IUBio

Fasta Sequence class

tendo tendo at fas.harvard.edu
Sat Dec 12 01:13:38 EST 1998


I just wrote the first alpha version of a C++ class for FASTA format reader
with a slight extension.
Basic feature of the class is this:

    o To read a sequence is as simple as this:
            Sequence s;
            cin >> s;
        To read multiple sequence, just iterate.
    o No line length limit.
    o It recognize identifier as the first word (non-space sequence of char)
of '>' line.
    o Identifier, comments and sequence can be assessed through three
methods:
            id(), rem() and seq(),
        where range can be passed to seq().
    o A single site can be access by index s[i] (where s is Sequence
instance).
    o It allows ';' on the first column as comment line indicator.  Comments
are
        accessible through method rem() as a concatenated string separated
by
        newline (\n).
    o It comes with a function 'complement()' to obtain compliment strand of
DNA
        or RNA.


This class requires string template library from standard template library
(STL).
It also includes header <iostream> which is described in STL definition.
This class is accessible with included README at

        ftp://toshi-pc.fas.harvard.edu/pub/ReadFasta/ReadFasta-0.1.tgz

Although I made every effort to avoid problems, but as it's just an alpha
version,
it may have problems, so I reserve all rights and there is no warranty.
However, if anybody want to use it, I guarantee for free use.
I am still working on adding some features.
I would appreciate your comments, suggestions and bug report.


==========================================
Toshinori Endo, Ph.D.
Harvard University Biological Laboratories
16 Divinity Avenue, Cambridge MA02138-2092, USA
tendo at fas.harvard.ed





More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net