A C++ class library for FASTA format sequence.

tendo tendo at fas.harvard.edu
Mon Dec 14 16:51:06 EST 1998

It seems like the message I posted a couple days ago was removed from the
news, I would like to post it again.

I just wrote the first alpha version of a C++ class for FASTA format reader
with a slight extension which recognizes semicolon (;) as the comment line

Basic features of the class are:

    o To read a sequence is as simple as this:

            Sequence s;
            cin >> s;

        To read multiple sequences, just iterate.

    o No line length limit.

    o It recognize identifier as the first word (non-space sequence of char)
        of '>' line.

    o Identifier, comments and sequence can be assessed through three

            id(), rem() and seq(),

        where range can be passed to seq().

    o A single site can be access by index s[i] (where s is Sequence

    o It allows semicolon (;) on the first column as comment line indicator.
        Comments are  accessible through method rem() as a concatenated
        string separated by newline (\n).

    o It comes with a function 'complement()' to obtain compliment strand of
        DNA or RNA.

This class requires string template library from standard template library
It also includes header <iostream> which is described in STL definition.
This class is accessible with included README at


Although I made every effort to avoid problems, but as it's just an alpha
it may have problems, so I reserve all rights and there is no warranty.
However, if anybody want to use it, I guarantee for free use.
I am still working on adding some features.
I would appreciate your comments, suggestions and bug report.

Toshinori Endo, Ph.D.
Harvard University Biological Laboratories
16 Divinity Avenue, Cambridge MA02138-2092, USA
tendo at fas.harvard.edu

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net