Computer readable FASTA output

Brian Fristensky frist
Fri Oct 13 16:14:01 EST 1995

My first comment is on the use of periods "." as gap
characters. I don't know if there is any sort of official 
standard, but certainly dashes "-" are far more commonly-used.

My interest in computer readability mainly concerns using fasta
output from GDE. At present, I have a script that extracts
names from fasta output and opens up one window for the output
itself, and one containing the names of the hits. Some or all of
the names can then be copied and pasted into another window for retrieval
of the GenBank entries by FETCH. The problem
is that all names occur twice in the output, meaning that they
occur twice in the name file.  Doing something like

 sort < rawfile |uniq >outputfile

would work, but you lose the ORDER of the names. If the names
aren't in the same order as they appear in the printout, it can
be tedious to copy and paste all the hits you want.

It looks as though the "machine readable" output format
in your message would make it easy to extract a list of
names, in the order in which they appear. However, I don't
want the machine-readable output and the more traditional listing
to be mutually exclusive. It would be useful if one run of 
the program could product BOTH types of listing.
Brian Fristensky                |  "Let us think the unthinkable, let us do
Department of Plant Science     |  the undoable. Let us prepare to grapple
University of Manitoba          |  with ineffable itself, and see if 
Winnipeg, MB R3T 2N2  CANADA    |  we may not eff it after all."
frist at cc.umanitoba.ca           |  
Office phone:   204-474-6085    | Douglas Adams, DIRK GENTLY'S HOLISTIC
FAX:            204-261-5732    |           DETECTIVE AGENCY

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net