NCBI Server Format --> FASTA converter

Dennis Benson dab at ray.nlm.nih.gov
Fri Oct 2 17:06:56 EST 1992

robison1 at husc10.harvard.edu (Keith Robison) writes:
: >Does someone already have a program to convert the results of a
: >NCBI Sequence Server Query into FASTA format?  
: I realize now that I omitted a key qualifier.  GenBank queries
: come back in GenBank format, PIR queries in something that looks
: like PIR format but with some extra linefeeds separating field
: labels from data.  But the Swiss-Prot returns not only have the
: extra line-feeds, but use different heading names than the
: Swiss-Prot distribution (full names rather than abbreviations).
: I guess the real question is whether these deviations will cause
: problems for various programs designed to read/convert PIR and
: Swiss-Prot formats.
: Keith Robison
: Harvard University
: Department of Cellular & Developmental Biology
: Department of Genetics / HHMI
: robison at ribo.harvard.edu 

Keith -- You've pointed out something I want to fix over the next couple
weeks.  The only reason the PIR, Swiss-Prot, (and EMBL) come out looking
a little unfamiliar is that you are seeing the entries as formatted for
IRX text retrieval.

GenBank entries have been passed through a filter to turn them back into
flat-file style records.  I just need to write the filters for the other
databases.  So please, if you can be patient a little longer, don't
write converters for the present output.

I also wonder whether as an interim solution, a FASTA type output option
would take care of some of your needs, ie, do you need all of a PIR
record in PIR format or is your principal need to get an identifier line
and the sequence itself.  It would be easy to have a field in the mail
message 'FASTA yes' or just  'FASTA' and have the program return just
the FASTA-formatted sequence for any of the databases.

Dennis Benson

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net