In article <Cotz82.19H at mozo.cc.purdue.edu>, pmiguel at bilbo.bio.purdue.edu writes:
> (This is about the version of FASTA in GCG.)
> When a DNA FASTA search displays an alignment in which the query is /REV
>(i.e., reverse complement, i.e., bottom strand) it numbers it incorrectly.
>That is, the last base of the sequence becomes "1", the next to the last
>"2" and so on. Why hasn't this been fixed over the years? BESTFIT, for
>example, doesn't do this. Why doesn't FASTA display reverse alignments
>like BESTFIT? What would it take, an extra 10 lines of code? It drives me
>nuts every time I have to do it by hand!
>
I've been asked why this numbering should be considered incorrect. If I
have 1000 bases of sequence in a file, then it must be in one orientation or
another. Sometimes the orientation will be arbitrary, sometimes not -- but
that is not for a program to decide. If I get a hit on the reverse strand
I want to know where that hit is according to the numbering scheme (the
orientation) I'm using. But FASTA will show an alignment /REV from say 700
to 900. But it won't be 700 to 900 on my map, it will be 300 to 100. I can't
imagine anyone preferring the numbering used by FASTA. All it would take
to change it is to subtract the sequence length from /rev numbers and print
their absolute values on the alignment. Right? Why make me do this
calculation myself?
The situation is even more confusing if I'm restricting the database
search to a sub-section of my sequence -- say 200 to 600. If I get a /rev
hit (a hit on the reverse complement strand) from 300 to 350 according to
fasta, the real region of homology will be from 500 to 450! It took me
hours of banging my head against the wall to figure this one out. If you
think about it there is no legitimate reason why FASTA should display an
alignment like this. It's just an error. But in the 5 years I've been
using the program it's not been corrected.
Phillip