Funny CDS from srs... More comments...

Peter Rice pmr at sanger.ac.uk
Tue Nov 28 04:53:38 EST 1995

In article <49d9le$k9d at rc1.vub.ac.be> Robert Herzog <rherzog at ulb.ac.be> writes:
>   There is something going wrong here:
>   In the record from Genbank, the translation looks correct : a bona fide 
>   peptide of a few hundred amino acids long. But SRS seems to confuse the "W" 
>   and the "*", as each (or most, I did not check everything ;-) W appears 
>   like the translation of a stop codon.
>   Here a copy from the SRS query at Darasbury (we do not keep a full Genbank 
>   under SRS at BEN today):
>   LOCUS       DROMTM1      5292 bp    DNA             INV       20-SEP-1995
>   DEFINITION  Drosophila melanogaster mitochondrial cytochrome c oxidase
>	       subunits, ATPase6, 7 tRNAs (Trp, Cys, Tyr, Leu(UUR), Lys, Asp, 
>   Gly)
>	       genes, and unidentified reading frames A6l, 2 and 3.
>	CDS_pept        1071. .2606
>			/note="NCBI gi: 903727"
>			/codon_start=1
>			/transl_except=(pos:1071. .1073,aa:Met)
>			/transl_table=5
>			/product="cytochrome c oxidase I"

That's because it's a mitochondrial sequence where one of the stop
codons codes for Tryptophan instead.

This is supposed to be what the "/transl_table=5" means, but I do
not know of many programs that use it. It refers to NCBI translation
table 5, which they also call "SGC4", so it appears in EGCG's
EGenRunData directory as "trans4.txt".

I suppose SRS could try to handle it - yet another of those dreadful
feature table parsing problems where inconsistency in the feature
table made it not worth designing software to cope with the
many different possibilities.

I seem to recall that /transl_table was going to be hidden somewhere
(on the source key?) and not necessarily on every CDS (to reduce
redundancy). There was even talk of leaving it out, and assuming
all programs would be generating it from the taxonomy.

The situation is still just as confused today.

A quick check through the EMBL organelles file shows there is still
no consistency in the annotation. I suggest, if you want to work with
translated mitochondrial genes, that you should do the translations
yourself to be sure you know what you are getting.

Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr/   | England

More information about the Bio-srs mailing list

Send comments to us at biosci-help [At] net.bio.net