6-frame translation of a list of (large) DNA sequences in one go?

Stephane Vuilleumier svuilleu at micro.biol.ethz.ch
Wed Nov 20 09:23:36 EST 1996

Hi Netters,

I was wondering whether it is possible to translate a set of N  large DNA 
sequence files (say, a prokaryotic genome)  into all 6 reading frames in one 
command line.  I have the DNA sequence files in GCG format  and a file with 
the name of these files.
The rationale for doing this is I feel the sequence annotations which I think 
are used in the trembl protein database (which takes some time to update 
anyway) might miss some subtle things such as translational coupling, 
frameshifts and, yes, sequencing errors introducing stop codons.

What I would do next is build a dataset with these 6N protein translations
with something like (unix GCG)

>dataset @6Nprotein.list{*}

(I think I know how to do that) and then use this dataset for doing fasta 
searches against it.

Thanks for any input,

Stephane Vuilleumier
Mikrobiologisches Institut
ETH-Zurich                  Tel:   (+41) 1 632 33 57
ETH-Zentrum/LFV             Fax:   (+41) 1 632 11 48
8092 Zurich                 email: svuilleu at micro.biol.ethz.ch
Switzerland                 http://www.micro.biol.ethz.ch/sv1.htm

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net