Alignment of DNA with protein

Stuart Brown browns02 at mcrcr.med.nyu.edu
Thu Aug 13 13:30:52 EST 1998

In article <6qsc2c$hii at gap.cco.caltech.edu>, mathog at seqaxp.bio.caltech.edu

> In article <n1af5b21mj.fsf at speed.ebi.ac.uk>, Michele Clamp
<michele at speed.ebi.ac.uk> writes:
> >mcbaet at MCBSGS1.IMCB.NUS.EDU.SG (Anthony Ting) writes:
> >
> >On a similar note and while I'm thinking of it does anyone out there
> >know of something that will compare 2 protein sequences at the DNA
> >level and give you the most likely DNA alignment.  I'd find this
> >useful for finding possible undetected frame shift errors in multiple
> >alignments.
> >
> The only tool that I know of which is even close is GCG's FRAMEALIGN. It
> aligns a protein to DNA, allowing for frameshifts.  It is the only safe way
> to go when aligning to an EST, since they have a very, very high rate of
> frameshifts and other errors.  However, it assumes that the protein
> sequence in the comparison is fully in the correct frame, which might not
> be true if, for instance, all you had to start with is a collection of
> ESTs. 
> Regards,
> David Mathog
> mathog at seqaxp.bio.caltech.edu

I'm not sure if this is exactly what you have in mind, but the TFASTX3 program
in Pearson's FASTA 3.0 package compares a protein to a DNA sequence, translating
the DNA in all reading frames and allowing for frameshift mutations.  It
is a hell
of a lot faster than GCG's FRAMEALIGN and it is free.

I am working on a similar problem and would appreciate all opinions.  I want
to add DNA sequences to a protein alignment.   Turn the protein sequence
into 3-letter code and match up the corresponding DNA codons for each protein
in the multiple alignment.  I want the DNA seqs. to follow the gaps that have
been introduced in the protein alignment, but in the case of introns, the entire
protein alignment would be gapped across the DNA sequences.  This whole 
process needs to be fairly quick and easy since we want to make a lot of
these hybrid 
DNA/protein alignments and use them as the starting point for other analyses.

‹ Stuart Brown

Kim B. Foglia
WriteDesign   <http://www.write-design.com>
The Working Moms' Internet Refuge   <http://www.moms-refuge.com>

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net