Lefkowitz at orion.cmc.uab.edu writes:
>Does anyone know of a program which will allow you to align a nucleotide
>sequence given its amino acid sequence which has been previously aligned
>to some other sequence(s) and may therefore contain gaps? The program could
>either align the nt sequence itself based upon the alignment of the aa
>sequence, or simply an editor which would display both the aa and nt
-sequences
>together and allow you to introduce gaps into the nt sequence where
>appropriate.
>
>We utilize the GCG programs on a VAX using Macs as terminals, so anything
>compatible with GCG, VMS, or Mac formats would be most helpful.
I recently modified my 'Aligner' HyperCard stack to have this feature.
I got tired of doing an alignment of amino acids, followed by trying
to make the corresponding nucleotide alignment match the gaps in the
peptide alignment. 'Aligner' and my other stack, 'DNA Translator' use
a simple string format of the form 'Seqname<space(s)>ACGTCCGT...<rtn>'
etc. for however many sequences there are, but many other formats are
also supported from 'DNA Translator'. For compatibility with GCG
you can use "TOIG" or "FROMIG" to convert to Intelligenetics format,
which is supported.
The mentioned feature may not be exactly ideal for your purposes, but
will probably help. If you have corresponding text files for nucleotides
and peptides in the string format listed, with sequences in the same
order, starting at the same place, then you can import and align the
peptide file using 'Aligner', then use the menu item "Create Nuc Gaps"
to add appropriate gaps to the nucleotide file, which can be imported
directly to 'Aligner' or output as a new text file. You can use 'DNA
Translator' to create the inferred amino acid strings in the first
place, including support for known genetic code variations. You can
also convert the aligned strings to PAUP 3, PHYLIP, Hennig86, etc.
formats.
Sorry to be long-winded but here is the current (version 1.0g) blurb
on Aligner. Previous versions do not have this feature and much
earlier versions may have trouble with data fidelity when you
toggle between match and no match characters in the alignment display
(should be fixed now):
Aligner is a freeware HyperCard 2.x stack for manual alignment of
sequences by D. J. Eernisse, author of DNA Translator stack. Aligner was
inspired by a stack called MultiDNA by Ralph Gonzalez, but was created
completely from scratch. Enhancements compared to MultiDNA include: 1)
Number of sequences that can be aligned increased from 6 to 100; 2)
interleave formatting and number lines automatically adjust for the
number of sequences imported; 3) now accomodates sequences up to 30,000
bp or amino acid residues long; 4) imports or exports multiple sequences
that are simple return-delineated strings or in 'DNA Translator' stack
string format (i.e., 'Name<space>AGCTGA...<rtn>'); 5) optionally
interleaves exported sequences; 6) enlarged for use on a 640 x 480
standard color or larger monitor (automatically adjusts for standard
small Mac screens); 7) much faster switching between sequences to be
edited; 8) can toggle between match characters (dashes) to the first
sequence and no match characters; 9) speaks nucleotide/peptide
characters during keyboard entry or after entry starting from the
current insertion point; 10) various sequence manipulations, such as
lower <-> UPPER case; 11) can add gaps to all but the current sequence;
12) align amino acids, then introduce gaps to the corresponding
unaligned nucleotide strings to preserve the amino acid alignment; 13)
custom menus; 14) optional help facility displays the function of fields
or buttons as the cursor enters them.
This stack is distributed separately or as part of the more extensive
DNA Translator stack package. The best way to get the most current
version of either stack is via anonymous ftp to "ub.cc.umich.edu" (after
about 3/92: um.cc.umich.edu) then "cd gdef" and "get aligner.hqx" or
"get dnastacks.hqx" for Aligner stack only or the entire package,
respectively. Then debinhex the resulting files. Aligner by itself is
about 170 K uncompressed, while DNA Translator and associated files
occupy about 1 MB on your hard disk or 500K in compressed Binhex format.
This stack is only free for any noncommercial use and is not public
domain. It is copyrighted 1991 by D. J. Eernisse, and uses some XFCN
resources copyrighted by Nigel Perry with similar copyright restrictions
(see stack script for details). For more information, contact:
Douglas J. Eernisse Museum of Zoology, Univ. of Michigan, Ann
Arbor, MI 48109 USA usergdef at um.cc.umich.edu or usergdef at umichum
Please cite as:
Eernisse, D.J. DNA Translator and Aligner: HyperCard utilities to aid
phylogenetic analysis of molecules. CABIOS, in press.
or:
Eernisse, D.J. 1991. Aligner: HyperCard utility for manual sequence
alignment. Electronically published software available by anonymous ftp
to Ftp.Bio.Indiana.Edu.
For further information, tips, trouble-shooting, and version history,
see the online help facility.