Synonymous DNA Sites Exporting

Doug Eernisse DEernisse at fullerton.edu
Thu Jul 27 12:49:18 EST 1995

In article <925C05EB2 at biology.biosci.wayne.edu>,
vicdef at BIOLOGY.BIOSCI.WAYNE.EDU ("Victor DeFilippis") wrote:

> Dear Anyone:
> I am looking for a program that can take my aligned (homologous) 
> DNA sequences and create export files (text, PHYLIP, MEGA, PAUP, etc.) of 
> specific (or user specified) positions only.  For example, only 
> synonymous positions, only nonsynonymous positions, only fourfold
> degenerate sites, only twofold degenerate sites, etc.  I know that 
> MEGA can make exports of only variable positions but can it do 
> anything like I've described above?  Any assistance/information would 
> be greatly appreciated.  Thanks in advance.
> Victor DeFilippis
> Wayne State University
> vicdef at biology.biosci.wayne.edu

I'm not exactly sure what you mean by the above, but if you mean
3rd codon positions for fourfold degenerate synonymous sites, and
1st positions for twofold degenerate sites, it is quite easy to
just make those a character set if you are using PAUP. Then you
can include/exclude them to your heart's content. It is my understanding 
that synonymous/nonsynonymous substitutions are easy in the case of a
pairwise sequence comparisons but are dependent on a particular tree
hypothesis, and method of character optimization, in the case of a 
multiple sequence alignment. The problem with equating, say, weighting
schemes with codon site is that not all amino acids have the same
level of degeneracy at 3rd or 1st positions. If you want to more accurately 
reflect, say, the twofold degeneracy found at 1st position sites, then you
might want to take the approach advocated by Sidow and Thomas (1994),
who convert first positions of leucine codons (T or C) to Y and
those of arginine codons (A or C) to M. The same could be done for
third positions with considerable more substitution of ambiguous
symbols. You can, alternatively, simply ignore transitions by replacing
A,C,G, or T with pyrimadine/purine symbols. This is easy to 
accomplish with PAUP's equate feature. Apparently, at least the 
1st position conversion mentioned above can be done with 
Sidow's program, MakeInf, distributed with Phylip 3.5, but I haven't 
confirmed this personally.

Example of Nexus (PAUP/MacClade) character set method (within paup or 
assumptions block):

    charset 1stPos = 1-697\3;
    charset 2ndPos = 2-698\3;
    charset 3rdPos = 3-699\3;
    exclude 3rdPos;

[or in 'Options' statement of PAUP's data block] 

   options zap "3-699\3"; [then export file to a new matrix,
                            say, in 'simple text' format]

Hope this helps, but if you are trying to do something else, please
clarify. It is an intriguing suggestion for a matrix filtering
option that, who knows, some of us might implement. In general,
I would advocate keeping your original data matrix/alignment intact
and then exclude or equate symbols as illustrated above for
particular analyses. Otherwise, you lose track of your original
data and alignment and become more subject to text editing errors.

Doug Eernisse <DEernisse at fullerton.edu>
Dept. Biological Science MH282
California State University
Fullerton, CA 92634

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net