Attached is a short (657 words) article of mine that I should appreciate your
comments on. Thank you in advance.
Rate heterogeneity of nonsynonymous substitutions among codon sites
The second codon site of protein-coding genes are more
conservative than the first and the third codon sites. The standard
explanation for this is that any nucleotide substitution at the second
codon site is invariably nonsynonymous and should be under strong
purifying selection, whereas most nucleotide substitutions at the third
codon site and at least some nucleotide substitutions at the first codon
site are synonymous and should be under relatively weak purifying
selection. Our study shows that this explanation is insufficient and an
supplementary explanation is proposed and empirically verified.
Of the 60 mitochondrial codons, there are 190 possible
nonsynonymous codon pairs in which one codon can mutate into the other
through a single nucleotide substitution (e.g., ACU-GCU). Of these 190
pairs, 82 pairs differ at the first codon site (e.g., CCU and UCU), 84 at
the second codon site (e.g., CCU and CGU) , 24 at the third codon site
(e.g., CAU and CAA). Thus, when we compare two DNA sequences and count
nonsynonymous substitutions, we expect 43.16% (=82/190) of the
nonsynonymous substitutions to fall on the first codon site, 44.21% on
the second codon site, and only 12.63% on the third codon site.
We retrieved ten DNA sequences of the cyt-b gene from 10 closely
related species of pocket gophers (GenBank accession number
L11900-L11909). The closely related species were chosen to minimise the
error in reconstruction of ancestral states. We used 378 codons for
analysis (we deleted the initiation and termination codons, which are
identical in the nine sequences). The phylogenetic relationship among
these 10 sequences, was verified by using PAUP (Swofford 1993), DNAML
(Felsenstein 1993) and PAML (Yang 1996).
Given a phylogenetic tree based on mtDNA sequences for a
protein-coding gene, with reconstructed ancestral states for all internal
nodes, one can count the number of times each of the 60 mitochondrial
codons being substituted by a nonsynonymous codon. For our data, there
are 10 terminal nodes and 8 internal nodes, resulting in 17 pair-wise
comparisons between neighbouring nodes. For each pair-wise comparison,
nonsynonymous codon pairs were counted.
The counting procedure revealed a total of 97 nonsynonymous
substitution involving a single nucleotide substitution. There are also 10
nonsynonymous codon substitutions involved codon pairs that differ at two
codon sites (e.g., ACU and CUU). These are ignored in subsequent analysis
because of the difficulty in identify intermediate codons. Of the 97
nonsynonymous substitutions, 54 occurred at the first codon site, 24 at
the second codon site, and 19 at the third codon site. These values are
significantly different from theoretical expectations of 42 (=97*43.16%),
43, and 12, respectively (X2 = 15.55, df = 2, p = 0.0000).
We note that the number of nonsynonymous substitutions have
occurred more frequently than expected at the first and third codon sites,
and less frequently at the second codon site. One hypothesis for this
pattern is that an amino acid pair coded by a cod on pair that differ at
the second codon site are more different than an amino acid pair coded by
a codon pair that differ at the first or the third codon site. Therefore,
a nonsynonymous substitution at the first or the third codon site is more
nearly neutral and more likely fixed than a nonsynonymous substitution at
the second codon site. This hypothesis can be tested by using Grantham's
distance as a general measure of amino acid dissimilarity.
For the 82, 84 and 24 nonsynonymous codon pairs that differ at the
first, second, and third codon sites, respectively, the mean Grantham's
distance is 68.9, 100.5 and 68.3, respectively, with the first and third
mean significantly smaller than the second mean. This confirms our
hypothesis that nonsynonymous substitutions due to a nucleotide
substitution at the second codon site involve amino acid replacement with
more dramatic effect than those due to nucleotide substitutions at the
first and second codon sites.