lamoran at gpu.utcc.utoronto.ca (L.A. Moran) writes:
>In article <badger.843783074 at phylo.life.uiuc.edu>,
>Jonathan Badger <badger at phylo.life.uiuc.edu> wrote:
>>lamoran at gpu.utcc.utoronto.ca (L.A. Moran) writes:
>>>You raise a number of interesting problems but don't forget that many of
>>>them also apply to those genes that seem to support Woese's Three Domain
>>>Hypothesis. We should be skeptical of *all* current hypotheses concerning
>>>the tree of life.
>>>>Yes, it is always worthwhile to be skeptical of all scientific
>>theories. However, it is it also worthwhile to recognize that all data
>>are not of equal significance.
>I agree. Now, if we could only agree on which data was the most significant
>then all our problems could be solved. (-:
>> The fact that one can make gene trees
>>showing any desired relationship is not particularly surprising nor
>>informative in regard to organism trees.
>Actually I find this very surprising. Why was it not a surprise to you?
For many reasons. First, if one uses maximum parsimony to analyze
sequences, unequal amounts of change in different sequences can give
incorrect results. Secondly, the existence of paralogous genes can
give misleading results if you don't consistently choose the same
version of the gene in all cases. Thirdly, automated sequence
alignment programs don't always do a good job of alignment (Neither do
people always of course, but people can at least correct some of the
more glaring errors the programs create), fourthly, the possiblity of
horizontal transfer can disconnect gene trees from organism trees,
fifthly, if the amount of phylogenetic signal is weak in the
sequences, chosing a small number of sequences for analysis can result
in a tree wildly different than the correct one, and sixthly, if a
given sequence confers a great selective pressure, it is possible that
convergent evolution could have created it independently, thus given a
misleading result on a phylogenetic tree.
>>One of the reasons ribosomal RNA is a popular molecule for estimating
>>phylogeny is that it minimizes many of the problems given above. This
>>is not to suggest that *only* ribosomal RNA is good for phylogeny, but
>>genes should be chosen with some amount of care to minimize these
>Many workers in the field have stated that nucleic acid sequences are much
>less reliable that amino acid sequences. I agree with them.
What exactly do you mean here? If you mean protein sequencing is more
accurate than nucleotide sequencing, you are correct. However, almost
all "protein sequences" created today are inferred from nucleotide
sequences, making this argument irrelevant.
>also other problems associated with the use of rRNAs.) It seems to me that
>the use of ribosomal RNA sequences to reclassify bacteria willy-nilly
>is bound to lead to an extraordinary amount of confusion later on. I can
>think of no possible set of reasons that justifies using rRNA sequences in
>preference to, say, HSP70 or GDH. Can you?
I can. It involves thinking a bit about phylogeny. Why is it that
nobody makes bacterial phylogeny based on antibiotic resistance genes?
(besides the fact that not all bacteria have them) It's because people
realize there is incredible selective pressure for bacteria to be
resistant to antibiotics, and therefore mutations that confer
resistance are highly selected for. In addition, bacteria that pick up
an antibiotic resistance gene via horizontal transfer are also
selected for. So, it is clear from this example the genes would not be
good for phylogeny estimation.
In effect what one would like to have for phylogeny estimation is
sequences immune from selective pressure for improvement. The reason
ribosomal RNA is so useful for phylogeny is that while it is under
huge selective presure to maintain its function, it doesn't appear to
be under pressure to *improve* its function. (If it were, we would see
rRNAs with different amounts of activity in in-vitro translation
systems, and we just don't see this)
Getting back to HSP70 and GDH -- have people given much thought to
this problem? It seems to me enzymes are under constant pressure to
increase activity. Do all GDH's (for example) have equal activity? If
not, couldn't independent evolution have generated the same improved
This is what I meant in an earlier post by "not all data being
equal". Perhaps supporters of HSP70 and GDH phylogeny have given a lot
of thought to their choice of molecule, and this criticism is not
valid in this case, but from the outside it seems like they merely
chose molecules that had been sequenced a lot already in order to save