In article <schwarze-220594183224 at fennel.bio.caltech.edu>, schwarze at starbase1.caltech.edu (Erich Schwarz) writes:
> In most analyses of molecular evolution it is assumed that, if two
> protein sequences are similar to one another in a statistically significant
> manner, they have evolved divergently from one another. Furthermore, this
> assumption is often implicitly held not merely for global similarities
> (across hundreds of amino acids), but for the smaller "motifs" (sometimes
> only a dozen or so residues long) that can be discerned by multiple
> sequence alignment of otherwise dissimilar proteins.
>> However, one can imagine that motifs, even if their simultaneous
> existence in several proteins is statistically significant, might have
> arisen by convergent rather than divergent evolution.
>> Which brings me to my question for the "group mind" out there: is there
> any reasonable algorithm by which one might distinguish motifs that had
> arisen from convergent rather than divergent evolution? Intuitively it
> seems that, by the time a "motif" becomes 60 or so amino acids long (e.g.,
> the homeo box), it is probably impossible for it to have arisen
> convergently. Meanwhile, one instance of three residues (the catalytic
> triad in trypin et al. versus subtilisin et al.) is probably a rock-solid
> case of convergent evolution. How does one draw a line through the middle
> ground between these extremes?
>> (Since the most useful "algorithms" would be software, I am tossing
> this question into the bionet.software subgroup as well.)
>> Thanks to all of you in advance for any ideas you have. Either posts
> or e-mail would be great.
>> --Erich Schwarz / schwarze.ccomail at starbase1.caltech.edu
I think this is an interesting question which I have discussed with various
people of the ages: Here is a neat example of what I think is convergent
evolution: the RNP-1 motif (an octamer) found in RNA binding motifs was
also shown to have significant matches to the Y box transcrutpition factors
and cold shock proteins in bacteria (I think that's right).
However this RNP-1 motif in RNA binding proteins was part of a much
larger motif of about 80 amino acids. When you look at the eighty amino acids
it is clear that the other elements don't fit...so is this a case of mistaken
idenity etc....
But this all changed when the solution structure (and crystal
structure I think) was solved for a cold-shock protein and compared to the
alrweady known crystal structure for the RNA binding protein U1A.
What this found was that although the fold was entirely different
in the two proteins, the two RNP-1 motifs were placed each in a beta strand
with similar enviroments and in each case it was these RNP-1 motifs that
it was thought were doing the binding to nucleic acid. Neat eh?
So here's an example where sequence analysis really did seem to
pick up convergent evolution, which I'm quite impressed by.
But then I'm easily impressed.
ewan birney
birney at molbiol.ox.ac.uk