"Ahmed Moustafa" <ahmed from pobox.com> wrote in message
news:mailman.165.1172707327.5139.bio-soft from net.bio.net...
> On 2/28/2007 7:45 AM, Kevin Karplus wrote:
>> On 2007-02-28, Ahmed Moustafa <ahmed from pobox.com> wrote:
>>> But MSA methods perform the pairwise alignments anyway as an initial
>>> step to cluster or join the sequences in the multiple alignment. Also
>>> MSA methods are approximation while pairwise alignment methods computes
>>> the optimal alignments, so they give absolute values representing the
>>> relatednesses (or distances) between the sequences.
>>>> Only very slow (and ancient) MSA methods start with full pairwise
>> alignments.
>>>> Multiple sequence alignments are better alignments than independent
>> pairwise ones, because they can better disambiguate alignments where
>> the signal is weak. The evolutionary distance measures from MSAs are
>> crude, but the ones from pairwise alignments alignments are often even
>> cruder.
>> Regardless of the evolutionary distance, my guess is that for closely
> related sequences, pairwise alignments would be more sensitive than
> multiple alignment and resolve the relationships between the sequences and
> cluster them accurately.
>
That will only begin to be approximately true in cases where the sequences
are so close that you get the same answer by either route.
Pairwise alignment with dynamic programming is guaranteed to give the
optimal score. That is correct. What is not correct is how you use that
score. You wish to use it to estimate an evolutionary distance. Those
estimates will be better if they come from the multiple alignment.
If the sequences are very distantly related (as Kevin said already), those
estimates will be terrible if they come from the pairwise alignments.
If you take two very distant sequences and align them, there is huge
uncertainty as to where exactly the gaps go. That uncertainty is sligtly
less if you do a multiple alignment first. You can get a guaranteed optimal
alignment between the 2 sequences just doing it pairwise but it will not
mean much and it is only optimal regarding the particular parameters you
choose anyway.
Des