Gribskov's stupid comment

Michael Gribskov gribskov at SDSC.EDU
Tue Jun 30 11:24:46 EST 1992

Bill Pearson rightly points out that my comment in article 
<920629220811.21a51350 at sds.sdsc.edu> gribskov at SDSC.EDU (Michael Gribskov) 
was in error.

>> (gribskov)

> (Pearson)

>>  As you may recall, the NW algorithm (Needleman
>>  and Wunsch, J. Mol. Biol. 48, 443-453, 1970) ... requires a scoring table
>>  with all positive values since only the last row and column of the alignment
>>  matrix are examined for the maximum score. A scoring table with negative
>>  values is not guaranteed to give an optimum alignment with the NW
>>  algorithm. ...

>	I believe this statement to be incorrect.  While Needleman and
>Wunsch used a length independent gap penalty, there was no requirement
>that the substitution values be positive.  A global algorithm, such as
>N&W, does require that score be that calculated for the last row and
>column, but it places no limits on the scoring matrix.  A globally
>optimal score will be found with N&W even if all the substitution
>values are negative.
>	It is true, of course, that if the substitution matrix has
>both positive and negative scores, it is likely that there will be a
>locally optimal score (and alignment) that is different from the
>globally optimal score.  And, if all the scoring matrix values (and
>gap penalties) are 0 or greater, the optimal global and local scores
>(and alignments) will be the same.
>Bill Pearson

I see and agree with Bill's point.  If you use a scoring table with both
positive and negative values you get the optimum alignment THAT INCLUDES
THE END OF AT LEAST ONE OF THE SEQUENCES (or both if you count end gaps
as part of the alignment). It is unfair to fault the algorithm for not
finding alignments that end internally, and have better overall scores
(aligned residues minus gaps) since global algorithms such as NW
specificaly DO NOT try to find such alignments.  My real point should
have been that I think a local alignment approach is likely to make more
sense, particularly for distantly related sequences. 

The problem is the definition of optimum.  This definition varies
between global and local algorithms.  Since I usually deal with local
similarity algorithms (e.g. Smith and Waterman), I incorrectly used the
definition of optimum for the wrong algorithm. Only if you define
optimum as the highest scoring alignment ending at any residue pair is
my original statement correct. 

Michael Gribskov
San Diego Supercomputer Center

gribskov at sdsc.edu
(619) 534 - 8312

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net