Sellers Pattern-matching algorithm

Kerr Hatrick k-hatric at nimr.mrc.ac.uk
Mon Mar 22 07:18:11 EST 1993

 Are there any Dynamic Programmers out there able to give

 a puzzled PhD student help with the following problem ?

	Sellers' pattern matching algorithm enables the 

 calculation of the best global alignment between two 

 sequences . In addition it allows the comparison of this 

 alignment with other suboptimal alignments ( cf Sellers

 in "Time warps, String edits and Macromolecules: The theory

 and practice of sequence comparison", Sankoff and Kruskal

 (eds), Addison-Wesley (1983) ). Comparison of alignments

 is based on a distance metric, especially useful for 

 evolutionary comparisons.

 As such, I thought it would be ideal for drawing trees.

 Unfortunately, the algorithm as specified in the

 aforementioned book does not seem to behave quite as well

 as I expected it to : in particular, the distance between

 sequence(a) and pattern(b) is not equivalent to the 

 distance between sequence(b) and pattern(a). This is a

 direct consequence of the fact that the pattern can start

 anywhere in the length of the sequence. For instance, when

 the sequence is "puzzledPhDstudent" and the pattern is

 "student" the distance is 0. When the sequence and pattern

 are switched, the distance is 10 (using Identity Distance

 Matrix, gap penalty = insertion penalty = 1). The increased

 distance is due to the cost of inserting "puzzledPhD".

 Can anyone suggest a way of modifying the algorithm so that

 d(a,b) = d(b,a)? 

	Yours Hopefully,
			   Kerr Hatrick.

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net