IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Combining PFGE profiles to give a tree

Joe Felsenstein joe at evolution.genetics.washington.edu
Sat May 11 10:01:47 EST 1996

In article <4mvjdq$nnq at mserv1.dl.ac.uk>,
Dr. J.P. Clewley <jclewley at hgmp.mrc.ac.uk> wrote:
>I would like to take PFGE profiles (and possibily also ribotypes and
>PCR-RFLP patterns) of different strains of the same organism, and 
>generated with several different restriction enzymes, and combine the
>data to produce a phenetic (phylogenetic) tree. 
>For the PFGE profiles neither the location of the cutting sites nor
>the precise number of sites can be known. Thus, RESTML cannot be
>used as it assumes that the sites are mapped. Is this correct?

Quite correct.  It assume the presence/absence of individual sites can
be scorred, not just of individual fragments.

>Therefore, I propose to use the equation of Nei and Lei (PNAS 76: 5269,1979)
>D = 1 - 2(n(xy))/(n(x)/n(y)) as described by Vilgalys and Hester (J Bact 172:
>4238, 1990) and Gurtler et al (J Gen Microbiol 137: 2673, 1991). 
>In this approach the individual distance matrices of D values are averaged
>to produce a single matrix for e.g. FITCH. Is this valid?

In principle it is valid but do go back to Nei and Li's paper for your
formula.  Aside from a typo that I see (n(x) is not divided by n(y))
you need to have a distance that is additive across branch length.  That
is, if we evolve along branch X we accumulate distance D (on average),
if we evolve along branch Y we accumulate distance D', and if we
evolve along one followed by the other, we accumulate D+D'.  Arbitrary
dissimilarity formulas do not have this property.  In the formula you
give the distance reaches a maximum of 1, so it cannot be additive
(i.e. if branch X gives distance 0.6, and so does branch Y, you want
branch X followed by branch Y to give 1.2, not 1.0).

I think that Nei and Li's paper will give a distance formula that is
additive (under a simple model of DNA evolution) and I would use that.

The strategy of adding up distance measures is in general fine.

Joe Felsenstein         joe at genetics.washington.edu     (IP No.
 Dept. of Genetics, Univ. of Washington, Box 357360, Seattle, WA 98195-7360 USA

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net