IUBio

Gap penalties, PAM matrices and so on

Gaston Gonnet gonnet at inf.ethz.ch
Fri Jun 26 15:59:21 EST 1992


In article <1992Jun26.103554.12221 at gserv1.dl.ac.uk> BIONET at EARN.FRCGM51 writes:
> 
>- "mutation matrices ... differ, depending on whether they were derived
>  from protein pairs that are distantly homologous or from protein pairs
>  that are closely homologous". What a discovery !!
> 
Mark Cohen responded to most of your points, but he did not elaborate
on this one.  I believe you are thinking a triviality about this result.
What we say is that a Dayhoff matrix derived from data in a certain PAM
band (say 10-15 for example) and then normalized to 250 is different from
a Dayhoff matrix derived from another PAM band (say 50-60) and also
normalized to PAM 250.  That is, both Dayhoff matrices are computed
to the same amount of evolution yet they are different.  Of course
some differences will always be found due to the fact that we are
approximating a matrix with a finite sample.

The non-trivial event is that these differences (we studied 10 PAM
bands) are not random but have definite patterns.  This is the first
proof that the markovian model of mutations is not completely correct.
If you wish I can give you a more formal explanation of this.

The pattern unequivocally indicates that for short evolutions the
genetic code has some impact, while for longer evolutions this
impact vanishes in favour of other chemical/physical properties.
I may suggest that you should read this section with more care.

Having read some comments from other people, I realize that some
readers may not be familiar with PAM distances.  A PAM is a unit of
mutation, or amount of evolution, defined originally by Dayhoff et al.
Dayhoff computed similarity matrices which were based on a PAM distance
of 250, since she was interested in studying remote homologies.  We
have derived methods that can estimate the distance between two aligned
sequences in terms of PAM units and hence we can classify our alignments
by PAM distance.  This is a sound measure of evolutionary distance in
the sense that in the evolution from A to B to C (line of descendants),
the estimate of the distance between A and B plus the estimate of
the distance between B and C coincides with the estimate of the
distance between A and C.

Gaston H. Gonnet.




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net