GCG V. 9.0-GAP

Jack Leunissen jackl at caos.kun.nl
Mon May 26 08:27:00 EST 1997

Iddo Friedberg wrote:
> I've encountered an interesting phenomenon in GCG v. 9.0, UNIX.
> Try the following:
> gap -in1=sw:lyg_ansan -in2=sw:kad3_bovin  -gap=5 -len=3 -out=out1.pair
> gap -in1=sw:kad3_bovin -in2=sw:lyg_ansan  -gap=5 -len=3 -out=out2.pair
> Now look at the results in files out1.pair and out2.pair
> Seems like the results are dependent upon sequence order. AFAIK, that
> shouldn't be the case in pairwise sequence alignment.

This idiotic behaviour is caused by the comparison matrices that GCG
now uses for sequence comparison and alignment. You will see similar
(bad) behaviour with COMPARE and DOTPLOT.

The actual cause of the problem are in fact the numbers on the main
diagonal in the comparison matrix: these numbers reflect the self-
matches of amino acids. In version 8 these numbers were artificially
set to one and the same value, in order to facilitate the scoring of 
Version 9 now uses the original values on the diagonal, which are
different for the amino acids, reflecting the "conservativeness" of
amino acids. Correct by itself, for some reason it doesn't work with 
the version 9 GCG programs!!!

To illustrate this, you might try the following: manually change all
the diagonal values in theOBLOSUM62.CMP matrix to one value (I put
them all to 11, which is the W-W selfmatch). Now you will see that
the alignment is reasonable, and is the same for both orientations!

One more reason to stick to version 8 for the time being....


   Jack A.M. Leunissen       | Email: jackl at caos.kun.nl
   CAOS/CAMM Center          | Tel. : +31 24 365 22 48
   University of Nijmegen    | Fax  : +31 24 365 29 77
   Nijmegen, The Netherlands | Www  : http://www.caos.kun.nl

