Pairwise identity from multiple alignment with DISTANCES

Christos Ouzounis ouzounis at embl-heidelberg.de
Tue Dec 20 09:54:39 EST 1994

In article <3ct5dk$psr at mserv1.dl.ac.uk>, jackl at camms1.caos.kun.nl (Jack Leunissen) writes:
> Stephane Vuilleumier wrote:
>> I previously used DISTANCES with parameter set to 1.5 to calculate 
>> pairwise identities between sequences in a multiple alignment. In the
>> new GCG version now  installed at our site, this possibility has (as far as I
>> can tell) disappeared (choice between uncorrected, Jukes-Cantor and 
>> Kimura distances), and the output is the number of substitutions per
>> 100 nucleotides. 
>> Can I still use GCG to do pairwise identities from a MSF alignment in 
>> this version of the program?
> You're probably referring to the extended version of DISTANCES, called
> HOMOLOGIES (yes, I know, very imaginative) that can be used to calcul-
> ate both similarities or distances, augmented distances (Jukes-Cantor,
> Kimura), and produce PHYLIP input files. This program will be part of
> the EGCG (Extended GCG) distribution which is currently being produced
> by Peter Rice (EBI) et al.
> Don't dispair; coming soon...!

Until HOMOLOGIES arrives, here is another solution: have your x.msf file
with your sequences, say x1, x2... x10. If you are interested for the
pairwise identity between x1 and x2 after multiple alignment, do this:

$ bestfit/{parameter list here} x.msf{x1} x.msf{x2}

with gap penalties as high as possible (say 100). Then the alignment
remains the same as in the multiple alignment and you get the percent
identity, similarity and the other statistics (incl gaps).

Good luck,

Christos Ouzounis
EMBL Heidelberg

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net