protein alignment display

Mathew Woodwark mathew.woodwark at bbsrc.ac.uk
Thu Feb 29 11:23:09 EST 1996

watson_j at bms.com (A. John Watson) wrote:
>In article <4g2gc0$1emc at bigblue.oit.unc.edu>, ewb at med.unc.edu (Edward W.
>Baptist) wrote:
>> Is there a program which will take a protein alignment (such as a GCG msf
>> file) and display all amino acids which match either a particular sequence
>> or a consensus sequence and display those  as either dots or dashes while
>> printing out only the ones that differ?
>> Ed Baptist
>> Lineberger Comprehensive Cancer Center
>> UNC
>MegAlign from DNAStar (608.258.7420) does that.  Not cheaply, though.
>John Watson
>Bristol-Myers Squibb Co.
>watson_j at bms.com
>"If you're not part of the solution, you're part of the precipitate."

Another possible solution (sic)

If you have a GCG .msf file, you presumably have GCG. If so, you can use the GCG 
program pretty eg on an msf file called test.msf

       % pretty -diff="-" -cons test.msf{*}

This will calculate a consensus using the pam250 table, and a threshold of 1.0 for 
identity (you can up this with the parameter -thr, so -thr=1.5 means absolute 
identity only) and then only display those residues that differ from the consensus 
(and yes  know its confusing - the dashes are for those that agree, the residues 
that differ are in lower case)

The symbol in the quotation marks is up to you, but I would avoid dots as they are 
already used as gap characters.

Sample output will hopefully be at the end of the message (or much of this message 
will not make sense)


Dr Mathew Woodwark		            mathew.woodwark at bbsrc.ac.uk
Molecular Biology Software Support          Tel: 01582 762271
BBSRC Computing Centre                      Fax: 01582 761710
West Common, Harpenden
Hertfordshire, AL5 2JE, UK

-------------- next part --------------
Plurality: 2.00  Threshold: 1.00  AveWeight 1.00  AveMatch 0.54  AvMisMatch -0.40

PRETTY of: pir1.msf{*}   February 29, 1996 16:14  ..

                1                                                   50
pir1.msf{ccch}  .....m---- ---------- s--------- ---------- ---------e 
pir1.msf{cccm}  ......---- ---------- ---------- ---------- ---------v 
pir1.msf{cccz}  ......---- -------m-- s--------- ---------- ---------p 
pir1.msf{ccca}  ......---- ---------- -------b-- ---v----w- ---------p 
pir1.msf{ccck}  papfeq-sak --atl-ktr- -------a-- p--v------ i-s-hs---e 

                51                                                 100
pir1.msf{ccch}  ---------- ---t------ ---------- ---------- ----s--v-- 
pir1.msf{cccm}  ---------- ---t------ ---------- ---------- ---------- 
pir1.msf{cccz}  -----a---- ---i------ ---------- --------v- ----e----- 
pir1.msf{ccca}  ---------s ---v-b---- ---------- ---------- ---------- 
pir1.msf{ccck}  ---------r a--e-a-p-- s--------- ------a-g- l--ak--n-- 

                101    110
pir1.msf{ccch}  -----d--sk 
pir1.msf{cccm}  -----k--n- 
pir1.msf{cccz}  -----k--n- 
pir1.msf{ccca}  -----s--s. 
pir1.msf{ccck}  -t--le-sk. 
     Consensus  IAYLK-AT-E 

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net