IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Symbol comparisons in PRETTY

Peter Rice pmr at sanger.ac.uk
Mon Jul 10 07:45:04 EST 1995

In article <3tqua2$ko7 at cisun2000.unil.ch> vjongene at eliot.unil.ch (Victor Jongeneel) writes:
>I was trying to get PRETTY (GCG) to do a relatively simple thing, viz., 
>printing an alignment from a .MSF file with a consensus at the bottom, and 
>each sequence in the alignment with '-' at positions where it matches the 
>consensus, or the amino acid code at positions where it doesn't.  This is in 
>principle possible using the /CON and /DIF='-' command-line switches.
>The output however is not quite what you would expect.  It seems that the 
>consensus is calculated using a very permissive comparison table, and that 
>only one member of the permitted values in the consensus is displayed.  For 
>example, at one position PRETTY decided that W and R both matched the 
>consensus, which was given as W, while the sequences containing R just get 
>a '-' to indicate a match.  This of course is unacceptable.
>So the question(s): Is it the PrettyPep.Cmp file that does this?  Is there a 
>relatively easy way to change this behavior?

The problem is not with prettypep.cmp in this case, although I have seen
it cause confusion over consensus sequences from time to time.

The cause is the way in which PRETTY calculates the consensus.

Luckily, I believe a solution is available.

In EGCG, the PRETTYPLOT program started life as a much enhanced version
of PRETTY. PRETTYPLOT produces a boxed alignment as graphical output,
but for historical reasons it is also capable of generating text
output in the same style as PRETTY.

The consensus calculation in PRETTYPLOT is much improved. For your case,
the /NOCOLLISIONS option allows more than one residue to match the consensus
where all are above the plurality you specify.

The text output of PRETTYPLOT is not an obvious choice, and as far as I
am aware nobody has been using it so it may need testing and some
changes. In particular, I suspect the /DIFF option may upset the way
the consensus is stored.

EGCG is available from ftp.sanger.ac.uk in directory pub/pmr/egcg8
or pub/pmr/egcg8vms

Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr/   | England

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net