I'm using ClustalW 1.74 for the Mac to compare protein database
sequences to weed out redundant/repeated occurrences. I use the
default parameters and prepare the multiple sequence file in a
random order. One strange thing I found was that when I tried to
align two identical sequences (296 aa) with two somewhat scrambled but
somewhat related fragments (eg. 100 aa) Clustal completely failed to
spot the identical nature of the two long forms and "aligned" them in
a very staggered manner. I had thought that the initial clustering
was designed to stop these things happening. Normally Clustal (since
version W) has behaved impeccably for me and happily aligns sequences
with big disparities in length (eg. a plasmid+insert sequence with
a small piece of the insert) so I was surprised to see this.
Changing the order of the sequences in the file did not help.
Has anyone else come across this and how did you cope?
(ie. can you juggle the parameters or do you have to
perform pairwise alignments before constructing the multiple
I'll test the same input file tonight with the DOS and Linux
versions just to make sure its not a Mac-specific effect.
Bernard P. Murray, PhD
Dept. Cell. Mol. Pharmacol., UCSF, San Francisco, USA