IUBio

ClustalW problem

Bernard P. Murray, PhD bpmurray*STUFFER* at socrates.ucsf.edu
Thu Dec 3 15:14:23 EST 1998


In a previous post I told you of a (Mac) ClustalW 1.74 problem
I encountered when aligning two identical sequences with two
fragments.  For those of you who are interested.  I'll summarise the
helpful suggestions and eventual solution to my problem.

It turned out that this was not a Mac-specific problem as
I could reproduce it with a variety of ClustalW versions
(1.6x and 1.7x) for DOS (DJGPP-compiled) and Linux (gcc).

Ashok Aiyar and Doug Ernisse suggested trying ClustalX
and, as I anticipated, I saw the same results with MacClustalX
and ClustalX for Win3.11.  My NCBI vibrant library is out of
date so I haven't compiled the Linux version with the
ClustalW 1.74 code but the 1.60 "engine" in ClustalX 1.3 for
Linux gave the same result.
     In fact, ClustalW should be the same as ClustalX as
Francois Jeanmougin one of the authors of the latter wrote...

In article <742tcr$due at news.u-strasbg.fr>, jeanmougin at igbmc.u-strasbg.fr wrote:

>         In fact it is the same algorithm, so it should give the same result.
>         Using ClustalX would help finding the good matrix. In such a case, it
>         is also usefull to log the process in a log file.
> 
>         If Bernard could send me both input and output file, I could have a
>         look at what really happen in Clustal. The difference in length is a
>         known problem that we are trying to work around, as described in the
>         recent TiBS computer corner.
> 
>                                                 François.

I have to admit that I have a problem with the different ClustalX
versions as they do not report which ClustalW "engine" they are
using.  I found ClustalX 1.3 uses ClustalW 1.6 etc.  You can only
work out what is happening by checking the source code.  I mainly
use ClustalW 1.74 as I haven't found a "ClustalX 1.74" or later.
This being said I am a great admirer of ClustalX, especially
of how portable it is.

THE CURE
========
The cure for the problem was given by in an e-mail from Julie Thompson
one of the authors of ClustalW 1.7 who advised me

>can I suggest that you try setting the 'Negative matrix' option
>in the multiple alignment parameters menu.

This solved the problem completely for all versions of ClustalW
and ClustalX.  However, it is necessary to either delete the old
.dnd guide file or specify a new filename or Clustal will insist
on using the old file with very strange results.

...and just to reply to the remainder of the posts...

In article <slrn769hbf.4jb.aiyar at ebv.oncology.wisc.edu>,
aiyar at aiyar.ml.org wrote:

> Hello Bernard -- First off thank you for sending me your program to
> extract sequences from files created by Strider 1.2, and my apologies
> for not responding in a more timely fashion.
..
> Cheers,
> Ashok

No worries.  I now have a Perl script (my first!) which should
do the same thing and is probably a little more flexible.
I can send you that as well if it will be useful.


     THANKS VERY MUCH to you all (especially Julie)

For the curious I include my ClustalW input file below.
This simply contains the four sequences retrieved when
looking for "heme oxygenase" and "chicken".  I used the
default parameters in Clustal.

>G123476:HO_CHICK:P14791
        1 metsqphnae smsqdlsell keatkevheq aentpfmknf qkgqvslhef klvtaslyfi
       61 ysaleeeier nkdnpvyapv yfpmelhrka alekdleyfy gsnwraeipc peatqkyver
      121 lhvvgkkhpe llvahaytry lgdlsggqvl kkiaqkalql pstgeglaff tfdgvsnatk
      181 fkqlyrsrmn alemdhatkk rvleeakkaf llniqvfeal qklvsksqen ghavqpkael
      241 rtrsvnkshe nspaagkese rtsrmqadml ttsplvrwll algfiattva vglfam
>G2735479:GGU95209:U95209
       1 metsqphnae smsqdlsell keatkevheq aentpfmknf qkgqvslhef k
>G104690:S09337
        1 nfqkgqvslx efklvtasly fiykaalekd leyfygsnwk hpellvahay tralqlpstg
       61 eglafftfdg vnatkfkqly vleeakkafl lniqvfealq klvsksq
>G104689:S15123
        1 metsqphnae smsqdlsell keatkevheq aentpfmknf qkgqvslhef klvtaslyfi
       61 ysaleeeier nkdnpvyapv yfpmelhrka alekdleyfy gsnwraeipc peatqkyver
      121 lhvvgkkhpe llvahaytry lgdlsggqvl kkiaqkalql pstgeglaff tfdgvsnatk
      181 fkqlyrsrmn alemdhatkk rvleeakkaf llniqvfeal qklvsksqen ghavqpkael
      241 rtrsvnkshe nspaagkese rtsrmqadml ttsplvrwll algfiattva vglfam
-- 
Bernard P. Murray, PhD
Dept. Cell. Mol. Pharmacol., UCSF, San Francisco, USA




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net