Jerry Learn wrote:
> Hello,
>> Does anyone out there no if there is a maximum number of sequences that
> ClustalW can align? One of our users is trying to align several thousand
> 100 nucleotide sequences. It appears to be using about 750 meg of RAM.
>> Jerry Learn
>> Research Associate
>> Health Sci. Ctr., Rm. K443-C |
> Dept. of Microbiology | Learn at u.washington.edu> University of Washington | Phone: (206) 616-4286
> Box 357740 | FAX: (206) 616-1575
> Seattle, WA 98195-7740 USA |
> --
> Jerry Learn
>> Dept. of Microbiology | University of Washington
> Seattle, WA 98195-7740 USA |
Hello,
If you're using version 1.7(x) or 1.8 of ClustalW 1.7 , then the maximum of
sequences to be aligned should depend only on the memory available in
your computer. Older versions used static memory arrays, and the limits
were hard-coded into the program.
The main problem with using clustalw to align thousands of sequences, is
the time taken to build the guide tree. As the Neighbour-Joining algorithm
takes O(N3) time, this will be the longest step. In which case, as Kevin
Karplus
points out, an alternative program may well be better.
(Although what his alignment of 10,000 proteins looks like, I cannot
imagine!
- the whole of swissprot only contains about 80,000 sequences!)
Julie Thompson