On 2007-02-27, Bastien Chevreux <bach from chevreux.org> wrote:
> This is because there are no real good multiple alignment sequences that can
> tackle more than a few dozen sequences at once without an explosion of time
> and memory requirements. (Stoye et al have some nice publications on this).
Actually there are some decent multiple sequence alignment algorithms
that don't explode. MUSCLE does fairly well up to a few thousand
sequences and HMM-based methods (though not quite as good at multiple
sequence alignments) are linear in the number of sequences and do
fairly well up to tens of thousands of sequences.
MUSCLE does have an O(n^2) component, but it is a very fast
approximate distance measure used to build a guide tree for
progressive alignment. The distance measure is then computed using
the resulting MSA and the progressive alignment redone.
See http://www.drive5.com/muscle/ for more info.
------------------------------------------------------------
Kevin Karplus karplus from soe.ucsc.eduhttp://www.soe.ucsc.edu/~karplus
Professor of Biomolecular Engineering, University of California, Santa Cruz
Undergraduate and Graduate Director, Bioinformatics
(Senior member, IEEE) (Board of Directors & Chair of Education Committee, ISCB)
life member (LAB, Adventure Cycling, American Youth Hostels)
Effective Cycling Instructor #218-ck (lapsed)
Affiliations for identification only.