Massive Multiple Sequence Alignment tools?

Brian Foley brianf at med.uvm.edu
Wed Apr 24 17:48:55 EST 1996

Dear Bionet.Software;

	I am looking for an automated (or even semi-autimated)
method for generating a multiple sequence alignment of something
like 6,000 sequences, all of them > 80% identical to one another.
	I wish to align the envelope gene (or portions there-of)
which have been sequenced from the Human Immunodeficiency Virus
type 1 or types 1 and 2.  A BLAST search against the nr dataset
provided by NCBI reveals that there are several thousand HIV
env sequences in the database today.

	If I cannot find a tool already suitable for this, I'd
like advice on building a program (perhaps using ASN.1 code from
the NCBI Software Developers Toolkit) that will build a massive
multiple sequence alignment, given a query sequence (I plan to
use a "consensus sequence" from an alignment of 50 HIV env genes
from diverse subtype) and the GenBank/EMBL database.
	My first thought is to use a tool such as FASTA to
obtain information about each sequence from GenBank (Is
it highly similar to HIV env?  If so, what region of it
aligns with what region of my query) and then use that information
as a starting point for the multiple sequence alignment.

	Any thoughts or help will be greatly appreciated. 

*  Brian Foley               *  btf at t10.lanl.gov                   *
*  T-10, MS-K710, LANL       *  http://hiv-web.lanl.gov            *
*  Los Alamos, NM 87545 USA  *  http://www.uvm.edu/~bfoley         *

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net