WU-BLAST 2.0a software available

Warren Gish gish at blast
Wed May 8 00:58:07 EST 1996

This is to anounce availability of a development release of WU-BLAST
(Washington University BLAST) version 2 software (BLASTP, BLASTN, BLASTX,
TBLASTN, and TBLASTX) for sequence database searching.  Solaris 2.5-SPARC
executables can currently be downloaded from ftp://blast.wustl.edu/blast2
Due to its present state of flux, source code is currently unavailable.
Executables for other computing platforms will be made available in the near
future at this same location.  Please send suggestions and bug reports to
gish at watson.wustl.edu

** Note that WU-BLAST version 2 software is copyrighted. **

The new feature list for WU-BLAST version 2 includes:

o gapped alignments are produced by default by the database search programs.
Effectively using a 2-dimensional BLAST-like algorithm and other heuristics,
the speed penalty for generating gapped alignments is typically only about 10%
over that of the version 1.4 software.  The speed of BLASTN is affected most.

o Karlin and Altschul (1993) "Sum" statistics are used to evaluate any
instances of multiple, locally optimal, gapped alignments found between the
query and a given database sequence, as described by Altschul and Gish (1996).

New command line options, which are currently undocumented, include the
following.  Terse program usage information can also be obtained by entering
one of the program's names on the command line without any arguments or

nogap   do not create gapped alignments, in essence reverting entirely
        to WU-BLAST 1.4 behavior.

gapall  generate a gapped alignment for every HSP found

gapw=#  the window width within which gapped alignments are generated
        (default is gapw=32 for protein comparisons, gapw=16 for BLASTN).

Q=#     the penalty for a gap of length 1 (default Q=9)

R=#     the per-residue penalty for extending a gap (default R=2)

hspsepqmax   max. permitted distance along the query sequence separating two
             consistent HSPs
hspsepsmax   max. permitted distance along the subject (database) sequence
             between two consistent HSPs
gapsepqmax   max. permitted distance on the query sequence between two
             consistent gapped alignments
gapsepsmax   max. permitted distance on the subject sequence between two
             consistent gapped alignments


o There must be other bugs besides those listed below.  caveat emptor -- buyer
beware.  This is development-quality code.

o Parameters lambda, K, and H for gapped alignments are obtained by looking up
their values in a precomputed table, not by finding roots to analytical
equations as can (and is) done for ungapped alignments.  Values are not
available for all scoring matrix and gap penalty combinations.  When
appropriate values are unavailable, the programs will eschew a WARNING.

o When a matched sequence in a nt. database contains ambiguity codes, it is
possible for the gapped alignment program code to yield a different alignment
score at output time than was obtained during the database search.  When such a
discrepancy arises, the programs eschew a "SCORE_ERROR" message.

o the "hspsepqmax", "gapsepqmax", etc. parameters are measures of distance in
residues along the sequences in the specific form in which they are compared.
For instance, in a BLASTX search (conceptually translated nt. query sequence
compared against a protein sequence database), hspsepqmax refers to a distance
measured in amino acid residues, not the underlying nucleotides in the query.

o ASN.1 formatted output is currently broken.


Altschul, SF, and W Gish (1996).  Local alignment statistics.
ed. R. Doolittle.  Methods in Enzymology v.266 (in press).

Karlin, S, and SF Altschul (1993).  Applications and statistics
for multiple high-scoring segments in molecular sequences.
Proc. Natl. Acad. Sci. 90:5873-7.

