Help on BLAST searches

Keith Robison robison1 at fas.harvard.edu
Sun Nov 6 14:06:54 EST 1994

There are a growing number of sequence analysis tutorials on the Web.
Mine is at http://twod.med.harvard.edu/seqanal/  
It is far from perfect, but it does reference the other two I know about
(which are in many ways superior).  It also contains references to other
articles & books of interest.  

In reference to BLAST, the algorithm can crudely be described as:

1) For each sequence of k residues in the query ("k-tuple"), generate
   a list of all the k-tuples which could be the nucleus of a significant

2) Search the query for these k-tuples
   In BLAST, this is done using a comp sci contraption called a 
   Deterministic Finite State Automaton (DFA).  In brief, a DFA is
   roughly analogous to a identification key for the set of k-tuples
   (such as the sort of key you might use to identify a living organism).

3) At each matched k-tuple, extend the alignment until further extensions
   do not improve the score

At all stages, BLAST is using a simple table (substitution matrix) to
score the alignment of the query against the database sequence.

Keith Robison
Harvard University
Department of Cellular and Developmental Biology
Department of Genetics / HHMI

krobison at nucleus.harvard.edu 

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net