UCSC Protein Structure Prediction and SAM 2.2

Richard Hughey rph at cse.ucsc.edu
Thu Nov 19 23:16:31 EST 1998

	UCSC SAM-T98 Protein Structure Prediction	      


We are pleased to announce the availability of a hidden Markov model
(HMM) protein structure prediction server.

The server has used UCSC's SAM-T98 method to create a library of HMMs,
one per PDB structure (about 2500 HMMs total).  You can search this
database of HMMs with a protein sequence.  The iterative method of
creating these models is detailed in two upcoming papers available
>From our WWW site (to appear in JMB (in collaboration with Jong Park
and Cyrus Chothia) and to appear in Bioinformatics), and is more
sensitive for remote homology detection than PSI-BLAST or ISS.  These
methods, refinements of our CASP2 methods, formed the core of our
CASP3 (http://predictioncenter.llnl.gov/) structure prediction contest
entries, the results of which will be announced in December.

You will receive by e-mail a list of the PDB identifiers of each hit,
as well as a series of pairwise alignments based on the library's HMM
for those structures.  When the system is unloaded, the search will
take a few minutes.

Also available on the page are SAM-T98 database searching, alignment
comparison, and alignment refinement.

The iterative construction of an HMM for SAM-T98 database searching
can take a particularly long time when the server is processing many
queries.  Please wait at least a day before giving up on a search.

	SAM Sequence Alignments and Modeling Software Suite 2.2


An upgrade of the UCSC HMM software is also available (it does not
currently include the SAM-T98 method).  The software includes tools
for building HMMs from aligned and unaligned sequences using a variety
of Dirichlet mixture priors and transition regularizers, and scoring
and multiply aligning sequences using the trained HMM.  The object
code is free for academic use, but our copyright office would like a
signed license, the details of which are on the WWW page.  For
commercial use, send email to sam-info at cse.ucsc.edu.

Important additions to recent versions include:

     o An option for posterior-decoded alignments
     o Local and semi-local training and alignment
     o User-defined alphabets
     o Reduced space dynamic programming
     o Optional internal sequence weighting during training
     o Corrected MSF and HSSP file reading

       Other UCSC Servers of Interest


>From our main Compbio WWW page, you can also access three other

     o Human gene prediction with the UCSC/LBL Genie project. 
     o E. coli gene prediction with EcoParse HMMs
     o Small subunit ribosomal RNA secondary
       structure prediction with STOCCFG a stochastic
       context free grammar modeling system. 

These projects have been lead by David Haussler, Richard Hughey, and
Kevin Karplus.  The servers include the work of Anders Krogh,
graduate students Christian Barrett, Melissa Cline, and David Kulp,
undergraduates Rachel Karchin,  Nguyet Manh, and Jeffrey Sukharev, 
and many other members of our computational biology group.

The servers are supported in part with a donation from Digital
Equipment Corporation.  Our research has been supported by NSF, DOE,
and other grants as detailed on the WWW page.

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net