IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

[Computational-biology] Re: Point Specific Mutation Matrix vs. profile HMM ?

Kevin Karplus karplus at cheep.cse.ucsc.edu
Sun Feb 12 19:35:51 EST 2006

On 2006-02-12, harald <please_noSpam at gmx.de> wrote:
> 1 - Are the following both approaches nearly equivalent?
> 	A - 	Conduct a BLAST-Search with the query-protein.
> 		Take the first hits and make a multiple alignment by CLUSTALW or alike.
> 		Use this alignment to train a HMM model.
> 		Use the hmmsearch to search the database with the model.
> 	B -	Use two iterations of PSI-BLAST to search for the query-protein.

These are fairly close equivalents---the biggest difference is in the
quality of the multiple alignment and in the handling of gaps.  (The
HMM method has position-specific gap costs.)

> 2 - the HMMER-tool hmmt which was used to build a profile HMM out of an 
> unaligned set of sequences is not part of HMMER anymore. Does anyone 
> knows why? Is it generally better to multiple align a set of sequences 
> before training a HMM model instead of using hmmt?

HMMer was built primarily as a support for Pfam, which does not
require fancy training of HMMs.  If You want to build HMMs from
unaligned sequences, you are better off using the SAM package.
[Julian Gough, Kevin Karplus, Richard Hughey, and Cyrus Chothia.
Assignment of homology to genome sequences using a library of hidden
  Markov models that represent all proteins of known structure.
<i>Journal of Molecular Biology</I>, 313:903--919, 2001.

The current release of SAM 
(free to academics, non-profits, and government labs) includes the
target04 script, which does a similar job to that of psiblast, but
slower and better.  If you only have a few sequences, you can use the
SAM-T02 website, which uses the older target2k script, which is not
quite as sensitive.  

If you are willing to risk crashes, there is a beta test site up for
SAM-T05 server, which does both the target2k and the target04 scripts:
It also does local structure prediction, tertiary structure
prediction, and contact prediction though there are some bugs still to
be worked out on the tertiary predictions before the new CASP season. 

Disclaimer: I am one of the developers of SAM and the target2k and
target04 scripts---my views about HMMer and psiblast may be biased.

Kevin Karplus 	karplus at soe.ucsc.edu	http://www.soe.ucsc.edu/~karplus
Professor of Biomolecular Engineering, University of California, Santa Cruz
Undergraduate and Graduate Director, Bioinformatics
(Senior member, IEEE)	(Board of Directors & Chair of Education Committee, ISCB)
life member (LAB, Adventure Cycling, American Youth Hostels)
Effective Cycling Instructor #218-ck (lapsed)
Affiliations for identification only.

More information about the Comp-bio mailing list

Send comments to us at biosci-help [At] net.bio.net