concerning profile analysis

Steve Thompson: VADMS genetics THOMPSON at WSUVMS1.CSC.WSU.EDU
Tue Sep 15 11:53:10 EST 1992

Fellow bio-soft netters -

I thought you might be interested in a communication that I was involved in
last week.  Tom Doak (herricklab at bioscience.utah.edu) wrote me the following

>>...I've been using Profiles in the GCG package quite a bit, so checked out
>>internet (a new toy to me, that I'm not really familier with yet) exchanges
>>on that subject, and it sounded as if you might be able to help, or at least
>>lead me in the right direction. You asked if there were ways to compare
>>profiles, a question I have also had, and I was hoping you had had usefull
>>response that could be passed along.  The two things I would like to be able
>>to do are to 1) independently build profiles for two different families, 
>>that I claim to be related subfamilies, then show that thier profiles are in
>>fact the same, at some confidence level.  2)  Know when I have added enough
>>members into the alignment I'm making my profile from, because the profile
>>hasn't changed 'signifigantly' from one to the next; you can imagine some sort
>>of plot leveling off at some point. Also, as long as I'm rambling, I've been
>>doing most of my alignments with GCG Pileup, and have found that it gives
>>'visually' correct alignments, anything good or bad you could say about it?...

And I replied to him:

>....Last year when I was posing questions to the net concerning profile 
>comparisons Roland Luethy offered the most concrete solution. He had 
>written a profile alignment algorithm back when he was working in
>Eisenberg's lab with Gribskov which operates within the GCG environment to
>create a new, merged profile and a consensus alignment from two previous
>profiles.  It works rather well and I'm sure he would be happy to send it off
>to you also.  

>His e-mail address, as of April 1992, is "rluethy at ulrec2.unil.ch".   

>Unfortunately it does not adequately address your points:

>Even though his program does produce a new profile from preexisting, related 
>profiles, it does not give any indication, statistically, of "how well" the  
>two input profiles compared to each other---it is still a subjective call.   
>I suppose one could then ProfileSearch with the new profile and evaluate "z" 
>scores of the hits from each of the two families' members. 

>Being able to ascertain when enough members are in an alignment would be 
>very handy, and, I do agree, one does reach a "point of diminishing returns"
>upon the addition of sequences to an alignment.  What we need is some "hot"
>statatician to get involved in this area and produce a nice set of auxillary
>programs which would interpret and analyze profile data and produce nice
>graphical output easily understood by us biology types, all within the GCG
>environment.  The whole idea of "how good" is this profile, the "quality" of
>validation, needs to be addressed in a more quantitative and less subjective
>manner as well as the significance of search hits.

Call for help in the area; are there any stat' people listening? :^) !!

>I also like PileUp and have found that of the standard multiple sequence
>alignment programs it and ClustAl seem to do the best jobs, especially if the
>sequences being aligned ARE fairly homologous along their entire lengths.  I
>believe many individuals in the area would tend to agree with this statement. 
>PIMA also has very strong advantages, especially in highly divergent sequences
>where only patches of homology remain.  Unfortunately it does not operate in a
>VMS environment.  

Take-home message!!:

>However, in ALL cases careful, subjective and biological evaluation of the 
>resultant alignments should be undertaken and, if deemed necessary, the 
>alignments should be modified by hand.  

Back to software discussion:

>MAlignEd is particularly good at this editing process and can help one to 
>decide which adjustments to make in the alignment based on its highlighting 

			Thought you might be interested, Steve Thompson

                              Steven M. Thompson
            Consultant in Molecular Genetics and Sequence Analysis
VADMS (Visualization, Analysis & Design in the Molecular Sciences) Laboratory
           Washington State University, Pullman, WA 99164-1224, USA
          AT&Tnet:  (509) 335-0533 or 335-3179  FAX:  (509) 335-0540
                  BITnet:  THOMPSON at WSUVMS1 or STEVET at WSUVM1
                   INTERnet:  THOMPSON at wsuvms1.csc.wsu.edu

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net