IUBio

concerning profile analysis

Steve Thompson: VADMS genetics THOMPSON at WSUVMS1.CSC.WSU.EDU
Tue Sep 15 11:53:10 EST 1992


Fellow bio-soft netters -

I thought you might be interested in a communication that I was involved in
last week.  Tom Doak (herricklab at bioscience.utah.edu) wrote me the following
query:

>>...I've been using Profiles in the GCG package quite a bit, so checked out
>>internet (a new toy to me, that I'm not really familier with yet) exchanges
>>on that subject, and it sounded as if you might be able to help, or at least
>>lead me in the right direction. You asked if there were ways to compare
>>profiles, a question I have also had, and I was hoping you had had usefull
>>response that could be passed along.  The two things I would like to be able
>>to do are to 1) independently build profiles for two different families, 
>>that I claim to be related subfamilies, then show that thier profiles are in
>>fact the same, at some confidence level.  2)  Know when I have added enough
>>members into the alignment I'm making my profile from, because the profile
>>hasn't changed 'signifigantly' from one to the next; you can imagine some sort
>>of plot leveling off at some point. Also, as long as I'm rambling, I've been
>>doing most of my alignments with GCG Pileup, and have found that it gives
>>'visually' correct alignments, anything good or bad you could say about it?...

And I replied to him:

>....Last year when I was posing questions to the net concerning profile 
>comparisons Roland Luethy offered the most concrete solution. He had 
>written a profile alignment algorithm back when he was working in
>Eisenberg's lab with Gribskov which operates within the GCG environment to
>create a new, merged profile and a consensus alignment from two previous
>profiles.  It works rather well and I'm sure he would be happy to send it off
>to you also.  

>His e-mail address, as of April 1992, is "rluethy at ulrec2.unil.ch".   

>Unfortunately it does not adequately address your points:

>Even though his program does produce a new profile from preexisting, related 
>profiles, it does not give any indication, statistically, of "how well" the  
>two input profiles compared to each other---it is still a subjective call.   
>I suppose one could then ProfileSearch with the new profile and evaluate "z" 
>scores of the hits from each of the two families' members. 

>Being able to ascertain when enough members are in an alignment would be 
>very handy, and, I do agree, one does reach a "point of diminishing returns"
>upon the addition of sequences to an alignment.  What we need is some "hot"
>statatician to get involved in this area and produce a nice set of auxillary
>programs which would interpret and analyze profile data and produce nice
>graphical output easily understood by us biology types, all within the GCG
>environment.  The whole idea of "how good" is this profile, the "quality" of
>validation, needs to be addressed in a more quantitative and less subjective
>manner as well as the significance of search hits.

Call for help in the area; are there any stat' people listening? :^) !!

>I also like PileUp and have found that of the standard multiple sequence
>alignment programs it and ClustAl seem to do the best jobs, especially if the
>sequences being aligned ARE fairly homologous along their entire lengths.  I
>believe many individuals in the area would tend to agree with this statement. 
>PIMA also has very strong advantages, especially in highly divergent sequences
>where only patches of homology remain.  Unfortunately it does not operate in a
>VMS environment.  

Take-home message!!:

>However, in ALL cases careful, subjective and biological evaluation of the 
>resultant alignments should be undertaken and, if deemed necessary, the 
>alignments should be modified by hand.  

Back to software discussion:

>MAlignEd is particularly good at this editing process and can help one to 
>decide which adjustments to make in the alignment based on its highlighting 
>mode....

			Thought you might be interested, Steve Thompson

                              Steven M. Thompson
            Consultant in Molecular Genetics and Sequence Analysis
VADMS (Visualization, Analysis & Design in the Molecular Sciences) Laboratory
           Washington State University, Pullman, WA 99164-1224, USA
          AT&Tnet:  (509) 335-0533 or 335-3179  FAX:  (509) 335-0540
                  BITnet:  THOMPSON at WSUVMS1 or STEVET at WSUVM1
                   INTERnet:  THOMPSON at wsuvms1.csc.wsu.edu





More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net