My apologies to those involved in the recent discussion of the use of
the PROFILE* programs. The recent flurry of interest, unfortunately,
coincided with my recent move to San Diego, and I have been a little too
harried to reply properly. I have a few comments:
1) The programs are still under active development, and I expect
to have a new and much improved version available this fall.
Now is a very good time to voice your suggestion and complaints.
2) I currently have about forty of what I call validated
profiles (i.e. with known statistical properties). These will
be available as soon as I get a chance to properly document
them. Unfortunately my move has disrupted this rather badly.
3)As I have emphasized to those who have contacted me directly,
I have an ongoing interest in assembling a comprehensive library of
"validated" profiles. This message has probably not filtered
through to the people who use the package soley in its GCG
implementation. PLEASE, if you have developed profiles that you
consider to be powerful, send them to me so I can pass them on
to others with updates of the programs. It is also a good idea
to archive the profiles at a common site (e.g. iubio, uh, embl,
etc).
4) The biggest problem with assembling the profiles is
adequately documenting them, At least this has been the rub for
me. I would like to be able to provide for each profile:
a) the alignment used to make it
b) list of sequences xrefed to dbs
c) prose description of function with correct annotation
in terms of the multiple alignment coordinates
d) known statistics from database searches
e) known false positives/negatives
f) one or more references (to a review if possible) to
provide a hook into the literature.
However, do not neglect to send your profile just because you
don't have the time do all of the above.
5) There are a couple of improvements to the package since
the most recent versions GCG distributes. The most useful is a
version of profilegap that produces a multiple alignment. If this
would be useful to you, please contact me. It is still a
prototype, but seems robust (if clumsy).
There is probably more to say on this thread that I have missed in my
move, but this will do for now.
Michael Gribskov
San Diego Supercomputer Center
P.O. Box 85608
San Diego, CA 92186-9784
(619) 534-8312
gribskov at sdsc.edu