alpha-helix signals -> protein folding

David Jones jones at bsm.biochemistry.ucl.ac.uk
Sun Jul 17 05:55:36 EST 1994

Rob Miller (rmiller at bsm.bioc.ucl.ac.uk) wrote:

stuff deleted...
> structure and folding.  My initial ideas are due to a draft paper by 
> Edward Fredkin on Digital Information Mechanics (later shortened to Digital
> Mechanics for publication -- certainly a better acronym :-), and I can 
> add some thoughts about `what this picture means for a predictive theory 
> of folding' :
> (1) Let us say that an amino acid sequence specifies a `program' for the 
> folding of the protein.  This is consistent with the described model 
> when one recognizes that it is possible to build a network of cellular 
> automata (CA) capable of `computation' in the sense that it can perform the 
> basic operations recognized by Turing.  In other words, for both the CA
> network and the protein, the system processes it's input (the environment) 
> and eventually outputs a `final' structure. The Halting Problem states that,
> given a description of a machine and a program with input to run on it, it 
> is not possible to determine a priori even the simplest characteristic of 
> the program's results, such as whether the program + input will end up in
> an infinite loop.  The only way to find out for certain is to actually 
> run (or simulate) the program and see -- and even then you don't know 
> to really make this work out in a reasonable amount of time.
more stuff deleted...
> (2) Does this mean that the `protein threading' idea, where one evaluates
> the folding of an (unknown structure) amino acid sequence as a given
> structural motif shouldn't work ?  Certainly it does seem to work in 
> many cases (D.T. Jones, et al), and I am employed to be extending that

Rob has made some interesting points regarding the computability of the
protein folding function. In terms of theoretical protein folding I think
it is necessary to divide the approaches into two classes: ab initio and
knowledge-based. Ab initio protein folding simulations attempt to model
the precise form of the physical forces that are exerted on a protein
chain during the folding process and effectively minimize these forces
(generally speaking these days, dynamically). Such simulations offer
insight into the general nature of the folding process, but the results
obtained are far from the native conformation of the protein. These
simulations are hampered by a number of things. Firstly accurate models
of the forces we think to be important to the protein's conformation are
computationally expensive - they need to consider side chain positions,
they need at least the positions of the polar hydrogen atoms and they
really need a good model of the entropic effects of solvent. Even when
these forces are simulated to the best of our current ability it is
clear that these are very poor models of reality indeed. Considering the
complexity of average-sized protein domains I would be very surprised if
a purely ab initio method (i.e. without any knowledge of the intended
conformation) ever managed to closely approximate the native
fold of such structures within 20 years, if ever.

Knowledge-based methods (of which threading is just one example) take a
much more pragmatic approach to the problem. These methods effectively
side-step the issue of whether the fold of a protein is thermodynamically
or kinetically stable. What we know by looking at the native folds of
proteins is that tertiary, super-secondary and secondary structure is
highly recurrent as a result of evolution, stereochemistry or most likely
both. No matter whether the proteins we observe are sitting happily in
their global energy minimum or rolling about in a higher energy local
minimum we know that the very next protein structure to be determined
will consist of some mixture of alpha helices, beta sheets, beta-alpha-beta
units, alpha or beta hairpins, alpha corners, beta barrels etc. We also
expect that there will be a well defined hydrophobic core, no steric
clashes and a fairly high proportion of potential hydrogen bonds will be
made. In addition protein topologies are generally right-handed with some
exceptions. The knowledge-based methods are based on the hope that these
empirical constraints are sufficient to arrive at a good approximation
of the native fold. They are also based on the hope that by applying these
constraints it is possible to get away with a far cruder model of the
protein's energy function. Whilst we really have very little idea of what
form the hydrophobic effect really takes, the fact that at the end of
the real folding process hydrophobic residues tend to lie close together
results in a very simple constraint i.e. pull hydrophobic residues close
together or in the case of threading for example, penalize any hydrophobics
that are not close together (or some variant thereof). Of course these
methods don't really shed much light on how proteins compute their
own conformation - they are (or at least will be) very useful tools for
helping humans work out the function of newly sequenced proteins, but
they will not necessarily help us fathom out how protein sequences
apparently code for a unique 3-D structure.

So what are the future prospects? Well I would expect that within about
10-15 years or so, somebody, somewhere will have a program that will
be able to predict the tertiary structure of a given protein sequence
with at least reasonable accuracy (say to 3-4 A RMSD). I expect this program
will be based on some kind of pattern recognition method - either
recognizing whole folds or recurrent sub-structures. I am very doubtful,
however, that the algorithm in use in this futuristic program will have
anything in common with the "algorithm" proteins actually use.

One question that I often toy with when the subject of the protein folding
problem and its possible solution arises is just how are we going to
recognize the solution when it (eventually/if ever) arrives? If someone
creates a black box which guesses the native conformation of a protein
chain right every time, is this a solution? If someone works out that
proteins fold by doing X, Y and then Z but that we cannot hope to
simulate X or Y let alone Z, is this a solution?

Anyone any comments on this? I'm sure we'd all like to know when it's
time to give up on protein folding and find another interesting problem
to work on!

- David -

This message was written, produced and executively directed by Dr David Jones
Email: jones at bsm.bioc.ucl.ac.uk         |     JANET: jones at uk.ac.ucl.bioc.bsm
Address: Dept. of Biochemistry          |       Tel: +44 71 387 7050 x3879
and Molecular Biology, University       |       Fax: +44 71 380 7193
College, London WC1E 6BT, U.K.          |

More information about the Proteins mailing list

Send comments to us at biosci-help [At] net.bio.net