corelation between codon and protein secondary structures

Dennis Farr defarr at use.usit.net
Fri Jan 2 15:07:34 EST 1998

In 1989, I gathered up information on the secondary structure of a few proteins 
and the corresponding DNA code for those proteins, where I could find both sets of
data for a protein. I found around twenty proteins for which I had both data sets.

I then checked the correlation between codon and secondary structure type for 
each amino acid that can be represented by multiple codons. (There are 21 amino 
acids and 64 codons.) I found significant correlation coefficients for several 

This was an admittedly very small sample. I believe it would be quite easy to 
repeat my study using currently available datasets and come up with a much 
larger sample size with very little effort. 

I am seeking information on whether or not someone has done similar work, or has 
the resources to do so. I am no longer able to spend the kind of spare time I 
put into the first study, but would be glad to help out or provide additional 
details to anyone interested.

Caveats: I know the correlation I found is supposed to be impossible. I do not 
propose a direction for the arrow from cause to effect for the correlation I 
found, if it holds up under additional scrutiny. I am a computer programmer by 
trade, and a mathematician by training, not a molecular biologist. 

I believe the  phenomenon I seem to have discovered, or at least conjectured, 
should be investigated. The cost to do so is cheap. The impact of even a small 
correlation between structure and codon would be an improvement in protein 
structure prediction. Using codon rather than amino acid sequence as input to a 
protein structure prediction algorithm adds almost 2 bits of information per 
amino acid to the input. If the additional information is at all relevant, the 
resulting predicted structure should be 'better'.

More information about the Proteins mailing list

Send comments to us at biosci-help [At] net.bio.net