Codon Frequency tables:
Both Don Gilbert and John Obrien ask for "codon frequency tables"
for various groups of organisms. While one can easily generate
codon frequency tables for individual genes (e.g. using CODONFREQUENCY
in the GCG package) and then combine to get a *pooled* table for all
the genes in one species, one must be *very* careful in using such
pooled tables.
There is considerable variation in codon usage between genes in the
same organism. Clear trends are often seen. For example, Human
codon usage is dominated by variation in G+C content in the third
codon position: third position G+C content varies from about 35 per
cent to about 95 per cent. Would a *pooled* Human codon frequency
table with a third position G+C content of (say) 65 per cent be a
useful thing? In contrast, E.coli codon usage appears essentially
independent of third position G+C content.
I was involved in a review of this within-species diversity in codon
usage patterns with Paul Sharp's group at TCD in Dublin. We looked
genes from E.coli (sample of 165), B.subtilis (76), S.cerevisiae (154),
S.pombe (40), D.melanogaster (84), and H.sapiens (290). This work is
published in:
Sharp PM, Cowe E, Higgins DG, Shields DC, Wolfe KH & Wright F (1988)
"Codon usage patterns in Escherichia coli, Bacillus subtilis,
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila
melanogaster and Homo sapiens; a review of the considerable
within-species diversity."
Nucleic Acids Research 16: 8207-8211
I hope this is of use.
Frank Wright
DAFS Molecular Biology Support (Statistics & Computing)
Scottish Agricultural Statistics Service
J.C.M.B., University of Edinburgh, Edinburgh EH9 3JZ, U.K.