As a follow-up to Harry Mangalam's summary of methods for representing amino
acid similarities, people might also be interested in the following two
references (one in press, one in print):
The first paper is something of a sequel to Willie Taylor's original Venn
diagram paper from 1986...
Taylor, W.R., Jones, D.T.
"Deriving an Amino Acid Distance Matrix."
J. Theor. Biol. in press. (1993)
ABSTRACT
Various methods were investigated to convert an amino acid similarity matrix
into a low-dimensional, metric distance matrix. Using projection techniques,
no unique transformation was found and of the many inversion forms
investigated, simple negation normalised by the diagonal elements produced a
good fit to the original data. An inter-row distance also gave a comparable
fit and when evaluated by weighted least-squares minimisation was found to
be preferable. A rank-ordered form of the matrices was also derived by
constraining neighbours to be equidistant in (3) space. This produced a
network configuration not unlike that produced in a previous analysis of
amino acid physico-chemical properties. The derived forms might find
applications in sequence alignment, including pattern matching algorithms,
and the construction of phylogenetic trees.
The other reference is...
Taylor, W.R.
"A template based method of pattern matching in protein sequences."
Prog. Biophys. Mol. Biol. 54:159-252 (1989) (published in 1991).
This paper contains possibly the oldest reference in any molecular biology
paper: Wan, "The I Ching", privately published (1142 BC). If you want to
know what Gray codes, the I Ching, and the genetic code have to do with
representing amino acid similarities, then this is the paper for you!
---
David Jones
Biomolecular Structure and Modelling Unit
University College London