SUMMARY: comparing allele frequencies

Joe Felsenstein joe at evolution.u.washington.edu
Fri Apr 8 17:32:59 EST 1994

```In article <2nrnpu\$70d at mercury.hgmp.mrc.ac.uk> dcurtis at crc.ac.uk
(Dr. David Curtis) writes:

> I've been very interested in this discussion because we have just been
> working on a method
> which we think may be helpful for this situation. Our method essentially
> consists of two
> phases. If we imagine that there are two rows (for cases and controls) and
> one column for
> each allele, then we begin by we lumping together columns into only two
> columns in such a
> way as to maximise the chi-squared statistic. Obviously the chi-squared
> value that one would
> get from such a two-by-two table is not a real chi-squared statistic, so
> we then carry out
> Monte Carlo simulations to find out how unlikely it would be to produce
> such a value by chance.
...
> The appealing feature of this approach is that it tests the post hoc
> hypothesis that springs to
> mind when one views the original contingency table - "Gee, it looks like
> alleles B and D are
> commoner in the cases than in the controls".
...
>Secondly - has anyone described this method previously? We (myself + Pak Sham) have started writing

This method is closely related to one of EJ Williams (Biometrika, 1952)
which is also known under the name "correspondence analysis".  In those
cases one if trying to find a set of weights for rows and columns so as to
maximize the interaction between rows and columns.  A presentation of
Williams's results is given by A.E. Maxwell (who was Professor in Curtis's
own Institute!) in his book "Multivariate Analysis in Behavioural Research"
(1977), in Chapter 10.

An extension of Williams's method would be to restrict the weights to 0's
and 1's, which is equivalent to grouping rows and columns in the ways
Curtis describes.  It follows from Williams's results that it would be
conservative to test the grouping with R+C-3 degrees of freedom.  The
permutation test Curtis suggests would of course be more exact.

This is an important method for people to know about and a good way to
correct for an a posteriori grouping of rows and columns.

-----
Joe Felsenstein, Dept. of Genetics, Univ. of Washington, Seattle, WA 98195
Internet:         joe at genetics.washington.edu     (IP No. 128.95.12.41)
Bitnet/EARN:      felsenst at uwavm

```