I've been very interested in this discussion because we have just been working on a method
which we think may be helpful for this situation. Our method essentially consists of two
phases. If we imagine that there are two rows (for cases and controls) and one column for
each allele, then we begin by we lumping together columns into only two columns in such a
way as to maximise the chi-squared statistic. Obviously the chi-squared value that one would
get from such a two-by-two table is not a real chi-squared statistic, so we then carry out
Monte Carlo simulations to find out how unlikely it would be to produce such a value by chance.
These simulations are conditioned on the marginal totals of the real table. The results of this
provide a significance level for the original observations.
The appealing feature of this approach is that it tests the post hoc hypothesis that springs to
mind when one views the original contingency table - "Gee, it looks like alleles B and D are
commoner in the cases than in the controls". Some preliminary tests seem to suggest that it
is also more powerful than using the raw chi-squared statistic from the original table, which is
what we had hoped.
The purpose of this posting is two-fold. Firstly, if anyone would like a copy of this program then
I can make it available - as C source and/or a DOS executable. It isn't documented yet, but there's
not much to it so that shouldn't be a problem.
Secondly - has anyone described this method previously? We (myself + Pak Sham) have started writing
it up, but if it's old hat I'd be grateful if someone could let me know before an editor/reviewer
does. Thanks to everyone, especially to Ellen Wijsman for her helpful reply to my note.
Dave Curtis
Institute of Psychiatry Janet: dcurtis at UK.AC.MRC.HGMP
Denmark Hill Elsewhere: dcurtis at HGMP.MRC.AC.UK
London SE5 8AF
Tel 071-703 5411