We've now got a program that people may want to look at for doing this. It applies the Monte
Carlo approach to assessing significance of a 2 x m contingency table clumped in a number of
different ways. It's called CLUMP, and it's available (or should be very soon) at
diamond.gene.ucl.ac.uk in /pub/packages/dcurtis. Our preliminary trials suggest that clumping
to maximise the chi-squared is fairly powerful. It may be slightly more powerful than using the
raw table, though there's hardly anything in it. In some circumstances it does seem to have a
useful increase in power over using the table made by clumping together cells in which the
cells have small expected values. It is also more powerful (perhaps not surprisingly) than
tests which compare one column against all the others. The main advantage seems to be in using
the Monte Carlo method itself, since this allows one to deal with the raw table without having
to worry about small expected values.
All comments gratefully received...
Dave Curtis