Don Bowden "bowden at mgrp.bgsm.wfu.edu" writes:
> This is probably a dumb question, but... we have genotyped a micro-
> satellite in a large number of caucasian and african-american samples and
> would like to compare the allele frequency distribution to see if they are
> different. I did this using a contingency table to calculate chi square
> and there is a significant difference. I seem to recall thought that if any
> of the cells has less than 5 elements in it, chi square is not the appropriate
> way to go. What is the right way to do this, and more importantly, is there
> some textbook or other source which would show a simple-minded molecular bio-
> logist how to do it?
>
This is not a dumb question. It is not easy to deal with statistical analysis
of loci with lots of alleles, as is typical of micro-satellite repeats. You
could look at Bruce Weir's "Data Analysis" book; there is some stuff on
tests involving multiple locus markers. Depending on the number of alleles
it may be easy or hard; there has been quite a bit published in the last
few years on statistical tests involving extraordinarily polymorphic systems,
but this literature hasn't made it into books yet.
You are correct to be leary of tests which are based on large sample
approximations when your samples aren't big enough. The "5" rule for the
chi-square test is more a rule-of-thumb than a hard-and-fast rule. For tables
with not too many cells, it is often possible to use exact permutation tests
instead. Rather than just consult a textbook if you are unsure of what
you are doing, why don't you see if your university has a statistical
consulting service? At least they might steer you to appropriate analyses,
even if you have to carry out the computations yourself.
Ellen Wijsman
Div of Medical Genetics, RG-25
and Dept of Biostatistics
University of Washington
Seattle, WA 98195
wijsman at u.washington.edu