Hi, there,
I am a new comer to the field of linkage analysis. I have couple
questions regarding my ongoing project, and I would really appreciate
your help.
I am a molecular biologist and I am designing an experiment to do a
genome-wide mapping to identify certain loci (and hopefully genes
eventually) that predispose an individual who is infected with
Helicobacter Pylori to stomach cancer. In other words, it is well
known that the H. Pylori causes stomach cancer. But certain people
which specific mutations in certain unknown genes may be much more
susceptible to the H. Pylori induced carcinogenesis. I collected about
20 extended families which each of them have more than three stomach
cancer patients. And I tested the H. Pylori status for each of them
(i.e., infected or not infected). Now my questions are:
One, how to calculate the sample size that will produce significant
statistical power for the linkage analysis?
Two, I believe that I need to use the nonparametric model for
analysis. Because there are two variates here, i.e., cancer status and
H. Pylori infection status, how do I calculate the NPL Z score and the
LOD score so that I can compare a certain locus between the patients
who have cancer and are H. Pylori infected and the patients who have
cancer but are not infected? Suppose I use MERLIN or GENEhunter
programs.
I know these questions may sound stupid to you guys, but please help.
Thanks, Chris