From the Graduate Institute of Epidemiology, College of Public Health, National Taiwan University, Taipei, Taiwan, Republic of China.
Received for publication September 10, 2003; accepted for publication September 16, 2003.
I am grateful to Dr. Weinberg for her insightful commentary (1) on my paper (2). Weinberg is looking forward eagerly to a day when researchers will be able to search the whole genome for susceptibility loci for complex diseases (1). Likewise, I notice with excitement that the conventional "risk factor" epidemiology as we know it has undergone a profound change. Moving into this postgenomic era, epidemiology can mean gene-mapping business, no less truly than it has been for so long about odds ratios and confidence intervals for, for example, smoking and lung cancer.
To obtain a weighting scheme for the various family configurations encountered in actual practice, the Di is defined as the allele count for the case (Ci) minus the allele count for the following: 1) nontransmitted, Ni (case-parents data); 2) imputed nontransmitted, (case-sibling data); 3) spouse, Si (case-spouse data); and 4) imputed spouse,
(case-offspring data), where the "nontransmitted" refers to the parental alleles not transmitted to the case (1), the "imputed nontransmitted" has an allele count of
, where the Bi is the mean allele count of the control siblings, and the spouse and the imputed spouse are defined the same as in my paper (2). The Dis defined in this way all have the same "mean" (e1 in my paper (2)) under the alternative hypothesis, irrespective of family configurations. The statistical efficiencies of the various family configurations are then in proportion to the inverses of their respective "variances" (e2 in my paper (2)).
Table 1 presents the relative efficiencies for the various family configurations (relative to case-parents data). Note that, in this paper, the disequilibrium test (1), but not the transmission/disequilibrium test (2), was applied to the case-parents data. To be brief, I present only the case with the allele frequency set at P = 0.1. It is of interest to note the following. First, the case-spouse data and the case-parents data have exactly the same efficiency. Second, the case-offspring data and the case-sibling data have roughly the same relative efficiency, when the number per family of offspring and the number per family of control siblings are equal. Third, the relative efficiency of the case-offspring (case-sibling) data with one offspring (control sibling) per family is ~0.5. These findings largely confirm Weinbergs speculations (1).
|
The variance formula simplifies considerably under the null. We have, dropping the subscript i, Var(D) = Var(C N) = Var(C) + Var(N) = 4P(1 P), for case-parents data. Likewise, Var(D) = Var(C S) = Var(C) + Var(S) = 4P(1 P), for case-spouse data. Because the numbers of "identical by descent" (3) for a sibling pair are 2 (probability = 0.25), 1 (probability = 0.5), and 0 (probability = 0.25), we have Cov(B1, B2) = Cov(C, B1) = 0.25 x 2P(1 P) + 0.5 x P(1 P) + 0.25 x 0 = P(1 P), with B1 and B2 being the allele count for the first and the second control siblings, respectively. Thus, for case-sibling data with x control siblings per family,
Similarly, we can show that
for case-offspring data with y offspring per family.
Treating case-parents data as case-sibling data with x = and case-spouse data as case-offspring data with y =
, the variance (under the null) of a weighted average of the two lines of relatives (superscripts, I and II) is
where Ok is the allele count for the kth offspring. With the use of the identical-by-descent probabilities again (for siblings, parent-offspring, and uncle-nephew pairs) (3), the last term can be shown to be t(1 t) x 4P(1 P). Thus, the variance is a quadratic function of t and has a minimum value of
when
The last row of table 1 presents the relative efficiencies calculated from these formulas. It can be seen that the approximation is satisfactory as long as the risk parameter, , is not too far away from its null value of 1.0. Therefore, the following weighted disequilibrium test is proposed (with the subscript, i, denoting the ith family):
with
and
which is distributed as a 1-degree-of-freedom chi-square distribution under the null.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() |
---|
Related articles in Am. J. Epidemiol.: