1 Department of Epidemiology, Mailman School of Public Health at Columbia University in the City of New York, NY, USA
2 Herbert Irving Comprehensive Cancer Center, Columbia University in the City of New York, NY, USA
Correspondence: Habibul Ahsan M.D., Department of Epidemiology, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 720-G, New York, NY 10032, USA. E-mail: ha37{at}columbia.edu
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods We provide a formula in epidemiological terms that illustrates the relationship between the geneenvironment association measured among controls and the geneenvironment association in the source population. Using this formula, we conducted sensitivity analyses to describe the circumstances in which controls can be used as proxy for the source population when evaluating geneenvironment independence. Lastly, we generated hypothetical cohort data to examine whether multivariable modelling approaches can be used to control for non-independence.
Results Our sensitivity analyses show that controls should not be used to evaluate geneenvironment independence in the population, even when the baseline risk of disease is low (i.e. 1%), and the interaction and independent effects are moderate (i.e. risk ratio = 2). When the factors are associated, it is possible to remove bias arising from non-independence using standard statistical multivariable techniques in case-only analyses.
Conclusions Even when the disease risk is low, evaluation of geneenvironment independence in controls does not provide a consistent test for bias in the case-only study. Given that control for non-independence is possible when the source of the non-independence can be conceptualized, the case-only design may still be a useful epidemiological tool for examining geneenvironment interactions.
Accepted 13 July 2004
As the name implies, the entire sample used for a case-only analysis of geneenvironment interaction consists of people with the disease (cases). Each person with disease is coded as positive or negative for a genetic factor (G) and an environmental factor (E). The case-only odds ratio (OR) is derived from the cross-categorization of the study sample on G and E status. This OR is the odds of E given the presence of G divided by the odds of E given the absence of G. As always, the OR is reversible,1 such that the case-only OR can also be interpreted as the odds of G given the presence of E divided by the odds of G given the absence of E.2 The premise behind the case-only study is that this OR can be interpreted as the multiplicative interaction between G and E in causing disease.
Epidemiologists were initially enthusiastic about the case-only study to detect geneenvironment interactions. This innovative design offered an opportunity to study interaction that was more statistically efficient than the analogous case-control study and was not subject to common biases arising from control selection.3 However, the validity of the case-only study hinges on one assumptionthat the genetic and environmental factors of interest are independent of one another. More recently, concerns over violations of this assumption have caused the initial enthusiasm to wane.46
It has been shown that the validity of case-only estimates of geneenvironment interaction is highly susceptible to bias arising from non-independence between G and E.4 Many researchers advocate the use of non-diseased controls selected from case-control studies to verify the independence assumption.2,710 This examination of independence assumes that the controls provide an appropriate proxy for the geneenvironment association in the population that gave rise to the cases. In practice, some authors have performed case-only analyses after observing GE independence in controls,11,12 while others have rejected case-only analyses upon finding GE associations in controls.1315 We will show that even when the disease risk is relatively low, controls may not provide a good approximation of the geneenvironment association in the underlying population. Given the susceptibility to bias and that evaluation of independence may be more problematic than previously thought, it may seem that the case-only study is of limited utility. However, evaluation of and adjustment for GE non-independence warrants further consideration.
For many metabolic polymorphisms, associations between genetic and environmental factors seem unlikely since these genes rarely produce symptoms and individuals have little opportunity to learn their genetic status. Thus, behavioural modification of environmental factors based on genetic status is unlikely for these types of genes. In fact, examples of independence between genetic and environmental factors appear in the recent literature on Mendelian randomization.16
For circumstances when geneenvironment associations are likely, it has been mentioned that the bias in case-only estimates caused by geneenvironment associations can be controlled using stratification.17 Several studies have controlled for third variables in case-only analyses without any explanation.11,1820 It is not clear whether the investigators were attempting to remove bias due to geneenvironment associations, or simply including variables in multivariable models because they are potential confounders of main effects. Thus far, the concept and procedures to control for non-independence between genetic and environmental factors are undeveloped, and there are no methodological guidelines for what types of third variables should be included in case-only analyses.
Despite criticisms of the case-only study, several applications of the case-only design can be found in recent publications.11,12,14,15,1822 For this reason, and because of the potential advantages offered by the design, further development of the case-only design is warranted. Here, we show that the case-only study may still prove useful in some circumstances, and may in fact, lead to more valid estimation of geneenvironment interaction than the analogous case-control study.
In this paper, we: (1) provide the conceptual basis for the case-only OR and its relationship to analyses of cohort data using simple epidemiological terms; (2) demonstrate the extent to which non-diseased controls can be used to evaluate geneenvironment independence in the population from which the cases arose; and (3) show that control for non-independence between the genetic and environmental factors is possible using standard modelling techniques. In short, we provide a clearer conceptual basis for understanding how bias arises in the case-only study and demonstrate how such bias might be removed.
![]() |
The case-only study design |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To further the development of the case-only design, we provide a conceptual description of the mathematical relationship between the case-only OR and the estimate of multiplicative interaction based on RRs. Our approach, which begins with a description of the underlying cohort and ends with the case-only estimate of multiplicative interaction measured with RRs, incorporates the prior work of Yang et al.26 and Schmidt and Schaid.2 Understanding the relationship between the case-only OR and the measure of interaction based on RRs is essential for understanding the nature of bias in the case-only study.
Conceptually, the interaction between G and E refers to the extent to which the joint effect of the two factors on disease (D) differs from the independent effects for both factors. The joint effect of G and E is the effect on D due to the presence of both factors. The independent effect of each of the factors is its effect on D in the absence of the other. Multiplicative interaction is assessed by comparing the joint effect with the product of the independent effects. For instance, if the independent effect of G equals 3 and the independent effect of E equals 2, then we would expect the joint effect of G and E to be 6 if there is no multiplicative interaction. If G and E do interact to cause D, then we would expect the joint effect to be something other than 6, the product of their independent effects. This measure of multiplicative interaction is equivalent to stratified analyses (i.e. the ED association in G+ divided by the ED association in G).
Population N is a hypothetical population that serves as the basis for a cohort study. The members of this population are categorized according to their G and E status and followed for incident disease. Assuming these two exposures are dichotomous, individuals are classified according to four exposure categories. This cohort study is illustrated in Figure 1a.
|
To demonstrate the calculation of the interaction between G and E using RRs, we have organized the cells a1 to d2 into the tables presented in Figure 1b. The joint and independent effects of G and E, and the calculation of the geneenvironment interaction are also presented. The baseline risk of disease, or the risk of disease attributable to factors other than G and/or E, is represented by (c2/N00). The RR for the joint effect of G and E (RRGE) is calculated by comparing the risk of disease among those who are G + E+ to this baseline risk. The RR for the independent effect of G (RRG) compares the risk of disease among those who are G + E to the baseline risk. Similarly, the RR for the independent effect of E (RRE) compares the risk of disease among those who are G E+ to the baseline risk. The baseline risk of disease serves as the referent risk for all three effects. Using these components, the geneenvironment interaction based on risk ratios (G x ERR) is measured by dividing the joint RR by the product of the independent RRs (Figure 1c). In our notation, GE refers to the association between G and E, while G x E refers to the interaction between G and E.
Using the same notation and 2 x 2 tables from Figure 1, the joint and independent effects of G and E measured by ORs can also be derived. Just as the joint and independent effects measured by RRs are used to calculate the G x ERR, the joint and independent effects measured by ORs are used to calculate the interaction estimate based on ORs (G x EOR).* These ORs can be calculated from a cumulative incidence case-control study (i.e. traditional case-control study), in which all of the diseased people from the underlying cohort comprise the cases and a random sample of the non-diseased people taken at the end of the follow-up period comprise the controls. The G x EOR derived from this case-control study is the same G x EOR that would have been obtained had all of the non-diseased been used.
The case-only study can be conceptualized as an analysis of all individuals with disease illustrated in Figure 1. In Figure 2a, we present the calculation of the case-only OR using the notation from Figure 1. Figure 2 demonstrates how the case-only OR is equivalent to the G x ERR. Using algebraic manipulation, it becomes apparent that the case-only OR (term I) is embedded within the G x ERR.
|
![]() |
Problems with using controls to assess non-independence |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Clearly, the validity of the case-only OR is sensitive to GE associations in the source population. In the presence of a GE association, the case-only OR is biased; the extent of bias depends on the magnitude of association.2,4 However, the likelihood of GE associations is less clear. The current evidence for non-independence between genetic and environmental factors is limited and derives from associations observed among controls.1315 It is recommended that case-only studies be interpreted cautiously, based on the mathematical susceptibility to bias4 and findings of non-independence among controls.1315 However, as we show below, the assessment of GE independence in controls is problematic making GE associations observed in controls difficult to interpret.
Using controls to approximate the GE OR in the underlying population can lead to the rejection of valid case-only data, an overestimation of the underlying interaction, or a finding of interaction when none exists. The following two examples illustrate the consequences of using the GE OR in the controls to approximate the GE OR in the base population.
In the first example (Figure 3), G and E are independent in the underlying cohort; that is, the GE OR is 1.0. The baseline risk of disease [p(D|G E)] is 4% and the overall risk of disease [p(D)] is 5%. Using the equations in Figure 1, the RR for the independent effect of G is 2 [RRG = (22/280)/(549/13 720)] and the RR for the independent effect of E is also 2 [RRE = (470/5880)/(549/13 720)]. The interaction estimate based on RRs (i.e. G x ERR) from the underlying cohort study is 2.5.
|
In the next example (Figure 4), the GE OR in the total cohort is 2.0, a violation of the independence requirement of the case-only design. The baseline risk of disease is 4% and the overall risk of disease is 8%. The RR for the independent effect of G is 6 [RRG = (132/548)/(538/13 452)] and the RR for the independent effect of E is 2.7 [RRE = (599/5548)/(538/13 452)]. The interaction estimate based on RRs from the underlying cohort study is 1.
|
These examples show that the use of controls to evaluate GE independence in the source population is problematic. Even when the disease risk is low, the GE OR in the controls may not accurately reflect the GE OR in the source population. Therefore, the observation of independence or non-independence between G and E among controls does not provide a consistent test for bias in case-only analyses.
Below, we provide a formula to describe the circumstances in which the OR for the geneenvironment association derived from controls can be used to approximate the geneenvironment OR in the source population (Equation 1).
![]() | (1) |
All of the factors that determine the relationship between the GE OR in the controls and the GE OR in the source population are visible in Equation 1. They are: the baseline risk of disease [p(D|G E)], the RRs representing the independent effects of the genetic and environmental factors (RRG and RRE, respectively), and their joint effect on disease (RRGE). Alternatively, the relationship between the GE OR in controls and the GE OR in the population can be expressed in terms of the disease risk in each of the four exposure categories (G + E+, G + E, G E+, G E) by factoring out the baseline risk. Equation 1 was adapted from previous research, in which two of the authors of this report (UBC and NMG) derived a formula that describes the mathematical relationship between interaction estimates based on ORs and RRs.27 In fact, the same factors that cause divergence between the GE OR in the controls and the GE OR in the underlying population are the same factors that cause the divergence between the G x EOR and the G x ERR. Therefore, the circumstances in which ORs and RRs yield materially different estimates of interaction are the same circumstances in which controls cannot be used to estimate the GE OR in the population.
To illuminate some situations in which valid case-only results would be rejected based on an observed GE association in controls, we used Equation 1 to perform sensitivity analyses. Specifically, we assessed the impact of the baseline risk of disease and the independent effect of G on the GE OR in non-diseased controls under GE independence in the source population. For these analyses, the independent effect of E and the G x ERR were each equal to 2.
Figure 5 highlights situations in which the baseline risk of disease ranges from 0.1% to 6%. As illustrated in the figure, the GE OR in the controls is a good approximation to the GE OR of 1 in the population when either the baseline risk of disease is low (baseline risk is 0.1%) or when the baseline disease risk is close to 1% and the independent effect of G is moderate (RRG < 2.5). However, as the baseline risk of disease approaches 3%, the GE OR in the controls begins to appreciably diverge from the GE OR in the population. For example, when the independent effect of G is 2.3 and the baseline risk is 3%, the GE OR in the controls is 0.8. This finding among controls can have important implications for interpretation. A researcher may conclude that the assumption of independence required for a valid case-only analysis is not met and reject the case-only design, despite GE independence in the population. For minimal increases in the baseline risk of disease, the approximation becomes increasingly worse.
|
Although the case-only design is not specific to cancer epidemiology, it has been most often used in this context. The baseline risks of disease used in our analyses are higher than is typical for cancers, and indeed our results show that evaluation of GE associations in controls may be satisfactory when the disease risk is less than 0.1%. However, the prevalences used in our analyses are applicable to studies of precursor lesions such as colorectal adenomas,28,29 Barrett's oesophagus,30 and benign breast disease.31 In addition, researchers are increasingly studying intermediate biomarkers as endpoints, where the risk of the endpoint may be close to 50%.32
Although there are some situations in which the assessment of GE independence in the controls will be a good proxy for assessment in the underlying population, recognizing these situations in practice requires assurance of many unknown quantities (e.g. the risk of disease among those without the genetic or the environmental factor, the underlying magnitude of interaction, etc.). Even when it is possible to estimate these quantities, minimal errors at critical thresholds can cause a researcher to come to the incorrect conclusion about whether controls may be safely used to evaluate independence. Consequently, interpretation of the GE OR in the non-diseased controls as the GE OR in the population should be made cautiously.
![]() |
Sources of non-independence |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Here, we posit two examples of potential GE relationships as a matter of illustration. For instance, in a study assessing interaction between BRCA1 status and use of post-menopausal hormones in causing breast cancer, a strong family history of breast cancer may cause non-independence between BRCA1 status and hormone use. Specifically, a woman with a strong family history is more likely to carry the BRCA1 mutation and knowing her family history, may be more likely to avoid post-menopausal hormones. In this scenario, there is a positive association between family history and BRCA1 mutation and a negative association between family history and hormone use, resulting in a negative association between BRCA1 and hormone use. Another example might be a study assessing interaction between alcohol intake and the alcohol dehydrogenase (ADH) polymorphism in causing liver cancer. The adverse reaction to alcohol common in those with the ADH polymorphism may cause non-independence between the polymorphism and alcohol intake. Specifically, those with the ADH polymorphism may alter their drinking behaviour to avoid the resulting adverse reaction. There is a positive association between ADH polymorphism and alcohol-induced adverse reaction and a negative association between alcohol-induced adverse reaction and alcohol use, resulting in a negative association between ADH polymorphism and alcohol use. Here, the third variable, adverse reaction to alcohol, is a step in the causal pathway and accounts for the association between the polymorphism and alcohol intake.
In both examples, the covariate (family history, adverse reaction to alcohol) is responsible for a negative association between the gene and the environmental factor. In a case-only study of a positive multiplicative interaction, this negative association between G and E in the underlying population will cause the case-only OR to be biased toward the null.
In the worst-case scenario, G and E are plausibly related as in the examples discussed above and controls are deemed inappropriate to evaluate independence because of a high baseline risk or strong independent effects. However, there may be alternatives for assessing independence in this situation. For instance, Sturmer et al. measured the association between alcohol consumption (the exposure of interest) and the dehydrogenase II gene (the gene of interest) in a small random sample of the general population.22
![]() |
Control for violations of independence |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aspects of controlling for non-independence in case-only studies examining interaction are similar to controlling for confounding in studies of main effects. When main effects are of interest, researchers are concerned with controlling for covariates that are risk factors for the disease and related to the exposure to ensure that any observed exposuredisease association reflects a causal effect. A covariate with these characteristics may represent a common cause of the exposure and disease or an intermediate variable that participates in a causal pathway between the exposure and disease. If such a covariate is indeed a confounder, adjustment for the covariate will remove its influence on the exposuredisease association and improve the validity of the effect estimate.
In case-only studies of interaction, researchers should be concerned with controlling for covariates that cause non-independence between G and E to ensure that any GE association among cases reflects their interaction in causing disease. In this context, a covariate of interest (C) may represent a common cause of G and E (e.g. family history of breast cancer) or an intermediate variable that participates in the causal pathway between them (e.g. adverse reaction to alcohol). If such a covariate is indeed the source of non-independence between G and E, adjustment for the covariate will remove the GE association and yield a valid estimate of interaction among cases.
It is important to note that the purpose of controlling for non-independence in a case-only study is to remove the association between G and E entirely, regardless if the association is causal or non-causal. This is in contrast to the purpose of controlling for confounding in a study of a main effect, which is to separate the causal effect of interest from the non-causal association. However, we find the analogy to confounding useful, especially when thinking about whether G and E are related in the population. Just as an investigator would not worry about covariates that are not plausible potential confounders, an investigator using the case-only design need not worry about third variables that are not plausibly associated with both G and E.
The diagrams shown in Figure 6 depict the scenario in which the third variable, C, is related to both G and E and the scenario in which C is in the causal pathway between G and E. The diagrams have been labelled with the factors involved in the two examples of non-independence described above. The mechanisms are simplified to include only the relevant pathways (e.g. other causes of reduced alcohol intake and adverse reaction to alcohol that do not result from the polymorphism are omitted).
|
Twelve hypothetical populations of 100 000 people were generated using SAS version 8.12.33 Each population can be conceptualized as a study cohort, similar to Figure 1a, but further stratified by C+ and C, in which a1 to d2 represent those exposed to C and a3 to d4 represent those unexposed to C (Fig. 7). For each population, the SAS MODEL procedure was used to simultaneously solve for the desired relationships among the variables C, G, E, and D. Each solution was based on set parameter values. For details regarding these parameters, see Appendix 1. For each of the 12 examples, the cell counts, a1 to d4, derived from the SAS MODEL procedure were used to generate a dataset of 100 000 observations to which logistic regression modelling was applied.
|
All of the cases from each hypothetical cohort were used in the corresponding case-only analysis. All crude and adjusted case-only ORs were generated using the LOGISTIC procedure in SAS.33 In general, modelling any variable as a function of another in a case-only analysis gives the estimate of interaction between these two variables. The crude case-only OR was calculated by modelling G as a function of E (Equation 2). C was added to this model as a covariate to obtain the adjusted case-only OR (Equation 3). The exponentiations of ß1 represented in equations 2 and 3 are the unadjusted and adjusted case-only ORs respectively.
![]() | (2) |
![]() | (3) |
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Furthermore, we discussed how non-independence seems most likely when an individual's genetic status is knowable or symptomatic, enabling modification of exposure to an environmental factor dependent on gene status. For most metabolic polymorphisms, it seems that obvious symptoms or knowledge of gene status would be uncommon. In studies of these genetic factors, geneenvironment associations are less likely. Although the data are limited, some studies have shown a lack of association between metabolic polymorphisms and behavioural risk factors.16
Finally, we demonstrated that control of non-independence in the case-only analysis can yield a valid interaction estimate. If the factor that causes a geneenvironment association is measured, multivariable logistic regression modelling can be used to remove the bias. If one can posit a mechanism by which G and E are related, then the source of this relationship should be measured and included in the case-only analysis. The extent to which complete control is possible depends on the same challenges that face epidemiological studies of main effects. These challenges include articulation of explicit hypotheses, and attention to the construct and measurement of all variables.
In practice, control for non-independence will not always be straightforward. For instance, control for non-independence requires that the source(s) of non-independence can be conceptualized and measured. However, this may be difficult or impossible in some situations. For example, when genegene interaction is of interest, linkage disequilibrium may cause non-independence between the genes. In this circumstance, the source of non-independence may not easily be attributable to a third variable, making adjustment impossible. Furthermore, in studies of main effects, adjustment for variables that does not change the validity of the effect estimate costs degrees of freedom and reduces the precision of the estimate. Similarly, control for variables that do not generate a GE association in the underlying population may have the same costs in case-only studies. Finally, given that it is difficult to control for sources of bias in cohort and case-control studies of main effects and/or interaction (e.g. misclassification, loss to follow-up, etc.) it may be difficult to control for these sources of bias in case-only studies.9 However, sensitivity analysis methods that prove useful in cohort and case-control studies may also apply to case-only studies.
In conclusion, recent criticisms of the case-only study may be overstated. Since non-independence can be accounted for in the analysis, the case-only study may still be a valuable tool for investigations of geneenvironment interaction.
KEY MESSAGES
|
![]() |
Appendix 1 |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The 16 parameters described above were translated into 16 equations relating 16 cells. These 16 cells (labelled a1 to d4) represent the cell counts for four 2 x 2 tables (Figure 7).
For each population, the SAS MODEL procedure was used to simultaneously solve for the 16 variables (a1 to d4), based on the parameter values entered. The solution values were used to generate a dataset of 100 000 observations in order to perform regression modelling. For example, if a1 equalled 5000, the dataset contained 5000 observations of people who were labelled as exposed to variables E, D, G, and C.
For additional details regarding this program, contact Ulka Campbell at uvb2{at}columbia.edu, or Nicolle Gatto at nmg22{at}columbia.edu.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
* The calculation of the G x EOR using this notation can be obtained by contacting the authors.
When the GE OR in the population is equal to one, the GE RR will also be equal to one. The GE OR is used to test for non-independence in the population because it is this ratio that must equal one for the case-only OR to be mathematically equivalent to the G x ERR.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Schmidt S, Schaid DJ. Potential misinterpretation of the case-only study to assess gene-environment interaction. Am J Epidemiol 1999;150:87885.[Abstract]
3 Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene-environment interaction: case-control studies with no controls! Am J Epidemiol 1996;144:20713.[Abstract]
4 Albert PS, Ratnasinghe D, Tangrea J et al. Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol 2001;154:68793.
5 Saunders CL, Gooptu C, Bishop DT et al. The use of case-only studies for the detection of interactions, and the non-independence of genetic and environmental risk factors for disease (Abstract). Genet Epidemiol 2001;21:174.
6 Saunders CL, Barrett JH. Flexible matching in case-control studies of gene-environment interactions. Am J Epidemiol 2004;159:1722.
7 Yang Q, Khoury MJ. Evolving methods in genetic epidemiology. III. Gene-environment interaction in epidemiologic research. Epidemiol Rev 1997;19:3343.[ISI][Medline]
8 Goldstein AM, Andrieu N. Detection of interaction involving identified genes: Available study designs. J Natl Cancer Inst 1999;26:4954.
9 Clayton D, McKeigue PM. Epidemiological methods for studying gene and environmental factors in complex diseases. Lancet 2001;358:135660.[CrossRef][ISI][Medline]
10 Botto LD, Khoury MJ. Commentary: Facing the challenge of gene-environment interaction: The two-by-four table and beyond. Am J Epidemiol 2001;153:101620.
11 Marcus PM, Hayes RB, Vineis P et al. Cigarette smoking, N-acetyltransferase 2 acetylation status, and bladder cancer risk: a case-series meta-analysis of a gene-environment interaction. Cancer Epidemiol Biomarkers Prev 2000;9:4616.
12 Chang-Claude J, Dunning A, Schnitzbauer U et al. The patched polymorphism PRO1315LEU (C3944T) may modulate the association between use of oral contraceptives and breast cancer risk. Int J Cancer 2002;103:77983.[CrossRef][ISI]
13 Deitz AC, Localio R, Mitchell L et al. Genotype-environment association in controls: Implications for case-only analyses (Poster). American Association for Cancer Research (AACR) Annual Meeting. San Francisco, CA, 2002.
14 Egan KM, Newcomb PA, Titus-Ernstoff L et al. Association of NAT2 and smoking in relation to breast cancer incidence in a population-based case-control study (United States). Cancer Causes Control 2003;14:4351.[CrossRef][ISI][Medline]
15 Becher H, Schmidt S, Chang-Claude J. Reproductive factors and familial predisposition for breast cancer by age 50 years. A case-control-family study for assessing main effects and possible gene-environment interaction. Int J Epidemiol 2003;32:3848.[CrossRef][ISI][Medline]
16 Davey Smith G, Ebrahim S. Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 2003;32:122.[CrossRef][ISI][Medline]
17 Umbach DM, Weinberg CR. Designing and analysing case-control studies to exploit independence of genotype and exposure. Stat Med 1997;16:173143.[CrossRef][ISI][Medline]
18 Bennett WP, Alavanja MC, Blomeke B et al. Environmental tobacco smoke, genetic susceptibility, and risk of lung cancer in never-smoking women. J Natl Cancer Inst 1999;91:200914.
19 Infante-Rivard C, Labuda D, Krajinovic M et al. Risk of childhood leukemia associated with exposure to pesticides and with gene polymorphisms. Epidemiology 1999;10:48187.[ISI][Medline]
20 Infante-Rivard C, Krajinovic M, Labuda D et al. Childhood acute lymphoblastic leukemia associated with parental alcohol consumption and polymorphisms of carcinogen-metabolizing genes. Epidemiology 2002;13:27781.[CrossRef][ISI][Medline]
21 Modan B, Hartge P, Hirsh-Yechezkel G et al. Parity, oral contraceptives, and the risk of ovarian cancer among carriers and noncarriers of a BRCA1 or BRCA2 mutation. N Engl J Med 2001;345:23540.
22 Sturmer T, Wang-Gohrke S, Arndt V et al. Interaction between alcohol dehydrogenase II gene, alcohol consumption, and risk for breast cancer. Br J Cancer 2002;87:51923.[CrossRef][ISI][Medline]
23 Aalen OO, Borgan O, Keiding N et al. Interaction between life history events. Nonparametric analysis for prospective and retrospective data in the presence of censoring. Scand J Stat 1980;7:16171.[ISI]
24 Prentice RL, Vollmer WM, Kalbfleisch JD. On the use of case series to identify disease risk factors. Biometrics 1984;40:44558.[ISI][Medline]
25 Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med 1994;13:15362.[ISI][Medline]
26 Yang Q, Khoury MJ, Sun F et al. Case-only design to measure gene-gene interaction. Epidemiology 1999;10:16770.[CrossRef][ISI][Medline]
27 Campbell UB, Gatto NM, Schwartz S. Distributional interaction: interpretational problems when using odds ratios to assess interaction (Poster). Society for Epidemiologic Research (SER) Annual Meeting. Salt Lake City, Utah, 2004.
28 Yamaji Y, Mitsushima T, Ikuma H et al. Incidence and recurrence rates of colorectal adenomas estimated by annually repeated colonoscopies on asymptomatic Japanese. Gut 2004;53:56872.
29 Chan AT, Giovannucci EL, Schernhammer ES et al. A prospective study of aspirin use and the risk for colorectal adenoma. Ann Intern Med 2004;140:15766.
30 Toruner M, Soykan I, Ensari A et al. Barrett's esophagus: prevalence and its relationship with dyspeptic symptoms. J Gastroenterol Hepatol 2004;19:53540.[CrossRef][ISI][Medline]
31 Rohan TE, Miller AB. A cohort study of oral contraceptive use and risk of benign breast disease. Int J Cancer 1999;82:19196.[CrossRef][ISI][Medline]
32 Li Y, Marion MJ, Rundle A et al. A common polymorphism in XRCC1 as a biomarker of susceptibility for chemically induced genetic damage. Biomarkers 2003;8:40814.[CrossRef][ISI][Medline]
33 SAS Institute I. SAS/STAT software. Cary, NC: SAS Institute, Inc, 2002.