From the Biostatistics Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The goal of gene-environment studies in epidemiology is to learn how the risk of a disease changes as a joint function of genotype and exposure. Such studies promise new insights into etiology. They also promise the eventual capability to tailor interventions more precisely, whether at a clinical level, where the therapeutic agent prescribed or its dose may be chosen in light of an individual's genetic makeup, or at a public health level, where programs may be targeted at high-risk subpopulations. Of course, the study of candidate genes and exposures presents methodological challenges. This commentary reviews some of these issues and indicates how successfully the report by Wang et al. (1) has addressed them. These authors study the interrelation of benzene exposure, maternal polymorphisms in the CYP1A1 and GSTT1 genes, and length of gestation and find that low-dose maternal benzene exposure is associated with shortened pregnancy among a genetically defined subset of mothers.
![]() |
CONFOUNDING |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Another potential source of confounding, particularly in studies of perinatal outcomes, is to measure the mother's genotype when the child's genotype is actually the relevant one (or vice versa). The basic biology of allele transmission ensures that a mother and child share one allele at each locus so that their genotypes are correlated. Consequently, the mother's genotype and the child's genotype at the same locus may each be viewed as a potential confounder of the other. In an analysis that uses the mother's genotype at the susceptibility locus when the child's alone is relevant, estimates of the risk associated with a variant allele can be severely biased (2).
The same phenomenon can bias estimates of genotype-exposure interactions. As the mother's tissue encounters the toxicant first, her genotype likely plays a key role in activating or deactivating potentially toxic exposures before they reach the fetus. Yet, without evidence that the toxicant does not cross from the maternal to the child's circulation, ruling out any role for the child's genotype in the same biochemical processes is difficult. Wang et al. (1) point out the existence of an operating cytochrome P-450 enzyme system in the fetoplacental unit but do not raise the question of whose genotype may be most relevant here. A potential role for the child's genotype is difficult to judge, however, without a mechanistic link between benzene exposure and the initiation of parturition.
One goal of studies involving genetic susceptibility and perinatal exposures should be to disentangle the separate contributions of maternal and offspring genotypes. Because the correlation between maternal and offspring genotypes often leads to correlations between the corresponding risk estimates, the relative importance of these two interrelated risk factors (or of their interactions with exposures) may be difficult to assess with commonly used case-control or cohort designs. Specialized designs, such as case-parents designs, can help, at least when disease status (affected vs. unaffected) is the outcome of interest. With a case-parents design and appropriate analysis, the risk estimates associated with the mother's genotype are uncorrelated with those associated with the offspring's genotype (3, 4
). Similarly, estimates of the mother's genotype-exposure interactions are uncorrelated with estimates of the offspring's genotype-exposure interactions in case-parents designs (unpublished).
Still another source of potential genetic confounding is hidden population structure. The sampled population may consist of several genetically distinct subpopulations that are incompletely mixed (admixture, population stratification). If those subpopulations differ in both the prevalence of a variant allele at the candidate locus and the prevalence or magnitude of a trait, apparent associations between the allele and trait may simply reflect confounding of the allele's effect by subpopulation identity. When an investigator recognizes relevant subpopulations, for example, ethnic groups, and can correctly classify respondents into them, controlling such admixture-induced confounding is straightforward. Population structure, however, can be subtler than overt ethnic differences. Barriers to gene flow within an apparently homogeneous population could lead to a subpopulation structure that is easily overlooked. Even when admixture is known to exist, identifying relevant subpopulations and accurately assessing an individual's membership therein can be difficult. Since exposure prevalence may also vary among genetically distinct subpopulations, exposure and gene-exposure interaction effects can also be biased by subpopulation structure.
That confounding from genetic population structure is capable of biasing inference is inescapable; what is less clear is how severe a problem it presents. For diverse populations such as that of the United States, admixture-induced bias may be relatively small for common variants (5). Wang et al. (1
) report that all the workers were Chinese but offer no information about the possibility of subtler ethnic distinctions among their subjects. Consequently, the potential for such structure is difficult to evaluate.
Clever study designs can help cope with biases that arise from population structure. The basic idea is to use data from family members of affected individuals in ways that, in essence, stratify possible confounding away. Case-parents designs, mentioned earlier, can eliminate spurious associations in studies of genetic effects alone (6, 7
) or in studies of gene-environment interactions (8
, 9
). Matched case-control designs that use siblings or other relatives of cases as controls can also eliminate admixture-induced confounding (10
). Weinberg and Umbach (11
) provide a discussion of the pros and cons of various retrospective epidemiologic designs for gene-environment studies of disease incidence. Analogous study designs are applicable to studies of quantitative traits (12
15
).
A different complication is that factors that confound interaction effects need not be the same as those that confound exposure effects. Put another way, adjusting for variables that confound exposure will not guarantee that gene-exposure interaction effects will be free from bias. Investigators tend to avoid explicit consideration of factors that may confound gene-exposure interactions. Perhaps this oversight reflects the relatively recent origin of studies in which detection of interaction is a primary goal. As a specialty, we may not have enough experience thinking about the kinds of factors that may confound interaction effects. On the other hand, despite its theoretical possibility, perhaps confounding of interactions rarely occurs in practice or is adequately controlled by adjustment for confounders of exposure or of genotype. Still, if some factor were a confounder of exposure, prudence would dictate checking whether a term for the interaction of that factor with genotype appeared to confound the gene-exposure interaction of interest. Although the authors adjusted for a number of variables related to gestational age, they did not explicitly consider variables that might be related to genotype-benzene interactions. Again, imagining what set of variables might be relevant is problematic without having some mechanism of action in mind. Because the CYP1A1 and GSTT1 enzymes have been related to the metabolic processing of benz[a]pyrene in cigarette smoke, however, one possibility might be to include CYP1A1-passive smoking or GSTT1-passive smoking interaction terms and look for possible changes in the corresponding gene-benzene interaction coefficients. This sort of check seems particularly apt since passive smoking was somewhat associated with benzene exposure among these workers.
![]() |
BIOLOGIC PLAUSIBILITY |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Field studies are undertaken because some previous data or mechanistic rationale suggests that the exposure may be related to the disease outcome. Candidate genes are nominated for study analogously. Some connection between the candidate gene and the disease or the exposure under study is desirable. Polymorphic genes make stronger candidates when the different alleles lead to differential catalytic activity of enzymes or differential expression levels. The higher the biologic plausibility of any mechanisms that implicate the candidate gene in the disease process, the more causally relevant any observed associations become.
Criteria like those, while easy to enunciate, are not always easy to evaluate. Knowledge about functional polymorphisms is restricted to a relatively small number of well-studied genes. The Human Genome Project (http://www.nhgri.nih.gov/HGP/) will facilitate gene discovery as well as the identification of polymorphisms. Additional efforts such as the Environmental Genome Project (http://www.niehs.nih.gov/envgenom/) will evaluate selected genes for the existence, and eventually the function, of polymorphisms. As such efforts proceed, the choice of candidate genes for a given exposure-disease setting will become increasingly enlightened.
Science needs activities that generate hypotheses as well as those that test them, and valuable exploratory studies often have little a priori evidence for genetic or exposure effects. Wang et al. (1) offer no biologic mechanism relating benzene exposure to parturition. The prima facie plausibility of the observed interaction rests on the fact that the candidate genes, CYP1A1 and GSTT1, are involved in the detoxification of organic solvents. The authors are somewhat less definite about direct connections of either gene to benzene. The CYP1A1 variants do not seem to exhibit differential enzyme activity (17
) but do, as the authors point out, show differential inducibility. Is benzene a substrate or an inducer for CYP1A1? Substrates for the CYP1A1 enzyme are generally polycyclic molecules such as benz[a]pyrene from cigarette smoke, whereas the phase I metabolism of benzene has been attributed to CYP2E1 (18
). Even granting that biologic plausibility is a mercurial concept, the authors might have delved more deeply into mechanisms through which benzene might influence parturition or into the roles of these candidate genes in benzene metabolism to provide a stronger foundation for assessing the biologic plausibility of their results.
![]() |
A FINAL REMARK |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|