CORRESPONDENCE

RESPONSE: Re: Population Stratification in Epidemiologic Studies of Common Genetic Variants and Cancer: Quantification of Bias

Sholom Wacholder, Nathaniel Rothman, Neil Caporaso

Affiliation of authors: Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD.

Correspondence to: Sholom Wacholder, Ph.D., National Institutes of Health, EPS 8046, 6120 Executive Blvd., Bethesda, Md 20892 (e-mail: Wacholder{at}nih.gov).

Dr. Millikan's letter raises the question of how to assess the bias from population stratification in a particular study that did not account for race or ethnicity. We show here that, even in the most extreme situation, where a genotype is virtually universal in one group and absent in the other, the bias factor must lie between the ratio of disease rates in the two groups and its reciprocal. To see this, one must remember that the confounding risk ratio, CRR, which is a measure of the bias, depends on the ratio of the disease rates among those with the at-risk genotype and its frequency in the two groups, P1b and P1w (1). When the rates or genotype frequencies are equal (RR = 1 or P1b = P1w), CRR is 1. If RR is greater than 1, then the confounding is positive, and CRR is greater than 1 when P1b is greater than P1w (2); CRR is below 1 (negative confounding) results when P1b is less than P1w. When P1b = 1 and P1w = 0, CRR equals RR; when P1b = 0 and P1w = 1, CRR equals 1/RR. Thus, the bias factor CRR must lie between the rate ratio RR and its reciprocal 1/RR, regardless of the differences in frequency of the at-risk genotype. Table 1Go shows the CRR for various genotype frequencies when there are two groups split 20%–80%.


View this table:
[in this window]
[in a new window]
 
Table 1. Approximate bias factor (confounding risk ratios) due to population stratification for a study that ignores race as a function of at-risk genotype frequencies and ratio of disease rates in two racial or ethnic groups*
 
Millikan's results on the effects of controlling for race in a study of breast cancer in blacks and whites in North Carolina provide a good example of how one can predict the direction and extent of bias. Millikan reports that, in the North Carolina Breast Cancer Study, the empiric CRRs for adjusting by race lie between 0.94 and 1.12 when assessing the effects of several genotypes as risk factors for female breast cancer. Given that the ratio of breast cancer rates in blacks and in whites is 0.89, as reported by the North Carolina Central Cancer Registry for the period 1993 through 1997, the small estimated CRRs are consistent in magnitude with Table 1Go and in the direction predicted by P1b/P1w. Even for the most extreme differences in genotype frequencies, the bias factor from ignoring race will be bounded approximately by 0.89 and 1.12, if the black to white breast cancer incidence rate ratio from the Registry applies to the 20 study counties in North Carolina from 1993 through 1996. The column in Table 1Go for RR = 1.1 can be used as a more precise indicator of the bias for specified race-specific genotype frequencies.

In addition to a wide range of cancer rates and genotype frequencies among the groups, there are two requirements for important bias from population stratification when race or ethnicity is ignored (1). The genotype frequencies and cancer rates must vary together; clearly, with only two groups, they do so here. The differences in rates must remain after adjustment for known risk factors; while the North Carolina study collected information on all known breast cancer risk factors (3), it is unclear how much, if any, of the rate difference they explain.

In contrast to breast cancer, the ratio of incidence rates of prostate cancer in black and white males from 1993 through 1997 is near 1.7 in the Surveillance, Epidemiology, and End Results Program.1 In Table 1Go, the column for RR = 1.7 shows that failure to adjust for race in a study of genotypes with an extreme difference in frequency and prostate cancer is likely to have a greater impact than for breast cancer.

NOTES

1 SEER is a set of geographically defined, population-based, central cancer registries in the United States, operated by local nonprofit organizations under contract to the National Cancer Institute (NCI). Registry data are submitted electronically without personal identifiers to the NCI on a biannual basis, and the NCI makes the data available to the public for scientific research. Back

REFERENCES

1 Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias. J Natl Cancer Inst 2000;92:1151–8.[Abstract/Free Full Text]

2 Boivin JF, Wacholder S. Conditions for confounding of the risk ratio and of the odds ratio. Am J Epidemiol 1985;121:152–8.[Abstract]

3 Newman B, Moorman PG, Millikan R, Qaqish BF, Geradts J, Aldrich TE, et al. The Carolina Breast Cancer Study: integrating population-based epidemiology and molecular biology. Breast Cancer Res Treat 1995;35:51–60.[Medline]



             
Copyright © 2001 Oxford University Press (unless otherwise stated)
Oxford University Press Privacy Policy and Legal Statement