Commentary: Considerations for Use of Racial/Ethnic Classification in Etiologic Research

Jay S. Kaufman1,2 and Richard S. Cooper3

1 Department of Epidemiology, University of North Carolina School of Public Health, Chapel Hill, NC.
2 Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NC.
3 Department of Preventive Medicine and Epidemiology, Loyola University Stritch School of Medicine, Maywood, IL.

ABSTRACT

Numerous authors have critiqued the use of race as an etiologic quantity in medical research. Despite this criticism, the use of variables encoding racial/ethnic categorization has increased in epidemiology, and most researchers agree that important variation in disease risk is captured by this classification system. Previous discussions have generally neglected to articulate guidelines for appropriate use of racial/ethnic information in etiologic research. The authors summarize the logical, conceptual, and practical problems associated with the "ethnic paradigm" as currently applied in biomedical sciences and offer a set of methodological recommendations toward more valid use of racial/ethnic classification in etiologic studies. These suggested guidelines address issues of variable definition, study design, and covariate control, providing a consistent foundation for etiologic research programs that neither ignore racial/ethnic disease disparities nor obfuscate the nature of these disparities through inappropriate analytical approaches. This methodological analysis of racial/ethnic classification as an epidemiologic quantity provides a formal basis for a focus on racism (i.e., social relations) rather than race (i.e., innate biologic predisposition) in the interpretation of racial/ethnic "effects."

causality; confounding factors (epidemiology); epidemiologic factors; epidemiologic methods; epidemiologic research design; ethnic groups; racial stocks

Abbreviations: SES, socioeconomic status

Over the last 15 years, numerous authors have critiqued the use of race as an etiologic quantity in medical research (1GoGoGoGoGoGoGo–8Go). Much of the emphasis has been on the traditional notion that human races define "[p]ersons who are relatively homogeneous with respect to biological inheritance" (9Go, p. 139). Despite criticism of a biologic concept of race, use of variables encoding racial/ethnic categorization has increased during this period (10Go, 11Go). Even for those who embrace the view that the biologic content of racial/ethnic categories is limited, a rationale for the continued focus on these quantities is that they encode important variations in environment because of the central role they play in social stratification (6Go, 12GoGo–14Go). Under either set of assumptions, racial or ethnic designation is a remarkably strong predictor of health status (15Go). It is understandable, therefore, that researchers would seize upon this observed variability between racial/ethnic groups as an important natural resource for etiologic research.

The application of this approach, which we refer to as the "ethnic paradigm," is fraught with difficulty, however. Several previous treatments have emphasized various deficiencies but have generally failed to provide practical guidelines utilizing racial or ethnic status in etiologic research. Although some journals have recently suggested guidelines, these are generally vague, often merely requiring authors to justify their use of race or ethnicity in subject matter terms and to explain how the variables are defined (16Go, 17Go). The topic also requires fresh examination in light of the revolution in molecular biology that allows genetic polymorphisms and their products to be assessed directly, obviating the need to rely on social categorizations as rough surrogates for unspecified biologic attributes. We summarize here the logical, conceptual, and practical problems associated with the ethnic paradigm as currently realized in observational research. We then offer a set of practical methodological guidelines, in light of this critique, to facilitate more consistent use of racial and ethnic classification in etiologic studies.

PRELIMINARY CONSIDERATIONS

Distinction between "race" and "ethnicity"
The prevailing notion of race in biomedical research has long been understood to imply that phenotypic traits like skin color and facial features can be used to categorize people into meaningful genetic subgroups (18Go). The concept of ethnicity has been suggested as an alternative to race because it is thought to carry less of a strictly biologic connotation, implying that groups may differ by cultural as well as biologic heritage (19GoGo–21Go). In practice, the distinction between these constructs is often blurred, leading many researchers to collapse them into a single dimension as "race/ethnicity" (14Go, 22Go) or "ethnorace" (23Go). Collapsing the terms is also justified because data are generally gathered by self-report, and many respondents consider the terms to be synonymous (24Go).

How do you know what race or ethnicity a person is/has?
The "gold standard" for racial/ethnic assessment is self-report (25Go). Although there are measurable biologic correlates of ancestry (26Go), there is no objective physiologic or anatomic verification of race/ethnicity because this is a descriptor of identity and therefore part of the subjective consciousness of the individual. The circularity of existing definitions for race in terms of ancestry reveals this necessary subjectivity. For example, the US Office of Management and Budget directive 15 definition for Black race is "A person having origins in any of the black racial groups of Africa" (27Go, p. 29835). The restriction to Black African ancestry, without guidance for how to operationalize this criterion, is necessary on prima facie grounds to exclude African groups that are not historically recognized as Black (e.g., Afrikaners, Arabs), but it is also vague enough to ensure that the formal definition conforms to any assertion of self-identity. In the broadest interpretation, all of humanity meets this definition (28Go).

Although recent innovations in molecular biology facilitate racial/ethnic identification from fragmentary biologic material (26Go, 29Go), this does not supercede self-identification as the ultimate basis for categorization. Although these techniques might seem to suggest a more objective basis for dividing humans into subspecies, this interpretation is faulty because it confuses prediction with validation. Investigators began with groups of people who were identified by self-report or social consensus as Black, White, and so forth and sought measures of physical or molecular traits that would predict the recorded racial classification. The standard against which such predictions are judged for accuracy, therefore, remains subjective identity. Rather than validating a biologic entity called race, these methods simply indicate that we can often predict how people are likely to define themselves or be recognized, even if no such category exists in nature as an objective entity (30Go).

The biologic content of racial/ethnic self-classification
Racial classifications that attempt validation rather than prediction presume a priori meaningful clustering of biologic traits in genetically isolated subpopulations. However, the degree and nature of that clustering remain undetermined. While many researchers are quick to cite the possible significance of various biologic differences among regional populations (e.g., sickle cell, Tay Sachs), these sorts of traits have little relevance for etiologic research on the complex diseases that are the focus of most observational epidemiology. Although molecular analyses may reveal that regional populations vary in the frequency of some genetic markers, defining a race in genetic terms would require determining whether the human genome aggregates naturally into subunits. It is well-appreciated, for example, that as few as five tandem repeat or microsatellite markers can unambiguously identify most individuals; DNA fingerprinting—the most dramatic application of this principle—has precipitated a near collapse of the American system of assigning guilt for homicide (31Go). This observation follows from the fundamental uniqueness of all living organisms, whereas the challenge of classification is the opposite: finding meaningful categories amidst naturally occurring variation by assuming an inherent structure in the distribution of biologic traits.

The genetic basis of the difference between men and women, for example, appears clear at the level of the genome: the X and Y chromosomes throw large "switches" that influence every cell. Nonetheless, the essential genetic difference between the sexes remains obscure in a functional sense because we do not know what the genes are or what they do. Moving to a higher level of complexity, biologists have long speculated about what differentiates two closely related species. We share 99 percent DNA sequence identity with the chimpanzee, yet we are obviously very different, and we possess no meaningful biologic metric with which to quantify that genetic difference.

These considerations reinforce the complexity of defining human races at the level of the genome. There will be no "master switches," like the X/Y chromosomes, but at most an abundance of minor variants leading to subtle differences. Classification based on many minor variants requires a method of summarizing this information, which we currently lack. One popular conceptualization of a genome is a sequence of base pairs, but DNA has no objective dimensionality, only functionality. It follows that looking at sequence differences across groups is wholly insufficient as a basis for classification and comparison when the functional meaning of the variation remains unknown. Even if it were possible to sensibly quantify genomic variation, we would still need to determine how much variation is enough to make separate races. These problems must be resolved before we can make much use of the general concept of race as a coherent biologic entity, and until then we argue that creating categories within our species more often obscures than reveals meaning. Conceivably, genetics may someday advance to the point that variation across the genome can be measured, understood, quantified, and aggregated, but our first glimpse of that task reveals how difficult this will be and how hopelessly inadequate existing racial classification schemes have been for this purpose.

LOGICAL FOUNDATIONS

What is a cause?
Studies of disease etiology are concerned with causation. The counterfactual model of causality, which dominates modern quantitative inference in biomedicine, defines a "cause" in relation to an "effect" as a contrast between (hypothetical) intervention scenarios (32GoGo–34Go). A consequence of this definition appears to be that factors under consideration as potential causes must be plausibly manipulable; they cannot include fixed attributes such as race (35Go). In the social sciences, where manipulability is rarely possible, there has been greater resistance to counterfactual definitions of causality because of this implicit restriction, although no consistent alternative definition has been proposed (36Go, pp. 135–8; 37Go, pp. 40–5). When causal definitions are tied to human action, by analogy with experimental manipulation, there is no ambiguity about the meaning of a causal attribution; the effect is a contrast between the outcome distributions under various manipulative regimens (36Go, pp. 70–2). When such manipulation is not tenable, even hypothetically, then effects can only correspond to contrasts between conditional distributions such as Pr(Y = y|X = x1) and Pr(Y = y|X = x2), where x1 and x2 are observed levels of X. These contrasts provide no distinction between association through causation or through a common antecedent cause, a long-standing philosophic objection to nonmanipulative approaches to defining causation (38Go).

Race as a cause
The causal effect of race in an etiologic model is presumably the contrast between outcome distributions for subjects manipulated to various racial/ethnic states, for example, Pr(Y = y|SET[Race = Black]) versus Pr(Y = y|SET[Race = White]) (36Go, p. 70). To estimate such quantities we must accept the existence of counterfactual distributions, such as the outcome distribution for Whites, had they been Black (39Go). Both the untenability of the intervention and the absurdity of the counterfactual distribution have led several authors to reject race as a valid cause in this sense (40Go, 41Go).

Positing a counterfactual racial/ethnic state may be judged plausible if racial identity is not considered to be a fundamental or unalterable characteristic of an individual (7Go). Individuals in racialized societies are not free to adopt any identity they wish, however, but rather must generally adhere to an identity consistent with social expectation based on phenotype and behavior. Moreover, identity is not generally a product of individual volition. If we consider a hypothetical manipulation in which a fetus in a White mother is treated to induce dark skin at birth and then endowed with an African-American self-identity by being raised culturally as Black, we would achieve something approaching the counterfactual contrast necessary for viewing race as a well-defined cause. The probability of a given outcome among treated individuals is contrasted with the probability that would have pertained if, counter to fact, the intervention had not occurred. Of course, for any given individual, only one of the two states is directly observable.

The imaginary intervention above reveals the extent to which race/ethnicity is ill-suited to be considered a cause, even if such an intervention were feasible. The hypothetical contrast is considered in the "closest possible world," in which only exposure is manipulated and all other variables are unperturbed (32Go; 36Go, pp. 238–40), because factors affected by exposure are part of its total effect (42Go). For race/ethnicity as an exposure, this contrast is difficult to articulate because the exposure is a state of lifelong identity. Virtually all other relevant variables in a study (e.g., diet, socioeconomic status, neighborhood characteristics) will, as consequences of exposure, be differentially distributed in the two contrasting states because few covariates are plausibly unaffected by race/ethnicity. Because this hypothetical manipulation is so global in its total effect, some have referred to social factors such as race/ethnicity as fundamental or ultimate causes (43Go).

IMPLICATIONS OF STUDY DESIGN

Study designs that permit race as a well-defined cause
When race/ethnicity is a trait of an individual and we wish to infer disease causation within that individual, it may be difficult to posit an alternate status. When the etiologic process under study is not internal to the individual whose race/ethnicity is assessed, however, valid causal contrasts are more readily defined. An example of this paradigm is the use of "testers" in discrimination investigations: actors who attempt activities such as renting an apartment or securing a bank loan with identical presentations (using fixed scripts) except for their racial/ethnic status. The difference in the experiences of the testers is attributable to racial discrimination, because the study design ensures that other relevant details of the encounter are held constant. Because the causal effect of race is directly estimated by contrasting the outcome distributions under each treatment in this experimental design, the use of race as an exposure is valid and interpretable. Similar conclusions have been stated regarding the causal effect of gender (36Go, pp. 128–30; 44Go).

This general approach has proliferated recently in medical research as well and constitutes a useful paradigm for understanding one aspect of racial/ethnic variation in health status (45GoGoGoGo–49Go). Designs may involve scripted case presentations from actors of various racial/ethnic backgrounds (50Go) or diagnostic decisions from duplicate medical records on which racial/ethnic status is systematically varied (51Go). Even for observational studies in which race/ethnicity is not directly manipulated, this genre of study still allows for valid causal inference in principle, under the assumption that other factors predictive of the outcome are included as covariates (52Go). Although such investigations may not be considered strictly etiologic because they address differential diagnosis and access to health services as opposed to a purely biologic hypothesis, most etiologic research on racial/ethnic differentials must address the contributions of systematic diagnostic and treatment differentials to measured status (53Go).

Characteristics of race as a cause in standard study designs
When a racial/ethnic contrast is estimated in standard designs and interpreted as an effect internal to study participants, inference is complicated because variables that are intrinsic are causally antecedent to nearly all measurable covariates. That is, a person's race/ethnicity is fixed prior to his/her measured social, physiologic, and psychological status; all of these measurable factors are downstream of the exposure in a racially stratified society (43Go). A consequence of this temporal primacy is that virtually all potential covariates in analyses of racial/ethnic disparities are causal intermediates. It has been suggested that adjustment for covariates, such as social class in racial/ethnic comparisons, risks overcontrol because social class is itself affected by race (2Go, 39Go). Indeed, conditioning on almost any other covariate will bias estimates of total effect because adjustment for causal intermediates using standard methods is not generally valid (42Go).

ANALYTICAL ISSUES

Effect decomposition by adjustment for consequences of race/ethnicity
Some authors acknowledge that adjustment for consequences of racial/ethnic status yields a biased estimate of total effect, but they contend that adjustment for causal intermediates decomposes the total effect into indirect effects (transmitted through the intermediate) and direct effects (transmitted through unspecified pathways) (figure 1a). This is the strategy underlying the common practice of adjusting for socioeconomic variables to see if the race effect "goes away" (e.g., testing the null hypothesis that the partial correlation between race/ethnicity and outcome equals zero). This is presently the most common analytical framework for studying race/ethnicity in epidemiologic research (54GoGoGoGo–58Go). Despite the popularity of this approach, it is highly prone to providing misleading inference. Not only is this method likely to suggest spurious direct effects of race due to misspecification of the intermediates (22Go, 59Go, 60Go), but it is also prone to bias because adjustment for intermediates does not generally provide valid estimates of direct effects (36Go, pp. 163–5; 61Go). The exchangeability conditions that provide for valid causal inference for a given exposure are not sufficient to provide for separate identification of direct and indirect effects of that exposure (36Go, pp. 127–8).



View larger version (17K):
[in this window]
[in a new window]
 
FIGURE 1. Direct and indirect causal pathways linking race, socioeconomic status (SES), and disease.

 
As a simple illustration of this problem, consider the following thought exercise (62Go). Suppose that there are two race groups (Black, White), two socioeconomic states (poor, rich), and two outcomes (disease, health) and, for sake of simplicity, that all effects are completely deterministic. We are interested in two causal types in the population. For type 1 individuals, the entire effect of race is indirect, relayed through poverty: Black race leads invariably to poverty, and poverty leads invariably to disease (figure 1b). For type 2 individuals, the entire effect of race is direct: Black race leads invariably to poverty, and Black race leads invariably to disease, but poverty has no effect on disease whatsoever (figure 1c). The question regarding what proportion of the total observed effect is indirect (relayed through poverty) and what proportion is direct is therefore equivalent to merely determining the proportions of types 1 and 2 in the population. Now suppose that we observe values of these variables and intend to separate direct from indirect effects of race by controlling for socioeconomic status. Every Black subject presents with the same vector of values: Black, poor, disease. There is no logical way to determine the proportions of types 1 and 2 and, thus, no way to separate out the two types of effects. Consequently, for the causal structure shown in figure 1a, there is no clear interpretation of an estimate for the adjusted effect of race. Using a more elaborate mix of deterministic causal types, Robins and Greenland (61Go) demonstrate that attempts at statistical control from any method are prone to bias in these settings and can easily indicate "independent" effects even when none exist.

Race/ethnicity as a covariate
A common use of racial/ethnic categorization in observational research is as a covariate when another quantity is the primary exposure of interest. Adjustment in this context is equivalent to standardizing the distribution of study subjects to some alternate set of racial/ethnic proportions. Adjustments of this sort have been criticized because they provide no insight into the role or the meaning of the race/ethnicity quantity (63Go, 64Go). Despite this problem, adjustment does not invoke the logical and technical dilemmas described above. Whatever the unspecified myriad factors for which racial/ethnic status is a surrogate, these may be partially controlled when analyses are stratified or standardized by this variable. Because this is analogous to simply stratifying or sampling the populations with weighted probabilities (Pr), it does not lead to the same problems of interpretation created by focusing on race/ethnicity as the causal factor of interest.

For example, table 1 shows a hypothetical population (n = 18,750) with two race groups (r0, r1), two SES levels (i0, i1), and binary outcome D (d0, d1). The crude association between SES and disease (relative risk = 5.14) is confounded by an unbalanced representation of levels of SES within racial/ethnic groups in the observed population, so that the association measure does not equal the true causal contrast that would result from intervention on SES. Reweighting each cell by [Pr(SES)/Pr(SES|race)], we obtain a pseudopopulation with the same number of SES = i1 (n = 3,750) and SES = i0 (n = 15,000) but that is standardized to a new joint distribution so that race/ethnicity no longer confounds the relation between SES and outcome D (65Go). In the reweighted data in table 2, Pr(SES = i1|race = r0) = Pr(SES = i1|race = i1) = 0.2, so that no confounding is expected.


View this table:
[in this window]
[in a new window]
 
TABLE 1. A hypothetical population (n = 18,750) with SES*effect on disease D confounded by race

 

View this table:
[in this window]
[in a new window]
 
TABLE 2. A pseudopopulation (n = 18,750) formed from weighted strata of table 1, with SES* effect on disease D unconfounded by race

 
Table 2 merely reflects oversampling of certain strata in order to achieve an unbiased causal effect estimate for SES and, in this example with effect homogeneity over the racial/ethnic strata, the result is equivalent to that achieved with any weighted average of stratum-specific estimates (e.g., Mantel-Haenszel procedure). There may still be unmeasured confounding between SES and outcome D, as in the previous example, but there is an important distinction; that is, when SES is the factor of interest in a structure such as figure 1a, conditional independence between SES and the vector of counterfactual outcomes {D|SET(SES = i0), D|SET(SES = i1)} is sufficient for unbiased causal inference (35Go). When race/ethnicity is the factor of interest, on the other hand, this same condition does not imply unbiased estimation of direct effects by conditioning on SES (61Go, 66Go).

SUMMARY RECOMMENDATIONS

When total effect of race/ethnicity is the quantity of interest
Surveillance. In the description of crude population incidence or prevalence, results stratified by race/ethnicity may be crucial for documenting existing inequalities and monitoring disparities over time (67Go) (table 3). This is an important activity for assessing the population burden of disease, allocating public health and medical resources, and motivating etiologic research. A notable potential drawback is the inadvertent reification of race as a biomedical quantity (68Go). Nonetheless, because of dramatic racial/ethnic disparities for many conditions, this is generally considered a significant and consequential research program.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Summarization of methodological guidelines for use of race/ethnicity in observational research

 
Health care epidemiology. For hypotheses concerning the behaviors of patients and health care providers, interactions between patients and providers, and other aspects of social relations that influence care-seeking and evaluation and that shape the provision and consequences of health care, the effects of race/ethnicity are potentially valid, interpretable, and important.

Etiologic research: the ethnic paradigm. There is no unambiguous causal interpretation to total race effect estimates in the context of etiologic research. It is questionable, therefore, whether this should ever be a quantity of interest for biomedical researchers, except in cases when race is a marker for a process external to individual physiology, as in the investigation of health services disparities or other sociologic questions. In the event that an investigator is convinced that a pathophysiologic racial/ethnic effect is meaningful, however, covariate sets for adjustment must be chosen cautiously. Factors that may confound the estimated effect measures for race are other invariant attributes of the individuals, including sex, age, and genetic factors. Estimates for the total effect of race should not generally be adjusted for or stratified by other covariates.

When direct and indirect effects of race/ethnicity are the quantities of interest
Although a common analytical strategy is to adjust race/ethnicity for social factors in order to identify a direct biologic effect that is not mediated by measured covariates, this approach is highly problematic. Even if an interpretation could be granted to a racial/ethnic effect, the rigid assumptions necessary in order to render this decomposition strategy reliably valid are so far from plausible that it is difficult to imagine any useful and cogent inference resulting from this practice.

In studies of health care epidemiology, potentially valid covariates for adjustment depend on the particular design but are considerably less limited by consideration of causal order than in studies of individual pathophysiology. For example, in the study of the etiology of heart disease, comorbid diabetic status is affected by race/ethnicity and therefore not a candidate for adjustment when race/ethnicity is the factor of interest. On the other hand, when race/ethnicity is the factor of interest in a study of heart disease diagnosis or management (50Go), the causal process is external to the study participant. In this case, a comorbid condition, such as diabetes, is not causally subsequent to the exposure of interest and would often be an important and valid candidate for statistical adjustment in the effort to estimate an unbiased effect of race/ethnicity.

When the effect of a variable confounded by race/ethnicity is the quantity of interest
Adjustment for race/ethnicity may be reasonable when attempting to estimate the causal effect of another factor of interest (i.e., when race/ethnicity is merely a nuisance in the data). This use of racial/ethnic information has been criticized because researchers often fail to describe what they believe race/ethnicity represents in such a model. Although an understanding of the observed relation between race/ethnicity and the outcome is not furthered by this usage, this does not detract from the utility of improving the effect estimation of interest. Nonetheless, although conditioning on racial/ethnic status may reduce bias, other more specific measures, for which racial/ethnic status is acting as a rough surrogate, may reduce bias more effectively.

ACKNOWLEDGMENTS

Supported in part by grant R01-HD-39746 from the National Institute of Child Health and Human Development.

The authors thank Chandra Ford and Dr. Sol Kaufman for their insightful critiques of early drafts of the manuscript.

NOTES

Correspondence to Dr. Jay S. Kaufman, Department of Epidemiology (CB#7435), University of North Carolina School of Public Health, McGavran-Greenberg Hall, Pittsboro Road, Chapel Hill, NC 27599-7435 (e-mail: Jay_Kaufman{at}unc.edu).

Editor's note: An invited commentary on this article appears on page 299, and the authors' response appears on page 305.

REFERENCES

  1. Cooper RS. A note on the biologic concept of race and its application in epidemiologic research. Am Heart J 1984;108:715–23.[ISI][Medline]
  2. Cooper RS, David R. The biological concept of race and its application to public health and epidemiology. J Health Polit Policy Law 1986;11: 97–115.[ISI][Medline]
  3. Kaufman JS, Cooper RS. Epidemiologic research on minority health: in search of the hypothesis. Public Health Rep 1995;110:662–6.[ISI][Medline]
  4. Witzig R. The medicalization of race: scientific legitimization of a flawed social construct. JAMA 1996;125:675–9.
  5. Muntaner C, Nieto FJ, O'Campo P. The Bell Curve: on race, social class, and epidemiologic research. Am J Epidemiol 1996;144:531–6.[ISI][Medline]
  6. Williams DR. Race and health: basic questions, emerging directions. Ann Epidemiol 1997;7:322–33.[ISI][Medline]
  7. Muntaner C. Invited Commentary: social mechanisms, race, and social epidemiology. Am J Epidemiol 1999;150:121–6.[ISI][Medline]
  8. Stolley PD. Race in epidemiology. Int J Health Serv 1999;29:905–9.[ISI][Medline]
  9. Last JM, ed. A dictionary of epidemiology. 3rd ed. New York, NY: Oxford University Press, 1995.
  10. Jones CP, LaVeist TA, Lillie-Blanton M. "Race" in the epidemiologic literature: an examination of the American Journal of Epidemiology, 21–1990. Am J Epidemiol 1991;134:1079–84.[Abstract]
  11. Ahdieh L, Hahn RA. Use of the terms "race," "ethnicity," and "national origins": a review of articles in the American Journal of Public Health, 80–1989. Ethn Health 1996;1:95–8.[Medline]
  12. Buehler JW, Stroup DF, Klaucke DN, et al. The reporting of race and ethnicity in the National Notifiable Diseases Surveillance System. Public Health Rep 1989;104:457–65.[ISI][Medline]
  13. Cooper RS. Health and the social status of blacks in the United States. Ann Epidemiol 1993;3:137–44.[Medline]
  14. Williams DR. Race/ethnicity and socioeconomic status: measurement and methodological issues. Int J Health Serv 1996;26:483–505.[ISI][Medline]
  15. Cruickshank JK, Beevers DG, eds. Ethnic factors in health and disease. Boston, MA: Wright, 1989.
  16. Census, race, and science. (Editorial). Nat Genet 2000;24:97–8.[ISI][Medline]
  17. For discussion: race, ethnicity and nationality. (Editorial). Paediatr Perinat Epidemiol 2000;14:13.[ISI]
  18. Cartmill M. The status of the race concept in physical anthropology. Am Anthropologist 1999;100:651–60.[ISI]
  19. Crews DE, Bindon JR. Ethnicity as a taxonomic tool in biomedical and biosocial research. Ethn Dis 1991;1:42–9.[Medline]
  20. Senior PA, Bhopal R. Ethnicity as a variable in epidemiologic research. BMJ 1994;309:327–30.[Free Full Text]
  21. McKenzie K, Crowcroft NS. Describing race, ethnicity, and culture in medical research. (Editorial). BMJ 1996;312:1054.[Free Full Text]
  22. Herman AA. Toward a conceptualization of race in epidemiologic research. Ethn Dis 1996;6:7–20.[Medline]
  23. Goldberg DT. Racist culture: philosophy and the politics of meaning. Cambridge, MA: Blackwell Publishers, 1993.
  24. McKenney NR, Bennett CE. Issues regarding data on race and ethnicity: the Census Bureau experience. Public Health Rep 1994;109:16–25.[ISI][Medline]
  25. Kaufman JS. How inconsistencies in racial classification demystify the race construct in public health statistics. Epidemiology 1999;10:101–3.[ISI][Medline]
  26. Shriver MD, Smith MW, Li J, et al. Ethnic-affiliation estimation by use of population-specific DNA markers. Am J Hum Genet 1997;60:957–64.[ISI][Medline]
  27. Executive Office of the President, Office of Management and Budget. Race and ethnic standards for federal statistics and administrative reporting. Statistical policy directive no. 15. Federal Register 1994 (June 9);59:FR 29831–5 (http://frwebgate3.access.gpo.gov/cgi-bin/waisgate.cgi?WAISdocID5788876441+9+0+0&WAISaction5retrieve).
  28. Leakey RE. A review of the evidence for our African origins. Ethn Dis 1991;1:8–20.[Medline]
  29. Parra EJ, Marcini A, Akey J, et al. Estimating African-American admixture proportions by use of population-specific alleles. Am J Hum Genet 1999;63:1839–51.[ISI]
  30. Sauer NJ. Forensic anthropology and the concept of race: if races don't exist, why are forensic anthropologists so good at identifying them? Soc Sci Med 1992;34:107–11.[ISI][Medline]
  31. Sheck B, Neufeld P, Dwyer J. Actual innocence: five days to execution, and other dispatches from the wrongly convicted. New York, NY: Doubleday, 2000.
  32. Lewis D. Causation. J Philos 1973;70:556–67.
  33. Greenland S. Causation. In: Armitage P, Colton T, eds. Encyclopedia of biostatistics. New York, NY: John Wiley & Sons, Inc, 1998:569–72.
  34. Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Stat Sci 1999;14:29–47.[ISI]
  35. Stone R. The assumptions on which causal inferences rest. J R Stat Soc (B) 1993;55:455–66.[ISI]
  36. Pearl J. Causality: models, reasoning and inference. Cambridge, United Kingdom: Cambridge University Press, 2000.
  37. Bollen KA. Structural equations with latent variables. New York, NY: John Wiley & Sons, Inc, 1989.
  38. Sobel ME. Causal inference in the social and behavioral sciences. In: Arminger G, Clogg C, Sobel ME, eds. Handbook of statistical modeling for the social and behavioral sciences. New York, NY: Plenum Press, 1995:1–38.
  39. Morgenstern H. Defining and explaining race effects. Epidemiology 1997;8:609–10.[ISI][Medline]
  40. Holland PW. Statistics and causal inference. J Am Stat Assoc 1986;81:945–61.[ISI]
  41. Kaufman JS, Cooper RS. Seeking causal explanations in social epidemiology. Am J Epidemiol 1999;150:113–20.[Abstract]
  42. Rosenbaum PR. The consequences of adjustment for a concomitant variable that has been affected by the treatment. J R Stat Soc (A) 1984;147:656–66.[ISI]
  43. Link BG, Phelan J. Social conditions as fundamental causes of disease. J Health Soc Behav 1995;38(suppl):80–94.
  44. Holland PW. Causal mechanism or causal effect: which is best for statistical science? Discussion of "Employment discrimination and statistical science" by A. P. Dempster. Stat Sci 1988;3:186–8.
  45. Ford ES, Cooper RS. Implications of race/ethnicity for health and health care use. Health Serv Res 1995;30:237–52.[ISI][Medline]
  46. King G. Institutional racism and the medical/health complex: a conceptual analysis. Ethn Dis 1996;6:30–46.[Medline]
  47. Van Ryn M, Burke J. The effect of patient race and socioeconomic status on physicians' perceptions of patients. Soc Sci Med 2000;50:813–28.[ISI][Medline]
  48. Rathore SS. The effects of patient sex and race on medical students' ratings of quality of life. Am J Med 2000;108:561–6.[ISI][Medline]
  49. Fiscella K, Franks P, Gold MR, et al. Inequality in quality: addressing socioeconomic, racial and ethnic disparities in health care. JAMA 2000;283:2579–84.[Abstract/Free Full Text]
  50. Schulman KA, Berlin JA, Harless W, et al. The effect of race and sex on physicians' recommendations for cardiac catheterization. N Engl J Med 1999;340:618–26.[Abstract/Free Full Text]
  51. Loring M, Powell B. Gender, race and DSM-III: a study of the objectivity of psychiatric diagnostic behavior. J Health Soc Behav 1988;29:1–22.[ISI][Medline]
  52. Conigliaro J, Whittle J, Good CB, et al. Understanding racial variation in the use of coronary revascularization procedures: the role of clinical factors. Arch Intern Med 2000;160:1329–35.[Abstract/Free Full Text]
  53. Liu ET. The uncoupling of race and cancer genetics. Cancer 1998;83:1765–9.[ISI]
  54. Brancati FL, Whelton PK, Kuller LH, et al. Diabetes mellitus, race and socioeconomic status: a population study. Ann Epidemiol 1996;6:67–73.[ISI][Medline]
  55. Davey Smith G, Neaton JD, Wentworth D, et al. Mortality differences between black and white men in the USA: contribution of income and other risk factors among men screened for the MRFIT. Lancet 1998;351:934–9.[ISI][Medline]
  56. Winkleby MA, Kraemer HC, Ahn DK, et al. Ethnic and socioeconomic differences in cardiovascular disease risk factors: findings for women from the Third National Health and Nutrition Examination Survey, 88–1994. JAMA 1998;280:356–62.[Abstract/Free Full Text]
  57. Ng-Mak DS, Dohrenwend BP, Abraido-Lanza AF, et al. A further analysis of race differences in the National Longitudinal Mortality Study. Am J Public Health 1999;89:1748–51.[Abstract]
  58. Robbins AS, Whittemore AS, Thom DH. Differences in socioeconomic status and survival among White and Black men with prostate cancer. Am J Epidemiol 2000;151;409–16.[Abstract]
  59. Kaufman JS, Durazo-Arvizu RA, McGee DL, et al. The difference in diabetes risk in blacks and whites. (Letter). Ann Epidemiol 1997;7:76–7.[ISI][Medline]
  60. Kaufman JS, Cooper RS, McGee DL. Socioeconomic status and health in blacks and whites: the problem of residual confounding and the resiliency of race. Epidemiology 1997;8:621–8.[ISI][Medline]
  61. Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology 1992;3:143–55.[ISI][Medline]
  62. Kaufman JS. Progress and pitfalls in the social epidemiology of cancer. Cancer Causes Control 1999;10:489–94.[ISI][Medline]
  63. LaVeist TA. Beyond dummy variables and sample selection: what health services researchers ought to know about race as a variable. Health Serv Res 1994;29:1–16.[ISI][Medline]
  64. Moss N. What are the underlying sources of racial differences in health? Ann Epidemiol 1997;7:320–1.[ISI][Medline]
  65. Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000;11:550–60.[ISI][Medline]
  66. Holland PW. Causal inference, path analysis, and recursive structural equations models. In: Clogg C, ed. Sociological methodology. Washington, DC: American Sociological Association, 1988:449–84.
  67. Barnett E, Armstrong DL, Casper ML. Evidence of increasing coronary heart disease mortality among black men of lower social class. Ann Epidemiol 1999;9:464–71.[ISI][Medline]
  68. American Association of Physical Anthropologists (AAPA). AAPA statement on biological aspects of race. Am J Phys Anthropol 1996;101:569–70.[ISI]
  69. Brancati FL, Kao WHL, Folsom AR, et al. Incident type 2 diabetes mellitus in African American and white adults: the Atherosclerosis Risk in Communities Study. JAMA 2000;283:2253–9.[Abstract/Free Full Text]
  70. Ford ES. Serum copper concentration and coronary heart disease among US adults. Am J Epidemiol 2000;151:1182–8.[Abstract]
Received for publication July 3, 2000. Accepted for publication January 24, 2001.