Estimating causal effects

George Maldonadoa and Sander Greenlandb

a University of Minnesota School of Public Health, Mayo Mail Code 807, 420 Delaware St. SE, Minneapolis, MN 55455–0392, USA. E-mail: GMPhD{at}umn.edu
b Department of Epidemiology, UCLA School of Public Health, Los Angeles, CA 90095–1772, USA.

Although one goal of aetiologic epidemiology is to estimate ‘the true effect’ of an exposure on disease occurrence, epidemiologists usually do not precisely specify what ‘true effect’ they want to estimate. We describe how the counterfactual theory of causation, originally developed in philosophy and statistics, can be adapted to epidemiological studies to provide precise answers to the questions ‘What is a cause?’, ‘How should we measure effects?’ and ‘What effect measure should epidemiologists estimate in aetiologic studies?’ We also show that the theory of counterfactuals (1) provides a general framework for designing and analysing aetiologic studies; (2) shows that we must always depend on a substitution step when estimating effects, and therefore the validity of our estimate will always depend on the validity of the substitution; (3) leads to precise definitions of effect measure, confounding, confounder, and effect-measure modification; and (4) shows why effect measures should be expected to vary across populations whenever the distribution of causal factors varies across the populations.

Introduction

Imagine that the creator of the universe appears to you in a dream and grants you the answer to one public-health question. The conversation might go as follows:

You: What is the true effect of (your exposure here, denoted by E) on the occurrence of (your disease here, denoted by D)?

Creator: What do you mean by ‘the true effect’? The true value of what parameter?

You: The true relative risk.

Creator: Epidemiologists use the term relative risk for several different parameters. Which do you mean?

You: The ratio of average risk with and without exposure—what some call the risk ratio1 and others call the incidence proportion ratio.2

Creator: Which incidence proportion ratio?

You: Pardon?

Creator: Do you want a ratio of average disease risk in two different groups of people with different exposure levels?

You: Yes.

Creator: So you want a descriptive incidence proportion ratio?

You: No, not descriptive. Causal. An incidence proportion ratio that isolates the effect of E on D from all other causal factors.

Creator: By ‘isolate’, you mean a measure that applies to a single population under different possible exposure scenarios?

You: Yes, that's what I mean.

Creator: OK. Which causal incidence proportion ratio?

You: Pardon?

Creator: For what population, and for what time period? The true value of a causal incidence proportion ratio can be different for different groups of people and for different time periods. It's not necessarily a biological constant, you know.

You: Yes, of course. For population (your population here, denoted by P) between the years (your study time period here, denoted by t0 to t1).

Creator: By population P, do you mean: (1) everyone in population P, or (2) the people in population P who have a specific set of characteristics?

You: Pardon?

Creator: As I just said, the true value of a causal incidence proportion ratio is not necessarily a biological constant. It can be different for subgroups of a population.

You: Of course. Everyone in population P.

Creator: OK. Comparing what two exposure levels?

You: Exposed and unexposed.

Creator: What do you mean by exposed and unexposed? Exposed for how long, to how much, and during what time period? There are many different ways you could define exposed and unexposed, and each of the corresponding possible ratios can have a different true value, you know.

You: Of course. Ever exposed to any amount of E versus never exposed to E.

Creator: The incidence proportion ratio for the causal effect on D of ever E compared to never E in population P during the study time period t0 to t1 is (your causal incidence-proportion-ratio parameter value here).

The point of the above is that, while one goal of etiologic epidemiology is to estimate ‘the true effect’ of an exposure on disease frequency, we usually do not precisely specify what ‘true effect’ we want to estimate. We may not be able to do so. For example, before reading this paper would you have required less prompting than in the dialog above? How many published papers explicitly state what the authors mean by ‘true’ relative risk or odds ratio, or whether the estimated measure of association is intended to have a descriptive or causal interpretation? How many papers explicitly define the population or time period of interest? How many etiologic papers over-emphasize results that cannot be given a causal interpretation, such as significance tests, P-values, correlation coefficients, or proportion of variance ‘explained’?

In this paper we discuss the questions ‘What is a cause?’, ‘How should we measure effects?’ and ‘What effect measure should epidemiologists estimate in etiologic studies?’ We begin by adapting the counterfactual approach to causation, originally developed in philosophy and in statistics,3,4 to epidemiological studies. In the process, we give precise answers to these questions, and we describe how these answers have important implications for etiologic research: (1) Under the counterfactual approach, the measure we term a ‘causal contrast’ is the only meaningful effect measure for etiologic studies. (2) The counterfactual approach provides a general framework for designing and analysing epidemiological studies. (3) The counterfactual definition of causal effect shows why direct measurement of an effect size is impossible: We must always depend on a substitution step when estimating effects, and the validity of our estimate will thus always depend on the validity of the substitution.3,5–7 (4) The counterfactual approach makes clear that a critical step in study interpretation is the formal quantification of bias in study results. (5) The counterfactual approach leads to precise definitions of effect measure, confounding, confounder, and to precise criteria for effect-measure modification.

In the discussion that follows, we assume that the study outcome is a disease (e.g. lung cancer); this discussion can be readily extended to any outcome (e.g. a health behaviour such as cigarette smoking). We also assume for simplicity that disease occurrence is deterministic; under a stochastic model, the quantities we discuss are probabilities or expected values.6,7

The counterfactual approach

History
In 1748, the renowned Scottish philosopher David Hume wrote ‘we may define a cause to be an object followed by another ... where, if the first object had not been, the second never had existed.’3,8 A key innovation of this definition was that it pivoted on a clause of the form ‘if C had not occurred, D would not have either’, where C and D are what actually occurred. Such a clause, which hypothesizes what would have happened under conditions contrary to actual conditions, is called a counterfactual conditional. Despite its early appearance, this counterfactual concept of causation received no formal basis until 1923 when the statistician Jerzy Neyman presented a quantitative conceptual model for causal analysis.9 This model was originally known as the ‘randomization model’10 and was later called the ‘potential-outcomes model’ (or, inaccurately, ‘Rubin's model’) when extended to observational studies.11 The model has since been widely (though not universally12) adopted by statisticians and others seeking a logical foundation for statistical analysis of causation.4,10,13–16 These developments were paralleled by more extensive analysis of counterfactual reasoning by philosophers.17–20 A comprehensive review of causality theory is provided by Pearl,15 who shows how structural-equation models and graphical causal models (causal diagrams) translate directly to counterfactual models, shedding light on all three approaches. A brief review of these connections is given by Greenland,21 and Greenland et al.22 provide a more extensive review of graphical causal modelling for epidemiological research.

Target
We will use the term target population for the group of people about which our scientific or public-health question asks, and therefore for which we want to estimate the causal effect of an exposure. The target population could be composed of one group of people (as in most epidemiological studies), several groups of people (as in an intervention study in several communities), or one person. For simplicity, in the rest of this discussion we assume that the target population is one group of people.

Let the etiologic time period be the time period about which our scientific or public-health question asks. The beginning and end of this time period is specified by the study question. For example, in a study of the effectiveness of a back-injury prevention programme in a workplace, the etiologic time period could be any time period after the implementation of the programme. Note that this period may vary among individuals (e.g. the etiologic time period for a study of weight gain during pregnancy and pre-eclampsia spans only pregnancy), and not all of the period need be time at risk. For example, the etiologic time period for a study of intrauterine diethylstilbesterol exposure and subsequent fertility problems could include childhood, a time at no risk of such problems but during which etiologically relevant events (e.g. puberty) occur.


Let A be the number of new cases of the study disease in the target population during the etiologic time period. Let B be the denominator for computing disease frequency in the target population during the etiologic time period. If B is the number of people at risk at the beginning of the period and all individuals are followed throughout the etiologic time period, the disease-frequency parameter

is the proportion getting disease over the period (incidence proportion, average risk). If B is the amount of person-time at risk during the period, R is the person-time incidence rate. If B is the number of people who do not get disease by the end of the period, R is the incidence odds.

Let target refer to the target population during the etiologic time period.

Causal effect
Consider one target population during one etiologic time period, but under two different exposure distributions, as illustrated below. Let the subscript 1 denote one exposure distribution, and let the subscript 0 denote the other. These distributions represent different possible mixtures of individual exposure conditions. With, say, smoking as the exposure, distribution 0 could represent conditions under which 20% of the target population regularly smoked cigarettes during a given time period, whereas distribution 1 could represent conditions under which 40% (instead of 20%) of the population did so during that period.


Then R1 = A1/B1 is disease frequency if the target population had experienced exposure distribution 1, and R0 = A0/B0 is disease frequency if the same group of people during the same time period had instead experienced exposure distribution 0.

Let a causal contrast be a contrast between R1 and R0. For example, we define the ratio causal contrast as

where we allow RR to denote a risk ratio, rate ratio, or odds ratio. Similarly, we define the difference causal contrast as

Synonyms for causal contrast are effect measure and causal parameter2.

A causal contrast compares disease frequency under two exposure distributions, but in one target population during one etiologic time period. This type of contrast has two important consequences. First, the only possible reason for a difference between R1 and R0 is the exposure difference. A causal contrast, therefore, measures the causal effect of the difference between exposure distributions 1 and 0 in the target population during the etiologic time period.2,4–7,23

Second, a causal contrast cannot be observed directly, as we explain below, and therefore a different type of measure must be used as a substitute for it.

Counterfactuals
Why is it not possible to directly observe a causal contrast? Because at least one of the disease-frequency parameters needed for a causal contrast, R1 and R0, must be counterfactual and therefore unobservable. A parameter (such as a disease frequency) that describes events under actual conditions is said to be actual (or factual); in contrast, a parameter that describes events under a hypothetical alternative to actual conditions is said to be counterfactual.2,3,7,17 Counterfactual parameters cannot be observed because, by their very definition, they describe consequences of conditions that did not exist—they describe events following hypothetical alternatives to actual conditions, not actual conditions. The entire collection of outcome parameters for the target, actual and counterfactual (here, R1, R0 and all Ri under all other exposure conditions), is sometimes called the set of potential outcomes, to note that each is a possibility before the exposure distribution becomes fixed.11,23

For a given R1 and R0, one and only one of the following three scenarios may occur. R1 is an actual disease frequency; it occurs, and therefore it can be observed. R0, then, must be counterfactual; as a hypothetical alternative to R1 it does not occur, and therefore it cannot be observed (as illustrated below).


R0 is an actual disease frequency; it occurs, and therefore it can be observed. R1, then, must be counterfactual; as a hypothetical alternative to R0 it does not occur, and therefore it cannot be observed.


Both R1 and R0 are counterfactual disease frequencies—both are hypothetical alternatives to the actual disease frequency that occurs under the actual exposure distribution (which is neither exposure distribution 1 nor 0), and therefore neither R1 nor R0 can occur and be observed.


Substitutes
A causal contrast requires two quantities, at least one of which must be counterfactual and therefore unobservable. How, then, can we estimate a causal contrast? By using substitutes (illustrated below) for the counterfactual disease frequencies in the target. As before, the subscript indicates the exposure distribution.


In a substitute under exposure distribution 1, let C1 be the numerator of a disease-frequency measure, and let D1 be the denominator (number of people or amount of person-time at risk). Likewise, in a substitute under exposure distribution 0, let E0 be the numerator, and let F0 be the denominator.

In epidemiological practice, a substitute will usually be a population other than the target population during the etiologic time period. It may be the target population observed at a time other than the etiologic time period, or a population other than the target population. In theory, however, a substitute can be any source of information about a counterfactual parameter.

Below we describe how these quantities are used to predict5–7 or impute11 the counterfactual quantities in the causal contrast of interest.

Target experiences exposure distribution 1
If the target experiences exposure distribution 1, R1 = A1/B1 occurs and therefore can be observed directly, but we must substitute E0/F0 for the counterfactual disease-frequency parameter R0 = A0/B0; hence, we must substitute the association measure


for the causal contrast RRcausal. That is, we substitute what we can observe (a contrast in two populations or two time periods) for what we would like to observe directly, but cannot (a contrast in one population and one time period). In the diagrams below, an arrow indicates a substitution of an actual frequency for a counterfactual one.


Target experiences exposure distribution 0
If the target experiences exposure distribution 0, R0 = A0/B0 occurs and therefore can be observed directly, but we must substitute C1/D1 for the counterfactual disease-frequency parameter R1 = A1/B1; hence, we must substitute the association measure


for the causal contrast RRcausal.


Target experiences neither exposure distribution 1 nor 0
If the target experiences neither exposure distribution 1 nor 0, we must substitute C1/D1 for the counterfactual R1 = A1/B1, and E0/F0 for the counterfactual R0 = A0/B0; hence, we substitute the association measure


for the causal contrast RRcausal.


In this scenario, the exposed substitute is often the exposed subset of the target population, and the unexposed substitute is often the unexposed subset of the target population. That is, C1, D1, E0, and F0 are often subsets of A1, B1, A0, and B0, respectively. They may, however, have no overlap at all with the target population; this is the case whenever we generalize from a study to an external target population (see below).

Implications for aetiologic studies

The counterfactual approach and the concept of a causal contrast have important implications for designing, analysing, and interpreting aetiologic studies.

Choice and interpretation of effect measure
Under the counterfactual approach, causal contrasts are the only meaningful effect measures for aetiologic studies. Note that many measures are not causal contrasts; for example, the following are not, because they cannot be expressed as contrasts of a target under two exposure distributions of intrinsic interest: correlation coefficients, percent of variance explained (R2), P-values, {chi}2 statistics, and standardized regression coefficients.24

Causal contrasts can be given precise interpretations. RRcausal can be interpreted as the net proportionate change in disease frequency caused by the difference in exposure distributions 1 and 0 in the target population during the aetiologic time period. RDcausal can be similarly interpreted as the net absolute change in disease frequency.2,5–7 Because a causal contrast is a contrast in a target—not in a substitute or a ‘study base’—it should be interpreted as a measure of causal effect in the target.

We caution that not all population causal contrasts can be interpreted as averages of individual causal effects of exposure, or as averages of effects on subpopulations. This limitation arises when the denominators (Bi) of the disease-frequency measures are affected by exposure, as when the Bi represent non-cases or person-years, so that the Ri represent odds or incidence rates.2,6,25 For example, if the Bi represent person-years, so that RRcausal and RDcausal are the causal rate ratio and rate difference, and exposure, age, and sex all affect the rate, then RRcausal will not equal the average rate ratio across age and sex groups and RDcausal will not equal the average rate difference across these groups. For explanations of such problems, see refs2,6,7,25.

General framework for design and analysis of aetiologic studies
The counterfactual approach leads to a general framework for designing and analysing aetiologic studies. Because all aetiologic designs should estimate causal contrasts, different designs can be viewed simply as different ways of (1) choosing a target that corresponds to the study question, and (2) choosing substitutes and sampling subjects from target and substitutes into the study to balance tradeoffs among bias, variance, study costs and study time. This approach works for all etiologic studies.26 Beginning with Fisher and Neyman's work on permutation tests,16 careful study of counterfactual models has also led to the invention of new analysis methods and new study designs.11,27–29

The above framework applies to randomized trials as well as observational studies. In fact, it was invented for the analysis of randomized trials, and only later extended to non-experimental studies.3,4,16,17 A typical randomized trial is an example of the scenario discussed above in which the target experiences neither exposure distribution 1 nor 0. Here the treatment arms are substitutes for the target under different treatments. For example, when a drug is approved for treatment of a particular disease, a generalization is being made from the clinical trials on which the approval was based to some external (target) population of patients with that disease. In effect, the treatment and placebo arms in those trials serve as substitutes for the target under different treatment scenarios.

Definition of confounding and confounder
The concept of a causal contrast facilitates precise and general definitions of confounding and confounder. Confounding is present if our substitute imperfectly represents what our target would have been like under the counterfactual condition. An association measure is confounded (or biased due to confounding) for a causal contrast if it does not equal that causal contrast because of such an imperfect substitution.2,5–7,30

Under scenario 1, in which the target experiences exposure distribution 1, confounding occurs if E0/F0 != A0/B0. The bias due to confounding in the ratio and difference associations may be measured by


which for this scenario equal

Under scenario 2, in which the target experiences exposure distribution 0, confounding occurs if C1/D1 != A1/B1. The bias due to confounding in the ratio and difference associations may be measured by


Finally, under scenario 3, confounding may occur if E0/F0 != A0/B0 or C1/D1 != A1/B1. The bias due to confounding in the ratio and difference associations are


which is just the product or sum of the confounding factors under scenarios 1 and 2. Thus, RRassociation will be biased for RRcausal unless the product of its two confounding factors is 1, and RDassociation will be biased for RDcausal unless the sum of its two bias factors is zero. Note that, if confounding is present, at least one (and usually both) of the measures will be biased.

Roughly speaking, a confounder is a variable that at least partly explains why confounding is present. Many authors attempt to define a confounder more precisely as a variable that is a risk factor for disease and is associated with exposure but not affected by exposure. This definition has several limitations. One is that it applies only to the classical condition in which there is just one variable to consider. That variable may be a compound of several variables, such as an age-sex-race stratification used for standardization or Mantel-Haenszel analysis. Often, however, we must consider several variables at once while keeping them distinct, as when some have been measured and others have not. In that case, the status of a variable as a confounder, as well as the degree and direction of confounding, can change drastically according to which variables are controlled.5,7,22,31,32 One consequence is that control of a variable that meets the above definition can at times introduce more confounding than it removes.5,7,22,32 This happens, for example, when there is little or no confounding to explain; in that case we may still find many variables that satisfy the above definition, but whose confounding effects have balanced out. When this happens, control of one but not the others can increase confounding; see ref.5 for an illustration.

More generally, the fundamental equalities that must be met to control confounding are E0/F0 = A0/B0 in scenario 1 and C1/D1 = A1/B1 in scenario 2; in scenario 3, both equalities are needed except in the special case discussed above. Both these ‘no-confounding’ equalities, however, represent summary relations, and place no constraints on particular covariates or their effects.5,7 Control of confounding thus depends on creating strata within which these equalities are satisfied, rather than on the particular variables used to create the strata.2,5,7,15,22,32 Methods to aid in identifying sufficient sets of variables for control have been developed using counterfactual and graphical causal models.7,15,22,32

Properties of effect-measure modifiers
The size of a causal effect for a given pair of exposure distributions can be different for different targets. To see this, let Pdoomed be the proportion of individuals in the target population who would get disease during the etiologic time period regardless of their exposure status (‘doomed’ with respect to the study exposure), Pcausative the proportion in the target population who would get disease during the etiologic time period if and only if exposed, and Ppreventive the proportion in the target population who would get disease during the etiologic time period if and only if not exposed. The proportion of individuals in the target who would get disease if exposed is Pdoomed + Pcausative; the proportion who would get disease if not exposed is Pdoomed + Ppreventive. Then, we can write a causal risk ratio as the ratio of these proportions:2,5

This formula shows that the size of a causal risk ratio not only tends to vary with the proportion of individuals in the target population whose outcome is altered by exposure (who are counted in Pcausative and Ppreventive), but also tends to vary with the proportion of individuals in the target population for whom disease is inevitable by the end of the etiologic time period (who are counted in Pdoomed).2

The causal risk difference does not depend on Pdoomed.2,5 To see this, we can write a causal risk difference as follows:

.

This formula shows that the size of the causal risk difference will tend to vary only with the proportion of individuals in the target population whose outcome is altered by exposure.

It follows that a factor that affects Pcausative or Ppreventive can modify the size of a ratio or difference effect measure, and can modify the size of a ratio effect measure even if it affects only Pdoomed.2,5 Thus, one should not be surprised if an effect measure varies from one population to another or from one time period to another unless one expects other causal factors to have similar distributions across the populations or periods.

Implications for consistency criteria and meta-analysis
Because variations in the distribution of other factors can easily produce variations in effect measures, the consistency of an association measure across populations should not be viewed as a necessary causal ‘criterion’. Conversely, if one expects other causal factors to have similar distributions across a set of populations, one should in particular expect consistency in the distribution of uncontrolled confounders across the populations and hence similar amounts of confounding in the association measures; thus, consistency (homogeneity) of an association measure does not in itself provide logical support for causality, even if the distribution of all other factors is consistent across the populations.

These deductions from the counterfactual formulation provide a logical basis for earlier reservations about the consistency criterion:

A pertinent question is on what grounds consistency is to be decided. To ask for the same risk ratios to recur under many diverse circumstances is to ask for homogeneity, which is certainly to ask too much.33

In other words, the consistency criterion has general applicability only as a qualitative criterion rather than a quantitative one, and then only on the (often reasonable) assumption that either Pcausative or Ppreventive is negligible. The same deduction adds force to arguments that meta-analyses are better conducted as a search for sources of systematic variation among study results, rather than as an exercise in estimating a fictional common effect.34,35

The amount of bias in effect measures should be quantified
We must always use measures of association as surrogates for causal measures. This gives rise to the question, ‘How different are measures of association from causal measures?’ In other words, how much bias is inherent in the measures of association that we estimate? In practice, these questions are usually answered informally—that is, it is a matter of ‘judgement’. The magnitude of bias, however, is a complicated function of many parameters, and informal evaluation may be inadequate.27,29,36–38 Many authors hence recommend that formal methods, such as sensitivity analysis27,29,36–38 and validation substudies,36 be used to quantify the magnitude of bias.

Formal evaluation of bias requires formulas that describe the magnitude of bias as a function of relevant parameters. The counterfactual approach can help here. For example, it can be used to show that in special cases the approximate expected value of a relative risk estimate equals the causal relative risk times a bias factor for confounding, times a bias factor for losses to follow-up, times a bias factor for subject sampling, times a bias factor for subject non-response, times a bias factor for subjects excluded from analysis, times a bias factor for information bias.39 This result can be used in a sensitivity analysis to evaluate bias under different plausible scenarios.

Discussion

By their very definition, counterfactuals cannot be observed. Some people find this property disconcerting and reject counterfactuals as a foundation for causal inference, even though they may use statistical methods that require hypothetical (and hence unobserved) study repetitions for proper interpretation. One reason for their discomfort is that the counterfactual definition of effect seems to contradict the common-sense notion that we can observe effects. This seeming contradiction arises because of the unfortunate tendency to use the word ‘effect’ for different concepts. Sometimes ‘effect’ refers to an observed (actual) outcome event, such as ‘John Smith's lung cancer was an effect of his smoking’. Often, however, ‘effect’ refers to an effect measure such as RRcausal, which has at least one counterfactual (and hence unobservable) component. Although we observe the effects of a cause, we can only infer the cause of an effect, because our inferences will always depend on substitutions that may be called into question.

Causal inference is possible because we can make logically sound conditional inferences about counterfactuals, despite the fact that we do not observe them. Indeed, following earlier writings2–7,11,16,17,23,27–29 we have shown how basic problems of causal inference can be made logically precise (and hence subject to logical analysis) by translating them into problems of inference about counterfactuals. Two other well-developed systems of reasoning about cause and effect, structural-equations models and causal diagrams, turn out to yield results equivalent to those obtained using counterfactuals.15,22,32 This equivalence points to a basic unity among logically sound methods for causal inference.

The physicist Richard Feynman considered science to be ‘confusion and doubt, ... a march through fog’.40,p.380 As it does in physics,41,42 counterfactual analysis can cut through some of the ‘fog’ in epidemiology, for it leads to a general framework for designing, analysing, and interpreting etiologic studies. It has already led to a number of analysis innovations,11,16,27–29,32 and we have found it an excellent teaching tool. We hope that this paper will prove useful in enabling epidemiologists to view problems from the counterfactual perspective.

Acknowledgments

This publication was made possible by support from grant number NIH/1R29-ES07986 from the National Institute of Environmental Health Sciences (NIEHS), NIH. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIEHS, NIH. We are grateful to Timothy Church, Aaron Cohen, Bud Gerstman, Jay Kaufman, Stephan Lanes, Malcolm Maclure, Wendy McKelvey, Mark Parascandola, Judea Pearl, Carl Phillips, Charles Poole, Eyal Shahar, and the referees for their helpful comments on earlier drafts of this manuscript.

References

1 Kelsey JL, Whittemore AS, Evans AS, Thompson WD. Methods in Observational Epidemiology. New York: Oxford University Press, 1996.

2 Greenland S, Rothman KJ. Chapter 4: Measures of effect and measures of association. In: Rothman KJ, Greenland S (eds). Modern Epidemiology. 2nd Edn. Philadelphia: Lippincott-Raven, 1998.

3 Lewis DK. Causation. J Philos 1973;70:556–67.

4 Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psych 1974;66:688–701.

5 Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol 1986;15:413–19.[Abstract]

6 Greenland S. Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol 1987;125:761–68.[ISI][Medline]

7 Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Stat Sci 1999;14:29–46.[CrossRef][ISI]

8 Hume D. An Enquiry Concerning Human Understanding. LaSalle: Open Court Press, 1748, p.115.

9 Neyman J. (1923) Sur les applications de la thar des probabilities aux experiences Agaricales: Essay des principle. [English translation of excerpts by D. Dabrowska and T. Speed]. Statist Sci 1990;5:463–72.

10 Copas JB. Randomization models for matched and unmatched 2 x 2 tables. Biometrika 1978;60:467–76.

11 Rubin D. Bayesian inference for causal effects: the role of randomization. Ann Stat 1978;6:34–58.[ISI]

12 Dawid AP. Causal inference without counterfactuals (with discussion). J Am Statist Assoc 2000;95:407–48.[ISI]

13 Fisher RA. The Design of Experiments. Edinburgh: Oliver and Boyd, 1935.

14 Sobel ME. Causal inference in the social and behavioral sciences. In: Arminger G, Clogg CC, Sobel ME (eds). Handbook of Statistical Modeling for the Social and Behavioral Sciences. New York: Plenum Press, 1995.

15 Pearl J. Causality. New York: Springer, 2000.

16 Rubin DB. Comment: Neyman (1923) and causal inference in experiments and observational studies. Statist Sci 1990;5:472–80.

17 Simon HA, Rescher N. Cause and counterfactual. Philos Science 1966; 33:323–40.

18 Stalnaker RC. A theory of conditionals. In: Rescher N (ed.). Studies in Logical Theory. Oxford: Blackwell, 1968.

19 Lewis D. Counterfactuals. Oxford: Blackwell, 1973.

20 Harper WL, Stalnaker RC, Pearce G. Ifs. Dordrecht: Reidel, 1981.

21 Greenland S. Causal analysis in the health sciences. J Am Statist Assoc 2000;95:286–89.[ISI]

22 Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology 1999;10:37–48.[ISI][Medline]

23 Rubin DB. Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 1991;47:1213–34.[ISI][Medline]

24 Greenland S, Maclure M, Schlesselman JJ, Poole C, Morgenstern H. Standardized regression coefficients: a further critique and a review of alternatives. Epidemiology 1991;2:387–92.[ISI][Medline]

25 Greenland S. Absence of confounding does not correspond to collapsibility of the rate ratio or rate difference. Epidemiology 1996;7:498–501.[ISI][Medline]

26 Maldonado G, Greenland S. The causal-contrast study design (abstract). Am J Epidemiol 2000;151:S39.

27 Rosenbaum PR. Observational Studies. New York: Springer-Verlag, 1995.

28 Robins JM. Causal inference from complex longitudinal data. In: Berkane M (ed.). Latent Variable Modeling with Applications to Causality. New York: Springer, 1997, pp.69–117.

29 Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Halloran E (ed.). Statistical Models in Epidemiology. New York: Springer, 1999.

30 Greenland S, Morgenstern H. Confounding in health research. Ann Rev Public Health 2001;22:189–212.[CrossRef][ISI][Medline]

31 Fisher L, Patil K. Matching and unrelatedness. Am J Epidemiol 1974; 100:347–49.[ISI][Medline]

32 Pearl J. Causal diagrams for empirical research (with discussion). Biometrika 1995;82:669–710.[ISI]

33 Susser M. Falsification, verification and causal inference in epidemiology. In: Rothman KJ (ed.). Causal Inference. Chestnut Hill, MA: Epidemiology Resources, 1988, pp.46.

34 Greenland S. A critical look at some popular meta-analytic methods. Am J Epidemiol 1994;140:290–96.[Abstract]

35 Poole C, Greenland S. Random effects meta-analyses are not always conservative. Am J Epidemiol 1999;150:469–75.[Abstract]

36 Greenland S. Basic methods for sensitivity analysis and external adjustment. In: Rothman KJ, Greenland S (eds). Modern Epidemiology, 2nd Edn. Philadelphia: Lippincott-Raven, 1998, pp.343–57.

37 Maldonado G. Informal evaluation of bias may be inadequate (abstract). Am J Epidemiol 1998;147:S82.

38 Leamer EE. Sensitivity analyses would help. Am Econ Rev 1985;75: 308–13.

39 Maclure M, Schneeweiss S. The confounding product (abstract). Am J Epidemiol 1997;145:S55.

40 Gleick J. Genius. The Life and Science of Richard Feynman. New York: Vintage Books, 1992.

41 Penrose R. Shadows of the Mind. New York: Oxford, 1994, Chapters 5 and 6.

42 Price H. Time's Arrow and Archimedes' Point. New York: Oxford, 1996, Chapters 6 and 7.