We thank Kaufman and Kaufman (K&K),1 Dawid,2 Elwert and Winship,3 and Shafer4 for their commentaries on our paper Estimating causal effects.5 Here we hope to separate misunderstandings from substantial disagreements; we believe the latter arise only in the comments of Dawid2 and Shafer4 (and are described in refs. 613).
Misunderstandings
According to K&K, The authors organize their presentation around an aggregate model, rather than the individual causal model that dominates elsewhere. This is not entirely true. We organized our presentation around the target population as specified by the study question. Our target population could comprise one person, many people, or any collection of interest; our model of effects is therefore aggregate when the study question asks about a population of aggregates (e.g. all counties in California), but it is a model for effects on individuals when the study question asks about a group of individuals.
Kaufman and Kaufman then say Following the authors, we consider the simplified case of only two levels of exposure, "exposed" and "not exposed". But we did not use this simplification. We wrote that a causal contrast compares outcomes under two exposure distributions that represent different possible mixtures of individual exposure conditions. We did not imply that only two exposure distributions are possible. On the contrary, our conceptualization allows for any exposure distribution (all possible combinations of exposure timings, exposure metrics, people in the target, and exposure levels).
Kaufman and Kaufman also say Another important source of variability is the sampling of the study population from the target population. The authors clearly subsume this under the rubric of confounding. On the contrary, we do not assume any sampling from the target, nor would we subsume such sampling under confounding. Instead, we wrote Confounding is present if our substitute imperfectly represents what our target would have been like under the counterfactual condition. Thus, in our conceptualization, confounding results from an imperfect choice of substitute, which could bebut need not bea result of sampling from the target to form the study population. For example, in an occupational study, workers at plant 1 might be our target population, and plant 2 might be used as a substitute for the counterfactual experience of the workers at plant 1. Here, the study population consists of everyone in the target plus workers at plant 2. If the experience of the workers at plant 2 is not a good substitute for the experience of plant 1 workers under the counterfactual exposure distribution, then the resulting effect estimates will be confounded.
The study population need not be sampled from the target population at all. Consider our scenario 3, in which the target experiences neither exposure distribution 1 nor 0. Here the study population would consist of two substitutes, because the target did not experience either of the exposure distributions we want to compare. Neither substitute is required to include any members of the target. In theory, the only requirement for a valid causal contrast is that a substitute is a good substitute for the counterfactual experience of the target. It need not also be a good substitute for the actual experience of the target; this is a stronger condition than necessary. Thus, we would strike the word actual in K&K's parenthetical statement (actual and counterfactual).
Three of the commentators13 misinterpreted our examples of exposure distributions, in which 20% or 40% of the target population regularly smoked cigarettes during a given time period. We did not intend to imply that one does not know individual exposures. In our conceptualization, individuals experience exposures (treatments), and a group of individuals has a distribution that describes each individual's exposure. Therefore, in a study with data on individual exposures, neither Dawid's uniformity assumption nor K&K's assumption of random allocation of exposure is necessary for defining or estimating effects. This point may be seen in definitions of generalized population attributable fractions, in which exposure effects are allowed to vary across covariates (and hence across individuals).14 The assumptions imposed by K&K and Dawid are indeed made by conventional statistical procedures for effect estimation, but we avoid them because they have no justification in typical observational studies.15
Elwert and Winship3 state Instead of applying a particular treatment to an individual, MG apply a distribution of treatments (the exposure distribution) to a population, and from that they incorrectly concluded that our basic unit of analysis is the population rather than the individual. In reality, the ACE of the potential-outcomes model is a special case of our causal-contrast measure. Both measures are derived from the outcomes of individuals under different exposures or treatments. In our conceptualization, individuals experience the causal effect of the difference in exposure levels being compared, and the individual outcomes are modelled, even when the average causal effect on a group of individuals is of ultimate interest. Contrary to what Elwert and Winship3 thought, we do employ the mapping of distinct exposures from the exposure distribution onto individuals within a target population. This mapping is known in most aetiological studies, because exposure information is typically collected on individuals (although not in ecologic studies). We suspect that our simplified notation obscured this important point. Of course, the individuals in our model may be aggregates, such as counties or states, but if so the treatments and outcomes must then be variables defined unambiguously on the aggregate (macro) level (such as laws, expenditures, and mortality rates).
Elwert and Winship3 also write that MG depart in important respects from the standard model of counterfactual causal inference; this is true, although we do not depart from it in the way that Elwert and Winship describe. Perhaps most importantly, we are interested in the causal effect in a target population (the group of individuals about whom our scientific question asks), not in the study population. The two populations are not necessarily the same: The study population may include individuals who are not in the target population but are being used as substitutes for the counterfactual experience of the target (e.g. our scenarios one and two); it may even include no individual from the target population (e.g. our scenario three). The distinction is important because (1) the size of a causal-effect measure is not a biological constant, as it may vary with the composition of the target population, and (2) the target population may not be available for study (e.g. the people enrolled in a randomized trial may not be the target population of public health or medical interest). Thus a large, well-conducted randomized trial would usually not provide an unbiased estimator for a causal-effect measure unless the people enrolled in the trial are representative of the target population; this condition is rarely stated in presentations of the potential-outcomes model.
Dawid4 questions the meaning of deterministic disease occurrence, which we used only for simplicity. Kaufman and Kaufman explain what we mean: individual potential responses are fixed rather than random quantities. Pdoomed, for example, represents individuals who would always get the study disease if the study were hypothetically repeated, fixing the entire history of that individual except for exposure and factors affected by exposure. Using Dawid's example of a penny toss, a two-tailed penny will always land tails up. Stochastic counterfactuals are discussed in detail elsewhere.1618 Dawid also states Things become much murkier if no uniformity assumption can be made, since then no perfect substitute exists. This statement is just wrong: Because of averaging, a substitute may perfectly represent a counterfactual outcome of the target even if there is no uniformity in either group.18 The practical problem is that we are usually unable to identify a perfect substitute with certainty.
Disagreements
Regarding Dawid's objections to counterfactual causal inference, we recommend readers to sec. 1.4.4 of Pearl19 and the commentaries on Causal interference without counterfactuals,6 most of which embrace counterfactual models and address his objections in detail712 (the exception being Shafer13). Dawid objects to counterfactual events because (by definition) they do not occur and so cannot be observed (although he concedes a role for them in formulating causal models).6 We and others912 maintain that this property of counterfactuals leads to insights and reveals assumptions that are hidden by other models. For example, a P-value (the probability of observing a statistic as large as observed or larger) is defined in terms of counterfactuals (in the or larger), which implies that one should reject use of P-values if one rejects inferences based on non-occurring events.
Shafer4 offers predictive causality as an obvious alternative to counterfactual theories. While predictive theories are interesting, we find them as yet too limiting to supplant counterfactual theories, and quite obscure for teaching purposes. Treatments of predictive causality we have seen have dodged the thorny problem of defining causality by relying on circularities4,13 (which usually go unnoticed), or on a metaphysical notion of covariate sufficiency6 (which becomes a derived concept in other theories18,19), or on hidden potential outcomes.13 Shafer4 indulges in the circularity when he defines weak causation by saying A is a probabilistic cause of B if it raises the probability of B. What does it mean for A to raise a probability if not to cause an increase? Probabilistic counterfactuals1618 provide the only non-circular answer we know of, notwithstanding Shafer's inability to understand them4 or even acknowledge their existence.13 Shafer's definition of strong causality,4 that A causes B in a strong sense if we can predict, using a method of prediction that proves consistently correct, that B will happen if we do A and will not happen if we do not do A, tacks on a subjective observer (we who predict) to an objective definition of causation identical to that based on the do (or set) operator of potential-outcome models.19 This does not strike us as an advantage of Shafer's theory13 over counterfactual theories.
Shafer4 nicely sums up another reason why predictive causality has thus far failed to attract the usage that counterfactual theories have: The very word [counterfactual] places us in the situation where A has already been performed and so not(A) is counter to the facts. The counterfactual theory insists that there should be a well-defined answer ... to the question of what would have happened if A had not been performed. The predictive theory, on the other hand considers only what can be predicted before the choice between A and not(A) is made. The latter limitation means that Shafer's restrictive version of predictive causality demurs to directly face down the subjunctive causal questions of deep concern to individuals and society. Those questions are explicitly cast in a counterfactual but for form put to American juries, such as but for the action of tobacco companies in promoting the use of cigarettes, would the state of Minnesota have had health-care costs as high as it did bear? The only substance we see in Shafer's criticism of such questions13 is addressed by prefixing health-care costs with expected.
No one doubts the difficulty of answering such questions, but we dispute Shafer's attempt to address these difficulties by denying meaning to the question.4,13 The philosophy espoused by Shafer4 as well as Dawid6 strikes us and others810 as a form of logical positivism (misattributed to Popper6,8) that attempts to hobble science by a fiat of restriction to questions that admit tidy solutions. All else is condemned as untestable, metaphysical, or silly and therefore not scientific,6,13 without regard to the importance of the question; witness Dawid's2 claim that a causal relative risk is simply not an appropriate subject for discourse! In contrast, counterfactual theories allow one to examine such questions logically, and make clear exactly where precise estimates cannot be attained without detailed mechanistic knowledge;17 they thus show why answers to certain causal questions must remain conjectural, and how to shape those conjectures to be consistent with background information (including results from predictive research).3,17 They also help us shape questions and answers to remove such ambiguity as can be removed,3,5,17 without introducing the distortions and oversimplifications that seem to attend extreme positivist approaches.13 If we do not avail ourselves of these advantages, special interests will still exploit them expertly.20
While we welcome other coherent theories of causality (such as decision-analytic6 and graphical theories19), they have so far failed to yield a broad set of widely tested statistical methods comparable to that of the potential-outcomes model of counterfactuals (invented by Neyman21 in the early 1920s, yet often misattributed to Rubin,3 though not by Rubin himself22). Analysts employ this model to good effect whenever they apply a permutation test (such as Fisher's exact test) to randomized-trial data,15,23 and the vast work by Rubin, Robins, and Rosenbaum has extended the model and methods to observational studies.11,22,24 Shafer points out correctly that Lewis's counterfactual theory is not equivalent to this model, but fails to point out that Lewis's theory (in which counterfactuals are taken as actual events in closest possible worlds) is far more metaphysical than Neyman's model, and lends absolutely no support to Shafer's theory.
Shafer4 also takes us to task for juxtaposing Feynman's name with citations of physicists who endorse counterfactuals.5 Indeed, Feynman did not advocate counterfactuals because, to the best of our knowledge, he never discussed them. As for whether many physicists oppose them,4 the truth will be unknown until there is a more thorough poll of physicists than either we or Shafer (we each cite one) have mustered. Although we doubt whether epidemiologists should base any decision on the poll's outcome, we note that Shafer's cite4 (like Shafer13) fails to even consider probabilistic counterfactuals.
References
1
Kaufman JS, Kaufman S. Commentary: Estimating causal effects. Int J Epidemiol 2002;31:43132.
2
Dawid AP. Commentary: Counterfactuals: help or hindrance? Int J Epidemiol 2002;31:42930.
3
Elwert F, Winship C. Commentary: Population versus individual level causal effects. Int J Epidemiol 2002;31:43234.
4
Shafer G. Commentary: Estimating causal effects. Int J Epidemiol 2002; 31:43435.
5
Maldonado G, Greenland S. Estimating causal effects. Int J Epidemiol 2001;30:18.
6 Dawid AP. Causal inference without counterfactuals (with discussion). J Am Statist Assoc 2000;95:40748.[ISI]
7 Cox DR. Comment. J Am Statist Assoc 2000;95:42425.[ISI]
8 Casella G, Schwartz SP. Comment. J Am Statist Assoc 2000;95:42527.[ISI]
9 Pearl J. Comment. J Am Statist Assoc 2000;95:42831.[ISI]
10 Robins JM, Greenland S. Comment. J Am Statist Assoc 2000;95:43135.[ISI]
11 Rubin DB. Comment. J Am Statist Assoc 2000;95:43538.[ISI]
12 Wasserman L. Comment. J Am Statist Assoc 2000;95:44243.[ISI]
13 Shafer G. Comment. J Am Statist Assoc 2000;95:43842.[ISI]
14 Greenland S, Drescher K. Maximum likelihood estimation of attributable fractions from logistic models. Biometrics 1993;49:86572.[ISI][Medline]
15 Greenland S. Randomization, statistics, and causal inference. Epidemiology 1990;1:42129.[Medline]
16 Greenland S. Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol 1987;125:76168.[ISI][Medline]
17 Robins JM, Greenland S. The probability of causation under a stochastic model for individual risk. Biometrics 1989;45:112538.[ISI][Medline]
18 Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Statist Sci 1999;14:2946.[CrossRef][ISI]
19 Pearl J. Causality. New York: Springer, 2000.
20 Rubin DB. Estimating the causal effects of smoking. Stat Med 2001;20: 1395414.[CrossRef][ISI][Medline]
21 Neyman J. Sur les applications de la thar des probabilities aux experiences Agaricales: Essay des principle. Original 1923; English translation of excerpts by Dabrowska D and Speed T. Statist Sci 1990;5:46372.
22 Rubin DB. Comment: Neyman (1923) and causal inference in experiments and observational studies. Stat Sci 1990;5:47280.
23 Greenland S. On the logical justification of conditional tests for two-by-two contingency tables. Am Stat 1991;45:24851.[ISI]
24 Greenland S. Causal analysis in the health sciences. J Am Statist Assoc 2000;95:28689.[ISI]