1 National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD.
2 Center for Weight and Health, University of California at Berkeley, Berkeley, CA.
3 National Cancer Institute, National Institutes of Health, Bethesda, MD.
4 Division of Diabetes Translation, Centers for Disease Control and Prevention, Atlanta, GA.
Received for publication March 14, 2003; accepted for publication March 12, 2004.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
anthropometry; body mass index; body weight; cause of death; epidemiologic methods; risk; statistics; vital statistics
Abbreviations: Abbreviations: BMI, body mass index; PAF, population attributable fraction.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
McGinnis and Foege (5) subsequently clarified that their estimate of 300,000 deaths per year referred to all aspects of diet and physical activity and not specifically to obesity. Nonetheless, this estimate has often been interpreted as representing the number of deaths caused by obesity (1, 2) and is also often cited to motivate increased efforts to treat and control obesity. For example, the total number of deaths attributable to obesity has been invoked in connection with new drug approvals (6).
In 1999, Allison et al. (7) used data from six large prospective epidemiologic cohorts to estimate that approximately 280,000 deaths per year could be attributed to obesity and overweight in the United States. For each study cohort, these authors estimated a hazard ratio for each body mass index (BMI; weight (kg)/height (m)2) category, adjusted for age, sex, and smoking. They then combined those hazard ratios with estimates of the prevalence of BMI categories and with vital statistics data to estimate the number of deaths attributable to overweight and obesity.
Allison et al. (7) used proportional hazards models to estimate hazard ratios adjusted for age, sex, and smoking status, stating that "[i]nteractions of age and sex with BMI terms were not included because of our interest in estimating the average effect of overweight or obesity across both sexes and all adult ages" (7, p. 1533). Their goal was "to estimate total societal obesity burden in terms of mortality" (7, p. 1533). They argued that this approach did "take into account differential effects of obesity by age and sex despite no corresponding interaction terms, simply by including both sexes and a cross section of ages in the derivation samples" (7, p. 1533). Considerable evidence suggests that the effects of obesity on mortality differ strongly by age (812).
The method used by Allison et al. (7) did not allow for effect modification and only partially adjusted for confounding. Use of this approach is contraindicated by several published statistical papers, which show that such an approach can lead to bias (13, 14).
Few studies, however, have estimated the magnitude of the bias that can occur in specific situations when inappropriate methods are used to calculate attributable fractions (13). In this paper, we investigate the possible magnitude and direction of the bias in estimates of deaths attributable to obesity when confounding and effect modification are not adequately accounted for.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
PAF = Pe(RR 1)/(1 + (Pe(RR 1))),
where Pe is the proportion of the population exposed to the factor (in this case, obesity), and RR is the unadjusted relative risk of mortality associated with obesity. This equation is appropriate when there is no confounding. A modified version of this equation for use when there are multiple categories (i = 0,1 ... k) of the exposure variable is
PAF = pi (RRi 1)/(1 +
pi (RRi 1)),
where i refers to the ith exposure category, pi is the proportion of the population in the ith exposure category, RRi is the relative risk comparing the ith exposure category with the reference group (i = 0), and both summations are over i = 0 to k. Because of the distributive property of PAF, this equation can also be used to calculate a level-specific attributable fraction for any given level of exposure, relative to a fixed reference category, by including only that level of exposure in the numerator. The equation used by Allison et al. (7) to calculate excess deaths is algebraically identical to PAF equation 2 multiplied by the total number of deaths, although their equation is more complex and uses two additional parameters. To calculate deaths attributable to overweight and obesity, the attributable fraction and attributable deaths can be calculated for each level of overweight and obesity and the results added together. Strictly speaking, to calculate the total PAF, deaths attributable to the lowest BMI category (BMI <23) should also be included in the total to obtain all deaths attributable to not being in the reference category (BMI 2325). However, equation 2 is still appropriate to calculate deaths attributable to overweight and obesity relative to the reference category.
Methods of estimating attributable fractions when there is confounding or interaction
In the article by Allison et al. (7), relative risk estimates, adjusted for age, sex, and smoking, for the risk of mortality associated with obesity were derived from six available epidemiologic cohorts. To estimate the number of deaths due to obesity, these relative risk estimates were then combined with data from the target population (the 1991 US population) on the prevalence of obesity and the number of deaths. When there are subgroups within the population (e.g., age and sex subgroups), there are several different ways to combine relative risk estimated from a derivation cohort with the prevalence of obesity and number of deaths in a target population.
In one method, which we call the "partially adjusted" method and is essentially the method used by Allison et al. (7), a single relative risk, adjusted for subgroup membership (e.g., adjusted for age and sex subgroup), is calculated from the derivation cohort and used in equation 1, along with the prevalence of exposure from the population, to calculate a single attributable fraction. The number of deaths attributable to obesity in the target population is then calculated by multiplying the total number of deaths by the attributable fraction. This method is in general biased when there is confounding of the exposure-disease association because of the use of an attributable risk formula appropriate for unadjusted relative risks only. We refer to this method as "partially adjusted" because the attributable fraction is calculated by using adjusted relative risks but with an equation appropriate for unadjusted risks only.
In another approach, referred to by Benichou (13) as the weighted-sum method, unadjusted relative risks are calculated separately for each subgroup in the derivation cohort and used in equation 1 or 2, along with information on the prevalence of obesity and the number of deaths within each subgroup, to calculate the number of deaths attributable to obesity in each subgroup. These numbers are then summed over subgroups to obtain the total number of deaths in the population attributable to obesity. For example, the number of deaths attributable to obesity can be calculated separately for each age group from age-specific values for relative risk, the prevalence of obesity, and the number of deaths and then added together across age groups. In effect, this approach uses a fully saturated model that can account for both confounding and interaction because it allows for different relative risks within each subgroup. We used this method, along with our hypothetical relative risks for each subgroup, to obtain the hypothetical "true" values for our examples.
Estimates of potential bias
We estimated the direction and magnitude of bias from the partially adjusted method for calculating the number of attributable deaths, using hypothetical examples based on real data and published relative risks. The prevalence of BMI categories was calculated by using data from the Third National Health and Nutrition Examination Survey (excluding pregnant women and those for whom BMI data were missing), and the number of deaths in the US population in 1991 was taken from US vital statistics. For convenience, the percentage of the population in each sex-age group was calculated from the sample weights for the Third National Health and Nutrition Examination Survey sample, but note that the target population for the National Health and Nutrition Examination Survey is the civilian noninstitutionalized population rather than the total population.
As our hypothetical relative risks, we used the relative risk (hazard ratio) estimates from the Allison et al. (7) paper based on the National Health and Nutrition Examination Survey I Epidemiologic Follow-up Study. Although those hazard ratios were adjusted for three factorsage, sex, and smokinghere we consider only age and sex because mortality data are tabulated by age and sex. We follow the Allison et al. paper in using the category of BMI 23<25 as the reference category and the category of BMI <23 as a nonreference, nonexposure category. We calculated the relative risk for the category of BMI 25<30 as a prevalence-weighted relative risk of the hazard ratios for the individual BMI values within that category from the Allison et al. paper.
In all cases, we held the adjusted relative risks constant in the target population so that the adjusted relative risks would match those in the Allison et al. (7) paper. The two sets of hypothetical relative risks within subgroups that we used are shown in table 1. One set did not vary by age-sex group (representing possible confounding but no interaction). Another set (arrived at empirically) varied by age-sex group (representing effect modification or interaction) but resulted in the same final adjusted relative risks (adjusted across subgroups by using the Mantel-Haenszel method). We selected this second set of values to show a decline with relative risks by age, consistent with the literature on this topic.
|
To estimate the potential bias in the partially adjusted method when there are differences between the derivation cohort and the target population, we also developed hypothetical derivation cohorts that had the same relative risks within subgroups as did the target population but differed from the target population in the proportion of people in different age-sex subgroups, the prevalence of obesity, or the risk of mortality in the reference BMI category. We also considered the effects of minor differences in relative risk between the derivation cohort and the target population. In all cases, we considered the true values to be those derived from the weighted-sum method applied to the target population.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
Effect modification: different relative risks in subgroups
A more realistic scenario for effects of obesity on mortality is that the effect of obesity on mortality differs between subgroups, particularly age groups, as shown in the effect-modification example in table 1. The "true" value in this hypothetical situation is 198,924 deaths (table 3). In this case, the partially adjusted method gives rise to a 16 percent overestimation of deaths attributable to overweight and obesity.
Bias within age-sex-BMI subgroups
The distribution of estimated excess deaths across age-sex subgroups and across BMI categories by the two methods is shown in table 4. The hypothetical true values are shown as WS1 (weighted sum 1, for confounding only) and WS2 (weighted sum 2, for effect modification). The bias in the partially adjusted method can be seen by comparing the values for the partially adjusted method in table 4 with the values for both weighted sums.
|
Effect of characteristics of the derivation cohort
Even if the relative risks within age-sex subgroups are identical between the derivation cohort and the target population, the derivation cohort may differ from the target population in terms of the distribution of subgroups, the prevalence of obesity, or the absolute risk of disease within the reference BMI category. The weighted-sum method is not affected by these differences between the derivation cohort and the target population. When there is confounding only, such differences also have no effect on the bias from the partially adjusted method. However, when there is effect modification, such differences affect the degree of bias from the partially adjusted method.
The two oldest age categories (8084 years and 85 years) accounted for 3.4 percent of our target population. We allowed the proportion in the derivation cohort to increase or decrease by three percentage points, to 6.4 percent or 0.4 percent, respectively. Including a lower proportion of elderly in the derivation cohort increased the bias from use of the partially adjusted method; having a higher proportion decreased the bias (table 5). We also constructed derivation cohorts in which the prevalence of obesity (BMI
30) was 10 percentage points higher or lower than the target population but were otherwise identical to the target population. A lower prevalence of obesity in the derivation cohort increased the bias from the partially adjusted method; a higher prevalence decreased the bias (table 5). Similar but smaller effects were associated with an increase or decrease in the probability of mortality within the reference category (data not shown).
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Our numerical examples, using plausible values for the US population, suggest that the Allison et al. (7) method of calculating deaths attributable to overweight and obesity may result in substantial bias when there is confounding or effect modification. It is well established in the statistical literature that using an adjusted relative risk estimate with an attributable fraction formula for an unadjusted relative risk will not in general give unbiased estimates when there is confounding (13). The degree of bias has not often been quantified, however. In the US population, age and sex are associated with both obesity and mortality and thus confound the obesity-mortality relation. In our example, the partially adjusted method gave an estimate too high by 17 percent when confounding by age and sex was not fully accounted for.
The method used by Allison et al. (7) also did not allow for effect modification. However, when there is effect modification (interaction), that is, differential effects of obesity by age or other factors, yet further bias can arise from the partially adjusted method when the derivation cohort differs from the target population. Allowing the prevalence of the older age groups (80 years) to be slightly lower in the derivation cohort than in the target population led to a bias of 42 percent when the partially adjusted method was used (283,377 excess deaths rather than 198,924a difference of almost 85,000 excess deaths).
In both of the above situations, if relative risks and deaths attributable to overweight and obesity are calculated separately for each subgroup (the weighted-sum method), the sum of deaths attributable to overweight and obesity will be correct. If the relative risk estimates from the derivation cohort are slightly inaccurate, however, neither a single adjusted risk estimate nor the weighted-sum method will give the correct results. Our numerical examples show that a variation of 0.1 or 0.2 in relative risk estimates can potentially produce an extremely large bias in the estimated deaths attributable to overweight and obesity, both with the partially adjusted method and with the weighted-sum method.
We demonstrated the possible magnitude and direction of the bias in estimates of deaths attributable to overweight and obesity in the United States with numerical examples that used plausible values of the mortality relative risks for age groups in the US population. These explorations suggest that when the mortality relative risk is estimated with a single age-adjusted relative risk, overestimation of deaths attributable to overweight and obesity in the US population is more likely than underestimation.
Certain types of differences between derivation cohort and population appear more likely than others. Sample selection in many cohorts underrepresents the elderly, particularly the older elderly. In addition, the proportion of elderly in the population has also increased over time. Thus, it is unlikely that a cohort study conducted some years previously will have as high a proportion of elderly as the general population.
The relative risks associated with obesity may be different in epidemiologic cohort studies than in the general US population (16). Many cohort studies exclude, either in sample selection or in analysis, people at a higher risk of mortality, such as those in hospitals and nursing homes, those with cancer and heart disease at baseline, current smokers, and former smokers. For those excluded, the relative risk associated with obesity may be lower. For example, in the study by Calle et al. (11), limiting the analyses to subjects who had never smoked and who had no history of disease at enrollment increased the relative risk associated with high BMI. After these exclusions, relative risks in cohort samples may tend to be overestimates of the true mortality relative risk for overweight and obesity in the entire US population.
The relative proportions of the elderly, the death rates for the nonobese, and the prevalence of obesity in many cohorts are likely to be lower than those in the US population as a whole, and the relative risk may also be higher because of exclusions. In our numerical examples, a combination of these characteristics, including an overestimation of the relative risk by 0.1 and a lower proportion of older elderly in the sample, led to a remarkable 83 percent overestimation of deaths attributable to overweight and obesity. An overestimation of the relative risk by 0.2, coupled with a lower proportion of elderly, led to more than a doubling of the number of attributable deaths, with a bias of 120 percent (436,697 deaths rather than 198,924 deaths). These estimates are potentially very sensitive to minor differences.
The biases were greater for older people and for those with higher BMIs. Thus, the bias due to applying this method in the US population may increase over time as a higher proportion of deaths occur in older people and as obesity increases. In 1991, 51.7 percent of deaths in adults occurred in those 75 years of age or older; the corresponding figure in 2000 was 57.8 percent. Another consideration is whether the estimate of interest is all deaths attributable to overweight and obesity or only premature deaths occurring at ages of less than 75 years. If we restrict attention to deaths occurring in people under 75 years of age, the estimated number of deaths due to overweight and obesity would be considerably smaller (ranging from ~110,000 to ~149,000 in our examples), regardless of what method is used.
This paper has focused on how estimates of the number of deaths attributable to a risk factor can be biased if the estimates are calculated improperly or if the relative risks are not estimated accurately. Another issue is the construction of confidence intervals that reflect the random variation of these estimates. Previous estimates of the number of deaths attributable to overweight and obesity do not provide confidence intervals, perhaps because of the complexity of the estimates that involve combining estimates of complex quantities from multiple data sources. However, if the original data from the various sources are available, bootstrap or jackknife methods (17) can be applied to obtain standard errors from which confidence intervals can be constructed. Estimates of numbers of deaths attributable to overweight and obesity may be surprisingly unreliable, particularly if the relative risks are based on small numbers of outcomes or if the sample sizes for estimating exposure to risk factors are not large.
Estimates of the attributable number of deaths due to a risk factor, such as obesity, can be dramatically affected by the relative risk estimates that are used. We recommend that sensitivity analyses be included in future studies. It would be advisable to estimate the attributable number of deaths several times by using relative risks estimated under alternative models, for example, with and without various interactions with other risk factors.
Existing estimates of the number of deaths attributable to overweight and obesity (7) were calculated by using a method likely to produce biased estimates when the effects of obesity vary by age or other characteristics. Estimates of deaths attributable to overweight and obesity arrived at by using this approach may be biased and should be viewed cautiously.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|