RE: "PRESENTING STATISTICAL UNCERTAINTY IN TRENDS AND DOSE-RESPONSE RELATIONS"

John L. Hopper and Gillian S. Dite

The University of Melbourne Centre for Genetic Epidemiology Carlton, Victoria 3053, Australia

We thank Greenland et al. (1Go) for drawing attention to the "floating absolute risk" (FAR) method for presenting statistical uncertainty in trends and dose-response relations, originally introduced by Easton et al. (2Go). This is an important tool for assessing the pattern of risk when there is no obvious "unexposed" referent category, such as for a continuously distributed risk factor. It can also help determine the most likely mode of inheritance from genotype data generated by common variants (polymorphisms) (3Go).

We have developed a simple method, using standard statistical software and simple calculations, for deriving approximate confidence intervals for the floating log odds ratios for different categories from an unmatched case-control study, while concurrently adjusting for other measured covariates (4Go). Let the risk factor of major interest be categorized into k groups. For each category i, let pi = observed proportion of cases, Oi = pi/(1 - pi) = observed odds of being a case in the ith exposure category. Then, let i = log[Oi], so that ij = log(Oi/Oj) = i - j is an estimate of the log relative risk ij in category i relative to category j, under the usual conditions.

If there are no adjustments for covariates, i and j are independent for all pairs i and j. Therefore,

(1)
so, for any distinct values of i, j, and k,

(2)

If adjustments for covariates are made, the association between i and j is minimized by mean-centering the covariate(s) (refer to the Appendix in Greenland et al. (1Go)). In any case, the assumption of independence is approximately true provided there is no strong collinearity, even without centering. Simply fit an unconditional logistic regression model (including the intercept term) to the probability of being a case by using one category, i, as baseline. Estimate Var(ij) by standard error ij)]2 for all j != i. Repeat once for a different baseline category, j, and use equation 2 to derive all Var(i). Standard errors and confidence intervals follow from standard asymptotic likelihood theory.

We originally considered height and weight as continuously distributed risk factors for early-onset breast cancer (4Go). Here, we fit body mass index (BMI), while concurrently adjusting for height and several other covariates (figure 1). The height, weight, and BMI scales were each divided into five categories based on their quintile distributions in controls. We arbitrarily assigned the middle category as the referent, so that it had a log odds ratio value of 0. On the log odds scale, the (relative sizes of) confidence intervals apply irrespective of the referent category; that is, they are "floating." On the right-hand side of figure 1, we have drawn attention to this by representing the log odds scale without anchoring it to any particular reference category. For any pair of categories, estimates and confidence intervals of the relative risk can be easily derived from the floating log odds ratios and their confidence intervals or standard errors. (As noted by Greenland et al. (1Go), this is not possible by using the conventional approach of presenting results relevant to just one referent group.)



View larger version (15K):
[in this window]
[in a new window]
 
FIGURE 1. Relation between risk of breast cancer before age 40 years and body mass index, while concurrently adjusting for height and other covariates, as represented by floating case-control log odds ratios; refer to McCredie et al. (4Go). Estimates are relative to a referent risk category arbitrarily assigned to the middle category (i.e., floating log odds ratio = 0). The circles represent point estimates for the mean of each quintile for a typical control, and the triangles show upper and lower 95 percent confidence intervals. The dashes represent the best-fitting straight line for the same typical control.

 
An important feature of this approach is its ability to assess the trend in risk as a function of the variable of interest. We previously found that a linear effect on the log odds scale gave a good fit for height after adjusting for weight but not for weight after adjusting for height (refer to McCredie et al., figure 2 (4Go)). Figure 1 illustrates that the association between risk of early-onset breast cancer and BMI is not linear; the best-fitting straight line goes outside the 95 percent confidence intervals for the second quintile and fails a goodness-of-fit test ({chi}32 = 8.9, p < 0.05). As for weight, women in the highest quintile were at a reduced risk, rather than those in the lowest quintile being at a higher risk, and this finding may have etiologic significance; refer to the Discussion in McCredie et al.

In summary, our approach and that of Greenland et al. (1Go) and Easton et al. (2Go) are essentially the same, as can be demonstrated empirically. Our algorithm used standard statistical software and requires minimal calculations. We found it informative when applied to breast cancer, and we concur with Greenland et al. in recommending it to epidemiologists.

References

  1. Greenland S, Michels KB, Robins JM, et al. Presenting statistical uncertainty in trends and dose-response relations. Am J Epidemiol 1999;149:1077–86.[Abstract]
  2. Easton DF, Peto J, Babiker AG. Floating absolute risk: an alternative to relative risk in survival and case-control analysis avoiding an arbitrary reference group. Stat Med 1991;10:1025–35.[ISI][Medline]
  3. Spurdle AB, Hopper JL, Dite GS, et al. CYP17 promoter polymorphism and breast cancer in Australian women under age forty years. J Natl Cancer Inst 2000;92:1674–81.[Abstract/Free Full Text]
  4. McCredie MR, Dite GS, Giles GG, et al. Breast cancer in Australian women under the age of 40. Cancer Causes Control 1998;9:189–98.[ISI][Medline]

 

THE AUTHORS REPLY

Sander Greenland, Karin B. Michels, Charles Poole, James M. Robins and Walter C. Willett

Departments of Epidemiology and Statistics University of California Los Angeles, CA 90095-1772
Obstetrics and Gynecology Epidemiology Center Brigham and Women's Hospital Harvard Medical School Boston, MA 02115
Department of Epidemiology UNC School of Public Health Chapel Hill, NC 27599-7400
Departments of Epidemiology and Biostatistics Harvard School of Public Health Boston, MA 02115
Channing Laboratory Harvard School of Medicine Boston, MA 02115

While we agree with Hopper and Dite (1Go) about the value of FAR-type methods, we have three problems with their approach and their examples (1Go, 2Go).

First, the "floated trend" model given in our paper (3Go) can be fitted easily to cohorts and to unmatched case-control data with conventional software via the following coding trick: After mean-centering the adjustment covariates, include indicators for all the exposure categories in the fitted model, as well as the covariates, making sure to specify "NO INTERCEPT" or "NO CONSTANT" or the equivalent command to exclude the intercept term from the model. The resulting estimates of the exposure-indicator coefficients are estimates of the {theta}i (in the notation of Hopper and Dite (1Go)), and the estimated standard errors of these coefficients given by the program are approximately correct. This coding trick is much simpler than the repeated model fitting and use of equation 2 in Hopper and Dite.

Second, Hopper and Dite (1Go) plot categorical estimates based on control quintiles, which makes it difficult to evaluate the underlying trend. For example, is the unusual pattern in their figure compatible with simple curves such as a flat curve followed by a decline? Is the nadir at BMI = 21 an artifact of categorization? And where does the trend begin to decline after BMI = 25? Such questions would be better illuminated by a flexible continuous-data analysis without categories (3GoGo–5Go).

Third, there are serious biologic objections to modeling chronic-disease risk as a function of weight and height main effects only, as did McCredie et al. (2Go), rather than as a function of a relative-weight measure such as BMI (6Go), because of the likely variation in weight effects with height. In particular, the relation of weight to breast cancer is believed to be due to the effect of adiposity on breast cancer risk (7Go, pp. 197–199). The increased risk conferred by moving from 110 pounds to 150 pounds in a 6-foot woman (moving from a BMI of 14.9 to a BMI of 20.4) is that of moving from starvation to low-adiposity status. It seems doubtful that this risk increase would resemble that conferred by moving from 110 pounds to 150 pounds in a 5-foot woman (moving from a BMI of 21.5 to a BMI of 29.3), which is moving from low-adiposity to overweight status. Yet the model used by McCredie et al. assumes that the two effects are the same, that is, that height does not modify the relative risks comparing absolute weight categories. We do not think such an implausible model would yield valid inferences, especially about trends. Furthermore, with weight held constant in this model, the meaning of the height coefficient becomes obscure, because changes in height must now reflect changes in body composition (8Go). A more credible starting model would use BMI and height (6Go), as do Hopper and Dite (1Go). Even with this starting model, an analysis of potential variation in BMI effect across height would be warranted.

References

  1. Hopper JL, Dite GS. Re: "Presenting statistical uncertainty in trends and dose-response relations." (Letter). Am J Epidemiol 2002;155:977–8.[Free Full Text]
  2. McCredie MR, Dite GS, Giles GG, et al. Breast cancer in Australian women under the age of 40. Cancer Causes Control 1998;9:189–98.[ISI][Medline]
  3. Greenland S, Michels KB, Robins JM, et al. Presenting statistical uncertainty in trends and dose-response relations. Am J Epidemiol 1999;149:1077–86.[Abstract]
  4. Hastie T, Tibshirani R. Generalized additive models. New York, NY: Chapman & Hall, 1990.
  5. Greenland S. Analysis of polytomous exposures and outcomes. In: Rothman KJ, Greenland S, eds. Modern epidemiology. 2nd ed.. Philadelphia, PA: Lippincott-Raven, 1998:301–28.
  6. Michels KB, Greenland S, Rosner BA. Does body mass index adequately capture the relation of body composition and body size to health outcomes? Am J Epidemiol 1998;147:167–72.[Abstract]
  7. Willett WC, Rockhill B, Hankinson SE, et al. Epidemiology and nongenetic causes of breast cancer. In: Harris JR, Lippman MF, Morrow M, et al, eds. Diseases of the breast. 2nd ed. Philadelphia, PA: Lippincott-Raven, 2000:175–219.
  8. Willett WC. Nutritional epidemiology. 2nd ed. New York, NY: Oxford University Press, 1998:252–3.