1 Robarts Clinical Trials, Robarts Research Institute, London, Ontario, Canada.
2 Department of Epidemiology and Biostatistics, University of Western Ontario, London, Ontario, Canada.
Received for publication August 7, 2003; accepted for publication September 25, 2003.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
clinical trials; cohort studies; logistic regression; Mantel-Haenszel; odds ratio; relative risk
Abbreviations: Abbreviations: CI, confidence interval; RR, relative risk.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Despite repeated emphasis on the importance of the rare event rate assumption, consumers of medical reports often interpret the odds ratio as a relative risk, leading to its potential exaggeration. For example, several major US news media recently dramatically overstated the effects of race and sex on physicians referrals for cardiac catheterization: a 7 percent reduction in the referral rate for Black women was mistakenly reported as 40 percent (3).
Extensive discussion in much of the literature has reached a consensus that the relative risk is preferred over the odds ratio for most prospective investigations (1, 4, 5). Nevertheless, the recent medical literature has frequently included uncritical application of logistic regression to prospective studies. Coupled with the perception that easily accessible alternatives are unavailable, naive conversion of an adjusted odds ratio to a relative risk has compounded the difficulties (6, 7). Not only will this conversion method provide invalid confidence limits (7), but, most importantly, it will also produce inconsistent estimates for the relative risk; that is, the bias will not decrease as the sample size increases. Suppose, for example, in a study with two strata, each having 200 subjects, the estimated risks are 0.8 for the exposed group (140 subjects) and 0.4 for the unexposed group (60 subjects) in stratum 1, while the corresponding risks are 0.1 (60 subjects) and 0.05 (140 subjects) in stratum 2. It is obvious that the standard Mantel-Haenszel estimate for the relative risk is 2.0, but converting the odds ratio as obtained from logistic regression results in an estimated value of 2.98. Moreover, increasing each cell size 10-fold will result in a 95 percent confidence interval of 2.68, 3.25.
To estimate the relative risk directly, binomial regression (8) and Poisson regression (7) are usually recommended. However, as is commonly known, neither is very satisfactory. Convergence problems may arise with binomial regression models; in this case, they may fail to provide an estimate of the relative risk (710). On the other hand, use of Poisson regression tends to provide conservative results (7, 11, 12).
The purpose of this paper is to demonstrate how to estimate relative risk by using the Poisson regression model with a robust error variance. Since this procedure coexists with logistic regression analysis as implemented in standard statistical packages, there is no justification for relying on logistic regression when the relative risk is the parameter of primary interest.
![]() |
MODIFIED POISSON REGRESSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Consider the case in which xi (i = 1,2, ... , n) is a binary exposure with a value of 1 if exposed and 0 if unexposed. Then, the data can be summarized in a 2-by-2 table (table 1).
|
log[(xi)] =
+ ßxi.
The relative risk (RR) is then given by exp(ß). If a Poisson distribution is assumed for yi, the log-likelihood is given by
where C is a constant. Application of standard likelihood theory yields
with the estimated variance of given by
Now, since the error term is misspecified when the underlying data are binomially distributed, the sandwich estimator is used to make the appropriate correction. The corrected variance can be easily shown to be given by
which is consistently estimated by
Note that this estimator is identical to the traditional variance estimator derived by using the delta method (14, p. 455). An extension of this result that incorporates covariates adjustment can be obtained by using the steps outlined elsewhere (Lachin, section A.9 (14)).
Sandwich error estimation can be implemented by using the SAS PROC GENMOD procedure (15) with the REPEATED statement. It is commonly known that this approach can be used to analyze clustered data, such as repeated measures obtained on the same subject (16) or observations arising from cluster randomization trials (17). It is less well known that the same statement with PROC GENMOD can also be used to obtain a robust error estimator when only one observation is available from each cluster. In the present context, this approach can be used to correctly estimate the standard error for the estimated relative risk.
To validate this procedure numerically, I evaluated the performance of the modified Poisson regression approach in terms of relative bias for point estimation and percentage of confidence interval coverage. For comparison, I also included binomial regression and the standard Mantel-Haenszel procedure (18). Total sample sizes considered were 100, 200, and 500, with relative risk values of 1.0, 2.0, and 3.0. Sample sizes of less than 100 may provide confidence intervals that are too wide and thus were not considered here. In each of 1,000 simulated data sets, n subjects were randomly assigned to the exposure group with a probability of 0.5. Subjects in the exposure group were randomly assigned to the first stratum with a probability of 0.6, whereas those in the nonexposed group were assigned with a probability of 0.4 to this stratum. Regression analysis was performed by using the PROC GENMOD procedure for both binomial regression and Poisson regression and the PROC FREQ procedure for the Mantel-Haenszel method. The SAS macro used for the simulation is available from the author on request.
Simulation results shown in table 2 indicate that the relative bias of all point estimators decreases with increasing sample size. The results also demonstrate, by any reasonable standard, that the coverage percentage obtained by using the modified Poisson regression approach can be regarded as very reliable in terms of both relative bias and percentage of confidence interval coverage, even with sample sizes as small as 100. As expected, the Poisson regression produces very conservative confidence intervals for the relative risk, and the Mantel-Haenszel procedure also shows good performance. The binomial regression provides very satisfactory results, which is in agreement with findings reported by Skov et al. (10). However, they disagree with those reported by McNutt et al. (7), who found that confidence intervals obtained from this model and from the Mantel-Haenszel procedure have less-than-nominal coverage levels.
|
![]() |
ILLUSTRATIVE EXAMPLES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Applying the modified Poisson regression procedure results in an estimated risk of microalbuminuria that is 2.95 times higher in the control group than in the treatment group. Had the estimated odds ratio been interpreted as a relative risk, the risk would have been overestimated by 65 percent (4.87 vs. 2.95). The relative bias of the converted relative risk as obtained from the logistic regression model is 13 percent compared with the result obtained from using Poisson regression. The confidence interval provided by the ordinary Poisson regression approach is 31 percent wider than that obtained by using the sandwich error approach. Interestingly, the binomial regression procedure failed to converge until a variety of starting values were provided, when it finally converged with a starting value of 1.1 for the intercept. The estimated relative risk for patients treated with standard therapy is given by 2.85 (95 percent confidence interval (CI): 1.56, 5.23), which is fairly compatible with that obtained from the modified Poisson regression procedure.
Now let us consider data from a randomized clinical trial conducted in 19971998 at 18 US trauma centers (20, 21). The primary objective of this trial was to determine whether additional infusion of 5001,000 ml of diaspirin cross-linked hemoglobin during the initial hospital resuscitation period could reduce 28-day mortality in patients suffering from traumatic hemorrhagic shock. Ninety-eight patients were randomly assigned to diaspirin cross-linked hemoglobin or to a control (saline) treatment. Three risk subgroups were then defined according to the baseline trauma-related injury severity score, which was available for 93 patients, producing the data summarized in table 3. My aim was to estimate the risk of death for patients treated with diaspirin cross-linked hemoglobin relative to that for patients treated with saline. Application of the modified Poisson regression procedure results in an estimated relative risk of 2.30 (95 percent CI: 1.27, 4.15), very close to the results obtained by using the Mantel-Haenszel procedure and given by 2.28 (95 percent CI: 1.27, 4.09). Use of logistic regression analysis, on the other hand, results in an estimated odds ratio of 6.823 (95 percent CI: 1.776, 26.214). Thus, the estimated relative risk obtained from the converting odds ratio is given by 3.31 (95 percent CI: 1.55, 4.69), over 40 percent higher than the result obtained by using the standard Mantel-Haenszel procedure. The estimated relative risk from binomial regression is given as 1.94 (95 percent CI: 1.05, 3.59), somewhat smaller than that from using the Mantel-Haenszel method.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Although it is possible to obtain the adjusted relative risk from logistic regression analysis, the required computations are fairly tedious (22, 23). Naively converting the odds ratio may not produce a consistent estimate, a minimum statistical requirement. Interestingly, a similar problem has previously been pointed out when dealing with converting an adjusted odds ratio to a risk difference (24); this pitfall continues to be seen in calculating the "number needed to be exposed" (25), a variant of the number needed to be treated (26). Therefore, it may still be very relevant to revisit a statement made by Greenland more than 20 years ago: "... there is a danger that the ease of application of the [logistic] model will lead to the inadvertent exclusion from consideration of other, possibly more appropriate models for disease risk" (27, p. 693). Many alternative models allow the relative risk to be estimated directly. As one such alternative, I have introduced a modified Poisson regression procedure at least as flexible and powerful as binomial regression. The additional advantage of estimating relative risk by using a logarithm link is that the estimates are relatively robust to omitted covariates (28, 29), in contrast to logistic regression.
The robust error estimate is commonly used to deal with variance underestimation in correlated data analysis. I have applied this approach here to deal with variance overestimation when Poisson regression is applied to binary data. It is thus interesting to investigate the performance of this approach with correlated binary data that arise from longitudinal studies or a cluster randomization trial. This research is in progress.
![]() |
ACKNOWLEDGMENTS |
---|
The author is indebted to Dr. Allan Donner for reviewing drafts of the paper.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|