Interaction as Departure from Additivity in Case-Control Studies: A Cautionary Note

Anders Skrondal 

From the Division of Epidemiology, Norwegian Institute of Public Health, Oslo, Norway.

Received for publication December 9, 2001; accepted for publication January 8, 2003.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
It has been argued that assessment of interaction should be based on departures from additive rates or risks. The corresponding fundamental interaction parameter cannot generally be estimated from case-control studies. Thus, surrogate measures of interaction based on relative risks from logistic models have been proposed, such as the relative excess risk due to interaction (RERI), the attributable proportion due to interaction (AP), and the synergy index (S). In practice, it is usually necessary to include covariates such as age and gender to control for confounding. The author uncovers two problems associated with surrogate interaction measures in this case: First, RERI and AP vary across strata defined by the covariates, whereas the fundamental interaction parameter is unvarying. S does not vary across strata, which suggests that it is the measure of choice. Second, a misspecification problem implies that measures based on logistic regression only approximate the true measures. This problem can be rectified by using a linear odds model, which also enables investigators to test whether the fundamental interaction parameter is zero. A simulation study reveals that coverage is much improved by using the linear odds model, but bias may be a concern regardless of whether logistic regression or the linear odds model is used.

additivity; case-control studies; epidemiologic methods; interaction

Abbreviations: Abbreviations: AP, attributable proportion due to interaction; RERI, relative excess risk due to interaction; S, synergy index.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Logistic regression analysis is the workhorse of contemporary epidemiology. Consequently, assessment of interaction is often performed by simply introducing product terms into logistic risk models. This practice has been vehemently criticized by some epidemiologists, who argue that assessment of interaction should mainly be based on additive rate or risk models (17). For rare outcomes, this notion of interaction follows from probabilistic independence, as embodied in the classical toxicologic notion of "simple independent action" discussed by Finney (8). The purpose of this article is not to engage in the debate on how interaction should be conceptualized in epidemiology. Rather, I confine my investigation to the performance of suggested measures of interaction as departure from additivity.

In cohort studies, the desired interaction assessment can easily be accomplished by fitting linear rate or risk models. However, the parameters of linear models cannot be validly estimated for case-control studies unless the sampling fractions for cases and controls are known or can be estimated. On the other hand, it is well known that odds ratios can be estimated in case-control studies. Furthermore, relative risks are often well approximated by odds ratios in case-control studies.

On the basis of these observations, Rothman (1, 2) suggested a synergy index (S) which can be used in case-control studies to measure interaction as departure from additive risks. Moreover, Rothman considered statistical inference for the index, deriving confidence intervals using the delta method. Rothman presented several additional measures of interaction (3), including the relative excess risk due to interaction (RERI), renamed the ICR by Rothman and Greenland (6), and the attributable proportion due to interaction (AP), which is the focus in Rothman’s latest book (7). Rothman furthermore pointed out (3, p. 324) that estimates of RERI, AP, and S are easily obtained from logistic regression analysis, as are Wald tests and confidence intervals (9). Alternatively, a likelihood ratio test of additive risks could be performed in the logistic regression model. Although this test would be expected to have better properties than the Wald test, it would be much harder to implement.

Discussion of the measures advocated by Rothman is typically confined to the somewhat unrealistic situation in which there are two exposures but no additional covariates to control for confounding. An exception is Flanders and Rothman (10), who suggested a likelihood approach to estimating S from stratified case-control data. As Rothman acknowledged (3), their approach only handles one or possibly two additional covariates, because otherwise data in each stratum become too sparse. Hence, Rothman suggests invoking "multivariate methods" in estimating RERI, AP, and S when there are additional covariates. Specifically, Rothman states, "Confounding factors can be controlled by including terms for those factors in the multiple logistic model" (3, p. 324). This suggestion has been adhered to by epidemiologists (for instance, see Olsen et al. (11)).

There has been a paucity of studies investigating the performance of RERI, AP, and S. The only paper I am aware of is that of Assmann et al. (12), where the investigation was limited to coverage of confidence intervals for RERI and AP in models without additional covariates. The primary concern in this article is the extent to which RERI, AP, and S are useful summary measures of interaction as departure from additive risks. In addition to the conventional approach based on logistic regression, I also suggest an alternative approach based on linear odds models. Attention is focused on the more realistic setting in which there are additional covariates. However, the concepts are best introduced in a setting with two exposures and no additional covariates.


    MODELS FOR TWO EXPOSURES
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Let Y be a dichotomous outcome variable with outcomes 1 and 0. Consider the case of two dichotomous exposure variables x1 and x2 with levels j = 0, 1 and k = 0, 1, respectively. Let

Let Rjk {equiv} P(Y = 1|xl, x2) be the conditional risk or probability that the outcome variable Y takes the value 1 given the values of the exposures. For all j and k, define risk differences as RDjk {equiv} Rjk – R00, relative risks as RRjk {equiv} Rjk/R00, odds as Ojk {equiv} Rjk/(l – Rjk), and odds ratios as ORjk {equiv} Ojk/O00.

The linear risk model
A linear risk model is now specified as

Rjk = a + b1x1 + b2x2 +b3x1x2,

where it is assumed that a > 0, b1 > 0, and b2 > 0. It follows that a = R00, b1 = R10 – R00 = RD10, and b2 = R0l – R00 = RD0l. Hence, a is interpreted as the risk when there is no exposure, b1 as the excess risk under exposure x1 (compared with no exposure whatsoever), and b2 as the excess risk under exposure x2. The parameter b3 can be expressed as

b3 = RD11 – RD10 – RD01 = R11 – R10 R01 + R00,

representing the excess risk due to interaction of the exposures. If b3 = 0, RD11 = RD01 + RD10, which is risk-difference additivity. According to Rothman (3, p. 320), b3 is the most fundamental epidemiologic measure of interaction.

Unfortunately, the linear risk model cannot in general be validly estimated from case-control designs, unless the sampling fraction of cases and controls is known or can be estimated. Since this rarely appears to be the case, it follows that direct inference regarding the fundamental interaction parameter b3 cannot be performed in this case. This was the impetus for the development of the surrogate interaction measures RERI, AP, and S.

The logistic risk model
A logistic risk model is specified as

Note that the parameters {alpha}, ß1, ß2, and ß3 are different from the corresponding parameters a, b1, b2, and b3 in the linear risk model. The model can alternatively be expressed as

Often RRjk ª ORjk, giving , , and . If ß3 = 0, RR11 = RR01 x RR10 is obtained, which is relative-risk multiplicativity.

Importantly, the logistic model can be employed for case-control designs under reasonable assumptions (13). Regarding the parameters, the only difference is that the intercept now becomes

where {phi}1 and {phi}0 are the sampling fractions of cases and controls, respectively.


    MEASURES OF INTERACTION
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Several measures of interaction have been suggested that can serve as surrogates for the fundamental interaction parameter b3 in the linear risk model, including RERI, AP, and S. The basic idea is that indirect statistical inference regarding b3, including calculation of confidence intervals and testing, can be based on , , or from logistic modeling in case-control designs.

Relative excess risk due to interaction
Rothman defines RERI (3, p. 323) as

RERI can be interpreted as the excess risk due to interaction relative to the risk without exposure. Rothman suggests substituting estimated approximate risk ratios , , and from the logistic risk model. Under our parameterization of the logistic risk model (equation 2), this leads to

Attributable proportion due to interaction
Rothman defines AP (3, p. 321) as

AP is interpreted as the attributable proportion of disease which is due to interaction among persons with both exposures. However, this interpretation does not make sense under negative interaction (b3 < 0), since the proportion would then be negative.

Substituting the estimated approximate risk ratios from the logistic risk model gives us

Synergy index
Rothman defines S (3, p. 322) as

S can be interpreted as the excess risk from exposure (to both exposures) when there is interaction relative to the excess risk from exposure (to both exposures) without interaction.

Substituting the estimated approximate risk ratios from the logistic risk model (equation 2) gives us


    MODELS INCLUDING ADDITIONAL COVARIATES
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Covariates are included in most epidemiologic models to control for confounding. I still consider two dichotomous exposures, but now I also include a dichotomous covariate z, which is coded 1 or 0. This definition of z is chosen for simplicity; there may of course be a vector of additional covariates, including both categorical and continuous covariates.

Let Rjkz {equiv} P(Y = 1|xl, x2, z) be the conditional risk of Y taking the value 1 given covariates. Define stratum-specific risk differences as RDjkz {equiv} Rjkz – R00z, relative risks as RRjkz {equiv} Rjkz/R00z, odds as Ojkz {equiv} Rjkz/(1 – Rjkz), and odds ratios as ORjkz {equiv} Ojkz/O00z.

The linear risk model
Consider a linear risk model with an additional covariate, where there is interaction among exposures but not between the exposures and the additional covariate:

Rjkz = a + b1x1 + b2x2 + b3x1x2 + gz,

where a > 0, b1 > 0, b2 > 0, and gz > 0. a = R000 and g = R001 – R000; a is the risk under no exposure when z = 0, whereas g represents the excess risk when z = 1 (compared with z = 0). Hence, the risk when there is no exposure can be expressed as a + gz; note that it depends on the value taken by the additional covariate. Irrespective of the value of z, it follows that b1 = R10z – R00z = RD10z, b2 = R01z R00z = RD01z, and b3 = R11z – R10z – R01z + R00z = RD11z – RD10z – RD01z. It also follows that

Note that RR10z, RR01z, and RR11z are functions of the covariate z, in contrast to the risk differences.

The logistic risk model
A logistic risk model with an additional covariate, where there is interaction among exposures but not between exposures and the covariate, is specified as

It follows that OR00z = 1, OR10z = , OR01z = , and OR11z = . When RRjkz {approx} ORjkz,

Hence, the relative risks implied by the logistic risk model do not depend on the covariate z, in contrast to the linear case. On the other hand, risk differences depend on the covariates, unlike the case in the linear risk model.


    PROBLEMS WITH ADDITIONAL COVARIATES
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
There are two problems associated with using surrogates for the fundamental interaction parameter b3 when there are additional covariates.

The uniqueness problem
Noting that the interaction parameter of interest b3 is invariant across the strata defined by the covariates z, I investigate whether this also applies for the surrogate measures.

Consider RERI for a given value of the covariates z. Substituting for the relative risk from the true linear risk model (equation 7) gives us

demonstrating that the magnitude of RERI generally depends on the values of z. In contrast, Rothman’s suggestion of including additional covariates in the logistic model would produce a single , given in equation 3, where , , and are now estimates from the logistic model (equation 8) including the covariate but no interactions between the covariate and either of the exposures or their product. Hence, there is clearly a tension between the suggested estimator, based on the implicit assumption that there is one measure to be estimated, and the fact that there are several unknown measures. The exception is when there is no interaction, b3 = 0, since RERI = 0 in this case, whatever the value of z. Also note that RERI retains the sign of b3, since a + gz > 0.

Regarding AP, substitution for the relative risk from the true linear risk model (equation 7) produces

and there is a different AP for each stratum defined by the covariates, unless b3 = 0. Following Rothman’s strategy, on the other hand, a single AP would be estimated as in equation 4, with estimates substituted from the logistic model with covariate (equation 8).

For S, substituting for the relative risk from the linear risk model (equation 7) gives us a unique measure

which does not depend on the covariate z. Analogous to the case without additional covariates, Rothman suggests estimating S using equation 5, with estimates substituted from equation 8. S does not suffer from the uniqueness problem when additional covariates are included, in contrast to RERI and AP, which suggests that S is the surrogate measure of choice.

The misspecification problem
If a logistic model is used in estimation of the surrogate interaction measures, specified with interaction among exposures (but not between exposures and additional covariates), the model is misspecified in the sense that it does not produce a relative risk identical to that of the corresponding true linear model when there are additional covariates. This is evident from noting that the relative risk from the logistic model (equation 9) does not depend on the value of the covariate z, whereas the relative risk from the linear model in equation 7 does. Hence, RERI, AP, and S based on the logistic risk model with an additional covariate (equation 2) only approximate the true measures from the corresponding linear risk model (equation 6). This stands in contrast to the case with solely two exposures, where the logistic and linear models are both "saturated" (both have as many parameters as conditional probabilities) and produce identical relative risks (and hence RERI, AP, and S). An important implication is that the estimated logistic model cannot be used to check the validity of the linear model, since a linear model without interaction between exposures and covariate implies interaction in the logistic model.


    RECTIFYING THE MISSPECIFICATION PROBLEM
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Using a linear odds model

enables us to estimate a* = ka, = kb1, = kb2, = kb3, and g* = kg based on a case-control study (6, pp. 418–419; 14). The linear odds model is a misspecified version of the linear risk model (equation 6) in the sense that the parameters a, bl, b2, b3, and g of the latter model are recovered up to a proportionality factor k. This proportionality misspecification has two important implications: First, it follows that hypotheses specifying that parameters of linear risk models are zero can be tested, particularly the hypothesis of no departure from additive risks b3 = 0 by testing = 0 in the model shown by equation 10. Second, the surrogate measures of interaction as departure from additivity can be validly estimated from the linear odds model. Considering S,

Note that the unknown proportionality factor cancels out. Thus, although the linear odds model is a misspecified version of the linear risk model, no misspecification problem is involved in obtaining S (or RERIz and APz), in contrast to the approach based on logistic regression. However, the uniqueness problem involving RERI and AP persists, suggesting that linear odds modeling of S is the method of choice in assessing interaction as departure from additivity in case-control studies with additional covariates. The linear odds model can be fitted in software packages such as STATA, EPICURE, and SAS (a reparameterization is available in EGRET).


    EXAMPLE
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Consider two dichotomous exposures, cigarette smoking (x1) and coffee drinking (x2), and the additional dichotomous covariate gender (z). The outcome variable indicates whether or not the subject experienced myocardial infarction. Interest concerns the interaction between cigarette smoking and coffee drinking.

To ease presentation, I now let risk and risk difference be expressed as number of cases per 100,000. That is, a risk of 0.0004 is written as 40. Remember that only (approximate) relative risks are generally available from case-control studies, and inference must hence be based on these.

I let z be coded 1 if male and 0 if female. A linear risk model with interaction between exposures but no interactions between the covariate and the exposures is specified:

Rjkz = a + b1x1 + b2x2 + b3x1x2 + gz = 10 + 100x1 + 40x2 +40x1x2 +90z.

This setup is exhibited in table 1.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Example of a linear risk model for myocardial infarction with the exposures smoking and coffee drinking and the additional covariate of gender
 
For females, RERI = 19 – 11 – 5 + 1 = 4 and AP = (19 – 11 – 5 + 1)/19 = 0.21; for males, RERI = 2.8 – 2 – 1.4 + 1 = 0.4 and AP = (2.8 – 2 – 1.4 + 1)/2.8 = 0.14. In contrast, S attains the same value for both genders: S = (l9 – l)/[(11 – l) + (5 – 1)] = (2.8 – 1)/[(2 – 1) + (1.4 – 1)] = 1.29.

The example illustrates the problems previously uncovered. Although the fundamental interaction parameter b3 is invariant over gender, the surrogates RERI and AP both vary across gender. S is the only adequate measure, attaining a unique value for both gender strata. Regarding the misspecification problem, the relative risks for males in table 1 do not equal those for females, as would be the case for a logistic model without interaction between exposures and covariate. A hypothetical case-control study can be obtained from the table by letting the figures reported in the "Risk" column represent cases and considering 500 controls in each group. If logistic regression were used, the estimates = 0.80, = 0.18, and = 1.31 would be obtained, whereas using the linear odds approach produces = 1.29.


    SIMULATION STUDY
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
In each replication of the study, a cohort is initially simulated from a linear risk model with an additional covariate (equation 6). All covariates are dichotomous, with 50 percent of individuals in each category. It is crucial to ensure that covariates are appropriately correlated in simulation studies in epidemiology, since such correlations are standard in observational studies. I have specified a "typical" Pearson correlation of 0.3 among exposures and between exposures and the additional covariate. This leads to a correlation of 0.32 between the additional covariate and the interaction term (the product of the exposure dummies) and a correlation of 0.69 between either exposure and the interaction term. I subsequently produce a case-control study by randomly sampling 500 cases and 500 controls from the cohort.

On the basis of the resulting case-control data, I first consider the approach advocated by Rothman, basing inference regarding RERI, AP, and S on the logistic risk model (equation 8). Ninety-five percent confidence intervals for all measures are obtained as described by Hosmer and Lemeshow (9). I then consider the performance of the alternative approach based on fitting the linear odds model (equation 10). The Wald test of H0: , which is also a test of the hypothesis that the fundamental interaction parameter b3 is zero, is investigated. The actual rejection probability at the nominal level of 5 percent represents the actual significance level when H0 is true and the power of the test otherwise. The performance of point estimates of S and corresponding 95 percent confidence intervals obtained via the delta method are also studied. Confidence intervals for S are not part of the standard output from linear odds modeling; therefore, I demonstrate in the Appendix how a calculator or spreadsheet can be used to obtain these. Since RERI and AP suffer from the uniqueness problem when there are additional covariates, I do not consider inference regarding these measures based on the linear odds model.

The nine scenarios investigated are presented in the left-hand portion of table 2. Throughout, I specify a = bl = b2 = 0.0001 but consider several scenarios for the interaction parameter b3 and the covariate effect g. Regarding the magnitude of interaction, no interaction (b3 = 0), a moderate positive interaction (b3 = 0.0001), and a strong positive interaction (b3 = 0.001) are studied. Regarding the covariate effect, I consider no effect (g = 0), a moderate effect (g = 0.0001), and a strong effect (g = 0.001) on disease. The corresponding values of the interaction measures are given, where RERI and AP are given subscripts designating the strata defined by z.


View this table:
[in this window]
[in a new window]
 
TABLE 2. Scenarios and performance of interaction measures in a simulation study with 500 cases and 500 controls and 1,000 replications per scenario*
 
Each of the scenarios was replicated 1,000 times. The logistic model (equation 8) and the linear odds model (equation 10) were used for each replication. Let be the estimated S in replication r of a scenario. The mean estimate is defined as

the variance as

and coverage as the fraction of the 1,000 95 percent confidence intervals including the true S. Analogous definitions apply for RERI and AP, but note that coverage cannot be defined when these measures vary across strata. For each scenario, the mean estimates and variances are reported in the right-hand portion of table 2, and the coverage of the 95 percent confidence intervals is reported when applicable.

Considering the performance of inference based on logistic regression, it is evident that RERI and AP are very problematic under scenarios 5, 6, 7, and 8, where there is not a unique measure. The evidence for bias in estimating RERI and AP for the remaining scenarios is statistically significant, except for scenarios 4 and 7, respectively, where p > 0.05. Bias in estimating S is significant for scenarios 1, 4, 5, and 6. However, the estimated bias is fairly tolerable in magnitude for all unique measures, apart from scenarios 6 and 9 for S. Regarding precision, did not perform satisfactorily for scenarios 6, 7, 8, and 9. This is due to its construction as a fraction, often producing very large absolute values when the denominator by chance approaches zero. Coverage was generally quite dismal, and it grew worse (more discrepant from 95) as the interaction and the magnitude of the covariate effect increased.

From a theoretical point of view, the linear odds model is the model of choice for estimating S. Interestingly, the results from the simulations are somewhat mixed. Regarding coverage, the performance of the linear odds approach is good, and it clearly outperforms the logistic approach. When it comes to estimation, the evidence of bias in from the linear odds model is significant (p <= 0.05) for all scenarios except 3, 6, and 9 (lack of significance for the latter is due to extreme imprecision). Disappointingly, the estimated bias is generally somewhat more pronounced than for the logistic approach. The variances of the estimates are also generally higher for the linear odds model than for the logistic model, leading to larger mean squared errors. The nominal significance level for testing the fundamental interaction parameter b3 in the linear odds model is reasonably well recovered. Observe that the power is low when the interaction parameter is of the same magnitude as the main effects, notwithstanding that there are as many as 500 cases and 500 controls. The power also appears to decrease as the covariate effect increases.

As expected, all measures perform fairly well in terms of bias when there is no covariate effect (scenarios 1, 4, and 7). Results based on the logistic and linear odds models differ because the estimated models are misspecified by inclusion of the covariate z. Identical results would be obtained for both models if the estimated models were correctly specified by omitting the covariate.

A simulation study with smaller samples, 250 cases and 250 controls, was also conducted. The results were similar but a bit more pronounced and are not reported here.


    CONCLUSION
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
I strongly endorse the notion that interaction assessment should be governed by the conceptualization of interaction. Logistic regression is appropriate if interaction is taken as departure from relative-risk multiplicativity, regardless of whether additional covariates are included. Given a conceptualization of interaction as departure from additive risks or rates, making direct inferences regarding the fundamental interaction parameter b3 would be preferred. Unfortunately, this is usually not possible in case-control studies. Hence, surrogate measures of interaction as departure from additivity such as RERI, AP, and S that can be estimated from case-control studies have been proposed. Estimation of the measures on the basis of logistic regression is appropriate for assessment of interaction as departure from additivity of risks in case-control studies when there are no additional covariates. This approach is problematic in practice, however, where additional covariates are usually included to control for confounding. A uniqueness problem arises because the surrogates RERI and AP vary across strata defined by the additional covariates, in contrast to the unique interaction parameter of interest b3. S, on the other hand, does not suffer from this problem, which suggests that it is the measure of choice in assessing interaction as departure from additivity in case-control studies that include additional covariates. A misspecification problem arises because the logistic model is no longer equivalent to the linear risk model when there are additional covariates. This problem can theoretically be rectified by using a linear odds model instead, and simulations reveal that coverage is much improved in comparison with the logistic approach. However, bias in estimating surrogate measures can be a problem regardless of whether logistic regression or the linear odds model is used. An advantage of the linear odds approach is that it enables us to test the hypothesis of interest b3 = 0 directly, without using surrogate measures, but the power appears to be rather low.

I conclude that considerable caution should be exercised in assessing interaction as departure from additivity in case-control studies with additional covariates.


    ACKNOWLEDGMENTS
 
The author acknowledges Drs. K. J. Rothman, S. O. Samuelsen, and L. C. Stene for their helpful comments.


    APPENDIX
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 
Confidence intervals for S are not part of the standard output from linear odds modeling. Hence, a calculator or spreadsheet can be used to obtain confidence intervals based on the parameter estimates , , and and the estimated variances and covariances of these parameter estimates , , , , , and .

S based on the linear odds model was given in equation 11 as

from which it follows that

.

Since S is a fraction, the coverage properties of a confidence interval for ln S are likely to be superior. Estimated standard errors of , , can be obtained using the multivariate delta method (15) as

where

and

An approximate 95 percent confidence interval for S will then have the lower confidence limit

and the upper confidence limit


    NOTES
 
Reprint requests to Dr. Anders Skrondal, Department of Epidemiology, Norwegian Institute of Public Health, P.O. Box 4404 Nydalen, N-0403 Oslo, Norway (e-mail: anders.skrondal{at}fhi.no). Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MODELS FOR TWO EXPOSURES
 MEASURES OF INTERACTION
 MODELS INCLUDING ADDITIONAL...
 PROBLEMS WITH ADDITIONAL...
 RECTIFYING THE MISSPECIFICATION...
 EXAMPLE
 SIMULATION STUDY
 CONCLUSION
 APPENDIX
 REFERENCES
 

  1. Rothman KJ. Synergy and antagonism in cause-effect relationships. Am J Epidemiol 1974;99:385–8.[ISI][Medline]
  2. Rothman KJ. The estimation of synergy or antagonism. Am J Epidemiol 1976;103:506–11.[ISI][Medline]
  3. Rothman KJ. Modern epidemiology. 1st ed. Boston, MA: Little, Brown and Company, 1986.
  4. Koopman JS. Interaction between discrete causes. Am J Epidemiol 1981;113:716–24.[Abstract]
  5. Rothman KJ, Greenland S, Walker AM. Concepts of interaction. Am J Epidemiol 1980;112:467–70.[ISI][Medline]
  6. Rothman KJ, Greenland S. Modern epidemiology. 2nd ed. Philadelphia, PA: Lippincott Williams and Wilkins, 1998.
  7. Rothman KJ. Epidemiology: an introduction. Oxford, United Kingdom: Oxford University Press, 2002.
  8. Finney DJ. Probit analysis. 3rd ed. Cambridge, United Kingdom: Cambridge University Press, 1971.
  9. Hosmer DW, Lemeshow S. Confidence interval estimation of interaction. Epidemiology 1992;3:452–6.[ISI][Medline]
  10. Flanders WD, Rothman KJ. Interaction of alcohol and tobacco in laryngeal cancer. Am J Epidemiol 1982;115:371–9.[Abstract]
  11. Olsen AO, Dillner J, Skrondal A, et al. Combined effect of smoking and human papillomavirus in cervical carcinogenesis. Epidemiology 1998;9:346–9.[ISI][Medline]
  12. Assmann SF, Hosmer DW, Lemeshow S, et al. Confidence intervals for measures of interaction. Epidemiology 1996;7:286–90.[ISI][Medline]
  13. Farewell VT. Some results on the estimation of logistic models based on retrospective data. Biometrika 1979;66:27–32.[ISI]
  14. Greenland S. Multivariate estimation of exposure-specific incidence from case-control studies. J Chronic Dis 1981;34:445–53.[CrossRef][ISI][Medline]
  15. Serfling RJ. Approximation theorems of mathematical statistics. London, United Kingdom: John Wiley and Sons, 1980.