RE: "FLEXIBLE MATCHING STRATEGIES TO INCREASE POWER AND EFFICIENCY TO DETECT AND ESTIMATE GENE-ENVIRONMENT INTERACTIONS IN CASE-CONTROL STUDIES"

Walter Schill1 and Pascal Wild2

1 Bremer Institut für Präventionsforschung und Sozialmedizin (BIPS), Universität Bremen, Linzer Str. 8–10, D-28359 Bremen, Germany
2 Department of Epidemiology, Institut National de Recherche et de Sécurité pour la Prévention des Accidents du Travail et des Maladies Professionelles (INRS), Avenue de Bourgogne, 54501 Vandoeuvre Cedex, France

To increase the power and efficiency in case-control studies exploring gene-environment interaction, Stürmer and Brenner advocated "flexible matching strategies" (1, pp. 600–601) with varying proportions of a matching factor among selected controls. In essence, they advocate biased sampling of control persons with respect to environmental exposure and developed a measure DM (degree of matching) that describes deviations from population frequencies among controls and cases. To demonstrate the advantage of their proposal, they perform a series of simulations under a variety of scenarios (comprising the joint distribution of environmental exposure and genetic trait, disease-model parameters, and "degree of matching") and estimate the variance of the interaction term in the logistic model, which is then compared with the variance that would have been obtained if a simple case-control study had been conducted.

We show in this letter that an optimal design does not depend on the parameters of the disease model—which one would prefer to leave unspecified when planning a study—and give a simple formula for an optimal design. Simply balancing exposure among controls turns out to be nearly optimal.

The aim is to obtain, for given numbers of cases and controls, designs with minimal variance of the interaction term ßI in gene-environment interaction studies, where gene (G) and environment (E) are dichotomous and the following logistic model is used:

Logit pr(D = 1 ½ G, E) = {alpha} + ßGG + ßEE + ßIEG.

Such a study has the structure of a 2 x 4 table with counts . It is known that the variance of the interaction term ßI is given as the sum over the reciprocals of all table entries:

It is shown immediately that biased sampling of the controls has an effect only on the cell frequencies among controls, while the frequencies among cases are not affected. Therefore, to minimize var(ßI), given fixed numbers of cases and controls, it suffices to consider the counts among controls.

If we denote by {tau}0 the prevalence of G among e (nonexposed) and by {tau}1 the prevalence of G among E (exposed), it can be shown that the variance is minimal if the ratio of exposed and nonexposed controls {rho} is equal to

An immediate consequence is that if OREG = 1, then {tau}0 = {tau}1 and {rho}* = 1; that is, the optimal numbers of exposed and nonexposed controls are equal (balanced design). Further, note that a simple case-control study, where the proportion of exposed controls is PE, is characterized by {rho} = PE/(l – PE).

With application of the above variance formula for the interaction parameter ßI to the basic scenarios of tables 1 and 3 in Stürmer and Brenner’s paper (1), it appears that using optimal {rho}*s improves the variance of ßI by less than 1 percent over a balanced design.

Note that, for a given value of {rho}, which specifies the actual proportion of exposed controls, the degree of matching, DM, depends additionally on the unknown disease parameters. The DM design characterization is therefore unsuited for the present framework. However, the results of the simulations in Stürmer and Brenner’s paper (1) are valid and show the gain in power that can be obtained by choosing adequately the proportions of exposed controls. In this letter, we showed that the ideal proportion is close to 50 percent.

REFERENCES

  1. Stürmer T, Brenner H. Flexible matching strategies to increase power and efficiency to detect and estimate gene-environment interactions in case-control studies. Am J Epidemiol 2002;155:593–602.[Abstract/Free Full Text]