Departments of 1 Statistics, 2 Epidemiology, 3 Clinical Studies, 4 Radiobiology/Molecular Epidemiology, Radiation Effects Research Foundation, Hiroshima, Japan
5 Unit of Environmental Epidemiology, National Public Health Institute, Finland
6 Radiation Epidemiology Branch, National Cancer Institute, US
Correspondence: John B Cologne, Department of Statistics, Radiation Effects Research Foundation, 5-2 Hijiyama Park, Minami-ku, Hiroshima 732-0815, Japan. E-mail: cologne@rerf.or.jp
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods We estimated the power to detect interaction using four control-to-case ratios (1:1, 2:1, 4:1, and 8:1) in a planned case-control study of the joint effect of atomic bomb radiation exposure and serum oestradiol levels on breast cancer. Radiation dose is measured in the entire cohort, but because neither serum oestradiol level nor the true degree of interaction was known, we simulated values of oestradiol and hypothetical levels of oestradiolradiation interaction.
Results Compared with random sampling, power to detect interaction was similarly higher with either matching or counter matching with two or more controls.
Conclusions Because counter matching is generally at least as efficient as random sampling, whereas matching on exposure can result in loss of efficiency and precludes estimation of exposure risk, we recommend counter matching for selecting controls in nested case-control studies of the joint effects of multiple risk factors when one is previously measured in the full cohort.
Accepted 17 December 2003
In cohort studies aimed at investigating the effects of already measured risk factorssuch as radiation exposure in the survivors of the atomic bombings of Hiroshima and Nagasaki, Japannested case-control studies may be conducted to analyse the effects of additional factors that cannot be assessed practically in the entire cohort. The purpose might be to study the effects of potential confounders or effect-modifying factors on the exposure risk. If the exposure is rare or has a skewed distribution, ignoring it in selecting controls can lead to a loss of statistical efficiency, so exposure-based methods of control selection might be considered.
As an example, although radiation and oestradiol are both strong risk factors for breast cancer, only radiation dose is known for atomic bomb survivors. The relative risk of early-onset breast cancer (diagnosis under age 35) for 1 Sv of radiation is 14 among women irradiated by the atomic bombs before age 20; the overall relative risk of breast cancer ranges from 2.3 to 3.4 among all women exposed under the age of 40.1 Key, Verkasalo, and Banks showed that serum oestradiol is positively associated with risk of breast cancer, with risks being about twice as high for postmenopausal women with high, as opposed to low, serum oestradiol concentrations.2 Little is known, however, about oestradiol levels and the risk of pre-menopausal breast cancer. Furthermore, the joint effect of radiation and oestradiol has not been studied, although Land et al.3 demonstrated interactions between radiation and breast cancer risk factors that may be related to constitutional hormone levels. At the Radiation Effects Research Foundation (RERF), we are conducting a study of radiation and oestradiol as joint risk factors for pre-menopausal breast cancer using stored sera obtained from atomic bomb survivors who participated in biennial clinical examinations conducted for RERF's Adult Health Study. Oestradiol, which is expensive to assay and requires sera, for which supplies are limited, will be measured in all cases but only a subset of controls. This raises the issue of how best to select controls to provide maximum statistical efficiency.
Selecting controls by individually matching them to cases on radiation exposure can improve statistical efficiency for testing interaction with another factor.4 If there is evidence of interaction or possible confounding in the case-control sample, a logical next step in the analysis would be to examine how exposure risk estimates vary with the level of the other factor. Matching on exposure allows studying the effect of con-founder/effect-modifier per se but precludes studying its effect on the exposure risk without additional information, such as comes from the cohort.3 An alternative to matching is weighted sampling of controls using counter-matching, where controls are selected to fill exposure strata not occupied by the case.58 Counter matching also allows estimation of the exposure risk, and the efficiency for studying both confounding and effect modification can be improved relative to random sampling of controls.9 Furthermore, counter matching allows the investigator to fix the number of controls in advance and is easily implemented with prospective, risk-set based selection. Many of the references on these designs provide justification and intuitive explanation as to why exposure-based sampling is efficient.
Counter matching has been shown to generate better efficiency for testing interaction than matching over a wide range of exposure risks and degrees of correlation between exposure and another risk factor when both are dichotomous,8 but comparisons have not been made for continuous risk factors, such as radiation dose and oestradiol. The proposed breast cancer study provides a basis for making that comparison in the case of a rare exposure that may interact positively with an additional factor, the situation in which matching achieves the greatest gain in efficiency.4 Our objective was to assess the extent to which matching and counter matching impact statistical power for detecting interaction relative to random control selection.
![]() |
Subjects and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
To assess the power of the study for detecting interaction, we simulated oestradiol levels and their interaction with radiation. We then calculated the resulting power using one, two, four, or eight controls selected from case risk-sets using three approaches: (1) random sampling, (2) matching as closely as possible on radiation exposure, or (3) counter matching on radiation exposure.
Selecting controls by random sampling is simple. Within risk sets defined by age, date, and availability of stored serum, controls are selected at random, without regard for exposure status. Matching on radiation exposure is also straightforward; within each risk set, the potential controls whose exposure values are closest to the case's exposure are selected. Note, however, that matching on exposure in addition to matching on other factors can be more complicated.12 If there are more tied values among the potential controls than the number needed, the controls are selected at random from among the tied subjects.
With counter matching, exposure strata are defined based on the number of controls to be selected per case and on the distribution of exposure among the cases. In each risk set one control is selected from each of the exposure strata not occupied by the case. In the study described here, there were many people with dose zero; we therefore defined the lowest exposure category to be zero. The other categories were determined by the appropriate percentiles of the distribution of exposure values among the exposed cases. For example, with eight controls per case there were nine exposure strata: zero plus eighths defined by octiles of the case exposure values (cutpoints: 0.14, 0.45, 0.66, 0.79, 1.15, 1.50, and 2.04 Sv; Supplementary Material: Distribution of Radiation Doses). With four controls per case there were five strata: zero plus fourths defined by quartiles of the case exposures (cutpoints: 0.45, 0.79, and 1.50 Sv; Figure 1).
|
Because radiation dose and case status were already known in the cohort, we simulated the interaction by adjusting the mean of the Normal distribution from which case log10-oestradiol values were generated so that the average case-control mean difference in log10-oestradiol level increased linearly with radiation dose, while log10-oestradiol values among the controls were generated independently of radiation dose. The dose-dependent case log10-oestradiol means were calculated to produce log odds ratios (OR) according to the following model:
![]() |
|
Statistical analysis of simulated data
We analysed the counter-matched design according to Langholz and Borgan,5 using conditional logistic regression with sampling weights as offsets.15 (An offset is added to the logistic regression model by entering it as a covariate with coefficient fixed at 1.) Counter-matched sampling weights were calculated separately for each risk set (i, i = 1,...,I), and exposure stratum (j, j = 1,...,J) as:
![]() |
We studied effect modification via statistical interaction by fitting the model:
![]() |
Counter-matched case-control samples were analysed using conditional logistic regression with weights as described above. Matched and randomly selected case-control samples were analysed using conditional logistic regression without weights. Cohort data were analysed using unconditional logistic regression. All analyses were conducted using Epicure (Hirosoft Inc., Seattle, Washington).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
![]() |
Conclusions and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The risk factors in our investigation were assumed to have positive interaction and one was a rare exposure already measured in the full cohort. This is the situation where matching performs best; in more general situationsi.e., when the two risk factors do not interact positively or when the matching factor is not rarematching can lead to a loss of efficiency.4 Counter matching generally improves statistical efficiency for studying interaction9 and, unlike matching, further allows studying the exposure risk with adjustment for the other factor measured only in the case-control sample. Counter matching using sampling within risk-set strata is no more difficult to perform than matching on exposure, which can be complicated when additional risk-set matching factors are involved,11 and both strategies require the use of con-ditional logistic regression. Counter matching additionally requires sampling weights, which are calculated from the numbers of cohort subjects in each risk-set exposure stratum in the cohort. Being able to examine the adjusted exposure risk would usually outweigh the extra effort involved in calculating the weights. We conclude that counter matching, and not matching, should generally be used to increase efficiency if a nested case-control study of joint effects is planned when one risk factor is known in the cohort.
When calculating the power of a study involving sampling from a cohort, two issues deserve consideration: power of the full cohort and power of the study design. There is little point in selecting a subset to investigate interaction if even the cohort is too small or the effect too weak to provide sufficient statistical power. If the cohort has sufficient power, then the question becomes what type and size of design will provide the greatest possible efficiency within the limitations of financial cost, time, biological specimen availability, and other considerations. Breslow and Day point out that some sampling of cohort risk sets can generally be performed with little loss of efficiency.18 We have not considered the trade-off between cost and benefit here (see, for example, Reilly19), but in designing studies investigators must decide how much of the cohort power they are willing to sacrifice to achieve the necessary logistical savings. We have not investigated all possible designs, but for two approaches to nested case-control selection with fixed risk-set size, we have demonstrated that using counter matching can allow the researcher to achieve the same level of efficiency using about half as many controls as would be needed if controls were selected randomly.
The nested case-control design allows repeated selection of subjects in different risk sets; even cases can serve as controls in risk sets prior to their disease onset. Thus, there can be greater efficiency (in terms of number of subjects) depending on how many subjects are selected repeatedly by chance. In our application, the number of potential controls was large com-pared with the number of risk sets, so the probability of repeated selection was small. The total number of subjects needed for a study will depend on this ratio as well as on the random draw of subjects. With matching, the number of potential controls at rare levels of exposure is limited and may lead to repeated selection. However, when counter matching is used, because dose strata are defined by quantiles, repeated selection is not likely to occur except for very large control:case ratios.
In studies with exposure known in the entire cohort, there is additional information on exposure risk in the non-selected subjects. Two-stage designs can improve efficiency.2022 Langholz and Goldstein23 proposed a likelihood for analysing the case-control data only using a proportional odds model with multi-stage sampling. Land and others3 proposed a method for incorporating the cohort risk estimate into the analysis of case-control subsets matched on exposure using more general risk models. There are also alternatives to the nested case-control design with counter matching. Borgan et al. addressed exposure-stratified selection in the case-cohort design.24 Randomized recruitment as an alternative to counter matching can also result in efficiency gains.25 Much remains to be done to synthesize the various designs and methods of analysis, but that is beyond the scope of the present work.
Because the present investigation was based on our interest in effect modification, we had to speculate as to what form it might take in order to simulate study power. Huang et al. reported that risk of breast cancer for medical irradiation to the chest in pre-/peri-menopausal women tended to be associated with oestrogen-receptor negative tumours,26 suggesting that mechanisms other than those dependent on hormonal exposure may be involved. Radiation might cause additional genetic alterations that result in more rapid progression of breast cancer associated with oestrogen receptor negative phenotype. Oestrogen receptor negative breast cancer cells have been reported to be relatively resistant to IL-6 induced apoptosis,27 so they may be more proliferative. If radiation induced alterations in signal transduction systems that were independent of the oestrogen receptor signalling system, then the joint effect of radiation and oestradiol could be multiplicative. If such alterations were dependent on the oestrogen receptor signalling system, then the joint effect could be greater than multiplicative. On the other hand, radiation exposure may lead to early onset of menopause,28 which could indirectly reduce the risk of breast cancer by decreasing the duration of exposure to constitutional estrogens. Therefore, in studying the joint effects of radiation and oestradiol, it is important to have sufficient power to detect or rule out interaction on the multiplicative scale to facilitate the planning of in-depth mechanistic studies.
Because these possibilities for interaction are mostly speculative, there was, in the present study, no basis to assume any particular type of effect modification between radiation and oestradiol. We therefore studied several arbitrary degrees of statistical interaction using a log-linear model. Effect modification can take other forms, including interaction on an additive scale. Such statistical interactions have been defined as effect-measure modification as distinguished from true effect modification, or biological interaction,29,30 which implies that the joint effect of multiple risk factors exceeds the sum of their individual risks. In the analysis of data from a nested case-control study, one should consider alternatives to the standard log-linear logistic-regression model for the joint effect of multiple risk factors, such as additive or mixture models.31
In summary, we have demonstrated that matching and counter-matching on a known, continuous exposure variable provide equal gains in statistical power in a nested case-control study of risk-factor interaction with a control:case ratio of at least 2:1. However, matching on exposure prevents studying the effect of exposure after adjusting for one or more other risk factors which might confound or modify the exposure risk, study aspects that counter matching addresses with greater efficiency than random sampling. We conclude that counter matching is superior to both matching on exposure and random control selection for nested case-control studies of effect modification when there is a known exposure.
KEY MESSAGES
|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Key TJ, Verkasalo PK, Banks E. Epidemiology of breast cancer. Lancet Oncol 2001;2:13340.[CrossRef][Medline]
3 Land CE, Hayakawa N, Machado SG et al. A case-control interview study of breast cancer among Japanese A-bomb survivors. II. Interactions with radiation dose. Cancer Causes Control 1994;5:16776.[ISI][Medline]
4 Thomas DC, Greenland S. The efficiency of matching in case-control studies of risk-factor interactions. J Chron Dis 1985;38:56974.[ISI][Medline]
5 Langholz B, Borgan Ø. Counter-matching: A stratified nested case-control sampling method. Biometrika 1995;82:6979.[ISI]
6 Cologne JB. Counterintuitive matching (editorial). Epidemiology 1997;8:22729.[ISI][Medline]
7 Steenland K, Deddens JA. Increased precision using countermatching in nested case-control studies. Epidemiology 1997;8:23842.[ISI][Medline]
8 Cologne J, Langholz B. Selecting controls for assessing interaction in nested case-control studies. J Epidemiol 2003;13:18493.
9 Langholz B, Goldstein L. Risk set sampling in epidemiologic cohort studies. Statist Sci 1996;11:3553.[CrossRef][ISI]
10 Sharp GB, Neriishi K, Hakoda M et al. A nested case-control study of breast and endometrial cancer in the cohort of Japanese atomic bomb survivors. Research Protocol 6-02: Radiation Effects Research Foundation, Hiroshima, Japan; 2002.
11 Roesch WC (ed.). US-Japan Joint Reassessment of Atomic Bomb Radiation Dosimetry in Hiroshima and Nagasaki. Hiroshima, Japan: Radiation Effects Research Foundation, 1975.
12 Cologne JB, Shibata Y. Optimal case-control matching in practice. Epidemiology 1995;6:22125.
13 Kabuto M, Akiba S, Stevens RG, Neriishi K, Land CE. A prospective study of estradiol and breast cancer in Japanese women. Cancer Epidemiol Biomarkers Prev 2000;9:57579.
14 Verkasalo PK, Thomas HV, Appleby PN, Davey GK, Key TJ. Circulating levels of endogenous hormones and their relations to risk factors for breast cancer: a cross-sectional study in 1092 pre- and postmenopausal women (United Kingdom). Cancer Causes Control 2001;12:4759.[CrossRef][ISI][Medline]
15 McCullagh P, Nelder JA. Generalized Linear Models. 2nd Edn. London: Chapman and Hall, 1989.
16 Greenland S. Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med 1983;2:24351.[Medline]
17 Smith PG, Day NE. The design of case-control studies: The influence of confounding and interaction effects. Int J Epidemiol 1984;13:35665.[Abstract]
18 Breslow NE, Day NE. Statistical Methods in Cancer Research. Volume IIThe Design and Analysis of Cohort Studies. Lyon: International Agency for Research on Cancer, 1987, pp. 20002.
19 Reilly M. Optimal sampling strategies for two-stage studies. Am J Epidemiol 1996;143:92100.[Abstract]
20 Breslow NE, Cain KC. Logistic regression for two stage case-control data. Biometrika 1988;75:1120.[ISI]
21 Zhao LP, Lipsitz S. Designs and analysis of two-stage studies. Stat Med 1992;11:76982.[ISI][Medline]
22 Breslow NE, Chatterjee N. Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis. Appl Statist 1999;48:45768.
23 Langholz B, Goldstein L. Conditional logistic analysis of case-control studies with complex sampling. Biostatistics 2001;2:6384.
24 Borgan Ø, Langholz B, Samuelsen SO, Goldstein L, Pogoda J. Exposure stratified case-cohort designs. Lifetime Data Analysis 2000;6:3958.[CrossRef][ISI][Medline]
25 Weinberg CR, Wacholder S. The design and analysis of case-control studies with biased sampling. Biometrics 1990;46:96375.[ISI][Medline]
26 Huang W-Y, Newman B, Millikan RC, Schnell MJ, Hulka BS, Moorman PG. Hormone-related factors and risk of breast cancer in relation to estrogen receptor and progesterone receptor status. Am J Epidemiol 2000;151:70314.[Abstract]
27 Chiu JJ, Sgagias MK, Cowan KH. Interlukin 6 acts as a paracrine growth factor in human mammary carcinoma cell lines. Clin Cancer Res 1996;2:21521.[Abstract]
28 Soda M, Cologne J. Radiation-accelerated age at menopause. RERF Update 1993;5:56.
29 Rothman KJ. Epidemiology: An Introduction. Oxford: Oxford University Press, 2002; p. 170.
30 Greenland S, Rothman KJ. Concepts of interaction. In: Rothman KJ, Greenland S (eds). Modern Epidemiology. Philadelphia: Lippincott Williams & Wilkins, 1998; pp. 32942.
31 Thomas DC. General relative-risk models for survival time and matched case-control analysis. Biometrics 1981;37:67386.[ISI]