1 Center for Clinical Epidemiology and Biostatistics and Department of Epidemiology and Biostatistics, University of Pennsylvania, Philadelphia, PA.
2 Department of Preventive Medicine, University of Southern California, Los Angeles, CA.
3 Division of Cancer Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA.
4 Division of Global Migration and Quarantine, Centers for Disease Control and Prevention, Atlanta, GA.
5 Division of Reproductive Health, Centers for Disease Control and Prevention, Atlanta, GA.
6 Fred Hutchinson Cancer Research Center, Seattle, WA.
7 Population Studies and Prevention Program, Karmanos Cancer Institute at Wayne State University, Detroit, MI.
Received for publication September 30, 2002; accepted for publication January 31, 2003.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
breast neoplasms; case-control studies; mammography; mass screening; sensitivity and specificity
Abbreviations: Abbreviations: CARE, Contraceptive and Reproductive Experiences; CI, confidence interval.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Case-control studies of the efficacy of screening mammography also require detailed and accurate mammography histories because past screening of cases and controls during the detectable preclinical period is the exposure of interest. This period (sojourn time) is the time prior to clinical diagnosis during which a screening test can detect cancer; for mammography, it is estimated to be 14 years (1315).
To our knowledge, there have been no case-control studies of the efficacy of screening mammography based on self-report. Past case-control studies have used deceased patients (16, 17), obtaining screening histories through medical or screening program records, or relying on the presence or absence of population-based screening programs in the geographic areas in which the study participants lived (16).
We are conducting a case-control interview study of screening efficacy among women aged 4064 years within a large, population-based case-control study of risk factors for incident breast cancer, the National Institute of Child Health and Human Development Womens Contraceptive and Reproductive Experiences (CARE) Study (18). By using a hybrid design entailing follow-up of all cases for mortality from breast cancer (19), we will able to assess how efficacious screening mammography is in reducing the risk of death from breast cancer among women aged 4064 years.
Concerns about the accuracy of self-report are more complex in comparative studies than in descriptive surveys. We know of no prior validation studies that have compared the accuracy of screening mammography histories between women with and without breast cancer. Both nondifferential and differential misclassification of exposure by cases and controls can result in biased estimates of association. It is generally assumed that cases recall past exposures more accurately than do controls, particularly if, in the course of their cancer being diagnosed, they have been asked about tests related to that diagnosis. Statistical techniques that adjust for differential recall require careful measurement of the sensitivity and specificity of the self-report in cases and controls compared with a "gold standard" (20). Some studies suggest that women recall their mammograms taking place more recently than they actually did (2, 3, 8). This "telescoping" can produce false positives when self-reporting is used to estimate the frequency of recent mammography. In these studies, specificity (21) has been substantially lower than sensitivity (21). In this article, we estimate the validity of self-reports of screening mammography in the context of a case-control study.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The reference date for cases was the month and year that their cancer was diagnosed. For controls, the reference date was considered the month and year that they were identified through random digit dialing.
Inclusion criteria
The subset of Womens CARE Study respondents included in the validation study 1) were aged 4064 years on the reference date; 2) reported at least one mammogram in the 5 years before the reference date (or, for cases, reported that their cancer was discovered by a screening mammogram even if they did not report any other mammograms in the 5 years before the reference date); and 3) were interviewed on or after November 1, 1995. Only after that date were women asked to provide facility information for mammograms received during the reference month. Each woman included signed a consent form and gave us the names of the facilities she used.
Self-reported mammography history
The Womens CARE Study questionnaire defined a mammogram as "an x-ray taken only of the breasts by a machine that presses the breast against a plate." From self-report, we obtained age at first mammogram, month and year of the most recent mammogram before the reference date, and year of all intervening mammograms in the 5 years before the reference date. We assumed that respondents would not be able to recall the precise month for mammograms before their most recent examination. For each mammogram reported on the questionnaire, respondents were asked whether the examination was 1) for a specific breast problem, 2) for follow-up of a previous breast problem, 3) part of a routine physical examination or was a screening test, or 4) for another reason. The questionnaire did not specifically ask about mammograms in the reference month. Rather, all questions about mammograms were asked for the time period before the reference month to exclude mammograms expected to occur as part of the diagnostic work-up. However, we were able to determine whether a case had a screening mammogram in the reference month because the questionnaire also asked about each breast surgery or procedure that occurred on or before the date of the interview, the month and year of this surgery, and how the problem that necessitated the surgery was first discovered (e.g., routine screening mammogram). Thus, if a womans breast cancer was discovered with her first mammogram, and it was a screening mammogram in the reference month, this information became evident because of the reasons for breast surgery.
Facility information
Facility records were used to document the presence and timing of the self-reported mammograms. If a woman reported no mammograms within the prior 5 years, we accepted that statement as the truth. With no means of searching the hundreds of facilities at each study site, we did not validate negative reports. Respondents provided the names of facilities used for mammograms in the 5 years before the reference date and, after November 1, 1995, the name of any facility used in the reference month. We validated information on mammograms in the reference month only for cases who reported that their breast surgery resulted from a problem discovered by mammography screening. We asked every facility listed by the respondent, including those out of the study area, for the dates of all mammograms she received for at least 5 years prior to the reference date up to the date of our request. If a facility did not have any information about the respondent, we also contacted other facilities with similar names or in close geographic proximity.
We did not confirm from the facility whether it conducted a screening or a diagnostic mammogram. Previous studies indicate that such information, if obtainable, frequently is not correct because restrictive insurance requirements lead to errors in recording the indicators for mammography (11, 22).
Data analysis
Summary variables
Self-reported screening mammography history. All analyses required an estimate of the number of months from the respondents most recent self-reported screening mammogram to a defined endpoint. For cases, this endpoint was the date of diagnosis. A mammogram in the reference month sufficed if the case said that her cancer was discovered by screening. For controls, the analogous endpoint was the month prior to the reference month. To analyze whether a screening mammogram occurred within a specific time period, we added 1 month to the time period for the controls to adjust for this difference. Thus, if the time frame was within the prior year, the most recent screening mammogram for a case could have occurred during, or 12 months before, the reference month. For controls, the most recent screening mammogram could have occurred in the 13 months before the reference month.
If only patient age at the time of the mammogram was available and the exact month of the mammogram was not, we chose a date equal to the patients birth month plus 5 or 6 months (assigned randomly) as the mammogram date. If we had only the year of the mammogram, we assumed June or July (randomly). When the respondent could not recall a year, we set the response to missing.
Comparison of self-report and facility information. We evaluated the correspondence between the timing of the self-report and the facility record by using two time frames: within 1 year and within 2 years. Both are estimated to be within the detectable preclinical period for breast cancer. We cross-classified respondents self-reports and facilities records by whether they were within the specified time frame. Using the 2-year time frame for illustration, for cases self-report of screening mammograms, "within 2 years" was defined as a mammogram in the 24 months prior to the reference date or, if the study subject reported that her cancer was discovered by screening, a mammogram in the reference month prior to the date of diagnosis. "Not within 2 years" was defined as a mammogram more than 24 months but less than or equal to 60 months before the reference date or no self-reported mammogram within the prior 5 years. More specifically, if a cases cancer was discovered by a screening mammogram, we generally used the self-reported date of her most recent screening mammogram. However, some of these respondents either did not report any mammograms at all prior to the reference date or reported their most recent self-reported screening mammogram to be more than 24 months ago (using the 2-year time frame). For these women, we adopted the reference date as the time of their most recent screening mammogram. For controls self-report, "within 2 years" was defined as a screening mammogram in the 25 months prior to the reference date and "not within 2 years" as a screening mammogram more than 25 months but 60 months or less before the reference date, or none within the past 5 years. If no facility could find a mammogram for the respondent, we classified the woman as having no matching mammogram within the prior 5 years.
To test the robustness of our assumptions, we used different approaches to assess the correspondence between facility records and self-report (table 1). For approaches I and II, we started with the date of a womans most recent self-reported screening mammogram and then located the facility-reported mammogram closest in time. Other approaches (III and IV) defined a match as any facility-reported mammogram in the same dichotomous time period as the self-report. For example, using the 2-year time frame, if a case reported a screening mammogram at 22 months before diagnosis and there were two facility reports, 25 months and 10 months before diagnosis, then approaches I and II would find a mismatch between the self-report and the facility report because the closer facility mammogram (25 months) did not occur within 2 years. However, approaches III and IV would declare this a match because there was a self-reported screening mammogram within 2 years (22 months) and a facility-reported mammogram within 2 years (10 months). This scenario would likely be an error because the facility-reported mammogram at 25 months was more likely to be the one the respondent recalled. Finally, we repeated the analyses by omitting any facility reports in the reference month (approaches II and IV) because they could be part of the diagnostic work-up and could lead to more matches than were justified. Excluding matches based on a facility mammogram in the reference month was most restrictive; if a case had only one screening mammogram and it was in the reference month, the corresponding facility-reported mammogram would be missed and a true match excluded.
|
Our analyses relied on the usual statistics of diagnostic testing, sensitivity, specificity, and positive predictive value (21) and were based on the assumption that the matching facility report was the gold standard. By using the estimated sensitivity and specificity, we estimated the true prevalence of screening among cases and controls, respectively, as computed by Thompson (20).
To assess telescoping, we restricted our analysis to the controls because 1) for cases, the phenomenon of telescoping could easily be confounded by their recollection of events related to the diagnosis of cancer; and 2) a large proportion of the cases most recent screening mammograms occurred very near the reference date. We first examined the distribution of the signed (+/-) difference in months between respondents most recent self-reported screening mammogram and the facility-reported mammogram nearest in time. Second, by limiting our analysis to women who reported a screening mammogram in the prior 5 years and for whom there was a facility record of a mammogram, we calculated the proportion of self-reports that were confirmed by facility data when the most recent self-reported mammogram was within the 1- or 2-year cutpoints.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Overall, we could not obtain mammography information from facilities for 4 percent (96/2,495) of the cases and 9 percent (54/615) of the controls. Only rarely was there no response from any of the facilities listed (10 cases, six controls). More often, the facilities found no record of the woman (41 cases, 22 controls) or found some information but not for a mammogram (43 cases, 23 controls). We found no evidence that any facilities were particularly nonresponsive or that individual facilities changed reporting patterns over time.
Correspondence of self-reported and facility information
Table 2 cross-classifies self-reported and facility data for cases and controls for two different periods: within 1 year and within 2 years of the reference date. For the 2-year cutpoint, almost half of the false positives reflected the absence of any mammography date from a facility (100/225 (44 percent) for cases, 38/78 (49 percent) for controls). Using the 1-year cutpoint, the comparable proportions for cases and controls were 29 percent and 24 percent.
|
|
Evidence of "telescoping"
We found only minimal evidence of telescoping among the controls. The median and modal differences in months between the most recent self-reported mammogram and the facility-reported mammogram closest in time were both zero. The mean difference was 4 months, a direction consistent with telescoping, varying across sites from 2 months to 7 months. The phenomenon did not appear to be caused by most women recalling their mammogram as more recent than actual but rather by a few large discrepancies between self-reports and facility reports. Twenty-eight (5.7 percent) of the 495 controls involved in the analysis accounted for 60 percent of the telescoping.
Robustness of results to changes in assumptions
As expected, both specificity and positive predictive values for cases were lower when we used approaches that excluded facility reports in the reference month (approaches II and IV, table 1). Matches based on a facility report in the reference month that would have been true positives with approaches I or III became false positives (table 4). For example, using the 2-year window, 16 percent (159/970) of cases who reported that their cancer was discovered by screening would have been misclassified without the question on how their cancer was discovered. Sixty-two of these 159 cases did not report any mammogram before the reference month, and 97 reported that their most recent mammogram before the reference month was more than 24 months ago. Because these cases quite likely had their most recent screening mammogram in the reference month, not including facility matches in the reference month would cause such matches to be classified as false positives. Controls were not affected by this decision because facility-reported mammograms in the reference month were not counted.
|
Estimating the true prevalence of screening mammography
Table 5 gives estimates of the true prevalence of screening mammography () for various levels of self-reported screening (p) based on corrections in which our estimates of sensitivity and specificity were used. For both cases and controls, the self-reports overstated the true prevalence of screening; however, the estimates became closer as self-reported prevalence increased.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Sensitivity and specificity of self-reported screening mammography were high among both cases and controls (table 3). High sensitivity is consistent with the results of most other validation studies (16, 8): perhaps if a woman truly has had a recent screening mammogram, this information will be reported accurately. Specificity in our study, although lower than sensitivity, was considerably higher than in most other studies across a broad range of assumptions about the concordance between self-reported and facility data. In several other studies, specificity was about 0.500.60 for at least one of the time frames measured (15, 8); in our study, specificity was no lower than 0.78 for our base-case assumptions.
One possible explanation for this discrepancy is that our study, conducted from 1994 to 1998, was more recent than the others, which were conducted in the late 1980s and early to mid-1990s. Recent, extensive public health efforts to increase screening might have promoted better understanding of mammography, thereby decreasing the false-positive rate. Furthermore, if women are now more likely to obtain regular mammograms, they might recall the mammography dates more accurately.
Contrary to other reports about self-reported screening mammograms being on average 34 months more recent than actual (2, 3, 8), we found that the limited telescoping among controls was driven by a few aberrant responses, including potentially incorrect facility reports. Self-reported mammography histories of cases and controls were similarly accurate, indicating nondifferential misclassification of exposure. Our estimates and the narrow confidence intervals suggest that the sensitivity and specificity of self-reported histories among cases and controls differ by no more than a few percentage points (table 3).
This study has several limitations. If a woman reported no mammograms in the prior 5 years, we assumed that this report was accurate since prior studies in closed medical systems have found negative predictive values of a self-report of no mammogram close to 100 percent (2, 4, 6). This decision could have overestimated specificity but only in the unlikely event that facilities found mammograms that the respondents failed to report. Given the high negative predictive value, it is also unlikely that this decision overestimated sensitivity by placing potential false-negative reports in the true-negative cell.
Estimates of sensitivity and specificity could have also been affected because we relied on the subjects to name the mammography facilities. We may have overestimated sensitivity if a woman overlooked a recent mammogram at a different facility. In addition, if a contacted facility could not find information on a woman, we were not able to allocate the error to the subject or to the facility. Although we contacted other plausible facilities, we could not check them all. Our estimates of specificity assumed that the subject was in error for all false positives. If we assumed that some or all of these instances reflected facility errors, the specificity would have been higher.
Another limitation results from the form of the questions about screening. Our questionnaire elicited information only about mammograms before the reference month. Screening mammograms in the reference month that led to the diagnosis of cancer were captured by a separate question on how the problem leading to the breast surgery was discovered. For most of the women whose cancer was discovered by a screening mammogram, we did not have to rely on this question alone. However, for 16 percent of these women, we had to assume that the mammogram occurred in the reference month. In addition, we had to randomly impute the month of the most recent self-reported screening mammogram when we did not have this information; however, doing so should have introduced only random error and should not have biased our overall estimates of sensitivity or specificity.
Finally, we did not seek the reason for the mammogram from the facility because previous studies indicated that such information was frequently incorrect (11, 22). Instead, we assumed that the facility-reported mammogram closest in time to the most recent self-reported screening mammogram was also a screening mammogram. Our results were not materially changed when we adopted a more conservative approach to declaring a match between a self-reported screening mammogram and a facility-reported mammogram.
In summary, our validation study contributes to the evaluation of mammography screening in two ways. First, use of controls enabled us to adjust estimates of screening prevalence based on self-reports to more closely reflect the true prevalence of screening in the population. For example, according to nationwide data from the Behavioral Risk Factor Surveillance System for the year 2000, the proportions of women aged 4049, 5059, 6064, and 65 years reporting mammograms in the prior year were 0.67, 0.74, 0.77, and 0.72, respectively, and the proportions who had mammograms in the prior 2 years were above 0.80 for all age groups (23). Over 90 percent of the mammograms were for routine screening. At these screening levels, our results indicate that self-reports of a recent screening mammogram somewhat overestimate the true prevalence of such screening.
Second, by including women both with and without breast cancer, we were able to compare the sensitivity and specificity of self-reported screening mammography histories in these two groups. Such information is needed to assess the validity of case-control studies of the efficacy of screening mammography (20). The similar sensitivity and specificity of the self-report of cases and controls suggest that nondifferential rather than differential misclassification is a more likely source of error and that odds ratios might understate the true efficacy of screening mammography in preventing mortality and morbidity from breast cancer.
![]() |
ACKNOWLEDGMENTS |
---|
Special thanks are extended to Janet Ortiz, who coordinated data collection for the mammography study, and to Noemi Epstein, Julie Bamrick, Jane Sullivan-Halley, Brenda Rogers, Peter Briggs, and Janet Ortiz for their special efforts in obtaining facility data. Drs. Noel S. Weiss, Lars Holmberg, and W. Douglas Thompson served as the Scientific Advisory Board. Their advice and guidance were invaluable. The authors also thank all past and present members of the Womens CARE Study team for their important contributions to this project.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|