ARTICLE

Frequency and Predictive Value of a Mammographic Recommendation for Short-Interval Follow-Up

Shagufta Yasmeen, Patrick S. Romano, Mary Pettinger, Rowan T. Chlebowski, John A. Robbins, Dorothy S. Lane, Susan L. Hendrix

Affiliations of authors: S. Yasmeen, P. S. Romano, J. A. Robbins, University of California, Davis; M. Pettinger, Women’s Health Initiative Clinical Coordinating Center, Fred Hutchinson Cancer Research Center, Seattle, WA; R. T. Chlebowski, Harbor–University of California at Los Angeles Research and Education Institute, Torrance, CA; D. S. Lane, State University of New York, Stony Brook; S. L. Hendrix, Wayne State University, Detroit, MI.

Correspondence to: Shagufta Yasmeen, M.D., University of California, Davis, Department of Obstetrics/Gynecology and Internal Medicine, 4860 Y St., Suite 2500, Sacramento, CA 95817 (e-mail: syasmeen{at}ucdavis.edu).


    ABSTRACT
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Background: A recommendation for short-interval follow-up of "probably benign finding" is associated with up to 11% of screening mammograms, but its predictive value for breast cancer is unclear. We examined the predictive values (i.e., the percentage of women with a diagnosis of breast cancer 2 years after a short-interval follow-up recommendation) and likelihood ratios (derived from the pretest and post-test odds of breast cancer in the Women’s Health Initiative sample) for breast cancer that are associated with a recommendation for short-interval follow-up among postmenopausal women. Methods: We performed a longitudinal analysis of a prospective cohort of 68 126 postmenopausal women (aged 50–79 years) who were participating in clinical trials as part of the Women’s Health Initiative at 40 centers across the United States. Eligible participants had screening mammograms at baseline and at least 2 years of follow-up that included a repeat mammography. Outcomes measured were breast cancer events at 1 and 2 years after baseline and the results of subsequent mammograms. All P values were two-sided. Results: A total of 2927 (5%) of the 58 408 eligible women had baseline mammograms that included recommendations for short-interval follow-up. The incidence of breast cancer for women with a short-interval follow-up recommendation was 1.0% at 2 years after the baseline mammogram compared with breast cancer incidences of 0.6% and 0.5% for women whose baseline mammograms were described as "benign" and "negative," respectively. Across the 40 participating centers, the prevalence of short-interval follow-up recommendations among baseline mammograms varied from 1.2% to 9.8% (P<.001), even when the analysis was adjusted for key variables in regression models. Centers reporting higher frequencies of such recommendations did not have lower positive predictive values for breast cancer than centers reporting lower frequencies. The likelihood ratio for breast cancer after a recommendation for short-interval follow-up on a subsequent mammogram was 2.20 (95% confidence interval = 1.65 to 2.86). Conclusion: Having a mammographic recommendation for short-interval follow-up was associated with a low positive predictive value for breast cancer among postmenopausal women during a 2-year follow-up. This result suggests that the current criteria for this recommendation—repeat mammography within 6 months—should be reconsidered.



    INTRODUCTION
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
The results of several randomized trials (1) indicate that mammographic screening is associated with an age-dependent reduction in breast cancer mortality. Mammographic findings in the United States are now universally reported using the Breast Imaging Reporting and Data System (BI-RADS) classification system (2), which includes five assessment categories that range from category 1 ("negative") and category 2 ("benign") to category 5 ("highly suggestive of malignancy – appropriate action should be taken"). BI-RADS category 3 ("probably benign finding – short-interval follow-up suggested") is applied to mammograms that show abnormalities that have a very high probability of being benign. Such abnormalities are not expected to change over the follow-up interval, but the radiologist would prefer to establish their stability (2).

The likelihood of breast cancer in women who have mammographic recommendations for short-interval follow-up may depend on risk factors such as age, family history, and hormone use, as well as the accuracy of the mammogram and its interpretation. Considerable variation exists among radiologists in the interpretation and subsequent management of these abnormalities (35).

Lesions that are judged to have a low probability of malignancy can be managed by prompt biopsy, early follow-up mammography, or routine follow-up mammography. On the basis of limited and indirect evidence, it has been suggested that careful mammographic surveillance (i.e., every 3–6 months) of such lesions may result in the detection of more asymptomatic cancers at an early stage when the prognosis is still favorable as well as a reduction in the morbidity and the cost associated with biopsies (4,6).

However, early recall for follow-up testing on screening mammography causes anxiety, psychological morbidity, and increased health care utilization in the year after a false-positive mammogram is reported (7,8). Furthermore, it is questionable whether repeating mammograms at 3–6 months increases the overall sensitivity of the test or has any positive impact on breast cancer outcomes. Current information regarding the benefits and the positive predictive value of a recommendation for short-interval follow-up is limited (9,10). In one recent report (8), only two breast cancers were found at a 6-month follow-up among 3184 mammograms with such recommendations. Indeed, the BI-RADS manual (2) states that "most approaches (to this problem) are intuitive . . . (recommendations) will likely undergo future modification as more data accrue as to the validity of an approach, the interval required, and the type of findings that should be followed" (2).

The current study was designed to evaluate the prevalence and positive predictive value of mammographic recommendations for short-interval follow-up in a longitudinal prospective cohort of postmenopausal women participating in the Women’s Health Initiative (WHI) at 40 clinical centers throughout the United States.


    SUBJECTS AND METHODS
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Data Collection

The WHI is a prospective study of 161 860 postmenopausal women aged 50–79 years who were enrolled from 1993 through 1998 at 40 clinical centers throughout the United States. The study population included women who enrolled in any of the WHI Clinical Trial (CT) arms (Dietary Modification [DM-CT], Hormone Replacement Therapy [HRT-CT], or Calcium/Vitamin D; 68 135 women) and who also had a baseline screening mammogram (n = 68 126 women).

Study methods for these WHI trials have been described in detail elsewhere (11). Briefly, women were eligible for enrollment if they were postmenopausal, unlikely to move or die within 3 years, and not currently participating in any other clinical trial. Women with a prior breast cancer diagnosis were excluded. At baseline, women completed screening and enrollment questionnaires by interview and self-report, underwent a physical examination, and provided a blood specimen. Special efforts were made to recruit a diverse sample that represented the population of community-dwelling, postmenopausal women in the United States.

We obtained reports of all screening mammograms for WHI clinical trial participants from the Clinical Coordinating Center in Seattle, Washington. Reports submitted before February 1995 (N = 9669) could not be analyzed because of inadequacies in the original design of the mammography reporting form. After having a negative screening examination (defined as BI-RADS category 1 or 2), HRT-CT participants had yearly screening mammograms and DM-CT participants had biennial screening mammograms. Although all participants received annual clinical breast examinations, the results of those examinations were not collected by the WHI Coordinating Center. Only women who completed at least 2 years of follow-up after having a baseline mammogram that was classified as "negative," "benign," or "probably benign finding – short-interval follow-up suggested" were included in the current study to avoid bias due to differences in 1-year follow-up procedures across the study arms. Baseline mammography preceded a woman’s assignment to a specific intervention. Women with baseline category 4 ("suspicious abnormality – biopsy should be considered") or category 5 ("highly suggestive of malignancy – appropriate action should be taken") mammograms were not eligible for enrollment in the WHI unless subsequent evaluation, performed independently of the WHI, ruled out malignant disease. Because we do not know the number of women who were deemed ineligible for this reason (and who, therefore, were never entered into our dataset), we excluded all women with baseline category 4 or 5 mammograms from our analyses. The study was reviewed and approved by the Human Subjects Review Committee at each participating institution, and all participants provided written informed consent.

Because our principal aim was to evaluate radiologists’ recommendations for short-interval follow-up (i.e., "less than 1 year" instead of "1 year" or "2 years"), we classified mammograms on the basis of whether such a recommendation appeared in the radiologist’s summary. Unfortunately, as previously documented by other authors (12), radiologists’ assessments and recommendations are sometimes inconsistent. In addition, the BI-RADS assessment scheme was not used universally until April 1999, which prevented us from analyzing our data strictly according to BI-RADS assessment categories. The WHI follows BI-RADS reporting practices, however, in that a definitive interpretation is provided only after "assessment is complete," based on any additional studies (e.g., magnification, ultrasound) that the interpreting radiologist may deem necessary (13). BI-RADS category 0 mammograms ("need additional imaging evaluation") must be resolved before they can be reported to the WHI Coordinating Center. Regardless of whether mammography was performed at a clinical center participating in the WHI or in the community, all results were entered into the WHI database. Because breast cancer is a primary endpoint in the WHI, all abnormal mammograms are followed up, in accordance with radiologists’ recommendations, until a diagnosis is obtained. Active follow-up using hospital, cancer registry, and mortality data ensures nearly complete ascertainment of relevant outcomes.

Potential breast cancer risk factors, including age, body mass index, waist-to-hip ratio, family history of breast cancer, hormone use, smoking history, alcohol use, level of physical activity, age at menarche, nulliparity, age at first full-term birth, age at menopause, education level, family income, and having medical insurance were collected from various WHI forms (11). Among all study participants, baseline characteristics were compared between women with a recommendation for short-interval follow-up and women without such a mammographic recommendation. Student’s t test was used to compare the means of continuous variables, and the chi-square test was used to compare the distributions of categorical variables.

For the current analyses, we combined each woman’s mammography results from the right and left breasts and used the most severe reading to categorize each participant. The primary study outcome was in situ or invasive breast cancer diagnosis based on local and central WHI adjudication following a review of medical records of each study participant within 2 years of the date of the baseline mammogram. The outcomes of women who had mammograms with a recommendation for short-interval follow-up were compared with the outcomes of women who had "negative" and "benign" mammograms. In accordance with previous studies, "negative" and "benign" mammograms were considered false negatives if the participant was diagnosed with breast cancer within the subsequent 2 years (14). A mammogram with a recommendation for short-interval follow-up was considered a false positive if the participant was found not to have breast cancer within 2 years of follow-up. Predictive value (positive or negative) was defined as the percentage of women with a specific mammographic finding who received a diagnosis of breast cancer within the subsequent 2 years, regardless of when the cancer was reported to the WHI. Our primary analysis included all baseline mammograms except those with category 4 or 5 recommendations. We performed a secondary analysis that was limited to women who had a "negative" or "benign" baseline mammogram and at least 2 years of follow-up after having a subsequent (annual or biennial) screening mammogram. In this latter analysis of new (incident) mammographic abnormalities, we classified each woman according to the most serious interpretation among all mammograms obtained after her baseline mammogram. A comparison mammogram was randomly selected for each woman with a series of "negative" mammograms. The likelihood ratio (LR) for each summary category was empirically derived from the pretest and post-test odds of breast cancer in the WHI sample. We estimated 95% confidence intervals (CIs) by using a recently described objective Bayesian technique, which outperforms traditional methods of estimating 95% CIs when observed probabilities are close to zero (15,16).

Statistical Analyses

Our primary aim was to estimate the predictive value of a recommendation for short-interval follow-up among a national sample of women. We therefore focused our power analysis on the width of the 95% CIs surrounding these estimates. Given that 24 (3%) category 3 mammograms were identified among the 750 mammograms at the WHI clinical center at the University of California, Davis, and that the prevalence of category 3 mammograms reported in previous studies was as high as 11% (5,9), we expected to find approximately 2000 baseline mammograms with a recommendation for short-interval follow-up in the national WHI database. We estimated that if the positive predictive value of these mammograms was 2%, then the 95% CI would have a width of 1.2% (i.e., 95% CI = 1.4% to 2.6%). This CI was sufficiently narrow to justify this study (17).

A secondary objective of this study was to estimate regional variation in the reported prevalence of short-interval follow-up recommendations. The prevalence and predictive value of these recommendations were computed for each of the 40 clinical centers. Prevalence was defined as the percentage of baseline mammograms with a recommendation for short-interval follow-up. The 40 clinical centers were categorized into four quartiles on the basis of prevalence. We then compared the predictive value of the recommendation for short-interval follow-up across these four prevalence quartiles to determine whether this recommendation was associated with a higher likelihood of breast cancer at centers that applied it more cautiously compared with centers that were less selective in applying this interpretation.

We used multivariable logistic regression at the patient level to test whether the prevalence of a recommendation for short-interval follow-up varied across clinical centers, both before and after adjusting for differences in participant characteristics. Our initial regression model had the recommendation for short-interval follow-up as a dependent variable and 39 dummy variables representing 40 clinical centers as independent variables. To determine whether the variation among clinical centers was due to differences in participant characteristics, we ran additional regression models that included demographic, socioeconomic, lifestyle, reproductive, and family history risk factors for breast cancer. To test whether this variation could be due to preexisting cancer, we further augmented the model by using breast cancer that was diagnosed during the 2 years subsequent to baseline (which was presumably present at baseline) as a predictor. The likelihood ratio chi-square test, with 39 degrees of freedom, was used to evaluate the statistical significance of clinical center as an independent predictor. All P values are two-sided. Statistical analyses were conducted using Statistical Analysis Software (SAS), version 8 (SAS Institute, Cary, NC).


    RESULTS
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Baseline mammogram results and other necessary baseline data were available for 58 408 participants in the WHI HRT-CT and DM-CT. Of the baseline mammograms, 408 (0.7%) were category 4 or 5 (and were therefore excluded from these analyses for reasons described in the "Subjects and Methods" section), 2927 (5%) were recommended for short-interval follow-up on the basis of either the radiologist’s summary assessment or his or her associated recommendations, 23 676 (40.5%) were described as having a "benign finding," and 31 305 (53.6%) were described as "negative" (Fig. 1Go). There are no BI-RADS category 0 ("need additional imaging evaluation") mammograms in the WHI database because only final assessments are entered. Two-year follow-up data after baseline mammography were available for 56 542 of the 58 408 participants, of whom 2927 (5%) had received recommendations for short-interval follow-up at baseline and 53 615 (94.8%) had not (Fig. 1Go). The other 1458 participants had not yet completed 2 years of follow-up in the WHI after having their baseline mammogram. A total of 29 135 WHI participants with a baseline screening mammogram described as "negative" or "benign" had at least one subsequent screening mammogram 1 or 2 years after their baseline mammogram (with the specific timing dictated by the HRT-CT or DM-CT study protocol) and at least 2 years of subsequent follow-up. Of these women, 2097 (7.2%) received at least one recommendation for short-interval follow-up but not a more serious mammographic result (see Table 5Go).



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 1. Flow chart showing number of participants enrolled in the Women’s Health Initiative (WHI) with baseline mammograms and 2 years of follow-up mammograms.

 

View this table:
[in this window]
[in a new window]
 
Table 5. Diagnosis of breast cancer among Women’s Health Initiative participants with normal or benign baseline mammograms, stratified by the results of subsequent annual or biennial mammographic screening (N = 29 135)*
 
We compared the demographic and clinical characteristics of women who had a mammographic recommendation for short-interval follow-up at baseline with those of women who had "negative" or "benign" baseline mammograms (Table 1Go). Women with recommendations for short-interval follow-up had a slightly higher mean body mass index and higher mean waist-to-hip ratio, were more likely to have experienced menarche before age 12, be nonwhite, be nulliparous, have used hormone replacement therapy, be a past or current smoker, have a family income of less than $35 000, be uninsured, and be physically inactive (Table 1Go). A family history of breast cancer, age at menopause (data not shown), alcohol intake, and level of education were not statistically significantly associated with baseline mammography findings.


View this table:
[in this window]
[in a new window]
 
Table 1. Demographic and clinical characteristics of women with baseline mammograms and at least 2 years of follow-up in the Women’s Health Initiative, stratified according to the presence or absence of a recommendation for short interval follow-up (N = 56 542)*
 
The incidence of newly diagnosed breast cancer within 2 years after a baseline screening mammogram that recommended short-interval follow-up was 1.02%. The incidences of newly diagnosed breast cancer within 2 years after a baseline screening mammogram that was described as having "negative" and "benign" findings were 0.45% and 0.61%, respectively (Table 2Go). The predictive value of a baseline mammogram with a recommendation for short-interval follow-up increased slightly with age, from 1.2% among women aged 50–59 years to 2.0% among women aged 70–79 years (Table 3Go). Of the 308 cases of breast cancer that were newly diagnosed within 2 years after the baseline mammogram, only 30 (10%) occurred in women who had a baseline mammogram with a recommendation for short-interval follow-up (Table 4Go). Eleven (37%) of those 30 cancers were diagnosed within 12 months after the baseline mammogram result was obtained; 10 of those 11 cancers involved the same breast for which the mammographic results were reported, and one cancer involved the opposite breast. Nine of the 11 cancers diagnosed within 12 months after the baseline mammogram result was obtained were invasive and none were associated with positive lymph nodes; tumor size at diagnosis ranged from 8 to 20 mm. The temporal distribution of cancer diagnoses was associated with a substantial shift toward an earlier diagnosis after a baseline mammogram with a recommendation for short-interval follow-up compared with the timing of diagnoses after a "negative" or "benign" baseline mammogram, which is consistent with the hypothesis that some of the abnormalities were malignant (Table 4Go).


View this table:
[in this window]
[in a new window]
 
Table 2. Incidence of breast cancer among participants in the Women’s Health Initiative for at least 2 years after having a baseline mammogram that recommended short-interval follow-up (N = 56 542)
 

View this table:
[in this window]
[in a new window]
 
Table 3. Incidence of breast cancer among participants in the Women’s Health Initiative who were followed for at least 2 years (N = 2927) after having a baseline mammogram with a recommendation for short-interval follow-up, stratified by age
 

View this table:
[in this window]
[in a new window]
 
Table 4. Time to breast cancer diagnosis after baseline mammography among participants in the Women’s Health Initiative who were diagnosed with in situ or invasive breast cancer within 2 years after their baseline mammogram (N = 308)
 
The mean prevalence of baseline mammograms with a recommendation for short-interval follow-up among the 40 WHI clinical centers was 5% (range = 1.2%–9.8%) (data not shown). This variability across centers remained statistically significant (N = 44 976; P<.001), even after we adjusted for all of the patient characteristics listed in Table 1Go, plus parity, age at menopause, and a diagnosis of breast cancer within 2 years after the baseline mammogram in the logistic regression analysis (data not shown).

The adjusted odds ratio for having a short-interval follow-up recommendation (versus a recommendation for routine 1- or 2-year follow-up) at baseline varied from 0.4 to 3.2 across WHI centers when we used the largest center as the reference group (data not shown). When we categorized the clinical centers into quartiles on the basis of prevalence of short-interval follow-up recommendations at baseline, the predictive value of a short-interval follow-up recommendation was not statistically or clinically significantly different across the quartiles (data not shown), as would have been expected if centers with a higher prevalence were less selective in offering this recommendation.

We extended our analysis by calculating the incidence of breast cancer among WHI participants who had a baseline mammogram described as "negative" or "benign" and then subsequently had an annual or biennial follow-up mammogram with a recommendation for short-interval follow-up (Table 5Go). The cumulative incidence of short-interval follow-up recommendations among the participants with "negative" or "benign" baseline mammograms was 7.2% (data not shown), which was slightly greater than the point prevalence of such recommendations at baseline among all participants (5%). Among women who had completed at least 2 years of further follow-up, the incidence of newly diagnosed breast cancer was 83.8% (likelihood ratio [LR] = 531, 95% CI = 234 to 1320), 24.5% (LR = 33.4, 95% CI = 26.8 to 40.9), and 2.1% (LR = 2.20, 95% CI = 1.65 to 2.86) within 2 years after receiving mammographic recommendations that were consistent with BI-RADS categories 5, 4, and 3, respectively. Among the 44 (2.1%) women who had a breast cancer diagnosis after a short-interval follow-up recommendation, 36 women had cancer involving the same breast, 34 women had invasive breast cancer, and five women had associated nodal involvement at the time of diagnosis (Table 5Go). By comparison, the incidence of newly diagnosed breast cancer was 0.5% (LR = 0.54, 95% CI = 0.43 to 0.66) and 0.4% (LR = 0.37, 95% CI = 0.28 to 0.47) within 2 years after "benign" and "negative" mammograms, respectively. The overall sensitivity of mammography, based on 2 year follow-up, was 58% if mammograms with short-interval follow-up recommendations were classified as positive and 43% if these mammograms were classified as negative.


    DISCUSSION
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
The predictive value of baseline mammographic recommendations for short-interval follow-up was approximately 1% over a 2-year follow-up period for a large sample of postmenopausal women participating in multicenter clinical trials. A woman who had a mammogram with such a recommendation was 1.7 times more likely than a woman with a "benign" mammogram and 2.3 times more likely than a woman with a "negative" mammogram to be diagnosed with breast cancer in the 2 years following the mammogram. Although there was substantial variability in the prevalence of short-interval follow-up recommendations across the 40 clinical centers that could not be explained by observable patient characteristics or prevalence of preexisting breast cancer, that variability was not associated with the predictive value of the recommendation. The predictive value of short-interval follow-up recommendations at follow-up mammography appeared to be higher than the predictive value of short-interval follow-up recommendations at baseline mammography, perhaps because the availability of prior negative films for comparison in the former case reduced the risk of a false-positive interpretation (18). Although 95% of the 56 542 women eligible for this analysis had at least one mammogram before their enrollment in the WHI, we do not know whether that prior mammogram was available to the radiologist interpreting the baseline study.

Our primary analyses focused on baseline mammograms because use of hormone replacement therapy, a potentially important confounder of the association between mammographic abnormalities and breast cancer, was unknown for HRT-CT participants at follow-up. However, our secondary analyses of subsequent mammograms revealed that a recommendation for short-interval follow-up was associated with a surprisingly low sensitivity of 58% (if we classified such mammograms as positive). This high false-negative rate should be interpreted cautiously, given the unusually high vigilance, with respect to breast health, of participants and physicians in the WHI. Many of these women were taking, or believed that they were taking, a medication known to increase the risk of breast cancer. In addition, these women were relatively well educated and sufficiently motivated and concerned about their health to enroll in a major randomized clinical trial. Hence, it is likely that they were receiving regular clinical breast examinations and doing frequent breast self-examinations. These factors might have led to the earlier clinical detection of cancers that, under ordinary circumstances, would have been detected only by mammography.

Our findings regarding the prevalence and predictive value of short-interval follow-up recommendations are consistent with the predictive values of 2.0% or less that were reported by previous studies (9,10,19) of mammographic follow-up of "probably benign finding – short-interval follow-up suggested" (i.e., BI-RADS category 3) lesions. The only studies that reported higher predictive values than ours included subsets of category 3 mammograms that were referred for biopsy (9,20). We found that the predictive value of recommendations for short-interval follow-up was slightly higher among women 70–79 years of age than among younger women, principally because prevalence of disease is greater among older women. By comparison with a prior study (21), we found a slightly lower positive predictive value associated with BI-RADS category 3 mammograms and a higher false-negative rate associated with BI-RADS category 1 and 2 mammograms. The superior performance of screening mammography in that study may be due to several factors, such as more aggressive follow-up or the possibility that higher quality mammography was performed at academic centers such as the University of California, San Francisco (9,19).

The temporal distribution of breast cancer diagnoses after baseline mammography suggests that a few cancers were, in fact, diagnosed earlier than they would have been had those women not received a recommendation for an early recall for follow-up mammography. However, the value of short-interval follow-up mammography was limited because only 11 cases of breast cancer, representing approximately 0.4% of the 2927 mammograms with that recommendation, were diagnosed during the 1-year period before the next regularly scheduled mammogram. Although 19 women were diagnosed with breast cancer in the second year after having a mammogram with such a recommendation, they appear not to have benefited from short-interval (6-month) follow-up because most of their 19 cancers presumably would have been diagnosed at about the same time if follow-up had been scheduled at 1 year. This result is supported by other studies (9,20) that have examined relatively large numbers of category 3 mammograms and their subsequent clinical outcomes. For example, in one study (9) of 3184 category 3 mammograms, only two breast cancers (0.06%) were found at the 6-month follow-up.

The size and geographic distribution of the study sample and the completeness of data collection at each study center are important strengths of our study. The WHI has 40 centers in a wide range of both academic and community settings, which were selected to maximize generalizability of trial results. The reporting of mammography results is likely to be accurate, because all abnormal mammograms receive special attention. Each category of mammogram is required by the WHI protocol to have appropriate follow-up, and the results must be documented at semiannual and annual visits before the study pills are dispensed. Our study also had the advantage that false-negative baseline mammograms could be identified because the WHI protocol requires aggressive follow-up of every participant. Participants who stop visiting the clinic are still contacted to obtain study endpoint data, which include the diagnosis of breast cancer from hospital and cancer registry data.

The quality of mammography varies greatly across the United States, such that the predictive value of an abnormal screening mammogram may be three times higher in academic centers than in community-based practices (22). WHI clinical centers use both university-based and community-based mammographic facilities, and the proportions of each vary from center to center. Reliability of mammographic interpretations may be suboptimal because each mammogram was probably interpreted by only one radiologist. Although these design features may explain why the positive predictive value of mammograms with a short-interval follow-up recommendation and the negative predictive value of "negative" and "benign" mammograms were lower among women participating in the WHI than among women in some prior studies (19,20), they also enhance the generalizability of our results to community practice.

Our study has several limitations. First, the results of our study are not applicable to premenopausal women or women aged 40–49 years who were ineligible to participate in the WHI. The predictive value of short-interval follow-up recommendations among premenopausal women may be lower or higher than the estimates reported here because breast cancers in this age group are less prevalent but more aggressive than they are in postmenopausal women (23,24). Second, we focused on radiologists’ recommendations for short-interval follow-up rather than on the BI-RADS assessment scheme per se. We did so for two reasons. BI-RADS usage was not mandated under the Mammography Quality Standards Act (25) until April 1999; therefore, WHI data collectors had difficulty retrospectively assigning some prior mammography reports to BI-RADS categories. In addition, radiologists’ assessments and recommendations are sometimes inconsistent. In the primary care setting, physicians are likely to focus on and follow the radiologist’s final recommendation and only cursorily scan the accompanying explanation and assessment. Third, the results of interim steps in evaluating abnormal findings, such as 3- to 6-month follow-up mammography after a baseline recommendation for short-interval follow-up, are not documented in the WHI database. Therefore, we have no information on the diagnostic pathway by which women were proven to have, or not to have, breast cancer after receiving a mammographic recommendation for short-interval follow-up. We also have no data on whether a radiologist requested additional imaging before offering a definitive interpretation.

Given the fact that recommendation for short-interval follow-up may account for over 40%–50% of abnormal screening mammograms (12,21), our results should stimulate re-examination of the criteria used to make this recommendation. In addition, the recommended timing of short-interval follow-up at 6 months should be critically examined. BI-RADS category 3 mammograms have economic implications for health care delivery, and emotional consequences for the women that receive them, because they inevitably lead to more mammographic examinations and, probably, more biopsies for changes that turn out to be benign. Our results, and those of other recent studies (5,9,21), suggest that a 1-year follow-up of BI-RADS category 3 mammograms may be as or more appropriate than a 6-month follow-up. The higher predictive value of new BI-RADS category 3 abnormalities, after a woman has had a normal baseline mammogram, suggests that the availability of prior mammograms may improve the usefulness of this classification.


    NOTES
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 
Editor’s note: J. A. Robbins conducts research sponsored by Merck Research Laboratories (Whitehouse Station, NJ), Wyeth (Madison, NJ), and Novartis (Broomfield, CO) and is member of the speaker’s bureau for Merck and Procter and Gamble (Informagen, Inc., Biotechnology/Pharmaceutical Companies, Newington NH).

Supported by N01-WH-3–2100 through -2102, -2105, -2106, -2108 through -2113, -2115, -2118 through -2120, and -2122; M01-RR-00-425; N01-WH-4-2107 through -2126, and -2129 through -2132 as part of the WHI program (funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services).


    REFERENCES
 Top
 Notes
 Abstract
 Introduction
 Subjects and Methods
 Results
 Discussion
 References
 

1 Humphrey LL, Helfand M, Chan BK, Woolf SH. Breast cancer screening: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med 2002;137(5 Part 1):347–60.

2 American College of Radiology. Breast Imaging Reporting and Data System (BI-RADS). 3rd ed. Reston (VA): American College of Radiology; 1998. [Last accessed: 02/06/2003.] Available at: http://www.imaginis.com/breasthealth/acrbi.asp and then click on the first link under "Additional Resources and References" at the bottom of the page (http://www.acr.org/cgi-bin/fr?mast:masthead-products,text:/departments/standaccred/birads-a.html).

3 Velanovich V. Immediate biopsy versus observation for abnormal findings on mammograms: an analysis of potential outcomes and costs. Am J Surg 1995;170:327–32.[CrossRef][Medline]

4 Hall FM, Storella JM, Silverstone DZ, Wyshak G. Non-palpable breast lesions: recommendations for biopsy based on suspicion of carcinoma at mammography. Radiology 1988;167:353–8.[Abstract]

5 Caplan LS, Blackman D, Nadel M, Monticciolo DL. Coding mammograms using the classification "probably benign finding--short interval follow-up suggested". AJR Am J Roentgenol 1999;172:339–42.[Abstract]

6 Cyrlak D. Induced costs of low cost screening mammography. Radiology 1988;168:661–3.[Abstract]

7 Barton MB, Moore S, Polk S, Shtatland E, Elmore JG, Fletcher SW. Increased patient concern after false-positive mammograms: clinician documentation and subsequent ambulatory visits. J Gen Intern Med 2001;16:150–6.[CrossRef][Medline]

8 Ong G, Austoker J, Brett J. Breast screening: adverse psychological consequences one month after placing women on early recall because of diagnostic uncertainty. A multicenter study. J Med Screen 1997;4:158–68.[Medline]

9 Sickles EA. Periodic mammographic follow-up of probably benign lesions: results in 3,184 consecutive cases. Radiology 1991;179:463–8.[Abstract]

10 Orel SG, Kay N, Reynolds C, Sullivan DC. BI-RADS categorization as a predictor of malignancy. Radiology 1999;211:845–50.[Abstract/Free Full Text]

11 The Women’s Health Initiative Study Group. Design of the Women’s Health Initiative Clinical Trial and Observational Study. Controlled Clin Trials 1998;19:61–109.[CrossRef][Medline]

12 Taplin SH, Ichikawa LE, Kerlikowske K, Ernster VL, Rosenberg RD, Yankaskas BC, et al. Concordance of breast imaging reporting and data system assessments and management recommendations in screening mammography. Radiology 2002;222:529–35.[Abstract/Free Full Text]

13 Imaginis. Breast cancer diagnosis. Mammogram interpretation: categories and the ACR/BI-RADS. [Last accessed: 02/06/2003.] Available at: http://www.imaginis.com/breasthealth/acrbi.asp and then click on the first link under "Additional Resources and References" at the bottom of the page (http://www.acr.org/cgi-bin/fr?mast:masthead-products,text:/departments/standaccred/birads-a.html).

14 Roberts MM, Alexander FE, Anderson TJ, Chetty U, Donnan PT, et al. Edinburgh trial of screening for breast cancer: mortality at seven years. Lancet 1990;335:241–6.[Medline]

15 Mossman D, Berger JO. Intervals for posttest probabilities: a comparison of 5 methods. Med Decis Making 2001;21:498–507.[Abstract/Free Full Text]

16 Simel DL, Samsa GP, Matchar DB. Likelihood ratios with confidence: sample size estimation for diagnostic studies. J Clin Epidemiol 1991;44:763–70.[Medline]

17 Weissert WG. Estimating the long-term care population: prevalence rates and selected characteristics. Health Care Financ Rev 1985;6:83–91.[Medline]

18 Specificity of screening in United Kingdom trial of early detection of breast cancer. BMJ 1992;304:346–9.[Medline]

19 Kerlikowske K, Grady D, Barclay J, Sickles EA, Ernster VL. Likelihood ratios for modern screening mammography. Risk of breast cancer based on age and mammographic interpretation. JAMA 1996;276:39–43.[Abstract]

20 Varas X, Leborgne F, Leborgne JH. Nonpalpable, probably benign lesions: role of follow-up mammography. Radiology 1992;184:409–14.[Abstract]

21 Lacquement MA, Mitchell D, Hollingsworth AB. Positive predictive value of the Breast Imaging Reporting and Data System. J Am Coll Surg 1999;189:34–40.[CrossRef][Medline]

22 Brown ML, Houn F, Sickles EA, Kessler LG. Screening mammography in community practice: positive predictive value of abnormal findings and yield of follow-up diagnostic procedures. AJR Am J Roentgenol 1995;165:1373–7.[Abstract]

23 Retsky M, Demicheli R, Hrushesky W. Breast cancer screening for women aged 40–49 years: screening may not be the benign process usually thought. J Natl Cancer Inst 2001;93:1572.[Free Full Text]

24 Retsky M, Demicheli R, Hrushesky W. Premenopausal status accelerates relapse in node positive breast cancer: hypothesis links angiogenesis, screening controversy. Breast Cancer Res Treat 2001;65:217–24.[CrossRef][Medline]

25 Code of Federal Regulations: Quality Mammography Standards. Final Rule. 21 C.F.R. Parts 16 and 900 (1997). [Last accessed: 02/06/2003.] Available at: http://www.fda.gov/cdrh/mammography/mqsa_accomplishments.html.

Manuscript received February 27, 2002; revised January 7, 2003; accepted January 21, 2003.


This article has been cited by other articles in HighWire Press-hosted journals:


             
Copyright © 2003 Oxford University Press (unless otherwise stated)
Oxford University Press Privacy Policy and Legal Statement