Validity of Self-reported Mammography: Examining Recall and Covariates among Older Women in a Health Maintenance Organization

Lee S. Caplan1,5,, Margaret T. Mandelson2 and Lynda A. Anderson3,4

1 Division of Cancer Prevention and Control, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA.
2 Center for Health Studies, Group Health Cooperative of Puget Sound, Seattle, WA.
3 Prevention Research Centers Program, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA.
4 Rollins School of Public Health, Emory University, Atlanta, GA.
5 Current affiliation: Prevention Research Center, Morehouse School of Medicine, Atlanta, GA.

Received for publication March 21, 2002; accepted for publication September 3, 2002.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Self-reports of screening are frequently used in place of chart abstraction, particularly in outpatient settings, because they are generally less expensive and frequently provide the only information available. The authors expanded the literature on validation of self-reported mammography by including the validity of recall and by assessing covariates in a setting where women were examined more than once. In 1995, this study validated mammography use in a sample of 949 women aged 50–80 years who were members of a health maintenance organization with centralized automated records of mammographic examinations. The majority of women had had a mammogram within the previous 2 years according to self-reports and records, but self-reported rates exceeded record rates by 8.2%. Sensitivity was high (93.8%), whereas specificity was low (53.6%). The overall agreement between self-reports and records was 82.7%. The kappa value was 0.52, indicating fair agreement beyond chance. Modeling with logistic regression revealed that being a college graduate and having a first-degree relative with breast cancer were significantly associated with accurate recall. Comparison of actual time interval data revealed that disagreements consisted largely of women’s underestimates of time since their last screening. These results add to knowledge about the validity of self-reported mammographic screening data in settings where women are screened more than once.

data collection; mammography; neoplasms; recall; reliability; reproducibility of results; validity; women’s health


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Assessment of the prevalence of screening in a particular population must rely on inexpensive, valid, and reliable measures. Self-reports of screening are frequently used in place of chart abstraction, particularly in outpatient settings, because they are generally less expensive and less difficult to monitor over time, and they frequently provide the only information available (1). A number of studies have examined the validity of self-reported mammography (110), and in most cases, the validity of self-reported mammography data was demonstrated. However, a recent systematic review of the literature examined the accuracy of self-reported data and cast serious doubts on the practice of relying exclusively on self-reported information on health behaviors, including mammographic screening (11). In that review, close examination of the mammographic screening data revealed good sensitivity (ranging from 81 percent to 99 percent) but lower specificity (ranging from 50 percent to 85 percent). One factor that influences the accuracy of self-reported data is "telescoping," a phenomenon whereby an event is thought to have happened more recently than it actually did (13, 7, 10). Other factors have been less consistently associated with self-reporting (11).

We assessed the validity of self-reports of mammography in a sample of women aged 50–80 years who were members of a health maintenance organization, the Group Health Cooperative of Puget Sound. Our study contributes to the literature in two important ways. First, our study included extensive information on covariates, which enabled us to examine the validity of self-reported mammography independently of confounding factors and within subgroups (e.g., by smoking status). Second, distinct from earlier work in this area, which generally focused on recall of the first or second mammogram, we examined a woman’s most recent mammogram, regardless of her number of prior mammograms. Such results should be more relevant to the assessment of mammography recall, because women are now more likely to obtain periodic mammograms than a single mammogram. Information on the most recent mammogram is being used as a key measure in analytical studies and is meaningful from a public health standpoint. For example, information on a recent mammogram would have more relevance to case-control studies evaluating an outcome of breast cancer, both in terms of assessing recent mammography as a risk/protective factor (or confounding factor) and in terms of using history of recent mammography to examine detection or screening bias when investigating issues such as oral contraceptive use as a risk factor. This consideration is particularly important in settings such as managed care programs; in the Group Health Cooperative in 1996, 96.8 percent of women aged 50 years or older who were screened reported having a prior mammogram.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Our study was part of the Encouraging Prevention in Older Women (EnPower) project, conducted at the Group Health Cooperative between June and November of 1995. The project addressed four important health promotion practices: cancer screening, hormone replacement therapy, smoking status, and physical activity. The Group Health Cooperative is a staff-model health maintenance organization serving more than 420,000 enrollees in western Washington State. All female members of the Group Health Cooperative aged 40 years or older are invited to enroll in the Breast Cancer Screening Program, which was established in 1986. Program enrollment begins with completion of a risk factor questionnaire and includes regular reminders to women who are due for screening. Eight-five percent of women complete the questionnaire and enroll in the program. Physicians may also order mammograms during the course of usual care for screening and for diagnostic evaluation. Results of mammographic examinations are stored in a centralized database that includes the patient’s medical history number, the date of the examination, and the names of the Group Health Cooperative radiologist and the radiology center where the examination was performed. Examinations performed through the Breast Cancer Screening Program also contain information on the indication for examination, the results, and recommendations for follow-up. Female Group Health Cooperative members were eligible to participate in the telephone survey if they were between the ages of 50 and 80 years, had an identified primary care physician, and had been continuously enrolled in the Group Health Cooperative for at least 2 years prior to January 1, 1995. The survey methodology has been described in detail elsewhere (12, 13).

Of the 1,395 eligible women invited to take part in the survey, 1,120 (80.3 percent) agreed to participate. One woman had to be excluded because her data were lost due to a software problem. For our study, we excluded 75 participants who reported prior breast cancer and five who had missing survey data, which made it impossible to determine their mammography status. We also excluded 90 women who were not continuously enrolled in the Group Health Cooperative for at least 5 years prior to the survey, so that the study time frame was consistent with the mammography questions asked in the survey. The final sample (n = 949) had a mean length of enrollment in the Group Health Cooperative of approximately 19 years (median, 18 years; range, 12–57 years).

We conducted computer-assisted telephone surveys that included questions on demographic information, perceived health status, perceived risk of ever getting breast cancer compared with other women of the same age, family history of breast cancer, number of years since last preventive checkup, and smoking status. Self-reported mammography was assessed through a series of questions concerning whether the woman had ever had a mammogram and when she had undergone her last one. For the women who reported ever having had a mammogram, the recency of the most recent mammogram was recorded as within the past year (<=12 months), over 1 year but within the past 2 years, over 2 years but within the past 5 years, and over 5 years. The self-reported data were subsequently linked to a centralized database containing computerized mammography records to validate the survey responses (herein referred to as medical records). Perceived personal risk of developing breast cancer was assessed by asking each woman to rate her own risk of ever getting breast cancer as lower, higher, or about the same as that of other women her age. We compared self-reports and medical records for having had a mammogram within the past 2 years. Using the medical records as our "gold standard," we calculated sensitivity, specificity, and positive and negative predictive values for the overall sample and for various subgroups. We then calculated the percentage of overall agreement between self-reports and medical records. Logistic modeling was used to determine whether any variables were independently associated with overall agreement regarding whether a woman had had a mammogram within the previous 2 years, while controlling for all other variables in the model (14). Cohen’s kappa statistic ({kappa}) was then used to determine the level of agreement between self-reports and medical records once chance agreement was removed (15). Values of {kappa} range from 1.0 (complete agreement) to –1.0 (complete disagreement), with a score of zero indicating expected agreement by chance alone. We chose a 2-year interval because most women aged 50 years or older in the Group Health Cooperative have coverage to receive mammographic screening once every 2 years, which is consistent with screening guidelines that recommend an interval of 1–2 years between screenings (16).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Of the 949 women included in our analysis, more than half (53.2 percent) were between 50 and 64 years of age, nearly one third (31.6 percent) were between the ages of 65 and 74 years, and the remainder were between the ages of 75 and 80 years (table 1). The majority of participants were White (89 percent), had an educational level of high school graduation or above (90.3 percent), and rated their health as good or excellent (85.7 percent). More than half (54.1 percent) reported that their risk of ever developing breast cancer was lower than other women’s risk; only one in six rated their risk as higher (16.2 percent).


View this table:
[in this window]
[in a new window]
 
TABLE 1. Characteristics of study participants and of those who reported having had a mammogram in the previous 2 years, Group Health Cooperative of Puget Sound, 1995
 
The proportion of women who reported that they had had a mammogram within the past 2 years (766/949) exceeded the proportion based on medical records (688/949). Sensitivity for determining whether a woman had had a mammogram within the past 2 years was 93.8 percent; the positive predictive value was 84.2 percent; and the negative predictive value was 76.5 percent (table 2). Specificity was 53.6 percent. The overall percentage of agreement was 82.7 percent, and {kappa} was 0.52.


View this table:
[in this window]
[in a new window]
 
TABLE 2. Validity of self-reports of mammography during the previous 2 years and agreement between self-reports and laboratory records on mammography (n = 949), Group Health Cooperative of Puget Sound, 1995
 
We explored the phenomenon of "telescoping" by examining the actual interval category since the most recent mammogram (data not shown). We found that among women whose self-reported interval differed from the interval in the computerized records, 218 underestimated or "telescoped" and 89 overestimated the interval since their last mammogram.

The odds of accurate recall of mammography within the past 2 years were over twice as high for women who graduated from college than for women who did not graduate from high school (odds ratio = 2.22, 95 percent confidence internal: 1.21, 4.09) and for women with a first-degree relative with breast cancer than for women without one (odds ratio = 2.22, 95 percent confidence interval: 1.09, 4.51). The accuracy of self-reports was not related to age, race/ethnicity, years since last preventive checkup, smoking status, perceived health status, or perceived risk of developing breast cancer, after controlling for all of the other variables in the model (table 3).


View this table:
[in this window]
[in a new window]
 
TABLE 3. Associations between characteristics of study participants and validated self-reports of mammography in the past 2 years, Group Health Cooperative of Puget Sound, 1995
 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
We found that self-reported data on mammography use for women in a managed care setting is highly sensitive for assessing the prevalence of mammographic screening. Our results are similar to those of prior studies on the use of screening tests for female cancers in health maintenance organization populations, which have found high sensitivity and low specificity (4, 10). In our study, specificity for having had a mammogram within the past 2 years was low. This could be due to underestimation of actual time since the last mammogram: When women were asked how long it had been since their last mammogram, they tended to underestimate the interval. Such "telescoping" is one of the most consistent findings among studies that compare self-reports and medical records (3, 5, 710). In fact, the date of the procedure is generally more problematic than reporting on whether mammography occurred at all. As a result, self-reports can often overestimate the use of the test within a certain period of time.

An important consideration in analysis of data such as these is whether the assessment is a validity study or a reliability study. A validity study is one in which an assumption is made that there is a "gold standard," and thus it is designed to determine how close the results from the nongold standard come to those from the gold standard. Therefore, measures of validity are appropriate to use, including sensitivity and specificity. A reliability study is one in which it is not assumed that there is a gold standard; the study is therefore designed to examine the degree to which the results from the two nongold standards are in agreement with each other. Thus, in a reliability study, only measures of agreement are appropriate to use, including percentage of overall agreement and {kappa}. The basic question concerns whether the medical records (or, in this case, computerized laboratory data) are "accurate" and therefore can serve as the gold standard. We treated the computerized data as the gold standard against which we compared self-reported information about mammography. Because Group Health Cooperative members tend to receive most of their care through this organization, and because our sample was derived from a centralized data source, there was a high likelihood that the data were very complete. For example, a survey of 14,205 women enrolled in the Group Health Cooperative who received mammograms in 2001 showed that only 475 (3.4 percent) reported having a mammogram outside of the Group Health Cooperative system (unpublished data). Because a small number of women may have received care outside of this setting and because errors are sometimes made in the notation and entry of data, we included measures of reliability between the self-reports and the computerized records (i.e., percentage of overall agreement and {kappa}) in addition to measures of validity.

Caution is necessary concerning the generalizability of our findings to the entire US population and other diverse populations, because of the characteristics of our study sample and setting. Because of our reliance on Group Health Cooperative laboratory records, validation of mammography was restricted to women whose last test was performed at a Group Health Cooperative facility. Additionally, we relied on the computerized mammography data and did not review data from individual women’s charts. Finally, no attempt was made to distinguish between screening tests and diagnostic tests.

Despite its limitations, our study, which evaluated the validity of self-reporting in an evolving health system where it is important to examine mammography among women who have received more than one mammogram, contributes important information to our understanding of the validity of self-reported mammography independently of confounding factors and within subgroups. We found that self-reported data on mammographic screening, having the most recent mammogram within a defined interval (2 years), could be used in clinical decision-making and surveillance. However, it would certainly be preferable to use medical records if they were available at a cost and level of effort that was manageable.

The whole question of the accuracy of self-reported information on health behaviors in general has recently been brought into question as a result of a critical review of the literature conducted by Newell et al. (11). They found that there is often a marked discrepancy between what people report and what is found in their medical records, and that self-reported data consistently underestimate the proportion of individuals considered at risk. This is consistent with our findings of women overestimating whether they had had a recent mammogram, because women who had not had a recent mammogram would be at risk of not having breast cancer diagnosed early in comparison with women who had had one. In the review, Newell et al. suggested a number of important ways in which the accuracy of self-reported data could be improved, including: ensuring that respondents fully understand the questions; phrasing questions in a way that minimizes socially desirable responses; using bounded recall to improve respondents’ recall of less recent events; ensuring that questions have clear, exhaustive, mutually exclusive response categories which assess the behavior of interest; and employing "bogus pipeline" techniques which deceive respondents into believing that their answers to some behavioral questions will be objectively verified. Although our own findings lead us to suggest that self-reported data on mammographic examinations indeed have value, we agree that investigators relying on self-reported data should undertake the suggestions of Newell et al. on how to improve the accuracy of such data.


    ACKNOWLEDGMENTS
 
This study was supported by grant U48/CCU415794-05 from the National Center for Chronic Disease Prevention and Health Promotion.


    NOTES
 
Reprint requests to Dr. Lee S. Caplan, Morehouse School of Medicine, 777 Cleveland Avenue, Suite 410, Atlanta, GA 30315 (e-mail: lcaplan{at}msm.edu). Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Paskett ED, Tatum CM, Mack DW, et al. Validation of self-reported breast and cervical cancer screening tests among low-income minority women. Cancer Epidemiol Biomarkers Prev 1996;5:721–6.[Abstract]
  2. Etzi S, Lane DS, Grimson RC. The use of mammography vans by low-income women: the accuracy of self-reports. Am J Public Health 1994;84:107–9.[Abstract]
  3. Fulton-Kehoe D, Burg MA, Lane DS. Are self-reported dates of mammograms accurate? Public Health Rev 1992–93;20:233–40.[Medline]
  4. King ES, Rimer BK, Trock B, et al. How valid are mammography self-reports? Am J Public Health 1990;80:1386–8.[Abstract]
  5. Degnan D, Harris R, Ranney J, et al. Measuring the use of mammography: two methods compared. Am J Public Health 1992;82:1386–8.[Abstract]
  6. Crane LA, Kaplan CP, Bastani R, et al. Determinants of adherence among health department patients referred for a mammogram. Women Health 1996;24:43–64.[ISI][Medline]
  7. Zapka JG, Bigelow C, Hurley T. Mammography use among sociodemographically diverse women: the accuracy of self-report. Am J Public Health 1996;86:1016–21.[Abstract]
  8. Johnson CS, Archer J, Campos-Outcalt D. Accuracy of Pap smear and mammogram self-reports in a southwestern Native American tribe. Am J Prev Med 1995;11:360–3.[ISI][Medline]
  9. Suarez L, Goldman DA, Weiss NS. Validity of Pap smear and mammogram self-reports in a low-income Hispanic population. Am J Prev Med 1995;11:94–8.[ISI][Medline]
  10. Gordon NP, Hiatt RA, Lampert DI. Concordance of self-reported data and medical record audit for six cancer screening procedures. J Natl Cancer Inst 1993;85:566–70.[Abstract]
  11. Newell SA, Girgis A, Sanson-Fisher RW, et al. The accuracy of self-reported health behaviors and risk factors relating to cancer and cardiovascular disease in the general population: a critical review. Am J Prev Med 1999;17:211–29.[CrossRef][ISI][Medline]
  12. Newton KM, LaCroix AZ, Leveille SG, et al. Women’s beliefs and decisions about hormone replacement therapy. J Womens Health 1997;6:459–65.[ISI][Medline]
  13. Mandelson MT, LaCroix AZ, Anderson, LA, et al. Comparison of self-reported fecal occult blood testing with automated laboratory records among older women in a health maintenance organization. Am J Epidemiol 1999;150:617–21.[Abstract]
  14. Coughlin SS, Pickle LW, Goodman MT, et al. The logistic modeling of interobserver agreement. J Clin Epidemiol 1992;45:1227–41.
  15. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960;20:37–46.[ISI]
  16. US Preventive Services Task Force. Guide to clinical preventive services. 2nd ed. Baltimore, MD: Williams and Wilkins, 1996:73–87.