Accuracy of Self-Reports of Acquired Immunodeficiency Syndrome and Acquired Immunodeficiency Syndrome-related Conditions in Women

Nancy A. Hessol1, Sandra Schwarcz2, Niloufar Ameli1, Gary Oliver3 and Ruth M. Greenblatt1

1 Department of Medicine, University of California, San Francisco, CA.
2 AIDS Office, Department of Public Health, San Francisco, CA.
3 HIV/STD Epidemiology Surveillance and Field Service Section, Alameda County Public Health Department, San Francisco, CA.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
To investigate the validity of self-reported acquired immunodeficiency syndrome (AIDS) among women enrolled in a prospective study of human immunodeficiency virus (HIV) infection, the authors compared the self-reported occurrence of AIDS-specific diagnoses with AIDS diagnoses documented by county AIDS surveillance registries. Also examined was the association between participant characteristics and the validity of self-reports. Among the 339 HIV-infected participants in the Northern California Women's Interagency HIV Study between October 1994 and September 1998, 217 reported having been given a diagnosis of AIDS. Of these 217 women, 157 (72%) were listed in the registry as having AIDS. Among the specific AIDS-related conditions reported by three or more women, the sensitivity was highest for tuberculosis (100%), CD4 cell count less than 200 (84%), Mycobacterium avium complex (73%), and Pneumocystis carinii pneumonia (69%), and the positive predictive value was highest for CD4 cell count less than 200 (75%). Among all reported AIDS diagnoses, the kappa statistic was highest for cryptococcosis (0.67) and CD4 cell count less than 200 (0.57). The only statistically significant participant characteristic associated with inaccurate reporting of an AIDS diagnosis was being a current cigarette smoker (adjusted odds ratio = 2.57, 95% confidence interval: 1.17, 5.64). Overall, self-reporting of any AIDS-related condition is fairly accurate, but there is great variability in the accuracy of specific conditions.

acquired immunodeficiency virus; HIV; registries; self disclosure; sensitivity and specificity; women

Abbreviations: AIDS, acquired immunodeficiency syndrome; HIV, human immunodeficiency virus; WIHS, Women's Interagency HIV Study


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Self-reported health events are common features of many epidemiologic studies and are often used to define outcome measures. However, self-reporting may be inaccurate, depending on the characteristics of the study participant, such as age (1Go, 2Go), level of education (2Go, 3Go), and substance use (4Go), or may vary by the type of disease or health event being reported (5Go). Additionally, self-reports may be influenced by the quality of information given to patients by medical providers (6Go). A few studies have used supplemental sources for verifying self-reported health outcomes. The two most common have been medical record review (7Go) and comparisons with disease registry data (matching studies) (1GoGo–3Go).

In natural history studies of human immunodeficiency virus (HIV) infection, ascertainment of key outcomes, such as acquired immunodeficiency syndrome (AIDS)-related conditions and dates of diagnoses, is essential to determine progression time to AIDS. Although study participants may be queried about past AIDS-related conditions, the completeness and accuracy of this information is uncertain and is likely to be variable. By linking participant data with local AIDS registries, we were able to verify self-reported information and to collect key information about women who were lost to follow-up (including those who had died).

To investigate the validity of self-reported AIDS and the completeness of AIDS case ascertainment among women enrolled in a prospective study of HIV infection and AIDS, we compared the self-reported occurrence of AIDS-related conditions with AIDS diagnoses documented by local county AIDS surveillance registries. We also examined the association of respondent characteristics, such as age, race, education, and substance use, with the validity of self-reports.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Study population
Between October 1994 and November 1995, 339 HIV-infected women were enrolled in the Northern California site of the Women's Interagency HIV Study (WIHS), a multisite natural history study of HIV infection. A detailed description of this cohort has been reported previously (8Go). Recruitment was performed at a variety of venues. These included HIV primary care clinics, hospital-based programs, research programs, community outreach sites, women's support groups, drug rehabilitation programs, and HIV testing sites. Recruitment was also done through referrals from enrolled participants.

The demographic characteristics of HIV-infected women enrolled at the Northern California site were compared with local and national AIDS registries to determine how representative study participants were of women with AIDS in the Bay Area. Baseline characteristics of participants were compared with 1994–1996 AIDS surveillance registry data from San Francisco and Alameda counties. Overall, local WIHS participants and locally reported cases are very well matched in terms of age, race, and exposure mode for HIV infection (table 1).


View this table:
[in this window]
[in a new window]
 
TABLE 1. Characteristics of the Northern California HIV*-infected WIHS* women in 1994–1995, Bay Area AIDS* cases in women in 1994–1996, and US AIDS cases in women through December 1995

 
In brief, study visits included collection of demographic information, medical and behavioral histories, physical examination data, and laboratory specimens. Prospective follow-up of the women in the cohort occurred every 6 months. At baseline, participants were asked whether a health care provider had ever told them that they had AIDS or any of the AIDS–related conditions. At each follow-up visit, participants were asked whether a health care provider had told them that they had AIDS or any of the AIDS–related conditions since the last study visit. The majority of the HIV-infected women (85 percent) resided in either San Francisco County or Alameda County.

AIDS registries
Two local AIDS registries, representing San Francisco and Alameda counties, performed computer matching with the WIHS HIV-infected study participants by name, gender, race/ethnicity, and date of birth. Additional information, such as Social Security number, address, and date of last WIHS follow-up, was used to verify a possible match. All possible matches were reviewed manually to confirm the match. To maintain confidentiality, the AIDS registries and the WIHS database were mutually encrypted. For each possible match, the registries provided information on all AIDS diagnoses, with corresponding dates, as well as the date of death (if applicable). Both of these AIDS registries use a combination of passive reporting by health care providers to the health department and active reporting, in which health department staff conduct periodic visits (ranging from more than once a week to once or twice a year) to inpatient and outpatient facilities to ascertain persons diagnosed with AIDS. In addition, both registries collect information on the type and date of the initial AIDS diagnosis as well as subsequent diagnoses and their respective dates. The estimate of the completeness of the AIDS case reporting in San Francisco County was 97 percent when last evaluated in 1995–1996 (9Go). In Alameda County, a validation study of AIDS cases found 93–98 percent complete reporting when evaluated in 1997 and 1999 (Linda Frank, Alameda County Public Health Department, personal communication, 2000). The median time from diagnosis to AIDS registry report is 1 month for San Francisco County (9Go) and 2 months for Alameda County (Linda Frank, Alameda County Public Health Department, personal communication, 2000).

Comparison of self-reported AIDS with registry data
Self-reported AIDS information, collected from October 1994 through September 1998 for the 339 WIHS participants, was submitted to both the San Francisco County and Alameda County surveillance units, and match results were received in early 1999. This allowed for at least a 4-month lag period for a self-reported AIDS diagnosis to be reported to AIDS surveillance.

AIDS cases are reported according to their county of residence. However, people with AIDS who move between counties or who receive medical care in a county different from their county of residence may be reported outside of their county of residence or in more than one county. Reassignment of the AIDS case to the county of residence and the removal of duplicate case reports are done on an ongoing basis through the state AIDS registry. Eighty-five percent of our study participants reported living in either Alameda or San Francisco counties. Due to the mobility and lack of permanent residence of the study population, calculation of person-years of residence was extremely difficult on an individual level. Instead, we used a weighted denominator that assumed that at any given time 85 percent of the participants would be residing in San Francisco County or Alameda County.

The registries provided linked information on all AIDS-related conditions and dates of diagnoses for women who reported that they had AIDS. Both sensitivity and positive predictive value for self-reported AIDS were calculated by using the registry reports as the standard. Sensitivity was computed as the number of women in the AIDS registry who also self-reported AIDS divided by the total number of women reported in the AIDS registry. The positive predictive value was calculated as the number of women in the AIDS registry who also self-reported AIDS divided by the total number of women who reported that they had AIDS. For AIDS-specific diagnoses, we defined a true-positive match between a self-report and a registry match as agreement on the type of AIDS-related diagnosis. To measure agreement between the two sources of information, the kappa statistic and the 95 percent confidence intervals were also calculated (10Go).

To determine whether respondents' attributes affected the validity of self-reports, we performed multivariate logistic regression analysis of both false-positive and false-negative reports for any AIDS diagnosis. Age at baseline was categorized in 10-year age groups; cigarette smoking at baseline was categorized as never (reference), former, and current; education at baseline was categorized as less than 12 years of school, high school graduate, and more than 12 years of school (reference); household income at baseline was categorized as $6,000 or less, $6,001–12,000, and more than $12,000 (reference); and duration of HIV infection was categorized as first tested positive in 1980–1989, 1990–1992, and 1993–1995 (reference). Other study variables that were dichotomized (yes or no) were African-American race, history of injection drug use, alcohol use 7 days a week, and having a primary care provider. Additional regression analyses were performed by examining false reports of disease-specific AIDS outcomes. Adjusted logistic regression odds ratios and 95 percent confidence intervals were calculated by using SAS software logistic procedure with logit link function (11Go).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
A total of 339 HIV-infected WIHS women were computer-linked with the San Francisco County and Alameda County AIDS Surveillance registries. These women were 56 percent African American, 27 percent White, 12 percent Latina, and 5 percent other races/ethnicities. The mean and median ages were 38 years, with a range of 19–61 years Fifty-six percent reported a history of injection drug use, and 70 percent reported completion of high school.

As of September 30, 1998, 217 HIV-infected WIHS participants reported having been given a diagnosis of AIDS. Of these 217 women, 89 were listed in the San Francisco AIDS registry, 70 were listed in the Alameda AIDS registry, and two were listed in both registries, for a total of 157 matches (72 percent).

Supplemental information from medical record abstraction and tuberculosis and cancer registry matches found that seven additional self-reported AIDS diagnoses could be verified that were not verified by local matches with the AIDS registries. These cases all occurred in either San Francisco County or Alameda County and included one case of esophageal candidiasis, two cases of bacterial pneumonia, one case of Pneumocystis carinii pneumonia, two cases of tuberculosis, and one case of Kaposi's sarcoma.

There were no self- or surveillance reports for coccidioidomycosis, histoplasmosis, isoporiasis, or primary lymphoma of the brain, or progressive multifocal leukoencephalopathy (table 2). The sensitivity and the positive predictive value varied by disease-specific AIDS diagnoses. AIDS diagnoses with the highest sensitivity were any AIDS diagnosis (91 percent including low CD4 count and 87 percent excluding low CD4 count), low CD4 count/percent (84 percent), HIV encephalopathy (70 percent), Mycobacterium avium complex disease (73 percent), P. carinii pneumonia (69 percent), toxoplasmosis (100 percent), and tuberculosis (100 percent). AIDS conditions with the highest predictive value were any AIDS diagnosis (including low CD4 cell count; 65 percent), low CD4 count/percent (75 percent), and cryptococcosis (100 percent). Only low CD4 count/percent (kappa = 0.57), cryptococcosis (kappa = 0.67), and M. avium complex (kappa = 0.54) had a kappa statistic over 0.50.


View this table:
[in this window]
[in a new window]
 
TABLE 2. Sensitivity and positive predictive value of self-reported cancer by specific AIDS* diagnosis, Northern California WIHS,* 1994–1998

 
A total of 306 participants had complete data on all study variables and were included in the multivariate analyses. In the multivariate logistic regression analysis, the only statistically significant participant characteristic associated with inaccurate reporting of any clinical AIDS diagnosis was being a current cigarette smoker at baseline (table 3). In multivariate regression models of specific AIDS diagnoses, the association with cigarette smoking was strongest for inaccurate reporting of AIDS diagnoses that affect the lungs (P. carinii pneumonia, bacterial pneumonia, and tuberculosis). In a multivariate regression model that combined these three AIDS diagnoses, both being a current (odds ratio = 5.67, 95 percent confidence interval: 1.57, 20.57) and a former (odds ratio = 4.22, 95 percent confidence interval: 1.10, 16.25) cigarette smoker were significantly associated with inaccurate reporting.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Multivariate logistic regression analysis of 306 HIV*-infected study participants, Northern California WIHS,* 1994–1998

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
To our knowledge, this is the first study to use AIDS registry data to validate self-reported AIDS diagnoses. Because of the extent of the data collected at the county AIDS surveillance registries, we were able to investigate the accuracy of reports of both initial and subsequent AIDS-related diagnoses. Unlike some studies that were able to validate only positive reports of disease, we were able to examine both positive and negative reports of AIDS. However, the AIDS registry data are not the true "gold standard," and it is possible that some of the self-reported AIDS diagnoses were correct but were not verified by the two local AIDS surveillance registries due to a lag in reporting, to underreporting, or to a diagnosis outside of these jurisdictions. In fact, we found that seven additional self-reported AIDS diagnoses could be verified that were not validated by local matches with the AIDS registries. Nevertheless, studies that have evaluated the completeness of the San Francisco County AIDS registry data by using medical records as well as billing and laboratory diagnostic information found them to be 93–97 percent complete (9Go). This highlights the fact that no single source of information (self-report, medical records, or disease registries) is 100 percent accurate.

It is also possible that the matching process was imprecise and resulted in some women being incorrectly classified as false positive or false negative in reports. By matching on several criteria (name, gender, race/ethnicity, and date of birth) and manually reviewing possible matches (using Social Security number, address, and date of last study interview), we tried to minimize this kind of error. There are also some preliminary data from medical record abstraction to indicate that false-positive reports may have occurred because some participants mistook a diagnosis of cervical dysplasia for cervical cancer, oral candidiasis for esophageal candidiasis, and acute outbreaks of herpes simplex for chronic herpes simplex virus (12Go). This may explain why certain AIDS diagnoses are more likely to be inaccurately reported and why there are more false-positive than false- negative reports. The higher number of false-positive compared with false-negative reports is also why the sensitivity is generally higher than the positive predictive value.

The demographic and risk group characteristics of our study population are also uncommon in prospective studies of disease. Most cohort studies of AIDS (13Go, 14Go), cancer (2Go), and heart disease (15Go) are mainly composed of White men who are relatively well educated and are not substance users. Our diverse study group shows that women, many of who have challenging backgrounds, can reliably report certain AIDS-related health events.

Our finding that women who smoke cigarettes are less accurate reporters of disease is interesting and confirms what some (2Go), but not all (3Go, 4Go), studies have found. In general, current smokers are more likely to experience health problems than nonsmokers are. It is therefore possible that smoking-related illnesses may be confused with AIDS-related conditions. Additional analysis of specific AIDS-related conditions found that current smokers were statistically more likely to have inaccurately reported bacterial pneumonia, P. carinii pneumonia, and tuberculosis. Since cigarette smoking has been found to be associated with higher rates of pneumonia (16GoGo–18Go) and tuberculosis (19Go, 20Go), it is not surprising that these conditions were more likely to be reported among current smokers.

Our study is not the first to examine self-reported HIV conditions in HIV-infected individuals, but it does make a unique contribution by evaluating women with HIV infection for the report of all AIDS-defining conditions and comparing this report with AIDS surveillance data. Published articles have examined self-reports of bacterial infections (21Go), CD4 counts (22Go), and HIV-related symptoms (23Go) in HIV-infected individuals. The first study (21Go) found that women participants who reported smoking cigarettes in the previous 6 months were significantly more likely to self-report pneumonia than were women who did not report a recent history of smoking. This result is consistent with our finding that women who were current cigarette smokers were more likely to report bacterial pneumonia; however, this article did not verify the self-reports. The second study (22Go) found good agreement between self-reported CD4 counts and documentation of CD4 counts among hospitalized patients with HIV-related illnesses, similar to our finding comparing self-reports with AIDS surveillance data. However, this study population was quite different from ours and was 95 percent male, 69 percent White, and only 5 percent injection drug users. The last study (23Go) found provider reports to be less sensitive, less reproducible, and less clinically valid than self-reports of symptoms, which highlights some of the weaknesses of using only medical records for verification of HIV-related symptoms.

Our study was not able to assess whether there are gender differences in the accuracy of self-reported AIDS diagnoses because all of our participants were women. Since this study is the first that we know of to compare self-reports with AIDS registry data, there are no previously published reports that have comparable data on men. However, in the United States, men and women with AIDS have different demographic characteristics, different exposure risks for acquisition of HIV infection, and differing frequencies of AIDS-defining illnesses (24Go). These differences may contribute to the accuracy of self-reports among men and women with HIV infection.

Overall, self-reporting of any AIDS-related condition is fairly accurate, but there is great variability in the accuracy of specific conditions. These finding indicate that self-reported AIDS can be a reliable source of information for research studies of HIV-infected women. However, these HIV-infected women may have had such a broad complex of health problems that distinguishing the various sources of comorbidity was difficult. A person's confusion about a specific illness may also impact initiation of and adherence to treatment and therapy. We recommend that clinicians take the time to explain their diagnoses adequately to people with HIV and make it clear which illnesses are AIDS related and which are not. Additional study is needed to understand why the accuracy of the reporting of AIDS conditions varies by the type of condition.


    ACKNOWLEDGMENTS
 
Supported by National Institutes of Health grants AI034989, M01-RR00071, and M01-RR00083.

The authors gratefully acknowledge the important contribution of Ling Chin Hsu at the San Francisco Health Department and Linda Frank from the Alameda Health Department.


    NOTES
 
Correspondence to Nancy A. Hessol, University of California San Francisco, Department of Medicine, 405 Irving Street, 2nd Floor, San Francisco, CA 94122 (e-mail: nancyh{at}itsa.ucsf.edu).


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Kerber RA, Slattery ML. Comparison of self-reported and database-linked family history of cancer data in a case-control study. Am J Epidemiol 1997;146:244–8.[Abstract]
  2. Bergmann MM, Calle EE, Mervis CA, et al. Validity of self-reported cancers in a prospective cohort study in comparison with data from state cancer registries. Am J Epidemiol 1998;147:556–62.[Abstract]
  3. Bergmann MM, Byers T, Freedman DS, et al. Validity of self-reported diagnoses leading to hospitalization: a comparison of self-reports with hospital records in a prospective study of American adults. Am J Epidemiol 1998;147:969–77.[Abstract]
  4. West SL, Savitz DA, Koch G, et al. Demographics, health behaviors, and past drug use as predictors of recall accuracy for previous prescription medication use. J Clin Epidemiol 1997;50:975–80.[ISI][Medline]
  5. Colditz GA, Martin P, Stampfer M, et al. Validation of questionnaire information on risk factors and disease outcomes in a prospective cohort study of women. Am J Epidemiol 1986;123:894–900.[Abstract]
  6. Rimer BK, Glassman B. Tailoring communications for primary care settings. Methods Inf Med 1998;37:171–7.[ISI][Medline]
  7. Harlow SD, Linet MS. Agreement between questionnaire data and medical records: the evidence for accuracy of recall. Am J Epidemiol 1989;129:233–48.[ISI][Medline]
  8. Barkan S, Melnick S, Preston-Martin S, et al. The Women's Interagency HIV Study (WIHS)—design, methods, sample, cohort characteristics and comparison with reported AIDS cases in U.S. women. Epidemiology 1998;9:117–25.[ISI][Medline]
  9. Schwarcz SK, Hsu LC, Parisi MK, et al. The impact of the 1993 AIDS case definition on the completeness and timeliness of AIDS surveillance. AIDS 1999;13:1109–14.[ISI][Medline]
  10. Fleiss JL. Statistical methods for rates and proportions. New York, NY: John Wiley & Sons, 1973.
  11. SAS Institute, Inc. SAS/STAT user's guide. Cary, NC: SAS Institute, Inc, 1989.
  12. Hessol NA, Anastos K, Levine AM, et al. Factors associated with incident self-reported AIDS among women enrolled in the Women's Interagency HIV Study (WIHS). AIDS Res Hum Retroviruses 2000;16:1105–12.[ISI][Medline]
  13. Muñoz A, Schrager LK, Bacellar H, et al. Trends in the incidence of outcomes defining acquired immunodeficiency syndrome (AIDS) in the Multicenter AIDS Cohort Study: 1985–1991. Am J Epidemiol 1993;137:423–38.[Abstract]
  14. Hessol NA, Koblin BA, van Griensven GJ, et al. Progression of human immunodeficiency virus type 1 (HIV-1) infection among homosexual men in hepatitis B vaccine trial cohorts in Amsterdam, New York City, and San Francisco, 1978–1991. Am J Epidemiol 1994;139:1077–87.[Abstract]
  15. Kannel WB, Feinleib M, McNamara PM, et al. An investigation of coronary heart disease in families: The Framingham Offspring Study. Am J Epidemiol 1979;110:281–90.[Abstract]
  16. Conley LJ, Bush TJ, Buchbinder SP, et al. The association between cigarette smoking and selected HIV-related medical conditions. AIDS 1996;10:1121–6.[ISI][Medline]
  17. Galai N, Park LP, Wesch J, et al. Effect of smoking on the clinical progression of HIV-1 infection. J Acquir Immune Defic Syndr Hum Retrovirol 1997;14:451–8.[ISI][Medline]
  18. Marrie TJ. Pneumococcal pneumonia: epidemiology and clinical features. Semin Respir Infect 1999;14:227–36.[Medline]
  19. Yu GP, Hsieh CC, Peng J. Risk factors associated with the prevalence of pulmonary tuberculosis among sanitary workers in Shanghai. Tubercle 1988;69:105–12.[ISI][Medline]
  20. Anderson RH, Sy FS, Thompson S, et al. Cigarette smoking and tuberculin skin test conversion among incarcerated adults. Am J Prev Med 1997;13:175–81.[ISI][Medline]
  21. Flanigan TP, Hogan JW, Smith D, et al. Self-reported bacterial infections among women with or at risk for human immunodeficiency virus infection. Clin Infect Dis 1999;29:608–12.[ISI][Medline]
  22. Cunningham WE, Rana HM, Shapiro MF, et al. Reliability and validity of self-report CD4 counts-in persons hospitalized with HIV disease. J Clin Epidemiol 1997;50:829–35.[ISI][Medline]
  23. Justice AC, Rabeneck L, Hays RD, et al. Sensitivity, specificity, reliability, and clinical validity of provider-reported symptoms: a comparison with self-reported symptoms. Outcomes Committee of the AIDS Clinical Trials Group. J Acquir Immune Defic Syndr 1999;21:126–33.[ISI][Medline]
  24. Centers for Disease Control and Prevention. HIV/AIDS surveillance report. Atlanta, GA: Centers for Disease Control and Prevention, Department of Health and Human Services, 1999.
Received for publication February 23, 2000. Accepted for publication October 20, 2000.