1 School of Medicine, University of Alabama at Birmingham, Birmingham, AL
2 School of Public Health, University of Alabama at Birmingham, Birmingham, AL
Correspondence to Dr. Jeffrey Roseman, Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, 220 M Ryals Building, 1665 University Boulevard, Birmingham, AL 35294-0008 (e-mail: JRoseman{at}ms.soph.uab.edu).
Received for publication March 28, 2003. Accepted for publication April 7, 2005.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
cohort studies; data collection; hospitalization; longitudinal studies; medical records; reproducibility of results
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
This study was intended to narrow this gap in our knowledge. Using data from the Coronary Artery Risk Development in Young Adults (CARDIA) Study, an ongoing population-based longitudinal study designed to describe and identify factors associated with the development of cardiovascular disease in a cohort of more than 5,000 Black and White men and women in the United States, we 1) evaluated the validity of self-reports of reasons for hospitalization by young White and Black adults in different geographic regions of the United States and 2) identified specific reasons for hospitalization and population subgroups or personal characteristics associated with discordant self-reports of reasons for hospitalization.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
For this study, subjects who self-reported being hospitalized overnight were included. A total of 399 subjects who had had 516 self-reported hospitalizations during follow-up years 712 were eligible. Among these persons, medical records for 195 hospitalizations of 134 subjects were not available, because of either the patient's refusal to give permission for review of the records (99 hospitalizations) or unavailability of the records in the hospital (96 hospitalizations) even though permission had been obtained from the subject. Those 195 hospitalizations were excluded. Thus, 265 subjects who had had 321 hospitalizations over a period of 5 years (follow-up years 712) were included in this study (figure 1).
|
A questionnaire was used to collect data on self-reported reasons for hospitalization by face-to-face interview during CARDIA examinations and by telephone and mail during interim follow-up interviews. The questionnaire posed the questions: "Since your last CARDIA examination or follow-up interview on (date), have you been hospitalized overnight? If so, what was (were) the reason(s) for hospitalization?" The next two questions asked about the name of the hospital(s) and the date(s) of admission. Permission was obtained from the subject to review medical records from the hospital(s) if he or she had been hospitalized. The hospital was then contacted, and the participant's medical records were obtained and reviewed for collection of data on reasons for hospitalization.
For the present study, we included data from questionnaires and medical records for 5 of the follow-up years (years 712) during 19921998. Data for all hospitalizations were used, with the exception of pregnancies and deliveries.
The International Classification of Diseases, Ninth Revision (ICD-9), was used to code the reasons for hospitalization. We had two sets of ICD-9 codes. The first set consisted of codes already in the medical records (medical record's code). Because these codes were not always available, we had a second set, assigned by our medical record technologists after a thorough review of the subject's medical history, investigations, diagnoses, and treatments provided in the medical records (coder's code). These two sets of codes were compared for similarity. Dissimilarity in coding was investigated by two of the authors, and adjudicated codes were obtained.
Data obtained from the questionnaires administered during CARDIA examinations and at interim follow-up contacts included data on demographic characteristics such as age, sex, race, and educational level and other factors examined for associations with discordant self-reports. Those other factors were hostility, depression, ever taking drugs by injection (intravenous drug use), alcohol intake in the past week, time between self-report and hospitalization, and length of stay in the hospital. Age was computed by subtracting the date of birth from the date of the follow-up interview. Data on the subject's date of birth, sex, and race were collected at the baseline examination and were verified at the year 2 examination. Data on educational level and depression were collected at the year 10 examination. Data on intravenous drug use and alcohol use were collected at the year 7 examination. Data on hostility score were collected at the year 5 examination. "Days at the hospital" and "months since hospitalization" were calculated by subtracting the date of admission from the date of discharge and subtracting the date of discharge from the date of the follow-up visit, respectively.
The instrument used to measure hostility was the 50-item Cook-Medley Hostility Scale, derived from the Minnesota Multiphasic Personality Inventory (13). On this scale, the individual hostility score ranges from 0 to 50, with a higher score indicating greater hostility. We used the median score (>17) to identify a higher level of hostility, since there is no standardized cutoff point otherwise available. The presence of depression was assessed using the Center for Epidemiologic Studies Depression Scale. Scores on this scale range from 0 to 60. In this report, the generally accepted cutoff point of
16 (6
) was used for defining depression, with higher scores indicating more depressive symptoms.
Assessing concordance of self-reports with medical records: validation process and problems
Assessment of the validity of self-reports was conducted using medical records as the gold standard. For a particular hospitalization, a self-report of a reason for hospitalization was validated by confirming it with the reason (disease, injury or fracture, procedure) for hospitalization recorded in the medical records either as a primary diagnosis, a secondary diagnosis, or a surgical or medical procedure. Thus, concordance for a particular reason for hospitalization was determined by 1-1 matching between self-reports and medical records.
There were several difficulties in conducting the validation process. In some cases, there were difficulties in translating self-reported lay terms (n = 6) for reasons for hospitalization into medical terms and comparing them with medical records. One example is "blood poisoningright thumb" to indicate bacterial infection (staphylococcus) of a finger wound. Occasionally, self-reports of reasons for hospitalization appeared to be too vague, broad in meaning, or incomplete (n = 17) for comparison with the medical record diagnoses (e.g., use of the term "bleeding disorder" to mean hemorrhage in acute gastritis or "ulcer" to mean gastric ulcer). In several cases (n = 7), self-reports provided a specific symptom that reflected a specific disease or disorder (e.g., chest pain to indicate acute pericarditis, pancreas pain to indicate acute pancreatitis, and suicidal ideation to indicate depression). In all of the above cases, the self-reports were considered concordant.
Several other self-reported reasons for hospitalization (n = 5) were considered discordant even though they accurately reflected the secondary diagnoses, because those secondary diagnoses were not the reasons for hospitalization. As an example, the self-report "appendix removed" was concordant with "incidental appendectomy," one of the secondary diagnoses in medical records. However, that was not the reason for hospitalization, because the patient was hospitalized with another diagnosis for which he was undergoing a surgical procedure when he had an incidental appendectomy. In one case, the self-report appeared to be concordant in terms of cause and discordant in terms of body parts (e.g., "infection, esophagus" for acute laryngitis and epiglottitis).
Data analysis
The percentage of self-reports that were concordant with medical records was computed by dividing the number of accurate self-reports of reasons for hospitalization by the total number of hospitalizations. SAS was used in the data analysis (14). The chi-squared statistic was used to examine the difference in concordance by the mode of data collection (mailed questionnaire, telephone, and face-to-face interviews) and by center.
Generalized estimating equations (GEE) analyses were performed to determine whether subjects' background characteristics and other factors influenced discordance between self-reports and medical records. GEE is a type of regression analysis which takes into account the correlation of the repeated measures within a person, and it includes subjects regardless of missing values (15, 16
). GEE analysis is suitable for both continuous and dichotomous outcome variables.
Along with sociodemographic characteristics such as age, sex, race, and educational level, all variables were entered into the GEE model. Models with interaction terms were also created for examination of possible interactions between variables. Overall model fit was assessed by deviance statistic. For specific categories of reasons for hospitalization (infections and chronic disorders), only simple logistic regression was used, since the number of hospitalizations was too small.
Units of analysis: rationale for use
In other studies of self-reports, either subject (a subject himself/herself) or hospitalization (each episode of hospitalization) was used as the unit of analysis (1). There are rationales for the use of either unit. Use of "hospitalization" rather than "subject" as the unit of analysis increases sample size (a single subject with several instances of hospitalization can be analyzed several times for each of these hospitalizations) and provides an opportunity to examine each subject in the context of a variety of diseases or reasons for hospitalization. The concordance of reasons for hospitalization overall was 93.2 percent when subject was the unit of analysis and 92.5 percent when hospitalization was the unit of analysis. Hence, because it did not influence the results and provided us with a greater sample size, hospitalization was used as the unit of analysis in the validation of self-reports, as well as in examining associations between risk factors and discordance of self-reports with medical records.
Quality control
Quality control procedures were implemented in data collection, entry, editing, and analysis. All interviewers administering questionnaires to subjects were trained, certified, and later recertified in the measurements for which they had responsibility. The medical record technologists who collected data from medical records were trained in medical record and ICD-9 coding.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
When diseases of specific body systems were analyzed, renal system diseases had the highest concordance (95.4 percent), followed by diseases of the respiratory system (93.3 percent), cardiovascular system (88.9 percent), digestive system (78.6 percent), and nervous system (62.5 percent).
When the reasons for hospitalization were grouped according to method of data collection, the concordance of self-reports was found to be 96.8 percent (30/31) for mailed questionnaires, 91.8 percent (78/85) for telephone contacts, and 92.2 percent (188/204) for face-to-face interviews. Overall, there was no significant difference in concordance by method of data collection (p = 0.6).
When the reasons for hospitalization were categorized by region or center, concordance was 92.0 percent (104/113) in Birmingham, 91.1 percent (51/56) in Chicago, 89.5 percent (77/86) in Minneapolis, and 98.5 percent (65/66) in Oakland. There was no significant difference in concordance of self-reports across regions or centers (p = 0.2).
GEE analyses: associations between risk factors and discordance
For hospitalizations overall, in GEE analyses, race and intravenous drug use were positively associated with discordance between self-reports and medical records (table 3). Black race was more than four times as likely as White race to be associated with discordant self-reports (odds ratio (OR) = 4.23, 95 percent confidence interval (CI): 1.72, 10.40; p = 0.002). Even among Blacks, however, the overall rate of concordance for self-reports was 90.0 percent (162/180). Among Whites, the overall rate of concordance for self-reports was 95.7 percent (135/141). Intravenous drug users were more than six times as likely to give discordant self-reports as nonusers of intravenous drugs (OR = 6.06, 95 percent CI: 1.17, 31.22; p = 0.03). No other factor was significantly associated with discordance of self-reports in the GEE analyses. Possible interactions of race with intravenous drug use and sex were examined. No significant interaction between race and intravenous drug use (p = 0.26) or between race and sex (p = 0.27) was found when interaction terms were included in the model. The deviance statistic did not indicate any problems with model fit.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Our results are similar to findings of other studies assessing concordance of self-reports with a gold standard of reporting (usually medical records), for most reasons for hospitalization: breast cancer (100 percent in this study vs. >90 percent in two other studies) (8, 17
), large bowel cancer (100 percent vs. >90 percent in another study) (17
), stroke (66.6 percent vs. 6566 percent in two other studies) (17
, 18
), and cardiovascular diseases (88.9 percent vs. 89.9 percent in one other study) (6
), as well as injuries or fractures (100 percent vs. 100 percent in another study) (17
) and surgical procedures, including hysterectomy (100 percent vs. 100 percent in two other studies) (19
, 20
).
Our results were dissimilar, however, for diabetes (20, 21
). For diabetes, concordance in our study (85.7 percent) was somewhat lower than in previous studies, where it exceeded 95 percent (20
, 21
). One reason for the lower concordance in this study may have been that, out of 14 self-reports of diabetes as a reason for hospitalization, two cases of hospitalization were reported by the same person, who was admitted four times in the same year (1992). Although the participant was admitted with diabetes in each case of hospitalization, according to medical records he had several other health problems during that time and reported two of these other problems as the reason for hospitalization. Having multiple hospitalizations in a short period of time with multiple diagnoses might have confused the participant when he gave the reasons for hospitalization. It is also possible that diabetes caused the other health problems that the subject reported as the reasons for hospitalization.
Factors affecting concordance between self-reports and medical records may include complexity of diagnostic criteria and how important the effects of the illness are perceived to be. The relatively low level of concordance for infections and for categories of symptoms (80.082.0 percent) may be attributable to their not being perceived as life-threatening. Length of stay in the hospital, a potential indicator of seriousness of illness, was not associated with discordance. Mailed questionnaires appeared to work as well as interviews, even though they did not permit probing.
Concordance of self-reports with medical records may be influenced by certain background and lifestyle characteristics of subjects. Educational level was not significantly associated with discordance, which is similar to the finding in a previous study (6). Black race was four times more likely to produce discordant self-reports than White race. This finding was similar to that of a previous study (9
), wherein agreement between self-reports and medical records was found to be greater for Whites. For two specific categories of hospitalizationchronic disorders and infectionsBlacks were almost twice as likely as Whites to give discordant self-reports; however, the associations were not significant (p = 0.57 for chronic disorders and p = 0.44 for infections). Whites were not more likely than Blacks to overreport cancer, as had been found in another study (5
). One possible explanation for the racial difference might be disparities in communication between doctors and patients. The higher level of discordance between self-reports and medical records for Black respondents in our study could be due to miscommunication or distrust between physicians and patients, particularly if they are of different racial backgrounds. Distrust of, or discomfort with, health-care providers of another race might cause inappropriate and careless responses, leading to discordant self-reports. Similarity of racial backgrounds of interviewers, physicians, or nurses with subjects might positively influence concordance of self-reports of reasons for hospitalization. A study examining the effect of race on medical student-patient communication demonstrated that when patients and interviewers were of different racial backgrounds, interviewing performance scores were adversely affected (22
). Race-of-interviewer effects on participants' responses were also demonstrated in other studies (23
25
).
Except for intravenous drug use, no lifestyle factors were significantly associated with discordant self-reports of reasons for hospitalization overall. Intravenous drug users were over six times as likely to provide discordant self-reports. The fact that intravenous drug users gave concordant reports only 80 percent of the time suggests that in studies involving intravenous drug users, investigators should probably not rely solely on self-reports. With respect to personality and attitudinal factors, neither depression nor hostility was significantly associated with discordant reports of reasons for hospitalization overall. Time since hospitalization (a surrogate measure of memory) and length of stay in the hospital, a potential indicator of seriousness of illness, were also not significantly associated with discordance. However, because the percentages of discordant self-reports were small, detecting factors associated with discordance was difficult.
The generalizability of these study findings is somewhat limited, given that the study was a follow-up study and the data were collected with the same questionnaire in each follow-up interview. This yearly follow-up might have helped participants become more attentive to their hospitalizations, leading them to provide more accurate self-reports. It would be important for determining the utility of self-reports for case-control studies to determine whether these results were generalizable to the initial interview.
This study had a methodological limitation. Approximately 38 percent (n = 195; 135 Blacks, 60 Whites) of the self-reported hospitalizations could not be checked for validity because of the unavailability of medical records. The unavailability of medical records (because of missing records or patient refusal) was approximately 1.7 times higher for Blacks (OR = 1.7, 95 percent CI: 1.2, 2.5; p = 0.005) than for Whites and 1.8 times higher for persons with a lower level of education (OR = 1.8, 95 percent CI: 1.3, 2.6; p = 0.008). This might have caused bias in the findings, though the direction of the bias is uncertain.
In subjects for whom medical records were unavailable because of patient refusal (19.2 percent (n = 99; 73 Blacks, 26 Whites)), the difference between missing and nonmissing records was significant by race (p = 0.002) and by years of education (p = 0.008). In subjects for whom records were missing for reasons other than patient refusal, there was no significant difference between missing and nonmissing records by race (p = 0.17) or by years of education (p = 0.1). Considering that concordance was so high, even for Blacks (90.0 percent) and persons with a low level of education (93 percent), in 62 percent of cases (nonmissing or available records in this study) and that there was no significant difference (p = 0.17) by race or by years of education (p = 0.1) for approximately 19 percent of cases (records missing for reasons other than patient refusal), one should not have expected any dramatic change in the overall result for the remaining 19 percent of cases for which records were unavailable because of patient refusal. Moreover, our findings on race and education were very similar to those of previous studies (6, 9
).
Another issue is that our results related only to self-reported hospitalizations (i.e., only when participants stated on a questionnaire that they had been hospitalized and gave a reason for hospitalization). We did not check for negative responses (i.e., when the response to the question "Were you hospitalized?" was "No"). We had no way of determining false negatives; therefore, it was not possible to determine their impact on our analyses. Nonetheless, false negatives, given the relative rarity of any specific reason for hospitalization, would have little impact on estimates of association from case-control studies or longitudinal studies where these would be the outcome variables. Moreover, the concordance of negative responses is usually higher than that of positive responses when medical records are used as the gold standard. In a study where information from medical records was used as the gold standard for validation of self-reports of myocardial infarction, the predictive value for negative responses was found to be 100 percent, whereas the predictive value for positive responses was 70 percent (26). In a similar study of myocardial infarction (4
), the predictive value was also higher for negative responses (99 percent) than for positive responses (74 percent).
In conclusion, the overwhelmingly high concordance between self-reports and medical records for overall reasons for hospitalization and for some categories of reasons for hospitalization, such as injuries or fractures, surgical procedures, and chronic medical conditions, suggests that investigators can often rely on self-reports by young adults for this information. This is particularly relevant in times when obtaining medical records is becoming more costly and logistically difficult. Researchers, however, need to be aware that certain personal characteristics may influence the validity of self-reports of reasons for hospitalization.
![]() |
ACKNOWLEDGMENTS |
---|
Conflict of interest: none declared.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|