From the Institute for Health, Health Care Policy, and Aging Research, RutgersThe State University of New Jersey, New Brunswick, NJ.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
activities of daily living; health status; mortality; survival
Abbreviations: ICD-8, International Classification of Diseases, Eighth Revision; NHANES I, First National Health and Nutrition Examination Survey; NHEFS, NHANES I Epidemiologic Follow-up Study.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A far smaller number of studies have examined disability rather than mortality outcomes (38
). These studies have found self-rated health to be significantly predictive of functional limitations at follow-up, even when data are adjusted for baseline level of disability. Studies of health outcomes other than mortality are important because they identify health risks for survivors, respondents whose outcomes in a survival analysis would be identical because of censoring. Functional decline is a strong precursor of mortality in elderly populations (9
, 10
), despite the conventional caveat that functional decline is not inevitable in old age (11
, 12
). If self-rated health has an impact on mortality, one would expect it to also be related to a decline that precedes the event. Indeed, self-ratings have strong cross-sectional relations with functioning (13
, 14
).
In this investigation, we examined the role of self-rated health in multiple health outcomes using data from the First National Health and Nutrition Examination Survey (NHANES I) Epidemiologic Follow-up Study (NHEFS). The NHEFS is a large, nationally representative sample of adults aged 2574 years at baseline (19711975) who have been traced through 1992. We hypothesized that self-ratings gathered as part of the NHANES I data would predict hazard of death up to 1992, as well as functional limitations in 1982 and 1992.
There have been two previous reports on poor self-rated health as a risk factor for mortality in the NHEFS (15, 16
) and one in which it was found to be a risk factor for functional limitations (8
). All three of those investigations had shorter periods of follow-up, using data from either the 19821984 follow-up (16
) or the 1987 follow-up (8
, 15
). The earliest study, one of the first reports of an association between self-rated health and mortality, utilized data from a subsample which had been selected for detailed physical examination by staff physicians (17
, 18
). The small number of other studies of self-rated health and mortality at that time relied on self-reports or medical records for data on other risk factors. The physician examination in the NHANES I eliminated bias due to overreporting or underreporting of conditions or symptoms, as well as bias in medical records stemming from unequal access to health care or failure to seek treatment. All three studies reported some statistically significant associations of self-rated health with outcomes measured after various lengths of follow-up.
None of these studies took full advantage of the rich array of data available in the NHANES I for use as covariates. The baseline NHANES I database contains nutrition data from a food frequency questionnaire and a 24-hour dietary recall, clinical measurements from blood and urine samples, anthropometric measures of height and weight, blood pressure measurements, and respondent reports of symptoms, in addition to the physician examination and self-report data on diagnosed conditions and health behaviors considered by the previous studies. If the predictive power of self-rated health derives from the respondent's awareness of undiagnosed symptoms, such knowledge, if it is unreported, could influence both global assessments and later outcomes. The NHANES I is unique in its continuum of health status measures, from quite subjective information on internal health states (self-reported symptoms) to the clearly objective results of laboratory analyses of serum samples. Thus, this study both replicates and extends previous studies of self-ratings of health by 1) taking both survival time and functional limitations as outcomes and 2) employing the full range of NHANES I self-reported and clinical risk factor data as covariates in the analyses.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Tracing for the 1992 follow-up was undertaken for all NHANES I respondents not known to have died by 1987, regardless of whether they had been reinterviewed in a previous wave of the NHEFS. Ninety percent of the surviving cohort was successfully traced, and interviews were conducted with 92.1 percent of those traced. Respondents' deaths were confirmed by death certificate or proxy interview. Those whose vital status could not be determined were considered lost to follow-up. Follow-up rates were higher for the subsample of 6,913 who received the detailed physical examination; 68.3 percent were traced as alive, 28.2 percent were known to have died, and 3.5 percent were lost to follow-up. Complete 1992 data on tracing status and time of death were available for 6,641 persons in the subsample, and data on functional limitations in 1992 were available for 4,136 persons (see table 1).
|
Functional limitations. The 1982 and 1992 follow-up interviews contained items measuring functional limitations, but the baseline interview did not. The 1982 interview contained 26 items measuring function in eight domains, adopted from the Stanford Health Assessment Questionnaire (19); the 1992 interview contained 23 items. For consistency, we dropped the three items from 1982 that had not been repeated. Each item was scored from 0 for "no difficulty" to 3 for "unable to do." We totaled the amount of difficulty the respondent had over all activities to create a summary score. On the basis of screening criteria, certain groups of respondents were not asked about some items. For example, respondents with no use of their lower limbs were not asked questions involving lower body mobility. We assigned the appropriate maximum or minimum value for all skipped items (see footnotes of table 1). The reliability of the measures was excellent; Cronbach's
for the 1982 items was 0.96, and for the 1992 items it was 0.92. The validity of the 1982 measure is supported by its correlations with older age (r = 0.28, p = 0.0001), frequency of exercise (r = - 0.15, p = 0.0001), death by 1992 (r = 0.30, p = 0.0001), and the 1992 functional limitations measure (r = 0.45, p = 0.0001).
Physical examination data. NHANES I physicians performed direct physical examinations of respondents, including examination of the ears, head, eyes, mouth, neck, abdomen, major and minor joints, and skin, as well as percussion of the liver and auscultation of the heart. Physicians recorded abnormalities found during the examination in International Classification of Diseases, Eighth Revision (ICD-8), categories (20). We created a set of 15 binary variables from these recordsone for each ICD-8 category, except complications of pregnancy and perinatal morbidity. Table 2 shows the weighted frequencies for these diagnostic categories; the most frequent diagnoses were for circulatory system disorders (including hypertension).
|
Self-reported conditions. NHANES I respondents were asked, "Has a doctor ever told you that you had any of the following conditions?" This was followed by a list of 39 chronic and acute conditions, some described with vernacular expressions (table 3). The most frequently reported conditions included arthritis, fractures, bladder infections, and anemia.
|
Self-reported health practices. Two binary variables were created to indicate smoking status, current and past. A level-of-activity variable was defined on the basis of two questions concerning exercise and usual levels of daily activity. The categories for alcohol consumption separated regular moderate to heavy drinkers (two or more drinks at least several days per week) from regular light to moderate drinkers (one drink as often as daily) and less regular drinkers from nondrinkers. They were defined on the basis of answers to questions about frequency and amount of alcohol consumed.
The NHANES I contained a dietary frequency questionnaire, which asked about the number of times per week fiber, fish and shellfish, and fruits and vegetables were consumed. Only participants surveyed at baseline in 19711973 and some of those surveyed in 1974 were asked about diet. We imputed mean values based on regressions of the variables on age, race, and sex (53.0 percent of the sample). Respondents reported that they consumed fruits or vegetables approximately twice per day, while fish was eaten only once per week, on average.
Self-rated health. A single-item health status indicator with categories of excellent, very good, good, fair, and poor was treated as a set of four binary variables, with poor health being the reference category.
Data analysis
The NHEFS has a complex survey design, including multistage sampling and clustering. Women of childbearing age, elderly persons (up to age 74 years only), and persons living in poverty areas were oversampled. Weights were assigned to each respondent to adjust his or her probability of selection and to ensure agreement with controls provided by the US Bureau of the Census (22). Models were fitted initially with unweighted Cox proportional hazards (for mortality) (23
) and ordinary least squares regression (for functional limitations) procedures, to reduce the very large number of covariates, with stepwise procedures not available in SUDAAN. We fitted multivariate models separately within the five categories of physician diagnoses, clinical measurements, conditions, symptoms, and health behaviors; variables whose parameter estimates had associated p values of 0.10 or less were retained for later analyses. Final models including all variables with significant coefficients for either males or females were refitted applying weights and adjusting the standard errors with a Taylor series approach, using SUDAAN software (22
, 24
, 25
).
Survival analysis results are reported as conditional hazard ratios with 95 percent confidence intervals. Hazard ratios for categorical variables compare the hazard rate (for mortality) for respondents with the factor to the hazard rate for those without it; for continuous variables, the hazard ratio represents the increase in hazard associated with a single unit of change in the hazard factor. In either case, hazard ratios larger than 1 signify an increased hazard associated with the factor in question and those smaller than 1 signify a decreased hazard. Confidence intervals that do not include 1.0 indicate a <5 percent probability that the findings are due to chance.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
With regard to the first study hypothesis, we found support in males but not in females that global self-ratings of health add significantly to mortality prediction, even when a large array of other information is considered in the model. To gauge the impact of self-ratings of health, we refitted the model for men without them; the results showed that the addition of self-rated health caused very little change in the hazards ratios for diagnoses, clinical measurements, conditions, symptoms, or risk behaviors, and also that the amount added to the -2 log likelihood ratio 2 was 22.2 (4 df; p < 0.001). By contrast, the amount added to the women's model -2 log likelihood ratio
2 was 7.6 (4 df; p > 0.10). However, a test of differences between the parameter estimates for "excellent" health in the two models found the women's parameter estimate to be not significantly different from the men's. This suggests that men's global self-ratings, but possibly not women's, contain information that is independent of the comprehensive measurement of health achieved in this survey.
Functional limitations
Table 1 shows frequencies for individual items and weighted means for the 1982 and 1992 total scores. In 1982, 65.1 percent of the sample had no difficulty with any activity; in 1992 this figure was 64.8 percent. Because functioning was not assessed at baseline, we took both the 1982 and 1992 measures as endpoints. In model 1, 1982 functioning was used as the dependent variable; in models 2 and 3, 1992 functioning was used as the dependent variable; and in model 3, 1982 functioning was added as a covariate, thus adjusting for level of disability prior to 1982. Parameter estimates shown in table 5 are unstandardized regression coefficients; they may be interpreted as the level of (absolute) change in the dependent variable (functioning score) associated with a change of one unit in the independent variable. As before, variables were included in the models if they were initially significant predictors within their group for either men or women.
|
Women's 1982 functional limitations were significantly associated with age, with diagnosis of musculoskeletal disease, and with self-reports of arthritis, leg pain, morning cough, and wheezy chest. None of these diagnoses or conditions resulted in an increase in the limitation score of more than 2.0 points. By contrast, excellent, very good, and good self-rated health resulted in scores that averaged more than 8 points lower than those for women with poor health, though not in precise dose-related fashion; self-ratings of fair health reduced scores by 5.2 points.
Men's 1992 functional limitations were associated with age and with self-reports of bronchitis, heart attack, hernia, hives, morning cough, and chest pain. Some of these effects were apparently protective (i.e., they had negative signs), which may signal selection effects in the reporting of relatively minor conditions. Excellent self-rated health subtracted 9.5 points from functioning scores, and self-ratings of fair health reduced scores by 7.3 points, with scores for very good and good health falling in between, again in a dose-related fashion. These effects were greater than the size of the 1982 effects.
Women's 1992 functioning was predicted by clinical indicators of circulatory system disease and musculoskeletal disorders and by overweight. Self-reported conditions that predicted 1992 limitations included arthritis, diabetes, heart attack, morning cough, leg pain, wheezy chest, and alcohol use. Excellent self-rated health decreased scores by 8.1, and very good, good, and fair health also made significant reductions.
Model 3 introduced 1982 functional limitations, so that baseline level of limitation and any increase in limitation that occurred up to 1982 was eliminated. Correlated measurement error, which might have existed because of the presence of both measures in the model, potentially biasing estimates of other terms in the model, is likely to have been small because the reliabilities of both measures were so high. For women, the introduction of this variable virtually eliminated the significant effect of self-rated health, with only one category remaining marginally significant. This reduction of the effect suggested that the initial effect on 1992 limitations was largely explained by an association between self-rated health and functioning up to 1982 but not beyond 1982. For men, there were still significant effects for excellent and very good health; the effects for good and fair health were marginally significant. Thus, for functional limitations, as for mortality hazards, the findings were substantially weaker for the women in the sample.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A primary limitation of many studies of self-rated health has been a perceived inadequacy of other risk factor measures in the analyses; clinical data have been available only rarely, and self-report data on conditions and symptoms have often been relatively crude. Thus, it is not surprising that respondents with poor self-rated health in these studies were less likely to survive; it is an unremarkable result, unless the global rating of health is analyzed in a multivariate context in which the effects of more specific health status measures are assessed first. The NHANES I physical examination was designed to have high sensitivity, to improve prevalence estimates by detecting previously undiagnosed disorders. Thus, the data from these particular physical examinations should provide a strong test of the independence of self-ratings. Moreover, self-reports, especially of symptoms, could represent the early or prodromal stages of conditions that might not have been detected in a physical examination. The strength of the risk factor data considered in the present study makes the findings among the most substantively significant that have been assembled to date on the predictiveness of subjective self-ratings of health.
The choice of multiple endpoints for this study was an attempt to take a well-replicated finding and move this area of research forward. Previous studies have examined differences in single endpoints, mortality, or (less often) functioning, and have as a result ignored data from one or another segment of their samples. Simple mortality studies with a binary outcome treat all deaths and all survivors as two homogeneous groups. Studies of hazard rates differentiate length of time to death within the deceased group but right-censor values for survivors, which in effect treats all survivor outcomes as identical. Even less complete are studies of functioning which analyze data only from living respondents.
Our results regarding the men in the sample confirm a frequently repeated pattern with more comprehensive collateral measures of health status than have been seen in the field. For women, however, the findings are substantially weaker, with no significant effect of self-rated health on mortality and a sharp reduction in the effect on functional limitations once a prior measure of limitations is introduced.
Jylhä et al. (26) have pointed out that gender differences in self-rated health have not been systematically examined. To address this question, we located 17 studies which analyzed self-ratings of health as predictors of mortality separately by sex. In six of the studies, relative risks of mortality for poor self-rated health compared with excellent self-rated health were higher among women (27
32
), but several of these differences were slight. In 10 studies (one with two samples), the risks were higher for men (33
41
), and in six of these there were no significant effects at all for women. There are no obvious differences between the "male-higher" and "female-higher" studies with respect to samples or study design; nearly all were studies of elderly populations, and five different countries were represented. Our similar results, with weaker findings for women in a sample that comprised young as well as elderly respondents and an extremely inclusive set of health status covariates, should raise this issue for further consideration. Research attention could focus on: 1) the differential distribution of the dependent variable, 2) the differential distribution of the self-rated health variable, and/or 3) the nature and distribution of other covariates in the model. If "health" means something different to men and to women, this may affect both their pre-existing knowledge of their own health and their postbaseline health risk behavior. Gender differences in the effects may offer us a window for comparison of these processes and actually aid us in understanding them.
In our study, we examined differences in mortality hazards for respondents who rated their health differently, and then we examined significant differences in functioning among survivorsdifferences which are themselves strong predictors of subsequent risk of death. We conclude that in this sample, self-ratings of health played an independent role in the prediction of survival for men and, to some extent, in the prediction of functional limitation for both men and women; self-ratings of health were not rendered superfluous by the inclusion of comprehensive data obtained more objectively from a standardized physical examination and a set of laboratory tests. We also conclude that studies of self-ratings of health must move to a new stage in which health outcomes for study samples are more fully explored and potential sex differences in effects are systematically addressed.
![]() |
ACKNOWLEDGMENTS |
---|
The authors acknowledge the National Center for Health Statistics for fielding the NHANES I Epidemiologic Follow-up Study and the Inter-University Consortium for Political and Social Research at the University of Michigan for making the data available. The authors also acknowledge the valuable assistance of Edwin Milan, Sharon Cook, and Tami Videon in the preparation of the data for this project.
An early version of this paper was presented at the meeting of the American College of Epidemiology, Arlington, Virginia, September 12, 1994.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|