Dichotomous or categorical response? Analysing self-rated health and lifetime social class

Orly Manora, Sharon Matthewsb and Chris Powerb

a School of Public Health and Community Medicine, The Hebrew University and Hadassah, P.O. Box 12272, Jerusalem 91010, Israel. E-mail: om{at}cc.huji.ac.il.
b Department of Epidemiology & Public Health, Institute of Child Health,30 Guilford Street, London WC1N 1EH, UK.


    Abstract
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
Background Self-rated health is a commonly used measure of health status, usually having three to five categories. The measure is often collapsed into a dichotomous variable of good versus less than good health. This categorization has not yet been justified.

Methods Using data from the 1958 British birth cohort, we examined the relationship between socioeconomic conditions, indicated by occupational class at four ages, and self-rated health. Results obtained for a dichotomous variable using logistic regression were compared with alternative methods for ordered categorical variables including polytomous regression, cumulative odds, continuation ratio and adjacent categories models.

Results and Conclusions Findings concerning the relationship between socioeconomic position and self-rated health yielded by a logistic regression model were confirmed by alternative statistical methods which incorporate the ordered nature of self-rated health. Similarity of results was found regarding size and significance of main effects, type of association and interactive effects.

Keywords Self-rated health, social class, logistic regression, polytomous regression, cumulative odds model, continuation ratio model, adjacent categories model

Accepted 29 June 1999


    Introduction
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
Research on the influence of risk factors at different life stages on health, requires measures of health that can be applied across the life span. One widely used measure is ‘self-rated health’, that is, an overall assessment by the individual of their health status. Self-rated health is an easily administered measure, it is not demanding of interview time, and it may indicate ill-health at younger ages when mortality is rare. Evidence has accumulated showing that self-rated health is associated with fitness,1 morbidity,2,3 and general practitioner visits,4 and it predicts mortality.5,6 Such associations are likely to reflect the fact that the individual possesses information of the complex range of factors (including health behaviours, family history of longevity and so on) influencing their health status. As a subjective measure it may, however, be imprecise due to differences in interpretation. Finally, methodological studies of this health measure have tended to focus only on reliability and validity.6–8

Although self-rated health is measured as a categorical response, usually having three to five categories, it is often collapsed into a dichotomous variable of good versus less than good health when it is used as a dependent variable.9–16 The justification for this practice has not been established. It may be that the categories of self-rated health represent an arbitrary classification of underlying continuous phenomena.17 Alternatively, the categories may represent intrinsically distinct health states, which are predicted by different factors.18,19 It is important, therefore, to establish whether analyses of self-rated health are sensitive to categorization. Investigation of the uncollapsed variable may shed light on whether threshold effects exist which would justify the practice of collapsing this health measure. To our knowledge this issue has not been examined.

The collapsing of categories of a categorical variable has been discussed in the statistical literature and it is recognised that dichotomization, whilst valid, involves loss of information and may lead to reduction in efficiency in the statistical analysis under consideration.20,21 Alternative approaches, which relate to the ordinal nature of a response variable have been suggested within the framework of logit and loglinear models for ordered categorical variable. These approaches include the continuation ratio model, the cumulative odds model and the adjacent categories model.20–24

Elsewhere we have examined the association of cumulative socioeconomic circumstances and education with self-rated health among 33 year olds in the 1958 British birth cohort.25 Five specific questions were identified as follows. First, do socioeconomic conditions (as indicated by occupational class) at four life stages contribute to self-rated health at age 33. Second, are these contributions of similar magnitude. Third, is there an interaction effect of socioeconomic conditions at different ages. Fourth, is the cumulative effect of socioeconomic conditions distinct from that of education. Finally, are there gender differences in these relationships. This previous work used self-rated health as a dichotomous variable. The purpose here is to establish whether results for the dichotomous outcome differ from those obtained with alternative approaches based on self-rated health as an ordered categorical variable.


    Statistical models
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
When the response variable is binary, logistic regression models enable assessment of the association between an independent variable(s) and the response variable. When the response variable has more than two categories, generalizations of the logistic model have been suggested. While in the case of a binary response model, the definition of the logit to be modelled is unique, in the multinomial case a number of logits could be modelled.

Consider a polytomous variable Y having k categories denoted by 1,2,..., k corresponding to the ordered values y1,..., yk and let x = (x1,..., xp)’ be a column vector of covariates. A possible generalization of the logistic regression model is the ordinary polytomous model.26 The model is given by


(1)

Where ßj are regression coefficients vectors and ßji represents the log odds ratio for Y = yj versus Y = y1 per unit increase in xi. The model is based on a reference category, here the first one, and involve k – 1 logits equations each having separate parameters. While the model's parameters are easy to interpret, the model does not incorporate the natural ordering of the response variable Y. The following models address this issue.

The cumulative odds model does incorporate the ordered nature of the response variable. This model, which was suggested by Walker and Duncan27 and further discussed by McCullagh,20 is related to cumulative probabilities and is given by


(2)

with {alpha}2 > {alpha}3 > ... > {alpha}k.

Consider transforming the Y variable into a binary variable in which all categories from 1 to j – 1 form one category, and those from j to k form the second category. The logits of such binary variables are modelled in the cumulative odds model. The vector of regression coefficients, ß, does not depend on j, thus this model assumes that the association between X and Y is independent of j. Comparing the odds of having a response >yj for x = x1 with that for x = x2 will yield



The assumption of equality of the log odds-ratio over all the cut-off points, is also referred to as the proportional odds assumption.20 A more general model which relaxes this assumption will include of the right hand side of (2) {alpha}j + ßjx. Such a model can be used to assess the appropriateness of the proportional odds assumption by testing the equality of the ßjs. The model stays practically the same if the order of the categories in the Y variable is reversed. Further, collapsing some of the categories will only affect the intercept parameters. This collapsability property allows modelling an underlying continuous variable.24

Another model, which incorporates the natural order of the response, is the continuation ratio model. Here the logits to be modelled are those yielded by transforming Y into a binary variable in the following way: one category is formed from the j category of Y, and the other is formed by categories 1 to j – 1. The model has the following form


(3)

The model, which was suggested by Feinberg,28 is based on the probability of being in category j, conditional on being in categories smaller than j. The slope ßi, corresponding to the covariate xi, represents the change in the relative chance of a particular ranking, against a lower ranking, for a unit change in xi. Reversing the order of the categories will yield a different model, this will also be the case when some of the categories of Y will be collapsed. The model is related to Cox's proportional hazard model.29

The adjacent categories model is an additional model suggested for analysing an ordered categorical response. The model is based on comparing in each logit two adjacent categories and has the following form


(4)

The parameter ßji corresponds to the regression coefficient of the i-th covariate for the log odds of (Y = yj) relative to (Y = yj – 1). The model is related to the ordinary polytomous model presented in (1) where in both models the regression coefficient ß, depends on j, and thus is different for each logit modelled. A simpler model may include an assumption that the ßj's are all equal implying a similar logit effect for all (k – 1) pairs of adjacent response categories, such a model will have on the right hand side {alpha}j + ßx. The adjacent categories model was presented by Goodman30 and the model with fixed ß is similar to a loglinear model having a linear by linear association.21 Such a model requires assigning scores to the response variable.

It is important to note the parameters of the models presented above, while representing in each case a constant ({alpha}) and a regression parameter (ß) are not comparable since each predicts different probabilities.

Variations and generalizations of the models described above have also been presented,31 including the stereotype model,32 which is based on model (1) together with constraints on the regression parameters so as to have a monotonic pattern. While models based on logits are natural for studying an ordered categorical response variable, additional models based on mean response can also be employed. These models which resemble linear regression models for continuous response, are appropriate mainly for analysing underlying continuous variables and have some disadvantages when used for categorical ordered variables.26 In addition, a number of possibilities for assigning scores to an ordered categorical variable can be employed, including methods based on an underlying continuous distribution and methods based on maximizing correlation.33–35

The selection of an appropriate model is usually carried out using goodness of fit considerations. However, additional criteria should be addressed as well. These criteria include the formulation of the research question, namely which logit was of interest a priori, as well as the type of dependent ordered variable under consideration; that is, does it represent an underlying continuous phenomenon or a discrete one? Is the model parsimonious? Does modelling involve the assignment of scores to the ordered categories? Finally, what is the effect of combining adjoining categories on the model?

Given the diverse approaches available for analysing the association between a set of covariates and an ordered categorical response, it is of interest to establish whether conclusions would differ substantively using alternative methods. Especially whether the results yielded by models which are based on a polytomous response variable differ from those yielded by a model which dichotomizes the response variable. We therefore selected the ordinary polytomous model, the cumulative odds model, the continuation ratio model and the adjacent categories model and compared the conclusions yielded by these models with that of a logistic regression model based on a dichotomization of self-rated health.


    Health and socioeconomic circumstances in the 1958 birth cohort
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
Study sample
The 1958 birth cohort includes all children born in England, Wales and Scotland during 3–9 March 1958. Information was collected on 98% of births, totalling 17 414, during the Perinatal Mortality Study.36 Five subsequent follow-up studies were conducted at ages 7, 11, 16, 23 and 33, with 11 405 subjects included in the most recent sweep. Those remaining in the study are considered to be generally representative of the original sample, although as expected sample attrition has been associated with under-representation of those with the most disadvantaged backgrounds.2,37 Such biases tend to be small: within the sample used in our analyses, which includes 7063 individuals with complete information, 19.4% of men responding at age 33 had been born into classes IV and V compared with 21.1% in the original sample; for women the figures are 20.5% and 21.4%.

Measures
For self-rated health, cohort members gave an overall assessment of their health, rated as excellent, good, fair or poor during a personal interview at age 33.

Socioeconomic circumstances were indicated by social class, based on occupation, at four ages: (i) father's occupation at the time of the respondent's birth (using the Registrar General's (RG) 1950 classification); (ii) father's occupation when the cohort member was aged 16 (to reduce the effects of sample attrition due to missing data, fathers social class at age 11 was used if 16-year data were not available); (iii) cohort member's own occupation at age 23 (using the 1980 RG classification); (iv) cohort member's own occupation at age 33 (using the 1990 RG classification). Four social groups were used: I & II (professional and managerial), III non-manual (unskilled), III manual (skilled) and IV & V (unskilled manual). Both men and women were classified on the basis of their own occupation.

We constructed a summary measure of socioeconomic circumstances at the four ages (SES lifescore) in which each life stage had equal weighting and which was compared with other measures of cumulative SES.25 This simple score was derived by adding the social class value (1–4) at different ages, and ranged from 4 (for those always in the highest social class) to 16 (for those always in the lowest class). About 25% of men had scores below 8, and 25% had scores above 12; for women the percentages were 25% and 17% respectively.

Educational qualifications obtained by age 33 were classified into five groups: above ‘A’ Level (29% men, 26% women), ‘A’ Level or equivalent (24% men, 10% women) ‘O' Level or equivalent (24% men, 36% women), less than ‘O' Level (14% men, 17% women), and no qualifications (9% men, 11% women).

Data analysis
The approaches for comparison were ordinary polytomous regression, the cumulative odds, the continuation ratio, the adjacent categories method and a logistic regression analysis based on a dichotomization of self-rated health. For the continuation ratio, the assumption of parallelism was tested. For the adjacent categories method a uniform association having a fixed ß was used to represent a more parsimonious model and equally-spaced scores were assigned to self-rated health.

Three models were examined for each method, first with social class at the four ages separately as the explanatory variables, second with the SES summary (lifescore), third with lifescore and education simultaneously. Both social class and educational qualifications were used as ordinal categorical variables with equally spaced scores assigned to the categories. Different ways for assigning the scores to these variables and the sensitivity of the results of logistic regression models to the different scores have been reported elsewhere.35 A potential problem in Model 1 is collinearity arising from the correlation between the independent variables. We assessed the size of this problem and its impact on the estimated parameters for the logistic regression model, and found that the results are not affected by collinearity.38

Goodness of fit was assessed for each model using the deviance statistic. However, when the number of covariate patterns was large, we used a modified test based on a procedure suggested by Hosmer and Lemeshow.39 The estimated logits were divided into 10 equal groups and in each group the observed and expected frequencies were compared for the categories of self-rated health. A Pearson {chi}2 test was used to summarize the comparison. Furthermore, evaluation of the residuals and graphical methods was also used to assess the fit of the models.

The results for logistic regression and the three alternative approaches were compared with respect to the effect size, the significance of results, gender differences, linearity of association (which was tested with a quadratic term) and tests of interactions.


    Results
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
Table 1Go shows that the distribution of self-rated health at age 33 was similar for men and women, with slightly more than 1% reporting poor health. For the logistic regression analyses the poor category was combined with the fair, and excellent-rated health was combined with that rated as good. Results of the logistic regression are presented in Table 2Go. Model 1 estimates the effects of social class at birth, and ages 16, 23 and 33 simultaneously. It indicates that, apart from class for men at age 16, all other social class measures are significantly related to self-rated health at age 33. Thus, a predictive effect of social class at birth remained after allowing for social class at later ages, and conversely, adult class had an additional effect to that at earlier ages. Furthermore, the size of effect of social class was similar for each age (except for men at age 16). We tested models with interactions, between class at two consecutive ages, and found no significant interaction. Thus, social mobility was not found to be significantly related to self-rated health. Model 2 shows a strong association between the lifetime SES score and self-rated health, where a one unit increase in this score increases the odds for less than good health by 15% in men and 18% in women. The contribution of lifetime SES in predicting self-rated health remains significant also after adjusting for education (Model 3). In all three models results for men and women were not significantly different. Regarding goodness of fit all three models fit the data well.


View this table:
[in this window]
[in a new window]
 
Table 1 Self-rated health at age 33 (percentage and number)
 

View this table:
[in this window]
[in a new window]
 
Table 2 Results of fit of logistic regressiona
 
Results of the ordinary polytomous model are presented in Table 3Go with excellent rated health as the reference category. For poor health in Model 1, only social class at birth and age 23 are significant for men and class at age 23 is the only significant variable for women. It is important to note, however, that few subjects rated their health as poor (45 men and 49 women, Table 1Go). For fair health the results accord with those for the logistic regression in Table 2Go. Differences for good and excellent-rated health are, as might be expected, of a smaller magnitude than for fair-rated health (Table 3Go). Model 2 indicates significant associations between lifescore and self-rated health for all logits considered. Using the estimates to assess logit relationships with adjacent category indicates a similar effect for men of lifescore on the odds of poor health for every adjacent category. Whereas, for women there are similar estimates for poor and fair health, suggesting a threshold effect. Associations for lifescore remain significant after adjustment for education (apart from poor-rated health for women, Model 3). There is a trend in the parameters where the largest slope corresponds to the odds of poor versus excellent health and the smallest one corresponds to the odds of good versus excellent health. Model 2 fits the data well for men but is less good for women. The residuals reveal that the worse fit is for the logit of good-rated health versus excellent, however no specific trend was noted for the adjusted residuals.


View this table:
[in this window]
[in a new window]
 
Table 3 Results of fit of ordinary polytomous regressiona
 
Table 4Go shows the cumulative odds models for the associations between self-rated health and social class, lifescore and education. This approach yielded similar results to those of the logistic regression, with respect to significant associations, size of effects for all social class measures (Model 1), no significant interactions (Model 1, results not shown) and no gender differences.


View this table:
[in this window]
[in a new window]
 
Table 4 Results of fit of cumulative odds method
 
Results based on the continuation ratio approach are presented in Table 5Go. These results are similar to those from the logistic regression model. Model 1 indicates that, apart from class for men at age 16, all other social class measures show a similar significant effect on self-rated health at age 33. The strong association between lifetime SES and self-rated health (Model 2) remains significant also after adjusting for education (Model 3). We tested and did not reject the assumption of parallelism, for all three models considered.


View this table:
[in this window]
[in a new window]
 
Table 5 Results of fit of continuation ratio method
 
Table 6Go presents results of fitting the adjacent categories approach with uniform association. Here, as well, the results are similar to those from the logistic regression. In Model 1 all four social class measures are significantly associated with self-rated health at age 33, excluding that of age 16 for men, with similar size of effect for each age and sex. Model 2 indicates that an increase of one unit in the lifetime SES score increases the odds of poorer health by 8.9% among men and 11% among women.


View this table:
[in this window]
[in a new window]
 
Table 6 Results of fit of adjacent categories method
 
None of the five approaches considered showed gender differences with respect to the associations examined, and in none of the analyses for lifescore in Model 2 was a significant quadratic term found, indicating a linear association. Similarities were also apparent regarding goodness of fit of the models, where the fit was better for men than for women irrespective of the approach employed, except for Model 1 in the logistic regression analysis. With respect to goodness of fit the adjacent categories approach fitted the data better, in most cases, than the continuation ratio or the cumulative odds, and the ordinary polytomous approach had generally a better fit than the three former approaches. Among men the fit of the cumulative odds was better than that of the continuation ratio. An inspection of the residuals revealed that in some cases the lack of fit was mainly associated with one cell. For example, in the continuation ratio approach for Model 3 among men, it occurred for the highest logits, where the expected number of subjects in the poor category was far smaller than that observed. A lack of fit was apparent in the same cell also for women, in relation to the adjacent categories approach and Model 1, however in this case the observed number was far smaller than that expected by the model.

The power of the statistical testing carried out under the different approaches is influenced, as in other cases of statistical inference, by the sample size and the effect size. In our analyses both the sample and effect sizes are large and so we considered whether our findings were robust with a smaller sample and effect size. Focusing on the association between lifetime SES and self-rated health among men, we therefore considered two additional situations. The first in which the same bivariate association prevailed between these two variables but the sample size was reduced to 10% of the original male sample (i.e. 346). The second in which for the reduced sample there was also a smaller effect. We compared the results of the logistic regression model to that of the adjacent categories, cumulative odds and continuation ratio methods. For the first situation all methods showed a significant association between lifetime SES and self-rated health. For the second situation the results differed: logistic regression showed no significant association with an estimated slope of 0.069 (SE 0.0521), whereas the other methods showed a significant association. For example the estimated slope was 0.073 (SE 0.027) for the adjacent categories model and 0.099 (SE 0.035) for the cumulative odds model. The reduction in the sample size resulted in only four individuals in the poor category for self-rated health. We repeated the analyses described above after combining the poor and fair categories and found similar results.


    Discussion
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
The first main finding from this comparison is the similarity of results and conclusions that emerge from the five statistical approaches. The results are of course specific to self-rated health and conclusions may differ for other health measures that are frequently dichotomized (including grouped continuous data, such as body mass index and ordered data, such as a disease severity classification). In addition, the results apply to the specific statistical approaches selected for comparison. These represent a wide range of methods, but they do not include all the possible approaches40 mentioned in the statistical section above. Moreover, only a single dichotomization was examined (fair and poor versus good and excellent health). This arbitrary cut-off relates to the distribution of self-rated health at this particular age. Of course our results apply to a specific age group. At older ages, the distribution of responses will change, with fewer individuals rating their health as excellent and good. It is possible that different categorizations for self-rated health may be appropriate at older ages, but we are unable to speculate on this since we do not know whether our conclusions will be affected by the distribution of responses. Further work on older population samples would be necessary to resolve this issue.

Logistic regression and other ordered categorical models
The results yielded by the logistic regression approach were similar to those obtained by the other methods. The logistic regression analysis was based on a dichotomization of the self-rated health variable, which involves loss of information, and in addition, the ordinal nature of the variable is not incorporated. Both aspects may lead to a reduction in statistical efficiency.21,24 However, only small differences in power and efficiency were evident. These small differences are related to the relatively large sample size. Using simulations Armstrong and Sloan41 compared results from logistic regression with the cumulative odds and continuation ratio. They concluded that logistic regression involved only a slight reduction in power compared with the other approaches, when the ordinal scale is collapsed into a dichotomy with equal numbers in each category. Dichotomization in our study resulted in about 12% in one category and 88% in the other. However, our sample size was far larger than that of Armstrong and Sloan.41 The similarity of results together with the good model fit presented in the present paper provides further evidence of the robustness of the logistic regression results and justifies previous work carried out with the 1958 birth cohort.9,10 Further, given the simplicity and wide use of logistic regression, particularly in epidemiology, our results support this practice within the context of a large sample size. However, as was illustrated by an example, in situations characterized by either a smaller sample, or a smaller effect size or both, results from logistic regression may differ from those yielded by the other methods.

The findings of the logistic regression analyses regarding our main research questions, were confirmed by the other methods, in respect of the accumulation and magnitude of effect of socioeconomic conditions on self-rated health, interactive effects, the combined effects of lifetime SES and education, and gender differences.

Ordered categorical models
The methods which incorporated the categorical nature of self-rated health also gave similar results. Ananth and Kleinbaum24 used data with a single covariate and compared a number of approaches including the continuation ratio, the cumulative odds, the ordinary polytomous and the adjacent categories. The results from the different approaches were similar. Two other studies, one based on simple data with a single covariate42 and the other based on more complex data43 reported similarity of findings for several analytical approaches. An additional study44 used simulations to compare results based on proportional odds to those of the stereotype model. It concluded that the performance of the first approach was superior to that of the second. Our analysis, which is based on a large data set and a number of covariates, provides additional support for the similarity of results of the various methods.

All four approaches examined fit the data for men rather well, with a less good fit for women and an indication of better fit for the adjacent categories approach than the continuation ratio or the cumulative odds approaches. However, the deviances reported for each approach are not completely comparable, and thus cannot be used as the only measure of goodness of fit. Further, while we have used a modified test of goodness of fit when the number of covariate patterns was large, the resulting {chi}2 tests were based on tables in which the expected number of cases was smaller than five in a number of cells. The small number of expected cases was due to the small number of subjects who rated their health as poor. This together with the sensitivity of the {chi}2 test to a large sample size as ours, leads to model selection which depends on scrutinization of the basic relationships, graphical representations, conceptualization of the response variable and, most importantly, the specific research questions. The approaches of continuation ratio, cumulative odds and adjacent categories methods are more parsimonious than ordinary polytomous regression. The continuation ratio approach assumes that each category of self-rated health is of an intrinsic interest, while the adjacent categories and cumulative odds approaches imply an underlying continuous attribute. The assumption of parallelism that was found to be appropriate for the continuation ratio model suggests that the associations hold across different comparisons (each health category versus better health). Whereas, with the ordinary polytomous models there was limited evidence, for women but not men, for the existence of a threshold in the association with cumulative socioeconomic conditions and self-rated health. Our analyses did not indicate that one method was strongly preferable to any other, but was suggestive of monotonic relationships throughout the categories of self-rated health. This finding accords with recently published results on the continuity of self-rated health17,19 and contrasts with previously reported studies suggesting that there are different predictors for good and for less than good health.18 However, when self-rated health was used to predict mortality, each decrease in the level of rating was found to incrementally increase the risk of mortality.45,46

In conclusion, results concerning the relationship between socioeconomic position and self-rated health yielded by a logistic regression model were confirmed by alternative statistical methods which incorporate the ordered nature of self-rated health. Similarity of results was found regarding size and significance of main effects, type of association and interactive effects. Methods based on self-rated health with four categories suggest that self-rated health forms a continuum.


    Acknowledgments
 
The research was supported by the UK Economic and Social Research Council under the Health Variations Programme (L128251021). The Canadian Institute for Advanced Research supports C. Power as a Weston Fellow. Data acknowledgement: Centre for Longitudinal Studies, Institute of Education, National Child Development Study Composite File including selected Perinatal Data and sweeps one to five [computer file]. National Birthday Trust Fund, National Children's Bureau, City University Social Statistics Research Unit [original data producers]. Colchester Essex: The Data Archive [distributor], 21 June 1994. SN:3148.


    References
 Top
 Abstract
 Introduction
 Statistical models
 Health and socioeconomic...
 Results
 Discussion
 References
 
1 Allied Dunbar National Fitness Survey. London: Sports Council and Health Education Authority, 1992.

2 Power C, Manor O, Fox AJ. Health and Class: The Early Years. London: Chapman and Hall, 1991.

3 Moller L, Kristensen TS, Hollnagel H. Self rated health as a predictor of coronary heart disease in Copenhagen, Denmark. J Epidemiol Community Health 1996;50:423–28.[Abstract]

4 Fylkesnes K. Determinants of health care utilization—visits and referrals. Scand J Soc Med 1993;21:40–50.[ISI][Medline]

5 Idler EL, Benyamini Y. Self rated health and mortality: a review of twenty-seven community studies. J Health Soc Behav 1997;38:21–37.[ISI][Medline]

6 Miilunpalo S, Vuori I, Oja P, Pasanen M, Urponen H. Self-rated health status as a health measure: the predictive value of self-reported health status on the use of physician services and on mortality in the working age population. J Clin Epidemiol 1997;50:517–28.[ISI][Medline]

7 Krause NM, Jay GM. What do global self-rated health items measure? Med Care 1994;32:930–42.[ISI][Medline]

8 Lundberg O, Manderbacka K. Assessing reliability of a measure of self-rated health. Scand J Soc Med 1996;24:218–24.[ISI][Medline]

9 Power C, Matthews S, Manor O. Inequalities in self-rated health in the 1958 birth cohort: life time social circumstances or social mobility? Br Med J 1996;313:449–53.[Abstract/Free Full Text]

10 Power C, Matthews S, Manor O. Inequalities in self-rated health: explanations from different stages in life. Lancet 1998;351:1009–14.[ISI][Medline]

11 Mackenbach JP, Kunst AE, Cavelaars AEJM, Groenhof F, Geurts JJM. Socioeconomic inequalities in morbidity and mortality in western Europe. Lancet 1997;349:1655–59.[ISI][Medline]

12 Shetterly S, Baxter J, Mason LD, Hamman RF. Self rated health among Hispanic vs non-Hispanic white adults: the San Luis Valley Health and Aging Study. Am J Public Health 1996;86:1798–801.[Abstract]

13 Rahkonen O, Arber S, Lahelma E. Health inequalities in early adulthood: a comparison of young men and women in Britain and Finland. Soc Sci Med 1995;41:163–71.[ISI][Medline]

14 West P. Inequalities? Social class differentials in health in British youth. Soc Sci Med 1988;27:291–96.[ISI][Medline]

15 Arber S. Comparing inequalities in women's and men's health: Britain in the 1990's Soc Sci Med 1997;44:773–87.[ISI][Medline]

16 Macran S, Clarke L, Sloggett A, Bethune A. Women's socioeconomic status and self-assessed health—identifying some disadvantaged groups. Soc Health Ill 1994;16:182–208.[ISI]

17 Manderbacka K, Lahelma E, Martikainen P. Examining the continuity of self-rated health. Int J Epidemiol 1998;27:208–13.[Abstract]

18 Smith AMA, Shelley JM, Dennerstein L. Self rated health: biological continuum or social discontinuity. Soc Sci Med 1994;39:77–83.[ISI][Medline]

19 Mackenbach JP, Van Den Bos J, Joung IMA, Van De Mheen H, Stronks K. The determinants of excellent health: different from the determinants of ill-health? Int J Epidemiol 1994;23:1273–81.[Abstract]

20 McCullagh P. Regression models for ordinal data (with discussion). J R Statis Soc B 1980;42:109–42.

21 Agresti A. Analysis of Ordinal Categorical Data. New York: Wiley, 1984.

22 McCullagh P, Nelder JA. Generalized Linear Models (2nd Edn). London: Chapman and Hall, 1989.

23 Anderson JA, Philips PR. Regression, discrimination, and measurement models for ordered categorical variables. Appl Statist 1981;30: 22–31.

24 Ananth CV, Kleinbaum DG. Regression models for ordinal responses: a review of methods and applications. Int J Epidemiol 1997;26: 1323–33.[Abstract]

25 Power C, Manor, O, Matthews S. (1999) Duration and timing of exposure: effects of socio-economic environment on adult health. Am J Public Health 1999;7:1059–65.

26 Agresti A. Categorical Data Analysis. New York: Wiley, 1990.

27 Walker SH, Duncan DB. Estimation of the probability of an event as a function of several independent variables. Biometrika 1967;54: 167–79.[ISI][Medline]

28 Feinberg S. The Analysis of Cross-Classified Categorical Data, 2nd Edn. Cambridge MA: MIT Press, 1980.

29 Läära E, Mathews JSN. The equivalence of two models for ordinal data. Biometrika 1985;72:206–07.[ISI]

30 Goodman LA. The analysis of dependence in cross-classifications having ordered categories, using log-linear models for frequencies and log-linear models for odds. Biometrics 1983;39:149–60.[ISI][Medline]

31 Peterson BL, Harrell FE. Partial proportional odds models for ordinal response variables. Appl Statis 1990;39:205–17.

32 Anderson JA. Regression and ordered categorical variables (with discussion). J R Statist Soc B 1984;46:1–30.[ISI]

33 Fielding A. Scoring functions for ordered classifications in statistical analysis. Quality Quantity 1993;27:1–17.

34 Gilula Z. Grouping and association in two-way contingency tables: a canonical correlation analytic approach. J Am Stat Ass 1986;81: 773–79.[ISI]

35 Manor O, Matthews S, Power C. Comparing measures of health inequality. Soc Sci Med 1997;45:761–71.[ISI][Medline]

36 Butler NR, Bonham DG. Perinatal Mortality. Edinburgh, Livingstone, 1963.

37 Ferri E (ed). Life at 33: The Fifth Follow up of the National Child development Study. London: National Children's Bureau, 1993.

38 Wax Y. Collinearity diagnosis for a relative risk regression analysis: An application to assessment of diet-cancer relationship in epidemiological studies. Stat Med 1992;11:1273–87.[ISI][Medline]

39 Hosmer DW, Lemeshow D. Applied Logistic Regression. New York: Wiley, 1989.

40 Wagstaff A, van Doorslaer E. Measuring inequalities in health in the presence of multiple-category morbidity indicator. Health Econ 1994;3:281–91.[ISI][Medline]

41 Armstrong BG, Sloan M. Ordinal regression models for epidemiologic data. Am J Epidemiol 1989;129:191–204.[Abstract]

42 Cox C, Chuang C. A comparison of chi-square partitioning and two logit analyses of ordinal pain data from a pharmaceutical study. Stat Med 1984;3:273–85.[ISI][Medline]

43 Greenwood G, Farewell V. A comparison of regression models for ordinal data in an analysis of transplanted-kidney function. Cand J Stat 1988;16:325–35.

44 Holtbrugge W, Schumacher M. A comparison of regression models for the analysis of ordered categorical data. Appl Stat 1991;40: 249–59.[ISI]

45 Pijls LTJ, Feskens EJM, Kromhout D. Self-rated health, mortality and chronic disease in elderly men: the Zutphen study, 1985–1990. Am J Epidemiol 1993;138:840–48.[Abstract]

46 McCallum J, Shadbolt B, Wang D. Self reported health and survival: a 7-year follow up study of Australian elderly. Am J Public Health 1994;84:1100–05.[Abstract]