1 Clinical Research Unit, Hvidovre University Hospital, University of Copenhagen, Hvidovre, Denmark.
2 Department of Community Medicine, Malmö University Hospital, Lund University, Malmö, Sweden.
Received for publication October 3, 2003; accepted for publication August 30, 2004.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
data interpretation, statistical; epidemiologic methods; hierarchical model; logistic models; odds ratio; residence characteristics
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The partition of variance at different levels (e.g., neighborhood and individual) is the sine qua non of multilevel regression analysis, and its consideration is relevant for both statistical reasons (improved estimation) and substantive epidemiologic reasons (quantification of the importance of the neighborhoods for understanding individual health) (5). However, contrary to normally distributed continuous variables, components of variance are tricky to investigate when it comes to dichotomous response variables. Forcing classical interpretative schemes wastes information and may be inappropriate. New measures are needed in order to quantify effects and ultimately provide a better understanding of the data.
In this paper, our aim is to highlight two measures previously described (6): the median odds ratio (MOR) and the interval odds ratio (IOR). These measures facilitate the integration and presentation of both fixed and random effects in logistic regression.
![]() |
MOTIVATION FOR NEW MEASURES OF EFFECT SIZE IN TWO-LEVEL MODELS WITH DICHOTOMOUS RESPONSES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
When formulating a two-level model, it is common to assume normality for the cluster-level (level 2) variation and to assume independence of units within-cluster (level 1) conditional on the cluster variable, thus generating a model in which individuals are marginally correlated within clusters. In the multivariate normal case, the interpretation of both fixed effects and random effects (individual residual error and cluster variation) is simple. This is not true in the case of dichotomous response variables, because of the nonlinear relation between the covariates and the response variable (typically a logit relation).
Choosing a logistic regression model leads to the choice between two different odds ratio interpretations of the fixed-effects parameters, the subject-specific interpretation and the population-averaged interpretation (7, 8). Taking a population-averaged approach, heterogeneity is considered a nuisance, whereas the subject-specific approach opens up the possibility of quantification of heterogeneity based on the two-level model (6).
Quantifying the variance component is a different matter. Usually, researchers either calculate a so-called intraclass correlation coefficient based on an estimate of the variance components and the residual variance or report the variance component, neither of which is very useful. Many different measures of intraclass correlation have been suggested (9). Nevertheless, intraclass correlations have serious interpretational drawbacks (10) for binary responses. First, the intraclass correlation may be interpreted as the proportion of the total variation attributable to variation between clusters, or as correlation between persons within the same cluster. That is, the intraclass correlation does not convey information regarding variation between clusters. Therefore, it is not a very useful measure when determining whether or not clustering is an important factor. Second, the intraclass correlation is not comparable with the fixed effects, which have odds ratio interpretations. This is unfortunate, because random effects are not very different from fixed effects in nature; random effects are fixed effects with an additional distributional assumption. Therefore, it seems natural to quantify variation between the random effects using odds ratios. Alternatively, one may use the variance component itself, but it is quite difficult to interpret, since it is on the log odds ratio scale.
In this paper, we unify interpretations of fixed and random effects in a subject-specific approach, explaining the use of the MOR and IOR measures in the case of a two-level logistic regression model.
The model
Consider a population of N individuals. Each individual has a vector of covariates, x, and each individual belongs to one of K clusters. The parameters corresponding to the covariates are in the vector ß. The K mutually independent cluster variables, u1, u2, ..., uK, are not to be estimated, since they are not of interest per se; rather, it is the variation between clusters that needs be quantified. Therefore, a normal distribution is assumed for the us, and parameters characterizing this distribution can then be used to characterize the heterogeneity induced by the random effects.
The response variable, Y, is a dichotomous variable. That is, for each individual, it is observed whether Y = 0 or 1. The model has two levels.
Level 1: For a person with covariate vector x, corresponding to the kth cluster, the probability of observing Y = 1 is
Level 2: For that individual, the second-level equation is
(x, uk) = ßx + uk,
where uk N(0,
2). The covariate vector, x, contains individual-level (level 1) and cluster-level (level 2) covariates. Although there is no difference in terms of formulating the model, there are differences when interpreting the effects of variables varying within clusters and cluster-level variables.
For variables varying within a cluster, the usual odds ratio interpretations apply for comparisons of persons belonging to the same cluster; for example, a gender effect may be interpreted as an odds ratio between a woman and a man belonging to the same cluster and with the same covariates, except for gender.
For variables varying on the cluster level, the quantification is more difficult; the usual odds ratio interpretation is incorrect, because it is necessary to compare persons with different random effects, since the variable of interest does not vary between individuals within-cluster. A measure for this situation is described below in the section "The IOR."
It is of interest to quantify the clustering as something other than variation on the underlying linear scale, because this is difficult to interpret and relate to the fixed effects, which are quantified in terms of odds ratios. Therefore, it would be useful to have an odds ratio interpretation of the cluster variation as well. A measure is described below.
The MOR
The MOR quantifies the variation between clusters (the second-level variation) by comparing two persons from two randomly chosen, different clusters. Consider two persons with the same covariates, chosen randomly from two different clusters. The MOR is the median odds ratio between the person of higher propensity and the person of lower propensity.
The MOR is very easy to calculate, because it is a simple function of the cluster variance, 2:
where (·) is the cumulative distribution function of the normal distribution with mean 0 and variance 1,
1(0.75) is the 75th percentile, and exp(·) is the exponential function. A theoretical derivation of the formula is provided in the Appendix.
The measure is always greater than or equal to 1. If the MOR is 1, there is no variation between clusters (no second-level variation). If there is considerable between-cluster variation, the MOR will be large. The measure is directly comparable with fixed-effects odds ratios.
The IOR
The IOR is a fixed-effects measure for quantification of the effect of cluster-level variables. Consider two persons with different cluster-level covariates, x1 and x2. The IOR is an interval for odds ratios between two persons with covariate patterns x1 and x2, covering the middle 80 percent of the odds ratios.
The IOR is only slightly more difficult to calculate than the MOR. The lower and upper bounds of the interval are
and
where 1(0.10) = 1.2816 and
1(0.90) = 1.2816 are the 10th and 90th percentiles of the normal distribution with mean 0 and variance 1. A theoretical derivation of the formula is provided in the Appendix.
The interval is narrow if the between-cluster variation is small, and it is wide if the between-cluster variation is large. If the interval contains 1, the cluster variability is large in comparison with the effect of the cluster-level variable. If the interval does not contain 1, the effect of the cluster-level variable is large in comparison with the unexplained between-cluster variation.
According to the model, the odds ratios can take any value between zero and infinity. However, to put meaning into the IOR, it is natural to report an interval that is likely to contain the odds ratio when randomly choosing two persons with two specific sets of covariates. Reporting an 80 percent interval has been suggested (6), and that is reasonable, because it covers a large fraction of the odds ratios. Note that the IOR is not a confidence interval.
![]() |
EXAMPLES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Multilevel analysis has provided evidence that the socioeconomic characteristics of the neighborhood environment affect individual risk of ischemic heart disease (12). However, neighborhood variation seems to be low (13).
In Sweden, cost is not a specific determinant of peoples choice of a private versus public health-care practitioner, since the county councils support both economically. However, private physicians are outside of the public health-care system and therefore are less susceptible to health-care strategies directed by the county councils. Preferring a public practitioner versus a private practitioner might suggest dysfunction in some parts of the public health-care system, but it might also be explained by individual preferences, demands, and expectations related to socioeconomic position. The greater confidentiality of medical records in the private sector is another factor that could explain individual preference. Moreover, area of residence might influence individual decisions over and above individual characteristics.
Using a multilevel approach, we investigated the probability of being hospitalized for ischemic heart disease and the probability of visiting a public practitioner versus a private practitioner among Swedish men. We give two examples, followed by some comments on issues that have not already been covered.
Study population and assessment of variables
The study population consisted of 11,312 men aged 4585 years residing in 98 of the 110 neighborhoods in the city of Malmö, Sweden. All of the men had visited a physician during the year 1999. Information was obtained from the Register on Health Care Utilization in Skåne, Sweden (14).
For the analysis, the variable "neighborhood" was used as a cluster (level 2) variable. Two individual-level (level 1) explanatory variables were considered: "age," which was a four-category variable with the categories 6569, 7074, 7579, and 8085 years, and "education," which was a dichotomous indicator of having 9 or fewer years of schooling. A cluster-level (level 2) explanatory variable, "neighborhood education," was also used in the analyses. This variable was a dichotomous indicator of the neighborhoods educational level being below the median for all neighborhoods. In example 1, the response variable was an indicator of whether a person had been hospitalized for ischemic heart disease (coded as 1) or not (coded as 0). Ischemic heart disease was defined by hospital discharge diagnosis in 1999. The relevant codes according to the International Classification of Diseases, Tenth Revision, are I20 (angina pectoris), I21 (acute myocardial infarction), I22 (subsequent myocardial infarction), and I50 (heart failure). In example 2, the response variable was whether a person had visited a public physician (coded as 1) or not (coded as 0). We restricted our analysis to persons who had used the health-care system and visited a general practitioner at least once during the year 1999.
Parameters in the random-effects model were estimated using restricted iterative generalized least squares. The MLwiN software package (15), version 1.1, was used to perform the analyses. Extrabinomial variation was explored systematically in all of the models, and there was no indication of either under- or overdispersion.
Population-averaged parameters were calculated on the basis of the approximate formula (7)
,
where is the population-averaged parameter and
is the subject-specific parameter. The cluster variance is
.
Example 1: hospitalization for ischemic heart disease
In example 1, two models are considered. In model 1, age and individual education were included together with a random neighborhood effect. Model 2 was an extension of model 1 that also included the cluster-level covariate neighborhood education.
In both analyses, it appeared that older people were more likely to be hospitalized than younger people and that less-educated persons were more likely to be hospitalized than the more educated. The estimates shown in table 1 were very similar in the two models, but there were some differences (discussed in detail below). The parameter estimates were transformed into odds ratios, which are shown in table 2.
|
|
Note that the above odds ratios are correct only for comparisons of persons belonging to the same cluster. When comparing persons from different clusters, it is necessary to calculate an IOR or leave the subject-specific interpretation and consider population-averaged odds ratios. The population-averaged odds ratio is the odds ratio between two persons from different clusters. In models 1 and 2, the estimates on the linear predictor scale are only shrunk by factors of
and
respectively. This indicates an apparently small amount of heterogeneity of the clusters. Thus, in effect, the attenuation is hardly visible in this analysis.
In model 1, the cluster heterogeneity is interpreted in the following way. Consider two randomly chosen persons with the same covariates from two different clusters (e.g., two more-educated persons aged 6569 years) and conduct the hypothetical experiment of calculating the odds ratio for the person with the higher propensity to be hospitalized versus the person with the lower propensity. Repeating this comparison of two randomly chosen persons will lead to a series of odds ratios. This distribution is estimated when estimating the parameters, and it is shown in figure 1 (on the log scale). The median of these odds ratios between the person with a higher propensity and the person with a lower propensity is estimated to be 1.17 in model 1. This is a low odds ratio, and it suggests that the clustering effect is small even without inclusion of any cluster-level covariates. When neighborhood education is included (model 2), the unexplained cluster heterogeneity (comparing persons from neighborhoods of the same kindfor example, both neighborhoods with a high level of education) decreases, yielding an MOR of 1.09, which is a very low odds ratio. Thus, there is very little variation between neighborhoods in the propensity for hospitalization for ischemic heart disease.
|
|
and some measure of within-cluster association. Contrary to the MOR and the IOR, these two measures do not convey much information regarding the effect of clustering on the likelihood of being hospitalized.
Example 2: visits to a public or private physician
In example 2, two models are considered. In model 3, age and individual education were included with a random neighborhood effect. Model 4 was an extension of model 3 that also included the cluster-level covariate neighborhood education.
The parameter estimates are shown in table 1, and the odds ratios are shown in table 2. The odds ratios for individual education were 1.28 and 1.27 in models 3 and 4, respectively; this suggests a higher propensity to visit a public physician for the less educated, conditional on age and neighborhood.
However, the results regarding clustering are quite different from those in example 1, and this implies substantial differences between the subject-specific and population-averaged estimates in this example. In models 3 and 4, the linear predictor parameters are attenuated by factors of
and
respectively. This yields population-averaged effects of 1.21 for individual education in both models. These effects are for comparisons of persons of the same age belonging to different clusters.
In model 3, for two persons with the same individual-level covariates, the MOR between the person living in the neighborhood with the higher propensity to visit a public physician and the person living in the neighborhood with the lower propensity is 3.61. This is a high odds ratio, suggesting that the heterogeneity is substantial. Including neighborhood education as a covariate reduces the unexplained heterogeneity between neighborhoods to an MOR of 3.33, which is still high. Thus, the propensity to visit a public physician varies a great deal between neighborhoods. This is also reflected in the IOR, which is very broad: [0.28; 27.3]. The interpretation is that if one is randomly selecting two persons, one from a low-education neighborhood and one from a high-education neighborhood, and comparing their odds of having visited a public physician, the middle 80 percent of the odds ratios will lie within this interval. The interval contains 1, which implies that neighborhood education does not account for a substantial amount of the neighborhood heterogeneity. In addition, the interval is quite broad, reflecting a large amount of unexplained variation between neighborhoods in the propensity to visit a public physician. Other cluster-level variables are needed to explain the cluster heterogeneity.
A population-averaged approach might lead to a different conclusion. The population-averaged effect of neighborhood education is
This is by far the most important of the fixed effects, and it suggests that neighborhood education is indeed an important variable when analyzing the propensity to visit a public physician. However, this is not completely in agreement with the previous interpretations based on the random-effects approach. The reason for this apparent discrepancy is the large unexplained cluster heterogeneity, which in the population-averaged approach is only considered a nuisance.
Other issues regarding the MOR and IOR
In a previous study (6), the IOR for being prescribed morphine when calling two different doctors and giving "back pain" versus "other" as the reason for the call was [2.08; 22.9]. This is an example of a large cluster effect (variation between physicians) along with a strong fixed effect (cause for the call).
Another, perhaps less interesting situation is also possiblethe situation where there is relatively little variation between clusters and there is little or no effect of the cluster-level covariate. Although this may be a common situation, it is also the one that is most likely to be overlooked because of type II error.
It is also possible to include continuous cluster-level covariates using the formulas given above in the section "The IOR," and the interpretation is straightforward.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Regarding interpretation of fixed effects, the choice between generalized estimating equations and the random-effects model is the choice between population-averaged and subject-specific interpretations. Which one to choose depends entirely on the substantive research question. In examples 1 and 2, the effect of individual education is conditional on neighborhood. A subject-specific approach is proper, because it facilitates measures of effect on the individual level: What is the direct effect of having a high individual level of education versus a low level for a person under his/her specific conditions? In this sense, the subject-specific parameters are more "clean" measures of the effect of individual education, because they are not attenuated by unmeasured heterogeneity.
A general argument favoring the random-effects model is the possibility of quantification of heterogeneity in an intuitively attractive way. As is shown in the four models in the examples, the heterogeneity may be quantified by characteristics from the distribution of the odds ratios between pairs of randomly chosen persons from different neighborhoods. This provides a measure of heterogeneity on a scale that is familiar to researchers who have worked with the logistic regression model, namely odds ratios.
The usual odds ratio interpretations (conditioning on all other covariates and random effects) are proper for covariates that vary within-cluster, whereas they are improper for cluster-level covariates, because it is impossible to make comparisons within-cluster. That is, all comparisons must be made between persons belonging to two different clusters, and thus the odds ratio is no longer a fixed quantity but a random variable. In this way, the heterogeneity becomes relevant when quantifying the effect of a cluster-level covariate. The IOR incorporates both the fixed effect and the cluster heterogeneity in an interval, allowing for a more detailed description of the covariate effect.
The MOR and the IOR have been applied in a two-level model for the quantification of between-rater variability in a study of mammographic screening (16). Other studies have used the pairwise odds ratio (17, 18), which is similar to the MOR and IOR in the sense that it is also based on comparisons between pairs of individuals. However, where the MOR and IOR quantify between-cluster heterogeneity, the pairwise odds ratio quantifies association (concordance/discordance) between pairs of individuals within-cluster.
Both the MOR and the IOR are very easy to calculate, since they are simple functions of the parameters in the model. Thus, no additional analyses are necessary. A pocket calculator with exponential and square root functions is sufficient for calculating the measures from the parameters.
The MOR and the IOR may be extended to higher-order multilevel models, models with a nonhierarchical structure, and models with random slopes. They can also be extended to log-linear (Poisson) and linear (normal) responses, as well as to models with nonnormal random effects.
![]() |
ACKNOWLEDGMENTS |
---|
The authors express their gratitude to Prof. Niels Keiding of the Department of Biostatistics, University of Copenhagen, for his comments on an early draft of this paper. The authors also thank Dr. Basile Chaix of the Research Team on the Social Determinants of Health and Health Care, French National Institute of Health and Medical Research, for his comments on the final manuscript.
![]() |
APPENDIX |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The median odds ratio (MOR)
The odds ratio between two persons with identical covariates from two different clusters is exp(u1 u2), where u1 and u2 are the two random cluster variables. Consequently, the odds ratio for the person with the higher propensity versus the person with the lower propensity is . Since data for the two cluster variables are assumed to be independent and normally distributed with mean zero and variance
2, the distribution of
may be characterized by the cumulative distribution, F, in the following way. For z > 0,
where (·) is the cumulative distribution function for the standard normal distributionthat is, a normal distribution with mean 0 and variance 1. Thus, the density function, f (·), for the distribution of
becomes
The MOR is the median of this distribution, so it can be calculated as the solution to the equation, F(z) = 0.5, which leads to
The interval odds ratio (IOR)
The odds ratio between two persons from two different clusters with covariates x1 and x2 is exp(ß x (x1 x2) + (u1 u2)). Since data for the two cluster variables are independent and normally distributed, the odds ratio is lognormally distributed with cumulative distribution function G, where, for z > 0,
Consequently, the density, g(·), becomes
The a-percentile in the distribution of the odds ratio is the solution to G(z) = a, which leads to
In particular, the 10th percentile is
and the 90th percentile is
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|