Appropriate Assessment of Neighborhood Effects on Individual Health: Integrating Random and Fixed Effects in Multilevel Logistic Regression

Klaus Larsen1  and Juan Merlo2

1 Clinical Research Unit, Hvidovre University Hospital, University of Copenhagen, Hvidovre, Denmark.
2 Department of Community Medicine, Malmö University Hospital, Lund University, Malmö, Sweden.

Received for publication October 3, 2003; accepted for publication August 30, 2004.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 
The logistic regression model is frequently used in epidemiologic studies, yielding odds ratio or relative risk interpretations. Inspired by the theory of linear normal models, the logistic regression model has been extended to allow for correlated responses by introducing random effects. However, the model does not inherit the interpretational features of the normal model. In this paper, the authors argue that the existing measures are unsatisfactory (and some of them are even improper) when quantifying results from multilevel logistic regression analyses. The authors suggest a measure of heterogeneity, the median odds ratio, that quantifies cluster heterogeneity and facilitates a direct comparison between covariate effects and the magnitude of heterogeneity in terms of well-known odds ratios. Quantifying cluster-level covariates in a meaningful way is a challenge in multilevel logistic regression. For this purpose, the authors propose an odds ratio measure, the interval odds ratio, that takes these difficulties into account. The authors demonstrate the two measures by investigating heterogeneity between neighborhoods and effects of neighborhood-level covariates in two examples—public physician visits and ischemic heart disease hospitalizations—using 1999 data on 11,312 men aged 45–85 years in Malmö, Sweden.

data interpretation, statistical; epidemiologic methods; hierarchical model; logistic models; odds ratio; residence characteristics


Abbreviations: IOR, interval odds ratio; MOR, median odds ratio.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 
A growing number of epidemiologic studies apply multilevel regression analysis for the investigation of associations between area of residence (e.g., neighborhood) and individual health (1, 2). A majority of studies have focused on traditional measures of association such as fixed effects using regression models for the relation between neighborhood (or cluster) characteristics and individual health. Other multilevel regression analysis approaches have concentrated on determining the components of health variation (3). On some occasions, difficult questions arise, such as "Is it worthwhile to study the association between neighborhood characteristics and health when the neighborhood variation is very small?" (4). Neither pure fixed-effects regression analysis nor calculation of intraclass correlations can provide sufficient insight to answer such questions in a satisfactory way.

The partition of variance at different levels (e.g., neighborhood and individual) is the sine qua non of multilevel regression analysis, and its consideration is relevant for both statistical reasons (improved estimation) and substantive epidemiologic reasons (quantification of the importance of the neighborhoods for understanding individual health) (5). However, contrary to normally distributed continuous variables, components of variance are tricky to investigate when it comes to dichotomous response variables. Forcing classical interpretative schemes wastes information and may be inappropriate. New measures are needed in order to quantify effects and ultimately provide a better understanding of the data.

In this paper, our aim is to highlight two measures previously described (6): the median odds ratio (MOR) and the interval odds ratio (IOR). These measures facilitate the integration and presentation of both fixed and random effects in logistic regression.


    MOTIVATION FOR NEW MEASURES OF EFFECT SIZE IN TWO-LEVEL MODELS WITH DICHOTOMOUS RESPONSES
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 
The multivariate normal distribution is an attractive framework for statistical modeling, when data on the response variable are continuous and normal. However, when the response variable is dichotomous, the distribution is incorrect, and therefore conclusions based on the normal distribution are likely to be flawed.

When formulating a two-level model, it is common to assume normality for the cluster-level (level 2) variation and to assume independence of units within-cluster (level 1) conditional on the cluster variable, thus generating a model in which individuals are marginally correlated within clusters. In the multivariate normal case, the interpretation of both fixed effects and random effects (individual residual error and cluster variation) is simple. This is not true in the case of dichotomous response variables, because of the nonlinear relation between the covariates and the response variable (typically a logit relation).

Choosing a logistic regression model leads to the choice between two different odds ratio interpretations of the fixed-effects parameters, the subject-specific interpretation and the population-averaged interpretation (7, 8). Taking a population-averaged approach, heterogeneity is considered a nuisance, whereas the subject-specific approach opens up the possibility of quantification of heterogeneity based on the two-level model (6).

Quantifying the variance component is a different matter. Usually, researchers either calculate a so-called intraclass correlation coefficient based on an estimate of the variance components and the residual variance or report the variance component, neither of which is very useful. Many different measures of intraclass correlation have been suggested (9). Nevertheless, intraclass correlations have serious interpretational drawbacks (10) for binary responses. First, the intraclass correlation may be interpreted as the proportion of the total variation attributable to variation between clusters, or as correlation between persons within the same cluster. That is, the intraclass correlation does not convey information regarding variation between clusters. Therefore, it is not a very useful measure when determining whether or not clustering is an important factor. Second, the intraclass correlation is not comparable with the fixed effects, which have odds ratio interpretations. This is unfortunate, because random effects are not very different from fixed effects in nature; random effects are fixed effects with an additional distributional assumption. Therefore, it seems natural to quantify variation between the random effects using odds ratios. Alternatively, one may use the variance component itself, but it is quite difficult to interpret, since it is on the log odds ratio scale.

In this paper, we unify interpretations of fixed and random effects in a subject-specific approach, explaining the use of the MOR and IOR measures in the case of a two-level logistic regression model.

The model
Consider a population of N individuals. Each individual has a vector of covariates, x, and each individual belongs to one of K clusters. The parameters corresponding to the covariates are in the vector ß. The K mutually independent cluster variables, u1, u2, ..., uK, are not to be estimated, since they are not of interest per se; rather, it is the variation between clusters that needs be quantified. Therefore, a normal distribution is assumed for the u’s, and parameters characterizing this distribution can then be used to characterize the heterogeneity induced by the random effects.

The response variable, Y, is a dichotomous variable. That is, for each individual, it is observed whether Y = 0 or 1. The model has two levels.

Level 1: For a person with covariate vector x, corresponding to the kth cluster, the probability of observing Y = 1 is

{kwi017eq1}

Level 2: For that individual, the second-level equation is

{eta}(x, uk) = ßx + uk,

where uk ~ N(0, {sigma}2). The covariate vector, x, contains individual-level (level 1) and cluster-level (level 2) covariates. Although there is no difference in terms of formulating the model, there are differences when interpreting the effects of variables varying within clusters and cluster-level variables.

For variables varying within a cluster, the usual odds ratio interpretations apply for comparisons of persons belonging to the same cluster; for example, a gender effect may be interpreted as an odds ratio between a woman and a man belonging to the same cluster and with the same covariates, except for gender.

For variables varying on the cluster level, the quantification is more difficult; the usual odds ratio interpretation is incorrect, because it is necessary to compare persons with different random effects, since the variable of interest does not vary between individuals within-cluster. A measure for this situation is described below in the section "The IOR."

It is of interest to quantify the clustering as something other than variation on the underlying linear scale, because this is difficult to interpret and relate to the fixed effects, which are quantified in terms of odds ratios. Therefore, it would be useful to have an odds ratio interpretation of the cluster variation as well. A measure is described below.

The MOR
The MOR quantifies the variation between clusters (the second-level variation) by comparing two persons from two randomly chosen, different clusters. Consider two persons with the same covariates, chosen randomly from two different clusters. The MOR is the median odds ratio between the person of higher propensity and the person of lower propensity.

The MOR is very easy to calculate, because it is a simple function of the cluster variance, {sigma}2:

{kwi017eq2}

where {Phi}(·) is the cumulative distribution function of the normal distribution with mean 0 and variance 1, {Phi}–1(0.75) is the 75th percentile, and exp(·) is the exponential function. A theoretical derivation of the formula is provided in the Appendix.

The measure is always greater than or equal to 1. If the MOR is 1, there is no variation between clusters (no second-level variation). If there is considerable between-cluster variation, the MOR will be large. The measure is directly comparable with fixed-effects odds ratios.

The IOR
The IOR is a fixed-effects measure for quantification of the effect of cluster-level variables. Consider two persons with different cluster-level covariates, x1 and x2. The IOR is an interval for odds ratios between two persons with covariate patterns x1 and x2, covering the middle 80 percent of the odds ratios.

The IOR is only slightly more difficult to calculate than the MOR. The lower and upper bounds of the interval are

{kwi017eq3}

and

{kwi017eq4}

where {Phi}–1(0.10) = –1.2816 and {Phi}–1(0.90) = 1.2816 are the 10th and 90th percentiles of the normal distribution with mean 0 and variance 1. A theoretical derivation of the formula is provided in the Appendix.

The interval is narrow if the between-cluster variation is small, and it is wide if the between-cluster variation is large. If the interval contains 1, the cluster variability is large in comparison with the effect of the cluster-level variable. If the interval does not contain 1, the effect of the cluster-level variable is large in comparison with the unexplained between-cluster variation.

According to the model, the odds ratios can take any value between zero and infinity. However, to put meaning into the IOR, it is natural to report an interval that is likely to contain the odds ratio when randomly choosing two persons with two specific sets of covariates. Reporting an 80 percent interval has been suggested (6), and that is reasonable, because it covers a large fraction of the odds ratios. Note that the IOR is not a confidence interval.


    EXAMPLES
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 
It is known that different geographic boundaries may have different effects on the same response, a problem that in medical geography is known as the "modifiable area unit problem" (11). Analogously, the same neighborhood definition may affect different individual outcomes differently. Moreover, one also needs to consider a temporal or longitudinal perspective to define individual exposure to the neighborhood environment, because individuals move between different areas during the course of their lives. Therefore, in cross-sectional analysis of neighborhood effects on health, one should expect that atherosclerotic disorders with a long natural history (e.g., ischemic heart disease) are influenced only slightly by the actual neighborhood territorial limits. Following the same reasoning, behavior-related outcomes (e.g., the choice of a specific kind of physician) may be much more susceptible to neighborhood influences.

Multilevel analysis has provided evidence that the socioeconomic characteristics of the neighborhood environment affect individual risk of ischemic heart disease (12). However, neighborhood variation seems to be low (13).

In Sweden, cost is not a specific determinant of people’s choice of a private versus public health-care practitioner, since the county councils support both economically. However, private physicians are outside of the public health-care system and therefore are less susceptible to health-care strategies directed by the county councils. Preferring a public practitioner versus a private practitioner might suggest dysfunction in some parts of the public health-care system, but it might also be explained by individual preferences, demands, and expectations related to socioeconomic position. The greater confidentiality of medical records in the private sector is another factor that could explain individual preference. Moreover, area of residence might influence individual decisions over and above individual characteristics.

Using a multilevel approach, we investigated the probability of being hospitalized for ischemic heart disease and the probability of visiting a public practitioner versus a private practitioner among Swedish men. We give two examples, followed by some comments on issues that have not already been covered.

Study population and assessment of variables
The study population consisted of 11,312 men aged 45–85 years residing in 98 of the 110 neighborhoods in the city of Malmö, Sweden. All of the men had visited a physician during the year 1999. Information was obtained from the Register on Health Care Utilization in Skåne, Sweden (14).

For the analysis, the variable "neighborhood" was used as a cluster (level 2) variable. Two individual-level (level 1) explanatory variables were considered: "age," which was a four-category variable with the categories 65–69, 70–74, 75–79, and 80–85 years, and "education," which was a dichotomous indicator of having 9 or fewer years of schooling. A cluster-level (level 2) explanatory variable, "neighborhood education," was also used in the analyses. This variable was a dichotomous indicator of the neighborhood’s educational level being below the median for all neighborhoods. In example 1, the response variable was an indicator of whether a person had been hospitalized for ischemic heart disease (coded as 1) or not (coded as 0). Ischemic heart disease was defined by hospital discharge diagnosis in 1999. The relevant codes according to the International Classification of Diseases, Tenth Revision, are I20 (angina pectoris), I21 (acute myocardial infarction), I22 (subsequent myocardial infarction), and I50 (heart failure). In example 2, the response variable was whether a person had visited a public physician (coded as 1) or not (coded as 0). We restricted our analysis to persons who had used the health-care system and visited a general practitioner at least once during the year 1999.

Parameters in the random-effects model were estimated using restricted iterative generalized least squares. The MLwiN software package (15), version 1.1, was used to perform the analyses. Extrabinomial variation was explored systematically in all of the models, and there was no indication of either under- or overdispersion.

Population-averaged parameters were calculated on the basis of the approximate formula (7)

{kwi017eq5} ,

where {kwi017eq6} is the population-averaged parameter and {kwi017eq7} is the subject-specific parameter. The cluster variance is {kwi017eq8} .

Example 1: hospitalization for ischemic heart disease
In example 1, two models are considered. In model 1, age and individual education were included together with a random neighborhood effect. Model 2 was an extension of model 1 that also included the cluster-level covariate neighborhood education.

In both analyses, it appeared that older people were more likely to be hospitalized than younger people and that less-educated persons were more likely to be hospitalized than the more educated. The estimates shown in table 1 were very similar in the two models, but there were some differences (discussed in detail below). The parameter estimates were transformed into odds ratios, which are shown in table 2.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Estimates and standard errors from analyses of hospitalization for ischemic heart disease (models 1 and 2) and visiting a public physician (models 3 and 4), Malmö, Sweden, 1999
 

View this table:
[in this window]
[in a new window]
 
TABLE 2. Odds ratios for being hospitalized for ischemic heart disease (models 1 and 2) and visiting a public physician (models 3 and 4), Malmö, Sweden, 1999
 
The individual-specific fixed effects are conditional on the random effects. That is, they may be interpreted as odds ratios for within-cluster comparisons. For the individual education variable, the odds ratios for a low level of education versus a high level were 1.25 in model 1 and 1.22 in model 2. These odds ratios are conditional on age and neighborhood.

Note that the above odds ratios are correct only for comparisons of persons belonging to the same cluster. When comparing persons from different clusters, it is necessary to calculate an IOR or leave the subject-specific interpretation and consider population-averaged odds ratios. The population-averaged odds ratio is the odds ratio between two persons from different clusters. In models 1 and 2, the estimates on the linear predictor scale are only shrunk by factors of

{kwi017eq9}

and

{kwi017eq10}

respectively. This indicates an apparently small amount of heterogeneity of the clusters. Thus, in effect, the attenuation is hardly visible in this analysis.

In model 1, the cluster heterogeneity is interpreted in the following way. Consider two randomly chosen persons with the same covariates from two different clusters (e.g., two more-educated persons aged 65–69 years) and conduct the hypothetical experiment of calculating the odds ratio for the person with the higher propensity to be hospitalized versus the person with the lower propensity. Repeating this comparison of two randomly chosen persons will lead to a series of odds ratios. This distribution is estimated when estimating the parameters, and it is shown in figure 1 (on the log scale). The median of these odds ratios between the person with a higher propensity and the person with a lower propensity is estimated to be 1.17 in model 1. This is a low odds ratio, and it suggests that the clustering effect is small even without inclusion of any cluster-level covariates. When neighborhood education is included (model 2), the unexplained cluster heterogeneity (comparing persons from neighborhoods of the same kind—for example, both neighborhoods with a high level of education) decreases, yielding an MOR of 1.09, which is a very low odds ratio. Thus, there is very little variation between neighborhoods in the propensity for hospitalization for ischemic heart disease.



View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 1. Model 1: density of the distribution of log odds ratios between a person with a higher propensity for being hospitalized for ischemic heart disease and someone with a lower propensity. The median value in the distribution of the log odds ratios is 0.16. MOR, median odds ratio.

 
Although the cluster heterogeneity is small in both models, it is interesting that neighborhood education accounts for quite a large part of the variation between clusters. This can best be seen from the IOR in the following way. The IOR is [1.07; 1.50]. This means that when randomly choosing two persons with identical individual covariates (e.g., 80- to 85-year-olds with a low individual level of education) from a low-education neighborhood and a high-education neighborhood, the odds ratio between the two lies within the interval in 80 percent of the cases. The estimated distribution of the odds ratios is shown in figure 2 (on the log scale). Two things are worth noticing regarding this interval. First, the interval does not contain 1. This suggests that the effect of neighborhood education is large relative to the cluster effect, because apparently there is little chance that the person from the less-educated neighborhood has a lower propensity for hospitalization than the person from the more-educated neighborhood. Second, the interval appears to be relatively narrow, which suggests that there is little cluster heterogeneity (as does the MOR). Therefore, a fairly large proportion of the (small) variation between neighborhoods in the propensity to be hospitalized may be explained by neighborhood educational level.



View larger version (17K):
[in this window]
[in a new window]
 
FIGURE 2. Model 2: density of the distribution of log odds ratios between a person from a less-educated neighborhood and a person from a more-educated neighborhood, when the persons belong to the same age group and have the same personal educational level. The 10th and 90th percentiles in the distribution of the log odds ratios are 0.07 and 0.41, respectively. IOR, interval odds ratio.

 
A population-averaged approach cannot convey the ramifications of the effect of neighborhood education on the propensity to be hospitalized in an equally satisfactory way. The results of such an analysis would be quantified in terms of a (slightly attenuated) odds ratio of

{kwi017eq11}

and some measure of within-cluster association. Contrary to the MOR and the IOR, these two measures do not convey much information regarding the effect of clustering on the likelihood of being hospitalized.

Example 2: visits to a public or private physician
In example 2, two models are considered. In model 3, age and individual education were included with a random neighborhood effect. Model 4 was an extension of model 3 that also included the cluster-level covariate neighborhood education.

The parameter estimates are shown in table 1, and the odds ratios are shown in table 2. The odds ratios for individual education were 1.28 and 1.27 in models 3 and 4, respectively; this suggests a higher propensity to visit a public physician for the less educated, conditional on age and neighborhood.

However, the results regarding clustering are quite different from those in example 1, and this implies substantial differences between the subject-specific and population-averaged estimates in this example. In models 3 and 4, the linear predictor parameters are attenuated by factors of

{kwi017eq12}

and

{kwi017eq13}

respectively. This yields population-averaged effects of 1.21 for individual education in both models. These effects are for comparisons of persons of the same age belonging to different clusters.

In model 3, for two persons with the same individual-level covariates, the MOR between the person living in the neighborhood with the higher propensity to visit a public physician and the person living in the neighborhood with the lower propensity is 3.61. This is a high odds ratio, suggesting that the heterogeneity is substantial. Including neighborhood education as a covariate reduces the unexplained heterogeneity between neighborhoods to an MOR of 3.33, which is still high. Thus, the propensity to visit a public physician varies a great deal between neighborhoods. This is also reflected in the IOR, which is very broad: [0.28; 27.3]. The interpretation is that if one is randomly selecting two persons, one from a low-education neighborhood and one from a high-education neighborhood, and comparing their odds of having visited a public physician, the middle 80 percent of the odds ratios will lie within this interval. The interval contains 1, which implies that neighborhood education does not account for a substantial amount of the neighborhood heterogeneity. In addition, the interval is quite broad, reflecting a large amount of unexplained variation between neighborhoods in the propensity to visit a public physician. Other cluster-level variables are needed to explain the cluster heterogeneity.

A population-averaged approach might lead to a different conclusion. The population-averaged effect of neighborhood education is

{kwi017eq14}

This is by far the most important of the fixed effects, and it suggests that neighborhood education is indeed an important variable when analyzing the propensity to visit a public physician. However, this is not completely in agreement with the previous interpretations based on the random-effects approach. The reason for this apparent discrepancy is the large unexplained cluster heterogeneity, which in the population-averaged approach is only considered a nuisance.

Other issues regarding the MOR and IOR
In a previous study (6), the IOR for being prescribed morphine when calling two different doctors and giving "back pain" versus "other" as the reason for the call was [2.08; 22.9]. This is an example of a large cluster effect (variation between physicians) along with a strong fixed effect (cause for the call).

Another, perhaps less interesting situation is also possible—the situation where there is relatively little variation between clusters and there is little or no effect of the cluster-level covariate. Although this may be a common situation, it is also the one that is most likely to be overlooked because of type II error.

It is also possible to include continuous cluster-level covariates using the formulas given above in the section "The IOR," and the interpretation is straightforward.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 
In this paper, we discuss interpretational aspects of the multilevel logistic regression model. Two measures, the MOR and the IOR, are proposed and applied to data concerning neighborhood effects on people’s propensity to visit public physicians and their likelihood of being hospitalized because of ischemic heart disease.

Regarding interpretation of fixed effects, the choice between generalized estimating equations and the random-effects model is the choice between population-averaged and subject-specific interpretations. Which one to choose depends entirely on the substantive research question. In examples 1 and 2, the effect of individual education is conditional on neighborhood. A subject-specific approach is proper, because it facilitates measures of effect on the individual level: What is the direct effect of having a high individual level of education versus a low level for a person under his/her specific conditions? In this sense, the subject-specific parameters are more "clean" measures of the effect of individual education, because they are not attenuated by unmeasured heterogeneity.

A general argument favoring the random-effects model is the possibility of quantification of heterogeneity in an intuitively attractive way. As is shown in the four models in the examples, the heterogeneity may be quantified by characteristics from the distribution of the odds ratios between pairs of randomly chosen persons from different neighborhoods. This provides a measure of heterogeneity on a scale that is familiar to researchers who have worked with the logistic regression model, namely odds ratios.

The usual odds ratio interpretations (conditioning on all other covariates and random effects) are proper for covariates that vary within-cluster, whereas they are improper for cluster-level covariates, because it is impossible to make comparisons within-cluster. That is, all comparisons must be made between persons belonging to two different clusters, and thus the odds ratio is no longer a fixed quantity but a random variable. In this way, the heterogeneity becomes relevant when quantifying the effect of a cluster-level covariate. The IOR incorporates both the fixed effect and the cluster heterogeneity in an interval, allowing for a more detailed description of the covariate effect.

The MOR and the IOR have been applied in a two-level model for the quantification of between-rater variability in a study of mammographic screening (16). Other studies have used the pairwise odds ratio (17, 18), which is similar to the MOR and IOR in the sense that it is also based on comparisons between pairs of individuals. However, where the MOR and IOR quantify between-cluster heterogeneity, the pairwise odds ratio quantifies association (concordance/discordance) between pairs of individuals within-cluster.

Both the MOR and the IOR are very easy to calculate, since they are simple functions of the parameters in the model. Thus, no additional analyses are necessary. A pocket calculator with exponential and square root functions is sufficient for calculating the measures from the parameters.

The MOR and the IOR may be extended to higher-order multilevel models, models with a nonhierarchical structure, and models with random slopes. They can also be extended to log-linear (Poisson) and linear (normal) responses, as well as to models with nonnormal random effects.


    ACKNOWLEDGMENTS
 
This study was supported by grants 2002-054 and 2003-0580 (Principal Investigator, Dr. Juan Merlo) from the Swedish Council for Working Life and Social Research.

The authors express their gratitude to Prof. Niels Keiding of the Department of Biostatistics, University of Copenhagen, for his comments on an early draft of this paper. The authors also thank Dr. Basile Chaix of the Research Team on the Social Determinants of Health and Health Care, French National Institute of Health and Medical Research, for his comments on the final manuscript.


    APPENDIX
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 
Mathematical Derivations

The median odds ratio (MOR)
The odds ratio between two persons with identical covariates from two different clusters is exp(u1u2), where u1 and u2 are the two random cluster variables. Consequently, the odds ratio for the person with the higher propensity versus the person with the lower propensity is {kwi017eq15} . Since data for the two cluster variables are assumed to be independent and normally distributed with mean zero and variance {sigma}2, the distribution of {kwi017eq16} may be characterized by the cumulative distribution, F, in the following way. For z > 0,

{kwi017eq17}

where {Phi}(·) is the cumulative distribution function for the standard normal distribution—that is, a normal distribution with mean 0 and variance 1. Thus, the density function, f (·), for the distribution of {kwi017eq18} becomes

{kwi017eq19}

The MOR is the median of this distribution, so it can be calculated as the solution to the equation, F(z) = 0.5, which leads to

{kwi017eq20}

The interval odds ratio (IOR)
The odds ratio between two persons from two different clusters with covariates x1 and x2 is exp(ß x (x1 x2) + (u1u2)). Since data for the two cluster variables are independent and normally distributed, the odds ratio is lognormally distributed with cumulative distribution function G, where, for z > 0,

{kwi017eq21}

Consequently, the density, g(·), becomes

{kwi017eq22}

The a-percentile in the distribution of the odds ratio is the solution to G(z) = a, which leads to

{kwi017eq23}

In particular, the 10th percentile is

{kwi017eq24}

and the 90th percentile is

{kwi017eq25}


    NOTES
 
Correspondence to Dr. Klaus Larsen, Clinical Research Unit, Section 136, Hvidovre University Hospital, Kettegård Allé 30, DK-2650 Hvidovre, Denmark (e-mail: klaus.larsen{at}hh.hosp.dk). Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MOTIVATION FOR NEW MEASURES...
 EXAMPLES
 DISCUSSION
 APPENDIX
 REFERENCES
 

  1. Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health 2000;21:171–92.[CrossRef][ISI][Medline]
  2. Kawachi I, Berkman LF. Neighborhoods and health. New York, NY: Oxford University Press, 2003.
  3. Merlo J. Multilevel analytical approaches in social epidemiology: measures of health variation compared with traditional measures of association. (Editorial). J Epidemiol Community Health 2003;57:550–2.[Free Full Text]
  4. Oakes JM. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. Soc Sci Med 2004;58:1929–52.[CrossRef][ISI][Medline]
  5. Rodriguez G, Goldman N. An assessment of estimation procedures for multilevel models with binary responses. J R Stat Soc A 1995;158:73–90.[ISI]
  6. Larsen K, Petersen JH, Budtz-Jørgensen E, et al. Interpreting parameters in the logistic regression model with random effects. Biometrics 2000;56:909–14.[ISI][Medline]
  7. Zeger SL, Liang K-Y, Albert PS. Models for longitudinal data: a generalized estimation equation approach. Biometrics 1988;44:1049–60.[ISI][Medline]
  8. Zeger SL, Liang K-Y. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986;42:121–30.[ISI][Medline]
  9. Ridout MS, Demétrio CG, Firth D. Estimating intraclass correlation for binary data. Biometrics 1999;55:137–48.[ISI][Medline]
  10. Goldstein H, Browne W, Rasbash J. Partitioning variation in multilevel models. Understanding Stat 2002;1:223–32.[CrossRef]
  11. Ratcliffe JH, McCullagh MJ. Hotbeds of crime and the search for spatial accuracy. J Geograph Systems 1999;1:385–98.[CrossRef]
  12. Diez Roux AV, Merkin SS, Arnet D, et al. Neighborhood of residence and incidence of coronary heart disease. N Engl J Med 2001;345:99–106.[Abstract/Free Full Text]
  13. Diez-Roux AV, Nieto FJ, Muntaner C, et al. Neighborhood environments and coronary heart disease: a multilevel analysis. Am J Epidemiol 1997;146:48–63.[Abstract]
  14. Merlo J, Gerdtham UG, Lynch J, et al. Social inequalities in health—do they diminish with age? Revisiting the question in Sweden 1999. Int J Equity Health 2003;2:2. (Electronic article).[CrossRef][Medline]
  15. Rasbash J, Browne W, Goldstein H, et al. A user’s guide to MLwiN. 2nd ed. London, United Kingdom: Institute of Education, University of London, 2000. (World Wide Web URL: http://multilevel.ioe.ac.uk/1_10/manuals.html).
  16. Elmore JG, Miglioretti DL, Reisch LM, et al. Screening mammograms by community radiologists: variability in false-positive rates. J Natl Cancer Inst 2002;94:1373–80.[Abstract/Free Full Text]
  17. Katz J, Carey VJ, Zeger SL, et al. Estimation of design effects and diarrhea clustering within households and villages. Am J Epidemiol 1993;138:994–1006.[Abstract]
  18. Carey VJ, Zeger SL, Diggle P. Modelling multivariate binary data with alternating logistic regressions. Biometrika 1993;80:517–26.[ISI]