Comparison of a Spatial Perspective with the Multilevel Analytical Approach in Neighborhood Studies: The Case of Mental and Behavioral Disorders due to Psychoactive Substance Use in Malmö, Sweden, 2001

Basile Chaix1,2, Juan Merlo2, S. V. Subramanian3, John Lynch4 and Pierre Chauvin1

1 Research Unit in Epidemiology, Information Systems, and Modelisation (INSERM U707), National Institute of Health and Medical Research, Paris, France
2 Department of Community Medicine, Malmö University Hospital, Lund University, Malmö, Sweden
3 Department of Society, Human Development and Health, Harvard School of Public Health, Boston, MA
4 Department of Epidemiology, Center for Social Epidemiology and Population Health (CSEPH), University of Michigan, Ann Arbor, MI

Correspondence to Dr. Basile Chaix, INSERM U707, Faculté de Médecine Saint-Antoine, 27, rue Chaligny, 75571 Paris cedex 12, France (e-mail: chaix{at}u707.jussieu.fr).

Received for publication October 29, 2004. Accepted for publication March 7, 2005.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Most studies of neighborhood effects on health have used the multilevel approach. However, since this methodology does not incorporate any notion of space, it may not provide optimal epidemiologic information when modeling variations or when investigating associations between contextual factors and health. Investigating mental disorders due to psychoactive substance use among all 65,830 individuals aged 40–59 years in 2001 in Malmö, Sweden, geolocated at their place of residence, the authors compared a spatial analytical perspective, which builds notions of space into hypotheses and methods, with the multilevel approach. Geoadditive models provided precise cartographic information on spatial variations in prevalence independent of administrative boundaries. The multilevel model showed significant neighborhood variations in the prevalence of substance-related disorders. However, hierarchical geostatistical models provided information on not only the magnitude but also the scale of neighborhood variations, indicating a significant correlation between neighborhoods in close proximity to each other. The prevalence of disorders increased with neighborhood deprivation. Far stronger associations were observed when using indicators measured in spatially adaptive areas, centered on residences of individuals, smaller in size than administrative neighborhoods. In neighborhood studies, building notions of space into analytical procedures may yield more comprehensive information than heretofore has been gathered on the spatial distribution of outcomes.

epidemiologic methods; logistic models; mental disorders; social environment; spatial analysis; substance-related disorders


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
During the past decade, there has been growing research interest in the impact of neighborhood of residence on health (1Go). Most studies have used the multilevel modeling technique (2Go). With this approach, standard errors for the measures of association between neighborhood factors and health are corrected for the nonindependence of individuals within neighborhoods (3Go, 4Go), and measures of variation based on random effects (e.g., neighborhood-level variance) allow quantifying the magnitude of variations in outcomes among neighborhoods (5Go–8Go). However, our hypothesis was that the multilevel approach may provide only limited information on the spatial distribution of outcomes, both when modeling variations and investigating associations, since it fragments space into administrative neighborhoods and ignores spatial associations between them.

This methodological question was motivated by an epidemiologic investigation of spatial variations in mental disorders in Malmö, Sweden, using data on all 65,830 residents aged 40–59 years in 2001 geocoded at their exact place of residence. Several previous studies that investigated neighborhood variations in mental health as a general category reported only weak variations between neighborhoods (9Go–12Go). Such variations were usually explained by differences in neighborhood composition (9Go–11Go), but some analyses found that neighborhood deprivation was weakly, but significantly associated with deteriorated mental health (12Go–14Go). However, we hypothesized that the absence of important neighborhood variations in mental health as a general category may conceal situations of strong context dependence for specific mental health outcomes. Therefore, we investigated spatial variations in mental and behavioral disorders due to psychoactive substance use, which, it was assumed, were particularly dependent on the social context.

After describing the spatial distribution of these mental disorders, our objective was to examine whether they were independently associated with the socioeconomic status of the context. Beyond spatial filtering of mentally ill individuals to places considered deprived (15Go), this relation may result from a direct influence of neighborhood deprivation on mental health (14Go). Describing spatial variations and investigating whether contextual poverty is a marker for places of high prevalence aid in determining whether the intensity of intervention programs for mental health should vary over space.

Original to our approach is an emphasis on the public health relevance of determining, beyond the magnitude of spatial effects, the spatial scale on which these effects operate—both when describing the spatial distribution of mental disorders and when investigating the impact of contextual characteristics. For example, the existence of clusters of increased prevalence that extend beyond administrative neighborhoods may indicate that public health interventions should be coordinated on a larger scale than that of the neighborhood.

To describe the spatial distribution of disorders, we did not use cluster recognition techniques (16Go, 17Go) but instead used regression approaches to identify individual and contextual characteristics contributing to clustering. We compared three modeling approaches— geoadditive, multilevel, and hierarchical geostatistical—all of which build upon different notions of space for investigating the spatial distribution of mental disorders.

Regression models that rely on a space fragmented into administrative neighborhoods are affected by the modifiable areal unit problem (18Go–20Go): their results depend on the particular size and shape of the administrative neighborhoods (21Go–23Go). To obtain precise cartographic information on mental disorders independent of neighborhood boundaries, we used a geoadditive model that captures spatial variations with a two-dimensional (longitude/latitude) smooth term (24Go, 25Go). Whereas spatial random effect approaches are often computationally unable to process the spatial coordinates for individuals, geoadditive models enabled us to use this accurate locational information to produce smoothed maps of prevalence (26Go, 27Go). However, this approach does not provide parametric information that would enable us to make inferences about the magnitude and scale of spatial variations. For example, it does not permit assessment of whether a similar prevalence noted in surrounding neighborhoods corresponds to real patterns of variation or simply results from the smoothing of data.

To make these inferences, we first used the multilevel model, considered the "gold standard" for contextual analysis. Taking into account the neighborhood affiliation of individuals using independent random effects that ignore spatial connections between neighborhoods, the multilevel model assumes that all spatial correlation can be reduced to within-neighborhood correlation. This model is not spatial because it does not incorporate any notion of space. Accordingly, it allows quantifying the magnitude of neighborhood variations, but it fails to provide information on their scale and does not indicate whether neighborhood variability follows a spatially organized pattern or consists of unstructured variations.

Many ecologic studies have explicitly modeled the spatial correlation of outcomes (28Go–31Go). However, there has been less effort to do so with individual data (32Go–35Go). To obtain information on the scale of spatial variations, we used a hierarchical geostatistical model (32Go) that geocodes individuals at the neighborhood level and splits neighborhood variability into a spatially structured component and an unstructured component (35Go, 36Go). Doing so allowed us to make statistical inferences on not only the magnitude of correlation within neighborhoods but also the range of correlation in space (33Go, 34Go).

Regarding the association between contextual deprivation and mental disorders, measuring deprivation within administrative neighborhoods may be overly restrictive because individual locational information was available. For one thing, the administrative neighborhood scale may be too broad to capture the effect of contextual deprivation. Moreover, using fixed boundary areas may not enable capture of contextual information in surrounding space for those individuals residing on the margins of administrative neighborhoods. Therefore, we measured contextual deprivation within small circular areas centered on the residences of individuals (i.e., within moving-window areas). Beyond quantifying the strength of association, this approach allowed us to examine whether contextual deprivation was related to mental health on the administrative neighborhood level commonly used in epidemiologic research or on a smaller or larger scale.

In summary, we aimed to describe precisely the spatial distribution of substance-related disorders (geoadditive model), make inferences of relevance to public health on the magnitude and scale of spatial variations (multilevel and hierarchical geostatistical models), and investigate the strength and spatial scale of the association between contextual deprivation and mental disorders.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Data and measures
After the research plan was approved, the Regional Office in the Skåne county of Sweden supplied us with data on all 65,830 individuals aged 40–59 years living in Malmö in 2001. Substance-related disorders are more common, and therefore more likely to constitute a public health problem, in this age bracket than in others. The Regional Office extracted the data from the Database for Resource Allocation, which includes demographic and socioeconomic data on individuals (see below) and information on all inpatient and outpatient contacts (including diagnoses and costs) between individuals and public or private health-care providers in 2001. Using the first three diagnoses provided for each contact (coded according to the International Classification of Diseases, Tenth Revision), the Regional Office created binary variables indicating whether individuals had received diagnoses in different categories of health problems. In the present study, the binary outcome investigated was the presence or absence of mental or behavioral disorders due to psychoactive substance use (International Classification of Diseases, Tenth Revision, codes F10–F19). For reasons of confidentiality, we had detailed information on subdiagnoses in a separate database only. We used these separate data to describe the distribution of individuals with a substance-related disorder by subgroups of conditions and psychoactive substances used (refer to the Results section).

As individual variables, we considered age, gender, marital status, education, and income. Age was divided into two categories (40–49 years, 50–59 years). Marital status was coded as "married or cohabiting" and "others" (single, divorced, widowed). Educational attainment was dichotomized (9 years or less vs. more than 9 years of education). Since household income was not available, we used individual income (dichotomized, with the median value as a cutoff) as a proxy for individual socioeconomic position.

Malmö is divided into 100 administrative neighborhoods. The Regional Office of Skåne used street addresses to geocode individuals' places of residence. Figure 1 indicates the spatial distribution of the 65,830 individuals aged 40–59 years over 13,730 locations. These locations correspond to houses or buildings. Figure 1 also provides basic information on the neighborhood structure.



View larger version (59K):
[in this window]
[in a new window]
 
FIGURE 1. Spatial distribution of all 65,830 individuals aged 40–59 years in Malmö, Sweden, in 2001 residing in 13,730 different locations in the city. Each point indicates the exact place of residence of these individuals. Information on neighborhood income in the 100 administrative neighborhoods is also plotted. The median area of the neighborhoods was 0.5 km2 (interquartile range: 0.3–1.1 km2). The median distance of each individual from the population-weighted centroid of his or her neighborhood was 246 m. The median distance between centroids of adjacent neighborhoods was 904 m (interquartile range: 655–1,326 m). The median number of inhabitants (all ages) in the neighborhoods was 2,046, and the median number of individuals aged 40–59 years was 510.

 
We computed the mean income of individuals aged 25 years or older to capture the socioeconomic level of the residential context (37Go). This variable was first calculated within administrative neighborhoods. Then, to define it on a more local scale and avoid the problem of individuals residing on the margins of neighborhoods, we computed mean income within areas of smaller size than neighborhoods, centered on residences of individuals. We could have defined these areas as small, circular spaces of constant size. However, because of the uneven distribution of individuals (figure 1), this approach results in missing values or unreliable measurements for those individuals living in sparsely populated areas. Therefore, we computed mean income in circular areas of constant population size (rather than constant geographic size); these areas were centered on the residences of individuals and comprised a fixed number of inhabitants aged 25 years or older. This approach results in spatially adaptive areas (areas having an adaptive window width) of greater size in sparsely populated areas. It was inspired by the spatially adaptive filters used in health geography to obtain smoothed maps of disease incidence (38Go–40Go). For each of the 13,730 individual locations, we computed mean income for the 100, 200, 500, 1,000, and 1,500 closest inhabitants aged 25 years or older, aggregating contextual information from surrounding locations until the required number of inhabitants was attained. To compare the different measurement strategies, all contextual variables were divided into quartiles.

Statistical analyses
To produce a precise smoothed map of prevalence based on individual locational information, we estimated a geoadditive model (24Go–27Go) by using a two-dimensional (latitude/longitude) smooth term for the spatial effect and simple regression coefficients for individual and contextual factors (refer to the Appendix). To obtain easily interpretable information on the magnitude of spatial variations, we propose an indicator on the odds ratio scale, the interquartile spatial odds ratio, defined as the odds ratio between an individual residing in a location in the first quartile and one from a location in the fourth quartile of spatial risk, as estimated from the geoadditive model (Appendix).

To make inferences on the magnitude of spatial variations, we estimated a multilevel logistic model (3Go, 4Go) with individuals nested within administrative neighborhoods (refer to the Appendix). The neighborhood variance indicated the amount of variability between neighborhoods regarding substance-related disorders. Using Moran's I statistic (Appendix), we sought spatial autocorrelation in the neighborhood residuals of the model (41Go, 42Go). To investigate whether spatial correlation decreased with increasing distance, we computed Moran's I separately for those neighborhoods less than 1,000 m apart, 1,000–1,999 m apart, 2,000–2,999 m apart, and so forth.

To assess the spatial scale of variations, we estimated a hierarchical geostatistical model (33Go, 34Go) with two sets of neighborhood random effects, including the usual unstructured effects uj of variance and an additional set of spatially correlated random effects sj of variance (refer to the Appendix) (32Go, 35Go, 36Go). Whereas uj takes independent values in each neighborhood and therefore captures unstructured neighborhood variations, sj adopts more similar values for neighborhoods close to each other than for those further apart, thereby reflecting spatially organized variations. We computed the proportion of total neighborhood variance attributable to the spatially structured component of variability as (36Go). The parameter {phi}, which quantifies the rate of correlation decay with increasing distance between neighborhoods, was used to assess the spatial scale of variations in mental disorders: we computed the range of spatial correlation (3/{phi}), defined as the distance beyond which the correlation between neighborhoods is below 5 percent, that is, beyond which neighborhood risk levels are no longer correlated (33Go, 34Go).

As detailed in the Appendix, we performed a simulation to verify that the hierarchical geostatistical model was able to disentangle spatially structured from unstructured neighborhood variations. We disorganized the spatial structure of the data without modifying the multilevel structure (spatial connections between neighborhoods were modified, but the same individuals were still grouped together within neighborhoods) and examined the resulting changes in neighborhood variance parameters. We found that the model was able to distinguish between spatially structured and unstructured neighborhood variations but that the proportion of spatially structured variations needs to be interpreted jointly with the spatial range of correlation 3/{phi}.

Multilevel and hierarchical geostatistical models were estimated with Markov chain Monte Carlo simulation (refer to the Appendix) (43Go). We used the deviance information criterion to compare their goodness of fit (the smaller the deviance information criterion, the better the fit of the model) (44Go). We could not compare them with the geoadditive model in this way because of differences in the estimation procedures. For each modeling option, we first estimated an empty model (without explanatory variables), then we introduced the individual covariates and finally the contextual variable.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
In our study population, 1.45 percent of all individuals sought care for mental or behavioral disorders due to psychoactive substance use in 2001. Alcohol was involved in the diagnoses of 88 percent of those individuals, sedatives or hypnotics 9 percent, stimulants 7 percent, and opioids 6 percent. Fourteen percent of those individuals with a disorder had used multiple substances. Clinical conditions constituted a dependence syndrome for 91 percent and harmful use for 13 percent of the individuals.

When individual locational information was used, the empty geoadditive model provided a precise smoothed map of the prevalence of disorders independent of administrative boundaries. The map in figure 2 shows increased prevalence in a large area of northern Malmö, including two local subareas of particularly high prevalence. Based on the spatial smooth term for the 65,830 individuals, the interquartile spatial odds ratio was 3.96, which approximately quantifies the odds ratio between individuals in the lowest and highest quartiles of spatial risk.



View larger version (32K):
[in this window]
[in a new window]
 
FIGURE 2. Smoothed map of the prevalence of mental disorders due to psychoactive substance use (top) and associated standard errors (bottom), estimated from the empty geoadditive model for all inhabitants aged 40–59 years in Malmö, Sweden, in 2001. The quantiles used to draw the maps were derived from distributions of the spatial smooth term and the standard error for the 65,830 individuals.

 
The empty multilevel model confirmed that there were important neighborhood variations in the prevalence of substance-related disorders (table 1). The Moran's I statistics computed from the neighborhood residuals indicated that spatial correlation between neighborhoods decreased with increasing distance between them (figure 3). Positive correlation between neighborhoods disappeared when the distance between them exceeded 3,000 m.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Results from the multilevel models for mental and behavioral disorders due to psychoactive substance use for all inhabitants aged 40–59 years in Malmö, Sweden, 2001

 


View larger version (17K):
[in this window]
[in a new window]
 
FIGURE 3. Moran's I statistics for neighborhood-level residuals of the multilevel models for all inhabitants aged 40–59 years in Malmö, Sweden, in 2001 computed separately for pairs of neighborhoods less than 1,000 m apart, 1,000–1,999 m apart, 2,000–2,999 m apart, and 3,000–3,999 m apart. Bars, 95% credible interval.

 
We estimated an empty hierarchical geostatistical model to assess the spatial scale of variations in substance-related disorders. The model indicated that 92 percent of neighborhood variability was spatially structured (table 2). Figure 4 displays the estimated neighborhood variations as split into spatially structured and unstructured components of variability. Regarding the spatially structured component, figure 5 (top) indicates how correlation in neighborhood prevalence decreases with increasing distance between neighborhoods. It shows that the range of spatial correlation (3/{phi}) was equal to 3,424 m, a distance that far exceeds the distance between adjacent administrative neighborhoods (refer to figure 1), suggesting that spatial variations were operating on a much broader scale. The deviance information criterion was 10 points lower in the empty hierarchical geostatistical model than in the multilevel model (tables 1 and 2), indicating a better fit of the model that split neighborhood variations into spatially structured and unstructured components of variability.


View this table:
[in this window]
[in a new window]
 
TABLE 2. Results from the hierarchical geostatistical models for mental and behavioral disorders due to psychoactive substance use for all inhabitants aged 40–59 years in Malmö, Sweden, 2001

 


View larger version (44K):
[in this window]
[in a new window]
 
FIGURE 4. Neighborhood-level variations in the prevalence of substance-related disorders, split into a spatially structured component (top) and an unstructured component (bottom), as estimated from the empty hierarchical geostatistical model for all inhabitants aged 40–59 years in Malmö, Sweden, in 2001. The quartiles used to draw the maps were derived from the distribution of random effects for the 65,830 individuals.

 


View larger version (23K):
[in this window]
[in a new window]
 
FIGURE 5. Spatially structured neighborhood variations with regard to substance-related disorders, estimated from the empty hierarchical geostatistical model for all inhabitants aged 40–59 years in Malmö, Sweden, in 2001: correlation (top) and covariance (bottom) in the neighborhood risk level. The asterisks (top) indicate the spatial range of correlation (3/{phi}).

 
Both the individual variables and the neighborhood socioeconomic level explained an important part of the spatially structured neighborhood variations in substance-related disorders, as indicated by a decrease in the neighborhood variance and Moran's I statistics in the multilevel model (table 1 and figure 3) and a decrease in the spatially structured variance and spatial range of correlation in the hierarchical geostatistical model (table 2, figure 5 (top)). In the latter model, plotting the covariance function of the spatially structured effect also illustrates this aspect (figure 5 (bottom)). In the geoadditive model, the interquartile spatial odds ratio dropped from 3.96 to 1.92 and 1.67 when the individual variables and neighborhood deprivation were successively included.

A higher prevalence of disorders was found in deprived administrative neighborhoods, after adjustment for individual factors (tables 1, 2, and 3). When the mean income was measured in spatially adaptive areas smaller than administrative neighborhoods, the strength of association between contextual deprivation and prevalence increased markedly with decreasing size of the areas considered (table 3). In geoadditive models, the risk of substance-related disorders was 1.97 times higher (95 percent confidence interval: 1.39, 2.79) in the highest versus the lowest quartiles of contextual deprivation when the contextual factor was measured in administrative neighborhoods, but it was 4.12 times higher (95 percent confidence interval: 3.01, 5.64) when the 100 nearest inhabitants aged 25 years or older were considered. The raw data showed that 38 percent of those with a substance-related disorder (362 of 956 cases) resided in the highest quartile of contextual deprivation when the contextual factor was defined in administrative neighborhoods, whereas 51 percent of the cases (n = 485) were in the highest quartile when the factor was defined in local areas comprising the 100 nearest inhabitants.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Contextual effect of mean income successively measured within neighborhoods and spatially adaptive areas, estimated from geoadditive models adjusted for individual covariates for all inhabitants aged 40–59 years in Malmö, Sweden, 2001

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Our investigation of the spatial variations in mental substance-related disorders in Malmö, Sweden, led us to compare a spatial analytical perspective with the usual multilevel approach. We found that the notion of space used during analysis conditioned the information we could obtain on the contextual dimension of mental disorders. Building a notion of space into statistical models, we were able to show not only important neighborhood variations in substance-related disorders but also that the spatial scale of variations was much larger than that of administrative neighborhoods. By measuring contextual deprivation in small areas centered on the residences of individuals, we found that the association between deprivation and mental health operated on a very local scale.

Investigating the magnitude and scale of spatial variations in outcomes
To investigate the spatial distribution of disorders, we initially sought precise cartographic information, independent of neighborhood boundaries. Our second objective was to make inferences about the magnitude and scale of spatial variations. Beyond knowing whether the magnitude of neighborhood variations justifies including a contextual dimension in public health programs (6Go), it is relevant to assess the spatial scale on which programs should be coordinated. To obtain this information, we compared three modeling approaches that, building on different notions of space, provided different insights into the spatial distribution of mental disorders.

A flexible way to obtain precise cartographic information was to fit a geoadditive model (24Go, 25Go). Working with continuous space, this model was able to process the spatial coordinates of individuals to produce a smoothed map of prevalence independent of neighborhood boundaries (26Go, 27Go)—a result far more precise than maps obtained by using poor locational information at the neighborhood level. However, this approach provides only visual information but no parameters to make statistical inferences about the spatial distribution of mental disorders. We obtained only quantitative information on the magnitude of spatial variations, expressed on the odds ratio scale with an interquartile odds ratio based on prevalence estimates at the 13,730 individual locations.

To make inferences, we considered analytical techniques such as the multilevel model (3Go) and the hierarchical geostatistical model (32Go). Since it is computationally intractable to estimate a parametric spatial correlation structure for a considerable number of locations (refer to the Appendix), these analyses were based on the 100 neighborhood locations, thus representing a dramatic underuse of information available.

The multilevel model showed important neighborhood variations in the prevalence of substance-related disorders—a mental health outcome that may be more context dependent than others. However, using multilevel models leads one to hypothesize that spatial correlation is reducible to within-neighborhood correlation, that is, that the distribution of neighborhoods at risk in space is completely random (35Go). Multilevel models do not incorporate any notion of space and, as such, may be described as nonspatial: they consider the neighborhood affiliation of individuals but neglect spatial connections between neighborhoods. Therefore, measures of variation/correlation in such models provided only partial insight into the spatial distribution of disorders, allowing us to make statistical inferences on the magnitude but not the spatial scale of variations.

To obtain this additional information, we used the hierarchical geostatistical model (33Go, 34Go). Georeferencing individuals at the administrative neighborhood level, our specific model splits neighborhood variability into a spatially organized component and an unstructured component (32Go, 35Go, 36Go). We found it more informative to use a geostatistical rather than a lattice formulation (45Go) of the spatial correlation structure (34Go): defining the correlation between neighborhoods as a decreasing function of the spatial distance between them enabled us to estimate the spatial range of correlation (34Go).

First, the hierarchical geostatistical model is of heuristic interest. Since many contextual factors have a strong spatial structure, disentangling spatially structured variability from other more chaotic sources of neighborhood variation may allow researchers to generate hypotheses on contextual mechanisms (35Go). Following recommendations in the literature (46Go), we compared the spatially structured variations in mental health (figure 4) with the geographic distribution of neighborhood income (figure 1) to gain preliminary insight into the association between deprivation and mental disorders.

Second, rather than being a nuisance parameter, the parameter {phi} for correlation decay allowed us to make inferences about the scale of spatial variations, showing that variations in substance-related disorders occurred on a larger scale than that of the neighborhood. As a public health implication, coordinating interventions between administrative neighborhoods close to each other may be an effective strategy. If recent developments in local regression techniques are used (47Go), one possible analytical refinement may consist of moving from a global to a local perspective in which the spatial autocorrelation parameter could vary over space.

Measuring contextual factors across continuous space around residences of individuals
In many instances, relying on administrative boundaries to define contextual factors may be restrictive. We found a much stronger relation between contextual deprivation and prevalence of disorders when we measured the factor in local areas of smaller size than administrative neighborhoods. Therefore, this association may operate on a more local scale than the neighborhood scale commonly used in contextual studies.

Contextual income was measured within spatially adaptive areas, that is, circles of variable width and fixed population size centered on residences of individuals. Using these areas appeared to be the only way to investigate whether contextual deprivation operated on a local scale, since measuring contextual income within areas having a small, fixed radius results in missing values or unreliable measurements in sparsely populated areas (39Go, 40Go). Theoretically, this approach that considers surrounding population rather than surrounding space may be particularly appropriate when considering contextual factors aggregating individual characteristics (e.g., income) rather than features of the physical environment.

In our cross-sectional study, causal mechanisms for the association between contextual deprivation and substance-related disorders may operate in both directions. On the one hand, although not yet definitely confirmed by quantitative studies, selective migration processes may contribute to the clustering of substance-related disorders in the most-deprived neighborhoods (15Go). On the other, deprivation may have an independent negative impact on mental well-being (14Go). Despite such uncertainty, an issue that has yet to be addressed in a longitudinal study, our findings show that interventions focused on individuals with substance-related disorders may be particularly useful in hot spots of contextual deprivation identified on a smaller scale than that of administrative neighborhoods.

In conclusion, beyond important geographic variations in the prevalence of substance-related disorders, our spatial analytical perspective showed that the spatial scale of variations was much larger than that of administrative neighborhoods. However, apart from such large-scale variations due to the clustering of poor socioeconomic circumstances in northern Malmö, we also found more local variations in prevalence that were attributable to differences in the intensity of deprivation.

We are aware that multilevel models may be appropriate when the context is defined in a way that is not strictly geographic (e.g., as hospitals (48Go), workplaces, or schools) or when spatial correlation can be reduced to within-area correlation. Similarly, it may be adequate to measure contextual factors within administrative boundaries when investigating effects operating on those scales (e.g., in relation to public policies). However, in many neighborhood studies, a deeper understanding of spatial variations in health outcomes may be gained by building notions of space into statistical models and measuring contextual factors across continuous space.


    APPENDIX
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 
Geoadditive logistic model
Because of multicollinearity in the explanatory variables, generalized additive models based on nonparametric smoothers often lead to underestimation of the standard errors of parameters (49Go, 50Go). We therefore used a geoadditive model based on fully parametric smoothers. Following the work of Simon Wood (24Go, 25Go), the geoadditive logistic model was defined as logit(pi) = ß0 + Xiß + t(xi, yi), where pi is the probability that individual i has a substance-related disorder, Xiß refers to the strictly parametric part of the linear predictor, and t(xi, yi) is a two-dimensional smooth function of the exact spatial coordinates (xi, yi) of individuals. The two-dimensional isotropic smooth term was defined as a thin plate regression spline, avoiding the problem of knot placement that one meets with conventional regression splines. To avoid overfitting of the spatial term, the model was estimated by using a penalized maximum likelihood approach (51Go), with a smoothing parameter controlling the trade-off between goodness of fit and smoothness. Geoadditive models were fitted to the data with the "mgcv" R software package (52Go).

To express the magnitude of spatial variations in an easily interpretable way, we propose an indicator on the odds ratio scale, the interquartile spatial odds ratio. We define it as the odds ratio between an individual located in the first quartile and one located in the fourth quartile of spatial risk, as estimated from the geoadditive model. The median odds for individuals residing in the first and last quartiles of spatial risk are equal to exp({Sigma}ßX + t12.5) and exp({Sigma}ßX + t87.5), where t12.5 and t87.5 are the 12.5th and 87.5th quantiles in the distribution of the spatial smooth term in the study population. Accordingly, the interquartile spatial odds ratio was computed as exp(t87.5t12.5).

Multilevel logistic model
To model variations in the probability pij of individual i in neighborhood j having a substance-related disorder, we fitted a multilevel logistic model to the data as logit(pij) = ß0 + Xijß + uj, where Xij is the vector of explanatory variables and uj is a normally distributed random intercept of variance (3Go, 4Go). The ujs for the different neighborhoods are independent of one another.

To assess spatial autocorrelation in the neighborhood residuals, we computed Moran's I statistic (41Go, 42Go):

where N is the number of neighborhoods (N = 100), wij is a weight related to the spatial relation between neighborhoods i and j (see below), ui and uj are the residuals for neighborhoods i and j (with a mean equal to 0), and S0 is the sum of the weights wij:

We computed Moran's I statistic separately for neighborhoods less than 1,000 m apart (with wij equal to 1 for pairs of neighborhoods closer than 1,000 m, otherwise 0), those 1,000–1,999 m apart (with wij equal to 1 for only those pairs of neighborhoods 1,000–1,999 m apart), those 2,000–2,999 m apart, and so forth. Distances between neighborhoods were based on neighborhood population-weighted centroids (determined by using the coordinates of individuals). Since our aim was solely to compare the Moran's I's at these different distance ranges, it would have been less appropriate to define wij as a function of the distance between neighborhoods.

Multilevel models were estimated with Markov chain Monte Carlo simulation (see below) (43Go). In this Bayesian perspective, we do not need to make specific assumptions to obtain the standard error of the Moran's I statistic (42Go): by computing Moran's I for each set of sampled values of the neighborhood residuals, we obtain its posterior distribution and report the median, as well as the 2.5th and 97.5th quantiles, to construct a 95 percent credible interval. In the absence of spatial autocorrelation, the Moran's I statistic has a small negative expectation when applied to regression residuals (42Go, 53Go). In comparing the 95 percent credible interval with the value 0, we have therefore applied a conservative test.

Hierarchical geostatistical logistic model
We used a logistic model including independent neighborhood random effects uj and neighborhood spatially correlated random effects sj (32Go, 35Go, 36Go). For an individual i in neighborhood j, the model was defined as logit(pij) = ß0 + Xijß + uj + sj. The ujs are mutually independent and Gaussian, with mean 0 and variance Let S = (s1, s2, ..., s100) be the vector of spatial effects for the 100 neighborhoods. The distribution of S is expressed as S ~ N(0, V), with Vkl defined as a parametric function of the distance dkl in meters between the population-weighted centroids of neighborhoods k and l. We assumed an isotropic spatial process (in which spatial correlation does not depend on direction). Vkl was defined as with an exponential correlation function {rho}kl = exp(–{phi}dkl) (33Go). The spatial range of correlation (beyond which the correlation is below 5 percent) was computed as 3/{phi}. The proportion of neighborhood variance that is spatially structured was computed as

We examined whether the hierarchical geostatistical model was really able to disentangle spatially structured variations from the neighborhood unstructured variability. In six successive simulations, we randomly selected 10, 25, 50, 75, 90, and 100 neighborhoods out of 100 and randomly assigned all individuals from each of these neighborhoods as a group to one other selected neighborhood, while no changes were made for the other nonselected neighborhoods. We therefore did not modify the multilevel structure of the data, since the same individuals were still grouped together within neighborhoods, but progressively disorganized the neighborhood spatial structure. Fitting a hierarchical geostatistical model to each data set, we observed that the proportion of neighborhood variations attributable to the spatially structured component [] decreased as the number of neighborhoods selected for random reassignment of inhabitants increased (appendix table 1). However, spatially structured variations still constituted an important part of neighborhood variability when we completely disorganized the spatial structure of the data. To explain this result, one notes that the spatial range of correlation (3/{phi}) regularly decreased with increasing disorganization of the neighborhood spatial structure (table 4). When inhabitants were randomly reassigned among the 100 neighborhoods, the spatial range of correlation was equal to 475 m (vs. 3,424 m in the real data), indicating that the spatially structured and unstructured components of variability were no longer intrinsically different. This simulation confirms a certain ability of the hierarchical geostatistical model to disentangle spatially structured variations from unstructured neighborhood variations, and it indicates that the proportion of spatially structured variations and the spatial range of correlation 3/{phi} need to be interpreted jointly.
APPENDIX TABLE 1. Results of the empty hierarchical geostatistical models when randomly reassigning all individuals from one neighborhood as a group to another neighborhood for 10, 25, 50, 75, 90, and 100 neighborhoods out of 100 (data on all inhabitants aged 40–59 years in Malmö, Sweden, 2001)


Reassignment made for


*


95% CI{dagger}


3/{phi}{ddagger}


95% CI


0 neighborhoods (real data) 0.92 0.52, 0.99 3,424 1,097, 10,760
10 neighborhoods 0.86 0.40, 0.99 3,285 859, 12,005
25 neighborhoods 0.81 0.24, 0.99 1,189 365, 3,724
50 neighborhoods 0.76 0.22, 0.99 733 347, 2,187
75 neighborhoods 0.66 0.22, 0.99 689 345, 2,054
90 neighborhoods 0.57 0.21, 0.99 503 339, 1,314
100 neighborhoods

0.47

0.21, 0.99

475

338, 1,277

* is the proportion of neighborhood variations that is spatially structured.

{dagger} CI, credible interval.

{ddagger} 3/{phi} is the spatial range of correlation, defined as the distance beyond which the correlation in risk level between neighborhoods is below 5 percent.

Multilevel models and hierarchical geostatistical models were estimated with a Markov chain Monte Carlo approach by using the WinBUGS version 1.4 program (43Go). We used noninformative uniform priors for all parameters. We ran a single chain with a burn-in period of 100,000 iterations. After convergence, we retained every 10th iteration until a sample size of 10,000 was attained. For each parameter, we report the median of the posterior distribution and provide a 95 percent credible interval.

To illustrate that the hierarchical geostatistical model cannot take into account the 13,730 different locations of individuals, we successively grouped individuals within squares of different sizes (1,000, 750, 500, 250, or 125 m on a side), resulting in data sets with a different number of locations for geocoding individuals (154, 235, 422, 1,025, and 2,612 locations, respectively). We estimated an empty hierarchical geostatistical model for each data set. The computation time for 1,000 iterations was 0.04 hour, 0.13 hour, 0.66 hour, 8.96 hours, and 145 hours (or 6 days) in these five different cases. Far more than 1,000 iterations would have been needed to fit the models. Furthermore, models including covariates require much longer computation times. Therefore, it is obvious that our hierarchical geostatistical model could not take into account the 13,730 different locations.


    ACKNOWLEDGMENTS
 
This research project was supported by a grant from the French Ministry of Research (TTT027), by a grant from the French Foundation for Medical Research (Dr. Basile Chaix), by the "Avenir" program of INSERM (the French National Institute of Health and Medical Research) dedicated to Dr. Pierre Chauvin and his research team, and by a grant from the FAS (the Swedish Council for Working Life and Social Research) for the project, "Socioeconomic disparities in cardiovascular diseases—a longitudinal multilevel analysis" (principal investigator: Dr. Juan Merlo (no. 2003-0580)).

The authors express their gratitude to Thor Lithman, Dennis Noreen, and Åke Boalt from the County Council of Skåne, Sweden, for their indispensable help and support regarding this research project.

Conflict of interest: none declared.


    References
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 References
 

  1. Kawachi I, Berkman LF. Neighborhoods and health. New York, NY: Oxford University Press, 2003.
  2. Diez-Roux AV. Multilevel analysis in public health research. Annu Rev Public Health 2000;21:171–92.[CrossRef][ISI][Medline]
  3. Leyland AH, Goldstein H. Multilevel modelling of health statistics. Chichester, England: Wiley, 2001.
  4. Snijders T, Bosker R. Multilevel analysis. An introduction to basic and advanced multilevel modelling. London, England: Sage Publications, 1999.
  5. Merlo J. Multilevel analytical approaches in social epidemiology: measures of health variation compared with traditional measures of association. J Epidemiol Community Health 2003;57:550–2.[Free Full Text]
  6. Merlo J, Chaix B, Yang M, et al. A brief conceptual tutorial on multilevel analysis in social epidemiology—linking the statistical concept of clustering to the idea of contextual phenomenon. J Epidemiol Community Health 2005 (in press).
  7. Merlo J, Yang M, Chaix B, et al. A brief conceptual tutorial on multilevel analysis in social epidemiology—investigating contextual phenomena in different groups of individuals. J Epidemiol Community Health 2005 (in press).
  8. Merlo J, Chaix B, Yang M, et al. A brief conceptual tutorial on multilevel analysis in social epidemiology—interpreting neighbourhood differences and the effects of neighbourhood characteristics on individual health. J Epidemiol Community Health 2005 (in press).
  9. Weich S, Holt G, Twigg L, et al. Geographic variation in the prevalence of common mental disorders in Britain: a multilevel investigation. Am J Epidemiol 2003;157:730–7.[Abstract/Free Full Text]
  10. Duncan C, Jones K, Moon G. Psychiatric morbidity: a multilevel approach to regional variations in the UK. J Epidemiol Community Health 1995;49:290–5.[Abstract]
  11. Reijneveld SA, Schene AH. Higher prevalence of mental disorders in socioeconomically deprived urban areas in the Netherlands: community or personal disadvantage? J Epidemiol Community Health 1998;52:2–7.[Abstract]
  12. Wainwright NW, Surtees PG. Places, people, and their physical and mental functional health. J Epidemiol Community Health 2004;58:333–9.[Abstract/Free Full Text]
  13. Weich S, Twigg L, Holt G, et al. Contextual risk factors for the common mental disorders in Britain: a multilevel investigation of the effects of place. J Epidemiol Community Health 2003;57:616–21.[Abstract/Free Full Text]
  14. Driessen G, Gunther N, Van Os J. Shared social environment and psychiatric disorder: a multilevel analysis of individual and ecological effects. Soc Psychiatry Psychiatr Epidemiol 1998;33:606–12.[CrossRef][ISI][Medline]
  15. Dear M. Psychiatric patients and the inner city. Ann Assoc Am Geogr 1977;67:588–94.[CrossRef]
  16. Knorr-Held L, Rasser G. Bayesian detection of clusters and discontinuities in disease maps. Biometrics 2000;56:13–21.[CrossRef][ISI][Medline]
  17. Sabel CE, Boyle PJ, Loytonen M, et al. Spatial clustering of amyotrophic lateral sclerosis in Finland at place of birth and place of death. Am J Epidemiol 2003;157:898–905.[Abstract/Free Full Text]
  18. Fotheringham AS, Wong DWS. The modifiable areal unit problem in multivariate statistical analysis. Environ Plan A 1991;23:1025–44.[ISI]
  19. Amrhein CG. Searching for the elusive aggregation effect: evidence from statistical simulations. Environ Plan A 1995;27:105–19.[ISI]
  20. Holt D, Steel DG, Tranmer M. Area homogeneity and the modifiable areal unit problem. Geograph Sys 1996;3:181–200.
  21. O'Campo P. Invited commentary: Advancing theory and methods for multilevel models of residential neighborhoods and health. Am J Epidemiol 2003;157:9–13.[Free Full Text]
  22. Mitchell R. Multilevel modeling might not be the answer. Environ Plan A 2001;33:1357–60.[ISI]
  23. Boyle MH, Willms JD. Place effects for areas defined by administrative boundaries. Am J Epidemiol 1999;149:577–85.[Abstract]
  24. Wood SN. Thin plate regression splines. J R Stat Soc B 2003;65:95–114.[CrossRef][ISI]
  25. Wood SN. Stable and efficient multiple smoothing parameter estimation for generalized additive model. J Am Stat Assoc 2004;99:673–86.[CrossRef][ISI]
  26. Cakmak S, Burnett RT, Jerrett M, et al. Spatial regression models for large-cohort studies linking community air pollution and health. J Toxicol Environ Health A 2003;66:1811–23.[CrossRef][ISI][Medline]
  27. Burnett R, Ma R, Jerrett M, et al. The spatial association between community air pollution and mortality: a new method of analyzing correlated geographic cohort data. Environ Health Perspect 2001;109(suppl 3):375–80.[ISI][Medline]
  28. Leyland AH, Langford IH, Rasbash J, et al. Multivariate spatial models for event data. Stat Med 2000;19:2469–78.[CrossRef][ISI][Medline]
  29. Langford IH, Leyland AH, Rasbash J, et al. Multilevel modelling of the geographical distributions of diseases. J R Stat Soc C 1999;48:253–68.[CrossRef][ISI]
  30. Kleinschmidt I, Sharp B, Mueller I, et al. Rise in malaria incidence rates in South Africa: a small-area spatial analysis of variation in time trends. Am J Epidemiol 2002;155:257–64.[Abstract/Free Full Text]
  31. Kleinschmidt I, Sharp BL, Clarke GP, et al. Use of generalized linear mixed models in the spatial analysis of small-area malaria incidence rates in KwaZulu Natal, South Africa. Am J Epidemiol 2001;153:1213–21.[Abstract/Free Full Text]
  32. Diggle P, Moyeed R, Rowlingson B, et al. Childhood malaria in The Gambia: a case-study in model-based geostatistics. J R Stat Soc C 2002;51:493–506.[CrossRef][ISI]
  33. Gemperli A, Vounatsou P, Kleinschmidt I, et al. Spatial patterns of infant mortality in Mali: the effect of malaria endemicity. Am J Epidemiol 2004;159:64–72.[Abstract/Free Full Text]
  34. Banerjee S, Wall MM, Carlin BP. Frailty modeling for spatially correlated survival data, with application to infant mortality in Minnesota. Biostatistics 2003;4:123–42.[Abstract/Free Full Text]
  35. Borgoni R, Billari FC. Bayesian spatial analysis of demographic survey data: an application to contraceptive use at first sexual intercourse. (Online journal; research article 3). Demogr Res 2003;8. (http://www.demographic-research.org/?http://www.demographic-research.org/volumes/vol8/3/).
  36. Banerjee S, Gelfand AE, Carlin BP. Hierarchical modeling and analysis for spatial data. Boca Raton, FL: Chapman and Hall/CRC, 2003.
  37. Geronimus AT, Bound J. Use of census-based aggregate variables to proxy for socioeconomic group: evidence from national samples. Am J Epidemiol 1998;148:475–86.[Abstract]
  38. Bithell JF. An application of density estimation to geographical epidemiology. Stat Med 1990;9:691–701.[ISI][Medline]
  39. Talbot TO, Kulldorff M, Forand SP, et al. Evaluation of spatial filters to create smoothed maps of health data. Stat Med 2000;19:2399–408.[CrossRef][ISI][Medline]
  40. Tiwari C, Rushton G. Using spatially adaptive filters to map late stage colorectal cancer incidence in Iowa. In: Fisher P, ed. Developments in spatial data handling. Berlin, Germany: Springer-Verlag, 2005:665–76.
  41. Walter SD. The analysis of regional patterns in health data. II. The power to detect environmental effects. Am J Epidemiol 1992;136:742–59.[Abstract]
  42. Congdon P. Applied Bayesian modelling. Chichester, England: Wiley, 2003.
  43. Smith AFM, Roberts GO. Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. J R Stat Soc B 1993;55:3–23.[ISI]
  44. Spiegelhalter DJ, Best N, Carlin BP, et al. Bayesian measures of model complexity and fit. J R Stat Soc C 2002;64:583–639.[CrossRef]
  45. Crook AM, Knorr-Held L, Hemingway H. Measuring spatial effects in time to event data: a case study using months from angiography to coronary artery bypass graft (CABG). Stat Med 2003;22:2943–61.[CrossRef][ISI][Medline]
  46. Jerrett M, Burnett RT, Goldberg MS, et al. Spatial analysis for environmental health research: concepts, methods, and examples. J Toxicol Environ Health A 2003;66:1783–810.[CrossRef][ISI][Medline]
  47. Pàez A, Uchida T, Miyamoto K. A general framework for estimation and inference of geographically weighted regression models: 2. Spatial association and model specification tests. Environ Plan A 2002;34:883–904.[CrossRef][ISI]
  48. Merlo J, Östergren PO, Broms K, et al. Survival after initial hospitalisation for heart failure: a multilevel analysis of patients in Swedish acute care hospitals. J Epidemiol Community Health 2001;55:323–9.[Abstract/Free Full Text]
  49. Dominici F, McDermott A, Zeger SL, et al. On the use of generalized additive models in time-series studies of air pollution and health. Am J Epidemiol 2002;156:193–203.[Abstract/Free Full Text]
  50. Ramsay TO, Burnett RT, Krewski D. The effect of concurvity in generalized additive models linking mortality to ambient particulate matter. Epidemiology 2003;14:18–23.[CrossRef][ISI][Medline]
  51. Wood SN. Modelling and smoothing parameter estimation with multiple quadratic penalties. J R Stat Soc B 2000;62:413–28.[CrossRef][ISI]
  52. Wood SN. Multiple smoothing parameter estimation and GAMs by GCV. (http://stat.ethz.ch/R-manual/R-patched/library/mgcv/html/00Index.html).
  53. Odland J. Spatial autocorrelation. Newbury Park, CA: Sage Publications, 1988.