1 School of Geography and Geosciences, University of St Andrews, St. Andrews, United Kingdom.
2 Department of Geography, University of Helsinki, Helsinki, Finland.
3 Institute for Health Research, Lancaster University, Lancaster, United Kingdom.
4 Department of Neurology, Päijät-Häme Central Hospital, Lahti, Finland.
5 Department of Medicine, University of Helsinki, Helsinki, Finland.
Received for publication July 16, 2002; accepted for publication December 10, 2002.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
amyotrophic lateral sclerosis; cluster analysis; geography
Abbreviations: Abbreviation: ALS, amyotrophic lateral sclerosis.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Any explanation(s) must account for ALSs worldwide distribution, relatively little reliable evidence for clustering, rarity of onset before 40 years of age, a peak incidence in those aged 5570 years (2), and an increased relative risk among males varying, depending on the study, from about 1.5:1 to 2:1 (3). The reported annual incidence of ALS varies from 0.07 per 100,000 population in Mexico (3) to 2.6 per 100,000 in Sweden (4), and several studies report rising incidence rates (5, 6) that have been at least partially explained by better case ascertainment (7), rising life expectancies, and decreases in competing causes of death such as infectious or cardiovascular diseases (8).
Sporadic ALS is the most common form of the disease, representing 9095 percent of observed cases worldwide. Several etiologic hypotheses have been proposed for ALS, and a wide range of potential environmental risks has been examined, including those relating to particular activities, such as welding (9), agricultural employment (10, 11), electrical occupations (12), exposure to ionizing radiation (13), and smoking (14). Exposure to specific substances, including mercury (15), aluminium (16) and, particularly, lead (17, 18), has attracted considerable attention, but results from these occupational or environmental exposure studies have so far proved inconclusive.
It has also been recognized that ALS sometimes aggregates within kin groups, and it is usually accepted that between 5 and 10 percent of ALS cases are genetically determined, usually with autosomal dominant inheritance. Recently this number for familial ALS has been questioned as an underestimate (19) due largely to inadequate recording (or awareness) of family histories. Familial ALS tends to have a lower age of onset and a relatively shorter survival time, and it does not show the usual male predominance of sporadic ALS (2). Ongoing research is attempting to identify and quantify the responsible genetic mutations, and a recent study has discovered superoxide dismutase (SOD1) gene mutations in a study of Scandinavian patients (20).
The relative importance of genetic and environmental causes of ALS is difficult to determine in cluster analyses of the condition. There may be, for example, a hereditary component to the development of the disease in (apparently) sporadic cases. On the other hand, familial incidence may merely reflect common exposure among family members to an environmental risk factor. Therefore, as regards the present analysis, both genetic and environmental explanations may result in observed spatial clustering of ALS. Furthermore, ALS may be a multifactorial disorder involving the combination of one or more genetic DNA mutations that predispose an individual toward ALS and external environmental factors that trigger the disease. The fact that there may be a long latency period between the initial trigger and disease onset complicates the issue of identifying appropriate environmental risks.
Prior evidence for ALS clustering was examined. Cluster identification is of considerable epidemiologic importance, particularly in diseases of unknown etiology. Unfortunately, many reported localized clusters of ALS (21, 22) are based on anecdotal observations, often involving a small number of cases perhaps within a neighborhood or family and which, upon closer examination, show no statistically significant evidence for clustering beyond what might be expected to occur by chance. Commentators have concluded that there is good evidence that ALS does not cluster in space and that its distribution is random worldwide (23).
However, there is now recent and increasing evidence for an inhomogeneous distribution of ALS, and a number of studies have attributed spatial clustering to a genetic explanation. In a study in northern Sweden, apparent clusters of sporadic ALS cases were found to be linked via the D90A mutation of the SOD1 gene (19). This led to the identification of a common founder for this group (24), thereby introducing a genetic explanation for an observation that was initially thought to have an environmental basis. In central Italy, the SOD1 gene was sequenced in three Italian families with ALS and six sporadic ALS patients (25). It was found that all familial members and one of the six apparently sporadic cases carried the L84F mutation of the SOD1 gene, with the conclusion that the mutation was largely responsible for ALS clustering in the area.
Other studies have suggested a link between ALS clustering and environmental causes. An unusually high incidence form, known as the Western Pacific form, has been observed on Guam (26), the Kii Peninsula, Japan (27), Irian Jaya, and Groote Eylandt, Australia (28). These geographically localized clusters gave rise to the hope that an environmental risk factor may be identified. In fact, recent evidence suggests a familial clustering of the disease on Guam (29). Various environmental hypotheses have been tested in relation to the Western Pacific form, including exceptional exposure to toxins such as cycasin and the nonprotein amino acid ß-N-methyl-amino-L-alanine, which are both found in the cycad nut, a traditional dietary staple on the island (1).
In northwest England, researchers investigated potential clustering of ALS at the place of onset among 173 cases between 1976 and 1986 by census wards (30). A significant excess was observed in two wards, although the authors suggest that this may be no more than expected by chance processes. Using a second, smaller data set of 112 cases, the same team examined data on all past residential locations for motor neuron disease patients, again in northwest England (31). Aggregating the point data to postal districts, the authors found that two districts in particular had an apparent excess of motor neuron disease residences. Again, the number of cases upon which this result was based was small.
In Skaraborg County, Sweden, 168 cases of motor neuron disease were identified between 1961 and 1990 (32). An epidemic-like cluster among male cases was identified, where standardized risk ratios were significantly elevated in six contiguous municipalities in the 19731984 period.
In Wisconsin, the death places of 529 ALS cases between 1973 and 1982 were considered (33). Three adjoining (rural) counties in northeast Wisconsin were identified that had significantly higher rates than might be expected. However, this analysis was based on only 15 cases in these three counties. Unlike most studies that consider place of death, a recent study focused on the state of birth in a study of motor neuron disease in the United States (34). A significant northwest (high) to southeast (low) gradient was found, perhaps suggesting the importance of the early childhood environment or genetic history for the disease.
Of relevance to this investigation, a number of studies in Finland have reported an excess of cases in certain areas. The birthplaces of 82 ALS cases alive throughout Finland in January 1973 were examined, with an excess found in the southeast counties of Mikkeli, Kymi, and Pohjois-Karjala (35). Taking a random sample of 31 Finnish patients alive in January 1973, they traced the birthplaces of cases, together with those of their parents and grandparents, but no visual evidence of clustering was found (36). A further study investigated the effect of birthplace on the development of ALS (37). In comparing two cohorts (Finnish war evacuees and the local nonevacuated population), they found a statistically significant difference in ALS prevalence of 18 per 100,000 among the evacuees and 8.8 per 100,000 for the nonevacuees. Whether the potentially stressful act of migrating could have been a confounding factor could not be established in the study. An alternative approach, using the same data as in this study, examined past places of residence between 1965 and 1995 to produce surfaces of excess disease over time (38). Significant areas of elevated risk among ALS cases were revealed, most notably in the southeast of Finland but additionally on the coast south of Vaasa and southeast of Oulu.
To summarize, there is some evidence of spatial clustering of ALS at different scales in a number of countries. However, a number of methodological problems remain, including the reliability of case ascertainment and the small number of cases upon which apparent clusters have been based. The analysis we present here extends the work on ALS clustering using a large data set collected throughout Finland.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The geographic study region consists of the whole of Finland, including for the birthplace analysis those areas ceded to Russia after 1944. The retrospective data set was provided by Statistics Finland, who collated information from their death certificate, longitudinal central population, and building registers. Cases were selected when either the primary or secondary cause of death was recorded on the death certificate with code 348.0 until the end of 1986 or with code 3352A from 1987 onward from the International Classification of Diseases, Revision 8 or Revision 9, respectively. Each case was then linked via its unique Social Security number to the central population and building registers to obtain its place of birth. The data set consists of 1,000 ALS patients who died and were recorded in the Finnish death certificate registers between June 1985 and December 1995. Of these 1,000 cases, two lived outside Finland at the time of death, and five were born outside Finland and hence were excluded from the analysis. The relatively high mortality rate (1.91 per 100,000 over the whole study period) (5) and the overall quality of diagnosis, data collection, and recording in Finland suggest that the potential problem of case ascertainment is unlikely to be serious.
The data were then aggregated into the Finnish municipalities. The national boundary of Finland has been altered since World War II, because of the ceding of some areas of Karelia to Russia in 1944. In the analyses described here, the 499 municipalities that existed in the early 20th century were recreated and then used for the subsequent analyses of place of birth. The place of death analysis comprised the 452 municipalities of Finland that existed in the mid 1990s and excluded those parts of Finland that were ceded to Russia. The statistical method required that we specify the geographic position of each municipality, and we used the geometric, but not population-weighted, centroid of each municipality as this marker. In a small number of cases, where municipalities consisted of a number of islands, we visually adjusted the centroids to better reflect reality.
Background population at risk figures, by municipality, for both 1920 (mean case year of birth) and 1990 (mean case year of death), were also obtained from Statistics Finland. The total population of Finland in 1920 was 3,377,432 and in 1990 was 4,998,428. We would have preferred to standardize for age/sex groups, but the data were not available at the municipality level for calculating age/sex background populations in the 1920s. Ideally, we would also have liked to examine the intervening period between birth and death, but unfortunately case migrations and population at risk data were not uniformly available. To allow for comparability between the place of birth and place of death analyses, we did not standardize for age/sex in the death data.
Statistical methodology
Rather than attempting to test preconceived hypotheses by identifying specific localized clusters perhaps based on a small number of cases, an approach that has often been criticized (40), in this analysis we attempt to identify wider geographic clustering using a large, high quality population-based register. The absence of any consensus as to the etiology of ALS makes such an exploratory methodology particularly appropriate. An additional advantage of this "surveillance" approach across the whole of Finland is that, unlike some other studies, we cannot be accused of shrinking our study region boundaries to obtain a significant result (41).
To test for the presence of such clusters, we used the spatial scan statistic (SaTScan; Statistical Research and Applications Branch, Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland) that has been used for a number of different diseases, including potential clusters of soft-tissue sarcoma and non-Hodgkins lymphoma around a solid waste incinerator in France (42), childhood leukemia in Sweden (43), and breast cancer in the United States (44). This statistical approach can detect clusters of any size located anywhere in the study region, and it is not restricted to clusters that conform to predefined administrative or political borders (4346). The method imposes a circular scanning window on the map and lets the center of the circle move over the study area so that at each position the window includes different sets of neighboring administrative areas. For each circle centroid, the radius varies continuously from zero to a previously user-defined maximum, based on the denominator population. Although the choice of maximum cluster circle size is somewhat arbitrary and there are no clear guidelines for its choice, it is important to make the choice of maximum cluster size a priori to avoid the problems of multiple hypothesis testing. In our analysis, we used 20 percent (or approximately one million people) of the total study population as the maximum to consider. By picking 20 percent as the maximum, we are evaluating all sizes from zero up to 20 percent, which means that rather than selecting a single arbitrary size we are evaluating numerous sizes and adjusting for the multiple testing related to each of them. Because most people in Finland live in the South, had we, for example, chosen a larger maximum such as 50 percent, it is possible that the resulting cluster could have covered as much as 90 percent of Finland.
One criticism of the test, as implemented (47), is that it is appropriate for identifying clusters that are approximately circular but is unlikely to detect linear clusters that perhaps follow rivers or overhead power lines. If one does not know, a priori, what shape a cluster might take, the test will impose a circular one regardless. Kulldorff et al. (47) argue, however, that it is not the exact borders of the cluster that one is most interested in, but rather the general area upon which it is focused. However, unlike some other techniques such as the Geographical Analysis Machine (48), SaTScan takes into account the problem of multiple hypothesis testing and additionally is capable of adjusting for heterogeneous background population densities.
For each location and size of the scanning window, the null hypothesis was that the risk of ALS was the same in all windows, with the alternative hypothesis being that there was an elevated rate within, compared with outside the window. The test statistic adopted is the likelihood ratio, which is maximized over all the windows to identify the most likely disease cluster (45, 46). The likelihood ratio for this window is noted and constitutes the maximum likelihood ratio test statistic. Its distribution under the null hypothesis and its corresponding simulated p value were obtained by using a Monte Carlo procedure to generate 99,999 random permutations of the data set. The maximum likelihood ratio test statistic was calculated for each of these random replications as well; if the cluster(s) identified in the real data set were among the nth highest, then the test was deemed significant at the nth level (we used a 95 percent significance test).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The results are compelling given the reliability of the Finnish data, the number of cases included in the analysis, the fine geographic resolution of the study, and the use of a modern statistical approachthe spatial scan statisticfor cluster detection. This may be the first convincing evidence for clustering of apparently sporadic ALS found to date that has been based on more than a handful of cases.
The identification of significant ALS clusters, at both places of birth and death, is etiologically important. The unknown etiology of the disease and the fact that there may be a long latency period between contact with environmental risks and onset mean that the search for clustering may be as important at the time of birth as it is at the time of death. Because the place of birth cluster had a higher log-likelihood ratio than either place of death cluster, it is tempting to suggest that we should be more interested in potential causal factors measured around the time of birth. However, the log-likelihood ratios cannot be directly compared across the place of birth and place of death data sets, because they involve different numbers of cases and the definitions of the geographic regions were slightly different.
Even so, one striking result was the consistency of the place of birth and place of death clusters. A similar cluster was identified in southeast Finland at the time of death, based on 227 cases, as that identified at the place of birth, based on 120 cases (note that the latter included parts of present-day Russian Karelia). This broad area of Finland would apparently be a useful focus for studies interested in identifying potential environmental triggers for ALS. On the other hand, a second statistically significant cluster, based on 229 cases, was identified in the death place analysis in south-central Finland that was not identified in the birthplace analysis.
These results could be consistent with either a genetic or environmental hypothesis. Finland has a relatively unique history of population settlement from an original small founding population, subsequent isolation, and then rapid expansion. This has caused major genetic bottlenecks and the enrichment of rare genes. Consequently, a number of dominant and, in particular, recessive disorders exist in higher frequency in Finland than elsewhere in the world (49).
There are thus reasons to suspect that genetic factors may be at play in this present study, at least for the consistent cluster that was identified in both the place of death and place of birth analyses. The relative stability of the Finnish population post initial population settlement, from the early 17th century to the 1950s (50), may suggest that there could be a genetically susceptible subpopulation in southeast Finland and Karelia. Indeed, the same area that we identified as the cluster center at the time of birth has also been highlighted as the birthplace of parents of cases in a recent study of idiopathic pulmonary fibrosis that cited a founder effect as the potential cause of the rare clustering (51).
Of course, it remains entirely possible that environmental factors could also explain the clusters. Numerous studies suggest that environmental pollution has been a considerable problem along the Finland/Russia border, particularly in the southeast. A heavy pollution load in the forests of southern Finland has been identified (52), and there is evidence of deforestation in the area linked via acid rain to sulfur emissions in Russia (53). In Finnish lakes, long-range transported air pollution has been implicated in the high levels of the heavy metals lead, cadmium, and zinc (54), and high levels of mercury, cadmium, lead, copper, nickel, zinc, and iron have been detected in animals shot in Karelia from 1989 to 1991 (55).
Determining whether genetics or environmental pollution is a potential cause will be the subject of further investigation; however, comprehensive DNA profiling of individuals and families from the areas identified in this study might be the only way that this could be tested satisfactorily.
This study is not without its limitations. The birthplace cluster, while lying wholly within the pre-World War II boundaries of Finland, straddles the present day border with Russia. It has not been possible to trace ALS births or deaths that have occurred within former Finnish Karelia after 1944. However, because of the forced migration of the whole of the former Finnish Karelian population from Finland in 1944, we are confident that we have captured practically all births originating in former Finnish Karelia pre-1945. Additionally, it would have been preferable to standardize our raw rates by age and sex, but it was not possible to obtain age and sex group denominator populations for 1920, the approximate mean year of birth. To enable comparisons between the place of birth and place of death maps, we therefore used crude counts for both analyses. Differential internal migration by age and sex between cases and the background population could theoretically confound the place of death analysis; however, this process cannot operate at the place of birth, which is perhaps the most revealing result from this analysis.
An additional potential source of bias is differential emigration from Finland. We feel this is not a significant issue largely because people who left prior to the 1990s came from all socioeconomic groups and from all areas of the country largely in proportion to the underlying demographics of the municipalities. Finally, we used Finnish municipalities as the basis for our clustering work, and their small average size makes them appropriate for such a study. However, it is possible that the use of different spatial units would have provided different results, a predicament commonly referred to as the "modifiable areal unit problem" and one that it is difficult to solve (41).
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|