a University of Sheffield, UK.
b Institut Municipal d'Investigacio Medica, Barcelona, Spain.
Reprint requests to: MJ Campbell, Community Sciences Centre, Northern General Hospital, Sheffield S5 7AU, UK. E-mail: m.j.campbell{at}sheffield.ac.uk
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods Models were fitted to daily levels of black smoke, nitrogen dioxide, sulphur dioxide, ozone and mortality from Barcelona 19861995 to account for seasonality, environmental temperature, days of the week and influenza epidemics. Cross-correlations of the residuals were plotted for different lags.
Results Clear evidence of a temporal relationship between mortality and air pollution found for all four pollutants in that changes in the pollutant preceded changes in mortality, implying causality. However the pattern of dependence was different for each pollutant.
Conclusion The cross-correlation plot is a useful tool in the analysis of air pollution time series.
Keywords Cross-correlation, time series analysis, air pollution, mortality, causality
Accepted 29 September 1999
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The traditional approach to causality would be to calculate a series of correlation coefficients between the two series, shifted by different numbers of lags. The problem with calculating correlations between the raw mortality and pollution data is that since the data are time series, any correlation may simply reflect their mutual dependence on some time related variable, such as a trend, or seasonality. Tests of significance for the correlation coefficient assume that within a series the data are independent, which is clearly invalid for the raw data. Bowie and Prothero suggested fitting separate models to both of the series to account for as many factors as are known to jointly affect both mortality and air pollution data.5 They used Box-Jenkins models to remove the serial correlation, a process known as prefiltering. The residuals from the mortality series can be then correlated with the residuals from the pollution series, using different lags between the series. The residuals from these models will be serially uncorrelated and so give valid significance levels for correlation coefficients. In addition we can assume that any significant correlation would not have been caused by an intermediate variable, because these effects will have been removed by the model.
As an example, in an earlier report we examined the association between environmental temperature and Sudden Infant Death Syndrome (SIDS)6 in the UK. We fitted models to each series which removed seasonality and serial correlation using an autoregressive process. The results of a cross-correlation between the residuals from these models showed that the coefficients were consistently and significantly negative only when the temperature series was in advance of the mortality series. Thus a decrease in temperature precedes an increase in deaths. When the mortality series was in advance of the temperature series the cross-correlation coefficient was non-significant and inconsistent. This suggests a causal relationship between environmental temperature and SIDS.
An analysis of this form on air pollution data in Barcelona has been undertaken for the Department of Health in the UK.7 This paper updates that report by analysing 10 years of data rather than 4 years. By following the APHEA protocol1 we fitted models which accounted for seasonality and influenza epidemics more completely than in the earlier reports.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The cross-correlation plots are shown in Figure 1 together with the 2 standard error line. All four pollutants show temporal effects in that the only statistically significant correlations are for positive lags when mortality changes follow pollution changes and these correlations are all positive. For negative lags the cross-correlations are small and inconsistent which is reassuring that seasonal effects have been successfully removed. The results show that the effect of an increase in black smoke on mortality stretches over 4 days, with a broadly similar effect for SO2. For NO2 the effects would appear to last over 3 days. However for ozone the effect is much less spread out, with an increase in mortality for just one day after an increase in ozone levels. All the correlations are small, suggesting the effect on overall mortality variation is weak although statistically significant associations between total mortality and all of these pollutants were reported.9 For black smoke, NO2 and ozone the positive correlations are followed by some negative ones, suggesting the possibility that at least some of the deaths were brought forward by only a few days (harvesting).
|
![]() |
Discussion and Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The correlation coefficients are low, but as has been pointed out in the Department of Health report7 when the outcome variable has an over-dispersed Poisson distribution, there is a limit to how much of the variation can be explained by external factors. The over-dispersion can be removed but the Poisson variability will remain. For some pollutants, 8-hour moving averages are used, and so for some series, for a particular observation there is information not just about the current day, but also about the subsequent one. This was not a problem for our data, but it means that in some cases one should be cautious about accepting a significant lag at minus one day as counter evidence to causality.
No observational study can prove causality, but these results satisfy the most important causality criterion, that of cause preceding effect and suggest a clear causal pathway between air pollution and mortality. The ecological time series design is not a good design to check other causality criteria because it is based on aggregate and not individual data. A number of studies2,3 have demonstrated problems in ascribing causality between health and air pollution for data based on individuals, and in particular in deciding which particular pollutants are to blame.
We recommend that cross-correlation plots are useful for showing the shape of an exposure/response relationship. Investigators using time series designs should check that their data are showing associations only in the right direction, and investigate possible anomalies before fitting causal models.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Samet JM, Zeger SL, Kelsall JE, Xu J, Kalkstein LS. Particulate Air Pollution and Daily Mortality. Analyses of the Effects of Weather and Multiple Air Pollutants. Health Effects Institute, Andover MA, 1997.
3 Gamble JF, Lewis RJ. PM25 and mortality in long-term prospective cohort studies: cause-effect or statistical associations? Environ Health Perspect 1998;106:53549.[ISI][Medline]
4 Rothman KJ, Greenland S. Modern Epidemiology 2nd Edn. Philadelphia: Lippincott-Raven, 1998.
5 Bowie C, Prothero D. Finding causes of seasonal diseases using time series analysis. Int J Epidemiol 1981;10:8792.[Abstract]
6 Murphy MFG, Campbell MJ. Sudden infant death syndrome and environmental temperature: an analysis using vital statistics. J Epidemiol Community Health 1987;41:6371.[Abstract]
7 Department of Health. Quantification of the Effects of Air Pollution on Health in the United Kingdom. HM Stationery Office, 1997.
8 Sunyer J, Castellsagué, Sáez M, Tobías A, Antó JM. Air pollution and mortality in Barcelona. J Epidemiol Community Health 1996; 50(Suppl.1):S7680.[ISI][Medline]
9
Katsouyanni K, Toulomi G, Spix C et al. Short tem effects of ambient sulpher dioxide and particulate matter on mortality in 12 European cities: results from time series data from the APHEA project. Br Med J 1997;314:165863.
10 Tobias A, Campbell MJ. Time series regression for counts allowing for autocorrelation. Stata Technical Bulletin Reprints Volume 8. 1999, Ed. H Joseph Newton, Stata Corporation, College Station, Texas, pp.29196.