A note on temperature effect estimate in mortality time series analysis

Vito MR Muggeo

GRASPA, Research Group for Stastistical Applications to Environmental Problems—Research Unit of Palermo, Dipartimento di Scienze Statistiche e Matematiche ‘S Vianelli’, Facoltà di Economia, Università di Palermo—90128 Palermo, Italy. E-mail: vito.muggeo{at}giustizia.it

Sirs—There is considerable and increasing attention in evaluating the effects of temperature on health, due to the growing concentration of greenhouse gases which is causing average temperatures to rise. Death excesses are observed with low and high temperatures, therefore it is of interest to quantify both the cold- and the heat-related risks in order to gain insight into possible consequences of global warming.

Taking a clue from the recent paper by Gouveia et al. which appeared in this journal,1 I discuss briefly some drawbacks resulting from the methodology used in that paper.

A brief introduction on the topic comes first.

The Poisson regression has become the standard means to analyse daily time-series mortality data. Such a model relates the log-expected death count (logE[Y]) to explanatory variables, including temperature and confounders known to explain, to some extent, variability in daily deaths.

It is well known that the mortality–temperature relationship is V-shaped, therefore a possible way to account for such a non-linear relationship is to set two complementary variables TEMP1 = min(TEMP-{Psi},0) and TEMP2 = max(TEMP-{Psi},0). TEMP is the average temperature and {Psi}, usually assumed as known, is the value where mortality reaches its minimum. This approach is rather useful since it allows direct estimation of percent change in mortality associated with 1°C increase in cold and heat.1,2

Among the confounders, seasonality is undoubtedly the most important factor as mortality time-series are always characterized by a periodic pattern with peaks during the winters. Several methods have been discussed and employed to control for long-term trend, including dummy variables for months and harmonic components. However, recently non-parametric smoothing terms have become quite popular, since they are able to account for non-regular cycles.

Let f(TIME,df) be the non-parametric smoother of seasonality (TIME = 1,2,...) with df degrees of freedom and x the other confounders (such as influenza epidemics, day-of-week, holiday, air pollution) with linear parameters {alpha}. Omitting a possible smoothing term for relative humidity, the semi-parametric model is:

(1)

The backfitting algorithm is usually employed to estimate the parameters in the so-called generalized additive model in equation (1); S-plus, for instance, uses the backfitting by means of its function gam().

In air-pollution effect assessments, recently it has been pointed out that backfitting can lead to bias in the linear parameter estimates ({alpha}12 in the equation above) and underestimation of the corresponding standard errors.3,4

Simulations have shown that bias is essentially due to different factors, the most important being the degree of concurvity, namely, roughly speaking, the ‘correlation’ between the non-parametric smoothed variable (TIME) and the parametric variables whose coefficients are of interest. These findings concern the effect of air pollutants whose pattern in time (that is the ‘correlation’ with TIME variable) is usually moderate.

The problem that has to be emphasized when studying temperature effects is that temperature itself is much more related to time than any pollutant and thus leads to a higher degree of concurvity; as a consequence, bias in estimates of ß1 and ß2 are expected to be larger. To illustrate, Table 1 shows the estimates for all natural causes (ICD.IX 1–799) mortality data, 1997–1999, in Palermo (South Italy having approximatively 700 000 residents). Estimates refer to per cent risk change, i.e. 100 x (exp(ß) – 1)) obtained by backfitting (actually the S-plus gam() function) and by a possible alternative approach based on parametric regression splines that have been shown to yield unbiased estimates for the linear parameters.3 The model includes days-of-week, holidays, influenza epidemics, and air-pollution (PM10) evaluated as mean of lag 0–1, and temperature also included as mean lag 0–1.


View this table:
[in this window]
[in a new window]
 
Table 1 Per cent changes in all-causes mortality for a 1°C increase in cold (TEMP1) and heat (TEMP2). Estimates and relevant |t|-values from two models having different seasonality smoothers for parametric regression splines and non-parametric smoothing splines

 
The t-values are calculated as ratios of the point estimates on corresponding standard errors: when |t | > 1.96 the parameter is usually understood to be significant. Differences in the cold-related risk are worth noting: the backfitting estimates a semiparametric model with a significant adverse influence of TEMP1, i.e. low temperatures increase mortality. On the other hand, the parametric model yields a weaker yet beneficial cold effect, although uncertainty is high (low |t| value) and hence the estimate is non-significant and then negligible. On the other hand, the heat-related risk is substantially unchanged. Hence according to the results returned by the S-plus gam() function, one should conclude that a risk coming from increases of cold is plausible.

Another point should be discussed here: estimation of model (1) is usually carried out conditioning on the break-point, {Psi}, that is assumed to be fixed.1,2 Independently of the estimation approach, assuming {Psi} known can lead to underestimating the standard error of the other parameters (including ß1 and ß2), since the uncertainty of {Psi} is neglected. In order to obtain correct estimates of standard errors, one should fit several models for every fixed breakpoint {Psi} in the range of the observed temperature values and apply the formula of the conditional variance, taking averages over all {Psi} values selected.

Alternatively one could use a method recently proposed that allows estimation jointly of all the parameters of the model.5 Such a method could be very useful and desirable when one is interested in estimating a three-segment relationship. As pointed out by both referees, it may be more plausible to assume a rather wide range of moderate temperatures over which the risk is negligible. In principle, estimation of such multiple break-point models can be carried out quite straightforwardly according to the method outlined in reference 5, through a parameterization similar to the one used in the single break-point case. However, temperature–mortality curves often exhibit high heterogeneity, making estimation of two break-point patterns quite difficult.

In conclusion, ignoring the side-effect of backfitting and leaving out uncertainty in break-point detection is very likely to cause underestimation in standard errors and therefore overestimation of the precision of relative risks. While bias is expected in any linear parameter when backfitting is used, the bias in the estimate of the parameters for temperature can be very severe and can lead to misleading findings and conclusions. In particular, due to high concurvity between TIME and TEMP1, the cold effect is likely to be seriously overestimated with the heat-related risk remaining substantially unchanged as shown in Table 1. Thus care must be taken in fitting data according to model (1) and in interpreting the relevant results about temperature effects.

References

1 Gouveia N, Hajat S, Armstrong B. Socioeconomic differentials in the temperature-mortality relationship in São Paulo, Brazil. Int J Epidemiol 2003;32:390–97.[CrossRef][ISI][Medline]

2 Kunst AE, Looman CWN, Mackenbach JP. Outdoor air temperature and mortality in the Netherlands: a time series analysis. Am J Epidemiol 1993;137:331–41.[Abstract]

3 Dominici F, McDermott A, Zeger SL, Samet JM. On the use of generalized additive models in time-series studies of air pollution and health. Am J Epidemiol 2002;156:193–203.[Abstract/Free Full Text]

4 Ramsay TO, Burnett RT, Krewski D. The effect of concurvity in generalized additive models linking mortality to ambient particulate matter. Epidemiology 2003;14:18–23.[CrossRef][ISI][Medline]

5 Muggeo VMR. Estimating regression models with unknown break-points. Statist Med 2003;22:3055–71.[CrossRef][ISI]





This Article
Extract
Full Text (PDF)
All Versions of this Article:
33/5/1151    most recent
dyh296v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Request Permissions
Google Scholar
Articles by Muggeo, V. M.
PubMed
PubMed Citation
Articles by Muggeo, V. M.