From the Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
Correspondence to Dr. Leah J. Welty, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 680 North Lake Shore Drive, Suite 1120, Chicago, IL 60611 (e-mail: lwelty{at}northwestern.edu).
Received for publication April 28, 2004. Accepted for publication February 16, 2005.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
air pollution; longitudinal studies; mortality; regression analysis; seasons; temperature; weather
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Despite the evidence, doubt about the findings remains, in part because of the potential confounding effects of weather and season (17). Time-series studies of temperature and mortality have found temperature effects (18
22
) that are larger than particulate matter effects and that may be correlated with particulate matter levels (7
, 23
). Time-series studies of the effects of both air pollution and weather on mortality have identified the importance of adequate control for temperature and humidity when estimating air pollution effects (24
28
). Multicity time-series studies of particulate matter and mortality have accounted for the potential confounding effects of weather by including nonlinear covariates for current and previous days' weather and by excluding extreme weather days. These studies have also conducted sensitivity analyses on the degree of flexibility for weather and season covariates (12
, 16
). However, concern remains that "the possibility of subtler effects within the normal climactic range continues" (17
, p. 67). A recent report on the results from multicity time-series studies states that "further exploration of weather effects is merited (for example, considering correlated cumulative effects of multiday temperature or humidity)" (17
, p. 67).
Motivated by this lingering challenge to the particulate mattermortality association, we investigate possible subtler effects of weather and season on estimates of mortality relative risk for particulate matter of less than 10 µm in aeordynamic diameter (PM10). Using the recently updated NMMAPS database with time series from 1987 to 2000 for over 100 US cities (16), we conduct sensitivity analyses of city-specific and national average estimates of the acute effect of PM10 on mortality to control for the effects of weather and season, including cumulative temperature effects. For comparison, our sensitivity analysis includes models similar to those used in previous analyses of the NMMAPS.
Many single-city time-series studies investigating more aggressive control for weather and season have shown that particulate matter effect estimates are generally robust (7, 25
, 26
, 28
). Results from single-city studies do not necessarily generalize to other cities, and single-city particulate matter effect estimates are more variable than are national estimates, making it difficult to distinguish noise from confounding. A substantial portion of evidence for the acute effects of PM10 on mortality comes from multicity studies, such as the NMMAPS and the APHEA project, so it is important to conduct extensive sensitivity analyses for these multicity studies.
We consider two strategies to control for the effects of temperature and season on mortality. Both are based on distributed lag models, time-series models that allow an exposure to affect response over an extended period of time (29). Using distributed lag models, researchers have shown that temperature affects mortality over several days (18
, 19
), so distributed lag models provide a natural framework for considering the effects of weather on mortality. Distributed lag models also provide a natural framework to quantify the health effects of multiple-day exposure to particulate matter, and they have been used throughout the air pollution epidemiology literature (11
, 14
, 15
, 28
, 30
).
Our distributed lag models are formulated to capture the complexity of confounding by temperature and season and to account for documented aspects of the temperaturemortality relation. Temperatures up to 2 weeks prior are likely sufficient for capturing the lagged effects of temperature on mortality (18), so we use distributed lag models with 2 weeks of temperature lags. Numerous studies and our own exploratory analyses have shown that the relation between temperature and mortality is nonlinear and varies by season. The relation between temperature and mortality relative risk is convex (i.e., U shaped), implying that cold temperatures in winter or hot temperatures in summer are worse for health than are moderate temperatures (20
, 22
). The exact nature of the nonlinearity, or the degree to which temperature extremes affect mortality, varies by location (20
). We formulate two versions of distributed lag models to control for the nonlinear effects of temperature on mortality: one that builds on the current approach to control for temperature and one that takes a new approach.
In early air pollution time-series studies, methods of control for temperature included categorizing days by weather regime, using sine and cosine terms or indicator variables for season and using indicator, linear, and polynomial terms for meteorologic covariates (46
, 10
, 21
). In recent time-series studies, smooth, unspecified functions of temperature have been used to account for the nonlinearity between temperature and mortality. In multicity time-series studies, this flexible specification allows for different nonlinear temperaturemortality relations in different cities. The advantage of models with flexible functions of temperature is that the exact nature of the temperaturemortality relation need not be explicitly defined. However, these models do not account for the possibility that the nonlinear relation between temperature and mortality may change slowly over time. For example, if in a city air conditioner use had increased over time, warm summer temperatures in the later years of the study could be less harmful than warm summer temperatures in the earlier years of the study. A single U-shaped function of temperature in this situation might capture only the average effect of warm temperatures on mortality and possibly have some subtle effect on the particulate matter effect estimate.
We consider two versions of distributed lag models to control for these complex effects of temperature on mortality. One version builds on the current use of flexible functions of temperature, while the new version allows for the nonlinear temperaturemortality relation to vary over time. The first part of this paper describes these two types of distributed lag models. In Materials and Methods, we introduce a basic distributed lag model for temperature and show how it may be extended so that temperature coefficients trend and vary seasonally, thereby allowing the temperaturemortality relation to vary nonlinearly and in time. We also illustrate how models with nonlinear temperature covariates are derived from the basic distributed lag model, and we consider more extensive versions than those often used. We follow up with subsections on the inclusion of temperature interactions and other covariates. The last portion of Materials and Methods outlines a hierarchical model for combining city-specific PM10 effect estimates into national estimates. The next part of this paper compares fitted-model results, both at the national and at the individual-city levels, to assess the sensitivity of the estimated particulate mattermortality relation to control for weather and season.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods
Distributed lag models for temperature.
Poisson regression (32) is frequently used to estimate the relation between fluctuations in daily mortality counts and fluctuations in air pollution, while taking into account fluctuations in weather and other time-varying confounders (3
). We assume that
is an overdispersed random variable with
and
. The overdispersion parameter
c represents the variation in mortality not captured by the regression model. We let
and
be the daily temperature and PM10 time series for city c. Our analysis considers the effects of previous days' temperature and particulate matter on current day mortality, so let
and
refer to the temperature and PM10 time series lagged by n and l days, respectively. A model for lag l PM10 on daily mortality with distributed lags of temperature is
![]() | (1) |
Time lags of exposure variables are often correlated, making the distributed lag effects difficult to estimate. A common solution is to constrain the
to a functional form, such as a polynomial (29
) or a spline (33
). Here, we let n = 14, allowing for temperatures up to 2 weeks prior to affect mortality, and we constrain the
to lie on a step function. This formulation minimizes the number of distributed lag parameters, allows for effect differences among more and less recent temperatures, and facilitates interpretation of coefficients and of the total effect of temperature. We set steps at lag 0, lag 2, and lag 7, thereby constraining the
so that lag 1 and lag 2 temperatures have the same effect on mortality, lag 37 temperatures have the same effect on mortality, etc. Defining
as the average of the past 2 days' temperatures
as the average of the past 7 days' temperatures, and
as the average of the past 14 days' temperatures, we have
![]() | (2) |
To illustrate this step distributed lag function as well as the seasonal variability in temperature effects, figure 1 shows estimates of the step distributed lags (equation 2) fitted separately for each of the 14 summers (MayAugust) and each of the 13 winters (NovemberFebruary) of the NMMAPS data for New York City. The covariates day of week (as a factor) and day of month (as a linear term) were also included. The smooth function of time S(t, x years) was not included since we fitted the model separately over the 4-month periods. For New York City, a 10°F increase in current day temperature in MayAugust results in greater increase in mortality than does a 10°F increase in current day temperature in NovemberFebruary (figure 1). Increases in lag 1 and lag 2 day temperatures generally result in increases in mortality in MayAugust and decreases in NovemberFebruary (figure 1).
|
![]() | (3) |
Replacing the fixed temperature coefficients (equation 2) with time-varying coefficients allows for temperatures up to 2 weeks prior to affect mortality nonlinearly and for this nonlinearity to change over time.
Distributed lag models for temperature with nonlinear temperature covariates.
These models include those currently used in multicity studies and use smooth functions of distributed lags of temperature to explicitly account for nonlinearity in the temperaturemortality relation. We let S(·, ) denote a smooth function with
degrees of freedom for a city c, and we consider distributed lag models of the form
![]() | (4) |
The value for determines the nonlinearity of distributed lag temperature covariates, and K determines how many lags of temperature to include. By K = 0, we denote the model that contains only current day temperature. Previous multicity studies have used smooth functions of the current day's temperature and the average of the past 2 or 3 days' temperatures (9
, 12
16
), that is, K = 1, so we consider K = 0, 1, 2, 3. Previous multicity studies have generally set
= 3, and corresponding sensitivity analyses have considered the effects of varying
on particulate matter log relative risk (13
, 16
). We accordingly investigate how both the smoothness of the distributed lag temperature variables (
) and the number of distributed lag variables (K) may influence particulate matter log relative risk.
In models with nonlinear temperature terms (equation 4), the exact nature of the seasonal relation between temperature and mortality is not explicitly defined. However, these models do not allow for the nonlinearity between temperature and mortality to change over time. The model with seasonallytemporally varying temperature coefficients (equation 3) allows implicitly for a nonlinear relation between temperature and mortality and additionally allows for this nonlinearity to change over time. We compare PM10 log relative risk estimates from both formulations to determine if either formulation alters PM10 estimates.
Interactions among lagged temperatures.
Models that include interactions of distributed lag temperature variables allow for synergy between current and previous days' temperatures when these affect mortality. We accordingly estimate our distributed lag models with and without interactions of temperature distributed lags. The interaction terms for seasonallytemporally varying distributed lag models (equation 3) take the form , etc. For models with smooth functions of temperature distributed lags (equation 4), interactions take the form
, etc. To keep the number of regression parameters from growing too large, we excluded all interactions with
.
Other covariates.
Measures of humidity, such as dew point, are important weather components in mortalityair pollution models (34). However, directly including dew point in models with many temperature covariates may result in variable parameter estimates due to collinearity. Residualized values of dew point variables regressed on temperature covariates are orthogonal to the temperature covariates but retain variation in dew point not explained by temperature. We regress the current-day dew point and the average of the past 2 days' dew points on temperature covariates and include the respective residuals in our models.
For convenience, we denote the distributed lag models with seasonallytemporally varying temperature coefficients by and those with nonlinear smooth functions of temperature by
, where
indicates the degrees of freedom per year in the smooth time trend, and the presence of superscript I indicates the inclusion of temperature interactions. For the nonlinear models, K + 1 refers to the number of distributed lag variables in the model (current day plus K additional lag averages), and
is the degree of smoothness of the distributed lag variables. For both model formulations, we include the covariates day of week and age category (as factors), day of month (as a linear term), and for each age category a natural spline of time with 14 df. The smooth of time by age category accounts for mortality trends related to an aging population or differing migration rates within age categories. Although not included in prior NMMAPS analyses, a few previous time-series studies of air pollution and health have adjusted for day of month (35
). We conservatively included a linear day-of-month term in our models.
The PM10 log relative risk was estimated for all distributedlag models separately for each of 100 cities, using the "glm" function in R, version 1.8.0, software (see below). The models included PM10 exposure from the current day, the previous day, or 2 days previously (l = 0, 1, or 2). For each of these exposures in the seasonallytemporally varying model, the degree of adjustment for seasonal effects on mortality was varied, with = 1, 2, 4, or 8. Based on the results for the seasonallytemporally varying model,
was set to 4 for the nonlinear model. All R programs are available in an R-script tdlm.R at http://www.ihapss.jhsph.edu/data/NMMAPS/R/, and the data are available as part of the NMMAPSdata Package in R (36
).
Multicity estimates.
The comparison of city-specific log relative risks across models informs the sensitivity of particulate matter estimates to control for weather and season for a single city but does not quantify the sensitivity of particulate matter estimates to control for weather and season generally. With a Bayesian hierarchical model (37), city-specific PM10 log relative risks
for a particular model may be used to estimate a pooled PM10 log relative risk
for that model. Then,
is a national average PM10 log relative risk. The hierarchical model supposes that
, and
are independent
and that
s are uniformly distributed. TLNise statistical software (37
) was used to estimate
and
.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The variability of evidence from single-city analyses investigating confounding by weather and season highlights the advantages of a multicity approach. Samet et al. (26) found that estimates in Philadelphia, Pennsylvania, were robust to control for weather, while Smith et al. (34
) found that PM10 coefficients for Birmingham, Alabama, and Chicago, Illinois, were sensitive to control for humidity. Our findings show that, though different weather and season models may alter the significance of estimates for specific cities, they do not significantly or substantially alter national PM10 log relative risk estimates.
Our findings support the use of smooth functions of current-day and average temperature and dew point from the past few days to control for weather effects. The nonlinear distributed lag model designated NL4(2, 4
) is similar to the models used in multicity studies, such as the NMMAPS and the APHEA project (10
, 12
14
, 16
), and we found that the PM10 log relative risk from NL4(2
, 4
) is similar to estimates from our other nonlinear models with more distributed lags of temperature. Time-series studies of temperature and mortality have found the strongest temperature effects on the current day and the past few days (18
20
), perhaps explaining why including distributed lags of temperature for more than a few days does not alter particulate matter effect estimates. Models that control extensively for the effects of weather may be helpful when estimating the effects of pollutants that have more temperature dependence than does PM10, such as ozone. We note that our models and the associated software implementation (36
) can be used in other multisite time-series studies of particulate matter or other pollutants.
Our analysis demonstrates that the short-term effects of PM10 on mortality estimated by Poisson regression are not artifacts of inadequate control for weather and season. Similar conclusions have been made for multicity time-series studies by use of alternative methods to Poisson regression, such as case-crossover analysis (38). Combined, the results suggest that the short-term effects of particulate matter on mortality are viable; they are the result of neither the lack of control for weather and season within the Poisson regression framework nor the use of Poisson regression itself.
Although the PM10 effect estimates for the NMMAPS are robust to control for weather and season, they cannot capture all of the short- or long-term effects of particulate matter on mortality. Our sensitivity analysis is limited to single-day PM10 exposures since the majority of cities in the NMMAPS lack daily PM10 measurements, but other studies have reported larger PM10 log relative risks for models that include distributed lags of PM10 (14, 15
, 28
, 30
). These studies that find significant air pollution log relative risks using models that include multiple-day exposures of temperature and air pollution provide additional evidence that residual confounding by weather is not responsible for the observed air pollutionmortality relation (28
). Our estimated 0.2 percent increase in mortality due to a 10-µg/m3 increase in previous-day PM10 reflects only a portion of the short-term health effects of PM10 and, furthermore, does not estimate the larger magnitude of chronic health effects identifiable only through cohort studies. Our methods control for unmeasured differences across city populations that may confound cohort studies however, and therefore they provide important evidence that particulate matter (and not other factors) is responsible for adverse health effects.
![]() |
APPENDIX |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
In a leap year, we accordingly divide dt by 366. We let D5(t), D6(t), and D7(t) be basis vectors for slow temporal change in the temperaturemortality relation, designated by a natural spline over t = 1, 2, ..., 5,114 (i.e., years 19872000) with 3 df. To allow the seasonal effects of temperature to vary over time, we set D8(t) = D1(t) x D5(t), D9(t) = D1(t) x D6(t), ..., and D19(t) = D4(t) x D7(t). Although understanding the sensitivity of PM10 log relative risk is our primary interest, we note that the time-varying temperature log relative risks may be computed directly from the estimates of the
![]() |
ACKNOWLEDGMENTS |
---|
The authors also wish to thank Drs. Francesca Dominici, Roger Peng, and Thomas Louis for their helpful comments.
The work herein does not necessarily reflect the views of the funding agencies nor was it subject to their review.
Conflict of interest: none declared.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|