From the Department of Statistics, University of MilanBicocca, Milan, Italy.
Received for publication April 3, 2002; accepted for publication December 16, 2003.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
alcohol drinking; dose-response; meta-analysis; mortality; polynomial regression; regression analysis; spline smoothing
Abbreviations: Abbreviations: AIC, Akaikes Information Criterion; CI, confidence interval; RR, relative risk.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Categorical methods are most commonly used to analyze the relation between an exposure and disease risk (2). In a single report, results are almost always presented in a 2 x J table, with a condition (present or absent) given on one margin of the table, J levels of exposure on the other, and a baseline category which serves as the common reference group for J 1 dose-specific estimates of the relative risk (RR) of disease. When access to the original data is not allowed, the only way to combine the results of several reports is to use aggregate data from the above-described tables.
One method of summarizing dose-response relations across studies is to estimate b, the change in the natural logarithm of the relative risk per unit of exposure within each study, and to combine these estimates across studies (3). Such an approach can be misleading, however, because it assumes that the dose-response relation follows a specific model form, usually linear (4).
Attempts to represent nonlinearity are usually made by means of polynomial models, typically quadratic models (5, 6). However, low-order polynomials offer a limited family of shapes, and high-order polynomials may fit poorly at the extreme values of the exposure variable (7). A further disadvantage is that polynomials do not have asymptotes and cannot fit data for which a threshold value is expected.
Nonparametric regression methods are useful alternatives for avoiding strict assumptions about the form of the dose-response relation (8, 9). A main hindrance of this approach is the lack of widely available software. Moreover, nonparametric regression is particularly useful when nothing may be assumed about the form of the exposure-disease relation. On the contrary, for most epidemiologic purposes, knowledge or strong prior evidence exists about the functional form of the relation of interest. This is particularly true in meta-analysis, for which the shape of the relation is supplied by individual studies. Finally, because of the data-driven nature of nonparametric regression, the nonparametric curve estimation methods only provide accurate estimates of the effects of particular exposure levels, rather than investigate the whole structure of the relation of interest.
Two alternative curve-fitting methods, fractional polynomial regression and spline regression, have recently been described by Greenland (10) and Royston (11). Fractional polynomials are a family of models considering as covariates power transformations of a continuous exposure variable restricted to a small predefined set of integer and noninteger exponents (12). The family includes conventional polynomials as a particular case and therefore comes close to simple regression, though it supplies important improvements (13). Splines are a family of smooth functions that can take on virtually any shape, and, as a consequence, they come close to nonparametric regression (14).
Such approaches are underused in epidemiologic research, and to our knowledge they have never been compared in meta-analysis of dose-response aggregate data. Therefore, we attempted to explore the use of these flexible tools to model aggregate dose-response data. Below we use a meta-analysis of alcohol consumption and all-cause mortality to illustrate the methods discussed in this paper.
![]() |
APPLICATION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Data collection
A meta-analysis of all of the published cohort studies investigating this issue was carried out. Articles included were found through a search of the literature published from 1966 to 2000 in MEDLINE and other bibliographic databases (Current Contents, EMBASE, and CAB Abstracts), supplemented by attention to all references listed in the selected articles and by a hand search of the most relevant journals of epidemiology and medicine. Each identified study was reviewed, and the study was included in the analysis if the corresponding findings were reported as relative risks (and numbers of cases and person-years) corresponding to two or more levels of alcohol consumption with respect to the nondrinking category. Studies that were excluded from analysis were excluded because of baseline exposure contamination (e.g., use of the lowest percentile category as the reference group) or consideration of alcoholism as an exposure category. A data set was constructed as a series of records including information about each dose-response reported by the included studies. The main fields in the data set were: 1) the value x of exposures (expressed as g/day) assigned as the midpoints of the ranges of the reported categories of alcohol intake (as suggested by Berlin et al. (5), the x values were calculated as 1.2 times the lower bound of the open-ended upper categories); 2) frequency counts, adjusted estimates of log RR, and corresponding 95 percent confidence intervals for each exposure level (when confidence intervals were not reported, we evaluated them using the standard error Woolf estimator for large samples (2)); and 3) covariates describing the characteristics of the study. Major details about the methods used in assessing the studies characteristics are given elsewhere (19, 20).
Describing aggregate data
Twenty-nine cohort studies (16, 2552) including a total of 171,869 deaths were considered in the analysis (table 1). The scatterplot shown in figure 1 was obtained by pooling the results reported from all of the included studies. Although the data are quite scattered, a rather rapid decrease in log RR for low doses and an approximately linear increase thereafter is observed. However, the data are not homogeneously distributed in number and precision along the scale of alcohol intake. Rather, the reported estimates mainly pertain to low doses for which more precise estimates are available.
|
|
![]() |
MODEL FITTING |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
First, the conventional procedure estimating the slope within each study and then combining these estimates across studies (the "postpool" method) is not suitable when more than one parameter needs to be estimated to explore nonlinear relations. In fact, within each study, J 2 parameters can be estimated in an unsaturated model, implying that only linear relations can be explored for studies reporting three exposure levels. An alternative involves pooling of the study data before trend analysis (the "prepool" method). This method is algebraically equivalent to the postpool procedure when linear relations are investigated, but it has the advantage of allowing fitting and testing of nonlinear trends (5, 53).
Second, because the relative risks within each study were calculated using a common reference group, the independence assumption is systematically violated. Greenland and Longnecker (53) suggested correcting the reported relative risks for the covariance matrix asymptotically estimated from the correlation between all pairs of log RRs, thus obtaining estimators of coefficients and their variance, respectively more efficient and more consistent, than the corresponding uncorrected estimators. In brief, the asymptotic covariance matrix of log RRs of the kth study was estimated as C = W-1RW-1, where W-1 is a diagonal matrix with the standard errors of the adjusted log RRs on the diagonal and R is the correlation matrix of the crude log RRs estimated using the delta method applied to the cross-classification of exposure and outcome. (See Appendix 3 in Greenland and Longnecker (53) for major details.)
Third, the relative risk within each study was estimated with differing precision, and consequently each observation should be considered with a weight equal to the inverse of its variance (being Var(log RR) = [(log log RR)/3.92]2, where
and RR and the upper and lower limits of the 95 percent confidence interval of the relative risk estimate (3)).
Hence, inverse-variance-weighted and correlation-corrected prepool methods were consistently used in fitting the flexible meta-regression models described in the following two sections. This method assumes that the only source of variability of the effect size is the within-study sampling variation (fixed-effects model). Nevertheless, meta-analyses often have to deal with issues relating to comparability of studies. This implies that differences in effect size among studies, in addition to those within studies, should be considered as a further source of random variability when important heterogeneity of the effect size is observed. Therefore, according to DerSimonian and Laird (1), an additional component of the variance explaining the between-study heterogeneity was considered in weighting each observation (random-effects model). The deviance (D) of both fixed- and random-effects models was computed, and the likelihood ratio test for homogeneity (1 df) was constructed as the difference between the random-effects model deviance and the fixed-effects model deviance (21).
The 95 percent confidence intervals of the model-predicted log RRs were always obtained by the large-sample approximation , where
is the vector of estimated parameters, x is the vector of covariates, and V is the covariance matrix of
(10).
All of the analyses were carried out using a Statistical Analysis System macro (54, 55) developed for the current application. It is briefly described in the Appendix.
![]() |
FRACTIONAL POLYNOMIAL MODELS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
describes log RR as a function of power transformations of the exposure variable x, where pm is an element of the powers vector p with m = 1, ..., M and ßm is the corresponding coefficient (11). (No intercept term was considered for this; likewise for all of the other models described in this paper, since for x = 0 the function would start from log RR = 0.)
It is worth considering the family of second-order fractional polynomials specifically, because only monotonic curves may be modeled by the first-order models, and because fractional polynomials with an order higher than 2 are rarely required in practice (11).
Although it is rather simple, the family of second-order fractional polynomial models offers considerably flexibility. In particular, by choosing p1 and p2 from a predefined set P = {2, 1, 0.5, 0, 0.5, 1, 2, 3}, a very rich set of possible functions, including some so-called U-shaped and J-shaped relations, may be accommodated. The powers are expressed according to the Box-Tidwell transformation (12), in which
denotes
if pi
0 and log x if pi = 0. When p1 = p2 = p, the model becomes log(RR½x) = ß1xp + ß2(xp log x).
The best fit among the family of models thus generated is defined as that with the highest likelihood or, equivalently, that with the lowest deviance. In the setting of fractional polynomials of order M (let us suppose, with M = 2), it is convenient to use the deviance associated with a reference model (e.g., with the conventional quadratic polynomial) as the baseline for reporting the deviances of other models. Thus, the gain (G) for a given model is defined as the deviance associated with the reference model minus that for the model in question. Since G moves in the direction opposite that of D, a larger gain indicates a better fit (56).
When several fractional polynomials of the same order are to be compared, Royston and Altman (12) suggest preferring a given model with respect to the reference model when (the 90th percentile of
2 with M df, that is, 4.61 for second-order models).
The conventional quadratic model (p1 = 1, p2 = 2) better fits data than the linear model (p1 = 1), with a gain in deviance of 54.32, according to the expected J-shaped dose-response relation between alcohol and mortality.
Table 2 shows the gain in fitting each second-order fractional polynomial model with respect to the quadratic polynomial. The best-fitting model (p1 = p2 = 0.5) offers a gain in deviance of 138.36 with respect to the reference model.
|
![]() |
SPLINE REGRESSION MODELS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
log(RR½x) = ß01x + ß02x2 + ß03x3
is fitted within each category. Smooth joints are obtained by assuming that the function is 2 times continuously differentiable on the range of x. Simultaneously fitting the C category-specific cubic models (equation 2) and taking into account continuity for the function and its derivatives is equivalent to fitting the single cubic spline model
where (x ki)+ = max (0, (x ki)) is the positive part. Hence, to fit the model shown in equation 3, C + 2 parameters would be estimated.
Depending on the exposure distribution, it might sometimes happen that one finds odd behavior on the part of the fitted curve in the last exposure category, such as sudden and unexpected changes in trend. When this happens, further constraints can be easily introduced by restricting the fitted curve to be linear in the upper category (14).
The cubic spline with the upper category restricted to be linear is
obtained applying to equation 3 the linearity condition for x kC1 expressed by the constraints
and
It is worth considering that in restricting a spline of order q with C 1 knots, C 1 + q (q 1) = C parameters would be estimated in the fitting of equation 4, independently of the order of the polynomial.
A crucial problem in spline regression is the choice of the location and number of knots. With the aim of avoiding arbitrary choices, knots have been firstly located as boundaries between equal-sized categories (e.g., one knot at the 50th percentile, two knots at the 33rd and 66th percentiles, and so on). However, a problem with this fixed percentile procedure is that it allows checking of only a few possible knots among many vastly putatively better knot positions. Following the recommendation of Royston (56), the alternative approach used in the current application is to assign knots at random positions and evaluate the likelihood of each resulting model. This procedure, called the random percentile procedure, was repeated 100 times, generating a family of 400 models (4 x 100) randomly selected among those with one, two, three, and four knots, with the model with the best goodness of fit being regarded as the best among those entertained. The model minimizing Akaikes Information Criterion (AIC), a penalized likelihood which takes into account the number of parameters estimated in the model (56), was used as a general rule in the model choice, because the AIC has been shown to be appropriate when the model complexity is potentially unlimited (11). In the random percentile procedure, the position of the knot was regarded as an additional parameter of the model to be taken into account in the computation of the AIC (56).
Table 3 shows the AIC values according to increasing numbers of knots of cubic splines from 1 to 4. Restricted cubic splines and, among these, models constructed by considering the random percentile procedure for the choice of knot locations had lower AIC values. Clearly, there is nothing to be gained in this example by increasing the number of knots, since this always increases the AIC. Thus, the model with one knot optimized at the 25th percentile remains the most satisfactory choice.
|
![]() |
COMPARING MODELS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The functions obtained by fitting conventional quadratic and best-fitting fractional polynomial and cubic spline random-effects models are compared in figure 2. Very similar trends were obtained by means of the best-fitting fractional polynomial and cubic spline models, whereas the quadratic curve was very different. The nadir (the level of alcohol consumption at which all-cause risk is lowest) and the last protective dose were 20 and 52 g/day, 5 and 44 g/day, and 6 and 46 g/day according to the quadratic, fractional polynomial, and cubic spline models, respectively.
|
Figure 3 plots the dose-response fractional polynomial and cubic spline functions in men and women. In men, the nadir is reached at 6 g/day and 7 g/day according to fractional polynomials and cubic splines, respectively (RR = 0.82 (95 percent confidence interval (CI): 0.74, 0.90) and RR = 0.83 (95 percent CI: 0.74, 0.93), respectively), and the protective action is observed up to a level of 60 g/day for both models. However, the protective effect was statistically significant only up to a daily intake of 28 g (RR = 0.90 (95 percent CI: 0.82, 0.99) and RR = 0.90 (95 percent CI: 0.81, 0.99), respectively). In women, the nadir is reached at 5 g/day and 6 g/day according to fractional polynomials and cubic splines, respectively (RR = 0.76 (95 percent CI: 0.68, 0.85) and RR = 0.77 (95 percent CI: 0.68, 0.88), respectively), and the protective action is observed up to levels of 47 g/day and 51 g/day. The protective effect was statistically significant only up to a daily intake of 24 g (RR = 0.89 (95 percent CI: 0.80, 0.99) and RR = 0.87 (95 percent CI: 0.77, 0.97), respectively).
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A challenge in epidemiologic studies, as well as in meta-analysis of such studies, is to determine the form of the relation between a given exposure and a disease. If the dose-response trend is not linear, there may be important public heath implications. For example, if there is a threshold for the effect of alcohol intake on the risk of death, measures addressed to the general population to reduce exposure below the threshold level would have more public health implications than control measures focused primarily on high consumers. Hence, there is considerable interest in understanding the shape of the dose-response relation.
In this paper, we have discussed flexible curve-regressing methods for obtaining models whose functional form is consistent with background knowledge.
An advantage of parametric approaches is that they allow researchers to obtain easily communicable results. The main advantage of parametric regression models using fractional polynomials rather than traditional ones is that models containing as few as two power transformations encompass a large range of shapes (13), allowing the arrangement of almost all known dose-response relations. In the current application, a dramatic improvement in the goodness of fit of the best-fitting second-order fractional polynomials with respect to the quadratic polynomial, which is the most popular tool for encompassing J-shaped relations, has been observed.
However, both traditional and fractional polynomials make use of all of the available data to build the global structure of the relation. Of course, this global structure may or may not be correct. Thus, perturbations of the data at an extreme end of the exposure range necessarily affect the model at the opposite extreme. This application, however, shows that fractional polynomials seem less affected by this than ordinary polynomials (see figure 2). In fact, the fast decrease in relative risk that is visually evident in the scatterplot of figure 1, with the gradual increase in relative risk values thereafter, seems better encompassed by the selected fractional polynomial model than the quadratic polynomial. Accordingly, the dose of alcohol showing the maximum protective effect, such as the one that discriminates between protective and harmful effects, showed much lower values for the best-fitting fractional polynomial than did the traditional quadratic polynomial. However, the effect of perturbations at extreme points of the exposure variable must still be there in some form.
As a consequence, it should be considered good practice to compare the results of the best-fitting fractional polynomial with those of spline analysis, which is characterized by local adapting. In this application, similar nadirs and last protective doses were found using the two approaches.
The meta-analysis described in this paper included most of the published information on alcohol and mortality currently available in the literature. The results suggest that the level of alcohol consumption at which all-cause risk is lowest is approximately 5 g/day and that alcohol exerts a protective action up to a daily intake of approximately 45 g, without any relevant differences between fractional polynomials and cubic splines.
However, because of the general weakness of meta-analysis of observational studies, our findings should be interpreted with caution. Use of different strategies in characterizing exposure categories (including the approximation involved in using the midpoint of each exposure category and the accommodation of the last open-ended category) will necessarily drive investigators toward different conclusions, the extent of which has not been investigated here. Perplexities mainly concern issues relating to comparability between studies. For example, it is questionable to pool results from studies that differ in terms of the measurement of alcohol consumption, the demographic characteristics of investigated subjects, the main patterns of alcohol consumption, the methods of obtaining the reported relative risks, the degrees of adjustment of the reported relative risks, and other unmeasured factors. Among these sources of variability, we investigated only the effect of gender, finding that: 1) the level of alcohol consumption at which all-cause risk is lowest was approximately 6 g/day in both men and women; 2) the extent of the greatest reduction in death risk was approximately 18 percent in men and 24 percent in women; and 3) alcohol seems to exert a protective action up to daily intakes of 60 g in men and 50 g in women. These findings are consistent with a recently published pool of systematic reviews (58). Other sources of heterogeneity, however, need to be investigated to accurately investigate the dose-response relation between alcohol and mortality. Moreover, other issues, such as publication bias, were not addressed in our study.
Therefore, we think that flexible modeling of dose-response aggregate data allows exploration of functional relations but does not overcome the general weakness of meta-analysis of observational studies. More generally, meta-analysis should be considered a useful alternative to but not a substitute for large, well-designed single investigations.
In summary, our results show that summarized curves generated from conventional polynomials may be misleading. Fractional polynomials and cubic splines both provide great flexibility for meta-analysis of dose-response aggregate data and are especially valuable when important nonlinearity is anticipated. The simple algebraic forms of such models and the fact that they are linear in their coefficients makes it easy to implement them in practice with the use of standard statistical software. Fractional polynomials are easier to communicate mathematically, require the estimation of fewer parameters, and are less influenced by arbitrariness in the choice of the model. On the other hand, cubic splines more adequately encompass sudden and important changes in relative risks. Hence, parallel analysis with both approaches seems the best strategy for obtaining unbiased and plausible dose-response curves.
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
APPENDIX |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The macro requires a SAS input data set specified by the user containing the following fields: id, the unique identifier of the study; exp, the assigned numerical value x of exposure; and n1 and n2, the numbers of cases and noncases (person-times or persons at risk) for each exposure level. In addition, adjusted estimates of relative risk (adjrr) for the nonzero exposure level and the lower (lower) and upper (upper) limits of the corresponding 95 percent confidence intervals can be reported. A set of covariates describing the studys characteristics can also be added.
To call up the macro, the user needs to be in IML. The "flexible" macro is included in the program via the "%include filename " statement, where filename is the name of the "flexible" macro file. Next the following macro variables are required: dataset, which indicates the name of the users input data set; ncov and covlist, representing, respectively, the number and names of the covariates describing the studys characteristics; and type, which specifies whether the analysis has to be based on fractional polynomials (type = 1) or on cubic splines (type = 2). If analyses based on fractional polynomials (FP) are chosen, the "_FP_" module is invoked, and the following macro variables must be specified: power, defining the set of r and s power transformations of the exposure variable, and ref, defining the reference model in the gain computation. If analyses based on cubic splines (CS) are chosen, the "_CS_" module is invoked, and the following macro variables must be specified: restr, specifying whether the cubic smoothing spline has to be tail-restricted, and nrnd, defining the number of replications needed for the random percentile knot allocation method. Finally, the macro variable random defines whether random-effects models are to be fitted within the "_FP_" or "_CS_" modules.
The following instructions allow us to perform fractional polynomial analysis considering data from all of the 29 studies included in this analysis, which are stored in a data set called COHORT.
proc iml;
%include flex_mac;
dataset = COHORT; ncov = 0; covlist = {" "}; type = 1; power = {2, 1, 0.5, 0, 0.5, 1, 2, 3}; ref = {1, 2}; random = 0;
%flexible(dataset, ncov, covlist, type, power, ref, random);
quit;
The resulting output is a table displaying the gain in the deviance for each second-order model with respect to the second-order reference model (see table 2). Parameter estimates and the corresponding standard errors of the reference and of the best-fitting model are also given.
In addition, an output data set called _PRED_ reports, for each model, the predicted value of the log relative risk (and 95 percent confidence interval) for each integer value of the exposure variable ranging from zero to the maximum value reported in the original data. Detailed graphs of the dose-response relation, like those presented in figure 2, could be obtained using any graphic software simply by plotting the data stored in _PRED_.
Further details on the use of the macro can be obtained from the first author (V. B.).
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|