THE AUTHORS REPLY

Karen Leffondré1,2, Michal Abrahamowicz1,2, Jack Siemiatycki3,4 and Bernard Rachet4

1 Department of Epidemiology and Biostatistics, McGill University, Montreal, Quebec, Canada
2 Division of Clinical Epidemiology, The Montreal General Hospital, Montreal, Quebec, Canada
3 Département de Médecine Sociale et Préventive, Université de Montréal, Montreal, Quebec, Canada
4 Department of Epidemiology and Biostatistics, INRS-Institut Armand-Frappier, Université du Québec, Laval-des-Rapides, Quebec, Canada

We thank Drs. Hoffmann and Bergmann for their helpful observations (1). They comment on some aspects of our study (2) and suggest a novel approach to modeling smoking history.

Hoffman and Bergmann (1) suggest that we did not consider the difficulties involved in simultaneous modeling of several smoking-related variables, yet comparison of our models 16–18 clearly underlined the difficulties in simultaneously modeling age at initiation, duration, and/or time since cessation while adjusting for age (2). We demonstrated that including all of these variables (model 17) was not tenable because of multicollinearity, such that "interpretation of the resulting estimates is impossible" (2, p. 820). McKnight et al. (3), in a study cited in our paper, focused on difficulties specific to simultaneous modeling of two continuous exposure variables that were categorized and that both had an assigned value of zero for nonexposed subjects. They suggested a solution similar to our inclusion of a binary smoking status indicator in model 9 (2). However, our approach avoids the limitations of categorizing continuous variables and allows exploration of multicollinearity problems, which were ignored by McKnight et al.

It is unclear why Hoffmann and Bergmann (1) are concerned that in our analyses, goodness of fit improves with the inclusion of additional variables. Akaike’s Information Criterion does not necessarily improve with an increasing number of covariates; rather, it corrects for this number (4). Indeed, our model 17 included one more variable than model 16 but yielded a worse Akaike’s Information Criterion (2).

Our paper focused on fairly simple and commonly used approaches to modeling smoking history. We never intended to explore all possible approaches. Specifically, we did not consider interactions between continuous smoking-related variables, partly because none of the recent studies we screened assessed such interactions and partly because of space limitations. For similar reasons, we did not consider more sophisticated approaches mentioned in our Discussion (2).

However, it may be of interest to investigate the advantages and limitations of the approach proposed by Hoffmann and Bergmann (1). Indeed, we believe that using their smoking indicator X may be especially interesting for testing the overall effect of smoking or adjusting for it. If the one-component model is consistent with the true (unknown) data structure, it might lead to a better goodness of fit than the use of, for example, separate variables for cigarette-years and time since cessation. However, there are some potential limitations of using X. First, the corresponding regression coefficient may be difficult to interpret. Moreover, the proposed formula of X implies, for example, a gradual leveling off of the effect of increasing smoking duration, which may not apply in some circumstances. Moreover, using X implies choosing a priori the values of half-time ({tau}) and lag ({delta}) parameters (1), both of which are likely to influence the results. However, in some studies there may not be sufficient prior knowledge to justify such choices. On the other hand, choosing these parameters a posteriori, as suggested by Hoffmann and Bergmann, may create some inferential problems, such as inflated type I error (5). Finally, some issues investigated in our paper (2) apply to X as well. Centering X and including the binary indicator of ever smoking in the model would help in interpreting the results of analyses that included never smokers. In addition, using X would not eliminate the problem of separating the effect of age at smoking initiation from other time-related smoking variables (2).

In summary, we think that the approach suggested by Hoffmann and Bergmann (1) may be of interest as a parsimonious representation of different aspects of smoking history. However, further investigation is needed to assess its potential advantages and limitations.

REFERENCES

  1. Hoffmann K, Bergmann MM. Re: "Modeling smoking history: a comparison of different approaches." (Letter). Am J Epidemiol 2003;158:393.[Free Full Text]
  2. Leffondré K, Abrahamowicz M, Siemiatycki J, et al. Modeling smoking history: a comparison of different approaches. Am J Epidemiol 2002;156:813–23.[Abstract/Free Full Text]
  3. McKnight B, Cook LS, Weiss NS. Logistic regression analysis for more than one characteristic of exposure. Am J Epidemiol 1999;149:984–92.[Abstract]
  4. Akaike H. A new look at the statistical model identification. IEEE Trans Automatic Control AC 1974;19:716–23.[CrossRef]
  5. Abrahamowicz M, MacKenzie T, Esdaile JM. Time-dependent hazard ratio: modeling and hypothesis testing with application in lupus nephritis. J Am Stat Assoc 1996;91:1432–9.[ISI]