Commentary: Are piecewise mixed effects models useful in epidemiology?

Rebecca Hardy

MRC National Survey of Health and Development, Department of Epidemiology and Public Health, Royal Free & University College Medical School, 1–19 Torrington Place, London WC1E 6BT, UK. E-mail: rebecca.hardy{at}ucl.ac.uk

The call for applied researchers to learn how to use mixed effects models in the paper by Naumova, Must and Laird1 in this issue of the International Journal of Epidemiology is to be welcomed. The value of longitudinal studies, in which measurements on the same sample of individuals are taken repeatedly over time, is well understood. Such studies are costly, both financially and in terms of the time and effort required on behalf of study members and researchers. Correct analysis of the often vast amount of data collected is therefore vital. Despite considerable discussion regarding the analysis of repeated measures in the statistics literature, the use of more complex models in medical and epidemiological journals remains scarce. As well as researchers being encouraged to use more complex models, journal editors should also welcome their use where appropriate.

Mixed effects models can be used to analyse straightforward repeated outcome measure data over time, many examples of which exist in epidemiology. However, the specific mixed effects model presented by Naumova et al.1 is more interesting. The idea of using a time scale other than age or calendar time is introduced, as is the use of a piecewise model. The piecewise model allows separate slopes to be fitted to the observations representing the periods before and after a ‘critical period’ or ‘event’. The extension of the basic model to determine the individual characteristics that relate to differences in the increase in fat after menarche is particularly valuable to the understanding of obesity development. Understanding how early life and childhood growth influence not only obesity, but also later disease development has been highlighted by the recent interest in life course epidemiology.2 For example, early age at menarche3 and tall adult height4 have been consistently related to increased breast cancer risk, but the mechanism underlying the associations may be due to childhood growth. Women who were both big babies and tall at 7 years have been found to have a greater risk of premenopausal breast cancer than others.5 Hence, understanding how growth patterns, in relation to menarche, are influenced by factors such as childhood diet and socioeconomic status is important in breast cancer aetiology. Menopause is another obvious event to which the proposed models could be applied. Insight into the debates regarding the influence of menopause on cardiovascular risk factors6 and on psychological health7 could be gained. The ‘event’ could even be social such as loss of job or divorce, both of which may influence health.

The additional information obtained from mixed effects models regarding between-individual variability can be of interest in its own right.8 Assuming that the individual post menarcheal slopes are normally distributed, approximately 95% of girls' slopes will lie within ±1.96 standard deviations of the overall average slope. For the example presented by Naumova et al.1, the limits would be 0.62–4.30% and hence approximately 2.5% of girls have an increase in fat above 4.3% per year. The percentage of girls above or below any given slope can be obtained by reference to the standard normal distribution. It is this idea that is utilized when Naumova et al.1 observe that there are virtually no girls with a negative slope after menarche but 37% with a negative slope before menarche. Such information could be useful when planning obesity prevention strategy. The correlation between body fat at menarche and post menarche slope of –0.59 suggests that the greater the fat at menarche the slower the increase in fat afterwards.

It may be unclear to the non-statistician why the desired statistical property of efficiency is important. In practice, the confidence intervals of the estimates in an efficient analysis will be narrower than those from a less efficient one and hence use of the more efficient analysis means there is greater power to detect a given effect. Efficiency can therefore also have implications for study design and sample size requirements. Although some improvement in efficiency is achieved simply through the model fitting procedure, additional gains will often be made simply through mixed effects models allowing more individuals to be included in the analysis than standard methods. This is due to their ability to handle missing outcome data and measures collected at different time points for different individuals. This represents a major practical advantage as unbalanced data is unavoidable in longitudinal epidemiological studies. Although, in technical terms, such models can always handle missing data, assumptions about the missing data mechanism must be made in order for the inferences to be valid. Specifically, the outcome data must be missing at random which means that the probability of a measure being missing may depend upon observed, but not unobserved, measures. Although impossible to test, the feasibility of such an assumption should be considered in each analysis as it may be more realistic in some cases than others. In studies of cognitive function in the elderly, a missing cognitive test score may be more likely if function has deteriorated. It should also be noted that the mixed effects models cope far less easily with missing covariate information.

With the increased ease of implementation, further aided by the SAS programs provided in the paper,1 wider use of such models should be encouraged. It has to be acknowledged, however, that these models are complex and this is perhaps highlighted by the fact that Naumova et al.1 still require a certain amount mathematical notation in their explanation. Effort is therefore required on behalf of the applied researcher wishing to utilize such models correctly. The use of graphical displays at all stages of the analysis can certainly aid understanding. The figures presented by Naumova et al.1 highlight the two components of the mixed effects model, the subject-specific growth curves and the population averages, which are key to such models. Finally, as with all statistical techniques, but perhaps particularly with one as flexible as the mixed effects model, caution is required and all analyses should be carried out with due thought to the specific hypothesis under consideration. The added complexity of having to specify correctly the random effects component of the model as well as the fixed effects means that model checking in addition to that usually carried out is required.9,10

References

1 Naumova EN, Must A, Laird NM. Evaluating the impact of ‘critical periods’ in longitudinal studies of growth using piecewise mixed effects models. Int J Epidemiol 2001;30:1332–41.[Abstract/Free Full Text]

2 Kuh D, Ben-Shlomo Y (eds). A Life Course Approach to Chronic Disease Epidemiology: Tracing the Origins of Ill-health from Early to Adult Life. Oxford: Oxford University Press, 1997.

3 Kelsey JL, Gammon MD, John EM. Reproductive factors and breast cancer. Epidemiol Rev 1993;15:36–47.[ISI][Medline]

4 Wang DY, De Stavola BL, Allen DS et al. Breast cancer is positively associated with height. Breast Cancer Res Treatment 1997;43:123–28.[ISI][Medline]

5 DeStavola BL, Hardy R, Kuh D, dos Santos Silva I, Wadsworth M, Swerdlow AJ. Birthweight, childhood growth and risk of breast cancer in a British cohort. Br J Cancer 2000;83:964–68.[ISI][Medline]

6 Matthews KA, Kuller LH, Sutton-Tyrrell K, Change Y-F. Changes in cardiovascular risk factors during the perimenopause and postmenopause and carotid artery atherosclerosis in healthy women. Stroke 2001;32:1104–11.[Abstract/Free Full Text]

7 Klein P, Versi E, Herzog A. Mood and menopause. Br J Obst Gynaecol 1999;106:1–4.[ISI][Medline]

8 Hardy R, Wadsworth M, Kuh D. The influence of childhood weight and socioeconomic status on change in adult body mass index in a British national cohort. Int J Obesity 2000;24:725–34.[ISI]

9 Beacon H, Thompson S. Multi-level models for repeated measurement data: application to quality of life data in clinical trials. Stat Med; 15:2717–32.

10 Cnaan A, Laird NM, Slasor P. Using the general linear mixed model to analyze unbalanced repeated measures and longitudinal data. Stat Med 1997;16:2349–80.[ISI][Medline]





This Article
Extract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Request Permissions
Google Scholar
Articles by Hardy, R.
PubMed
PubMed Citation
Articles by Hardy, R.