a Tufts University School of Medicine, Department of Family Medicine and Community Health, Boston, MA, USA.
b Harvard School of Public Health, Department of Biostatistics.
Elena N. Naumova, Department of Family Medicine and Community Health, Tufts University School of Medicine, 136 Harrison Avenue, Boston MA 02111, USA. E-mail: elena.naumova{at}tufts.edu
Abstract
Recent developments in modern multivariate methods provide applied researchers with the means to address many important research questions that arise in studies with repeated measures data collected on individuals over time. One such area of applied research is focused on studying change associated with some event or critical period in human development.
This tutorial deals with the use of the general linear mixed model for regression analysis of correlated data with a two-piece linear function of time corresponding to the pre- and post-event trends. The model assumes a continuous outcome is linearly related to a set of explanatory variables, but allows for the trend after the event to be different from the trend before it. This task can be accomplished using a piecewise linear random effects model for longitudinal data where the response depends upon time of the event.
A detailed example that examines the influence of menarche on changes in body fat accretion will be presented using data from a prospective study of 162 girls measured annually from approximately age 10 until 4 years post menarche.
Keywords Growth, fat accretion, menarche, obesity, critical periods, longitudinal data, piecewise linear model, random effects model, mixed effects model
Accepted 12 March 2001
Evaluation of the impact of critical or high-risk periods in longitudinal studies of growth may provide clues to the aetiology of life-long obesity or the optimal timing of preventive or therapeutic intervention. If the critical period has a lasting effect on an ongoing temporal process, we must be able to observe the changes in this temporal process with repeated measurements over pre- and post-critical periods. The ability to model the temporal pattern before and after the critical period provides the information needed to assess whether the critical period has had an enduring effect on the process under observation. The data that derive from such a study are inevitably and unavoidably complex because the timing of critical periods will vary from subject to subject and may not correspond to an occasion of measurement. In addition, subjects may exhibit considerable variability in time trends. Furthermore, the assumptions of independence required for traditional statistical methods are not met in these studies due to the nature of the process itself and to the limitations of any observational study design.
Observational studies to assess the impact of critical periods share a number of challenging features. The critical or high-risk period is often a fairly complex biological process and it is sometimes difficult to define the beginning and end of the critical period. Pregnancy exemplifies a well-defined critical period: it starts with fertilization and ends with delivery. In this case, evaluation of the impact of pregnancy on fat accretion, for example, would be straightforward. We would compare fat accretion before and after pregnancy. When the critical period is less well defined and consists of a series of complex processes, one is forced to choose an event within the critical period, which will serve as a proxy. The event proxy allows the time period to be separated into pre- and post- event periods. The process of sexual maturation with menarche as the event is illustrative of a less well-defined critical period. Here, menarche is of interest to obesity researchers as a proxy for the pubertal period, proposed as a critical period in the development of obesity.
A second challenge arises because the critical period, and consequently the period around the event, is often characterized by increased variability. Increased variability arises for two major reasons: first, the critical period, and the event per se can be thought of as a transient period of destabilization with uncertainty as to the timing of the event; second, since one does not know a priori when the event will occur, measurements at this time are not guaranteed. Menarche, for example, is characterized by changes in the levels of many hormones, some of which affect growth. Ideally, we would like a measurement at the time of the event, or, to increase the frequency of measurements just prior to the event. This, of course, requires one to know when the event will occurthe very information one lacks! Variability should be considered in assessing the impact of the event on the ongoing process.
A third challenge is the unbalanced design associated with the conduct of observational studies. In classically designed experiments, such as clinical trials or laboratory experiments, data can be collected completely and at standard intervals. Under these circumstances of balanced and complete data, covariates typically vary either within subjects (subject-specific level) or between subjects (population level), but not both. In observational studies, this is virtually never the caseunbalanced designs and/or missing data are the norm; it is rarely possible to design the study so that each subject is measured at the same set of times and measured at every single time. This unavoidable feature of observational studies should be incorporated into the analysis.
From a statistical perspective, to evaluate the impact of critical periods in longitudinal studies using a traditional analysis, one could use ordinary least squares (OLS) regression. One would need to do two regressions, one regression for the pre-event period and another for the post-event period, for each subject separately. Then, taking the difference of slopes for each subject, one could calculate the average difference in the slope pairs, its standard error, and perform a t-test to test whether this value is different from zero. The advantage of this approach is that it is simple to do and easy to explain. However, this approach does not take full advantage of the richness of the data, including the correlation structure. Also, the traditional analysis does not characterize individual variation relative to the population mean. In addition, this approach requires one to discard data on subjects with an insufficient number of measurements either before or after the event. This approach will also weight all remaining subjects equally even though their estimates are based on variable numbers of measurements. Finally, fitting two regressions separately will not insure that the lines will meet at the time of event, which is required by the growth process. It is not biologically possible to have an substantial instantaneous shift in body fat.
These limitations are elegantly addressed by modern applied statistical techniques, such as general linear mixed models.1 The mixed model intrinsically consists of two components: subject-specific and population, while accounting for the correlation at each level. The model allows one to characterize individual variation relative to the population mean. The estimates that derive from a mixed model are weighted based on the actual data. Missing data can be handled readily by these techniques. Finally, the mixed model allows one to simultaneously fit the two slopes, so that intersection of the lines at the time of the event is insured.
Given the clear advantages for the applied researcher, why have these methods not been utilized more widely? There are likely several reasons for this. One reason may be that unless compelled to do otherwise, researchers will continue to use familiar methods. The typical researcher may be intimidated by complex formulations and mathematical notation (e.g. Greek letters, double subscripts, matrix notation, multiple brackets). In addition, the complete absence of graphical displays, which can make the step-by-step process far easier to understand, are not often included. Papers describing these methods typically appear in statistical journals and, although they include clear examples and often do explain the model in terms of the variables under study, the amount of mathematical terminology the reader must wade through before they get to interpretation may be overwhelming. These descriptions meet the needs of the statistical reader, but may be difficult for the applied researcher.2
Another issue is terminology: the literature on repeated measurement analysis describes the same models and methods using different terminology, and the lack of consistency is a source of confusion. For example, random effects, mixed effects, two-stage and hierarchical models are variously used to describe the same model. The term hierarchical is especially common in the educational literature, where one may have repeat measures on children in several classrooms in several schools, etc. In this paper, we use the term random effects to describe models where all the parameters in a single individual's regression model are random. We use the term mixed effects to describe models where the parameters can be both fixed and random. Incorporating a fixed effect in the model allows one to include covariates that do not vary within a subject. As another example, consider the interchangeable usage of the terms repeated measures and longitudinal data. Sometimes the term repeated measures is used narrowly to refer to designs where each subject is measured under different conditions, and then they are appropriately analysed with a classical repeated measures analysis. In other instances, the term is used more broadly to include longitudinal studies, where the repeated measures are taken on the same subject over time. To our minds, longitudinal data represent a special case of repeated measures, where the condition is the time at which the measurement was obtained.
Addressing this need is particularly urgent. With the number of large longitudinal studies currently underway, these data will become more common, and they deserve to be analysed properly. The software needed for analysis of longitudinal data obtained from complex study designs is now part of standard statistical packages.3 User-friendly graphic interfaces in the commercially available software allow one to easily visualize each step of data analysis, validation and presentation of results. Together these developments suggest the opportunity to reach a broader audience.
Our goal is to illustrate and present the use of general linear mixed model in a manner that is accessible to applied researchers; we use prospectively collected data to assess the influence of menarche on changes in body fat accretion as an example. We take the reader through the process of data preparation, model formulation, fitting and validation. In this process, we will incorporate traditional and familiar statistical methods, as needed. To visualize the hypothesis under study, actual data are used, and, for practically every step in the development of the mixed model, we incorporate a variety of graphical displays.
Description of the Study
It is known that the peripubertal period is associated with increases in body fatness.4 This increase starts just before or around menarche and is presumed to level off by approximately 4 years after menarche. However, this phenomenon has not been systematically studied in population-based samples. The current epidemic of obesity in the US and elsewhere has stimulated interest in the aetiology, with the possible role of critical periods in the development of obesity as one important focus of efforts.5,6
The data presented here were derived from a prospective study of the development of obesity in a cohort of 162 girls. Initiated in 1990, this ongoing NIH-funded study examines the effects of energy expenditure and other metabolic and behavioural factors on subsequent changes in body fat as adolescence progresses, until 4 years after menarche. At study entry, all girls were non-obese, as determined by a triceps skinfold thickness less than the 85th percentile, and pre-menarcheal. All girls were followed according to a schedule of annual measurements. They reported their date of menarche and were then observed until their exit day, which was scheduled on the fourth anniversary of their menarche.
The outcome variable of interest in the study, body fatness, is based on bioelectric impedance analysis. Per cent body fat (%body fat) was obtained from three basic measurements: weight (Wt [kg]), height (Ht [cm]) and bioelectric impedance resistance (R). Per cent body fat was then calculated using the Kushner equation:7, where total body water TBW = (0.7 Ht2/R) 0.32. For each subject, date of birth, date of study entry, dates of each follow-up measurement, date of menarche, and date of exit were recorded. Then, for baseline and subsequent measurements, the age and times relative to menarche and corresponding age at the date of each measurement were calculated in years, to three significant digits after the decimal point.
Although the endpoint of the study is 4 years after the first menstrual period (menarche), the study is ongoing. As of June 1999, based on protocol completion, of the 162 girls enrolled in the study and followed through menarche, 94 girls have exited (they are the completers), 54 girls remain in the study, and 14 are lost-to-follow-up after menarche. On average, the time period between two consecutive measurements was 1.11 ± 0.3 years. There are 1049 individual %body fat measurements, with an average of 6.5 ± 1.41 measurements per subject: 3.06 ± 1.36 for the pre-menarcheal period (497 measurements) and 3.45 ± 1.22 for the post-menarcheal period (552 measurements). In our data, the number of measurements per subject pre- and post-menarche are almost equal; Table 1 provides the overall frequency of measurements.
|
We conducted an exploratory analysis to check for the presence of unusual features in the data, to visualize the hypothesis under study, and to check the validity of our initial assumptions for modelling. Per cent body fat is shown as a set of individual growth curves. Figure 1 shows the entire set of data, in which each data point is shown as an empty circle for pre-event and a solid diamond for post-event measurements. This plot reveals the variability in subjects' age at entry, age at exit and age at the menarcheal event, but includes too much data to reveal the underlying patterns.
|
|
|
|
We hypothesize that %body fat accretion increases linearly with age, but with different slopes before and after menarche. We assume that each subject has a two-piece linear spline growth curve with a knot at the time of menarche. Graphical representation of the research hypothesis and the modelling approach are shown schematically in Figure 4. To make the graph realistic and to reflect the range of time of measurements relative to the age at menarche, we used results from our exploratory analysis on sample means of age at baseline (10 years), of age on exit day (16 years), average age at menarche (12.8 years), and average of %body fat taken at the baseline and exit.
|
Piecewise Least Squares Analysis
For the piecewise least squares analysis, we fit two straight lines to the data for each girl, connected at the time of menarche. We regress %body fat as a function of time relative to menarche, obtaining one intercept and two slopes: one for before and one for after the event. Then, we take the difference in slopes for each girl. Next, we compute the mean intercept, mean slopes, the mean slope differences, and their standard errors. Finally, we perform a paired t-test in the usual way. Note that for this piecewise least squares regression, to estimate the three regression parameters, we need at least three data points (with at least one data point for both pre- and post- menarcheal time period), and known age at menarche. The number of girls with sufficient data to calculate the intercept and two slopes is 153.
The mean and standard deviation estimates are as follows: for pre-menarcheal slope: 0.81 ± 4.46; for post-menarcheal slope: 2.44 ± 1.93; and for post- and pre-difference: 1.62 ± 5.17. The correlation between slopes is weak and inverse (r = 0.18, P = 0.027). Since we are regressing %body fat as a function of time relative to menarche, the mean and standard error of the intercept provides the estimate of %body fat at menarche: 21.41 ± 7.12%. T-tests demonstrate that the post-, but not the pre-menarcheal slope differs significantly from zero, as does the difference in the two slopes.
Although this piecewise regression is simple to compute and interpret, there are a few disadvantages in this procedure: first, girls with fewer measurements are weighted equally to girls with more measurements in our estimation of population mean, and second, we must disregard data on subjects with insufficient number of data points and therefore reduce statistical power.
Random Effects Analysis
Here we provide a description of a piecewise approach for modelling growth using a random effects analysis. This analysis assumes that a continuous outcome variable is linearly related to a set of explanatory variables, allows one to fit pre- and post-slopes simultaneously using all data points, and also can account for dependence between observations collected over time.
We assume that our measurements depend upon the menarcheal event such that each subject has a baseline level, a pre-event slope and a post-event slope. Let di be age at menarche for i-subject and aij be age at j-measurement for i-subject, then tij = aij di is time of j-measurement for i-subject before or after menarche. Let dij be an indicator: ij = 1 for the time period before menarche, tij < 0, and
ij = 0 for the time period after menarche tij
0. Now, we specify a model at the subject specific level for the n subjects i = 1, ..., n, each of whom has mi observations, Yij, j = 1, ..., mi, so that for the time period before menarche
, for the time period after menarche
, and the combined model is:
![]() | ((1)) |
At the population level, the model specifies the population mean and variance:
|
If we assume that (eij, eik; j k) are independent (residuals are measurement errors), then the overall variances and correlations depend upon the actual times of the observations, and upon population variances and covariances. Note that at the subject-specific level
2 equals var(eij) and represents the measurement error. Therefore, for the time period before menarche, variances and covariances have the following form:
and
. Similarly for the time period after menarche:
and
The random effects model provides users with the means to illustrate and make estimates for individual curves. The population estimates: µ+0, µ+1, µ+2, and individual estimates:0i,
1i,
2i, can be used to create individual predicted trajectories, or growth curves. This feature of the model can be used, in theory, to assess how an individual girl's fat accretion differs or mirrors that of the population generally. How well this will work in practice depends on the number of individual measurements upon which a girl's curve is based. This has implications for practical study design issues, where with fixed resources, one often has to balance the number of subjects against the number of measurements per subject.
The model parameters can be estimated using maximum likelihood (ML) or its variant, restricted maximum likelihood (REML), assuming that the observations are normally distributed. The ML estimates of the mean are in essence weighted least squares, with weights calculated from the estimated variances. The REML is usually preferred for estimation of the variance-covariance components because it adjusts for the degrees-of-freedom lost in estimating the means. Parameter estimates, and their standard errors, can be obtained using a variety of software packages, including SAS (Proc MIXED), S-Plus (lme) and BMDP-5V.
We now apply the piecewise random effects model to estimate the pre- and post-menarcheal slopes and to predict body fatness. The results of this model consist of two main parts: a set of individual intercepts and two slopes (random effects), and estimates of their means (fixed effects) and standard deviations. Also, the correlations between the slopes and intercepts provide additional useful information. Even though the results for the individual curves form the basis for the fixed effects estimation, for pedagogical reasons, our description of results will begin with the fixed effects, since these results address our primary research question.
Our goal is to assess the influence of the menarcheal event on the rate of fat accretion. The fixed effects results indicate that the mean and standard deviation of the pre-menarcheal slope, the post-menarcheal slope, and %body fat at menarche are: 0.42 ± 1.28 (µ+1, +11); 2.46 ± 0.94 (µ+2,
+22); and 21.36 ± 6.78 (µ0,
00), respectively. As a next step, we use the results of the mixed effects model to test our hypotheses. First, we formally test the hypothesis: Is %body fat increasing before and after menarche? The estimate for the population mean pre-menarcheal slope (
1 ± SE) is 0.42 ± 0.16, with Z = 2.66, which is statistically significant (using the Z-test). The slope is quite shallow (the estimated rate of fat accretion is 0.42%), but different than zero. The high variability suggests that some girls are gaining fatness and others are losing fatness over the pre-menarcheal period. After menarche, the estimate for the population mean post-menarcheal slope (
2 ± SE) is 2.46 ± 0.12, and Z = 20.67, which translates to %body fat of 21.4% with a 95% CI ranging from 8.08% to 34.64%. Our focus now shifts to the second hypotheses regarding the difference in the pre- and post- menarcheal slopes: µ+2 µ+1. That difference is 2.05 ± 0.22, with Z for the hypothesis that slopes are equal of 9.32. Fat accretion is more uniform after menarche, and the rate of fat accretion is almost six times greater than before menarche. So for each year of age after menarche %body fat increases an average 2.46%, so that by 4 years post-menarche, body fatness has increased by approximately 10% (absolute %body fat). If we assume that the individual slopes are approximately normally distributed in the population, then, based on the results of the random effects model, about 37% of girls have pre-menarcheal slopes < 0, and virtually none have post-menarcheal slopes < 0. We conclude that the post-menarcheal slope is significantly greater than the pre-menarcheal slope.
The random effects model also provides very useful additional information, such as an estimate of the measurement error ( = 3.1) and covariance estimates and correlations: the correlation of %body fat at menarche and the pre-event slope (
01 = 0.29), the correlation of %body fat at menarche and the post-event slope (
02 = 0.56), and the correlation of the pre- and post- event slope (
12 = 0.10).
A graphical display of the resulting curve obtained from the random effects model and least squares model along with their confidence intervals are shown in Figure 5. The graph is rescaled to reflect the range of predicted %body fat at the expected average age at menarche (12.8 years), and for the time period of 4 years before and after menarche. In both models, the predicted fat accretion before menarche is unimpressive, but after menarche %body fat increases markedly. Although the predicted %body fat at menarche and post-menarcheal slope are very close in both models, the confidence intervals for the pre- and post- event slopes are narrower for random effects model estimates than for the least squares analysis which results from the weighting of the random effects estimates. Thus, the random effects model reduces variability and thereby improves the efficiency of estimation.
|
|
The random effects model represents an early step in data exploration. However, the researcher is often interested in what variables influence changes over time. For the present example, there are several questions that are of potential interest. If adolescence does in fact represent a critical period, we are interested in variables that might influence %body fat at menarche and the post-event (post-menarcheal) slope. Among the many questions one might pose are: Is there non-linearity in the post-menarcheal time period? Are there any variables measured at baseline that influence %body fat at menarche and/or the magnitude of the slopes? Examples of potential variables include ethnicity, socioeconomic status, or parental obesity.
To answer these questions, we extend our random effects model to a mixed effects model to make it more flexible. In the linear mixed effects model, fixed effects and random effects are connected to each other, so that in a single model for Yij any observable effect is a combination of the two. To see how mixed effects are formulated, let b0i = ß0i µ0; b1i = ß1i µ1; b2i = ß2i µ2, so E(bi) = 0 and var(bi) = var (ßi). Then, the piecewise random effects model can be rewritten as follows:
![]() | ((2)) |
Thus, to answer the questions posited above we will extend the model by adding parameters as fixed effects and/or random effects. To answer the first question regarding non-linearity, we can add a quadratic term to the model, such as tij2 (1 i). To answer the second question we can add baseline variables, which influence the %body fat at menarche and possibly the slopes as well. To evaluate an effect of a baseline variable on an intercept, the baseline variable should be added as a main term. To evaluate an effect of a baseline variable on a slope we should add an interaction term for this variable and time since menarche. If the baseline variable does not change over time, we are adding it to the model by extending the fixed effects terms.
Here we provide the results of the modelling from three mixed effects models. Model A includes the quadratic term for the post-menarcheal slope as fixed and random effects and is formulated as follows:
![]() | ((3)) |
Model B includes stratification by parental obesity status as a fixed effect (where v = {0, 1 or 2}, if none, one, or both parents were obese) and is formulated as follows:
![]() | ((4)) |
Model C examines the influence of the parental obesity status on the post-menarcheal slope by the addition of an interaction term: ß4tij (1 ij) vi. The model is formulated as follows:
![]() | ((5)) |
The estimates of the fixed effects from the basic random effects model and Models A, B and C are shown in Table 3. All models produce very similar estimates for the intercept, %body fat at menarche. The results of Model A suggest that there is a significant non-linearity in the post-event slope, which tends to level off by 4 years post-menarche. Model A also produces the smallest estimate for the pre-event slope. Model B suggests only a weak positive effect of parental obesity on body fat at menarche. The results of Model C suggest that parental obesity has no appreciable effect on the post-menarcheal slope. The piecewise nature of these models results in a strong interdependence among the parameters, so that a change in the intercept, for example, is reflected in changes in the two slope estimates.
|
This paper serves a dual purpose: to explain the advantages of using mixed effects models for longitudinal growth data and to demonstrate the model's utility in answering a question of epidemiological interest. The current thinking regarding the natural history of obesity is characterized by the invocation of critical periods. Because these mixed effects models can accommodate a change in slope, they are particularly well-suited for modelling growth data wherein such events occur. In regards to critical periods, the results of the models presented are quite clear and confirm what has been observed impressionistically. Clinical observation suggests that for many girls the first few years post-menarche are associated with appreciable increases in fatness.4 Our findings indicate that adolescence is a time of potential increased obesity incidence and that menarche does, in fact, emerge as the breakpoint in the fat accretion curve. In our analyses the inflection point is observed to occur (Figure 3), and the timing of that point of inflection turns out to be around menarche. Due to the absence of measurement at the time of menarche, the interval around this event represents a time of uncertainty. Using the approach of regressing %body fat against time relative to menarche we have acquired from this model the information, which is practically unobservable in a field study, the %body fat at menarche.
Use of random and mixed effects models substantially reduces parameter variability for all three estimates and thereby improves the efficiency of estimation. For pre- and post-menarcheal slopes, the standard deviations are 3.5 and 2 times smaller for random effects model estimates than for the least squares analysis. Although the large variation in estimated slopes might be explained by inherent biological variability during the critical period of sexual maturation, the large variation in estimated slopes obtained from OLS can also be exaggerated for at least two reasons: first, girls with fewer measurements are weighted equally to girls with more measurements in the estimation of population mean slope; and secondly, slope estimates depend on the number of measurements available. The requirement to have three data points and age at menarche to estimate both slopes and intercept in OLS would omit all subjects who were not yet menarcheal even if they have enough data points for pre-menarcheal time period. This clearly reduces the efficiency of OLS model for the interim and overall analysis.
Some practical considerations are worthy of mention. Because the mixed effects models are basically weighted least squares, with the weights estimated from the data, and because hypothesis testing relies on asymptotic normality of the estimated fixed effects, one needs a reasonable (>30) number of subjects to ensure that this property holds. Secondly, given that missing responses and attrition will occur in practice, the validity of the results depends upon the assumption that missingness is not related to unobserved measurements. The research question itself, the exploration of the pre- and post-menarcheal slope differences, emphasizes the importance of subject retention in a longitudinal study. Good retention ensures that the reasons for leaving a longitudinal study are likely to be largely independent of growth pattern, so that missingness assumptions are satisfied. Although subjects without a known time of menarche are excluded and this may bias the estimated pre-menarcheal slope, including these subjects may bias the estimates of change in slope. Our approach uses only within-subject data to estimate this change in slope.
Random effects and mixed effects models are very flexible and can accommodate a variety of study designs, data models and hypotheses. We have presented an example with a continuous Gaussian outcome variable; these models can readily be extendable to accommodate discrete outcomes as well. As described above, from a statistical perspective these approaches address many of the challenges present in longitudinal studies. The software needed to perform the analysis is readily available as part of standard statistical software programs. These programs have incorporated special procedures for handling missing data and have built-in user-friendly graphical interfaces that permit the analyst to visualize the data at each step. The applied researcher would do well to learn how to use these methods today, if not sooner.
Appendix
Available software and recommendations for data preparation
To make this approach amenable to wide use, we developed a set of recommendations for step-by-step data preparation, exploratory analysis, model implementation, and visualization of results using standard statistical software. The interested reader is directed to previous reviews of available software, such as BMDP 5-V, SAS, S-Plus, HLM, ML33,9 and codes for implementing these procedures previously published in the statistical literature.2 Pinheiro and Bates provide guidance for using S-Plus for mixed effects modelling with a strong emphasis on the use of graphical displays.10
In brief, to prepare the data set, one extracts data from baseline and annual follow-up files, checks data for completeness and develops decision rules for inclusion in the model. The data are then reformatted to fit the scheme for a covariate matrix, e.g. each row is a single record with identification number, outcome, time of measurement, and pre/post event index. The plots of the raw data for the complete and selected data set can be very useful in making assumptions for modelling. Below we provide S-plus and SAS codes for random effects (RE) and mixed effects (A, B, and C) modelling for all four models. The data set growth contains an identifier for each subject (id), age at menarche (age.at.menarche), two variables for time relative to event (time.before and time.after), a variable indicating parental obesity (parental.obesity), and an outcome variable%body fat (y).
S-plus CODE
function (MODEL)
{
if (MODEL == "RE") {
## random effects model: y ~ time.before + time.after
m <- lme(fixed = y ~ time.before + time.after, random = ~ time.before + time.after, cluster = ~ id, data = growth, na.action = na.omit)
}
if (MODEL == "A") {
##
## Model A: y ~ time.before + time.after + time.after^2
m <- lme(fixed = y ~ time.before + time.after + I(time.after ^2), random = ~ time.before + time.after + I(time.after^2), cluster = ~ id, data = growth, na.action = na.omit)
}
if(MODEL == "B") {
##
## Model B: y~ time.before + time.after + parental.obesity
m <- lme(fixed = y ~ time.before + time.after + parental.obesity, random = ~ time.before + time.after, cluster = ~ id, data = growth, na.action = na.omit)
}
if (MODEL == "C") {
##
## Model C: y~time.before + time.after * parental.obesity
m <- lme(fixed = y ~ time.before + time.after * parental.obesity, random = ~ time.before + time.after, cluster = ~ id, data = growth, na.action = na.omit)
}
print(m)
print(anova.lme(m))
}
SAS CODE
data growth;
set save.fatrw;
run;
Model RE
proc mixed data = growth;
class id;
model y = time.before time after / s;
random intercept time.before time.after / type = un subject = id;
run;
Model A
proc mixed data = growth;
class id;
model y = time.before time.after time.after*time.after / s;
random intercept time.before time.after time.after*time.after / type = un subject = id;
run;
Model B
proc mixed data = growth;
class id parental.obesity;
model y = time.before time.after parental.obesity / s;
random intercept time.before time.after / type = un subject = id;
= run;
Model C
proc mixed data = growth;
class id parental.obesity;
model y = time.before time.after parental.obesity time.after*parental.obesity / s;
random intercept time.before time.after time.after / type = un subject = id;
run;
KEY MESSAGES
|
Acknowledgments
We wish to acknowledge June Stevens, Ph.D., session organizer for our tutorial entitled, The Application of Longitudinal Analysis Methods to Repeated Measures of Body Weight and Fat Distribution presented at the Annual Meeting of the North American Association for the Study of Obesity, Charleston, South Carolina, 1418 November 1999. The data derive from a study supported by NIH grants DK-HD50537, DK46200 and M01-RR-00088.
References
1 Hogan JW, Laird NM. Intention-to-treat analyses for incomplete repeated measures data. Biometrics 1996;52:100217.[ISI][Medline]
2 Cnaan A, Laird NM, Slasor P. Tutorial in biostatistics: using the general linear mixed model to analyze unbalanced repeated measures and longitudinal data. Stat Med 1997;16:234980.[ISI][Medline]
3 Kreft IGG, deLeeuw J, van der Leeden R. Review of five multilevel analysis programs: BMDP-5V, GENMOD, HLM, ML3, VARCL. Am Statistician. 1994;48:32435.[ISI]
4 Garn SM, LaVelle M. Two-decade follow-up of fatness in early childhood. Am J Dis Child 1985;139:18185.[Abstract]
5 Allison D, Paultre F, Heymsfield SB. Pi-Sunyer FX. Is the intra-uterine period really a critical period for the development of adiposity? Int J Obesity Relat Metab Disord 1995;19:397402.[Medline]
6 Dietz WH. Critical periods in childhood for the development of obesity. Am J Clin Nutr 1994;59:95559.[Abstract]
7 Kushner R, Schoeller D, Fjeld CR, Danford L. Is the impedance index (ht2/R) significant in predicting total body water? Am J Clin Nutr 1992;56:83539.[Abstract]
8 Cleveland WS. Robust locally weighted regression and smoothing scatterplots. J Am Statist Assoc 1979;74:82936.[ISI]
9 Diggle PJ, Liang KY, Zeger SL. Analysis of Longitudinal Data. New York: Oxford University Press, 1994.
10 Pinheiro JC, Bates DM. Mixed-effects Models in S and S-PLUS. New York: Springer-Verlag, 2000.