1 Department of Psychiatry, University of Cambridge, Addenbrookes Hospital, Cambridge CB2 2QQ, United Kingdom.
2 Department of Epidemiology and Public Health, Faculty of Medicine, Imperial College of Science, Technology, and Medicine, London SW7 2AZ, United Kingdom.
3 MRC National Survey of Health and Development, University College London, London WC1E 6BT, United Kingdom.
Received for publication November 27, 2001; accepted for publication November 14, 2002.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
bladder; child development; enuresis; missing data; models, statistical; prevalence; prospective studies; repeated measures
Abbreviations: Abbreviations: AIC, Akaikes Information Criterion; BIC, Bayesian Information Criterion; LCGA, latent class growth analysis; LLCA, longitudinal latent class analysis.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A complete account of the representativeness of the sample and of losses through death and emigration up to age 43 years appears elsewhere (18). The initial data collection was carried out in the 1940s and 1950s before the development of modern notions of informed consent and ethical committee approval for epidemiologic studies. Written informed consent was obtained for later waves of data-gathering when subjects were aged 36 and 43 years.
Assessments
Prospective data on nighttime bed-wetting were available from parental assessments and medical examinations carried out at six ages (ages 4, 6, 8, 9, 11, and 15 years) in 1950, 1952, 1954, 1955, 1957, and 1961. For the purposes of analysis, a binary outcome was formed at each age: 1 = wetting, occasionally or more often during the past month; 0 = never wetting. Concurrent measures of daytime wetting (sometimes wetting vs. never wetting at ages 6 and 9 years), toilet training (age at which parent started to train the child to be clean, age at giving up napkins (diapers)), and bowel control (age at attaining bowel control, later bowel problems) were available. These provided opportunities for characterizing the trajectory groups.
Statistical framework: latent variable mixture modeling
In finite mixture models (19), it is assumed that observations made in a sample arising from a number of population groups can be represented by an unobserved categorical variable (the latent classes). Applications to longitudinal data have the aim of identifying subgroups in the population that have similar patterns of change in behavior over time. In our analysis, we assumed that there are a number of different and unobserved trajectories to and from the attainment of nighttime bladder control that are indicated by the patterns of observed responses to questions at the six ages (table ).
|
Our analysis was conducted in three steps.
First, we identified the number of mixture components (latent classes) and compared LLCA models with LCGA models. We applied LCGA with quadratic trajectories in all classes. We used modified and parametric bootstrap likelihood ratio test statistics for sequential models (T classes vs. T + 1 classes). We used parametric bootstrap p values for goodness-of-fit tests (Pearson and likelihood ratio tests).
Second, we used a confirmatory approach to characterize the trajectory groups for the best model. This involved re-estimating the best model while introducing additional variables measured in the cohort as extra latent class indicators. Here parameters for the bed-wetting outcomes were fixed at their estimated values (from step 1); the only free parameters were the class-specific prevalence estimates for the additional variablesfor example, the proportion of children in each class who were also wetting during the day at age 6 or 9 years.
Finally, we ascertained the population prevalence of the developmental trajectories in a model that included all cohort members with at least one measure of night wetting. This made maximum use of data from children with partially incomplete (missing) data and increased statistical power for detecting and differentiating between developmental trajectories.
To implement this analysis, we used a weighted likelihood approach to account for the differential sampling of social classes and introduced inverse probability weights to account for the survey design. In this model, we assumed that the missingness mechanism was "missing at random," in the terminology of Little and Rubin (21), and we included sex in the model to increase the likely validity of this assumption. Chi-square tests for the unrestricted latent class model were conducted to evaluate whether more restrictive assumptionsthose of "missing completely at random"were also met (see Little and Rubin (21), pp. 1923).
All models were estimated using maximum likelihood methods based on the expectation maximization algorithm (22) implemented in Mplus, version 2.12 (23). For LLCA, we used LatentGOLD software (24) to implement repeated runs from 1,000 random starting values to ensure that a true maximum likelihood solution had been reached (not a local minimum).
Model evaluation
Goodness of fit
The asymptotic properties of conventional chi-square tests for goodness of fit are often violated for the sparse, multiway cross-tabulations that arise in epidemiologic studies. In these circumstances, standard chi-square tests cannot be relied on uncritically.
Choosing the number of latent classes
Chi-square difference tests based on the maximized log likelihood are not appropriate for likelihood ratio test comparisons of models with higher numbers of classes, since the regularity conditions required for the conventional likelihood ratio test statistic do not hold.
These problems can be overcome with a Monte Carlo simulation approach (25), which can generate an empirical distribution for measures of fit. We used parametric bootstrapping to test the fit of LLCA and LCGA to the data (26, 27) and to compare sequential models with T versus T + 1 classes (recommended by McLachlan and Peel (19)).
For chi-square goodness-of-fit tests, this procedure involved generating a large number of random samples of size 3,272 simulated from the estimated model, treating the estimates as if they were population values. The proportion of chi-square values greater than that observed represents an approximate bootstrap p value. For sequential model tests of T versus T + 1 classes using the likelihood ratio test statistic, this approach involved estimating two models for each bootstrap sample.
We implemented parametric bootstrapping using the Monte Carlo procedures in Mplus and batch processing facilities executed under MS-DOS.
Adjusted likelihood ratio tests concerning the number of classes
Lo et al. (28) recently proposed modifications to the likelihood ratio test (the Lo-Mendell-Rubin method) that adjust the conventional likelihood ratio test for T versus T + 1 classes for violation of regularity conditions. Muthén has suggested their appropriateness for mixture models for longitudinal data (LLCA and LCGA) (B. Muthén, personal communication, University of California, Los Angeles, 2002; Mplus version 2.12, TECH 12 option). We report the results of conventional and adjusted likelihood ratio tests to evaluate a hypothesis regarding number of classes.
An alternative approach (29) to determining the best model that combines goodness of fit and parsimony seeks minimum values for information criteria (30). We report values for Akaikes Information Criterion (AIC), Schwarzs Bayesian Information Criterion (BIC), and a sample-size-adjusted BIC (30):
AIC = 2 log likelihood + 2r;
BIC = 2 log likelihood + r ln n,
where r is the number of free model parameters and n is the sample size. The sample-size-adjusted BIC uses n* = (n + 2)/24 in place of n. Yang (31) provided support for use of the sample-size-adjusted BIC when comparing latent class models in a psychometric context. Lin and Dayton (29) suggested that AIC may be more appropriate than BIC for more complex models.
The quality of the resulting classification can be evaluated in terms of the separation of the latent classes and is usually summarized using an entropy measure (24, 32) based on the posterior class membership probabilities (23). Entropy measures capture how well it is possible to predict class membership given the observed bed-wetting outcomes. Values range from 0 to 1, and high values are preferred.
A final consideration for model fit was the magnitude of the residuals for the two-way margins which show the deviation of observed-from-model estimated (expected) frequencies (33).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Step 1: LLCA versus LCGA
Tables and summarize the results of LLCA and LCGA. For the LCGA model with five classes, convergence was not achieved. The LCGA model with four classes returned a bootstrap p value greater than 0.05 for the Pearson chi-square test (p = 0.07) but less than 0.01 for the likelihood ratio goodness-of-fit test. The LLCA model with four classes returned a bootstrap p value greater than 0.01 for both the Pearson (p = 0.10) and likelihood ratio (p = 0.02) chi-square tests. In all cases, p values from the parametric bootstrap were higher (less likely to reject the null hypothesis (H0)), indicating bias of conventional values against the null hypothesis. The bootstrap p value for the sequential model likelihood ratio test (comparison of a four-class LLCA (LLCA-4) with a five-class LLCA (LLCA-5)) returned a nonsignificant result (p > 0.50). The Lo-Mendell-Rubin adjusted likelihood ratio test comparing T classes with T + 1 classes returned a marginal result (p values between 0.01 and 0.05), proving some support for a five-class LLCA. Model identification was also stronger (higher condition numbers) for the LLCA than for the LCGA. Condition numbers are the ratio of the smallest eigenvalues to the largest eigenvalues in the information matrix.
|
|
Table shows the parameter estimates for the four-class LLCA, with standard errors estimated from 3,000 bootstrap samples. Table also shows the separation of the estimated classes (entropy statistic = 0.913). Classification accuracy can be gauged from the high diagonal and low off-diagonal elements in the assignment matrix.
|
|
Figure 1 profiles the results for the five-class LLCA. In addition to "normal" development (prevalence = 84.0 percent), two trajectories described delays in acquisition of bladder control ("transient" = 8.7 percent and "persistent" = 1.8 percent), capturing primary enuresis. A "chronic" class of children experienced night wetting through age 15 years (2.6 percent). A final trajectory described relapse after initial success (relapsing = 2.9 percent), capturing secondary enuresis.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
These results demonstrate the power of latent class analyses of longitudinal data to identify otherwise unobserved trajectory classes in epidemiologic data. Together, these unrestricted (LLCA) and restricted (LCGA) latent class models allow flexible modeling of developmental trajectoriescapturing either curvilinear or discontinuous development.
In this domain of application, both models (LLCA and LCGA) provided evidence of at least four distinct types of developmental heterogeneity. As empirically defined longitudinal phenotypes, these developmental trajectories may be used as either independent variables or dependent variables in studies of the causes, associations, and outcomes of these phenomena. In this context, it would seem prudent for us to compare four- and five-class solutions when exploring antecedents or consequences, since the existence of a single best model could not be conclusively established from our results (when missing data were included).
We adopted a confirmatory approach to model validation to avoid the limitations of two-stage approaches in which individuals are classified and regression analyses are then performed (assuming that the latent classification is an observed variable measured without error). This approach is known to be statistically inefficient and should be avoided where possible. Within the confirmatory modeling approach, it is a natural extension for researchers in subsequent investigations in this cohort to further define these trajectories in relation to other life course outcomes. We plan to consider continence outcomes in later life, other early indices of child development, and psychiatric morbidity (affective illness) captured by mental health assessments at ages 36 and 43 years.
In this paper, we have discussed the main methodological considerations. When applied to binary repeated measures of nighttime bladder control, LLCA provided a marginally better fit than quadratic LCGA and, importantly, resulted in a more stable model. Other applications may capitalize on the parsimony of LCGA. Further model differentiation may be possible by combining the features of LLCA and LCGA in models that consider some classes parameterized as growth curves and others estimated piecewise. Such a model is a development of the current modeling framework (18) but remains to be pursued in further research.
Our use of a parametric bootstrap approach to support model evaluation (goodness of fit) and comparison (sequential tests) offers an advance over heuristic methods such as those based on information criteria, and it may also be more reliable than adjusted likelihood ratio tests under small sample sizes. These issues can be usefully addressed in subsequent Monte Carlo simulation studies based on the characteristics of these data or other epidemiologic samples.
With regard to the comprehensive modeling framework outlined in the Appendix, we have not considered the introduction of covariates (although, in our characterization of the trajectory classes, we estimated models with sex as an indicator of the latent classes). In subsequent analyses, it may be more logical to introduce concomitant variables measured prior to the developmental data as covariates that predict class memberships. This approach has the potential to identify the determinants of the observed developmental heterogeneity. Alternatively, indicators of the classes may be considered distal outcomes of the developmental trajectories that we have identified. Finally, we highlight the potential for applications to ordered polytomous outcomes, but we caution that in epidemiologic samples this may result in sparse cross-tabulations requiring appropriate recoding of categorical measures.
![]() |
ACKNOWLEDGMENTS |
---|
The authors are grateful to the staff of the MRC National Survey of Health and Development at University College London, particularly Warren Hilder, for access to the data and variables. The authors thank Professor Bengt Muthén (University of California, Los Angeles) for discussions and advice on modeling. Russell Ecob provided comments on manuscript revisions.
![]() |
APPENDIX |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Latent variable mixture models for categorical data (Muthén framework B)
Let u denote categorical observed variables. Let x denote categorical or continuous covariates. Let c denote a latent categorical variable with K classes, ci = (ci1, ci2, ..., ciK)', where cik = 1 if an individual i belongs to class k and zero otherwise. The model has two parts: c related to x and u related to c. c is related to x by multinomial logistic regression using the K 1 dimensional parameter vector of logit intercepts c and the (K 1) x q parameter matrix of logit slopes
c, where, for k = 1, 2, ..., K
, (1)
where the last class is a reference class with coefficients standardized to zero, cK = 0,
ck = 0. For u, conditional independence is assumed given ci and xi,
P(ui1, ui2, ...uir |ci, xi)P(ui2 |ci, xi) ... P(uirci, xi). (2)
The categorical variable uij (j = 1, 2, ..., r) with Sj ordered categories follows an ordered polytomous logistic regression (proportional odds model), where, for categories s = 0, 1, 2, ..., Sj 1 and j,k,0 =
,
,
uij = s, if j,k,s < u*ij
j,k,s+1, (3)
P(uij = s|ci, xi) = Fs+1(u*ij) Fs(u*ij), (4)
Fs(u*) = 1/(1 + , (5)
where for u*i = (u*1i, u*2i, ..., u*fi)', ui = (
u1i,
u2i, ...,
ufi)', and conditional on class k,
u*i = uk
ui + Kuk xi (6)
and
ui =
uk +
uk xi, (7)
where uk is an r x f logit parameter matrix varying across the K classes, Kuk is an r x q logit parameter matrix varying across the K classes,
uk is an f x 1 logit parameter vector varying across the K classes, and
uk is an f x q logit parameter matrix varying across the K classes. The thresholds may be stacked in the
rj=1(Sj 1) x 1 vectors
k varying across the K classes. Equation 6 does not include intercept terms given the presence of the
parameters, and
parameters have opposite signs than u* in equation 6 because of their interpretation as thresholds or cutpoints that a latent continuous response variable u* exceeds or falls below. With a binary u scored as 0 versus 1, equation 4 leads to
P(u = 1|c, x) = 1 (1/(1 + e(u* ))). (8)
For example, the higher the , the higher u* needs to be to exceed it, and the lower the probability of u = 1.
Latent class analysis
In traditional latent class analysis, the variables of u may be binary, ordered polytomous, or unordered polytomous. Under the conditional independence specification, the joint probability of all us is
(9)
The distribution of the categorical latent variable (the latent class sizes) is represented by P(c = k), which is expressed in terms of the logit parameters ck in equation 1. The conditional u probabilities are expressed via logit parameters in line with equation 6, where, for a binary u, logit =
k for class k, that is, the u* part of equation 6 is not needed. Posterior probabilities for each individual belonging to all classes can be computed by Bayes formula:
P(c = k|u1, u2 ..., ur) =
P(c = k), P(u1 |c = k), P(u2|c = k)...
P(ur|c = k)/P(u1, u2..., ur). (10)
Latent class growth analysis
In Muthéns general modeling framework, latent class growth analysis requires the introduction of a continuous latent variable u. For a single outcome at each time point u*i = (ui1, ui2, ..., uit, ..., uiT )', a second-order polynomial (quadratic) growth model corresponds to u*it =
0i +
1i at +
2i at2, where at and at2 are fixed time scores represented in
u,
(11)
where T is the number of time periods. Here u contains an intercept, a linear slope, and quadratic growth factors with differences across classes captured in
uk and
uk xi of equation 7. Conditional on x, there is zero within-class variation across individuals. Across-time and across-class measurement invariance for the categorical outcomes is imposed by a threshold specification (only one threshold per outcome if the repeated measure is binary). In equation 6, the
mean of the intercept growth factor
0i for the first class is fixed at zero for identification purposes. The mean of the intercept and all other growth factors are free to be estimated in the remaining classes.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|