Biostatistics Unit, Centre for Epidemiology and Biostatistics, Leeds Institute of Genetics, Health & Therapeutics, University of Leeds, Leeds LS2 9LN, UK. E-mail: m.s.gilthorpe{at}leeds.ac.uk
Results from a paper published in this journal1 are acknowledged by its authors (see previous contribution2) as suffering bias due to the effects of mathematical coupling (MC). The authors report an alternative analysis that seeks to overcome MC. However, this is not without its problemswe explain why and propose an alternative strategy.
Taking an example from Gunnell et al.,1 if self-reported height (x1) tended to be underestimated compared with recorded height (x2) amongst tall people and overestimated amongst short people, the standard deviation (SD) of x1 would be less than that for x2. Under the null hypothesis (H0) that over-/under-reporting is not related to either measure, the SD of x1 and x2 should be equal. However, under H0, the difference x1 x2, when correlated or regressed on x1 or x2, nearly always yields a statistically significant association for large samples.3 Such analyses are therefore misleading. A solution is to assess the difference x1 x2 with respect to the mean (x1 + x2)/2, as proposed by Oldham.4 Although MC remains, its effects are annulled because the statistical association between x1 x2 and x1 + x2 is zero under H0, illustrated geometrically if we envisage x1 and x2 as vectors with lengths equal to their SD; under H0, the vectors representing x1 and x2 are of equal length and the cosine of the angle between them is their correlation (Figure 1). Vectors representing x1 x2 and x1 + x2 are always perpendicular, i.e. their values are uncorrelated, under H0. Consequently, differences correlated with or regressed on means yield a correlation or regression coefficient of zero.
|
An alternative approach is to use multilevel modelling.5 The specific multilevel model required to analyse two variable differences in relation to their mean, avoiding MC, is where one specifies both variable measures as repeated outcomes at level-1, clustered within individuals at level-2. A covariate indicating the measure-type (self-reported or recorded) is included, where its coefficient is allowed to exhibit random variation about its meanknown as a random coefficient model.6 MC is not present since the dependent variable has no formulaic relationship with the independent variable. Furthermore, covariates associated with either measure, or their mean, or their difference, may be added without distorting the model. The unbiased correlation between differences and mean, akin to Oldham's correlation, is obtained from the covariance between the random intercept and the random slope.7 Note that the covariate for measure-type has to be centred about zero to avoid overestimation of the covariance term, and the covariate interval must be one in order to interpret the regression coefficient as the mean difference between measures. MC is clearly a problem not to be overlooked. However, the extent to which Oldham's method may remain biased when extended to multiple regression is not known, and the magnitude of any potential bias would vary from study to study. Multilevel modelling, on the other hand, circumnavigates the risk of MC totally.
![]() |
References |
---|
![]() ![]() |
---|
2 Gunnell D, Berney L, Holland P et al. Does the misreporting of adult body size depend upon an individual's height and weight? Methodological debate. Int J Epidemiol 2004; 33:139899.
3 Tu Y-K, Gilthorpe MS, Griffiths GS. Is reduction of pocket probing depth correlated with the baseline value or is it mathematical coupling? J Dental Res 2002; 81:72226.
4 Oldham PD. A note on the analysis of repeated measurements of the same subjects. J Chron Dis 1962; 15:96977.[CrossRef][ISI][Medline]
5 Goldstein H, Browne W, Rasbash J. Multilevel modelling of medical data. Stat Med 2002; 21:3291315.[CrossRef][ISI][Medline]
6 Bryk AS, Raudenbush AW. Hierarchical Linear Models: Applications and Data Analysis Methods. London: Sage; 1992.
7 Gilthorpe MS, Cunningham SJ. The application of multilevel, multivariate modelling to orthodontic research data. Community Dental Health 2000; 17:23642.[ISI][Medline]