In epidemiological studies of chronic disease risk in relation to diet, a crucial question is whether assessments of dietary intake can accurately characterize an individuals habitual intake of foods and nutrients. Over the last two decades, this question has been addressed in numerous validation studies. The OPEN study1 provides the best answer yet. It is a true landmark study both because of its size and the thoroughness of its design.
The main innovation of the OPEN study is the use of doubly labelled water (i.e. water made from stable isotopes of both hydrogen and oxygen) to measure energy expenditure. Doubly labelled water is extremely expensive to produce. It is therefore unlikely that the OPEN study will be replicated in the foreseeable future. With this lack of reproducibility in mind, we would like to consider the extent to which the conclusions of the OPEN study can be generalized.
Before considering the conclusions of OPEN, it is worth reviewing the evolution of dietary validation studies. Initially, dietary measurement validation studies were based on a comparison of two assessment methods, one of which (often based on a series of weighed food consumption records) was assumed to provide a perfectly valid intake measurement. This strong validity assumption was later relaxed by the use of statistical models for measurement error. These models impose certain constraints on the design of validation studies, where the nature of the constraint depends on the purpose of the study. If the aim is to correct for the attenuation effect of measurement error on estimates of disease risk then two independent measurements are required, one of which must be unbiased. If the aim is to completely characterize the error properties of the dietary assessment methods then three independent measurements are required.2,3 The difficulty is in finding three independent estimates of dietary intake.
In practice, three main categories of dietary assessment can be distinguished: questionnaires for assessment of habitual, long-term intake; methods based on recording of actual food intake on one or more days (e.g. weighed food records, 24-hour diet recall interviews), and biomarkers of diet. It is clear that the measurement errors of instruments in the first two categories are correlated, not least because the same food tables must be used when converting foods to nutrients. This leaves only certain biomarkers as a possible alternative. One set of biomarkers of particular interest consists of those based on the urinary recovery of chemical substances from diet. Such recovery-based markers allow the computation of absolute daily intakes of nutrients in time. They also have the advantage that repeat measurements may be assumed independent, if the repeats are sufficiently widely spaced, so that the necessary statistical assumptions for validation studies may be fulfilled.4 Recovery-based markers are currently available for energy expenditure, and for protein, sodium, and potassium intake. The first two of these biomarkers were used in the OPEN study. A previous validation study by Day et al.5 used biomarkers for protein, sodium, and potassium.
There are a number of interesting conclusions from the OPEN study. First, the food frequency questionnaire (FFQ) was found to have very low validity for both total energy and protein intake measurements: attenuation factors were found to be close to 0.1 in both cases. It is disconcerting to observe that the FFQ measurements predict so little variation in protein intake levels. A second, unexpected, result is that a single 24-hour recall (24-HR) appeared to predict both protein and energy intakes better than the FFQ. This is somewhat surprising given the well-documented day-to-day variation in dietary intakes. A third interesting feature of the OPEN study is the comparison between the results for absolute and energy adjusted protein intake, measured by the protein density. An energy-adjusted analysis simulates an intervention in which total energy intake is fixed, but the composition of the diet changes. Two motivations for this analysis have been put forward: Firstly, energy requirements depend on body size, physical activity, and metabolic efficiency of each individual, which collectively may confound relationships of absolute nutrient intakes with disease risk; adjustments for total energy intake might reduce such confounding effects. Secondly, it is claimed that measurements of absolute intake contain a substantial degree of measurement error that may be eliminated by energy adjustment.6 This latter claim is only partially borne out by the OPEN study. Whereas energy adjustment substantially decreases the degree of attenuation for protein intake measured by the FFQ, it has a more modest effect on the correlation with true intake. This suggests that the main problem of the FFQ is its inability to give absolute estimates of protein intake on a well calibrated scale. For the 24HR, energy adjustment has very little effect on the attenuation factor, and no clear effect on correlation. There is even a decrease in correlation among women after energy adjustment.
A general limitation of validation studies is that the results are context specific. The results of a validation study are not necessarily transferable to another population, or even to other nutrients in the same population, because the summary statistics (attenuation factor and correlation coefficient) depend on both the characteristics of the instruments used and the heterogeneity of intake in the population. This point, which is acknowledged by the authors, creates a paradox. On the one hand, the OPEN study has set a high standard for validation studies which is unlikely to be matched. On the other hand, the extent to which the results can be generalized is not clear. Some evidence for generalizability to other populations comes from the general agreement with the result of the previous validation study by Day et al.5 When it comes to generalization to other nutrients, however, there are no previous studies with a similar design to compare with, because all such studies are limited to recovery-based biomarkers. We note that the day-to-day variation in protein and total energy intakes is low compared with that of many other nutrients. For other nutrients, for which intake levels vary more over time, but also more between individuals, the FFQ may perform much better, both in absolute terms and relative to a 24HR. Having said this, we are not quite as optimistic as the authors that new biomarkers can be developed to measure other nutrients, so our speculation will probably remain unchallenged by empirical evidence.
![]() |
References |
---|
![]() ![]() |
---|
2 Plummer M, Clayton D. Measurement error in dietary assessment: an investigation using covariance structure models. Part II. Stat Med 1993;12:93748.[ISI][Medline]
3 Kaaks R, Riboli E, Esteve J, van Kappel AL, van Staveren WA. Estimating the accuracy of dietary questionnaire assessments: validation in terms of structural equation models. Stat Med 1994;13:12742.[ISI][Medline]
4 Kaaks R, Ferrari P, Ciampi A, Plummer M, Riboli E. Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments. Public Health Nutr 2002;5:96976.[CrossRef][ISI][Medline]
5 Day N, McKeown N, Wong M, Welch A, Bingham S. Epidemiological assessment of diet: a comparison of a 7-day diary with a food frequency questionnaire using urinary markers of nitrogen, potassium and sodium. Int J Epidemiol 2001;30:30917.
6 Willett W. Isocaloric diets are of primary interest in experimental and epidemiological studies. Int J Epidemiol 2002;31:69495.[ISI][Medline]