Departments of Nutrition and Epidemiology, Harvard School of Public Health, and the Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. Public Health, 655 Huntington Ave, Boston, MA 02115, USA.
Correspondence: Walter Willett, Department of Nutrition, Harvard School of
In conducting studies of diet and disease risk, methods of measuring diet with sufficient validity to detect important associations are essential. Cost is also a critical factor because prospective studies, which are necessarily large, are desirable to avoid problems of selection and recall bias. Most investigators have converged to use some form of a food frequency questionnaire (FFQ) for this purpose, and the validity of this approach has been documented repeatedly by comparisons with more detailed methods, correlations with biochemical indicators of dietary factors, and the ability to predict risk of future disease.1,2 However, all methods of dietary assessment are imperfect, and quantification of measurement error is desirable both to help in the interpretation of findings from epidemiological studies, and to correct relative risks and confidence intervals for this source of error.
In studies of questionnaire validity, the comparison method should have errors that are minimally correlated with those of the questionnaire to avoid spurious overestimation of validity. For this reason, we and others have chosen weighed dietary records for comparisons because they do not depend on memory and allow quantitative measurements of the amounts of foods at the time they are actually consumed. Biochemical indicators of diet are also attractive because their errors should have little correlation with those of reported food intake; unfortunately for many nutrients of interest appropriate biochemical indicators do not exist.
In this issue, Day et al.3 have combined data collected by an FFQ, diet diary, and biochemical measurements of urine to quantify measurement errors and to estimate the correlations of errors between different dietary assessment methods. Unfortunately, their data, although carefully collected, do more to obfuscate than enlighten. In particular, they have been unfair to their FFQ. The primary problems with this paper, which are inter-related, are that the authors have ignored the heterogeneity in their population, and have examined only absolute rather than energy-adjusted intakes.
As regards the first issue, the study population included both men and women of various ages and sizes. This heterogeneity spuriously increases the between-person variation in absolute intakes of nutrients because in any real epidemiological application we would normally control for age, sex, and body size. Any evaluation of questionnaire validity is unrealistic unless the major sources of variation that would not exist, or that could be controlled, in an epidemiological application are first removed.
The second problem is that they have only examined validity for absolute nutrient intake and have ignored the consensus among nutritional epidemiologists over the last decade that the energy-adjusted nutrient intake of a dietary factor, i.e. dietary composition, is appropriately the primary focus of nutritional epidemiology.1,4 The principle reason for the focus on energy-adjusted intakes is that this is primarily what can be changed by individuals or populations. Individuals must increase or decrease their intake of nutrients by changing the composition of their diets because their total energy intake is constrained within a narrow range by their size and level of physical activity. Analogously, experimentalists evaluating the effects of specific nutrients routinely compare iso-caloric diets, otherwise changes in weight will confound the specific effects of the nutrient being evaluated. There are many important hypotheses relating the protein composition of the diet to risk of chronic disease, but unfortunately, the paper by Day et al., which uses absolute urinary nitrogen excretion as a measure of protein intake, fails to inform us about the value of their methods for examining these issues.
A secondary benefit of adjusting for total energy intake, but not the primary reason for doing so, is that errors in measuring individual nutrients are strongly correlated with errors in measuring total energy intake because over- or under-reporting of individual foods leads to similar errors in all constituents. These correlated errors are strong, ranging from about 0.6 for total energy versus micronutrients to 0.95 for total energy versus macronutrients (unpublished data based on the Nurses' Health Study). Thus, adjustment for total energy cancels a substantial amount of error. Food frequency questionnaires are directed primarily at dietary composition, which is largely determined by the mix of foods that are consumed. Diet records, which provide more precise quantification of foods consumed, will usually be relatively better for measuring absolute intake. However, a limited number of days of diet records will perform less well for dietary composition because within-person day-to-day variation is much greater for dietary composition than for absolute intake.5
For the reasons noted above, an examination of validity for absolute rather than energy-adjusted nutrient intake will tend to favour a one-week diet diary compared to an FFQ. Indeed, there is much empirical evidence that the results for energy-adjusted nutrients would be substantially different to those reported by Day et al. Quite consistently, the correlations between FFQ and both diet records and biochemical indicators of nutritional status increase after adjustment for energy intake.1,6 Also, the correlations among different nutrients estimated by the same FFQ decrease greatly after energy adjustment and are much lower than those reported by Day et al.3 In addition, the apparent correlations in errors between the FFQ and diet diary method reported in their paper are likely to be larger than those using weighed diet records as the comparison because many of the cognitive processes were similar between the methods. A higher degree of correlated error is also likely in studies using 24-hour recalls as the comparison method. The issue of correlated errors between an FFQ and weighed diet records has been examined earlier by Spiegelman,7 who found that correlations between errors were much lower than reported by Day and not sufficiently strong to affect seriously the estimates of validity from studies comparing FFQ with weighed diet records.
The value of FFQ for assessing dietary composition has already been documented objectively by correlations with biochemical indicators and the prediction of outcomes in prospective studies.2 These questionnaires have great advantages over dietary records in cost and participant burden. These advantages are particularly important because they allow large populations to be enrolled in prospective studies and repeated assessments of diet during the follow-up period. Replicate assessments may not be of great value at a one-year interval as suggested by Day et al., although we do not know whether this is true for dietary composition. However, over a longer period, individual diets do change and repeated measures can be of great value.8,9 Whether 7-day diet diaries or records add useful information above and beyond FFQ remains an open question, and I will look forward to further findings based on the data collected by Day et al. Hopefully, in the future they will be analysed and presented in a way that will be useful to epidemiologists.
References
1 Willett WC. Nutritional Epidemiology. 2nd Edn. New York: Oxford University Press, 1998.
2 Willett WC. Invited commentary: comparison of food frequency questionnaires [editorial; comment]. Am J Epidemiol 1998;148:115759; discussion 116265.[ISI][Medline]
3
Day NE, McKeown N, Wong MY, Welch A, Bingham S. Epidemiological assessment of diet: a comparison of a 7-day diary with a food frequency questionnaire using urinary markers of nitrogen, potassium and sodium. Int J Epidemiol 2001;30:30917.
4 Willett WC, Howe GR, Kushi LH. Adjustment for total energy intake in epidemiologic studies. Am J Clin Nutr 1997;65(Suppl.):1220S28S.[Abstract]
5 Beaton GH, Milner J, Corey P et al. Sources of variance in 24-hour dietary recall data: implications for nutrition study design and interpretation. Am J Clin Nutr 1979;32:254649.[ISI][Medline]
6 Stram DO, Hankin JH, Wilkens LR et al. Calibration of the dietary questionnaire for a multiethnic cohort in Hawaii and Los Angeles [see comments]. Am J Epidemiol 2000;151:35870.[Abstract]
7 Spiegelman D, Schneeweiss S, McDermott A. Measurement error correction for logistic regression models with an alloyed gold standard. Am J Epidemiol 1997;145:18496.[Abstract]
8
Hu FB, Stampfer MJ, Manson JE et al. Dietary fat intake and the risk of coronary heart disease in women. N Engl J Med 1997;337:149199.
9 Hu FB, Sampson LA, Stampfer MJ, Rosner BA, Willett WC. A validation study of repeated measurement of diet through food frequency questionnaire in assessing long-term diet among female nurses (abstract). Fourth International Conference on Dietary Assessment Methods, 1720 September 2000.