a Strangeways Research Laboratory, Institute of Public Health, University of Cambridge, Cambridge CB1 8RN, UK. E-mail: nick.day{at}srl.cam.ac.uk
b Dunn Human Nutrition Unit, Cambridge Institute for Medical Research, Cambridge CB2 2XY, UK.
c Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong.
Abstract
Background Validation studies of dietary instruments developed for epidemiological studies have typically used some form of diet record as the standard for comparison. Recent work suggests that comparison with diet record may overestimate the ability of the epidemiological instrument to measure habitual dietary intake, due to lack of independence of the measurement errors. The degree of regression dilution in estimating diet-disease association may therefore have been correspondingly underestimated. Use of biochemical measures of intake may mitigate the problem. In this paper, we report on the use of urinary measures of intakes of nitrogen, potassium and sodium to compare the performance of a semi-quantitative food frequency questionnaire (FFQ) and a 7-day diet diary (7DD) to estimate average intake of these nutrients over one year.
Methods In all, 179 individuals were asked to complete an FFQ and a 7DD on two occasions separated by approximately 12 months. The individuals were also asked to provide 24-hour urine samples on six occasions over a 69-month period, covering the time at which the record FFQ and 7DD were completed. The urine was assayed for nitrogen, potassium and sodium. The protocol was completed by 123 individuals. The data from these individuals were analysed to estimate the covariance structure of the measurement errors of the FFQ, the 7DD and a single 24-hour urine measurement, and to estimate the degree of regression dilution associated with the FFQ and 7DD.
Results The results demonstrated that: (1) the error variances for each of the three nutrients was more than twice as great with the FFQ than the 7DD; (2) there was substantial correlation (0.460.58) between the error of both the FFQ and the 7DD completed on different occasions; (3) there was moderate correlation (0.24 0.29) between the error in the FFQ and the error in the 7DD for each nutrient; (4) the correlation between errors in different nutrients was higher for the FFQ (0.770.80) than for the 7DD (0.520.70).
Conclusions The regression dilution with the FFQ is considerably greater than with the 7DD and also, for the nutrients considered, greater than would be inferred if validation studies were based solely on record or diary type instruments.
Keywords Biological markers, food diaries, nutrition assessment, questionnaires, validity, EPIC-Norfolk
Accepted 1 August 2000
The development of instruments to measure habitual dietary intake in the context of large epidemiological studies has been investigated extensively. It is now widely acknowledged that prospective studies give more reliable estimates of association between diet and disease than retrospective studies.1 For most disease endpoints of interest, such prospective studies need to be large with several tens if not hundreds of thousands of individuals recruited to give baseline information. Dietary instruments to be used in studies of this size are clearly constrained by resource considerations. Food frequency questionnaires (FFQ) have been largely the instrument of choice,2 but increasingly the use of diet diaries is being proposed.3 No dietary instrument can capture habitual diet with complete accuracy, and methods are available to correct observed diet-disease association for the bias induced by the imprecision of the dietary assessment, the so-called regression dilution bias. Regression dilution bias is an intrinsic aspect of modern quantitative epidemiology. However, the credibility of correcting for regression dilution depends on the magnitude of the correction required. In addition, estimation of the requisite correction factors becomes problematic if the instrument measures habitual diet poorly. Both the potential for bias and the imprecision of estimation increases rapidly as the association between estimated and true dietary intake decreases.46 The performance of the FFQ has been extensively compared with other, more intensive, record-based methods, such as weighed diet records.2,7 There is increasing evidence, however, that record- or recall-based methods do not satisfy the independence criteria required to act as validation methods.811 Non-independence can lead to substantial overestimation of the capacity of the FFQ to assess habitual diet accurately.8 Few studies have characterized the performance of FFQ against validated biochemical measures of intake. Twenty-four hour (24-h) urine collections, verified for completeness by the para-amino benzoic acid (PABA) method have been shown to give an unbiased, calibrated measure of intake for nitrogen (N),3 potassium (K)3 and sodium (Na).12 In this paper, we report on the results of a validation study performed in association with the UK component based in Norfolk of the European Prospective Investigation of Cancer (EPIC).13 This cohort study of approximately 25 000 adults, known as EPIC-Norfolk, used both a 7DD and a semi-quantitative FFQ.14 This validation study compared the performance of these two dietary instruments with 24-h urine measurements of N, K and Na, verified for completeness using PABA. An underlying assumption of this study is that the errors of measurement associated with the biochemical assessment of intake are independent from the errors of measurement of the 7DD and FFQ methods of dietary assessment, provided the urine collection and the completion of the diary do not occur together. This assumption is considered later in Discussion.
Method
Study design
Over an 18-month period, 179 members of the EPIC-Norfolk cohort took part in a validation study. In brief, each individual was asked to complete an FFQ and a 7DD on two occasions: once on their entry into EPIC-Norfolk and on another occasion 18 months later (±3 months). Over a 12-month period which covered the time of completion of the second FFQ and 7DD, six 24-h urine collections were requested, with completeness to be verified by PABA. The time relationship between urine collections and completion of the FFQ and 7DD is displayed in Figure 1. In all, 123 individuals completed the full protocol, and the results presented here refer to these 123 individuals. Nitrogen, K and Na were assayed on all urine samples.3
|
The food frequency questionnaire
The self-administered FFQ was designed to measure an individual's habitual food and nutrient intake during the past year. The questionnaire was a modified version of the FFQ in the US Nurses Health Study15,16 with a food list that was adapted to include foods that were commonly consumed within the UK. The food list was compiled from national dietary intake data and was based on 130 main food items.17 The FFQ in the present study was a revised version of the questionnaire previously used in validation studies.18 For each food item, participants were asked to indicate their usual consumption from nine frequency categories, ranging from never or <1/month to 6 times per day. The FFQ did not include specific questions on portion size but rather specified medium servings, defined by natural (e.g. apple, slice of bread) or household units (e.g. glass, cup, spoon). Calculation of nutrient intake for both instruments was based on published food composition tables19 which are frequently updated with data on new food items and nutrients.
Urine collections
Participants received written and verbal instructions on the technique of collecting 24-h urine samples and the use of PABA tablets (PABA; PABA check, Laboratories for Applied Biology, London). On the first morning of the urine collection, participants were asked to discard their first urine specimen and from then on to collect all specimens for the next 24 hours, up to and including the first urine specimen of the next day. They were given three 80 mg PABA tablets to take at each meal on the day of the urine collection to verify completeness of the 24-h urine collection.20
The Norwich District Ethics Committee gave permission for both the main EPIC-Norfolk study and this validation study.
Statistical methods
We designate diary measures by R, the FFQ by Q and the urinary measures by M. We assume there is a true but unobservable intake we designate T, in this case comprising the average intake over a year. We designate the population variance of T by 2T. For each nutrient, we take M to be a calibrated but imprecise measure of intake, so that
![]() |
where M is an error term with variance
2M, different for each of the three nutrients.
R and Q we take as biased and imprecise measures of intake, so that
![]() |
![]() |
![]() |
![]() |
where ßR and ßQ can be regarded as scale factors and R and
Q as measurement bias. As before, the parameters differ for each nutrient. We also denote, by
RR and
QQ respectively, the correlation between values of
R and
Q from repeated measures of R and Q, and by
RQ the correlation between
R and
Q.
This model is similar to the one introduced by Kipnis et al.,8 except that the error terms are parameterized differently, and a simplified version of that investigated by Plummer and Clayton.10,11 Person-specific bias is included in the error term, and manifests itself as correlated error. Kipnis et al. introduced a separate term for individual bias.
We are interested primarily in estimating the parameters 2T,
2M,
2R,
2Q, ßR, ßQ,
RR,
QQ and
RQ.
We assume that errors in M are independent of errors in R and Q, and also between repeated measures of M.
In particular, we are interested in estimating the regression dilution correction for R and Q, given by
![]() |
![]() |
when only one measure of R or Q has been used as the basis for diet-disease association estimates.
To estimate the required parameter, we have used the method of moments, equating each observed variance and covariance to its expected value.5,6,10,11,21 We have
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where for X, Y {R, Q, M}, SXi,Xi is the average of sample variances over the repeated measures on measurement method X, SXi,Xj is the average of sample covariances over all pairs i, j of measures on measurement method X, i
j, and SXi,Yj is the average over all i, j of sample covariances for the ith measure on measurement method X and the jth measure on measurement method Y and where Ri, Qi and Mi designate repeated measures of R, Q and M, respectively. Using the method of moments, no distribution assumptions on T,
R,
Q and
M are needed. We only assume that their variances and covariances exist and are finite. It is known that the moment estimators are identical to the maximum likelihood estimators if T,
R,
Q and
M are normally distributed and when the repeats are complete.
We have estimated the unknown parameters for N, K and Na separately, and the regression dilution correction factor, together with the variance of the estimate for each nutrient for both the 7DD and the FFQ. The expressions of the estimates for unknown parameters are given in the Appendix.
Results
Tables 14 give the main sample statistics. The FFQ gives higher mean intakes than the 7DD for N and K. The mean intakes from three measurement methods differ significantly (P < 0.001) except the FFQ and the 7DD sodium intakes. Both the FFQ and the 7DD underestimate sodium intake, compared to the urinary measure. For each nutrient, the observed variance of the 7DD measure was smallest. For Na, the urine measure had substantially the largest variance (Table 2
). It is striking from Table 3
, that although the correlations between R and M and between R and Q are of similar magnitude for all three nutrients, the correlations between Q and M are markedly smaller. It is noticeable, in Table 4
, that the correlation between the estimated intakes of the three nutrients are largest for the FFQ, and smallest for the urinary measure.
|
|
|
|
|
|
|
|
|
Table 8 displays the estimates of the scale factors for R and Q. They indicate, as does Table 5
, that the FFQ is rather weakly associated with T, whereas the diary R, at least for N and K, relates more closely to true intake. The lower values seen for Na clearly reflect the underestimate of sodium intake by both the 7DD and the FFQ, as seen in Table 1
.
The regression dilution correction factors for the 7DD and the FFQ, for each of the nutrients, is given in Table 9. The values given refer to the correction needed both when only one measure of R or Q has been used to estimate diet-disease association, and in the situation when the diet-disease association has been estimated from two measures of R or Q. This latter correction incorporates the correlation in the error terms between repeats of R and Q. The reduction in the required corrections in going from one to two measures of R and Q is clearly less than it would be if the error terms were independent.
|
|
The results presented here demonstrate that a 7DD provides a better estimate of average intake, as assessed by urinary measures, than does the EPIC-Norfolk FFQ, for each of the three nutrients we have considered. One can only speculate whether the same conclusion holds for other nutrients. Additional biomarkers for other nutrients, substantially correlated with true intake, are required. The correlations between the diary estimates of intake of the three nutrients and the urinary measures are between 0.36 and 0.49, for the FFQ the correlations vary from 0.13 to 0.22. This comparison can be seen graphically in Figures 2(a), (b) and (c). From the values in Tables 5 and 8
, one can calculate the correlations between the 7DD measures and average intake T and between the FFQ and T. For the 7DD, the values are 0.64, 0.56 and 0.59 for nitrogen, potassium and sodium, respectively, whereas the corresponding values for the FFQ are 0.20, 0.33 and 0.18, respectively. For the 24-h urine, the corresponding values are 0.76, 0.69 and 0.69. The latter two values correspond to estimates of 0.74 and 0.66 from the INTERSALT study.24
The correlations for the FFQ are much lower than those often reported for the FFQ from validation studies.2 Usually, however, the validation is done in terms of other record-based instruments. As can be seen from Table 6, however, correlated errors between the 7DD and the FFQ will lead to overestimation of the correlation between the FFQ and underlying intake. The values in Table 3
, comparing the correlations between intakes from the FFQ and the 7DD and between intakes from the FFQ and the 24-h urinary measure, illustrate the point. The lower values for the latter could have been due to excessive error variation in the 24-h urinary measure, but Table 6
demonstrates that this is not the case. The lower correlations are due to the substantial correlation between the errors in the FFQ and the 7DD and are of a similar magnitude for each of the three nutrients.
The correlation between the errors of the 7DD and the FFQ can be expressed in terms of the corresponding correlation between person-specific biases, following Kipnis et al.,8 if one assumes that all the correlation between the two errors derives from correlation between person-specific biases. This correlation is then given by RQ/(
RR
QQ)1/2 with values of 0.56, 0.51 and 0.47 for N, K and Na, respectively. These values are at the top end of the values Kipnis et al. considered (Table 2
of ref.8).
The lower correlations between the FFQ and underlying intake leads to a larger degree of regression dilution than has often been assumed. With correction factors in the range 4 to 9, only large underlying relative risks will lead to appreciably elevated observed risks when using the FFQ, at least for the three nutrients under consideration. In addition to the correction factor estimates being large, for the FFQ, the associated standard errors for the estimated correction factor are large also, as can be seen in Figure 3. For a validation study to yield acceptable precision in the estimates of the correction factors for the FFQ, it would need to include several thousand individuals.
|
The substantial correlation of the error terms for both the diary and the FFQ requires comment. It may be associated with the well-known phenomenon of underreporting, i.e. that there is considerable inter-individual variation in the way dietary records and questionnaires are completed.4 One consequence for the design of epidemiological studies, however, is that there is limited value in obtaining several repeat diaries or FFQ. In particular, the poor performance of the FFQ cannot be overcome by simply repeating it on numerous occasions. Given the level of correlation in the error term, the most that can be achieved is the equivalent of two independent repeats.
Similar analyses have been published for nitrogen intake from an early validation study for the EPIC-Norfolk cohort.10,11 There is, however, an essential difference in that the earlier publication did not use an open-ended diet diary, but a 7-day daily check list. The earlier work also uses two different FFQ and so although our methodology derives in part from the earlier work, the substantive results are not comparable.
The assumption that the error associated with a single urinary measure is independent from the errors of the repeated urinary measures, and also from the errors of the 7DD and FFQ needs some comment. The situation of dependence between errors in the repeated 24-h urine measures has been considered in an earlier paper.5 Since the correlation of the mean of six urine measures with habitual intake over one year will be high (from Table 5), the effect of error correlation up to 0.4 will be relatively small.3 With regard to possible correlations between errors in the urinary measures and the corresponding measures from the 7DD and the FFQ, the urinary measure is clearly of a different type, and physically independent from the questionnaire type instruments. The timing of the urine collection was also chosen not to coincide with completion of either the 7DD or the FFQ. It is conceivable, however, that any approach to study participants may change their behaviour, thereby inducing some level of correlation. It is impossible to estimate this correlation without a further independent measure which we are not aware of being available. We have undertaken sensitivity analysis to investigate the degree to which our estimates of the correction factors in Table 9
change if error in the FFQ or 7DD is correlated with error in the urinary measure. The sensitivity analysis focused on the nitrogen values. We define
RM and
QM as the correlation between error in the urinary measure, M, and error in the 7DD, R, and the FFQ, Q, respectively. The correction factor for the FFQ is highly sensitive to moderate values of
QM, becoming infinite with
QM = 0.23. The correction factor for the 7DD is relatively insensitive to moderate values of
RM, equalling 3.09 with
RM = 0.3. The conclusion thus is similar to that in our earlier paper,5 that the sensitivity of the estimated correction factor to departures from the assumption of independence becomes greater the more weakly the measured intake (R, Q or M) is related to true intake (T). It is also worth noting that the values in Table 10
indicate that the relative magnitudes of the correction factors (for potassium) for the 7DD and the FFQ under the independence assumptions appear to be approximately correct since the adjusted regression coefficients are the same for the 7DD and the FFQ for both FEV1 and plasma vitamin C.
An additional feature of the results presented in this paper is the high level of correlation between the errors in estimating the three different nutrients (Table 7). These correlations are greater for the FFQ than for the 7DD and will lead to greater apparent confounding in the univariate situation, but their main effect will be in the multivariate situation. If several dietary factors are being examined simultaneously, these correlations will contribute to the multivariate regression dilution. In the multivariate situation the effect of measurement error is not simply dilution of each parameter, since there is an additive component, as discussed by Kipnis et al.25 These authors label this additive component regression contamination. The greater the correlation between the errors of different nutrients, the greater the resulting contamination is likely to be.6
As a final point, the large values for the regression correction factor seen for the FFQ derive both from the error variances of the FFQ and the underlying between-individual variation in the study population. If the latter is increased, the correction factor will become smaller, which underlines the importance of variation across the study population. This forms an important part of the design rationale of EPIC.
Appendix
Under the assumption that the errors between successive determinations using biomarker method are uncorrelated (that is, MM = 0) and
RM =
QM = 0, if the validation study consists of repeated measures on R, Q and M, the estimators are obtained by equating the sample statistics to their expected values. We thus get
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The estimates of the univariate correction factor for R and Q are, thus, SRi,Mj/SRi,Ri and SQi,Mj/SQi,Qi, respectively.
Since the estimates of unknown parameters and univariate correction factors are a function of sample statistics from validation study, we obtain their asymptotic variances by the delta method.
The estimated asymptotic variances of the univariate correlation factor of R and Q are equal to
![]() |
and
![]() |
where mr, mq and mm are the number of repeated measures on R, Q and M, respectively.
Acknowledgments
The EPIC-Norfolk Study is funded by the Cancer Research Campaign, the Medical Research Council, the British Heart Foundation, the Ministry of Agriculture, Fisheries and Food, and the Europe against Cancer Programme of the Commission of the European Communities. We are grateful to all the participants and general practitioners who have helped with this study, and to the nurses, technicians and the staff of the EPIC co-ordinating centre. Undergraduate students from Coleraine, Surrey, Newcastle, Leeds Metropolitan and Wageningen Universities assisted with the collection and interpretation of the fieldwork. The third author was funded by the British Council to do this collaborative work. The authors would like to thank the referees for their helpful comments.
References
1 COMA Nutritional Aspects of the Development of Cancer. Committee on Medical Aspects of Food and Nutrition Policy Report on Health and Social Subjects 48, London: Department of Health, The Stationery Office, 1998.
2 Willett W. Nutritional Epidemiology. New York: Oxford University Press, 1998.
3
Bingham S, Gill C, Welch A et al. Validation of dietary assessment methods in the UK arm of EPIC. Int J Epidemiol 1997;26:S13751.
4 Bingham SA, Gill C, Welch A et al. Comparison of dietary assessment methods in nutritional epidemiology: weighed records v. 24 h recalls, food-frequency questionnaires and estimated-diet records. Br J Nutr 1994;72:61943.[ISI][Medline]
5 Wong MY, Day NE, Bashir SA, Duffy SW. Measurement error in epidemiology: the design of validation studies I: univariate situation. Stat Med 1999;18:281529.[ISI][Medline]
6 Wong MY, Day NE, Wareham NJ. Measurement error in epidemiology: the design of validation studies II: bivariate situation. Stat Med 1999;18:283045.
7
Kaaks R, Slimani N, Riboli E. Pilot phase studies on the accuracy of dietary intake measurements in the EPIC project: overall evaluation of results. European Prospective Investigation into Cancer and Nutrition. Int J Epidemiol 1997;26(Suppl.1):S2636.
8 Kipnis V, Carroll RJ, Freedman LS, Li L. Implications of a new dietary measurement error model for estimation of relative risk: application to four calibration studies. Am J Epidemiol 1999;150:64251.[Abstract]
9
Kroke A, Klipstein-Grobusch K, Voss S et al. Validation of a self-administered food-frequency questionnaire administered in the European Prospective Investigation into Cancer and Nutrition (EPIC) Study: comparison of energy, protein, and macronutrient intakes estimated with the doubly labeled water, urinary nitrogen, and repeated 24-h dietary recall methods. Am J Clin Nutr 1999;70:43947.
10 Plummer M, Clayton D. Measurement error in dietary assessment: an investigation using covariance structure models, Part I. Stat Med 1993;12:92535.[ISI][Medline]
11 Plummer M, Clayton D. Measurement error in dietary assessment: an investigation using covariance structure models, Part II. Stat Med 1993;12:93748.[ISI][Medline]
12 INTERSALT Cooperative Research Group. INTERSALT: an international study of electrolyte excretion and flood pressure: result for 24-hour urinary sodium and potassium excretion. Br Med J 1988; 297:31928.[ISI][Medline]
13 Riboli E. Nutrition and cancer: background and rationale of the European Prospective Investigation into Cancer and Nutrition (EPIC). Ann Oncol 1992;3:78391.[Abstract]
14 Day NE, Oakes S, Luben R et al. EPIC in Norfolk: study design and characteristics of the cohort. Br J Cancer 1999;80(Suppl.1):95103.[ISI][Medline]
15 Willett WC, Sampson L, Stampfer MJ et al. Reproducibility and validity of a semiquantitative food frequency questionnaire. Am J Epidemiol 1985;122:5165.[Abstract]
16 Willett WC, Sampson L, Browne ML et al. The use of a self-administered questionnaire to assess diet four years in the past. Am J Epidemiol 1988;127:18899.[Abstract]
17 Committee NFS. Annual Report of Household Food Consumption and Expenditure 1980. London: Her Majesty's Stationery Office, 1982.
18 Bingham SA, Gill C, Welch A et al. Comparison of dietary assessment methods in nutritional epidemiology: weighted records v. 24h recalls, food-frequency questionnaires and estimated-diet records. Br J Nutr 1994;72:61943.[ISI][Medline]
19 Holland B, Welch AA, Unwin ID, Buss DH, Paul AA, Southgate DA. McCance and Widdowson's The Composition of Foods. Fifth revised and extended edition. The Royal Society of Chemistry and Ministry of Agriculture, Fisheries and Food, 1991 and Supplements to this edition.
20 Bingham S, Cummings JH. The use of 4-aminobenzoic acid as a marker to validate the completeness of 24h urine collections in man. Clin Sci 1983;64:62935.[ISI][Medline]
21 Kaaks R, Riboki E, Esteve J, van Kappel A, van Staveren W. Estimating the accuracy of dietary questionnaire assessments: validation in terms of structural equation models. Stat Med 1994;13:12742.[ISI][Medline]
22 Hu G, Cassano PA. Antioxidant nutrients and pulmonary function: The Third National Health and Nutrition Examination Survey (NHANES III). Am J Epidemiol 2000;151:97581.[Abstract]
23 Price JF, Fowkes FGR. Antioxidant vitamins in the prevention of cardiovascular disease: the epidemiological evidence. Eur Heart J 1997;18:71927.[ISI][Medline]
24 Dyer AR, Shirley M, Elliott P. Urinary electrolyte excretion in 24 h and blood pressure in the Intersalt study: estimate of reliability. Am J Epidemiol 1994;139:92739.[Abstract]
25 Kipnis V, Freedman LS, Brown CC, Hartman AM, Schatzkin A, Wacholder S. Effect of measurement error on energy-adjustment models in nutritional epidemiology. Am J Epidemiol 1997;146: 84255.[Abstract]