ACCURACY OF QUANTITY–FREQUENCY AND GRADUATED FREQUENCY QUESTIONNAIRES IN MEASURING ALCOHOL INTAKE: COMPARISON WITH DAILY DIARY AND COMMONLY USED LABORATORY MARKERS

Kari Poikolainen1,2,*, Irina Podkletnova3 and Hannu Alho2,4

1 Finnish Foundation for Alcohol Studies, P. O. Box 220, FIN-00531 Helsinki,
2 Department of Mental Health and Alcohol Research, National Public Health Institute, Mannerheimintie 166, FIN-00300 Helsinki,
3 Medical School, University of Tampere, P. O. Box 719, 33101 Tampere and
4 Research Unit of Substance Abuse Medicine, University of Helsinki, Finland

Received 11 February 2002; in revised form 28 March 2002; accepted 1 May 2002


    ABSTRACT
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Aims: To ascertain the accuracy of a quantity–frequency questionnaire (QF) and a graduated frequency questionnaire (GF) as methods of obtaining self-reported alcohol intake in relation to a daily diary and biochemical tests. Methods: QF and GF data were obtained before and after a 1-month daily diary on alcohol intake in a sample of 52 volunteers aged 20–63 years, of whom 43 were female. A blood sample to measure serum aspartate aminotransferase (ASAT), alanine aminotransferase (ALAT), {gamma}-glutamyltransferase (GGT) and % carbohydrate-deficient transferrin (CDT) was obtained at the outset. Results: Both QF and GF correlated closely with daily diary intake (r > 0.90). Compared with a daily diary, the mean QF intake was slightly lower, whereas the mean GF intake was 2-fold. The apparent overestimation by GF was independent of the actual consumption level. Self-reported alcohol intake by each method correlated closely with serum ASAT, ALAT and GGT (r = 0.41–0.67) but not with CDT. Conclusions: In adults motivated to recall alcohol intake, both QF and GF classify individuals in the correct rank order, but GF probably overestimates actual alcohol consumption.


    INTRODUCTION
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Our knowledge on alcohol-related risks and benefits depends on the accuracy of self-reported recall of alcohol intake. Whereas the reliability of recall is reasonably good, intake figures are usually underestimates. When compared with alcohol sales, the underestimation has been observed to vary from 29 to 83% (Pernanen, 1974Go; Polich, 1981Go; Simpura et al., 1987Go). Another problem is that heavy drinkers have been noted to underestimate their intake more than light drinkers (Uchalik, 1979Go; Poikolainen, 1985Go). Usually, intake has been assessed in these studies by inquiring into the quantity and frequency of alcohol consumption.

Traditional quantity–frequency questionnaires (QF) focus on typical, or habitual, intake. No options are provided to report more infrequent drinking occasions and less typical amounts. Hence, both frequent intake of small amounts and infrequent intake of high amounts remain underreported. Another problem is the lack of information on the total amount of alcohol consumed during one drinking day (except if all subjects under study were to drink only beer, or only wine, or only spirits). A partial solution to these difficulties is to ask separately about the frequency of heavy drinking days in addition to the questions that map out average alcohol intake. A logical extension of this latter approach is to inquire separately into the frequency of drinking days for various quantity levels of alcohol intake, say one to two drinks, three to four drinks, etc. The latter series of questions are now known as graduated frequency questionnaires (GF). Recent World Health Organization (2000)Go guidelines consider GF to be the method of choice.

GF seems to yield higher estimates of alcohol intake than QF. Data from the 1979 US National Alcohol Survey showed that GF produced 38% higher alcohol intake estimates than QF. This was partly due to a longer recall period (1 year vs 1 month) as well as to better reporting of heavy drinking occasions (Room, 1991Go). In a representative sample of nearly 4000 adults in Ontario, Canada, prevalence of harmful drinking was virtually three times higher for GF than for QF (Rehm et al., 1999Go). However, GFs have not been much used in research, and little is known about their accuracy (Greenfield, 2000Go; Leigh, 2000Go). As Rehm et al. (1999)Go pointed out, GF methods yield at present only indirect support in terms of reliability and validity. In the present study, we have compared GF and QF with a 1-month daily alcohol intake diary. A daily diary was used as the ‘gold standard’ as, at least in motivated volunteers, the literature over many years has tended to consider this as the most accurate way to collect data on alcohol intake from subjects living in their natural community settings.


    SUBJECTS AND METHODS
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Subjects
Adults with various levels of alcohol intake were recruited from workplaces through advertisements. Alcoholics and abstainers were not eligible. All volunteers submitted a written informed consent and were advised that they were entitled to withdraw from the study at any time without a given reason. The subjects were not paid for taking part in the study, but, in order to increase motivation, had blood tests taken and were informed about the findings. Of the 52 volunteers, 43 were females. Mean age was 42 (range 20–63) years.

Alcohol intake
Average alcohol intake was estimated both before (time 1) and after the daily diary period (time 2) by applying two self-report instruments: a quantity–frequency questionnaire (QF1 and QF2 respectively) and a graduated frequency questionnaire (GF1 and GF2 respectively).

QF presented 10 drinking frequency options (never, 1–2 times per year, 3–4 times per year, approximately every second month, approximately once a month, approximately twice a month, once a week, 2–3 times a week, 4–5 times a week, 6–7 times a week), and nine quantity options for beer (including cider and pre-mixed cocktails), wine and spirits. The highest quantity option was open-ended (5 or more litres of beer, 1.5 l or more of wine, 1 l or more of spirits). Examples of questions are as follows: how often did you consume spirits during the past 12 months?; how much spirits did you usually consume during the days when you drank spirits?

GF allowed for eight ‘number of drink’ levels (from ‘15 or more drinks per day’ to ‘1–2 drinks per day’). A ‘drink’ was defined as a standard serving in a restaurant or bar with an alcohol content of 12 g each. Detailed instructions were issued to calculate the number of drinks in various glass sizes and bottle volumes for beer, wine and spirits. The subject then chose one of seven drinking frequency levels (never, not more than 3 times a month, approximately once a week, 2–3 times a week, 4–5 times a week, 6 times a week, practically every day) for each relevant ‘number of drink’ level. QF1 and GF1 were administered immediately before the beginning of the diary-keeping period; QF2 and GF2 were returned within a week after the end of the diary-keeping period. Record-keeping lasted 31 consecutive days. The subjects were instructed not to deviate in any way from their earlier drinking patterns.

Blood tests
A fasting venous blood sample was drawn after the QF1 and GF1 and serum was prepared and stored at –20°C until assayed. Axis-Shield immunoassay for quantitative measurement of carbohydrate-deficient transferrin (CDT) was used and CDT is reported here as %CDT in proportion to the total transferrin (Axis-Shield Asa, Oslo, Norway). Serum aspartate aminotransferase (ASAT), alanine aminotransferase (ALAT) and {gamma}-glutamyltransferase (GGT) were determined using established clinical chemical methods (Johnson & Johnson, Clinical Diagnostics, Rochester, NY, USA). These values are reported in units per litre (U/l).

Data
Of the 52 volunteers, seven did not complete GF1, the daily diary or give a blood sample. An additional seven failed to complete one or more items in QF1. Fourteen cases produced inadequate data in QF2 and eight in GF2. Complete alcohol intake data were available for 34 subjects (28 females). One of the cases in this group reported very low alcohol intake in both QF and GF and no intake in daily diary. Incomplete data were observed for 18 subjects (15 females). Significant (P < 0.05) differences were not found between the cases with and without complete alcohol data in terms of sex, age (mean: 44 vs 40 years), quantity–frequency estimate at time 1 (13 vs 11 g/day), graduated frequency estimate at time 1 (31 vs 22 g/day), ASAT (22 vs 25 U/l), ALAT (28 vs 31 U/l), GGT (29 vs 26 U/l) or CDT (4.2 vs 3.8 %). The {chi}2-test was applied for categorical data and a t-test for continuous data.


    RESULTS
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Complete data were available for 34 subjects (Table 1Go).


View this table:
[in this window]
[in a new window]
 
Table 1. Mean values and ranges for the diary, quantity–frequency (QF) and graduated frequency (GF) estimates of alcohol intake (g/day), before (1) and after (2) diary period, and serum activities (U/l) of aspartate aminotransferase (ASAT), alanine aminotransferase (ALAT), {gamma}-glutamyltransferase (GGT), and percentage (% of total transferrin) of carbohydrate-deficient transferrin (CDT) among 34 subjects
 
The graduated frequency questionnaire yielded consistently higher alcohol intake estimates than the daily diary estimate. The mean values for GF alcohol intake were twice the daily diary estimate whereas those for the QF intake estimate were slightly lower than the daily diary estimate (Table 1Go). There were no obvious errors or inconsistencies in the questionnaire responses that would explain these results, except that five persons reported more than 365 drinking days in the GF spanning over the previous 12 months.

When the daily diary estimate and the quantity–frequency and graduated frequency of alcohol intake, both before (1) and after (2) the daily record-keeping period, were compared, all estimates showed clear linear associations in scatterplots. There were no obvious outliers. All correlations between the five estimates were high (Table 2Go). Comparison between time 1 and time 2 showed that the test–retest reliability of both QF and GF was good.


View this table:
[in this window]
[in a new window]
 
Table 2. Pearson correlation coefficients between the diary, quantity– frequency (QF) and graduated frequency (GF) estimates of alcohol intake, before (1) and after (2) the diary period among 34 subjects
 
Serum ASAT, ALAT and GGT levels correlated closely (P < 0.001) with the alcohol estimate measures, whereas CDT values correlated poorly (Table 3Go). The correlation coefficients between the blood test and the self-reported alcohol intake figures were consistently all of the same magnitude for all four biological markers (Table 3Go). When correlations between GF1, daily diary alcohol intake, ASAT, ALAT, GGT and CDT were studied among those 45 cases with complete data for these variables, the results did not materially differ from those described above.


View this table:
[in this window]
[in a new window]
 
Table 3. Pearson correlation coefficients between serum aspartate aminotransferase (ASAT), alanine aminotransferase (ALAT), {gamma}-glutamyltransferase (GGT) and carbohydrate-deficient transferrin (CDT) values with the diary, quantity–frequency (QF) and graduated frequency (GF) estimates of alcohol intake, before (1) and after (2) the diary period among 34 subjects
 
The overestimation was reflected in both the frequency and amount of drinking and it was most obvious among heavy (>=5 drinks per day) and frequent (once a week or more often) drinking occasions (Table 4Go). When the two cases with the highest difference between GF1 and daily diary alcohol intake were removed from the data, the magnitude of the overestimation by GF1 remained essentially the same. Judging from laboratory test data (GGT 127 and 74 U/l), both these subjects were heavy drinkers. When all cases with more than 365 drinking days in the GF were removed, the magnitude of overestimation remained at the same level (mean 2.35).


View this table:
[in this window]
[in a new window]
 
Table 4. Ratio of GF1-classified drinking occasions to diary-classified occasions by frequency and quantity of drinking (number of occasions in parentheses)
 
Overestimation by GF did not depend significantly on the actual alcohol intake measured by the daily diary. When alcohol intake measured by the daily diary was regressed on the ratio of GF1 to the former, the regression coefficient did not deviate significantly from zero (P = 0.4; b = –0.022; 95% CL for b –0.069, +0.025). Thus, it can be concluded that the average 2.2-fold overestimation holds true, independently of the actual consumption level.


    DISCUSSION
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We found that both QF and GF correlated closely with daily recording of alcohol intake and that test–retest repeatability of both QF and GF was good. Compared with diary-based alcohol intake, the QF questionnaire yielded slightly lower intake estimates and the GF questionnaire ~2-fold intake estimates. Overestimation of intake by the GF questionnaire was higher with respect to both frequent and heavy drinking than to infrequent or light drinking.

In theory, a daily diary kept by motivated volunteers is the best available way to collect comprehensive and accurate data on the many dimensions of alcohol intake from subjects living in their natural community settings. However, much depends on the motivation and conscientiousness of the volunteers. We offered laboratory tests free of charge before the diary period in the hope that it would increase motivation, both because the tests might be seen as a small compensation for participation and also because some participants might think that the tests are a way to check the truthfulness of responses to questions. This did not seem to be a total success, however, since 35% of the volunteers did not complete the diary period. Perhaps they were more motivated to have free tests than by altruistic reasons. As we have no way to check the accuracy of the responses by the study completers, the possibility remains that their responses did not fully meet the ‘gold standard’ accuracy. However, credibility to the latter is lent by the finding that all alcohol questionnaire measures showed close correlation with GGT, an objective blood test known to associate with alcohol intake (Poikolainen et al., 1985Go), and slightly less close correlations with ASAT and ALAT. The poor correlation between alcohol intake and CDT might be at least partly related to the observation that this association is weaker among women than men (Allen et al., 2000Go), as most of our subjects were women.

To explore the possible causes of overestimation by GF, daily diary intake was compared with GF1 intake. The rationale of this choice was that diary-keeping could not have influenced GF1, since it was filled in before the start of the daily diary. The fact that GF1 and GF2 correlated closely and yielded similar mean and SD values suggests that record-keeping did not influence the responses to the later GF2. The former as well as the close similarity of QF1 and QF2 responses points to the conscientiousness and precision of the study subjects in reporting their alcohol intake.

GF yielded clearly higher alcohol intake estimates than the diary. This might be due to (a) underestimation of actual intake by diary or (b) overestimation of GF intake. Weekday variation might bias the diary intake figures downwards if it was considerable and low intake days were under-represented in the data. According to the diaries, our subjects consumed more on Fridays, Saturdays and Sundays than during the rest of the week. However, all subjects reported on four weekends except one whose diary included five weekends. Moreover, most of the 31-day diary periods included the May Day, a traditional revelling festival in Finland. Therefore, it is unlikely that diary intake was underestimated. Overestimation is more probable.

Some overestimation may also occur, if the actual number of drinks consumed is closer to the lower class limit than the category average (the latter is assumed in calculations). In our GF, an odd number of drinks was always the lower and an even number the higher option within a response category. Of all drinking days, 56% yielded an odd number of drinks. Thus, deviations from the category averages had little influence on the overestimation in the present data.

GF is a family of questionnaires and versions with many drink level categories may inflate the magnitude of intake estimates. Our GF contained as many as eight drink level categories. Earlier ‘thinking aloud’ analyses on similar GF versions suggest that some heavy drinkers, having much to report to the high quantity level items in the beginning of GF, may feel obliged to find something to report also at lower levels. Some may reinterpret the questions, for example thinking that to have only ‘one to two drinks’ means at least ‘one to two drinks’ (Greenfield, 2000Go). Because of these problems, GF versions with five drink level categories have been used recently.

A study on 83 volunteers from the San Francisco Bay area found that the average recalled past 30-day alcohol intake measured by graduated frequency mailed questionnaire was 96% of the respective intake measured by weekly retrospective diaries (Hilton, 1989Go). The respective proportion by past 14-day quantity–frequency mail questionnaire was 95%. The intake estimates by all three methods were closely correlated. Good agreement between the various methods might be related to four factors. First, the GF had only five drink level categories. Secondly, the sample consisted of relatively heavily drinking young adults. Thirdly, the daily diary data came from the retrospective recall of drinking during the past 7 days and, fourthly, mail questionnaires were answered after the 10-week daily diary period. In contrast to this, larger underreporting has been found in a study in which the questionnaire was answered before the daily diary period (Poikolainen and Kärkkäinen, 1983Go).

In conclusion, the results suggest that, when the respondents are motivated to recall their alcohol intake, both QF and GF classify individuals in the correct rank order according to their actual alcohol intake. GF data seem to overestimate markedly actual alcohol consumption, whereas QF data slightly underestimate it.


    ACKNOWLEDGEMENTS
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors thank Dr Tom Greenfield and the anonymous reviewers for valuable comments.


    FOOTNOTES
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
* Author to whom correspondence should be addressed at: Finnish Foundation for Alcohol Studies, P. O. Box 220, FIN-00531 Helsinki, Finland. Back


    REFERENCES
 TOP
 FOOTNOTES
 ABSTRACT
 INTRODUCTION
 SUBJECTS AND METHODS
 RESULTS
 DISCUSSION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Allen, J. P., Litten, R. Z., Fertig, J. B. and Sillanaukee, P. (2000) Carbohydrate-deficient transferrin, gamma-glutamyltransferase, and macrocytic volume as biomarkers of alcohol problems in women. Alcoholism: Clinical and Experimental Research 24, 492–496.[ISI][Medline]

Greenfield, T. K. (2000) Ways of measuring drinking patterns and the difference they make: experience with graduated frequencies. Journal of Substance Abuse 12, 33–49.[ISI][Medline]

Hilton, M. A. (1989) Comparison of prospective diary and two summary recall techniques for recording alcohol consumption. British Journal of Addiction 84, 1085–1092.[ISI][Medline]

Leigh, B. C. (2000) Using daily reports to measure drinking and drinking patterns. Journal of Substance Abuse 12, 51–65.[ISI][Medline]

Pernanen, K. (1974) Validity of survey data on alcohol use. In Research Advances in Alcohol and Drug Problems, Gibbins, R. J., Israel, Y., Kalant, H., Popham, R. E., Schmidt, W. and Smart, R. G. eds., pp. 355–374. Wiley, New York.

Poikolainen, K. (1985) Underestimation of recalled alcohol intake in relation to actual consumption. British Journal of Addiction 80, 215–216.[ISI][Medline]

Poikolainen, K. and Kärkkäinen, P. (1983) Diary gives more accurate information about alcohol consumption than questionnaire. Drug and Alcohol Dependence 11, 209–216.[ISI][Medline]

Poikolainen, K., Kärkkäinen, P. and Pikkarainen, J. (1985) Correlations between biological markers and alcohol intake as measured by diary and questionnaire in men. Journal of Studies on Alcohol 46, 383–387.[ISI][Medline]

Polich, J. (1981) Epidemiology of alcohol abuse in military and civilian populations. American Journal of Public Health 71, 1125–1132.[Abstract]

Rehm, J., Greenfield, T. K., Walsh, G., Xie, X., Robson, L. and Single, E. (1999) Assessment methods for alcohol consumption, prevalence of high risk drinking and harm: a sensitivity analysis. International Journal of Epidemiology 28, 219–224.[Abstract]

Room, R. (1991) Measuring alcohol consumption in the U.S.: methods and rationales. In Alcohol in America: Drinking Practices and Problems, Clark, W. B. and Hilton, M. E. eds, pp. 26–50. State University of New York Press, Albany, NY.

Simpura, J. ed. (1987) Finnish Drinking Habits: Results from Interview Surveys Held in 1968, 1976 and 1984. Finnish Foundation for Alcohol Studies, vol. 35, Helsinki.

Uchalik, D. (1979) A comparison of questionnaire and self-monitored reports of alcohol intake in a nonalcoholic population. Addictive Behaviours 4, 409–413.[ISI][Medline]

World Health Organization (2000) International Guidelines for Monitoring Alcohol Consumption and Harm. Department of Mental Health and Substance Abuse, WHO, Geneva.