1 Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC.
2 South Carolina Cancer Center, Columbia, SC.
3 Division of General Internal Medicine and Vanderbilt-Ingram Cancer Center, Vanderbilt University, Nashville, TN.
4 Division of Endocrinology, Childrens Hospital and Harvard Medical School, Boston, MA.
5 South Carolina Comprehensive Breast Center, Palmetto Health, Columbia, SC.
6 Cancer Prevention and Control Program, Hollings Cancer Center, Charleston, SC.
Received for publication April 27, 2004; accepted for publication October 4, 2004.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
energy metabolism; exercise; monitoring, physiologic; motor activity; social desirability; social environment
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Certain personality traits may affect self-reporting of physical activity. The traits of "social desirability" and "social approval" have been found to influence participants reports of diet (25). "Social desirability" is the defensive tendency of individuals to portray themselves in keeping with perceived cultural norms, whereas "social approval" is the need to obtain a positive response in a testing situation (4). It has been found that people, especially women, who score higher on the social desirability scale are more likely to underreport their fat and total energy intake (25).
To extend our understanding of systematic errors in self-reports of physical activity, we designed the present investigation to compare three self-reported physical activity assessment approaches commonly used in epidemiologic and clinical studies with objective measures of physical activity and to test for systematic errors that can be ascribed to social desirability and social approval. Our objective criterion measures were physical activity energy expenditure estimated from doubly labeled water and estimated resting energy expenditure, as well as intensity-specific activity duration derived by means of the ActiGraph accelerometer (Manufacturing Technology, Inc., Fort Walton Beach, Florida).
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Study timeline
At the day 0 visit, a fasting urine sample and anthropometric measurements were obtained. Completed questionnaires (mailed 1 week previously) included information on demographic factors, lifestyle factors, and general health, the Marlowe-Crowne Social Desirability Scale (6), and the Martin-Larson Approval Motivation Scale (7). An ActiGraph accelerometer was provided to each participant, along with detailed usage instructions. Each patient was randomly assigned one of the two types of 7-day physical activity recalls (PARs), with instructions to complete the instrument on the evening of day 6. Over the next 14 days (days 114), seven telephone-administered 24-hour PARs and dietary recalls were obtained, such that one recall was obtained for each type of day of the week. On day 7, another nonfasting urine sample and weight measurement were obtained and the first 7-day PAR questionnaire was collected. All participants were then given a different 7-day PAR, with instructions to complete the instrument on the evening of day 13. On day 14, participants provided a final nonfasting urine sample. The surveys and ActiGraph data were then collected, and anthropometric measurements were obtained.
Study assessments
Doubly labeled water
Doubly labeled water (2H218O) was used to assess total energy expenditure, based on an individuals clearance of stable (i.e., nonradioactive) hydrogen (deuterium) and oxygen (18O) isotopes administered orally as water. A more detailed description of this assessment method can be found elsewhere (3).
Resting metabolic rate
Resting metabolic rate is the primary determinant of total energy expenditure. To estimate physical activity energy expenditure, it was necessary to estimate each persons resting metabolic rate (810). The equation developed by Arciero et al. (11) in older women was used for this purpose (resting metabolic rate (kcal/day) = 21 x fat-free mass (kg) + 369). This fat-free mass-based equation is highly correlated with measured resting metabolic rate (R2 = 0.79; standard error, 46 kcal/day) (11). Fat-free mass was quantified using doubly labeled water-derived total-body water data, assuming a hydration constant of 0.73 (12).
Physical activity energy expenditure
As suggested by Schoeller and Jefford (13), physical activity energy expenditure estimated from doubly labeled water (PAEEDLW) was calculated (kcal/kg/day) as follows: PAEEDLW = (total energy expenditure minus resting metabolic rate)/body mass (kg).
ActiGraph
A uniaxial ActiGraph accelerometer (for-merly Computer Science Applications model 7164) was used to assess motion over the 14-day study period. This small, lightweight instrument detects acceleration from 0.05g to 2g while rejecting other forms of movement such as vibration (14). The acceleration signal is filtered by an analog band-pass filter and digitized by an 8-bit analog/digital converter at a sampling rate of 10 samples per second, storing data at 1-minute intervals (15). ActiGraph data are summarized in counts per minute and have demonstrated reasonable validity and reliability for the evaluation of physical activity behaviors against a variety of criterion measures from direct observation to self-report diaries (1618).
The following labels and count cutpoints were used to determine the duration (minutes/day) of time spent in activities of various levels: inactivity, 0259 counts/minute; light activity, 260759 counts/minute; moderate activity, 7605,274 counts/minute; and vigorous activity, 5,275 counts/minute. To ensure the integrity of the data, we used a time on/off diary and an automated review of monitor wear to identify periods of noncompliance. Data were excluded if sustained periods of zero counts or sustained periods of improbably high counts (>30,000 counts/minute), indicating accelerometer malfunction, were noted.
7-day PAR 1
The first 7-day PAR was a self-administered early version of the Stanford Five-City Projects 7-day recall (19). It asked participants to report their amounts of sleep and moderate and vigorous physical activity for the previous five weekdays and two weekend days. Moderate, vigorous, and very vigorous activities were assessed, and examples of occupational, household, and leisure activities were provided for these intensity levels. Time spent in light activities was calculated by subtracting the total time for all other activities and sleep from 24 hours. Physical activity energy expenditure (not including sleep) was calculated using reports of duration for each activity intensity level and the following metabolic equivalent (MET) weights: light activity, 1.5 METs; moderate activity, 4.0 METs; vigorous activity, 6.0 METs; and very vigorous activity, 8.0 METs. Physical activity energy expenditure was calculated in terms of kcal/kg/day (1 MET-hour 1 kcal/kg) using standard methods (20). The average durations (minutes/day) of light, moderate, and vigorous (
6 METs) activities were also calculated.
7-day PAR 2
The second 7-day PAR was developed for use in this investigation and was modeled after the approach used in the Minnesota Leisure Time Physical Activity Survey to assess the information on frequency (per week) and duration (per day) of activity. The 7-day PAR 2 was expanded to capture six domains of activity and focused on the previous 7 days. The activity domains evaluated were household (indoor), household (outdoor), child-care, occupational and volunteer, leisure and sport, and miscellaneous. Each activity domain contained a list of 541 common activities; respondents were asked to report the number of days and average amount of time per day spent in each activity. The miscellaneous category included six sedentary activities. MET estimates for each line item were made on the basis of example activities provided in the text of the instrument (20, 21). As described above for 7-day PAR 1, physical activity energy expenditure (kcal/kg/day) and overall and intensity-specific activity durations (minutes/day) were calculated.
24-hour PAR
The 24-hour PAR was administered by trained interviewers, either immediately prior to or after the 24-hour dietary recall (determined by the participant). The method employed has been previously described by Matthews et al. (22) and has been shown to have reasonable relative validity for assessment of short-term physical activity energy expenditure using the Baecke questionnaire and activity monitoring as criterion measures. Briefly, participants were asked, for the previous day, to recall the amount of time they had spent in bed and the amount of time they had spent in light, moderate, vigorous, and very vigorous activities in each of three activity domains (household, occupational, and leisure). Domain-specific example activities were provided for each activity intensity. As described above for 7-day PAR 1, physical activity energy expenditure (kcal/kg/day) and overall intensity-specific activity durations (minutes/day) were calculated. The average of the seven recalls was used for analyses.
Social desirability and social approval
The 33-item Marlowe-Crowne Social Desirability Scale was used to ascertain a participants tendency "to avoid criticism" and display herself in a favorable social image (6). The 20-item Martin-Larson Approval Motivation Scale was used to assess the social approval trait (7). Both scales have been shown to have good validity and reliability over time and were administered only at baseline (6, 7).
Statistical methods
Complete data from doubly labeled water measurements were available for 80 of the 81 women recruited into the study. SAS, version 8.1 (SAS Institute, Inc., Cary, North Carolina), was used for all analytic procedures (23). Descriptive statistics were computed for all variables. Students t tests were used to assess differences in mean physical activity energy expenditure as estimated from doubly labeled water and each survey instrument. Continuous variables were assessed for evidence of linear model assumptions, including nonnormality. Spearman correlation coefficients were used to assess the rank correlation among the energy expenditure estimates, ActiGraph counts, social approval scores, social desirability scores, and various other potentially confounding or effect-modifying variables. Social desirability or social approval scores were plotted by the difference in physical activity energy expenditure between each instrument and doubly labeled water measurements. We calculated Bland-Altman plots to compare each survey instrument with doubly labeled water assessments. To assess the degree of bias from social desirability or social approval, we fitted regression models using the PROC GLM procedure in SAS, using the difference in physical activity energy expenditure between the self-reported measure of interest and doubly labeled water as the dependent variable (24). We included the social approval and social desirability scores simultaneously as independent variables. The regression coefficient for the social desirability or social approval score reflects the degree of bias, with a positive beta coefficient indicating overestimation of energy expenditure on the self-report instrument and a negative beta coefficient indicating underestimation of energy expenditure.
Previous findings in the literature on diet and physical activity have shown evidence for effect modification by education and body mass index (weight (kg)/height (m)2) (1, 3). Thus, the social approval/social desirability models were stratified by educational status (less than a college education, college education or higher) and body mass index (<27, 27). The cutpoint used for body mass index stratification was the median value for the study population. The cutpoint for education was based on prior work from this study (3). Confounding by body mass index, educational status, menopausal status, and age was assessed.
Similar analyses were performed by regressing social desirability or social approval score on the difference in duration of activity between the self-reported measure (24-hour PAR, 7-day PAR 1, or 7-day PAR 2) and the ActiGraph measure. Duration models were stratified by intensity of activity (light, moderate, or vigorous). For analyses involving the 24-hour PAR (n = 72), only those ActiGraph data corresponding to the same day as the 24-hour PAR were included. For the 7-day PAR analyses, subjects were included if they had at least 3 days of ActiGraph data from the observation period of the 7-day PAR (for 7-day PAR 1, n = 68; for 7-day PAR 2, n = 71). The outcomes modeled for these analyses represent the difference in average daily physical activity between each instrument and the ActiGraph (difference = average minutes/day from the instrument minus average minutes/day from the ActiGraph).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
The Bland-Altman plots for the comparison of each study instrument with doubly labeled water measures are depicted in figures 1, 2, and 3. As evidenced by these graphs, both 7-day PAR 1 and 7-day PAR 2 demonstrated proportional error. No instrument demonstrated an absolute systematic bias.
|
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To more fully quantify the magnitude of the effect of systematic reporting errors on physical activity energy expenditure values, we calculated the possible range of systematic error using our regression results, the interquartile range of social desirability among women in this study (interquartile range = 7), and the average body mass of women in this study (69.7 kg). Calculations based on the average body mass and the interquartile range would result in a 317-kcal/day (4.55-kcal/kg/day) overestimation of physical activity on 7-day PAR 2 for women in the 75th percentile of social desirability score as compared with the 25th percentile. Increased social desirability score also was associated with a systematic overestimation of duration of activity for light activity (7-day PAR 2 only) and moderate activity (both 7-day PAR 1 and 7-day PAR 2). Using the same interquartile range for social desirability, the durations of light and moderate activity would be overreported by 79 minutes/day and 29 minutes/day, respectively, on 7-day PAR 2.
The greater social desirability effect observed on 7-day PAR 2 compared with 7-day PAR 1 could be a result of the survey format. The first 7-day PAR was much less structured than the second and simply asked respondents to report the total amount of time spent in moderate, vigorous, and very vigorous activities. The participant was not queried directly about light activity. Time spent in light activity was calculated from reports of sleep and moderate-to-vigorous activity. These findings are consistent with our experience with dietary data in that biases may be more concentrated in response to more structured questionnaires (3, 25).
In contrast to 7-day PAR 1, 7-day PAR 2 was structured such that activities were grouped by activity domain and intensity in an effort to systematically assess the full range of activities encountered in daily living. On the basis of our examination of reported activity durations, overreporting appeared to occur for both light and moderate activities. For these women, household activities comprised the majority of reported time spent in light activity (70 percent). There are at least two possible explanations for this social desirability effect. First, the effect could be mediated through the societal norm for women to be "good caretakers" of the home. Second, the bias may have been expressed more strongly in reports of highly prevalent routine light- and moderate-intensity activities, because it may be that persons who are prone to overreporting may inflate reports of activities they engage in regularly rather than overreport activities they engage in less frequently or not at all. For example, when asked about their past week of both household and exercise-related activities in the structured survey, women with high social desirability scores may have reported spending more time in household activity, particularly when faced with leaving the leisure and sports sections empty. In contrast, on 7-day PAR 1, women were asked about all domains of activity; thus, less emphasis was placed on the types of activities performed. Further investigation of these initial findings appears to be warranted.
Our finding of a marginally significant negative bias associated with social approval on the 24-hour PAR was unexpected. We originally hypothesized that women with higher social approval scores would want to "please" study staff by reporting relatively high levels of activity. The stratified analyses suggested that this effect may have been concentrated among women with a body mass index of 27 or higher. Future research is needed to replicate this finding and to attempt to differentiate the possible influence of the interviewers presence in eliciting reporting bias.
Other investigators have attempted to characterize reporting error by participant demographic characteristics such as age, body fat, and physical activity level (26). Irwin et al. (26) reported a significant correlation between body mass index, percentage of body fat, and reporting error for physical activity records but not for a 7-day PAR. Similarly, in our investigation, we did not observe a significant independent association of age, body mass index, or menopausal status with our 7-day PAR. Our finding of no independent association between body mass index and the 24-hour PAR, which is akin to physical activity records, further emphasizes the need to investigate reporting biases in the use of short-term recall methods.
This investigation had a number of limitations that should be considered when interpreting its findings. The study population was heavily scrutinized, with some type of contact being made at least 34 times per week and multiple activity assessments being completed by each subject. With this amount of observation, reporting accuracy in this population may have been greater than usual. In this case, the relation between social desirability or social approval and reporting accuracy may have been attenuated. In addition, the study population was comprised of predominately European-American women, thereby limiting the applicability of these findings to men and to minority populations.
In comparing the 24-hour PAR with the 7-day PAR, it was not possible to differentiate between interviewer effects and the effect of recall interval (i.e., the past 24 hours vs. the past 7 days). Future studies should be designed to evaluate the effects of differences between recall-interval effect and different modes of administration on reporting errors. In addition, our reliance on estimated resting energy expenditure values in our calculation of energy expenditure from doubly labeled water certainly resulted in the introduction of some error in our criterion measure. To minimize this loss of precision in our doubly labeled water energy expenditure values, we employed a prediction equation that used our measured lean body mass values derived from the doubly labeled water procedure. The most likely effect of this loss of precision would be a loss of statistical power in our analyses and attenuation of the effects observed.
Similarly, use of the ActiGraph as the criterion measure for the duration analyses may have introduced some bias into the results. While it demonstrates good relative validity against a variety of criterion measures from direct observation to activity diaries (15, 18, 27), there are some activities that ActiGraph activity counts cannot adequately capture (e.g., bicycling or weight-lifting). Consequently, the reporting differences in durations may actually be smaller than calculated; that is, the ActiGraph may not record activity that the participant actually engages in and might ultimately report. Nevertheless, use of the ActiGraph remains one of the few feasible ways to objectively estimate the intensity and duration of physical activity in free-living populations.
This investigation also had several strengths that should be considered. This is one of the first studies to quantify the direct effect of social approval and social desirability on physical activity energy expenditure and duration in a relatively large group of women. The combined use of doubly labeled water and the ActiGraph as criterion measures enabled us to evaluate both bias in absolute physical activity energy expenditure and bias in different activity intensities.
Although this work is an important first step in examining specific sources of bias in self-reported physical activity, clearly much additional research is needed in this area. The stratified analyses suggested that the effects of this bias may be modified by demographic characteristics and body habitus. Investigators will need larger stratum-specific sample sizes to fully understand the relation of these variables. Given the prohibitive cost of doubly labeled water studies, less expensive measures of activity that overcome some of the limitations of waist-mounted accelerometers (i.e., multiple sensors and/or heart rate measurements) could be employed to replicate and extend our findings (28, 29). Because there is some evidence for differential expression of social desirability bias by ethnicity (30), future investigations should focus recruitment on minority populations.
In conclusion, we have described a possible source of systematic biases in certain self-reports of physical activity that are attributable to the personality traits social desirability and social approval. The presence of these biases may depend largely on the type of survey instrument employed. These results suggest that reporting biases may be minimized through survey and questionnaire design. Further study is required to confirm these findings and to better characterize differences in the expression of bias by mode of administration, length of recall, questionnaire structure, and type and intensity of activity reported. As with dietary intake (31, 32), fitting social desirability or social approval scores in regression equations may improve overall model explanatory ability. Additionally, this avenue of inquiry may aid researchers in the creation of new physical activity assessment methods that are less prone to biased reporting.
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|