1 Department of Medicine, Vanderbilt-Ingram Cancer Center, Center for Health Services Research, Vanderbilt University Medical Center, Nashville, TN.
2 Department of Epidemiology, Shanghai Cancer Institute, Shanghai, Peoples Republic of China.
3 Department of Exercise and Nutritional Sciences, San Diego State University, San Diego, CA.
Received for publication April 23, 2002; accepted for publication June 3, 2003.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
data collection; epidemiologic methods; exercise; questionnaires; reproducibility of results; validation studies
Abbreviations: Abbreviations: ICC, intraclass correlation coefficient; MET(s), metabolic equivalent(s); PAQ, physical activity questionnaire; SWHS, Shanghai Womens Health Study.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The Shanghai Womens Health Study (SWHS) is a population-based prospective cohort study. Several behavioral risk factors for cancer, including physical activity and diet, are of central interest in the research. The purpose of this investigation was to evaluate the reproducibility and validity of the physical activity questionnaire (PAQ) implemented at baseline in the SWHS.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Potential participants for this investigation (n = 826) were randomly selected from the SWHS roster based on the proximity of the neighborhoods to study interviewers. Approximately 25 primary contacts and 50 alternate contacts were identified for possible recruitment by each interviewer. Approximately 30 percent of the women contacted enrolled in and completed at least a portion of the study (n = 200). Comparisons between participants and nonparticipants in the validation study indicated that participants were older (p = 0.001) and less educated (p = 0.04), but otherwise participants did not differ from nonparticipants (p > 0.05) with regard to exercise participation in the 5 years preceding cohort entry, body weight, waist-to-hip ratio, family (household) income, and numerous health behaviors (i.e., smoking, alcohol drinking, and energy and macronutrient intake).
The reproducibility of the interviewer-administered SWHS PAQ was evaluated using test-retest methods over an approximate 2-year interval (mean = 2.15 years (standard deviation, 0.36; range, 1.652.66 years)). Of the 200 women enrolled, 191 (95 percent) completed a second interview. The validity of the SWHS PAQ was evaluated using repeated administrations of two instruments, a self-administered physical activity log and a telephone-administered 7-day PAQ. The comparison instruments were initiated, on average, 12 months after the first SWHS PAQ administration, with the log being initiated about 6 months after the 7-day recall. The second SWHS PAQ was administered approximately 12 months after the use of comparison instruments was initiated (figure 1).
|
7-day PAQ
The 7-day PAQ was structured and worded similarly to the items evaluating current exercise and nonoccupational activities on the SWHS PAQ. The instrument evaluated exercise and sports participation during the past 7 days and obtained quantitative data on these activities. Data for up to three exercise activities were summarized in terms of intensity (METs), duration (hours/week), and average energy expenditure during the period (MET-hours/week) using standard methods (10). Nonoccupational lifestyle activities (i.e., stair climbing, transportation, walking, cycling, housework) were evaluated using a 7-day time frame. A total of 200 women provided at least nine interviews, and the average number of assessments completed was 24.5 (standard deviation, 1.9).
Physical activity log
The physical activity log was adapted for this population from existing instruments that have previously been used to evaluate PAQs (11, 12). It was designed to capture the full range of activities encountered in daily life, including household activities, transportation, occupational activities, and up to 26 different sport, exercise, or recreational activities. At the end of each assessment day, women were instructed to record in their logs the amount of time they had spent in each category of activity. Summary measures from the logs were obtained as duration of activity (hours/day) and energy expenditure in overall activity and for each activity domain (e.g., household, occupation). Physical activity energy expenditure was calculated using the Compendium of Physical Activities (10) and was expressed in terms of activity intensity (METs) and duration (hours/week), as MET-hours/week. A total of 180 women completed at least one physical activity log. The average number of logs completed was 3.9 (standard deviation, 0.4).
Statistical analyses
We evaluated data derived from the physical activity assessments to identify possible outliers, as well as the distribution of the summary measures. Activities were summarized in terms of reporting prevalence (percentage of women reporting the activity) and mean values for women reporting participation.
Reproducibility
Data from the two administrations of the SWHS PAQ allowed completion of test-retest analyses. We examined items for which responses were reported as categorical data using cross-tabulation of activity reports to obtain the proportion of persons reporting the same category consistently (correctly), as well as extreme reporting variation between administrations of the PAQ. Extreme variation in reporting reflects the largest possible change in reported participation between administrations. The kappa statistic () was used to evaluate the reproducibility of classification for categorical responses (13). Repeated-measures models were used to test mean differences in continuous activity variables for which measures were obtained at each time point. To evaluate the reproducibility of continuous summary variables, we calculated intraclass correlation coefficients (ICCs) (14) using variance components from random-effects models derived from SAS PROC MIXED (15).
Validation
To assess the utility of our criterion measures, we examined correlations between the 1-year averages of the 7-day questionnaires and the activity logs. To evaluate the validity of the SWHS PAQ, we compared data from both of its administrations with the 1-year averages of the comparison measures using Spearman rank-order correlation coefficients. We also conducted detailed analyses by age, education, and family income.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Descriptive analyses for the 7-day PAQ and the physical activity log are presented in table 1. Nearly 75 percent of women reported participation in predominantly moderate-intensity exercise during the 12-month measurement period (table 1). In detailed analyses, using exercise reports on 75 percent of the instrument administrations as an indicator of "regular exercise," only 48 percent and 32 percent of women reported exercising regularly on their activity logs and 7-day questionnaires, respectively (data not shown). Nearly all women (>85 percent) reported participation in lifestyle activities such as stair climbing, walking, and housework, but only 40 percent reported transportation-related activity (table 1).
|
|
Our initial validity analyses examined correlations between the two criterion measures. Moderate-to-strong correlations were noted between the physical activity log and the 7-day PAQ in most activity domains: adult exercise (r = 0.84), cycling (for transportation, r = 0.62; as a daily activity, r = 0.74), household activity (r = 0.60), and nonoccupational walking (r = 0.38). In terms of the validity of the SWHS PAQ relative to the criterion measures, moderate-to-strong rank-order correlations (r = 0.490.80) were noted for adult exercise duration and energy expenditure, with the correlations being higher in the 7-day PAQ comparisons and upon the second administration of the cohort questionnaire (table 3).
|
Relative validity comparisons between the 7-day PAQ and the first SWHS PAQ by age, education, and family income are presented in table 4. In general, there was some evidence of systematic variation in the validity data across these covariates. Small sample sizes (n < 40) and low activity prevalence (e.g., cycling) made interpretation of some of the comparisons difficult. In terms of age, it appeared that the strength of the validity coefficients for adult exercise, daily walking, and household activity was somewhat lower among younger women, but this trend was reversed for transportation-related activities (table 4). Educational attainment appeared to be positively associated with higher validity coefficients for adult exercise and transportation-related activities. No systematic patterns in the validity coefficients were noted across family income strata (table 4).
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Previous validation studies of instruments used in prospective research have reported results that are consistent with the current findings. Comparisons of questionnaire-based reports of habitual exercise patterns with physical activity records have revealed correlations of moderate strength (r = 0.470.62) (11, 19, 20), while validity coefficients for household and transportation/daily walking have tended to be lower (7, 11). While no direct data were available for evaluating the validity of the adolescent exercise items on the SWHS PAQ, studies demonstrating the reproducibility of exercise reports obtained 1030 years in the past (21, 22), as well as our previous study, which used similar questions about adolescent exercise and suggested that exercise early in life is an important contributor to reduced breast cancer risk (23), support the utility of the adolescent exercise items.
At least two studies have reported a reduction in indicators of reproducibility with test-retest intervals longer than 1 month (20, 24). Nine- to 24-month reproducibility values for adult exercise behaviors from the Minnesota Leisure Time (20), College Alumnus (24), and Nurses Health Study II (19) PAQs have been in the range of 0.430.69, very close to the results reported here (ICC = 0.70). Reproducibility of adolescent exercise behavior in this report was similar to that of Friedenreich et al. (18) and Chasan-Taber et al. (25). Our finding of slightly higher reproducibility (kappa values) for the adolescent exercise questions versus the adult exercise questions may be due to the simplicity of the yes/no question in adolescence as compared with the question posed during a period of adulthood in which the participants understanding of the question, and possibly her activity levels, may have evolved during the test-retest interval. Evaluation of the summary duration and exercise energy expenditure reproducibility values (e.g., ICC = 0.40 for the adolescent measure (hours/week/year) vs. ICC = 0.70 for the adult measure (MET-hours/week/year)) indicated that length of participation and duration of activity may be less reliably recalled given a longer recall period. There are fewer comparative data in the literature for reproducibility of nonexercise activities (e.g., household activity, transportation-related activity, walking) over a 2-year period, but 2-week to 12-month retest values of 0.300.80 for household activities have been reported (11, 20, 26). The reproducibility of household activity in this study was in the lower end of this range.
The weak inverse relation between the physical activity log and the SWHS PAQ in comparisons of walking for transportation may be attributable to variation in the way the instruments captured this behavior. In detailed analyses, we noted a significant positive relation (r 0.59) between occupational walking in the physical activity log and the SWHS PAQ item on walking for transportation, as well as a weak positive correlation between the physical activity log walking variable and the SWHS PAQ daily walking variable (r = 0.14, p = 0.06). This suggests that the assessment of walking in the physical activity log was more inclusive and included walking done in addition to transportation. The questions on walking in the SWHS PAQ focused solely on walking for transportation.
The 2-year retest interval in this investigation probably resulted in a lowering of the apparent reproducibility of the activity behaviors evaluated because of the mixing of true intraindividual variation of activity with true reporting variation, and therefore our results may be viewed as a conservative estimate of the reproducibility of this instrument. This effect would be expected to be most acute for lifestyle activities, because the time frame evaluated for these behaviors was only about half of the 2-year retest interval. This issue would appear to be less problematic in reports of adult exercise that utilized a longer exposure time frame (i.e., the past 5 years); however, ICC values below 1.0 for reports of adolescent exercise behaviors reflect only variation in reporting between administrations of the instrument.
The SWHS PAQ was designed to be culturally relevant and to capture the full range of daily activities that are important contributors to the overall physical activity energy expenditure of women. The work of Ainsworth et al. (6, 27) has consistently demonstrated the importance of capturing activities related to the household and occupational activities of women, rather than only recreational or leisure-time activities. A recent report on US women suggested that household activities accounted for 50 percent of the overall physical activity energy expenditure among the women, with occupational and leisure-time activities accounting for only 33 percent and 17 percent, respectively (8). Weller et al. (5) demonstrated that failing to account for the full range of activities important to women resulted in underestimation of the risks of all-cause and cardiovascular disease mortality by 2040 percent. Unfortunately, optimal assessment of highly prevalent lower-intensity household and walking activities remains a challenge in physical activity research.
This study had a number of limitations that should be considered when interpreting this report. First, as with all studies seeking to determine the validity of self-reported physical activity levels, there is no easily administered "gold standard" available for measurement of overall activity as well as individual activity domains (e.g., exercise, household, transportation) that would allow true validation of the instrument (28). The SWHS PAQ evaluated habitual physical activity patterns over relatively long time periods (e.g., 1 year or 5 years), and it is subject to errors of recall attributable to memory (e.g., omission, intrusion) and long-term averaging (2931). In contrast, our criterion measures minimized the potential for these types of recall errors because of their short recall periods; therefore, they may be considered to have conceptually different sources of error than the cohort questionnaire. Thus, our alloyed standards were selected as the most feasible means of evaluating the validity of the SWHS PAQ in this investigation.
The validity coefficients for the 7-day questionnaire were consistently higher than those for the activity logs, perhaps because of the greater similarities in question structure and content between the 7-day and cohort questionnaires and the frequent sampling with the 7-day instrument (e.g., 24 administrations vs. four). A plausible explanation for the stronger results for the 7-day PAQ comparisons is that attenuation of the correlations attributable to intraindividual variation in activity were minimized (1, 2). However, it is also plausible that the stronger correlations in analyses of the second SWHS PAQ may be due to either 1) better temporal sequencing with the criterion measures (figure 1) or 2) enhanced reporting accuracy following completion of the intensive measurement protocol during the study period. There appeared to be changes in reporting on the second SWHS PAQ (i.e., increased prevalence and duration), particularly for adult exercise activities. These changes could be attributed to either a true increase in activity during the study (i.e., reactivity) or an enhanced understanding of the types of activities being evaluated by the investigators (i.e., a learning effect). Given the inherent challenge of increasing the physical activity levels of individuals (32), the latter explanation seems more likely. We think the best estimate of the validity of the SWHS PAQ is a figure that falls within the range of the estimates provided in our evaluation of both administrations of the instrument (table 3).
In conclusion, in the present investigation we observed that the SWHS PAQ was reproducible and valid with respect to self-reports of exercise behaviors, as well as a number of highly prevalent lifestyle activities (e.g., housework, transportation), in this cohort. In comparison with two criterion measures with conceptually different sources of measurement error, significant rank-order correlations were observed, suggesting that the PAQ would be useful for classifying participants in the SWHS into quantiles of physical activity. Overall, these findings support the SWHS PAQ as a useful measure of physical activity exposures in this cohort and suggest that this instrument may have utility in assessing the activity patterns of women in other populations.
![]() |
ACKNOWLEDGMENTS |
---|
The authors acknowledge the invaluable contributions of the doctors and health workers in the study communities for their recruitment of study participants, as well as the contributions of Drs. Xiu-Zhen Li, Pei-Lan Zhu, and Hong-Lan Li in ensuring effective study implementation.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|