Department of General Practice and Health Services Research, University of Heidelberg, Heidelberg, Germany and 1 Centre for Quality of Care Research, Radboud University Medical Centre Nijmegen, Nijmegen, The Netherlands.
Correspondence to: T. Rosemann, Department of General Practice and Health Services Research, University of Heidelberg, Vosstr. 2, 69115 Heidelberg, Germany. E-mail: thomas_rosemann{at}med.uni-heidelberg.de
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods. A structured procedure was used for the translation and cultural adaptation of the AIMS2-SF into German. The questionnaire was administered to 220 primary care patients with OA of the knee or hip. Testretest reliability was tested in 35 randomly selected patients, who received the questionnaire a second time after 1 week. The physical scale of the original AIMS2-SF was divided into an upper body limitations scale and a lower body limitations scale.
Results. With values ranging from 0.52 to 0.97 for Pearson's r, item-scale correlations were reasonably good. The discriminative power of separate scales was also good, reflected in low values for correlation between different scales, indicating little redundancy. Only two items (13 and 15) referring to the symptom scale showed item-scale correlation of r = 0.72 and r = 0.67, respectively with the lower body limitation scale. The assessment of internal consistency reliability also revealed satisfactory values: Cronbach's was
0.83 for all scales, except for the social interaction scale (0.66). The testretest reliability, estimated as the intraclass correlation coefficient (ICC), exceeded 0.85 except for the affect scale (0.72). Substantial floor effects occurred in the upper limb scale (33.8%). Principal factor analysis confirmed the postulated three-factor structure with physical, physiological and social dimensions, explaining 49.8, 14.1 and 6.4% of the variation, respectively. The assessment of external validity revealed satisfactory correlations with the corresponding WOMAC (Western Ontario and McMaster Universities Arthrosis Index) scales. As expected, correlations with radiological grading were moderate to low. The correlation with the physician's assessment was high in the scales that were dominated by physical factors, but rather low in the areas of health, which were found to be dominated by psychological or social factors.
Conclusion. The German AIMS2-SF is a reliable and valid instrument to assess the quality of life in primary care patients suffering from OA. When addressing the different impacts of OA, the physical scale should be divided into an upper body scale and a lower body scale. The floor and ceiling effects revealed are in accordance with the disease characteristics of the study sample and do not limit the significance of the questionnaire.
KEY WORDS: Health status, Quality of life, Cross-cultural validation, Osteoarthritis, Arthritis Impact Measurement Scales 2 (German AIMS2-SF), Outcome assessments
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Subjects and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Translation and cultural adaptation
The German version of AIMS2-SF was translated and retranslated according to guidelines for cultural adaptation in order to address content validity [17]. Slight adaptations were necessary for item 7 (Did you have problems either walking several blocks or climbing a few flights of stairs?. The expression blocks is not commonly understood in German as a measure for distance and was replaced by a few hundred metres. Item 49 contains the expression bothered by nervousness or your nerves, which sparked intense discussion among the translators because there are various potential translations in German. The case was settled for a more understandable translation capturing the original idea of the item rather than the more direct translation.
The draft translation was piloted with 15 patients. In accordance with Taal et al., we replaced item 33 of the original AIMS2, How often did you go to a meeting of a church, club, team or other group?, with item 31, How often did you visit friends or relatives at their homes?, because this is to be expected to increase the internal consistency of the social interaction component [8, 18]. On the other hand we did not follow the BlandAltman approach, unlike Taal and Havaardsholm, who replaced item 42 of the original AIMS version, How often did your pain make it difficult for you to sleep?, with item 38, How would you describe the arthritis pain you usually had [9, 18]. Like Ren and colleagues [7], who validated the AIMS2-SF in the USA among patients with OA, we divided the German AIMS2-SF component physical into the two components upper body limitation and lower body limitation. Ren et al. discussed some possible limitations due to ceiling and floor effects, especially in patients suffering from OA to the knee or hip. As they did, we included item 1 (drive a car or use public transport), item 11 (need help to get dressed) and item 12 (need help to get out of bed) in the lower body limitation scale. We also followed their approach in including item 24 of the AIMS2-SF, family and friends sensitive to personal needs, and item 19, enjoy the things you do, in the affect scale and not in the social interaction scale. Contrary to Ren et al. and in agreement with some previous validation studies, we did not exclude the role component, even if this scale is usually only answered by half of the participants addressing only those who are still involved in the working process.
Other measures
In order to assess the external validity of the scales, additional data were retrieved.
The patient's general practitioner (GP) was asked to evaluate the severity of arthritis based on available radiographs, the patient's history and clinical examination based on the classification criteria of the American College of Rheumatology [15, 16]. The GP's evaluation was scored on a 010 scale, 10 representing no limitation of quality of life by arthritis and 0 representing massive limitation of quality of life. All patients were also given the validated German version of the WOMAC questionnaire [19], containing five-point Likert scales similar to the German AIMS2-SF questionnaire. For inclusion in the study, an X-ray of the affected joint, not older than 6 months, was required. The X-rays were scored according to the criteria of Kellgren and Lawrence [20]: grade 0 = normal; grade 4 = massive alterations with complete collapse of the joint space.
Statistical analysis
Data were entered in Microsoft Excel spreadsheets and analysed with the SPSS statistical package (version 11.0). When necessary, items were (according to the recommendations of Meenan et al. [1, 3]) recoded and transformed from graduated 10-point scales, Likert scales of the German AIMS2-SF and WOMAC and patient self-assessments, so that results for all items lay between 0 and 10, 0 representing the best and 10 the worst health status. Descriptive analysis included mean and standard deviation, and in order to assess floor and ceiling effects the percentage of participants achieving the lowest and highest possible score was calculated.
Internal consistency reliability
As an indicator of internal consistency reliability, we calculated Cronbach's to estimate whether each item of a scale is appropriate for assessing the underlying concept of its scale [21, 22]. Achievable values for Cronbach's
range from 0, signifying no internal consistency, to 1, signifying identical results. We considered high internal consistency to be represented by values of 0.500.70 for group comparisons and by values of over 0.90 for individual patients' results.
Testretest reliability
We used the intraclass correlation coefficient (ICC) as an estimate of testretest reliability. In order to assess the testretest reliability of the individual scales, we computed the ICC based on Ren et al.'s six-component model of the AIMS2-SF: upper body limitation; lower body limitation; affect; symptom; social interaction; and role. A random sample of 35 patients from the initial sample of 220 was asked to complete the questionnaire again after 7 days. All of the 35 patients selected for the retest returned their questionnaires. In order to be eligible for retest, patients had to have no change in therapeutic regimen, lifestyle or medication during these 7 days.
Scale internal validity
Scale internal validity was assessed by computing the correlation (Pearson's r) of the items with the respective scale corrected for overlap to avoid bias from self-correlation. A correlation of at least 0.4 was assumed as the standard for supporting scale internal consistency [7, 23]. Item-discriminant validity shows the extent to which an item measures what it is not supposed to measure: the degree of discriminatory power. It was assessed by computing the correlation (Pearson's r) of the items with the other scales. Cut-off values have not been defined, but in order to support the high discriminatory power of scales there should not be a high correlation for item discriminance.
Convergent validity
Convergent validity was assessed using external and internal criteria. In using external criteria to estimate convergent validity, different systems are usually compared and a linear relationship cannot be assumed. Therefore, Spearman rank correlation tests are most commonly used, e.g. in the validation studies of Stucki et al. (WOMAC) [19], Roos et al. (WOMAC) [24], Ludwig et al. (Lequesne) [25] and Salaffi et al. (AIMS) [23]. This study challenged the hypothesis that AIMS2-SF scales correlate with corresponding scales of the previously validated WOMAC questionnaire. In addition, the correlations of the AIMS2-SF with the Kellgren score and the physician assessment were estimated by computing the Spearman rank test. As Roos et al. have discussed in this context [24], correlations usually range between 0.2 and 0.6: correlations between 0.40 and 0.60 are regarded as good correlations and values above 0.6 as very high correlations. P-values are provided in order to show levels of statistical significance.
In addition to external criteria, convergent validity was also assessed by analysing demographic subgroups by age, gender and level of education. A low level of education was defined as education only as far as secondary school. Education more advanced than this was considered a high level of education. To compare the different groups we used Student's t-test for independent samples.
Construct validity
To explore construct validity we conducted a principal components factor analysis with varimax rotation analysis. The criterion for factor extraction was an eigenvalue >1.0.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
Computing Cronbach's values revealed excellent values: all scales achieved values equal to or above 0.82, except for the social scale (0.66). In meeting and exceeding the recommended range of 0.50.7, the results indicated high internal consistency of the scales.
Generally high values for testretest reliability were found for all scales except for the affect scale, for which the ICC was only 0.72, and the social interaction scale (ICC 0.77). With values above 0.81, the areas dominated by physical factors and the role scale showed slightly higher testretest reliability.
Table 4 shows the results of the varimax rotation analysis with the three latent factors that could be extracted. The factor physical explained 49.8% of the cumulated variation, the factor psychological 14.1% and the factor social 6.4%. The high loading of these three factors in certain items confirms that the dimensions are clearly distinguished. All items referring to the symptom component (item 13, severe arthritis pain; item 14, morning stiffness >1 h; and item 15, pain influencing sleep) are loaded on two factors, physical as well as psychological, reflecting the complex nature of these areas of health.
|
|
Relationship with demographic subgroups
The use of demographic and socioeconomic subgroups is an additional approach to the assessment of convergent validity. Table 6 displays the results of these analyses. The differences were assessed with Student's t-test for independent samples. Women obtained significantly higher average scores in the physical and symptom scales, indicating worse health status and more burden due to OA. Women also had higher average scores for social interactions (5.69) and affect (4.81), with a statistically significant difference in the variance of the means (P<0.05 and P<0.01, respectively). This is in line with previous studies from Salaffi et al. and may be due to the fact that female study participants on average suffered longer from arthritis than male participants (11.3 vs 8.8 yr) [23].
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The results of the assessment of scale internal validity, internal consistency and item-discriminant validity indicated that the German AIMS2-SF questionnaire appears to measure what it is supposed to measure and that its items are selective and non-redundant.
Cronbach's showed very satisfactory results and by following the approach of Ren et al. in including item 19 (which asked the patients whether they enjoyed the things they did) in the affect scale, even the value for the social scale reached 0.66, results quite similar to those of the validation study of Ren (0.67) and much better than in the study of Guillemin et al. (0.32) [6, 7]. The ICC values (assessing testretest reliability) indicated good reproducibility. The lowest ICC value was found for the affect scale and the social interaction scale, an area that largely depends on external factors, such as telephone calls and visits from friends and family. These scales have performed similarly in validation studies in other languages [8, 23].
Correlations of the German AIMS2-SF scales with corresponding scales from the already validated German WOMAC were very satisfactory. In achieving quite different values for the upper body limitation and lower body limitation scales, the correlations with the WOMAC questionnaire substantiate the approach of Ren et al. [7] in dividing the physical scale into scales addressing upper and lower limb functioning. Ren et al. [7] (using a different coding, 0 representing worst and 10 representing best health status) found substantial ceiling effects in their validation study among patients with OA. They discussed possible limitations of the AIMS2-SF in applying it to patients with OA of the lower limb section. In our inclusion of patients suffering from OA to knee or hip we found no substantial ceiling effects in the lower limb scale but quite large floor effects in the upper limb section. Our results are in line with the findings of Taal et al. [8], who reported moderate floor effects (1.6) and no ceiling effects (0.0) in the physical scale, and Salaffi et al. [26], who also found only moderate ceiling (0.64.1) effects in the scales representing lower limb function but substantial floor effects (43.967.1) in the scales representing upper limb function. Like Salaffi et al., we regard the disease characteristics of the study sample as responsible for these results. To summarize, we did not find any possible limitations due to ceiling or floor effects of the German AIMS2-SF in patients suffering from OA of the knee or hip.
As expected, and supported by clinical experience, the correlation between the AIMS2-SF and radiological scores were low to moderate in the symptom and lower body limitation scales [19, 27, 28]. It is known that self-reported functional ability assessed by instruments such as the AIMS reflects physical impairment due to the arthritic joint disease quite well [29]. Therefore, it is not surprising that the correlation was high between physician assessment and physical aspects of the AIMS. The fact that the correlation was much lower for items reflecting social interaction may indicate that these areas are outside the scope of physicians' assessment of their patients' quality of life, even for GPs who are well acquainted with their patients, as was the case in this study. Therefore, these results reflect the potential benefit of AIMS in OA-related quality of life assessment in primary care.
As in previous validation studies, the principal factor analysis indicated high construct validity by revealing three latent factors: physical, psychological and social; these explained 70.4% of the variance of the entire questionnaire [5, 23]. Comparison of demographic subgroups consistently showed plausible results: impairment increased with age [28]. In line with previous studies, educational level affected quality of life. Salaffi et al. [26] also found higher levels of education to be related to higher quality of life in the validation study of the Italian AIMS questionnaire. Therefore, the results for the demographic subgroups substantiate the convergent validity of the German AIMS2-SF.
Especially due to demographic trends, the incidence and prevalence of OA are increasing in most western industrialized nations. They cause a substantial burden of disease, as well as high direct and indirect costs. OA has a massive impact on patients' quality of life, which poses a challenge to reliable and valid measurement instruments for the assessment of potential interventions. The well-established WOMAC and Lequesne questionnaires can be used to assess medical and surgical interventions in OA. In addition to these two instruments, the AIMS2-SF questionnaire also addresses aspects of life that are less directly related to joint diseases. The AIMS2-SF is therefore suitable for use in the evaluation of multimodal interventions, such as self-management programmes [30]. Results from hospital-based studies cannot easily be generalized to primary care. The present validation study underlines that the German culturally adapted AIMS2 short version promises to have these qualities. The instrument could be used in different German regions with distinct dialects. As in previous studies, willingness to participate was high among OA patients. This was reflected by high response rates, both in the test and the retest, and in the low rate of unanswered items.
The results presented are particularly interesting because results of hospital- or treatment centre-based studies cannot easily be transferred to a primary care setting. The results of this validation study indicate that the German AIMS2-SF is a valid and reliable instrument for assessing the quality of life of patients with OA and it provides us with an important instrument to assess the effects of complex interventions in primary care.
|
![]() |
Acknowledgments |
---|
All authors declare that there is no conflict of interest.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|