Clinica Reumatologica, Università di Ancona,
1 Fidia spa, Abano Terme,
2 Istituto di Reumatologia, Università di Siena,
3 Cattedra di Reumatologia, Università di Cagliari,
4 Divisione di Reumatologia, Ospedale S. Anna, Ferrara,
5 Divisione di Reumatologia, Ospedale Al Mare, Lido di Venezia,
6 Divisione di Reumatologia, Istituto Ortopedico Toscano, Firenze,
7 Unità Operativa di Reumatologia, Polo Ospedaliero ASL-Roma Est,
8 Cattedra e Divisione di Reumatologia, Università di Padova,
9 Cattedra e Divisione di Reumatologia, Istituto Ortopedico G.Pini, Milano,
10 Cattedra di Reumatologia, Università di Napoli,
11 Divisione di Reumatologia, Dipartimento di Medicina Interna, Università di Genova and
12 Cattedra di Reumatologia, Università di Parma, Italy
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods. The AIMS2 was translated into Italian and administered to a cohort of 178 outpatients with symptomatic OA of the knee who attended 12 participating rheumatological institutes in northern, central and southern Italy. A random sample of 71 patients were readministered the AIMS2, 7 days after the first visit, to evaluate the instrument's testretest reliability. After 6 months, the subjects were asked to return to the institutes for a second administration of the questionnaire.
Results. The internal consistency reliability of each scale score, as estimated by Cronbach's alpha coefficient, was high and indicated that the components of the scale measured the same construct. The items all correlated with each other, but there was no redundancy; this indicates that each domain addressed a somewhat different aspect of functional disability. The testretest reliability equalled or exceeded 0.80 for eight of the 12 scales. Factor analysis provided a three-factor health status model explaining 63.5% of the variance. Arthritis pain and psychological scale were loaded on the first factor, together with physical scales for mobility level and walking and bending. The upper limb function scales formed the second factor. The third factor was determined by the social dimension. These results demonstrate that the physical health status scales of the AIMS2 are valid, as shown by the significant, moderate to high correlations between the AIMS2 subscales and the majority of the clinical measures.
Conclusion. Our data suggest that, like the original questionnaire, the translated version of AIMS2 is a reliable, consistent and valid instrument for measuring health status and physical functioning in patients with OA of the knee.
KEY WORDS: Osteoarthritis of the knee, Italian Arthritis Impact Measurement Scales 2 (ITALIAN-AIMS2), Health status, Quality of life, Trial methodology
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Although there is no international agreement on which to base standardized procedures in future studies or to indicate which domains should be included, health-related quality-of-life (HRQL) measures were strongly recommended for OA trials at the 1996 Conference on Outcome Measures in Arthritis Clinical Trials (OMERACT III) [8]. However, despite the clinical importance of health status in OA, validated generic and disease-specific questionnaires have been employed infrequently in its measurement in the past.
The self-administered Arthritis Impact Measurement Scales (AIMS) questionnaire was one of the first instruments specifically designed for the purpose of assessing HRQL in patients with rheumatic diseases [9]. Its measurement properties are good and it has come to be widely accepted in the English-speaking countries for a variety of uses. Its Italian translation, ITALIAN-AIMS, was the first such instrument in the Italian language [10]. This tool has been shown to be acceptable, reliable and valid [10, 11]. Recently, the original AIMS was revised [12]. The new format was shown to be more comprehensive and acceptable to patients than its predecessor, without loss of reliability or validity.
ITALIAN-AIMS2 is the Italian version of AIMS2. We developed it by making analogous adaptations to its predecessor ITALIAN-AIMS [10], and translating all new items from the original, using methods similar to those used in validation studies for other languages [13]. To ensure that the instrument was reliable, in view of the cultural differences between the various regions of Italy, the study was conducted at 12 rheumatology centres located in different regions of the country. Our objective was to evaluate the reliability, validity and responsiveness of ITALIAN-AIMS2 and to compare the sensitivity of this questionnaire, specific for rheumatic diseases, with a generic one, the SF-36 (MOS 36-item short-form health status survey). Here, we report the results of validity and reliability of AIMS2 in a large group of subjects with symptomatic OA of the knee.
A paper reporting the responsiveness of the instrument and a comparison with SF-36 is in progress.
![]() |
Patients and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
ITALIAN-AIMS2
This self-administered questionnaire is a translation of AIMS2. For a detailed description of the AIMS2, see Meenan et al. [12]. Briefly, the AIMS2 is a revised and expanded version of the original AIMS questionnaire [9, 17]. Each of the nine AIMS health status scales (mobility, physical activity, dexterity, household activities, activity of daily living, pain, social activity, depression and anxiety) was revised to contain only four or five items; all scale items were changed to have a standard set of response options (five-point Likert scale); and several of the original scales were renamed. In addition, three new scales were added to assess arm function, work and support from family and friends. The items for the arm function scale included items designed to measure elbow and shoulder motion. In contrast to the other AIMS2 scales, the new work scale could be calculated for those patients who indicated that they were not unemployed, retired or disabled. The social support scale was added to supplement the social activities scale by measuring the qualitative aspect of social interactions. New scales were also added to evaluate patient satisfaction with health status, the impact of arthritis, and health perceptions. The AIMS2 questionnaire takes about 20 min to complete [12]. The items were translated by two translators aware of the field of application and of the target population of the questionnaire. The emphasis was on finding the best idiomatic translation rather than pure equivalence of vocabulary [18]. A few questionable items were discussed and resolved by three rheumatologists with experience in the use of the instrument in clinical study, plus one bilingual clinical researcher with a general internal medicine background, and one with additional training in psychometry.
Clinical measures
To test the convergent validity of AIMS2, the following additional clinical measures were applied concurrently. The physician's global assessment was recorded using a 100-mm visual analogue horizontal scale (VAS), anchored at each end, on which 0 represents asymptomatic and 100 represents extremely severe. Similarly, the patient's global assessment, pain and joint stiffness were measured on a 100 mm VAS in which one end (0) represented no problem, no pain, no stiffness and the other end (100) represented extremely severe, maximum imaginable pain or extreme stiffness. Joint pain was also graded using the present pain index (PPI), a six-point categorical scale ranging from 0 = no pain to 5 = excruciating pain. Finally, the Lequesne index of severity for knee osteoarthritis was used [19].
Statistical analysis
All AIMS2 items were recoded such that low scores indicated good health status while high scores indicated poor status. After recoding, each scale score was calculated by summing the scores of the items on the scale. All scale scores were then standardized by a simple mathematical transformation, so that the best possible scale score was 0 and the worst possible score was 10. For each scale, a descriptive analysis was conducted, to examine the distributional properties of the scale itself. We calculated the mean, median, standard deviation (S.D.), skewness and kurtosis. We also calculated the percentage of the sample achieving the lowest (floor effect) and highest (ceiling effect) possible score.
The internal structure and reliability of the scales were evaluated by means of item-internal consistency, item-discriminant validity, internal consistency (Cronbach's alpha coefficient) [20] and testretest reliability. Item-internal consistency was tested using correlation between each item and its scale corrected for overlap (a correlation corrected for overlap is the correlation of an item with the sum of the other items in the same scale, thus removing the bias of correlating an item with itself). We used a correlation of 0.40 as the standard for supporting item-internal consistency [21]. Item-discriminant validity was tested using correlation between each item and the other scales. Clearly, correlation between each item and its scale must be greater than correlation between the item and the other scales. In fact, it is not sufficient to demonstrate that an item measures what it is supposed to measure: it is also important to determine the extent to which an item measures concepts that it is not supposed to measure, that is, to examine the integrity of the hypothesized grouping of items. We computed Cronbach's alpha coefficient [20] to estimate the internal consistency of each scale score. Measures with reliability of 0.500.70 or greater have been recommended for group comparison, while an alpha value >0.90 is required when analysing an individual patient's score [21]. Reproducibility was evaluated using the intraclass correlation coefficient (ICC) [22]; values of ICC vary from 1 (perfectly reliable) to 0 (totally unreliable). Construct validity was assessed by performing principal components factor analysis on individual scales. Factor analysis is a statistical technique used to identify the presence of a relatively small number of underlying latent factors that can be used to represent relations among sets of many variables [12]. An eigenvalue criterion of 1.0 was used to select factors [23]. Convergent validity was tested using both internal and external criteria. The external validity of each scale was calculated by correlations (Pearson's r) with the other clinical measures of OA severity; internal validity was calculated by using internal items of the questionnaire itself (problem with the area and priority for improvement). Subgroup analyses by different demographic characteristics were also made, to verify any possible difference among subgroups. Subjects were classified by sex (female vs male), age (up to 65 yr vs >65 yr) and level of school education (up to primary school vs high school or university). For each of these characteristics, comparison between subgroups was made by Student' t-test. All analyses were made using the SAS statistical software (SAS Institute, Cary, NC, USA).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Description of the scales
Table 1 presents estimates of central tendency, dispersion and other important features of the distributions for the AIMS2 scales. Mean scores ranged from 1.12 for self-care to 5.98 for pain and walking and bending. For each scale, the subjects used the full range of response options, as demonstrated by the ranges of the calculated scale scores. In the same table are also reported skewness and kurtosis and the percentages of the sample showing the floor and ceiling effects. While the scales for walking and bending, social activity, pain experience, work and level of tension were moderately skewed, indicating distributions in which more respondents scored among the more negative health states, the scales designed to measure hand and finger function, arm function, self-care and household tasks were positively skewed. As expected, due to the population disease characteristics, the percentages of respondents with minimum (floor) scores were particularly high for hand and finger function (44.5%), arm function (43.9%), self-care (67.1%), household activities (50%) and support from family and friends (42.2%).
|
![]() |
Reliability and validity |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Factorial analysis
Each AIMS2 scale was constructed to measure a particular aspect of health, some aspects being strictly physical, some more psychological and some more sociological. To verify the presence of latent dimensions of health, a principal components factor analysis was conducted. This analysis yielded three latent factors (with an eigenvalue >1) explaining 63.5% of the cumulated variance. The lower-limb function scales (mobility level and walking and bending) together with the pain scale and the psychological scales (level of tension and mood) were loaded on the first factor, which explained 27% of the total measured variance. The upper limb function scales (hand and finger function, arm function) formed the second factor, which explained 25% of the variance. The third factor (11.5% of the variance) was determined by the social dimension, consisting of the social activities scale and support from family and friends. We excluded the work scale from this analysis, as it was calculated only for subjects who claimed to be employed, a substantially smaller group than the total number subjects enrolled (95).
Convergent validity
Convergent validity was tested using both internal and external criteria [12].
Validity based on internal criteria.
The results on the validity of the AIMS2 scales, tested using an internal standard based on the subject's attribution of the health status problem area and of the priority area for improvement, are presented in Table 3. In the attribution section of the AIMS2, the number of subjects who indicated that they had a problem in a particular health status area ranged from 32 who had a problem with support from family and friends to 162 who had a problem with arthritis pain. The subjects who designated an AIMS2 scale as a health status problem area consistently had significantly worse scores on that scale. As indicated in Table 3
, subjects who designated an area as a priority for improvement had significantly worse health status scale scores in that area than those who did not.
|
Validity based on external criteria (comparison with other clinical measures).
Table 4 shows the product moment correlations between the AIMS2 scales and the clinical measures, including the patient's and physician's global assessments, pain, joint stiffness and the Lequesne index. The scores for mobility level, walking and bending, pain, work, level of tension and mood were moderately to highly correlated with VAS for patient assessment (r = 0.220.59), PPI (r = 0.360.50), VAS for pain (r = 0.310.55), joint stiffness (r = 0.300.43), VAS for physician's assessment (r = 0.280.54) and Lequesne index (r = 0.320.67). Hand and finger function, arm function, self-care and household tasks were moderately correlated with most measures, while the correlations of social activities were considerably lower. No clinical measures correlated with the support from family and friends.
|
Relationship with demographic characteristics.
To estimate the predictivity of demographic characteristics on AIMS2 scales, subjects were grouped by sex (female vs male), age (<65 vs 65 yr) and school education (up to primary school vs high school or more). Comparison between groups was made by Student's t-test, and the results are presented in Table 5
. Females had higher scores for mobility level, walking and bending, hand and finger function, pain, work, level of tension and mood than males (P < 0.05), but not for other AIMS2 subscores. The relationship between AIMS2 subscores and school education was further evaluated by dividing patients into groups with a low or high level of education. The differences in mean AIMS2 subscores for each education level were statistically significant for mobility level, arm function, pain, work, level of tension and mood (P < 0.05). No significant difference was observed, however, in the comparison between the two subgroups of patients stratified by age.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Reliability was assessed in terms of item-internal consistency, item-discriminant validity, internal consistency (Cronbach's alpha coefficient) and testretest reliability (Table 2). Construct validity was assessed by performing factor analysis on individual scales, internal validity was examined by using internal items of the questionnaire itself (problem with the area, and priority for improvement) (Table 3
) [12], and convergent validity was tested by assessing the correlations with the other clinical measures of OA severity (Table 4
). The findings suggest that AIMS2 is an internally consistent instrument in patients with OA of the knee, Cronbach's alpha coefficient exceeding 0.70 for all scales. The items all correlated with each other, but there was no redundancy; this indicates that each domain addresses a somewhat different aspect of functional disability. The testretest reliability was
0.80 for eight of the 12 scales.
Factor analysis provides a three-factor health status model explaining 63.5% of the variance. Arthritis pain and psychological scales are loaded on the first factor together with physical scales for mobility level and walking and bending. This is in accordance with the view that psychological variables strongly influence both pain and perception and the degree of functional impairment experienced by patients with symptomatic OA of the knee, as demonstrated by the strong correlation between these scales [2528]. In particular, variables such as depression and anxiety may decrease pain tolerance and become part of a vicious circle of pain and inactivity [29]. The upper limb function scales formed the second factor. The third factor was determined by social dimension.
This study also provides some evidence of convergent validity, with significant and mostly stable association between AIMS2 and the majority of the clinical measures of health status, particularly the Lequesne index. The validity of using internal criteria was also proved by the use of the Meenan validation [12]; this is particularly important because focusing on areas that the individual patient considers a priority for improvement may produce a more sensitive measure of therapeutic benefit [12, 30]. As expected, noteworthy floor effects were observed for the scales for hand and finger function, arm function, self-care, household activities and support from family and friends. A potential approach to eliminating the response floor in AIMS2 is to focus on aspects of health status that are specific to the area of primary interest [30, 31]. The rationale for this approach lies in the potential for increased responsiveness that may result from including only important aspects of health status that are relevant to the patients being studied, in accordance with the above-mentioned concept of functional priority [32]. The resulting noise reduction effect may offer several advantages, including increasing the efficiency of outcome measurement, thus reducing the sample size needed [32].
Female sex and years of formal education have been associated with increased reporting of knee pain in some community studies [33]. We confirm a significant relationship between knee pain severity as assessed by the AIMS2 scale, sex and years of formal education, suggesting that these markers should be included as variables in clinical studies of knee OA. The mechanism by which education influences pain severity is unclear but may be related to enhanced self-efficacy and sense of control, allowing the patient to take advantage of a greater number of pain-reducing modalities. The differences in mean AIMS2 scores for each education level were also statistically significant for mobility level, arm function, work, level of tension and mood (P < 0.05).
In conclusion, the results reported here confirm the reliability and validity of the AIMS2 in patients with OA of the knee. The collection of information on health status using this instrument was acceptable to patients, though unfamiliar to them. A further prospective multicentre study will be necessary to prove the usefulness of the AIMS2 as an outcome measure in OA clinical trials.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|