Oslo City Department of Rheumatology, Diakonhjemmet Hospital, Box 23 Vinderen, N-0319 Oslo, Norway
Physical disability is one of the most widely used outcome measures in clinical studies and is recommended as one of the core measures both in controlled clinical trials [1] and in longitudinal observational studies [2]. Several instruments are available for measuring disability. Most commonly used are the Health Assessment Questionnaire (HAQ) [3] and the Arthritis Impact Measurements Scales (AIMS and AIMS2) [4]. The generic instruments SF-36 [5] and the Nottingham Health Profile [6] are also available. In general, these measures are strongly correlated to each other. In a large and representative sample of RA patients, we found that SF-36 physical functioning scale correlated with MHAQ (-0.69) and AIMS2 physical (-0.73), and the correlation between MHAQ and AIMS2 physical was 0.85 [7]. However, the scales behave differently with respect to the distribution of scores. There is a major floor effect for MHAQ and a less pronounced floor effect for AIMS2 physical, whereas the scores of SF-36 physical are more evenly distributed along the whole scale, from 0 to 100 [7].
It is desirable to use short questionnaires to save time and increase feasibility, but such advantages may be traded against loss of psychometric properties, especially the detection of treatment change. The eight-item MHAQ has lower sensitivity to change than the full HAQ [8] and the scoring instructions differ. In each of the eight components of the full HAQ, the item with the highest score on a scale from 0 to 3 should be used, and the score should be adjusted to 2 when patients report the use of technical aids or require assistance from another person within that specific functional component. We found, for example, that the average HAQ score in 178 patients with rheumatoid arthritis (RA) was 0.99 when the score was adjusted for the use of technical devices; the HAQ score without adjustment was 0.76 and the MHAQ gave a score of 0.46 [9]. The difference between HAQ and MHAQ scores increases with increasing disability [9].
Another approach, proposed by Pincus et al. [10], included new items of advanced activities of daily living and gave higher scores. This approach may be helpful in overcoming some of the limitations related to the floor effect of the HAQ and MHAQ.
The short form of AIMS2, which lacks about half of the items [11], has been found to have performance similar to that of the original instrument across the whole range of scores, and the physical scales of the short-form AIMS2, the MHAQ and the SF-36 had similar responsivenesses [12].
The HAQ is used most widely in controlled clinical trials and has been shown to discriminate between placebo and active therapy. For longitudinal observational studies, the responsiveness is still to be clarified. Several studies have demonstrated stable development of disability over time, but one study has suggested that the AIMS physical scale shows progression that is not captured by the HAQ [13].
Wolfe [14] has recently published a study of a large number of patients followed for up to 20 yr. He demonstrated a stable course of HAQ scores at a group level but also major individual variation in scores over time. He concluded that the HAQ disability scores are high at disease onset rather than gradually increasing and that the increase over time is very slow, estimated to be 0.03 units/yr on average, which is 1% of the maximum possible disability [14].
Attempts have been made to link the prevalence and incidence of RA to levels of disability as such data may be of special interest to health-care providers and for the planning of health services. In a county in which the prevalence of RA was 0.5%, we demonstrated that about half of the population of RA patients up to 80 yr of age had an MHAQ score exceeding 1.5 (scale 14) [15]; such scores may be considered to indicate severe disease and a high risk of reduced life expectancy and future additional disability [16]. In the same community, the incidence of RA with a MHAQ score exceeding 1.5 after 5 yr was 10/100 000; with an MHAQ score exceeding 2 the incidence was 3.5/100 000 and with an AIMS2 physical score exceeding 4 it was 4/100 000 [17].
Several studies have demonstrated moderate correlations of disability scores with age, disease duration, joint pain and other disease activity measures [18, 19], but also with reported depression and anxiety [14]. Smedstad et al. [20] attempted to examine longitudinally whether depression or anxiety led to subsequent disability or whether disability was a stronger predictor of subsequent mental distress. No definite answer could be given, but there was a tendency for disability to be a stronger predictor of subsequent mental distress than vice versa [20].
It has also been found that disability levels are related to educational status and low socioeconomic background as well as to female sex and low income [14]. Brekke et al. [21] recently showed that all self-reported measures, including physical disability, pain and mental distress, differed between areas in Oslo with low socioeconomic status and more affluent regions, whereas objective disease process measures, such as the erythrocyte sedimentation rate and joint count, did not differ between patients in the two areas [21].
The relationship between radiographic damage and disability has attracted special attention, as the traditional paradigm considers arthritis as the cause of structural damage, which again leads to disability. Recent studies have, however, demonstrated weak relationships between damage and disability in the first 10 yr of the disease course. However, after about 10 yr the correlation between damage and function is stronger [22]. It appears that inflammation contributes much more to the level of disability during the first years of the disease course, whereas radiographic progression contributes more strongly after about 10 yr of disease duration [23].
Pincus et al. [24] showed in their original studies that RA patients with different levels of disability had different outcomes with respect to both physical status and survival. These results were later confirmed by a longitudinal analysis of a patient sample from the 1980s [25]. Wolfe et al. [16] found a two-fold increase in mortality rate in patients having moderate disability (HAQ score between 1 and 2) and a four- to five-fold increase in mortality in patients with HAQ scores between 2 and 3.
The major difference between clinical studies and clinical practice is that the former focus on data from patient groups whereas the latter focus on individual scores recorded over time. This longitudinal pattern of individual scores is used to make therapeutic decisions. However, recent studies have demonstrated much greater intra-individual variation in disability scores over time than previously reported, both in RA [14, 23] and early inflammatory polyarthritis [26]. Using a BlandAltman approach, Greenwood et al. [27] showed that the limit of agreement, capturing information on measurement errors, was 0.48 points for HAQ score differences over 2 months, which means that a change exceeding this limit is required if it is to be regarded with confidence as clinically important. The size and direction of the change in HAQ score was not related to the position on the scale. However, other clinical measures probably have measurement errors of the same magnitude, and the interpretation is that no measure can be used in isolation and that clinicians need to understand the limitations of the measurement tools they use [28].
In conclusion, several instruments are available for measuring disability, but there is still no optimal, short instrument with good psychometric properties. One of the limitations of the instruments that are currently available is that they are strongly influenced by factors such as socioeconomic status, gender and mental state, indicating that trait phenomena have a large effect on self-reported disability scores. Secondly, the scores also progress slowly, which may indicate that patients adapt to the instruments in some way. Thirdly, despite their obvious usefulness in clinical studies, disability measures in isolation are of limited value in assessing the disease course in individual patients.
References