The mechanical joint score: a new clinical index of joint damage in rheumatoid arthritis

A. H. Johnson1, A. B. Hassell, P. W. Jones2, D. L. Mattey, J. Saklatvala and P. T. Dawes

Staffordshire Rheumatology Centre, Stoke-on-Trent ST6 7AG,
1 Leeds General Infirmary, Leeds LS1 3EX and
2 Department of Mathematics, Keele University, Stoke-on-Trent ST5 5BG, UK


    Abstract
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Objectives. To evaluate the mechanical joint score (MJS) in terms of its reliability between observers and over time, its ease of use and its relationship with conventional measures of rheumatoid arthritis (RA) disease activity, severity and functional outcome.

Methods. The MJS was evaluated in 103 patients with reference to the following joints: total proximal interphalangeal (PIP) joints, total metacarpophalangeal (MCP) joints, wrists, elbows, shoulders, hips, knees, ankles and total metatarsophalangeal (MTP) joints. The score was based on the appearance of the joints on a scale of 0–3, 0 representing no abnormality and 3 severe abnormality or previous surgery. The MJS was evaluated in terms of its intra- and inter-observer variability and its content, construct and criterion validities. A subset of 29 patients were re-evaluated after 5 yr to examine change in MJS over time.

Results. The MJS performed well in terms of inter-observer and intra-observer reliability. The MJS showed strong correlation with the Larsen X-ray score of hands and feet (Spearman correlation coefficient 0.74) and with the modified Health Assessment Questionnaire (Spearman correlation coefficient 0.56) and only weak correlation with indices of disease activity, such as the Ritchie index and erythrocyte sedimentation rate. The MJS showed highly significant positive change over time.

Conclusion. The MJS is a reliable clinical index of joint damage and may be a useful new outcome measure in RA.

KEY WORDS: Mechanical joint score, Rheumatoid arthritis, Outcome measure, Joint damage.


    Introduction
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Rheumatoid arthritis (RA) is a chronic and heterogeneous disease characterized by synovial inflammation (activity) leading to progressive joint damage. There may be considerable variation in the course of the disease, some patients having only mild joint damage after many years of disease while others develop gross abnormalities leading to serious functional impairment. In such a heterogeneous condition there is a need for indices and measures which can be applied easily and reliably to the majority of patients in order to quantify the extent of disease involvement. Not only are these indices important in the context of clinical trials, but there is also a need for simple measures of disease outcome in clinical practice, particularly in the current age of evidence-based medicine and clinical effectiveness. While there are a number of validated clinical and laboratory indices of disease activity in RA, there are few clinical measures of joint damage.

Radiographic evaluation of joint damage using a scoring system is probably the most direct and objective measure, but is limited in terms of joints assessed, dependence on X-rays and lack of a functional component [1]. Moreover, the aggregate radiological score is a simplification of quite complex data. Different methods of scoring radiological damage have been developed, with the emphasis on different features of rheumatoid joint pathology [14], which may vary in their ability to differentiate between erosive change, joint space narrowing and secondary degenerative changes [5, 6] as well as in their reproducibility and sensitivity to change [7, 8].

While it is clear that the assessment of radiological damage can be useful, it does not fully reflect the biological outcome of the disease, being primarily a measure of cartilage and bone damage and not of damage to other tissues and organs [1]. In the joint itself, damage to tendons, ligaments and soft tissue, together with neurological changes and muscle wasting, will all be important in terms of biological and functional outcome. These additional factors contribute to the discrepancy between radiological damage and functional ability [9].

Outcome measures which include the functional ability [10] or health assessment scores [11, 12] are useful as they measure the patient's perceived disability but are influenced by confounding factors, including age, sex, pain perception, neuromuscular power, language and cultural differences [3, 4].

In order to try to provide a clinical measure which better reflects the overall biological outcome, we have devised a simple index, the mechanical joint score (MJS), to assess the total amount of joint damage and impairment of mechanical function in patients with RA. We investigated the reproducibility of this index and its relationship to other measures of joint damage and disease activity.


    Patients and methods
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Patients
The MJS was evaluated in 103 patients who were participating in a long-term study of RA outcome and who have been described in detail [15]. These were patients who fulfilled the 1987 ARA criteria for definite or classical RA [16] and were recruited consecutively to a clinic to examine the effect of slow-acting anti-rheumatic drugs (SAARDs). In a subgroup of 29 patients (those still regularly attending SAARD monitoring clinics), MJS was re-evaluated 5 yr later.

The MJS was evaluated with reference to the following joints: proximal interphalangeal (PIP) joints, total metacarpophalangeal (MCP) joints, wrists, elbows, shoulders, hips, knees, ankles and total metatarsophalangeal (MTP) joints, i.e. a total of 18 joints or sets of joints. Joints were scored 0–3 according to their appearance.

Examination
In each hand the PIPs and MCPs were examined by observing the patient making a full fist and then fully extending the fingers (Fig. 1Go). Wrists were examined using the ‘prayer’ and ‘inverse prayer’ manoeuvres (Fig. 1Go). Each elbow was examined by bending and straightening it (Fig. 1Go).



View larger version (20K):
[in this window]
[in a new window]
 
FIG. 1.  Examination of the PIPs and MCPs (a), the wrist (b) and the elbow (c).

 
The shoulder was examined by asking the patient to place their hand behind their head with their arms pointing laterally; internal rotation of the shoulder was tested by asking the patient to put their hand behind their back and touch their shoulder blade (Fig. 2Go). Hip abduction was examined by using the examiner's hand to fix the pelvis; hip rotation was examined with the patient's legs held extended (Fig. 3Go). Flexion and extension of the knee were examined with the examiner's hand on the knee to feel for joint instability (Fig. 4aGo).



View larger version (16K):
[in this window]
[in a new window]
 
FIG. 2.  Examination of the shoulder.

 


View larger version (22K):
[in this window]
[in a new window]
 
FIG. 3.  Examination of the hip.

 


View larger version (19K):
[in this window]
[in a new window]
 
FIG. 4.  Examination of the knees (a) and feet (b).

 
Dorsiflexion, plantar flexion, inversion and eversion of the ankle joint were examined (Fig. 4bGo). Finally, in examining the MTPs the feet were inspected for gross deformity and palpated for subluxation and callus formation (Fig. 4bGo).

Scoring
When scoring each joint or set of joints, 0 was taken to mean no abnormality but was also the score given if any joint was absent for any reason or if the joint deformity was congenital in origin.

A score of 1 represented possible or minor abnormality; it was the score given if there was a slight resting deformity or if the reduction in the range of joint movement was less than 20%. A score of 2 represented definite or moderate abnormality; i.e. a definite resting deformity or a moderate reduction in the range of joint movement (20–40%). A score of 3 indicated severe abnormality or bony surgery. Total PIPs, MCPs and MTPs of each hand/foot were scored as one joint.

The final score was calculated by summing the scores for the individual joints or sets of joints, giving a minimum score of 0 and a maximum of 54.

Clinical assessments
In all 103 patients, the modified Stanford Health Assessment Questionnaire (HAQ) [12] was completed and posterior–anterior radiographs of both hands and anterior–posterior radiographs of both feet were taken. All films were scored by one observer (JS) using the method of Larsen et al. [1, 5]. The erythrocyte sedimentation rate (ESR) and the Ritchie articular index [17] were measured. In 56 of the patients the overall status in RA (OSRA) was measured [18]. The OSRA consists of four parts: demographic details, activity score, damage score and treatment category. At the time of the re-examination of 29 of the patients, the HAQ also was re-evaluated.

Reliability assessments
The reliability of the MJS was analysed in two ways. Inter-observer reproducibility was assessed by two rheumatologists (ABH and PTD) independently; they examined 24 patients with RA (15 in-patients and nine out-patients), recording the MJS on a pro forma. ABH was the first examiner for 16 of the patients and PTD examined eight first. The second examiner assessed all patients within 20 min of the first assessment. Each examiner was blinded to the other's scores.

Intra-observer reproducibility was assessed by AHJ, who examined 15 patients (10 rheumatology day-ward patients and five in-patients) on two separate occasions 6–8 h apart. The MJS was again recorded on a pro forma.

Statistical analysis
Data were analysed using a statistical software package (NCSS; Number Cruncher Statistical System, version 5.01) (NCSS Statistical Software, Kaysville, UT, USA). In the inter- and intra-observer reliability data, the two sets of scores for each patient were examined joint by joint. The distribution of the paired differences in total scores was approximately normal and a paired t-test was used for comparison. Agreement between total scores was assessed using the method of Bland and Altman [19].

The data comparing the MJS with other outcome measures were non-normal and relationships were investigated using Spearman rank partial correlation.

The longitudinal data were evaluated using the Wilcoxon signed ranks matched pairs test.


    Results
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
The MJS took approximately 5 min to complete.

Patient demographics
These are shown in Table 1Go.


View this table:
[in this window]
[in a new window]
 
TABLE 1.  Demographic data of the patients

 

Reliability
Following Bland and Altman [19], all the observed differences in total scores (both inter- and intra-observer) were within 2 S.D. of the mean difference, indicating acceptable reproducibility (Fig. 5Go).



View larger version (27K):
[in this window]
[in a new window]
 
FIG. 5.  Inter-observer (a) and intra-observer (b) reliabilities demonstrated by strong correlations between total scores in individual patients. (c) Differences in total scores plotted against the average scores for inter-observer reliability. The mean difference in scores is -0.46, the S.D. is 3.82 and the limits of agreement are -8.1 and 7.18 (mean±2 S.D.). In each case, all observed differences in scores are within 2 S.D. of the mean difference (dotted line). (d) Differences in total scores plotted against the average scores for intra-observer variability. The mean difference in scores is -0.8, the S.D. is 2.31 and the limits of agreement are -5.41 and 3.81 (mean±2 S.D.). In each case, all the observed differences in scores are within 2 S.D. of the mean difference (dotted line).

 
Looking more closely at patients examined for inter-observer error, we studied 24 patients. There was a mean total score of 21.5 (S.D. 13.6); the minimum score was 1, the maximum score was 45 and the interquartile range was 25 (10.25–35.25). Of the 427 joints or sets of joints examined, 277 (65%) showed agreement in the scores, 137 (32%) showed a difference in the scores of 1 and 13 (3%) showed a difference in the scores of 2. Interestingly, six of the joints with a difference of 2 between the two scores were ankle joints.

Fifteen patients were examined for intra-observer error. There was a mean total score of 21.9 (S.D. 12.6); the minimum score was 2, the maximum score was 41 and the interquartile range was 24 (13–35). Of the 285 joints or sets of joints that were examined, 197 (69%) showed agreement in scores, 87 (31%) showed a difference of 1 and one (ankle) joint showed a difference of 2.

Relationship of MJS to other measures of disease activity and severity
There was a highly significant correlation between the MJS and disease duration (r=0.7, P<0.001). To investigate the presence of a relationship independent of disease duration, variables were corrected for disease duration.

There was a highly significant correlation between the MJS and the Larsen index (r=0.74, P<0.001) (Fig. 6Go) and the damage score of OSRA (r=0.68, P<0.001) (Fig. 7Go). There was also a strong relationship with the HAQ (r=0.56, P<0.001) (Fig. 8Go).



View larger version (18K):
[in this window]
[in a new window]
 
FIG. 6.  Correlation between the mechanical joint score and the Larsen score.

 


View larger version (19K):
[in this window]
[in a new window]
 
FIG. 7.  Correlation between the mechanical joint score and the damage score of OSRA.

 


View larger version (51K):
[in this window]
[in a new window]
 
FIG. 8.  Correlation table comparing the mechanical joint score with other indices. D(OSRA) is the damage score of OSRA; A(OSRA) is the activity score of OSRA.

 
The relationship between the MJS and the Ritchie articular index was weaker (r=0.29, P<0.01), as was the correlation to the ESR (r=0.25, P<0.01). There was no correlation between the MJS and the activity score of the OSRA (r=0.15, not significant) (Fig. 8Go).

Longitudinal data
Re-evaluation of the MJS and HAQ in 29 of the original cohort showed a significant change in both indices (P<0.001). The mean MJS at time 0 was 21 (S.D. 10). At time 0 plus 5 yr the mean MJS was 35 (S.D. 13). The mean difference in MJS was +14 (S.D. 7.3). The mean HAQ at time 0 was 1.5 (S.D. 0.9). At time 0 plus 5 yr the mean HAQ was 2 (S.D. 0.6). The mean difference in HAQ was +0.5 (S.D. 0.75). No significant correlation was observed between change in HAQ and change in MJS (Spearman's r=0.1023, P=0.6).


    Discussion
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Assessment of the outcome in RA is a complex and contentious area. To the patient with RA, the degree of functional disability and interference with daily life, i.e. the handicap, will be the most important disease outcome. For the clinician, the outcome measure used most frequently is an objective radiological score of joint damage.

Fuchs et al. [20] showed that quantitative radiographic scores were correlated highly with joint scores for limitation of motion. It is known that, on the whole, radiographic indices are significantly correlated with self-report measures of functional status and this is borne out in our study (Larsen score vs HAQ, r=0.41, P<0.001). The aim of our study, however, was to propose the use of a new, clinical index of joint damage and functional impairment which can be used instead of or in addition to radiographic assessment and self-report measures.

We have validated our score in terms of face, construct, content and criterion validities [21]. With respect to face validity, the index reflects a typical clinical assessment of the joints of someone with RA.

In the context of construct validity, the measure makes biological sense as we are measuring the clinical end-point of joint and soft tissue damage. The measure also agrees with expected results in terms of its relationship to disease duration and to other measures of joint damage and function. Content validity requires that outcomes sample multiple domains of RA improvement. This is only partly relevant to the MJS as we aimed specifically to assess joint damage, but using the MJS a large number of joints covering most of the important functional areas involved are assessed, in contrast with, for example, the commonly adopted Larsen score of hands and feet.

Criterion validity requires that outcomes predict or correlate with gold standard measures of RA outcome. We believe that, while there is not a gold standard clinical measure of joint damage, there is a gold standard radiological measure of joint damage. In this respect, the MJS has a strong relationship with the Larsen score of the hands and feet (r=0.74, P<0.001). The HAQ is currently the most widely used index of joint function and may be regarded as something of a gold standard. Both joint damage and joint inflammation affect the HAQ (i.e. there is a degree of reversibility). The MJS has been found to correlate strongly with the HAQ, again supporting its construct and criterion validity.

With respect to discriminant validity or sensitivity to change, thus far we have only limited data. We have clear evidence of a rising score over time but more work needs to be done to examine shorter-term changes and the smallest meaningful change.

The mechanical joint score has been found to be reliable, with good inter- and intra-observer reliability. The major advantage of the mechanical joint score over the Larsen score and the HAQ is that it is a clinical index that can be performed swiftly and objectively by the clinician with no recourse to X-rays or questionnaires. It also has the advantage over radiographic scores of reflecting damage to periarticular structures, and this may explain the fact that there is stronger correlation between the MJS and Larsen score and the HAQ than is seen between the Larsen score and the HAQ alone. This lends further support to the use of the mechanical joint score as an outcome measure in addition to the Larsen score and the HAQ.

The mechanical joint score exhibits weak correlation with indicators of disease activity. This is in keeping with observations that, at a single point in time, radiographic scores were not correlated at all with joint tenderness scores [20] and that single measurements of disease activity do not predict radiological or functional outcome [15]. Common sense dictates, however, that a clinical index of joint damage and function should be associated to some degree with indices of joint tenderness, and this is borne out in the stronger correlation between the mechanical joint score and the Ritchie articular index (r=0.29, P<0.01) than between the mechanical joint score and other measures of disease activity.

Longitudinal data also support the validity of the MJS by showing a highly significant positive change in the score over time. A significant correlation between the change in HAQ and the change in MJS was not seen. A possible explanation is that HAQ is affected by both reversible inflammatory joint disease and by irreversible joint damage. The MJS, on the other hand, should reflect only the latter. Thus, whereas HAQ improves for some individuals, the MJS would not, and this indeed was our observation.

Good outcome measures are essential in a heterogeneous condition like RA. In this age of evidence-based medicine, there is increasing emphasis on well-constructed clinical trials to assess the efficacy of therapeutic interventions. In most trials of drug interventions, the gold standard for measuring outcome has been the assessment of radiological damage. This is not without drawbacks in terms of time, cost and the repeated exposure of patients to doses of radiation, albeit small doses. It is neither desirable nor feasible to X-ray all potentially affected joints regularly. Moreover, routine scoring of X-rays is not practical for the vast majority of rheumatology units, so that X-ray scoring is not a feasible outcome for everyday clinical practice.

For patients with RA, the most important outcome of any intervention is the preservation of or improvement in function. Scott et al. [22] showed that, in a group of patients treated with disease-modifying drugs over 10 yr, there was a discrepancy between deterioration in radiological features and improvements in functional capacity. Health status questionnaires which include self-reporting of functional ability are sensitive to drug-related improvements in patients treated with disease-modifying drugs [23]. Radiologically apparent changes in bony architecture take time to develop. In the assessment of short-term clinical changes in arthritis, e.g. when comparing the efficacy of anti-inflammatory drugs, indices of functional status are sensitive outcome measures [24]. A drawback of all self-reported measures of functional status, however well designed, is their intrinsic subjectivity and their openness to a number of confounding variables.

The MJS is potentially valuable in that it correlates strongly with accepted gold standard radiographic measures of damage and also with questionnaire-based measures of function. The MJS is readily performed in the clinical setting and has the advantages that all joints can be included in the assessment and that the assessment can be repeated as frequently as necessary. While it is not suggested that the mechanical joint score will replace X-rays or the HAQ score, we believe that such an index is a valuable new clinical outcome measure.


    Acknowledgments
 
We thank K. Brailsford-Atkinson for preparing the drawings.


    Notes
 
Correspondence to: A. B. Hassell, Staffordshire Rheumatology Centre, Stoke-on-Trent ST6 7AG, UK. Back


    References
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 

  1. Larsen A, Dale K, Eek M. Radiographic evaluation of rheumatoid arthritis and related conditions by standard reference films. Acta Radiol Diagn1977;18:481–91.[ISI]
  2. Sharp JT, Lidsky MD, Collins LC, Moreland J. Methods of scoring the progression of radiological changes in rheumatoid arthritis. Arthritis Rheum1971;17:706–20.
  3. Amos RS, Constable TJ, Crockson RA, Crockson AP, McConkey B. Rheumatoid arthritis: relation of serum C-reactive protein and erythrocyte sedimentation rates to radiographic changes. Br Med J1977;1:195–7.[ISI][Medline]
  4. Empire Rheumatism Council. Gold therapy in rheumatoid arthritis. Final report of a multicentre controlled trial. Ann Rheum Dis1961;20:315–34.[ISI]
  5. Larsen A, Dale K. Standardised radiological evaluation of rheumatoid arthritis in therapeutic trials. In: Dumonde DC, Jasani MK, eds. Recognition of anti-rheumatic drugs. Lancaster, UK: MTP Press, 1978:285–92.
  6. Sharp JT, Young DY, Bluhm GB et al. How many joints in the hands and wrists should be included in a score of the radiological abnormalities used to assess rheumatoid arthritis? Arthritis Rheum1985;28:1326–35.[ISI][Medline]
  7. Grindulis KA, Scott DL, Struthers GR. The assessment of radiological changes in the hands and wrists in rheumatoid arthritis. Rheumatol Int1983;3:39–42.[ISI][Medline]
  8. Kellgren JH. Radiological signs of rheumatoid arthritis, a study of observer differences in the reading of hand films. Ann Rheum Dis1956;15:55–60.[ISI]
  9. Scott DL, Coulton BL, Bacon PA, Popert AJ. Methods of x-ray assessment in rheumatoid arthritis: a re-evaluation. Br J Rheumatol1985;24:31–9.[ISI][Medline]
  10. Steinbroker O, Traeger CH, Batterman RC. Therapeutic criteria in rheumatoid arthritis. J Am Med Assoc1949;140:659–62.[ISI]
  11. Fries J, Spitz P, Kraines G, Holman H. Measurement of patient outcome in rheumatoid arthritis. Arthritis Rheum1980;23:137–52.[ISI][Medline]
  12. Kirwan JR, Reeback JS. Stanford Health Assessment Questionnaire modified to assess disability in British patients with rheumatoid arthritis. Br J Rheumatol1986;25:206–9.[ISI][Medline]
  13. Carr AJ, Thompson PW. Towards a measure of patient-perceived handicap in rheumatoid arthritis. Br J Rheumatol1994;33:378–83.[ISI][Medline]
  14. Thompson PW, Pegley FS. A comparison of disability measured by the Stanford Health Assessment Questionnaire disability scales (HAQ) in male and female rheumatoid outpatients. Br J Rheumatol1991;30:298–300.[ISI][Medline]
  15. Hassell AB, Davis MJ, Fowler PD et al. The relationship between serial measures of disease activity and outcome in rheumatoid arthritis. Q J Med1993;86:601–7.[Medline]
  16. Arnett FC, Edworthy SM, Bloch DA et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum1988;31:315–24.[ISI][Medline]
  17. Ritchie DM, Boyle JA, McInnes JM et al. Clinical studies with an articular index for the assessment of joint tenderness in patients with rheumatoid arthritis. Q J Med1968;37:393–406.[Medline]
  18. Symmons DPM, Hassell AB, Gunatillaka KAN, Jones PW, Schollum J, Dawes PT. Development and preliminary assessment of a simple measure of overall status in rheumatoid arthritis (OSRA) for routine clinical use. Q J Med1995;88:429–37.[Abstract]
  19. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet1986;i:307–11.
  20. Fuchs HA, Callaghan LF, Kaye JJ et al. Radiographic and joint count finding of the hand in rheumatoid arthritis. Arthritis Rheum1988;31:44–51.[ISI][Medline]
  21. Tugwell P, Bombardier C. A methodological framework for developing and selecting end points in clinical trials. J Rheumatol1982;9:758–62.[ISI][Medline]
  22. Scott DL, Grindulis KA, Struthers GR, Coulton BL, Popert AJ, Bacon PA. Progression of radiological changes in rheumatoid arthritis. Ann Rheum Dis1984;43:8–17.[Abstract]
  23. Meenan RF, Anderson JJ, Kazis LE et al. Outcome assessment in clinical trials: evidence for the sensitivity of a health status measure. Arthritis Rheum1984;27:1344–52.[ISI][Medline]
  24. Anderson JJ, Firschein HE, Meenan RF. Sensitivity of a health status measure to short-term clinical changes in arthritis. Arthritis Rheum1989;32:844–50.[ISI][Medline]
Submitted 17 August 1999; Accepted 31 August 2001