Department of Medicine, The University of Western Ontario, Canada and 1UCLA School of Medicine, Los Angeles, California, USA.
Correspondence to: J. E. Pope, Rheumatology Centre, St. Joseph's Health Care London, 268 Grosvenor Street, Box 5777, London ON N6A 4V2, Canada. E-mail: janet.pope{at}sjhc.london.on.ca
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods. We used the raw data from two randomized controlled trials (RCTs) in early diffuse scleroderma: methotrexate (Pope et al.) and D-penicillamine (Clements et al.). Subjects in the methotrexate trial were divided into the following groups: (1) those with at least 20% improvement in the primary outcome measurements [patient global assessment, physician global assessment, UCLA skin tethering score, modified Rodnan skin score (MRSS), DLCO as % predicted and HAQ disability] at 1 yr vs (2) the others. Baseline factors (including age, gender, skin scores, physician and patient global assessments, HAQ disability and pain scores, DLCO and physical parameters) were analysed to find baseline variables strongly correlated with later improvement. These variables were explored in the D-penicillamine trial to determine if (in a separate trial) they were still predictive of improved outcome at 1 and 2 yr. Adjusted models were used to find baseline predictors of good outcome. The median HAQ-DI was 1.3 (methotrexate) and 1.0 (D-penicillamine).
Results. A baseline HAQ disability score of less than the median was predictive of at least a 20% improvement at 1 and 2 yr with odds ratios of 1.77 to 5.05, in four of the five outcome measurements (in both groups); with strongly significant P values for 3 of 5 outcomes (UCLA skin score, MRSS, patient global skin score; P<0.02) from the methotrexate study group. These three outcomes were strongly correlated with improvement (r between 0.25 and 0.35). Although data from the D-penicillamine trial were less convincing, in both trials the less than median HAQ-DI and HAQ pain scores showed a stronger association with improved outcome, more so than age, gender, skin score and baseline global assessment.
Conclusion. A low baseline HAQ (defined as less than the median HAQ score) is predictive of improved outcome in diffuse scleroderma at 1 and 2 yr.
KEY WORDS: Early diffuse scleroderma, HAQ, Prognosis, Good outcome.
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To date, there have been no significant factors to help the clinician assess which patients should do better than others. The purpose of this study was to try to determine what factors were predictive of good outcome in early diffuse scleroderma.
The health assessment questionnairedisability (HAQ-DI) is a standardized disability questionnaire that was initially developed for use in rheumatoid arthritis (RA) [5]. A high HAQ-DI score has been shown to be a strong predictor of morbidity and mortality in RA, and low HAQ-DI scores are predictive of better outcomes [6, 7]. The importance of the HAQ-DI as it pertains to SSc has been illustrated by analyses of data from the D-penicillamine trial [8, 9]. High and/or worsening scores of the disability index of the HAQ-DI and skin scores are associated with increased mortality and morbidity [8].
Using data from two SSc trials of reasonable size (high-dose vs low-dose D-penicillamine [10] and methotrexate vs placebo trials [11]) we identified clinical characteristics in early diffuse SSc that might predict (or correlate with) improved status of patients at 1 and 2 yr of follow-up. Thus, all subjects who completed the trials were analysed or had final assessment data for the purposes of this study.
![]() |
Patients and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
There were several differences in the two studies: (1) The UCLA skin tethering score (10 skin regions, scored from 0 to 3 for skin tethering, for a maximum score of 30, [13]) and an early modification of the Rodnan skin thickness score (MRSS) (26 skin regions, scored from 0 to 3 for skin thickening, for a maximum score of 78, [14]) were used in the methotrexate trial to evaluate the degree and extent of SSc skin involvement. A later modification of the MRSS (17 skin regions, scored 0 to 3 for skin thickening, for a maximum score of 51, [15]) was used in the D-penicillamine trial to evaluate the degree and extent of SSc skin involvement. Since the MRSS using 26 and the MRSS using 17 skin regions have similar performance characteristics, they are both called MRSS and analysed similarly in studying percentage change from baseline. (2) A HAQ-DI was used in both studies (0 to 3 scale). In addition to the HAQ-DI, the methotrexate trial also used a HAQ-pain index [03.0 scale from a 15-cm visual analogue scale (VAS)]. (3) A 10-cm VAS was used in the methotrexate trial to document the physician's global assessment of the patient's disease, whereas a 7-point Likert scale [assessment of whether the patient improved (3 grades), worsened (3 grades) or did not change from baseline] was used in the D-penicillamine trial. (4) A 10-cm VAS was used in the methotrexate trial to document the patient's global assessment. No patient global assessment was performed in the D-penicillamine trial. (5) The length of the two trials differed: the methotrexate trial was 1 yr while the D-penicillamine trial was 2 yr (with data also available at 1 yr in those patients who completed 2 yr). (6) The drop-out rates also differed in the two trials: 34% (24/71) dropped out over 1 yr in the methotrexate trial while 49% (66/134) dropped out over 2 yr in the D-penicillamine trial. Table 1 compares the two trials.
|
Plan for analysis
Over 1- and 2-yr follow-up, six primary outcome (dependent) variables were identified: (1) patient global assessment, (2) physician global assessment, (3) UCLA skin tethering score, (4) modified Rodnan skin score (MRSS), (5) DLCO as percentage predicted and (6) HAQ-DI. Baseline values that were considered as independent variables that might predict improvement at 1 yr included sex, age, function (HAQ-DI), skin involvment, hand spread, hand flexion, oral aperture, skin ulcers, cardiomegaly, albumin, creatinine and presence of lung disease.
Change scores for the primary outcome measurements were calculated for all patients who completed 1 yr of the methotrexate trial and for all patients who completed 1 and 2 yr of the D-penicillamine trial, by determining the difference between the value at 1 or 2 yr and at baseline. Initially change scores were calculated as continuous variables and linear regression was used to analyse the relationship between the change scores and the different baseline factors. Since occasional trial subjects were 200% worse on some variables, this initial analysis was skewed. This line of analysis was difficult to interpret without transforming the data and was abandoned.
The change scores of the primary dependent outcome measurements were then tested as binary variables by dichotomizing the scores as improved: 20% improvement for MRSS, UCLA skin scores, and patient global assessment;
10% improvement in DLCO;
0.22 units improvement in the HAQ-DI;
2 grades of improvement for the physician global assessment in the D-penicillamine trial and
20% improvement in physician global assessment in the methotrexate trial, or not improved if they did not meet the definitions of improved. Baseline variables were tested as continuous and dichotomous. Both analyses showed similar and consistent results. Analyses of continuous variables were used to produce correlation coeffecients. Dichotomized variables were used to produce odds ratios, P values and confidence intervals to assess and compare various baseline characteristics. Baseline variables were dichotomized into equal to or greater than median and less than the median, except for age which was analysed as a continuous baseline variable and gender as dichotomous. Table 2 compares the number of patients improved in each primary outcome measurement of the two studies. Approximately 30% of trial completers were
20% better for various outcome measurements, except for diffusing capacity (% predicted) where only approximately 10% improved by
10%.
|
In the methotrexate trial, predictive baseline values were sought for all six primary outcome variables. In the D-penicillamine trial, predictive baseline values were sought for four of the primary outcome variables (MRSS, HAQ-DI, DLCO and physician global assessment). Baseline values investigated included skin scores, age, gender, HAQ-DI and pain scores, physician and patient global assessments, DLCO% and other physical parameters such as grip strength. The analysis was first conducted in the methotrexate trial to establish which baseline variables were predictive of later improvement; then the analysis was repeated in the D-penicillamine trial to determine if, in a separate trial, they were still predictive of improved outcome at 1 and 2 yr. Thus, in separate analyses (methotrexate at 1 yr; D-penicillamine results at 1 and 2 yr) those with less than median baseline values vs the others were dichotomized and compared with those patients who had had improvement in the six primary outcome measurements.
In both studies, logistic and linear regression were used to examine the relationship of several baseline variables to primary outcome measurements, and the relationship of changes in these variables to changes in the outcome measurements. The analyses were repeated adjusting for age as we thought it may be an important predictor of outcome. The median HAQ disability at baseline was 1.3 (S.E. = 0.097) for the methotrexate study and 1.0 (S.E. = 0.057) for the D-penicillamine study.
Statistical analysis
All statistical analyses were performed using SAS statistical software from JMP. Crude odds ratios (OR) and P values were estimated. P values of less than 0.05 were considered statistically significant. The r value was used to measure the correlation between improvement at 1 and 2 yr. Multiple comparisons were done and raw P values are presented.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adjusting for age in the data produced consistent results with the above. When adjusting for age, a lower than median HAQ-DI showed a strong correlation with improved primary outcome measurements for UCLA skin score (OR 4.6, P<0.0004), MRSS (OR 2.4, P<0.029), physician global assessment (OR 3.1, P<0.0057) and patient global assessment (OR 6.9, P<0.0001). The DLCO (OR 1.96, P<0.3) and HAQ-DI (OR 0.5, P<0.15) did not show a strong association.
After adjusting for age HAQ-pain also showed a strong correlation with changes in the primary outcome measurements, including UCLA skin score (OR 8.4, P<0.0001), MRSS (OR 4.6, P<0.0004), patient global assessment (OR 4.5, P<0.0013) and DLCO (10.1, P<0.0066). The physician global assessment (OR 1.8, P<0.14) and HAQ-DI (OR 0.46, P<0.14) were not statistically significant. No other primary factors had significant predictive value in the study, including baseline skin score and gender. A 10% improvement (chosen as cut-off because few subjects reached 20% improvement) in DLCO % predicted score at 1 yr compared with baseline was correlated with only the physician global score (r 0.36, OR 6.0, P<0.04). No baseline variables predicted a 0.22 unit improvement in the HAQ-DI score at 1 yr. Table 3 and Fig. 1 compare the odds ratio of different baseline variables with primary outcome measurements in the methotrexate trial.
|
|
When adjusted for age, the associations were even stronger. HAQ-DI scores <1.0 showed strong associations with good outcome in MRSS [1 yr compared with baseline (OR 2.6, P<0.019), 2 yr compared with baseline (OR 4.4, P<0.0007), and 2 yr compared with 1 yr (OR 2.7, P<0.014)], DLCO [2 yr compared with 1 yr (OR 3.3, P<0.012)] and physician global assessment [2 yr compared with baseline (OR 3.64, P<0.0018) and 2 yr compared with 1 yr (OR 3.5, P<0.042)].
Two additional baseline characteristics were slightly associated with improved outcome in the D-penicillamine trial: (1) Age, which correlated with MRSS improvement in 1 yr compared with baseline (OR 2.96, P<0.0280) with correlation coefficient: r = 0.1, P<0.3; and (2) the presence of ulcers, which correlated with HAQ-DI improvement in 1 yr compared with baseline (OR 8.94, P<0.009), with correlation coefficient: r = 0.3, P<0.04. However, these associations were isolated and inconsistent (both age and the presence of ulcers showed no significance with any other primary outcome measurement and showed no association at 2 yr compared with baseline, and 2 yr compared with 1 yr before and after adjusting for age). The HAQ-DI was the only consistently significant predictive factor of the study. Similarly baseline MRSS (>median vs <median) was not strongly associated with any improved outcome measurement at 1 and 2 yr. Figure 2 compares the odds ratios of different baseline variables with primary outcome measurements in the D-penicillamine trial at 2 yr compared with baseline. Table 4 and Fig. 2 compare the odds ratio of different baseline variables with primary outcome measurements in the D-penicillamine trial.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
This agrees with the HAQ-DI being correlated with skin score [3], and skin score being important in future prognosis of disease [4]. Our data, which show that low levels of HAQ pain and disability are strong predictors for 1- and 2-yr improvement in SSc, demonstrate their profound importance in early diffuse scleroderma.
Our analysis only applies to early diffuse scleroderma. We cannot generalize results from diffuse scleroderma of longer disease duration or conclude anything about limited systemic sclerosis. In addition, from this study, we cannot extrapolate the predictive value at greater than 2 yr. The sample sizes of the two cohorts were relatively small. The differences between the two trials included slight variations in inclusion and exclusion criteria, length of the trials, baseline factors evaluated and primary outcome measurements used. Since subjects were excluded from these trials if they had significant organ involvement, other important prognostic factors may have been missed. The change scores of the primary outcome measurements were divided into improved vs not improved. This is a clinically relevant division. However, the diversity of the two trials strengthens our conclusion that low HAQ-DI scores predict improvement in several important outcomes at 1 to 2 yr into early diffuse scleroderma.
Future direction
The HAQ-DI should be included in future scleroderma trials and in cohorts with longer disease duration and those with limited scleroderma to see how robust the association we found really is in those groups. Determining positive predictive factors such as the HAQ pain and disability is important in early diffuse scleroderma. With these factors, patients presenting for the first time may be classified, prognosis over the next 1 to 2 yr may be more accurately determined, and treatment may be better evaluated. This paper adds to current literature as it suggests that low baseline HAQ pain and DI scores are strongly correlated to improved outcome at 1 and 2 yr in early diffuse scleroderma consistently in two trials.
Our scleroderma subjects with early disease had significant disability and their HAQ scores can be compared with those of subjects with RA of 8- to 10-yr disease duration who were in the tumour necrosis factor (TNF) blocker trials [16]. We also found that the sample sizes for HAQ-DI and HAQ pain scores in scleroderma patients studied for clinical trials for diffuse patients of varying disease duration were lower than for limited scleroderma because of different standard deviations [17].
There may be equally or more important predictors of good prognosis in early diffuse scleroderma that were not measured, such as serum tests (VCAM, endothelin-1, other cytokines). However, a self-assessed tool that is easily measured and is reliable is an advantage. One may have predicted that a scleroderma expert could predict who would do well (or poorly), but baseline physician's global assessments did not fare as well as the baseline HAQ scores.
In addition, one may have thought a priori that a low skin score would have been a prognostic factor of good outcome, but our subjects had early disease, and many worsened owing to the natural history of their early disease (i.e. at 1 or 2 yr did not attain the defined degree of improvement). A low skin score is important in predicting who will do well compared with those who have a higher skin score with respect to occurrence of organ involvement, and those with worsening skin scores have a higher mortality [3, 5].
Our statistics assume independence in the measurements. However, as with RA, the HAQ-DI is not independent of all other measurements and in scleroderma is associated with hand function and skin score [4] (as RA is associated with tender and swollen joints). Thus this measure violates statistical principles of total independence, but that is also true of most human research.
We may have been erroneous in combining those on active treatment and placebo and could have missed a bias of treatment effects that could alter results. However, we redid the main analyses stratifying only those in placebo and only those on active treatment and found the same results.
One could also say the logic is circular. Those who do well (improve) are those who have less disease at baseline, but this is not conclusive for most baseline characteristics.
The clinical significance of having positive predictive factors in the future will depend on how much they influence treatment and long-term prognosis (not just up to 2 yr). Further research should include the appropriateness and effectiveness of altering treatment based on key baseline factors. In addition we recommend that HAQ-DI scores should be included as routine outcome measurements in scleroderma clinical trials and the HAQ-DI may also be a useful tool in the assessment of scleroderma patients in a clinic setting.
The authors have declared no conflicts of interest.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|