Disease-specific, patient-assessed measures of health outcome in ankylosing spondylitis: reliability, validity and responsiveness

K. L. Haywood1,2,, A. M. Garratt3, K. Jordan4, K. Dziedzic5 and P. T. Dawes6

1 Department of Health Sciences and Clinical Evaluation, University of York, York YO1 5DD,
2 Interdisciplinary Research Centre in Health, Physiotherapy and Dietetics Subject Group, School of Health and Social Sciences, Coventry University, Coventry CV1 5FB,
3 Unit of Health-Care Epidemiology, Institute of Health Sciences, University of Oxford, Oxford OX3 7LF,
4 Primary Care Sciences Research Centre,
5 Department of Physiotherapy Studies and Primary Care Sciences Research Centre, Keele University, Staffordshire ST5 5BG and
6 Staffordshire Rheumatology Centre, High Lane, Burslem, Stoke-on-Trent, Staffordshire ST6 7AG, UK


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Objective. To assess the acceptability and measurement properties of four ankylosing spondylitis (AS)-specific, patient-assessed measures of health outcome: AS Quality of Life Questionnaire (ASQoL), Bath AS Disease Activity Index (BASDAI), the Body Chart and the Revised Leeds Disability Questionnaire (RLDQ).

Methods. Instruments were administered by means of a self-completed questionnaire to patients recruited from across the United Kingdom (UK). Instruments were assessed for data quality and scaling assumptions. Where appropriate, dimensionality was assessed using principle component analysis (PCA). Internal consistency reliability was tested using Cronbach's alpha. Test–retest reliability was assessed in those patients reporting no change in AS-specific health at 2 weeks. The convergent validity of the instruments was assessed and scores were correlated with responses to the health transition questions. Responsiveness was assessed for patients reporting change in health at 6 months.

Results. The BASDAI and Body Chart have low self-completion rates. Item responses for the RLDQ were skewed towards higher levels of functional ability. PCA supported instrument unidimensionality. Cronbach's alpha ranged from 0.87 (BASDAI) to 0.93 (RLDQ). Test–retest reliability estimates support the use of the ASQoL and RLDQ in individual evaluation (>0.90). Correlations between instruments were in the hypothesized direction; the largest was between the ASQoL and BASDAI (0.79). The BASDAI had the strongest linear relationship, with responses to both specific and general health transition questions (P<0.01). With the exception of the Body Chart, instruments had a stronger relationship with general health transition. The BASDAI was the most responsive instrument. The Body Chart and RLDQ had low levels of responsiveness.

Conclusion. The instruments have undergone a comprehensive comparative evaluation to assess the measurement properties required for patient-assessed measures of health outcome. Adequate levels of reliability and validity were found for all instruments. The BASDAI and the ASQoL were the most responsive to self-perceived change in health, but the BASDAI had low levels of self-completion.

KEY WORDS: Ankylosing spondylitis, Patient-assessed health outcome, Reliability, Responsiveness, Validity.


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Patient-assessed instruments are becoming increasingly important in the measurement of health outcome in rheumatology, and provide supplementary information to traditional biomedical assessments. Ankylosing spondylitis (AS) is a chronic, often progressive, inflammatory disorder, primarily affecting the sacroiliac joints of the pelvis, the axial skeleton and thoracic cage [1]. Disease impact encompasses broad multi-dimensional issues including role and physical functioning, psychological well-being and social interactions [2]. The Assessment in Ankylosing Spondylitis Group (ASAS) have recommended several domains for the evaluation of patients with AS in both clinical research and routine practice [3], which include the assessment of functional disability, spinal stiffness and pain. Patient-assessed instruments with measurement properties to support their role in evaluation should now be identified to fulfil these domains.

Following a systematic review of the literature [4, 5], the Bath AS Disease Activity Index (BASDAI) [6] and the Revised Leeds Disability Questionnaire (RLDQ) [7] were identified as patient-assessed instruments worthy of further evaluation as measures of disease activity and functional disability, respectively. The BASDAI contains six items representative of disease activity in AS. Each item has a 10-cm horizontal visual analogue scale (VAS) anchored by adjectival descriptors ‘none’ and ‘very severe’. Item 6 (morning stiffness, duration) is anchored by a time scale (0–2 h). The mean of items 5 (morning stiffness, severity) and 6 is calculated. The summated item score is converted to a 0–10 scale, with a lower score indicating less disease activity. The RLDQ contains 16 items describing four areas of AS-specific functional disability: mobility, bending down, reaching up and neck movements, and posture. Each item has a four-point scale from 0 (no difficulty) to 3 (unable to do). Scores range from 0 to 48, where higher scores indicate greater functional disability. Both instruments require ~2 min to self-complete [5, 6].

The Body Chart has been developed as an AS-specific measure of global bodily pain [8, 9]. Patients sketch areas of current pain onto a body manikin (anterior and posterior views) and then score each area on a four-point scale [1 (mild pain) to 4 (very severe pain)]. Area scores are totalled, with a lower score indicating less bodily pain. There is no maximum score. The instrument was developed and tested following interview administration in a clinic environment, and preliminary evidence suggests satisfactory measurement properties [9]. Completion time ranges from several seconds (‘no pain’) to several minutes depending upon the extent of perceived pain and detail provided by the individual patient [5].

Despite increasing interest in the conceptualization and measurement of patient-assessed health related quality of life (HRQL) in chronic disease [10, 11], a published AS-specific measure of HRQL is not currently available [5]. Indeed, the ASAS group did not recommend quality of life as a core domain in AS evaluation because of uncertainty over the most suitable approach [3]. Communication with measurement experts identified the AS Quality of Life Questionnaire (ASQoL) (P. Helliwell and L. Doward, Galen Research, Manchester, personal communication, 1998) [12], a new and unpublished measure comprising 18 items relating to AS-specific HRQL. It adopts dichotomous responses (yes/no) and requires ~2 min to self-complete. Items are summated (0–18), with a lower score indicating a better level of AS-specific HRQL.

The aim of the study was to examine the acceptability, data quality and measurement properties of four patient-assessed measures of health outcome in AS patients recruited from UK rheumatology centres. The ASQoL (HRQL), BASDAI (disease activity), Body Chart (global pain) and the RLDQ (functional disability) describe several domains considered important in the evaluation of patients with AS, and the results of the study will support the recommendation of instruments to fulfil these domains.


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Data collection
Following published work evaluating patient-assessed measures of health outcome, a population of >400 patients was deemed acceptable for the postal survey [13, 14]. A simple random sample of patients, aged between 18 and 75 yr, with a confirmed diagnosis of AS (Modified New York Criteria, 1984) [15] registered with one of a group of specialist rheumatology centres in England and Scotland, and were were invited to participate in the study. Pregnancy, learning difficulties or an inability to comprehend written English were study exclusion criteria. The survey used a multi-centre study design and was approved by the Northern and Yorkshire Multi-Centre Research Ethics Committee and relevant local ethics committees. Written consent was gained from all patients.

Four-hundred and fifty-one patients were asked to self-complete a mailed questionnaire. Patients not wishing to participate were asked to return uncompleted and pre-coded questionnaires using a reply-paid envelope. Non-responders were sent reminders at 2 and 4 weeks. The questionnaire included the four disease-specific measures of health outcome, two health transition items and sociodemographic questions.

Instrument evaluation
Data quality. Individual items within each instrument were assessed for missing data, the distribution and symmetry of item response scores and endorsement frequencies. Principle component analysis (PCA) was used to assess the dimensionality of instruments based on multi-item scales [16]. The ASQoL, BASDAI and RLDQ are unidimensional, and PCA was used to confirm the existence of a single dimension for each instrument. The item-total correlation of individual items within these instruments was also assessed.

Reliability. The internal consistency reliability of the instruments was assessed by Cronbach's alpha [16, 17]. The Body Chart is not a multi-item scale and therefore could not be tested for internal reliability.

Patient-reported health transition questions that describe the magnitude and direction of change in general or specific health over a given time period are a valid approach to measuring change, and have been widely used as external criteria in the evaluation of instrument test–retest reliability and responsiveness [18, 19]. Instrument test–retest reliability was assessed for those patients indicating that their AS-specific health had remained the same at 2 weeks on a health transition question. This method reduces the influence of information recall associated with shorter periods of retest and produces a more robust estimate of instrument reliability [17]. Patients were asked to complete a second questionnaire at 2 weeks. The intra-class correlation coefficient [(ICC) 2,1] [20] was used to measure the agreement between test and retest [21]. For group comparisons, levels of reliability over 0.70 are required [16, 17], and for the evaluation of individuals levels above 0.90 have been recommended [17, 22].

Validity
Construct validity was assessed by correlating the scores for the separate instruments to assess the convergent validity of related dimensions (Pearson's correlation coefficient). Hypothesized theoretical relationships between instruments were considered a priori. The ASQoL (HRQL), BASDAI (disease activity), Body Chart (bodily pain) and RLDQ (functional disability) measure related aspects of HRQL. The main issues measured by the BASDAI and Body Chart, and several items measured by the RLDQ, are within the item content of the ASQoL. Therefore, a high level of correlation (>0.70) was hypothesized between the ASQoL and BASDAI, with moderate to high levels of correlation (0.50–0.70) between the ASQoL and both the Body Chart and RLDQ. The BASDAI and Body Chart measure closely related aspects of health and have a similar item content, and a large correlation was hypothesized. There is minimal overlap of item content between the BASDAI, Body Chart and RLDQ. However, all instruments measure related aspects of health and disease that may impact on normal function. Therefore, moderate levels of correlation between these instruments was hypothesized.

Validity was further assessed in relation to occupational status. Patients reporting an inability to work due to ill health were expected to have scores reflecting poorer health than their counterparts. t-Tests were used to test for differences in scores.

For purposes of assessing longitudinal construct validity, instrument scores were compared with self-reported AS-specific and general health transition at 6 months (Compared with 6 months ago, how would you rate your AS/general health now: much better, somewhat better, about the same, somewhat worse, much worse?). Changes in instrument scores and patient response to the transition questions were assessed for a linear trend [23]. To the extent that the patient-assessed instruments are valid measures of health capable of measuring change, a strong association with a patient-reported health transition item is expected [20, 21].

Responsiveness
Instruments were compared for responsiveness to change over the 6-month period by calculating the modified standardized response mean (MSRM), which is equal to the mean change in scores divided by the standard deviation of change scores in patients defined as stable [21]. Guidance for data interpretation has been proposed: a score of >0.8 represents a high level of responsiveness, a score of 0.5 a moderate level, and a score of 0.2 a low level [24]. MSRMs were calculated for patients reporting an improvement or deterioration in health on health transition (general or AS-specific).


    Results
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Data collection
Of the 451 patients who were mailed a postal questionnaire, 349 (77.4%) returned a completed questionnaire. One patient had changed address at the 2-week follow-up and 303 (87.1%) patients returned the 2-week questionnaire. Two-hundred and eighty-nine (82.8%) patients returned the 6-month questionnaire.

The majority of patients were male (n=259, 74.2%), with a mean age of 46.1 yr [standard deviation (S.D.) 12.6 yr, range 18–75 yr] (Table 1Go). The mean symptom duration of participants was 19.8 yr (S.D. 11.8; range 1–56 yr), suggesting a broad spectrum of disease presentation.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Patient characteristics at baseline

 

Instrument evaluation
Data quality. The item and scale properties of the disease-specific instruments are shown in Table 2Go. For all instruments a lower item or scale score reflects a better health state and may be described as the floor of the scale [24].


View this table:
[in this window]
[in a new window]
 
TABLE 2. Instrument item and scale properties at baseline (n=271)

 
The levels of missing data for the 18 items of the ASQoL ranged from 1.1 to 3.4%. Scale scores were computable for 97.1% (n=339) of patients. Both responses of the dichotomous response format were covered for all items. Items 5 (It's impossible to sleep; n=255, 74.1%) and 9 (I have unbearable pain; n=269, 79.8%) had high levels of endorsement for the ‘no’ response. Some patients were unable to make a clear distinction between ‘yes' and ‘no’ response options, and responded by placing a tick between the two boxes, often supplementing this with a written comment such as ‘sometimes' or ‘it depends' (n=10, 3.7%). ASQoL scale scores covered the full range (0–18) and approximated normality. The single dimension of the ASQoL proposed by the developers was supported by the results of the PCA, which produced a single-component solution with component loadings all above 0.50. Item–total correlation ranged from 0.51 to 0.73, and Cronbach's alpha was 0.92 (Table 2Go).

The levels of missing data for the 10 items of the BASDAI ranged from 8.9 to 24.9% (Table 2Go). Item 6 was the most frequently omitted. Scoring allows for the omission of up to two items and scale scores were computable for 91.1% (n=318) of patients. The full range of responses was observed for all items and no item produced an end-effect >80%. Score distribution at scale level approximated normality. The single dimension of the BASDAI proposed by the instrument developers was supported by the results of the PCA, which produced a single-component solution with all-item component loadings above 0.65. Item–total correlation ranged from 0.56 to 0.81, and Cronbach's alpha was 0.87 (Table 2Go).

The Body Chart does not consist of individual items and is therefore assessed at the scale level (Table 2Go). Final scores were computable for 89.9% (n=310) of responders. Features of incorrect completion included a failure to score the areas indicated on the body chart as painful. If more than two areas are not scored, a final score cannot be computed. A wide range of Body Chart scores was observed (range 0–122). Score distribution was skewed towards lower pain levels (2.3% reported no pain) (Table 2Go). The percentage of patients scoring at the ceiling of the Body Chart range represent those patients recording the maximum scores in this study population.

The levels of missing data for the 16 items of the RLDQ ranged from 0.9 to 5.2% and a scale score was computable for 98.0% (n=342) of patients (Table 2Go). The full range of response options was observed for all items. Three items had very low levels of endorsement for the option ‘Unable to do’ and a large proportion of patients scored at the floor of several items, indicating that the activity could be performed ‘without difficulty’. Although no item produced an end-effect of >80.0%, the skewed distribution of item responses shows that the majority of respondents experience no or moderate limitations in functional activities assessed by the RLDQ. This is reflected in the low mean values for all items and in the final scale scores, which are positively skewed towards better levels of functional ability (Table 2Go). No patient scored >41 on the 0–48 scale. The single dimension of the RLDQ was supported by the results of the PCA, which produced a single-component solution with all component loadings >0.48. Item–total correlation ranged from 0.41 to 0.79, and Cronbach's alpha was 0.93 (Table 2Go).

Reliability. The test–retest reliability of all instruments was assessed by correlating the two sets of scores for those patients who indicated no change on the AS-specific health transition question. The ICCs are shown in Table 3Go. Reporting the associated confidence intervals for the ICC as an estimation of test–retest reliability is based on the assumption that data is approximately normally distributed. Therefore, the highly skewed data for the Body Chart was logarithmically transformed to yield a log-normal distribution [25] (Table 3Go). The highest levels of reliability were observed for the ASQoL and RLDQ (>0.90), but for all instruments levels were >0.80.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Test–retest reliability (no change in AS-specific health by self-reported transition question at 2 weeks)

 
Validity. As hypothesized, the tests of convergent validity produced correlations of a moderate to large size (Table 4Go). The ASQoL produced the largest correlations. Also as hypothesized, compared with those unable to work due to ill health, patients in work have significantly better levels of health (P<0.01) (Table 5Go).


View this table:
[in this window]
[in a new window]
 
TABLE 4. Convergent validity (Pearson's correlation coefficient) (n=271)

 

View this table:
[in this window]
[in a new window]
 
TABLE 5. Mean (standard deviation) instrument scores according to whether working or not due to ill health

 
The test of longitudinal validity is shown in Table 6Go. The change scores for all instruments reflect the categories of both AS-specific and general health transition. The largest levels of change were found for the BASDAI on both transition questions. Patients who say that their AS is better have an average score-improvement of -1.04 over the 6 months (on a scale of 0–10), and where general health is better they have an average score-improvement of -1.26. Those whose AS is worse have an average score-deterioration of 0.85, and where general health is worse they have an average score deterioration of 0.92. Three instruments have a stronger relationship with general health transition than with AS-specific transition.


View this table:
[in this window]
[in a new window]
 
TABLE 6. Mean score change (standard deviation) and modified standardized response mean (MSRM) by 6-month AS-specific and general health transition

 
The strongest linear relationships with both AS (P<0.01) and general health transition items (P<0.01) were observed for the ASQoL and BASDAI.

Responsiveness. The results of responsiveness testing are shown in Table 6Go. The BASDAI produced the largest levels of responsiveness for groups of patients whose AS-specific or general health had improved or deteriorated according to transition question responses. In patients reporting an improvement in health responsiveness, statistics over 0.5 were found for the BASDAI, representing a level of change that is at least one half a standard deviation of the change scores for stable patients. The ASQoL also produced responsiveness statistics over 0.5 in patients reporting a deterioration in AS-specific health or an improvement in general health. Lower levels of responsiveness were found for the remaining instruments, with the lowest levels found for the Body Chart.


    Discussion
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Accurate measurement of outcome across the wide spectrum of health and disease has become an important medical and social issue [26]. The role of the patient is increasingly seen as central to this process and is reflected in the increasing availability of patient-assessed instruments [27]. Although a wide range of instruments now exist, poor standardization and limited evidence of measurement properties can make instrument selection for clinical practice or research difficult. Selection has often been guided by historical precedence [28] or expert opinion [29]. The selection of instruments with clear evidence supporting their development, measurement properties, acceptability to patients and feasibility for the required application will enhance patient evaluation and is an important requirement for all fields of health care [27, 30].

This study represents an extensive comparative evaluation of the measurement properties and acceptability of an evidence-based selection of disease-specific, patient-assessed instruments in a large and representative population of AS outpatients. The study has provided important information against which instruments can be judged in terms of necessary measurement properties. The results of tests of data quality and scaling assumptions have not been reported previously. Furthermore, there is no published evidence of empirical tests for dimensionality.

The ASQoL and the RLDQ had low levels of missing data, which is evidence for their acceptability to patients. However, the high levels of missing data for the BASDAI suggest that a revision of the instrument scaling or item content may improve completion rates. The Body Chart was not originally developed as a self-administered postal questionnaire and there is scope for improving the written instructions provided to patients. The instruments based on summated rating scales (ASQoL, BASDAI and RLDQ) had good evidence to support the unidimensional structure recommended by the instrument developers. The data for the Body Chart were highly skewed, and assessment using conventional methods was therefore difficult.

The tests of data quality and scaling assumptions were largely met by the instruments and all items showed a moderate to high level of correlation with hypothesized scales. The four instruments have levels of reliability that support their use in groups. The ASQoL and RLDQ had levels of internal and test–retest reliability that supports their use in individual patients.

Evidence for the construct validity of the four instruments was provided by the moderate to high levels of correlation between the instrument scores, which met a priori hypotheses. Correlations were of a sufficient magnitude to suggest that the instruments are measuring related aspects of disease-specific health. Further evidence for the validity of the instruments was provided by the significant association with work status and self-reported health transition. Scores for three of the instruments had a stronger association with general health transition than specific health transition, which was not as hypothesized. The exception, the Body Chart, is a measure of global body pain and it is possible that AS-related pain is a dominant component of self-reported changes in the disease. The other instruments measure broader considerations, and items within the ASQoL are not anchored to the disease.

The BASDAI showed a good level of responsiveness for self-perceived improvement and deterioration in both AS and general health. Good levels of responsiveness were also found for the ASQoL for change in general health. However, only moderate and low levels of responsiveness were found for the RLDQ and Body Chart, respectively.

Most patients with AS present with multiple coexisting problems, and a multi-dimensional approach to evaluation has been recommended [3]. The study instruments address four domains of disease impact: HRQL (ASQoL), disease activity (BASDAI), pain (Body Chart) and functional disability (RLDQ). Instrument selection must consider available evidence in light of the proposed application. For applications in research including clinical trials, all instruments demonstrate adequate levels of reliability and validity. In addition, there is good evidence for the responsiveness of the BASDAI and the ASQoL. However, the low levels of completion for the BASDAI and the self-administered version of the Body Chart are cause for concern. Furthermore, the Body Chart and RLDQ were not responsive to self-perceived change in AS. These are also important considerations for applications in both research and clinical practice. Moreover, reliability may be an even more important issue when selecting instruments for individual evaluation [31]. The BASDAI and Body Chart are close to the reliability criterion of 0.90 recommended by some authors [17]. The ASQoL and RLDQ exceed this criterion but were found not to be as responsive as the BASDAI.

Two additional AS-specific measures of functional disability have been recommended by the ASAS group [29]: the (modified) Dougados Functional Index (DFI) [32, 33] and the Bath AS Functional Index (BASFI) [34]. Direct comparison of available disease-specific measures of functional disability is required to support the selection of a single instrument. Following the extensive evaluation of instrument performance described in the current study, modification of the item content and response format of the RLDQ is suggested to enhance both data quality and measurement properties [5]. Such a detailed evaluation of the data quality, scaling assumptions and measurement properties of the DFI or BASFI has not been published [5]. For example, the response scales of the BASFI are identical to the BASDAI and self-completion in a patient population unfamiliar with the instrument, as described here, will provide important information relating to instrument acceptability.

It is increasingly recognized that generic and disease-specific instruments are complementary [35]. The broad content of generic instruments may support the identification of co-morbid features and treatment side-effects that may not be captured by disease-specific instruments. Furthermore, generic instruments are useful for comparing different groups of patients and have wider scope for application in economic evaluation. Evidence of the performance of widely used generic instruments such as the Short-Form 36-item Health Survey Questionnaire (SF-36) [14, 22] and the EuroQol [36, 37] in the AS population has yet to be described.

In conclusion, this study provides valuable information on the data quality, measurement properties and acceptability of four disease-specific measures of health outcome following self-completion in a large UK population of patients with AS. The best performing instruments were the ASQoL and BASDAI. However, it is recommended that all instruments undergo further evaluation following modification to improve data quality and responsiveness before recommendation for their use in the evaluation of AS patients in routine practice or clinical research can be made.


    Acknowledgments
 
We are very grateful to all of the patients who so willingly gave their time to complete the various questionnaires. We also thank the following Consultant Rheumatologists for allowing access to AS patient databases and local physiotherapists for their support: Professor R. Sturrock and Ms F. Gough; Professor I. Haslock, Dr M. Plant and Mrs K. West; Dr T. Price, Ms C. David and Ms L. Preston; Professor H. Gaston and Mrs J. Isaacson; Dr P. Creemer and Mrs R. Lewis; all Consultant Rheumatologists, nursing and clinical staff from the Staffordshire Rheumatology Centre, with particular thanks to Ms J. Waterfield for assistance with data collection. This study was supported by a grant from the Arthritis Research Council and funding from the Staffordshire Rheumatology Centre, and forms part of a larger study submitted for a DPhil in Health Sciences at the University of York.


    Notes
 
Correspondence to: K. Haywood, Interdisciplinary Research Centre in Health, Physiotherapy and Dietetics Subject Group, School of Health and Social Sciences, Coventry University, Priory Street, Coventry CV1 5FB, UK. Back


    References
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 

  1. Russell AS. Ankylosing spondylitis: history. In: Klippel JH, Dieppe PA, (eds). Rheumatology. 2nd edn. London: Mosby, 1998, pp. 1–2.
  2. Ward MM. Quality of life in patients with ankylosing spondylitis. Rheum Dis Clin North Am1998;24:815–27.[ISI][Medline]
  3. van der Heijde D, Bellamy N, Calin A, Dougados M, Khan MA, van der Linden S. Preliminary core sets for endpoints in ankylosing spondylitis. J Rheumatol1997;24:2225–9.[ISI][Medline]
  4. Haywood KL, Garratt AM, Dziedzic K, Dawes PT. A systematic review of outcome measures in ankylosing spondylitis (AS). Br J Rheumatol1998;37[Abstr Suppl 1]:43.
  5. Haywood KL. Health outcomes in ankylosing spondylitis: an evaluation of patient-based and anthropometric measures. DPhil thesis,2000, University of York, York.
  6. Garrett S, Jenkinson T, Kennedy G, Whitelock H, Gaisford P, Calin A. A new approach to defining disease status in ankylosing spondylitis: the Bath Ankylosing Spondylitis disease activity index. J Rheumatol1994;1:2286–91.
  7. Abbott CA, Helliwell PS, Chamberlain MA. Functional assessment in ankylosing spondylitis—evaluation of a new self-administered questionnaire and correlation with anthropometic variables. Br J Rheumatol1994;33:1060–6.[ISI][Medline]
  8. Dziedzic KSG. The Body Chart: A further sketch towards a fuller picture of ankylosing spondylitis. PhD thesis, 1997, University of Keele, Staffordshire.
  9. Dziedzic K. Ankylosing spondylitis. In: David C, Lloyd J, eds. Rheumatological physiotherapy. London: Mosby1998:97–114.
  10. Fitzpatrick R. The measurement of health status and quality of life in rheumatological disorders. Ballieres Clin Rheumatol1993;7:297–317.[ISI][Medline]
  11. Jenkinson C, Fitzpatrick R, Peto V. The Parkinson's Disability Questionnaire. User manual for the PDQ-39, PDQ-8 and PDQ Summary Index. Health Services Research Unit, Department of Public Health, University of Oxford,1998, Oxford: Joshua Horgan Print Partnership.
  12. Reynolds S, Doward LC, Spoorenberg A et al. The development of the Ankylosing Spondylitis Quality of Life Questionnaire (ASQoL). Qual Life Res1999;8:651.
  13. Garratt AM, Ruta DA, Abdalla MI, Russell IT. Responsiveness of the SF-36 and a condition-specific measure of health for patients with varicose veins. Qual Life Res1996;5:223–34.[ISI][Medline]
  14. Ruta DA, Hurst NP, Kind P, Hunter M, Stunnings A. Measuring health status in British patients with rheumatoid arthritis: reliability, validity and responsiveness of the short form 36-item health survey (SF-36). Br J Rhematol1998;37:425–36.
  15. van der Linden SJ, Valkenburg HA, Cats A. Evaluation of diagnostic criteria for ankylosing spondylitis—a proposal for modification of the New York criteria. Arthritis Rheum.1984;27:361–8.[ISI][Medline]
  16. Nunnally JC, Bernstein IH. Psychometric theory. 3rd edn. McGraw-Hill Series in Psychology, Columbus, OH: McGraw-Hill,1994.
  17. Streiner DL, Norman GR. Health measurement scales. A practical guide to their development and use. 2nd edn. Oxford: Oxford Medical Publications, Inc.,1995.
  18. Fitzpatrick R, Zieblans S, Jenkinson C, Mowat A, Mowat A. Transition questions to assess outcomes in rheumatoid arthritis. Br J Rheumatol1993;32:807–11.[ISI][Medline]
  19. Keller SD, Maijkut TC, Kosinski M, Ware JE Jr. Monitoring health outcomes among patients with arthritis using the SF-36 health survey. Med Care1999;37(5 Suppl):MS1–9.
  20. Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bull1979;86:420–8.[ISI]
  21. Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures. Statistics and strategies for evaluation. Control Clin Trials1991;12:142S–58S.[Medline]
  22. Ware JE. SF-36 Health Survey Manual and Interpretation Guide. The Medical Outcomes Trust. Boston, MA: Nimrod Press,1997.
  23. Garratt AM. A comparison of four approaches to measuring health outcome. DPhil thesis, 1997, University of Aberdeen.
  24. Fitzpatrick R, Davey C, Buxton MJ, Jones DR. Evaluating patient-based outcome measures for use in clinical trials. Health Technology Assessment, Vol. 2, no. 14,1998.
  25. Altman DG. Practical statistics for medical research. London: Chapman and Hall,1996.
  26. Ware JE. International Quality of Life Assessment (IQOLA) project. [Preface]. J Clin Epidemiol1998;51:891–2.
  27. McDowell I, Newell C. Measuring health. A guide to rating scales and questionnaires. 2nd edn. Oxford: Oxford University Press,1996.
  28. Jenkinson TR, Mallorie PA, Whitelock HC, Kennedy GL, Garrett SL, Calin A. Defining spinal mobility in ankylosing spondylitis (AS). The Bath AS Metrology Index. J Rheumatol1994;21:1694–8.[ISI][Medline]
  29. van der Heijde D, Calin A, Dougados M, Khan MA, van der Linden S, Bellamy N. Selection of instruments in the core set for DC-ART, SMARD, physical therapy, and clinical record keeping in ankylosing spondylitis. Progress report of the ASAS Working Group. Assessments in ankylosing spondylitis. J Rheumatol1999;26:951–4.[ISI][Medline]
  30. Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Chronic Dis1985;38:27–36.[ISI][Medline]
  31. McHorney CA, Tarlov RA. Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res1995;4:293–307.[ISI][Medline]
  32. Dougados M, Gueguen A, Nakache JP, Nguyen M, Mery C, Amor B. Evaluation of a functional index and an articular index in ankylosing spondylitis. J Rheumatol1988;15:302–7.[ISI][Medline]
  33. Spoorenberg A, van der Heijde D, de Klerk E et al. A comparative study of the usefulness of the Bath Ankylosing Spondylitis Functional Index and the Dougados Functional Index in the assessment of ankylosing spondylitis. J Rheumatol1999;26:961–5.[ISI][Medline]
  34. Calin A, Garrett S, Whitelock H et al. A new approach to defining functional ability in ankylosing spondylitis: the development of the Bath Ankylosing Spondylitis functional index. J Rheumatol1994;21:2281–5.[ISI][Medline]
  35. Guyatt GH, Feeny DH, Patrick DL. Measuring health-related quality of life. Ann Intern Med1993;118:622–9.[Abstract/Free Full Text]
  36. EuroQol Group. EuroQol: a new facility for the measurement of health-related quality of life. Health Policy1990;16:199–208.[ISI][Medline]
  37. Hurst NP, Kind P, Ruta D, Hunter M, Stubbings A. Measuring health-related quality of life in rheumatoid arthritis: validity, responsiveness and reliability of EuroQol (EQ-5D). Br J Rheumatol1997;36:551–9.[ISI][Medline]
Submitted 26 September 2001; Accepted 13 May 2002