Development and validation of an elbow score

P. Sathyamoorthy, G. J. Kemp, A. Rawal, V. Rayner and S. P. Frostick

Department of Musculoskeletal Science, Royal Liverpool University Hospital and Liverpool Upper Limb Surgery Unit, Royal Liverpool University Hospital, Liverpool, UK

Correspondence to: S. P. Frostick, Department of Musculoskeletal Science, University of Liverpool, Liverpool L69 3GA, UK. E-mail: s.p.frostick{at}liv.ac.uk


    Abstract
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Objectives. Few of the questionnaires available for evaluating the function and clinical state of the elbow have been validated. An ideal score would be consistent, sensitive, reliable and elbow-specific, incorporating both patient perception and clinician assessment. This was our aim.

Methods. Items were generated using 25 patients and expert opinion, and reduced using 25 new patients to yield a nine-item patient questionnaire and a six-item clinical evaluation (of strength, motion and ulnar nerve involvement). This was validated using 63 new patients (of whom 28 were studied twice without therapy and 18 were studied again after appropriate surgery).

Results. The test–retest reliability coefficient of determination (R2 = 0.93) and internal consistency (Cronbach's alpha = 0.98) were both good. Convergent validity was attested by good correlations with other scores, the Disabilities of Arm, Shoulder and Hand Questionnaire (DASH) and the Nottingham Health Profile (NHP) (physical) (R2 = 0.62 and 0.29, P<0.0005). Sensitivity to change was demonstrated by correlating preoperative–postoperative changes to those in DASH and NHP (physical) (R2 = 0.50 and 0.27, P<0.04).

Conclusion. This is a reliable, internally consistent score, correlating well with other, non-elbow specific scores and sensitive to change on treatment.

KEY WORDS: Clinical score, Elbow, Outcome measure, Validation


    Introduction
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Outcome measures are used for various purposes, such as studying the progress of disease and the effects of treatment, and comparing different treatments. Questionnaires are widely used and are basically of two types: physician-rated questionnaires, which use clinical and functional measurements, and patient-rated questionnaires, which may be general or contain questions related to a specific condition. In either case, they must be properly validated in terms of consistency, sensitivity and reliability. For the elbow, almost none of the clinical questionnaires used have been validated, and this makes results of studies and trials difficult to interpret.

Many outcome measures have been used for elbow conditions: the Mayo Elbow Performance Index [1–4] and its several variants [5–8], the Ewald Scoring system [9, 10], the Hospital for Special Surgery (HSS) Scoring System [11] and its variants [12, 13], the Flynn Criteria [14, 15], the Pritchard Score [1], the Brumfield Score [1], the Neviaser Criteria [16], the Jupiter Score [17–19], the Khalfayan Score [20, 21], the Disabilities of Arm, Shoulder and Hand Questionnaire (DASH) [22], the Modified American Shoulder and Elbow Surgeons self-evaluation form (M-ASES) [23], and the same organization's Elbow Assessment form [24]. Of these, DASH and M-ASES are patient-completed functional and general questionnaires. However, general questionnaires alone may not assess accurately the symptoms and functions of an individual joint [25]; they are often lengthy and contain questions irrelevant to a specific problem or procedure [26]. The ASES Elbow Assessment form [24] is elbow-specific and combines patient- and physician-completed questions; however, it has not been validated and is somewhat unwieldy, containing over 50 responses. The others listed are physician-completed questionnaires scored by clinical assessment, some (HSS, Ewald, Mayo and Khalfayan) also containing functional questions completed by the physician on the patient's behalf. The components of these scores are compared in Table 1.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Comparison of available scores

 
Both objective criteria (derived from physical examination) and subjective criteria (determined by interview) may contain bias [27]. Concerns and priorities of surgeons may differ [28], and even purely clinical scoring systems depend on variable judgements about pain, motion, strength, stability, function, deformity, activity and disability [29]. Generally, the domains of these scores are unrelated, so that different weights are given to different aspects of elbow performance. This precludes valid comparison between studies. Frequently, domains are scored separately and the aggregated scores are then categorically ranked, but there is little uniformity in the distribution of categories; categorical rankings cannot be relied on to provide meaningful comparisons either within or between cohorts of patients [30], and the categorical rankings of different systems are not interchangeable [30].

Above all, none of these scores have been properly validated. Full validation [31] requires assessment of internal consistency (the degree to which component responses agree with each other, giving confidence that they are measuring different aspects of the same thing); construct validity (the degree to which the instrument supports a predefined hypotheses), of which an important aspect is usually convergent validity (the degree to which the instrument correlates with other established and accepted questionnaires); test–retest reliability (reproducibility); and sensitivity to change (usually to treatment). A further aspect is discriminant validity, the degree to which an instrument diverges from instruments designed to measure different things; in the present context this means anatomical specificity, which must be built into both the functional and clinical parts of the questionnaire [32–34]. Of the scores listed above, Turchin et al. [30] examined construct validity alone for the HSS, Mayo and some variants, and the Ewald and Pritchard scoring systems. They found that the variable mixture of clinical and functional criteria impaired validity. There was a very low agreement between scores. Construct validity was reported, but this was done by comparing with patient- and physician-rated severity rather than a valid, standard score. Internal consistency and sensitivity to change were not assessed, and test–retest reliability was measured only for M-ASES and DASH. These authors recommended that an ideal tool for the assessment of the elbow would measure pain, function and disability simultaneously, and that the outcome of treatment should be assessed on the basis of function, clinical examination and assessment of pain [30].

We have therefore developed an instrument which we call the Liverpool Elbow Score (LES). This contains two main components: a patient-rated questionnaire assessment of function, relevant to the functions of the elbow (unlike some general upper limb questionnaires mentioned above), including a question about pain; and some important relevant clinical data, which can be measured objectively and consistently, regardless of the condition of the elbow. It is simple and quick to administer. We present here evidence of its validity.


    Patients and methods
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Under local rules for institutional review applicable at the time the bulk of this work was carried out (1999–2001), formal ethical approval was not required for this type of study. The present paper was reviewed in draft (April 2004) by the Liverpool (Adult) Research Ethics Executive Sub-Committee, who have stated that they are satisfied that the local guidelines in force at this time were complied with, and have no objection to publication. University statisticians were consulted both at the beginning and during the study for structuring and for data analysis.

There are three stages of scale development [35–37]: item generation, item reduction (comprising the selection of the item pool and choice of item scaling), and the determination of reliability, validity and responsiveness. We describe separately the patient-response and clinically-assessed items.

Generation of the scale: the patient-answered questionnaire
Item generation
Items were generated from interviews with 25 patients at the Royal Liverpool University Hospital Upper Limb Unit. These had elbow problems which included rheumatoid arthritis, primary osteoarthritis, post-traumatic osteoarthritis, elbows with joint replacements, elbows with failed joint replacements, tennis elbow, elbows with loose bodies, etc. Patients were asked what they thought were the most important activities of daily living affected specifically by their elbow problem. A total of 21 common items were obtained. Experts in elbow surgery were consulted, including members of the university department where this project was done and eminent elbow surgeons from various parts of the UK. Based on their judgement, the patient responses and literature review, a list was drawn up of functional and clinical criteria comprising a total of 42 items.

Item reduction
From this list, items were selected using the following criteria. For inclusion, items had to reflect a problem which is common, has a substantial effect on daily life, is important to patients and judged important by expert clinicians, and which affects tasks which are performed by all subjects at some functional level; it should be stable over at least short periods of time yet have potential for change. We excluded items which were generic, repetitive, not reflective of disability, not relevant to the elbow and not highly endorsed by expert opinion and patients. We separated the functional items (activities of daily living) and the clinical items (assessed by the clinician), and obtained a reduced list: the functional items included the use of the other arm due to difficulty with the affected arm, combing hair, personal hygiene, dressing, household activities, lifting, pain, sport, leisure activities, driving, gardening, keyboard, writing and feeding; the clinical items assessed motion, strength, ulnar nerve problems, instability and deformity. These were then tested on 25 out-patients attending with elbow problems. Patients were asked whether they could understand the questions, and how relevant they were to their problems. The clinicians who administered these questionnaires were consulted regarding the practicality and difficulty of administering the clinical items. Six patients felt that instead of ‘personal hygiene’ the term ‘washing’ would be easier to understand. Seven felt that the question on driving was irrelevant because they had never driven in their lives. Sport and leisure were grouped together as most patients considered they indulged in only one of these, and some took them to be the same. In the same way gardening, keyboard and writing were not universally approved by the patients, and we omitted these as potentially lacking consistency.

The final list of items was as listed in Table 2. For each item we used a five-level Likert scale, as commonly used for questionnaires of this kind and also advised by our statisticians, starting from zero for simplicity of calculation. Wording was such that 0 represented worst/least function and 4 best/most function. In the earlier studies, patients were asked to answer about ‘how you are now’. With the last 21 patients studied (and used for the test–retest study), we addressed the question of timeframe explicitly by specifying the last 4 weeks (Table 2). We asked these patients whether they ‘would have answered any of these questions differently if a time limit had not been given’, and the opposite question of 20 of the patients who had been given the earlier questionnaire. Both questions were invariably answered in the negative, and there was no discernible difference in reproducibility or internal consistency between the earlier and this slightly modified form of the questionnaire.


View this table:
[in this window]
[in a new window]
 
TABLE 2. The components of the Liverpool Elbow Score

 
Generation of the scale: the clinical assessment
We considered the categories of deformity, instability, motion, strength and ulnar nerve assessment. It was widely felt by experts that deformity does not contribute to the functional efficiency of the elbow. Instability was omitted as it was impossible to assess consistently because of the influence of elbow position: instability is usually measured in full extension and approximately 15° of flexion, and this could not be done consistently due to the varying amount of flexion deformity in patients. These two components were therefore omitted from the clinical assessment scale. The items remaining included motion, strength and ulnar nerve assessment. Under the heading of motion, flexion, extension, supination and pronation were assessed. A minimum value of 0 and maximum of 3 were given, 3 being the best motion: a greater scale length was felt to be redundant. Consideration was given to wrist and forearm pathologies as restrictions in supination or pronation might well affect the movements at the elbow: thus 1 was added to the score for supination and pronation to compensate. Strength was initially thought best measured using instruments, but these could not in practice be always available or usable in a busy clinic. Instead the MRC strength grading was modified by leaving out the conventional Grade 1 (flicker) as being irrelevant in orthopaedic functional assessment; the modified grading is 0 = absent, 1 = complete motion with gravity eliminated, 2 = complete motion against gravity, 3 = complete motion against gravity and some resistance, 4 = apparently normal strength. Ulnar nerve assessment is very important in assessing elbow function, especially in the joint replacement groups [38], and was given a score from 0 to 3 depending on sensory and motor involvement.

The scale as used
The final questionnaire combined a nine-item patient-answered questionnaire (PAQ) and a six-item clinical assessment score (CAS) component. For calculation of the final score, all responses were transformed to a scale of 0–10, and equally weighted (see Discussion) for summation by averaging, so the final score runs from 10 (best) to 0 (worst). Thus the total score can be expressed as

that is, 10 times the average of the six clinician-assessed components (Ci) and the nine patient-answered components (Pj), each divided by its theoretical maximum value (the M-terms): these maxima are 4 for all the patient-answered questions (i.e. all MPj = 4) and for the fifth clinician-assessed component, i.e. strength (so MC5 = 4), and 3 for the remaining clinician-assessed components (so MC1 = MC2 = MC3 = MC4 = MC6 = 3). To put this explicitly, and rearranging for convenience,

The limiting range 0–10 was chosen somewhat arbitrarily, to avoid on the one hand an excessive use of decimal places, and on the other hand an irrelevantly large number of levels.

Assessment of the scale: characteristics of the patients
The final questionnaire thus developed was validated in a prospective way for internal consistency, reproducibility, validity and sensitivity to clinical change [32–34]. During their assessment patients were examined by the clinician and then asked to answer the questionnaire (with the clinician present). There were 112 assessments in all. We studied 63 patients (median age 55 yr, range 15–77 yr) with various elbow conditions: 19 rheumatoid arthritis, 14 osteoarthritis, eight tennis elbow, six arthroplasty, six loose bodies, three fractured radial head, three ulnar nerve problems, two golfers’ elbow, and one each of osteochondritis dissecans and synovial chondromatosis. Of these, 28 were studied again between 1 and 3 days later when no change had occurred in their condition. Of the original 63 patients, 18 were studied again (median 24 weeks) after surgery of various kinds.

Assessment of the scale: other instruments
Patients were also asked to answer the SF-12 [39], NHP (Nottingham Health Profile) [40] and the DASH [41]. SF-12 is a 12-item short-form health survey consisting of 12 general health questions, which are weighted separately for mental and physical assessments [39]: we used the physical weights in this study. NHP is a generic health-related quality of life measure [39], structured to assess physical mobility (eight items), pain (eight items), social isolation (five items), emotional reactions (nine items), energy (three items) and sleep (five items). Each item is weighted: we used the physical weights. The DASH questionnaire is designed to measure upper limb disability and symptoms [41]. Functional domains include physical, social and psychological. It uses a single-scale, 30-item questionnaire of upper extremity function and symptoms. All three questionnaires are already well validated.

Methods of validation
Internal consistency was assessed by administering the instrument to a group of patients on one occasion and estimating to what extent the items yield similar results. For this we used Cronbach's alpha coefficient [42], which (in effect) assumes that each actual item represents a retest of a single notional item, and also the correlation between each individual item and the overall score.

Construct validity has to do with to what extent the questionnaire supports predefined hypotheses and also the degree to which the questionnaire relates to other established and accepted questionnaires. In the present context we assessed a particular component of construct validity, called convergent validity, by measuring the correlation with other established and accepted questionnaires; for this we used Pearson's correlation coefficient. We considered that the score should correlate well with the ‘physical’ components of SF-12 and NHP, and also with DASH.

Test–retest reliability (reproducibility) was assessed by administering the test to the same sample on two different occasions, on the assumption that there is no substantial change in what is measured. This was assessed using Pearson's correlation coefficient (R), expressed here as the coefficient of determination (R2), and by the mean ± S.D. of the test–retest difference; the mean test–retest S.D. was also expressed as a coefficient of variation (CV%).

Sensitivity to change was investigated by comparing pre- and postoperative scores with changes in other scores administered at the same time [41, 43], using the coefficient of determination (R2). We are of course comparing the preoperative–postoperative differences, not making claims about the benefits or otherwise of the operation.


    Results
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
The new score was easy to administer. Patients did not find any of the components in the PAQ difficult to understand or interpret. Because of the small number of questions, the scoring did not take much time (<2 min). The CAS was also easy and rapid (<1 min).

Results of validation did not differ significantly between pre- and postoperative patients. The initial (pre-op) studies are used in the analysis and figures which follow.

Internal consistency for the score was good (Table 3); of the two main components of the score, the PAQ by itself had better internal consistency and better correlation with the overall score.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Internal consistency

 
Construct (convergent) validity was assessed by comparison with other scores (Fig. 1 and Table 4): overall, correlation was excellent with DASH, slightly less so with the physical parts of NHP and SF-12. Test–retest reliability was excellent (Fig. 2). Preoperative–postoperative changes correlated well with those in DASH and NHP (physical) (Fig. 3) but not SF-12 (Table 4).



View larger version (16K):
[in this window]
[in a new window]
 
FIG. 1. Convergent validity. Data are shown from the initial study on all 63 patients. Solid lines represent the linear regression. R2 and corresponding P-values are given in each panel. (A) Correlation of LES and DASH. (B) Correlation of LES and SF-12 (physical weights). (C) Correlation of LES and NHP (physical weights). Of the two basic components of the LES, the correlations for the PAQ (patient-answered) part alone were better than for the whole score. See Table 4.

 

View this table:
[in this window]
[in a new window]
 
TABLE 4. Correlations with other scores

 


View larger version (9K):
[in this window]
[in a new window]
 
FIG. 2. Test–retest reliability. The dashed line is the line of identity; the solid line is the linear regression. The corresponding value of R2 for the CAS component alone was 0.96, while that for the PAQ component was 0.92. These can be compared with similar values for the other scores: R2 = 0.90 for NHP (physical), 0.95 for DASH, 0.79 for SF-12 (physical score). The mean ± S.D. difference for LES was –0.1 ± 0.50 (not significantly different from zero), corresponding to a coefficient of variation of 4% [(cf. 7% for DASH and SF-12 and 10% for NHP (physical)]. Number of comparisons = 28.

 


View larger version (20K):
[in this window]
[in a new window]
 
FIG. 3. Sensitivity to effects of surgical treatment. The figure shows the relationship between the preoperative–postoperative differences (i.e. postoperative score minus preoperative score) in LES and two other clinical scores. Solid lines are linear regressions. R2 and corresponding P-values are given in each panel. (A) Correlation of preoperative–postoperative differences of LES and DASH. (B) Correlation of preoperative–postoperative differences of LES and NHP (physical). The corresponding correlations for the PAQ component alone were slightly higher than for the whole score. See Table 4.

 
There was, as one would expect, a good correlation between the pain questions in PAQ and those in DASH (R2 = 0.32, P<0.0001) and NHP (R2 = 0.38, P<0.0001).

Diagnosis appeared to have no significant influence on the results (i.e. internal consistency, reproducibility or correlation with other measures), as assessed pragmatically by dividing the patients into two main groups: disease involving mainly the joint surface (largely osteoarthritis, but also loose bodies and osteochondritis dissecans) and disease involving mainly the periarticular structures (largely rheumatoid arthritis, but also ulnar nerve problems, tennis elbow, golfers’ elbow, synovial chondromatosis and posterior impingement).


    Discussion
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
We used currently accepted methods [35, 36] to generate an elbow-specific score, simple enough to be rapidly administered in clinics. Reproducibility and internal consistency were good. Both preoperative results and the effects of surgery correlated acceptably with general and the upper-limb score DASH. It is therefore the first properly validated [33] elbow score: although MacDermid [44] evaluated the reliability of several patient questionnaires (the ASES Elbow Form, the Patient-rated Elbow Evaluation, the DASH questionnaire and SF-36) in patients with elbow pathology, there was no assessment of sensitivity to change and internal consistency. Sensitivity to change in particular is an important aspect of validation of this kind [45].

The score follows the principle that the patient can provide reliable and valid judgements of health status and of the benefits of the treatment [31]: 60% of the total score comes from the PAQ. There is some debate on the need for clinical assessment as part of a score, alongside patient evaluation. As in other cases [33], the PAQ performed somewhat better alone, at least as judged by Cronbach's alpha coefficient (Table 3). We would suggest that while leaving the CAS component out might miss important aspects of elbow pathology, it would certainly be possible to use the PAQ alone in situations where a purely postal or telephone assessment would be desirable on practical grounds.

Is a new score needed for the elbow? Existing elbow scores have not been fully validated [30, 44] and there is little agreement between the scores [30], and also little consistency in their exact definition. It was our belief that there are desirable components of an elbow score which none of the existing scores possessed. The present elbow score was developed by consulting patients with various different problems, and tested on patients with a variety of pathologies and operations. The items in the PAQ were included on the basis of interviews with patients, which itself assists in achieving content validity, and refined by testing, retesting and several consultations with experts. The resulting score was tested statistically in a prospective manner for all the components of validity which are a prerequisite for an assessment tool [34]. It has good internal consistency and reproducibility.

For purposes of construct validity and sensitivity to change, we compared it with three well-established and well-validated scores. NHP and SF-12 are questionnaires that can be used for any disease state, and DASH is a general upper limb questionnaire. We contend that this is better than comparing clinician-assessed severity and patient-assessed severity, as has been done in some validation studies [30]. Sensitivity to change is an essential criterion in scale validation [45] which has not been assessed in patients with elbow pathology [44]. We have shown that the new score responds to operative treatment at least as well as the general scores against which we compared it. The difference between them, of course, is the direct focus on the elbow (content validity).

We chose to weight each item in the score equally. This is a difficult point, and a number of different choices are available. It would have been wrong to allow the weighting to be a function of the scale length for each item, as this is decided on the basis of precedent, convenience and a judgement of the reasonableness of the size of the distinctions that the scale length implies. Thus, we adjusted each score to result in equal weight. However, it would of course be possible to alter the weightings, for example to optimize reproducibility or internal consistency. This would require a very much larger number of patients, and in fact is unlikely to depart very much from equal weighting (as the contributions to Cronbach's alpha are so similar). Any such optimization would be likely to reduce considerably the contribution of the CAS, but we would argue on general grounds that it would be unwise to ignore objective findings in the assessment of an elbow problem. One consequence of this equal weighting is that the pain item is only 1/9 (~11%) of the PAQ and 1/15 (~7%) of the whole score. This is in contrast to, for example, the five major scoring systems reviewed by Turchin et al. [30], in which pain is weighted as 30–50% of the final score. However, pain can make a contribution to the functional limitations referred to in any of the other PAQ items. We felt it would be impractical and inappropriate to distinguish this contribution from, for example, restrictions due to deformity, instability or weakness. The pain weighting in fact makes little difference to the performance of the test. Increasing its weighting so that the pain question contributes 40% to the overall score (54% to PAQ alone) results in a trivial decrease in reproducibility (R2 now 0.89 for both PAQ and the full LES), and very small effects on the correlations between LES and other scores, both absolute values and preoperative–postoperative differences.

We acknowledge that larger studies would be necessary to validate the LES separately for different elbow conditions. The LES was developed in tertiary care, and it would be interesting to test it in the primary care setting; we would expect that the increasing number of general practitioners with an interest in musculoskeletal conditions would find the CAS straightforward, and the PAQ could easily be administered by any health-care professional.

The authors have declared no conflicts of interest.


    References
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 

  1. Morrey BF, An KN, Chao EYS. Functional evaluation of the elbow. In: Morrey BF, ed. The elbow and its disorders, 2nd edn. Philadelphia, PA, USA: W. B. Saunders, 1993, pp. 86–97.
  2. Cobb TK, Morrey BF. Total elbow arthroplasty as primary treatment for distal humeral fractures in elderly patients. J Bone Joint Surg Am 1997;79:826–32.[Abstract/Free Full Text]
  3. King GJW, Adams RA, Morrey BF. Total elbow arthroplasty: revision with use of a non-custom semiconstrained prosthesis. J Bone Joint Surg Am 1997;79:394–400[Abstract/Free Full Text]
  4. Modabber R, Jupiter JB. Reconstruction for post-traumatic conditions of the elbow joint. J Bone Joint Surg Am 1995;77:1431–46.[ISI][Medline]
  5. Morrey BF, Bryan RS. Revision total elbow arthroplasty. J Bone Joint Surg Am 1987;69:523–32.[Abstract]
  6. Verstreken F, De Smet L, Westhovens R, Fabry G. Results of the Kudo elbow prosthesis in patients with rheumatoid arthritis: a preliminary report. Clin Rheumatol 1998;17:325–8.[ISI][Medline]
  7. Regan W, Morrey B. Fractures of the coronoid process of the ulna. J Bone Joint Surg Am 1989;71:1348–54.[Abstract]
  8. Broberg MA, Morrey BF. Results of treatment of fracture-dislocations of the elbow. Clin Orthop 1987;216:109–19.[Medline]
  9. Ewald FC. Total elbow replacement. Orthop Clin North Am 1975;3:685–96.
  10. Schemitsch EH, Ewald FC, Thornhill TS. Results of total elbow arthroplasty after excision of the radial head and synovectomy in patients who had rheumatoid arthritis. J Bone Joint Surg Am 1996;78:1541–7.[Abstract/Free Full Text]
  11. Figgie MP, Inglis AE, Mow CS, Figgie HE. Total elbow arthroplasty for complete ankylosis of the elbow. J Bone Joint Surg Am 1989;71:513–20.[Abstract]
  12. Figgie MP, Inglis AE, Mow CS, Wolfe SW, Sculco TP, Figgie HE. Results of reconstruction for failed total elbow arthroplasty. Clin Orthop 1990;253:123–32.[Medline]
  13. Schneider T, Hoffstetter I, Fink B, Jerosch J. Long-term results of elbow arthroscopy in 67 patients. Acta Orthop Belg 1994;60:378–83.[Medline]
  14. Flynn JC, Matthews JG, Benoit RL. Blind pinning of displaced supracondylar fractures of the humerus in children. J Bone Joint Surg Am 1974;56:263–72.[Medline]
  15. Cheng JCY, Lam TP, Shen WY. Closed reduction and percutaneous pinning for type III displaced supracondylar fractures of the humerus in children. J Orthop Trauma 1995;9:511–5.[ISI][Medline]
  16. Neviaser JS, Wickstrom JK. Dislocation of elbow: a retrospective study of 115 patients. South Med J 1977;70:172–3.[ISI][Medline]
  17. Jupiter JB, Neff U, Holzach P, Allgower M. Intercondylar fractures of the humerus. J Bone Joint Surg Am 1985;67:226–39.[Abstract]
  18. Holdsworth BJ, Mossad MM. Fractures of the adult distal humerus. J Bone Joint Surg Br 1990;72:362–5.[ISI][Medline]
  19. Low CK, Wong DHC, Toh CL, Wong HP, Low YP. A retrospective study on elbow function after internal fixation of intercondylar fracture of adult humerus. Ann Acad Med 1997;26:168–71.
  20. Khalfayan EE, Culp RW, Alexander H. Mason Type II radial head fractures: operative versus nonoperative treatment. J Orthop Trauma 1992;6:283–9.[Medline]
  21. Yokoyama K, Itoman M, Kobayashi A, Shindo M, Futami T. Functional outcomes of ‘floating elbow’ injuries in adult patients. J Orthop Trauma 1998;12:284–90.[CrossRef][ISI][Medline]
  22. Beaton DE, Katz JN, Fossel AH, Wright JG, Tarasuk V, Bombardier C. Measuring the whole or the parts? Validity, reliability and responsiveness of the disabilities of the arm, shoulder, and hand outcome measure in different regions of the upper extremity. J Hand Ther 2001;14:128–46.[Medline]
  23. Petre D, Verborgt O, Vanglabbeek F, Verstreken J. Treatment of advanced impingement syndrome by arthroscopic subacromial decompression. Acta Orthop Belg 1998;64:257–62.[Medline]
  24. King GJW, Richards RR, Zuckermann JD et al. A standardised method for assessment of elbow function. J Shoulder Elbow Surg 1999;8:351–4.[CrossRef][ISI][Medline]
  25. Beaton DE, Richards RR. Measuring function of the shoulder: across sectional comparison of five questionnaires. J Bone Joint Surg Am 1996;78:882–90.[Abstract/Free Full Text]
  26. Patrick DL, Deyo RA. Generic and disease-specific measures in assessing health status and quality of life. Med Care 1989;27(3 Suppl.):217–32.
  27. L’Insalata JC, Warren RF, Cohen SB, Altchek DW, Peterson MGE. An administered questionnaire for assessment of symptoms and function of the shoulder. J Bone Joint Surg Am 1997;79:738–48.[Abstract/Free Full Text]
  28. Amadio PC. Editorial. Outcomes measurement: more questions; same answers. J Bone Joint Surg Am 1993;75:1583–4.[Medline]
  29. MacDonald DA. The shoulder and elbow. In: Pynsent PB, Fairbank JC, Carr A, eds. Outcome measures in orthopaedics: Oxford, UK: Butterworth-Heinemann 1993:144–73.
  30. Turchin DC, Beaton DE, Richards RR. Validity of observer-based aggregate scoring systems as descriptors of elbow pain, function, and disability. J Bone Joint Surg Am 1998;80:154–62.[Abstract/Free Full Text]
  31. Dawson J, Fitzpatrick R, Murray D, Carr A. Questionnaire on the perceptions of patients about total knee replacement. J Bone Joint Surg Br 1998;80:63–9.[CrossRef][Medline]
  32. Cox DR, Fitzpatrick R, Fletcher AE et al. Quality of life assessment: can we keep it simple? J R Statist Soc Ser A 1992;155:353–93.[ISI]
  33. Dawson J, Fitzpatrick R, Carr A, Murray D. Questionnaire on the perceptions of patients about total hip replacement. J Bone Joint Surg Br 1996;78:185–90.[Medline]
  34. Dawson J, Fitzpatrick R, Carr A. Questionnaire on the perceptions of patients about shoulder surgery. J Bone Joint Surg Br 1996;78:593–600.[Medline]
  35. Guyatt GH, Bombardier C, Tugwell PX. Measuring disease-specific quality of life in clinical trials. Can Med Assoc J 1986;134:889–95.[Abstract]
  36. Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. New York, NY: Oxford University Press, 1995.
  37. Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Chron Dis 1985;38:27–36.[ISI][Medline]
  38. Thomas W, Ewald F, Morrey B. Total elbow replacement—expert exchange. Perspect Orthop Surg 1992;3:17–30.
  39. Ware J Jr, Kosinski M, Keller SD. A 12-item short form health survey: construction of scales and preliminary tests of reliability and validity. Med Care 1996;34:220–33.[CrossRef][ISI][Medline]
  40. Hunt SM, McKenna SP, McEwen J, Backett EM, Williams J, Papp E. A quantitative approach to perceived health status: a validation study. J Epidemiol Community Health 1980;34:281–6.[Abstract]
  41. Hudak PL, Amadio PC, Bombardier C and Upper Extremity Collaborative Group. Development of an upper extremity outcome measure: The DASH (Disabilities of the Arm, Shoulder, and Head). Am J Ind Med 1996;29:602–8.[CrossRef][ISI][Medline]
  42. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;3:297–331
  43. Kazis LE, Anderson JJ, Meenan RF. Effect sizes in interpreting changes in health status. Med Care 1989;27(3 Suppl.):178–89.
  44. MacDermid JC. Outcome evaluation in patients with elbow pathology: issues in instrument development and evaluation. J Hand Ther 2001;14:105–14.[Medline]
  45. Fitzpatrick R, Ziebland S, Jenkinson C, Mowat A, Mowat A. Importance of sensitivity to change as a criterion for selecting health status measures. Qual Health Care 1992;1:89–93.[Abstract]
Submitted 19 December 2003; revised version accepted 13 July 2004.



This Article
Abstract
Full Text (PDF)
An erratum has been published
All Versions of this Article:
43/11/1434    most recent
keh367v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (1)
Disclaimer
Request Permissions
Google Scholar
Articles by Sathyamoorthy, P.
Articles by Frostick, S. P.
PubMed
PubMed Citation
Articles by Sathyamoorthy, P.
Articles by Frostick, S. P.
Related Collections
Other Rheumatology