Cross-validation of a composite pain scale for preschool children within 24 hours of surgery

S. Suraseranivongse*, U. Santawat, K. Kraiprasit, S. Petcharatana, S. Prakkamodom and N. Muntraporn

Department of Anesthesiology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand*Corresponding author

Accepted for publication: March 26, 2001


    Abstract
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
This study was designed to cross-validate a composite measure of the pain scales CHEOPS (Children’s Hospital of Eastern Ontario Pain Scale), OPS (Objective Pain Scale, simplified for parent use by replacing blood pressure measurement with observation of body language or posture), TPPPS (Toddler Preschool Postoperative Pain Scale) and FLACC (Face, Legs, Activity, Cry, Consolability) in 167 Thai children aged 1–5.5 yr. The pain scales were translated and tested for content, construct and concurrent validity, including inter-rater and intra-rater reliabilities. Discriminative validity in immediate and persistent pain for the age groups <=3 and >3 yr were also studied. The children’s behaviour was videotaped before and after surgery, before analgesia had been given in the post-anaesthesia care unit (PACU), and on the ward. Four observers then rated pain behaviour from rearranged videotapes. The decision to treat pain was based on routine practice and was made by a researcher unaware of the rating procedure. All tools had acceptable content validity and excellent inter-rater and intra-rater reliabilities (intraclass correlation >0.9 and >0.8 respectively). Construct validity was determined by the ability to differentiate the group with no pain before surgery and a high pain level after surgery, before analgesia (P<0.001). The positive correlations among all scales in the PACU and on the ward (r=0.621–0.827, P<0.0001) supported concurrent validity. Use of the {kappa} statistic indicated that CHEOPS yielded the best agreement with the routine decision to treat pain. The younger and older age groups both yielded very good agreement in the PACU but only moderate agreement on the ward. On the basis of data from this study, we recommend CHEOPS as a valid, reliable and practical tool.

Br J Anaesth 2001; 87: 400–5

Keywords: pain, postoperative; pain, paediatric; children, preschool; statistics


    Introduction
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Postoperative pain has been undertreated in children compared with adults.1 One of the main reasons is the difficulty of assessing pain.2 Preschool children usually lack the verbal and cognitive skills to describe their feeling of pain or physical discomfort. Pain assessment in this population depends on the observation and expertise of the care-provider. To date, many multidimensional composite measures that include self-report of pain in addition to behavioural and/or physiological indicators have been developed.3 These pain scales, including CHEOPS (Children’s Hospital of Eastern Ontario Pain Scale),4 OPS (Objective Pain Scale, simplified for parent use by replacing blood pressure measurement with observation of body language or posture),5 6 TPPPS (Toddler Preschool Postoperative Pain Scale)7 and FLACC (Face, Legs, Activity, Cry, Consolability),8 were validated in North American children immediately after emergence from general anaesthetics in the post-anaesthetic care unit (PACU).

The pain response is affected by several psychological factors, including cultural differences, observational learning, cognitive appraisal and coping style.9 Cross-validation is needed in order to apply these measures in different cultures. Subsequent research has found that CHEOPS is ineffective in measuring pain which persists several hours after surgery. Beyer and colleagues found that the CHEOPS scores were generally very low and did not correlate well with a self-reported measure in children aged 3–7 yr.10 Most scales in this group are highly correlated with CHEOPS and use many of the same behaviours, so it is likely that they would have similar problems. However, this speculation is based on extrapolation from one study, and more psychometric work is needed on these scales and their effectiveness in measuring persistent pain.11

The aims of this study were to cross-validate these composite measure pain scales in Thai children in terms of validity and reliability, to assess the discriminative ability of these measures during two periods (immediate pain in the PACU and persistent pain on the ward) and in two age groups (<=3 and >3 yr), and to assess the practicality of the tools.


    Patients and methods
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
After obtaining approval from the ethics committee and informed written or oral consent from parents, we studied preschool children aged 1–5 yr who were undergoing general anaesthesia and surgery at Siriraj Hospital and the Queen Sirikit National Institute of Child Health, which are tertiary care hospitals in Bangkok, Thailand. Pain relief during anaesthesia was given according to the decision of the anaesthetists who had clinical care of the patients. The children were observed after awakening from anaesthesia until 24 h after surgery. Patients were excluded if they were ambulatory or had postoperative ventilatory support, chronic pain or developmental delay.

Baseline behaviours before and after surgery, before the administration of analgesia in the PACU and on the ward were recorded with a video camera which was hidden from view. A nurse called the researcher to record a videotape of the child’s behaviour when she diagnosed pain. Criteria for the diagnosis of pain were discomfort, reflection of pain in eyes, moaning or a facial expression or grimace including complaint of pain with active or passive movement or without movement. Other causes of distress, such as hunger, thirst, nausea and vomiting, were relieved. Distraction techniques, such as telling stories, playing with toys and watching television, were attempted before pain was diagnosed. The decision to give analgesics was made by one of the researchers, who was unaware of the child’s pain scale rating, which was based on observations in routine clinical practice combined with the patient’s self-report (older children) and the parent’s opinion (if available). Intravenous fentanyl 1 µg kg–1 or pethidine 0.5 mg kg–1 was given as analgesia in the PACU. On the ward, analgesics were prescribed by surgeons. Most children were given pethidine 1 mg kg–1 i.m. for moderate to severe pain and oral paracetamol 10 mg kg–1 for mild pain.

Patients who were not considered to have pain by both nurses and the researcher in the PACU were videotaped just before they left the PACU, and the behaviour was recorded as behaviour before analgesics were given. Similarly, if the patients on the ward were not considered to have pain during the period since the last analgesic, their behaviour was videotaped 6 h after the last analgesic was given and recorded as behaviour before the analgesic period.

The chronological sequence of the videotapes was rearranged into a new random sequence by a computer program in order to blind the observers. We trained four observers (nurse anaesthetists) to rate all the pain scales. The inter-rater reliability between the four observers was tested by rating 30 pain behaviours for each pain scale. All pain behaviours were randomized into four blocks. As the inter-rater reliability yielded a high score (intraclass correlation >0.9), each observer rated each block on a single occasion.

Cross-validations were performed by translation, validity testing and reliability testing.

Translation
The four pain scales were translated from English into Thai by an anaesthetist who was fluent in both languages. Then, another bilingual anaesthetist, who was not associated with the translation phase, translated the Thai version back into English. Finally, the back-translated scales were rechecked with the original scales by another translator whose mother tongue was English. Alterations were made on the basis of the third expert’s opinion in order to produce the same meaning as the original scales.

Reliability testing
Reliability is a measure of consistency and was assessed as the intra- and inter-rater reliabilities. For intra-rater reliability, the observers were asked to rate 30 pain behaviours for each pain scale on a second occasion, 2 weeks after the first occasion. For inter-rater reliability, the four observers were asked to rate the same 30 pain behaviours for each pain scale.

Validity testing
Validity is a measure of accuracy, and was evaluated as follows.

Content validity
All pain scales were tested, on the basis of content relevance, coverage and scaling, by a paediatrician, a paediatric anaesthetist, a paediatric surgeon, a paediatric psychologist, a nurse and a kindergarten teacher. Content validity of each item was scored as 1=relatively valid, 0=not sure, –1=relatively irrelevant.

Construct validity
This is an assessment of the meaning of the instrument in terms of its theoretical basis by comparison with external variables related to the construct. We compared the scores for all four pain scales at no pain before surgery with those after surgery, before analgesia, as the postoperative pain scores were expected to be higher than those during the preoperative period.

Concurrent validity
Correlations of OPS, TPPPS, FLACC with CHEOPS were tested at the same point in time. We identified appropriate cut-off points for each pain scale, which yielded the highest agreement ({kappa} value) with the clinical decision to treat pain.

Effectiveness of the measures
Discriminative capability
The agreement or {kappa} value of the four pain scales and the routine decision to treat pain, including sensitivity and specificity, were assessed on the basis of time (immediately in the PACU and several hours later on the ward) and age group (<=3 and >3 yr). Factors that might affect pain behaviour, such as previous surgery, the presence of the parents and the site of surgery, were also recorded.

Practicality of measures
After training in the use of the pain scales, 30 nurses rated 10 behaviours from a videotape on four pain scales. The time taken to rate each behaviour on each scale was recorded. The nurses were asked to rank the scales according to the feasibility of their use in clinical situations, ease of use, ability of the scales to help assess pain, and general satisfaction with the scales.

Statistics
The sample size was calculated on the basis of a descriptive study with a variation of 8% and incidence of absence of pain of 25%.12 The formula n=Z{alpha}2pq/{delta}2 was used, {alpha}=0.05, p=0.25, q=1–p, {delta}=0.08. The estimated sample size was 113. Demographic data are presented as mean (SD), median (interquartile range) and 95% confidence interval. Content validity was assessed for each item by using item correlation (IC), which is the total score of each item divided by the number of experts. If IC is >=0.5, the item would be acceptable. Inter-rater and intra-rater reliabilities were analysed by intra-class correlation [R={sigma}2 subject/({sigma}2 subject+{sigma}2 observer + {sigma}2 error)]. An intra-class correlation of >0.8 was considered acceptable. As all pain scores were non-parametric data, discriminant validity was determined with the Wilcoxon rank sum test to assess the difference in pain scores before and after surgery, before analgesia. The correlations among CHEOPS, OPS, TPPPS and FLACC were analysed with the Spearman correlation. The agreement of all pain scales at various cutoff points, corresponding to the decision to treat pain in clinical practice, was analysed with the {kappa} statistic. Values of {kappa} were interpreted as follows: <0.2, poor agreement; 0.21–0.4, fair agreement; 0.41–0.6, moderate agreement; 0.61–0.8, good agreement; 0.81–1.0, very good agreement.13 The appropriate cutoff point was that point which yielded the highest {kappa} value, sensitivity and specificity. The differences in pain scores related to previous experience and parental presence were analysed with the Mann–Whitney U-test. The practicality of the scales, such as the time taken to rate the pain scores and the ranking of questionnaires, was analysed with descriptive statistics. All analyses were performed with SPSS for Windows v. 7.0 (SPSS, Chicago, IL, USA).


    Results
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Of the 167 patients enrolled in the study (age range 1–5.5 yr), who were recovering from most types of surgery (Table 1), 18 had missing PACU data and 28 had missing ward data. The pain severity resulting from these surgical procedures was classified as moderate to severe.14


View this table:
[in this window]
[in a new window]
 
Table 1 Patient characteristics and type of surgery (n=167)
 
The content validation of OPS and FLACC was accepted totally by all experts. Two behaviours in the CHEOPS score—upright behaviour (an alternative in item ‘torso’) and standing behaviour (an alternative in item ‘leg’)—produced disagreement as they were behaviours seen rarely in Thai children. ‘Squint’, an alternative in the item ‘eye’ of the TPPPS, was opposed for the same reason.

The inter-rater and intra-rater reliabilities of the four observers were excellent (Table 2). Construct or discriminant validity clearly demonstrated a significant difference in pain scores before and after surgery, before analgesia. The median pain scores on the wards were lower than in the PACU (Table 3).


View this table:
[in this window]
[in a new window]
 
Table 2 Inter- and intra-rater reliabilities for the pain scores CHEOPS, OPS, TPPPS and FLACC
 

View this table:
[in this window]
[in a new window]
 
Table 3 Construct validity of pain scores before surgery and after surgery, before analgesia in the PACU and on the ward. Wilcoxon rank sum test
 
Concurrent validity was assessed in terms of correlation. All data before analgesia, both in the PACU and on the ward, were analysed for association. The correlation of the four pain scales with each other was moderate to good (r=0.621–0.769, P<0.0001) in the PACU and good (r=0.757–0.827, P<0.0001) on the ward (Table 4).


View this table:
[in this window]
[in a new window]
 
Table 4 Correlations among CHEOPS, OPS, TPPPS and FLACC scores. Spearman correlation, *P<0.0001
 
Corresponding with the clinical decision to treat pain, the cutoff points which yielded the highest agreement for OPS, TPPPS and FLACC were 2, both in the PACU and on the ward. Based on the {kappa} value, the cutoff point of CHEOPS on the ward (5) was lower than that of the PACU (6). However, in terms of pain scale content, patients with OPS, TPPPS and FLACC scores of >2 and a CHEOPS score of >615 16 were considered to have pain. The agreement of CHEOPS with the clinical decision to treat pain was the highest among all scales both in the PACU and on the ward (Table 5).


View this table:
[in this window]
[in a new window]
 
Table 5 Cutoff points which yielded the highest agreement ({kappa} value) with the clinical decision to treat pain
 
According to age group, a cutoff point of 6 for CHEOPS in the PACU yielded very good agreement with the routine decision to treat pain in both age groups (<=3 and >3 yr) but there was only moderate agreement on the ward (Table 6).


View this table:
[in this window]
[in a new window]
 
Table 6 Agreement of CHEOPS with the clinical decision to treat pain categorized by age group
 
The cutoff point associated with the highest {kappa} agreement of all scales in the PACU provided high sensitivity and specificity (>80%). On the ward, a CHEOPS cutoff point of 5 yielded high sensitivity and specificity (>80%), but the discriminative abilities of a CHEOPS cutoff point of 6, OPS, TPPPS and FLACC were lower than this (Table 5). The result was similar in the younger and older groups (Table 6). Neither pain experience from previous surgery nor parental presence seemed to affect the severity of immediate or delayed postoperative pain (Table 7). Regarding the type of surgery, a high level of pain from eye surgery in the PACU was not seen on the ward (Table 8).


View this table:
[in this window]
[in a new window]
 
Table 7 Postoperative pain scores related to previous surgery and parental presence. Mann–Whitney U-test
 

View this table:
[in this window]
[in a new window]
 
Table 8 Postoperative pain scores related to site of surgery
 
In terms of practicality, CHEOPS and FLACC were comparable except for the duration of rating with CHEOPS, which was longer (Table 9).


View this table:
[in this window]
[in a new window]
 
Table 9 Practicality of pain scales
 

    Discussion
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
Differences in culture and coping style may affect the behavioural responses of children. Some behaviours in CHEOPS and TPPPS were not accepted by our observers as they were rarely seen. However, these behaviours were alternatives to several choices in those items and even if they were excluded all items of CHEOPS and TPPPS could be scored properly. Therefore, the content validity of all four pain scales was considered to be appropriate on the basis of content relevance and coverage.

The construct validity of all pain scales was determined by comparing the group experiencing no pain in a baseline situation before surgery with the group experiencing a high level of pain after surgery, before analgesia. The difference in pain scores was clinically and statistically significant. The positive correlations of all scales with each other support concurrent validity. The stability and consistency of all the pain scales was proven by excellent inter-rater and intra-rater reliability (intraclass correlation >0.8).

In terms of discriminative ability, assessment of pain in the PACU by all measures yielded higher agreement with the decision to treat than assessment of pain on the ward. The results were similar in the younger (1–3 yr) and older age groups (>3–5.5 yr). These findings support the study of Beyer and colleagues, who found that CHEOPS scores were generally low several hours after surgery and did not correlate well with self-report measures.10 There are several possible reasons for the expression of fewer pain behaviours in persistent pain. First, the incidence of emergence delirium or agitation occurred most commonly in preschool-age children.17 Pain combined with emergence agitation might increase the severity of behaviours observed in the immediate postoperative period.18 Secondly, other causes of distressed behaviour cannot be distinguished from pain.19 Children might gradually adapt themselves to this distress and reduce the severity of their pain behaviours as time passes. Thirdly, in the delayed postoperative period when the children are wide awake, older children, in particular, may behave in a socially desirable way or may underexpress their pain in order to avoid unpleasant medication (e.g. injection). Some of them may escalate their pain behaviour to obtain increased adult attention, especially when their parents are present. This study did not find any influence of experience of previous surgery and parental presence. Postoperative pain from eye surgery seemed to disappear more quickly than pain after other types of surgery.

Pain assessment and treatment decisions may be influenced by practice settings20 and characteristics of the providers such as age, education and personal pain experience.21 Research has shown that the use of a standardized pain assessment tool results in provider ratings of pain that more closely match the child ratings.22 In order to implement pain scales in clinical practice, cutoff points were not reported for all tools in the original studies. Our discrimination points might be of benefit in treatment decisions in research and clinical settings. The cutoff points for OPS, TPPPS and FLACC corresponding with the decision to treat were all 2, both in the PACU and on the ward. On the basis of the contents, score 3 could represent pain behaviour. Cutoff point 6 with CHEOPS in the PACU was similar to that found in other studies.15 16 A score of 7 was able to represent pain behaviour. Despite the fact that a CHEOPS cutoff point of 5 yielded the highest agreement with decision to treat, the contents of the CHEOPS score 6 derived from score 1 of 6 items (no cry, composed face, not talking, neutral body position, not touching wound and neutral leg position) might not necessarily represent pain behaviour. Therefore, in this study we decided to use a CHEOPS cutoff point of 6 on the ward. The agreement of CHEOPS with decision to treat was still higher than that of the other scales.

The behavior of all pain scales in Thai children appeared to be similar to that in North American children, and the different scores were all highly correlated. Among the four scales, CHEOPS yielded the highest agreement with decision to treat. Concerning practicality, FLACC and CHEOPS seemed to be similar, except that CHEOPS took about 14 s longer to rate. The time taken to rate may be decreased by further training. On the basis of our findings, we recommended using CHEOPS to assess pain in children aged 1–5 yr, especially in the immediate postoperative period.


    Acknowledgements
 
The authors wish to thank Professor Chitr Sitthi-Amorn for his invaluable suggestions, Queen Sirikit National Institute of Child Health for permission to collect data, and Siriraj Research Development and Educational Fund for financial support.


    References
 Top
 Abstract
 Introduction
 Patients and methods
 Results
 Discussion
 References
 
1 Schechter NL, Allen DA, Hanson K. Status of pediatric pain control: a comparison of hospital analgesic usage in children and adults. Pediatrics 1986; 77: 11–5[Abstract]

2 Schechter NL. The undertreatment of pain in children: an overview. Pediatr Clin N Am 1989; 36: 781–94[ISI][Medline]

3 Stevens B. Composite measures of pain. In: Finley GA, McGrath PJ, eds. Measurement of Pain in Infants and Children. Seattle: IASP Press, 1998; 161–78

4 McGrath PJ, Johnston G, Goodman JT, et al. CHEOPS: a behavioral scale for rating postoperative pain in children. In: Fields HL, ed. Advances in Pain Research and Therapy. New York: Raven Press, 1985; 395–402

5 Norden J, Hannallah R, Geston P, et al. Reliability of an objective pain scale in children. J Pain Symptom Manage 1991; 6: 196

6 Wilson GAM, Doyle E. Validation of three pain paediatric pain scores for use by parents. Anaesthesia 1996; 51: 1005–7[ISI][Medline]

7 Tarbel SE, Cohen IT, March JL. The Toddler–Preschool Postoperative Pain Scale: an observational scale for measuring postoperative pain in children aged 1–5. Preliminary report. Pain 1992; 50: 273–80

8 Merkel SI, Shayefitz JR, Lewis TV, Malwiya S. The FLACC: a behavioral scale for scoring postoperative pain in young children. Pediatr Nurs 1997; 23: 293–7[Medline]

9 Cousins M. Acute and postoperative pain. In: Wall PD, Melzack R, eds. Textbook of Pain, 3rd edn. Edinburgh: Churchill Livingstone, 1994; 357–85

10 Beyer JE, McGrath PJ, Berde CB. Discordance between self-report and behavioral pain measures in children aged 3–7 years after surgery. J Pain Symptom Manage 1990; 5: 350–6

11 McGrath PJ. Behavioral measures of pain. In: Finley GA, McGrath PJ, eds. Measurement of Pain in Infants and Children. Seattle: IASP, 1998; 83–102

12 Mather L, Mackie J. The incidence of postoperative pain in children. Pain 1983; 15: 271–82[ISI][Medline]

13 Altman DG. Some common problems in medical research. In: Altman DG, ed. Practical Statistics for Medical Research. London: Chapman & Hall, 1991; 404–8

14 Karl T, van der Laan K, Crombach J, et al. Postoperative pain management in children has been improved, but can be further optimized. Eur J Pediatr Surg 1996; 6: 259–64[ISI][Medline]

15 Anatol TI, Pitt-Miller P, Holder Y. Trial of three methods of intraoperative bupivacaine analgesia for pain after paediatric groin surgery. Can J Anaesth 1997; 44: 1053–9[Abstract]

16 Huntink-Sloot MT, Faber-Nijholt R, Zwierstra RP, Skalnik-Polackova D, Hennis PJ, Fidler V. Better postoperative pain management in children by introduction of guideline: a prospective study [abstract]. [Dutch.] Ned Tijdschr Geneeskd 1997; 141: 998–1002[Medline]

17 Aono J, Ueda W, Mamiya K, et al. Greater incidence of delirium during recovery from sevoflurane anesthesia in preschool boys. Anesthesiolgy 1997; 87: 1289–300

18 Davis PJ, Greenburg JA, Gendelman M, Fertal K. Recovery characteristics of sevoflurane and halothane in preschool-aged children undergoing bilateral myringotomy and pressure equalization tube insertion. Anesth Analg 1999; 88: 34–8[Abstract/Free Full Text]

19 McGrath PA. An assessment of children’s pain: a review of behavioural, physiological and direct scaling techniques. Pain 1987; 31: 147–76[ISI][Medline]

20 Burokas L. Factors affecting nurses’ decisions to medicate pediatric and adult patients after surgery. Heart Lung 1985; 14: 373–9[ISI]

21 Bradshaw C, Zeanah PD. Pediatric nurses’ assessments of pain in children. J Pediatr Nurs 1986; 1: 314–22[Medline]

22 Colwell C, Clarke L, Perkins R. Postoperative use of pediatric pain scale: children’s self report vs. nurse assessment of pain intensity and affect. J Pediatr Nurs 1996; 11: 275–82.