1Department of Anaesthesiology, Julius Maximilians University, Josef-Schneider Strasse 2, D-97080 Würzburg, Germany. 2Department of Anaesthesiology and Intensive Care Medicine, Philipps-University of Marburg, Baldingerstrasse 1, D-35033 Marburg, Germany*Corresponding author
Accepted for publication: September 25, 2001
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods. Data were analysed from 1566 patients who underwent balanced anaesthesia without prophylactic antiemetic treatment for various types of surgery. A systematic literature search identified six predictive models for PONV. These models were compared with respect to validity (discriminating power and calibration characteristics) and practicability. Discriminating power was measured by the area under the receiver operating characteristic curve (AUC) and calibration was assessed by weighted linear regression analysis between predicted and actual incidences of PONV. Practicability was assessed according to the number of factors to be considered for the model (the fewer factors the better), and whether the score could be used in combination with a previously applied cost-effective concept.
Results. The incidence of PONV was 600/1566 (38.1%). The discriminating power (AUC) obtained by the models (named according to the first author) using the risk classes from the recommended prophylactic concept were as follows: Apfel, 0.68; Koivuranta, 0.66; Sinclair, 0.66; Palazzo, 0.63; Gan, 0.61; Scholz, 0.61. For four models, the following calibration curves (expressed as the slope and the offset) were plotted: Apfel, y=0.82x+0.01, r2=0.995; Koivuranta, y=1.13x0.10, r2=0.999; Sinclair, y=0.49x+0.29, r2=0.789; Palazzo, y=0.30x+0.30, r2=0.763. The numbers of parameters to be considered were as follows: Apfel, 4; Koivuranta, 5; Palazzo, 5; Scholz, 9; Sinclair, 12; Gan, 14.
Conclusion. The simplified risk scores provided better discrimination and calibration properties compared with the more complex risk scores. Therefore, simplified risk scores can be recommended for antiemetic strategies in clinical practice as well as for group comparisons in randomized controlled antiemetic trials.
Br J Anaesth 2002; 88: 23440
Keywords: vomiting, nausea, postoperative; anaesthesia, general; statistics, risk score
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In order to identify patients at high risk who would benefit from a cost-effective antiemetic treatment,10 11 several models and scores have been proposed. As one of the risk scores predicting the probability of PONV has been developed by our centre, we decided to compare its validity and practicability with all other published models in patients from a different centre.
![]() |
Patients and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
All patients received balanced anaesthesia with isoflurane, enflurane, sevoflurane or desflurane in nitrous oxide/oxygen (ratio 2:1). Propofol, thiopental or etomidate was used for induction at clinically required doses. None of the patients included in the study received perioperatively (at least 24 h before and after anaesthesia) any drug with potential antiemetic properties. Vecuronium, atracurium or mivacurium was used to facilitate intubation and repeat doses were only given if needed, so that the need for antagonizing neuromuscular blockade (atropine 0.5 mg and pyridostigmine 0.1 mg kg1) was rare (approximately 3%). The most frequently used anaesthetic technique (60% of cases) consisted of propofol, fentanyl and vecuronium for induction followed by desflurane in nitrous oxide/oxygen for maintenance and additional doses of fentanyl as required.
Nausea and vomiting were assessed in the postanaesthetic care unit by specially instructed nurses or anaesthesiologists. Vomiting or retching was considered as an emetic episode. Nausea was assessed on a four-point verbal rating scale (none, mild, moderate, severe) and patients were interviewed explicitly before transfer to the ward. During the afternoon of the same day (at least 6 h postoperatively) and during the following day (at least 24 h postoperatively), further standardized postoperative visits were performed. A patient was considered to have had PONV if any degree of nausea and/or any emetic episodes occurred in at least one of the periods investigated.
A systematic search of a literature database (http://www.nlm.nih.gov/databases/freemedl.html) revealed 11 studies dealing with risk models and PONV1 2 1219 when the following search terms were used: postoperative together with (=logical AND) score AND (nausea OR vomiting) AND risk. However, we excluded five studies because they did not present a predictive model or were validations of previous scores1 13 16 18 19 and two further studies from our centre because they dealt only with postoperative vomiting and not with PONV.15 16 Thus, four scores for predicting PONV were identified initially.2 12 14 17 Additional hand searching of abstracts and other sources revealed two additional models,20 21 so that a total of six models/scores were found for the prediction of PONV.
For each patient, four probabilities of PONV were calculated according to the four scores.2 12 14 17 The remaining two models allowed classification of patients only into those with low (less than 30%) and high (more than 30%) risk.20 21 Either probability of PONV or classification into low or high risk was used to create so-called receiver operating characteristic (ROC) curves, as described previously.2
In short, a predictive model provides for an individual a probability of an event between 0 and 1. In practice, a decision is needed about whether this event will occur. Depending on the value of the probability that an event is expected to occur (decision criterion), the sensitivity and specificity of a predictive model vary. The ROC curve displays the correlation between the sensitivity and the specificity for all possible decision criteria. Therefore, the area under the ROC curve (AUC) is an overall measure of a risk score/model to discriminate patients with PONV from those without PONV (discriminating power) and is frequently used to compare different risk scores/models.
However, it should be noted that most scores and models have a limited number of outcome variables, resulting in a limited number of possible decision criteria. As ROC curves are constructed by direct lines between the data pairs of specificity and sensitivity, scores/models with fewer risk classes may result in a smaller AUC. Thus, the AUCs of the different scores/models were compared in three ways.
First, the AUC was calculated from the ROC curve that was given by the original scores/models.
Secondly, the four scores that allowed the calculation of the probability of PONV were adapted to the risk classes (<10%, low risk; 1030%, mild to moderate risk; 3060%, high risk; >60%, extremely high risk) recommended by White and Watcha.10 11 This resulted in three risk classes for the scores from Koivuranta and colleagues and Apfel and colleagues and four risk classes for the scores from Palazzo and Evans and from Sinclair and colleagues, so that all scores had a similar number of decision criteria and were directly comparable. Thirdly, ROC curves with their AUC were calculated for the recommendation of Gan or Scholz and colleagues to apply prophylactic antiemetic treatment if three or more points were obtained20 or if the risk was greater than 30%.21 The latter also applied to the risk scores with more than one decision criterion,2 12 14 17 i.e. the information was collapsed to a dichotomous outcome so that a direct comparison of all scores was possible.
Differences between the AUC of P<0.05 were considered to be statistically significant.
For the four scores for which the probability of PONV could be determined,2 12 14 17 the predicted incidences were compared with the real incidences and weighted linear regression analysis was applied across the recommended classes. This regression analysis resulted in calibration curves for which a slope of 1.00 or 45° with no offset would represent a perfect fit. A deviation from the slope of less than 25% was defined to be clinically acceptable.
The risk calculations were performed using Excel 2000 (Microsoft, USA). Logistic regression analysis was performed with SPSS 9.0 (SPSS, Chicago, IL, USA). The area under the ROC curve was calculated using MedCalc 4.2 (Mariakerke, Belgium).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Palazzo and Evans were the first to apply logistic regression analysis to patients undergoing minor orthopaedic surgery under well-defined conditions.12 Although they did not originally intend to develop a score with universal applicability for predicting PONV, the first validation with 400 patients suggested that their model could well be transferred to other types of surgery.13 However, in this validation, the calibration curve presented a flattened slope of 0.3 and an offset of 30%. Thus, patients who were classified as having a low risk showed a much higher incidence, while patients who were really at low risk were not identified. The reason for this difficulty may lie in the extremely high odds ratio of 53.0 for a positive history of PONV in the data from Palazzo and Evans, because of the smaller study population,12 while the odds ratio of this risk factor was between 2.27 and 3.13 in all other studies1 2 13 14 17 and 1.71 in this population (Table 1). Although this calibration makes this score less suitable for clinical practice, we would like to point out that this model for the prediction of PONV has stimulated several other investigators to apply this method on a larger scale.
The score from Koivuranta and colleagues appeared to be one of the best models in the population investigated.14 First, they identified all significant factors and calculated the corresponding AUC. Interestingly, the type of operation did not have any significant effect on PONV in their original multivariable model, i.e. when corrections for the other predictors were considered. Secondly, they reduced the number of predictors from eight to five and found no effect on overall discriminating power. Thirdly, they observed that the coefficients of the predictors were all quite similar. Thus, they investigated whether a model which was based only on the simple number of risk factors resulted in a similar AUC compared with the rather complicated logit model, which considers different coefficients for each factor. They found that such a score can be simplified without a significant decrease in the AUC. In our view, simplification is an important characteristic for a score suitable for daily practice (Table 3).
|
The score of Apfel and colleagues2 is a result of collaboration between a German and a Finnish centre. Reanalysing their previously published data,1416 the authors investigated whether risk scores were valid for different centres and whether risk scores based on logistic regression coefficients could be simplified without loss of discriminating power. While testing different scores, they developed a final score from a joint data set which consisted of four predictors: female gender, history of motion sickness or PONV, non-smoking status and the use of postoperative opioids. If none, one, two, three or four of these risk factors were present, the incidences of PONV were 10, 21, 39, 61 and 79% respectively. Owing to the definition of a low risk as less than 10%, patients with no risk factor or one risk factor were classified into the mild to moderate risk class. However, for practical purposes it may make more sense to classify patients with no risk factors as low-risk patients. We believe that this will be even more reasonable when the recommendation of White and Watcha is applied.10 11 In the population studied here, the score of Apfel and colleagues gave an acceptable AUC of 0.68 and a slope very close to 1. Similar positive results in favour of the score of Apfel and colleagues2 have now been obtained by Pierre and colleagues.23 In the light of this, and its easy applicability, we suggest that this score and its underlying model provide a tool that can usefully be put into clinical practice (Table 3).
While the other two models are guidelines for the use of prophylactic antiemetics rather than genuine risk scores to estimate the probability of PONV, they were included in this comparison for the sake of completeness.20 21 Unfortunately, it is not clear to what extent the weight attributed to their risk factors is based on multivariable analysis. Moreover, they are more complex than the simplified scores of Apfel and Koivuranta and their colleagues, they do not have better discriminating power and do not provide a calibration curve. In addition, they cannot be used for the cost-effective strategy suggested by White and Watcha (Table 3).10 11 The overall weak performance may indicate that risk models based on personal conviction yields poorer results than risk scores where the evidence is based on conscientious multivariate analyses of thoroughly collected data.
It seems surprising that models with a higher number of predictors were not superior to those with few predictors. The main reason may be that there are some predictors (e.g. female gender) which appear to be fairly constant in every population and are therefore part of every model, while other predictors included in the more complicated models do not seem to have more general applicability. For example, one model considered orthopaedic procedures to be associated with odds ratios between 2.6 and 5.9,17 while they had no significant effect on PONV in this data-set (Table 1). Another reason is that the increase in discriminating power with an additional predictor is relatively high when the first predictors are introduced in a model but decreases when four or five predictors are already present, so that further inclusion of predictors is not justified.24
We have demonstrated that the simplified models appear to be easily applicable and to provide a valid concept to predict the probability of PONV in individuals. However, none of the models is able to predict with certainty which individual will actually suffer from PONV. An experiment with a virtual population has shown that this limitation results from the fact that the odds ratio of the risk factors for PONV is at best in the range of 23 (when non-reproduced outliers are not considered).24 Therefore, unless significantly stronger predictors are found which appear to be applicable to most centres, it seems unlikely that further risk scores will lead to significantly better predictions.
Despite the above-mentioned limitations, the simplified risk scores can provide a rational basis for an antiemetic strategy. In addition, they may be used for group comparisons of randomized controlled trials of antiemetic strategies.22 If, for example, several risk factors have a tendency to be more frequent in one group than in the other, although not being statistically significant on their own, this may lead to a significantly different risk, as assessed by a scoring model. Thus, a more cautious interpretation of the results may be needed. On the other hand, when one risk factor is significantly more frequent in one group than in the other, although the overall risk is similar, this finding would be put into perspective. Therefore, we believe that validated scores predicting PONV should be used not only in clinical practice but also in demographic tables for group comparisons in randomized controlled trials.
![]() |
Acknowledgement |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Apfel CC, Läärä E, Koivuranta M, et al. A simplified risk score for predicting postoperative nausea and vomiting: conclusions from cross-validations between two centres. Anesthesiology 1999; 91: 693700[ISI][Medline]
3 Kovac AL. Prevention and treatment of postoperative nausea and vomiting. Drugs 2000; 59: 21343[ISI][Medline]
4 Foster PN, Stickle BR, Laurence AS. Akathisia following low-dose droperidol for antiemesis in day-case patients. Anaesthesia 1996; 51: 4914[ISI][Medline]
5 Scuderi PE, James RL, Harris L, et al. Antiemetic prophylaxis does not improve outcomes after outpatient surgery when compared to symptomatic treatment. Anesthesiology 1999; 90: 36071[ISI][Medline]
6 Tramèr MR, Reynolds DJ, Moore RA, et al. Efficacy, dose-response, and safety of ondansetron in prevention of postoperative nausea and vomiting: a quantitative systematic review of randomized placebo-controlled trials. Anesthesiology 1997; 87: 127789[ISI][Medline]
7 Schumann R, Polaner DM. Massive subcutaneous emphysema and sudden airway compromise after postoperative vomiting. Anesth Analg 1999; 89: 7967
8 Scuderi PE, James RL, Harris L, et al. Multimodal antiemetic management prevents early postoperative vomiting after outpatient laparoscopy. Anesth Analg 2000; 91: 140814
9 Carroll NV, Miederhoff PA, Cox FM, et al. Costs incurred by outpatient surgical centres in managing postoperative nausea and vomiting. J Clin Anesth 1994; 6: 3649[ISI][Medline]
10 White PF, Watcha MF. Postoperative nausea and vomiting: prophylaxis versus treatment [editorial; comment]. Anesth Analg 1999; 89: 13379
11 Watcha MF. The cost-effective management of postoperative nausea and vomiting [editorial; comment]. Anesthesiology 2000; 92: 9313[ISI][Medline]
12 Palazzo M, Evans R. Logistic regression analysis of fixed patient factors for postoperative sickness: a model for risk assessment. Br J Anaesth 1993; 70: 13540[Abstract]
13 Toner CC, Broomhead CJ, Littlejohn IH, et al. Prediction of postoperative nausea and vomiting using a logistic regression model. Br J Anaesth 1996; 76: 34751
14 Koivuranta M, Läärä E, Snåre L, et al. A survey of postoperative nausea and vomiting. Anaesthesia 1997; 52: 4439[ISI][Medline]
15 Apfel CC, Greim CA, Haubitz I, et al. A risk score to predict the probability of postoperative vomiting in adults. Acta Anaesthesiol Scand 1998; 42: 495501[ISI][Medline]
16 Apfel CC, Greim CA, Haubitz I, et al. The discriminating power of a risk score for postoperative vomiting in adults undergoing various types of surgery. Acta Anaesthesiol Scand 1998; 42: 5029[ISI][Medline]
17 Sinclair D, Chung F, Mezei G. Can postoperative nausea and vomiting be predicted? Anesthesiology 1999; 91: 10918[ISI][Medline]
18 Eberhart LHJ, Seeling W, Staack AM, et al. Validation of a risk score for prediction of vomiting in the postoperative period. Anaesthesist 1999; 48: 60712[ISI][Medline]
19 Eberhart LH, Hogel J, Seeling W, et al. Evaluation of three risk scores to predict postoperative nausea and vomiting. Acta Anaesthesiol Scand 2000; 44: 4808[ISI][Medline]
20 Gan TJ. Current controversies in the management of PONV. In: Sugarman M, Gan TJ, Levy JH, Cahalan M, eds. Advances in Anesthesia Research and Patient Management, Vol. 74. Honolulu, Hawaii, 2000; 25
21 Scholz J, Hennes HJ, Bardenheuer HJ, Kretz FJ. Postoperative nausea and vomitingincidence, prophylaxis and therapy. In: Purschke R, ed. Refresher course: current knowledge for anaesthesiologists. Deutsche Akademie für Anästhesiologische Fortbildung 26. Berlin: Springer Verlag, 2000; 17381
22 Matson A, Palazzo M. Postoperative nausea and vomiting. In: Adams AP, Cashman JN, eds. Recent Advances in Anaesthesia and Analgesia, Vol. 19. London: Churchill-Livingstone, 1995; 10726
23 Pierre S, Bernais H, Pouymayou J. A comparison of two risk scores for predicting postoperative nausea and vomiting. Eur J Anaesthesiol 2001; 18: 8 A-28
24 Apfel CC, Kranke P, Greim CA, et al. What can be expected from risk scores for predicting postoperative nausea and vomiting? Br J Anaesth 2001; 86: 8227