Department of Anaesthesiology, University of Würzburg, Josef-Schneider-Str. 2, D-97080 Würzburg, Germany*Corresponding author
Accepted for publication: January 2, 2001
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Br J Anaesth 2001; 86: 8227
Keywords: vomiting, nausea; risk
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the past, high-risk patients were intuitively classified by reference to the past medical history of PONV or the type of surgery. Recently, risk scores have provided an objective risk assessment for PONV.911 Several studies have shown that the risk assessments derived from such scores are robust enough to be also valid both in other hospitals and under different conditions.1114 However, the power to discriminate which individual will suffer from PONV remains limited (discriminating power). Thus, some centres are starting to develop more complex scores,15 hoping to gain better results by the inclusion of more predictors. Unfortunately, the development and the validation of such scores requires a large number of patients not receiving prophylactic antiemetics which may be ethically questionable if they are at high risk according to current risk scores. Thus, for ethical reasons and to be independent from centre specific populations, we created virtual populations to explore what can be expected from risk scores for predicting PONV by investigating how the number of risk factors and different study settings may affect their discriminating power.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Population I was created to verify that this model of a virtual population (when based on parameters taken from a previous study on real patients) leads to similar results when compared with the previously reported area under the receiver operating characteristic curve (AUC) from the real world.11 This was achieved by considering the frequencies and odds ratio (OR) from the previous publication. Then, risk scores considering female gender alone, non-smoking, history of motion sickness or PONV (MSPONVhist), plus post-operative opioids and plus the consideration of the interaction of male gender and MSPONVhist were created (Tables 1 and 2). Its discriminating power was measured by the AUC.
|
Finally, for populations III and IV, risk scores considering 17 idealized predictors (frequency 50% and OR 2 and 3, respectively) were applied to approximate what can at best be expected from an n-predictor model (Tables 1 and 2). In summary, four populations with different characteristics were created (Table 1 and 2).
Population I: Incidence of PONV, frequency, and OR of the predictors were taken from a previous study based on real patients.11
Population II: Incidence, frequency, and OR were modified to represent a gynaecological setting.
Population III: Incidence of PONV=50%, frequency of risk factors=50%, OR=2.0.
Population IV: Incidence and frequency as simulation III, but OR=3.0
Characteristics to create the populations were derived from the literature. A systematic review of studies investigating predictors using logistic regression models revealed two publications analysing postoperative vomiting (POV),13 17 and six analysing PONV.911 15 16 18 One study from a pharmaceutical company was excluded,18 because individual predictors (e.g. patients history of PONV) were not appropriately considered. A second study was not considered15 for reasons described in a joint letter from the UK, Finland, and Germany.19 The remaining studies with more than 1000 patients10 11 13 16 17 reported ORs for female gender in the range of about 3 while the OR of the other predictors were in the range of 2 or less. Thus, the first assumption was that the OR of clinically relevant predictors for PONV are at best in the range of 23.
While some scores allow exact calculations of the probability using the coefficients derived from the logistic models9 13 17 two recent studies provided evidence that a simplification by just considering the number of binary predictors does not significantly impair the discriminating power and still provides an appropriate estimate of the individuals risk.10 11 This led to the second assumption: that considering binary variables does not significantly alter the results.
Palazzo and Evans found an interaction between gender and history of PONV.9 This was also found in a cross-validation between two centres but detailed analysis revealed this effect to be negligible.11 Because the other studies10 13 did not find any other relevant interactions, the third assumption is that an interaction (covariation) between risk factors for PONV is negligible.
The discriminating power of a score was measured by the area under the receiver operating characteristic (ROC)-curve (AUC) as previously described.10 11 1315 17 When the average risk of PONV is 25% and the score results in a probability of 25% for almost every patient, then it would be impossible to discriminate individuals in that population who will suffer from those who will not. If, in contrast, 75% of the patients are predicted to have a relatively low risk while the remaining 25% will be predicted with a relatively high risk, then the score may predict PONV correctly in most of the individuals, for example, with acceptable discriminating power. Details on the calculation and interpretation of the discriminating powerwhich considers the relationship between sensitivity and specificity in the ROC-curveare described elsewhere.20 21
For practical reasons, an increase in the AUC of 0.025 per additional risk factor is considered as clinically relevant. This is roughly associated with a 5% higher sensitivity at a specificity of 50%. As this modelling is based on virtual populations, confidence intervals or calculations of P-values are not appropriate.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
The area of the ROC-curves for populations III and IV, when 07 risk factors are considered, are depicted in Figure 3. Again, the improvement in discriminating power is most obvious when the first predictor is introduced and decreases with every additional factor. The data demonstrate that four predictors are clearly superior to a single predictor model (AUC
0.12) but the increase of the AUC with any further risk factor becomes less than 0.025.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Population I, created from parameters of a previously published population, resulted in a discriminating power, as expressed by the area under the receiver operating characteristic curve (AUC), of 0.72 when four predictors were considered. This is in the range of 0.6830.746 when a score was applied to its original centre.11 As the discriminating power of the risk scores is comparable when applied to the virtual and the real population, we conclude that this model is a good representative of reality. Although the characteristics to create the virtual population were identical to the real population, other underlying characteristics were of course not considered. If, for example, a relatively strong, up to now, undiscovered predictor was unevenly distributed in the real but not in the virtual population, the discriminating power may have resulted in lower AUC in the real compared with the virtual population. Thus, it was reassuring that the AUC applied to the virtual population was in the expected range. Further, it could be shown that female gender as the sole predictor already has an AUC of 0.64, but this can be improved to 0.72, when three additional predictors are considered. However, any further predictor, such as an interaction between male gender and MSPONVhist, did not lead to a significant improvement (AUC <0.025). For clinical purposes a score needs to be easy applicable and as addition of further risk factors may complicate calculations for little improvement in risk assessment, the benefit of introducing a further predictor needs to be justified.
To investigate the potential impact of a different study setting on the discriminating power of a score, we constructed the most extreme deviation from the original, which is a gynaecological setting. This would eliminate the benefit of the strongest predictor (population II). The discriminating power with the remaining three predictors was 0.65. This lower discrimination was to be expected. It is noteworthy that in an investigation of a very homogeneous population, for example, all patients are female non-smokers with MSPONVhist undergoing procedures, which most likely will require post-operative opioids, a score based on those predictors will probably calculate for every patient a risk of 79%.11 As all patients will have the same risk, the score will not be able to discriminate between expected vomiters or non-vomiters, for example, AUC will be 0.5. This does not necessarily mean that the score is useless in that setting as it may still indicate that all these patients have a very high risk for PONV, which could justify the use of prophylactic antiemetics. This example demonstrates that the discriminating power of a scoring system may be affected by the investigated population and that the calibration curve is another important descriptor which can not be analysed with this model. In this respect, some validation studies may be needed to provide acceptable calibration curves in other centres. Interestingly, the risk models which have been validated from other centres10 11 were independent of the type of operation as its relative impact was negligible. To the best of our knowledge, there seem to be only two valid studies in which the type of operation led to statistically significant OR in multivariate analyses.13 16 However, their results are controversial and a score considering the operation did not lead to a better prediction compared with the previously described operation-independent model.13 Although there is little evidence supporting the widespread notion that the type of surgery is a strong risk factor, we tested this hypothesis. Given that 25% of gynaecological patients will undergo laparoscopic surgery with an OR of 2.3,16 this improvement of about 0.02 led to an AUC of 0.67 which does not appear to be clinically relevant.
The best results can be expected if the frequency of PONV is 50% and if the frequencies of the predictors are also 50%. As pointed out, the overall OR can at best be assumed to be in the range of 23.10 11 16 17 These assumptions were considered in the virtual populations III and IV and resulted in AUCs in the range of 0.72 up to 0.8 which support the previously established models reporting AUCs between 0.7110 and 0.7817 in real populations. Again, further inclusion of more than four or five risk factors does not lead to a clinically relevant increase in the discriminating power in the population models as found by Koivuranta and coworkers who reported an AUC of 0.721 in an eight factors and 0.710 in a five factor model.
Creating a virtual population led to practically identical results to those reported from real populations. The discriminating power of risk scores for predicting PONV can at best be expected to be in the range of 0.70.8 which means that the discrimination of individuals who will suffer from those who will not is still imperfect and will not significantly improve with further predictors. Unless there is consistent evidence that other predictors with a much stronger impact do exist, it is unlikely that future risk scores will provide a significantly better prediction for PONV. Thus, for the time being, it may be ethically questionable to develop new risk scores based on a large number of patients known to be at high risk who would be deprived of an effective prophylactic antiemetic strategy.
![]() |
Appendix |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The conditioned frequency of a combination was calculated from the product of the single frequencies of the predictors. The value was multiplied by 100 and the integer was taken. This results in the number of patients in that combination which is a proportional representation of that combination in the population.
The conditioned probability of PONV (P(PONV)) for that combination was calculated by the sum of the coefficients according to the presence of the predictors and submitted to a logit transformation P(PONV)=1/(1+e(sum of coefficients)).
The number of patients with PONV was derived from the number of patients in that combination multiplied by the conditioned probability.
All patients in that combination were added (=total number of patients=population). All patients with PONV were added and divided by the total number of patients (=incidence of PONV). A constant was fitted into the calculation of the conditioned probability of PONV until the aimed incidence of PONV was reached (regression process).
|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Rowbotham DJ. Current management of postoperative nausea and vomiting. Br J Anaesth 1992; 69: 46S59S[Medline]
3 Tramèr MR, Moore RA, Reynolds DJ, et al. A quantitative systematic review of ondansetron in treatment of established postoperative nausea and vomiting. BMJ 1997; 314: 108892
4 Sneyd JR, Carr A, Byrom WD, et al. A meta-analysis of nausea and vomiting following maintenance of anaesthesia with propofol or inhalational agents. Eur J Anaesthesiol 1998; 15: 43345[ISI][Medline]
5 Figueredo ED, Canosa LG. Prophylactic ondansetron for postoperstive emesis. Meta-analysis of its effectiveness in patients with previous history of postoperative nausea and vomiting. Acta Anaesthesiol Scand 1999; 43: 63744[ISI][Medline]
6 Scuderi PE, James RL, Harris L, et al. Antiemetic prophylaxis does not improve outcomes after outpatient surgery when compared to symptomatic treatment. Anesthesiology 1999; 90: 36071[ISI][Medline]
7 White PF, Watcha MF. Postoperative nausea and vomiting: prophylaxis versus treatment [editorial]. Anesth Analg 1999; 89: 13379
8 Watcha MF. The cost-effective management of postoperative nausea and vomiting [editorial]. Anesthesiology 2000; 92: 9313[ISI][Medline]
9 Palazzo M, Evans R. Logistic regression analysis of fixed patient factors for postoperative sickness: a model for risk assessment. Br J Anaesth 1993; 70: 13540[Abstract]
10 Koivuranta M, Läärä E, Snare L, et al. A survey of postoperative nausea and vomiting. Anaesthesia 1997; 52: 4439[ISI][Medline]
11 Apfel CC, Läärä E, Koivuranta M, et al. A simplified risk score for predicting postoperative nausea and vomiting: conclusions from cross-validations between two centers. Anesthesiology 1999; 91: 693700[ISI][Medline]
12 Toner CC, Broomhead CJ, Littlejohn IH, et al. Prediction of postoperative nausea and vomiting using a logistic regression model. Br J Anaesth 1996; 76: 34751
13 Apfel CC, Greim CA, Haubitz I, et al. The discriminating power of a risk score for postoperative vomiting in adults undergoing various types of surgery. Acta Anaesthesiol Scand 1998; 42: 5029[ISI][Medline]
14 Eberhart LH, Hogel J, Seeling W, et al. Evaluation of three risk scores to predict postoperative nausea and vomiting. Acta Anaesthesiol Scand 2000; 44: 4808[ISI][Medline]
15 Sinclair D, Chung F, Mezei G. Can postoperative nausea and vomiting be predicted? Anesthesiology 1999; 91: 10918[ISI][Medline]
16 Cohen MM, Duncan PG, DeBoer DP, et al. The postoperative interview: assessing risk factors for nausea and vomiting. Anesth Analg 1994; 78: 716[Abstract]
17 Apfel CC, Greim CA, Haubitz I, et al. A risk score to predict the probability of postoperative vomiting in adults. Acta Anaesthesiol Scand 1998; 42: 495501[ISI][Medline]
18 Haigh CG, Kaplan LA, Durham JM, et al. Nausea and vomiting after gynaecological surgery: a meta-analysis of factors affecting their incidence. Br J Anaesth 1993; 71: 51722[Abstract]
19 Apfel CC, Palazzo M, Koivuranta M, et al. Models for predicting PONV are well established. Data acquisition and analysis bias of a new model may not add to current knowledge. Anesthesiology 2000; 92: 148992[ISI][Medline]
20 Hanley JA, McNeil BJ. The meaning and use of the area under a ROC curve. Radiology 1982; 143: 2936[Abstract]
21 Hanley JA, McNeil BJ. A method of comparing the area under receiver operating chacteristic curves derived from the same cases. Radiology 1983; 148: 83943[Abstract]
22 Matson A, Palazzo M. Postoperative nausea and vomiting. In: Adams AP and Cashman JN (eds). Recent Advances in Anaesthesia and Analgesia, Vol. 19. London: Churchill-Livingstone, 1995; 107126