Department of Anaesthesia, Intensive Care and Pain Medicine, St Vincents University Hospital, Dublin, Ireland*Corresponding author: Department of Anaesthesia, Intensive Care and Pain Medicine, St Vincents University Hospital, Elm Park, Dublin 4, Ireland. E-mail: l.mcnicholas@st-vincents.ie
This work was presented in part at the National Scientific Meeting of the Royal College of Physicians in Ireland, Dublin, March 1998, and at the International Anesthesia Research Society, Honolulu, Hawaii, March 2000.
Accepted for publication: August 30, 2002
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods. In 40 previously reported patients, neuromuscular function, neuromuscular block/antagonist usage and time intervals were recorded throughout anaesthesia until tracheal extubation by an observer uninvolved in patient care. PORC was defined as significant fade (train of four <0.7) at extubation. Neuromuscular function was classified as PORC (value=1) or no PORC (value=0). A back-propagation neural network was trained to assign similar values (0, 1) for prediction of PORC, by examining the impact of (i) the degree of spontaneous recovery at reversal, and (ii) the time since pharmacological reversal, using the jackknife method. Successful prediction was defined as attainment of a predicted value within 0.2 of the target value.
Results. Twenty-six patients (65%) had PORC at tracheal extubation. Clinical detection of PORC had a sensitivity of 0 and specificity of 1, with an indeterminate positive predictive value and a negative predictive value of 0.35. Using the artificial neural network, one patient with residual block and one with adequate neuromuscular function were incorrectly classified during the test phase, with no indeterminate predictions, giving an artificial neural network sensitivity of 0.96 (2=44, P<0.001) and specificity of 0.92 (P=1), with a positive predictive value of 0.96 and a negative predictive value of 0.93 (
2=12, P<0.001).
Conclusions. Neural network-based prediction, using readily available clinical measurements, is significantly better than human judgement in predicting recovery of neuromuscular function.
Br J Anaesth 2003; 90: 4852
Keywords: measurement techniques, train-of-four; model, artificial neural network; neuromuscular block, atracurium
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Several variables predict successful reversal of neuromuscular block.10 However, the failure rates observed suggest that these are imperfect and insensitive in practice. Artificial neural networks consist of computer software designed to mimic multiple inputs and non-linear interactions, and are being increasingly used to examine complex data.11 We tested the hypothesis that a neural network-based analysis would enhance the prediction of PORC when compared with human decision-making.
![]() |
Patients and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To construct the predictive model, a feed-forward, back-propagation neural network model was developed using a commercial neural network software package (Neuralyst 1.40; Cheshire Engineering, Cheshire, CA, USA), integrated with a spreadsheet programme (Microsoft Excel 98). We used a training algorithm with three layers (an input layer, an output layer and a hidden layer containing four nodes; Fig. 1). The nodes in artificial neural networks are abstract entities which correspond to the weighting and summation of input data and its transformation into output values. A neural network is an algorithm consisting of a series of parallel mathematical equations in which variables from the external world are assigned internal weightings by each equation. Summation of these equations results in output values which are then used as variables for one or more series of new equations (one or more intermediate, hidden layers), and the summed, weighted output from these equations is projected to an output layer as a non-linear function of input. The internal weightings for each equation in the network are set at random values initially and are altered during training as the network-derived, predicted outputs are compared with actual outputs (back-propagation of error). Training consists of multiple empirical modifications of the weightings to minimize the difference between predicted and actual output values, allowing a form of pattern recognition to occur.
|
For diagnostic and predictive purposes, positive was defined as the presence of residual neuromuscular block, and negative was defined as the absence of residual neuromuscular block. Neural network sensitivity and specificity data were derived by sequentially using each of the 40 patients as the test case. Human and neural network performances were compared using 2 analysis or Fishers exact probability test as appropriate.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Increasing processor speeds and computer power have moved neural networks from an esoteric topic to one where they offer practical uses in a wide range of fields. At their most basic they can be used as cheap, small programs (essentially macro add-ins to commercial spreadsheets). Successful medical uses of neural networks include their use in pharmacokinetic/pharmacodynamic prediction (antibiotic peak and trough levels)13 and enhancement of diagnostic skills. In the clinical arena, the emergency room diagnosis of myocardial infarction14 and radiologists diagnosis of pulmonary embolism15 have been studied with favourable results. A neural network for detecting oesophageal intubation has been described.16 More recently, neural network-based prognostication in critical care has been shown to be superior to a conventional statistical approach.17 Enhanced performance by neural networks relative to humans is related to (i) more appropriate weighting given by the network to common diagnostic factors, and (ii) empirical use by networks of diagnostic rules of thumb not previously utilized. In this study, the diagnostic variables used by the software were deliberately limited to two of the commonest ones used in clinical practice, instead of allowing all possible information to be used. Although one patient with residual block was consistently misclassified by the network, less common predictors of incomplete reversal are beyond the scope of this study. Our findings suggest that their assessment would require much larger patient numbers. For example, assuming a successful prediction rate of 90% by the network using only the two variables employed in this study, in order to have an 80% chance of detecting a further improvement in performance to, say, 95%, we would have needed about 240 patients.
Neural networks have several limitations. A major theoretical concern is the black box nature of their output, i.e. conclusions are generated without explanations. This has led to claims that they lack rigour, and a reluctance to accept their results. This criticism implies that conclusions should follow from hypotheses supported by data and rejects the role of pattern recognition in decision making. However, the success of neural networks in medical decision-making, including a performance equal to that of radiologists in diagnosis of pulmonary embolism,15 and superior to that of emergency physicians in the diagnosis of myocardial infarction,14 suggests that they are competent at the very least. Black box concerns may be overcome by sensitivity analyses, where the effects of variables are assessed by their inclusion or exclusion. Our a priori choice of a small number of predictor variables in this study also addresses this criticism.
Neural networks may be difficult to validate in practice. If settings, especially learning rate, are not chosen carefully during the training phase, overlearning may occur. In overlearning, the network memorizes the data points and gives correct answers without recognizing patterns at all. As in this case, the problem is addressed by using low values for learning rate at the expense of decreased speed, and/or using separate training and test sets. The jackknife technique used during this study successfully prevented overlearning, as demonstrated by similar performances for the training and test phases, without reducing success rates. The sensitivity and specificity of any diagnostic or predictive technique will vary as a function of the cut-off (tolerance) points used to determine agreement. The success rate seen with tolerance values of 0.2 indicates that, in 95% of cases, the network correctly predicted adequate or inadequate return of neuromuscular function.
The high success rate with this model may be partly related to an unusually high incidence of undetected residual block in the training data set. The presumed failure to make a clinical diagnosis of PORC in any patient limited our ability to make a comparison of indices such as sensitivity and positive predictive value; however, the negative predictive values suggest that, compared with clinical care providers, the artificial neural network was much more effective in predicting recovery of adequate neuromuscular block. Even board-certified anaesthetists may fail to diagnose residual relaxant effects, as evidenced by a 25% requirement for unscheduled antagonism in adults receiving mivacurium without pharmacological antagonism in a clinical trial setting.18 It is possible that trainees have even less capacity to assess the speed and degree of pharmacological antagonism and that this could be improved significantly. The pilot nature of these data and the fact that they provide a single point estimate of the prevalence of PORC in a small number of patients make it premature to suggest that neural networks be used for real-time decision-making in the antagonism of neuromuscular block.
Neural networks are limited by the quality of their data and may need to be retrained periodically if performance changes over time, but this may be less applicable in the assessment of drug pharmacokinetics and pharmacodynamics. In predicting peak and trough antibiotic levels, network-based prediction has already been shown to be as good as or better than standard population-based pharmacokinetic modelling.13 For optimal assessment of network-based prediction, further validation would be required in the form of a prospective comparison with clinical practice and conventional statistical analysis. Also, higher train-of-four values (of the order of 0.9) are required to ensure full airway competence19 and minimize subjective discomfort.2 Because such data are not clinically detectable with routine monitoring, neural network techniques may play a role in enhancing prediction of greater degrees of neuromuscular recovery. This area merits further study.
We deliberately chose simple variables as predictors of residual block. Other data collected, such as absolute values for twitch height depression, were excluded from analysis, because they are not routinely measured in practice. In the present study, the predictive ability of the neural network was so strong using two simple indices that incorporating extra variables could not have enhanced performance with this sample size. This suggests that most of the variation seen in neuromuscular antagonism is attributable to these two factors. The difference in negative predictive values obtained using the two methods suggests that routinely collected variables are adequate in most patients for assessment of neuromuscular block when emphasis is placed on specificity (i.e. priority is given to exclusion rather than detection of PORC). Although our data are limited to the assessment of a single clinical decision (i.e. whether to extubate) in a small series of patients, our findings suggest that human factors are a major component in the erroneous diagnosis of adequate neuromuscular antagonism.
In conclusion, using simple, routinely measured variables, a neural network-based prediction is useful in estimating the likelihood of clinically significant residual neuromuscular block at the time of tracheal extubation, with a significant performance improvement over that of the anaesthetic trainees from whose practice the data were derived. Other predictive problems in anaesthesia where low sensitivity or non-linear interactions are a limiting factor may be amenable to neural network-based analysis.
![]() |
Acknowledgement |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Kopman AF, Yee PS, Neuman GG. Relationship of the train-of-four fade ratio to clinical signs and symptoms of residual paralysis in awake volunteers. Anesthesiology 1997; 86: 76571[ISI][Medline]
3 Shorten GD, Merk H, Sieber T. Perioperative train-of-four monitoring and residual curarization. Can J Anaesth 1995; 42: 71115[Abstract]
4 Viby-Mogensen J, Jensen NH, Olsen NV, et al. Tactile and visual evaluation of the response to train-of-four nerve stimulation. Anesthesiology 1985; 63: 4403[ISI][Medline]
5 Brull SJ, Silverman DG. Real time versus slow-motion train-of-four monitoring: a theory to explain the inaccuracy of visual assessment. Anesth Analg 1995; 80: 54851[Abstract]
6 Gaba DM, Howard SK, Jump B. Production pressure in the work environment: California anesthesiologists attitudes and experiences. Anesthesiology 1994; 81: 488500[ISI][Medline]
7 Brand JB, Cullen DJ, Wilson NE, Ali HH. Spontaneous recovery from nondepolarizing neuromuscular blockade: correlation between clinical and evoked responses. Anesth Analg 1977; 56: 559[Abstract]
8 Harper NN, Martlew R, Strang T, Wallace M. Monitoring neuromuscular block by acceleromyography: comparison of the Mini-Accelograph with the Myograph 2000. Br J Anaesth 1994; 72: 41114[Abstract]
9 Loan PB, Paxton LD, Mirakhur RK, Connolly FM, McCoy EP. The TOF-Guard neuromuscular monitor: a comparison with the Myograph 2000. Anaesthesia 1995; 50: 699702[ISI][Medline]
10 Bevan DR, Donati F, Kopman AF. Reversal of neuromuscular blockade. Anesthesiology 1992; 77: 785805[ISI][Medline]
11 Cross SS, Harrison RF, Kennedy RL. Introduction to neural networks. Lancet 1995; 346: 10759[ISI][Medline]
12 McCaul C, Tobin É, Boylan JF, McShane AJ. Atracurium is still associated with postoperative residual curarization. Br J Anaesth 2002; 89: 7669
13 Brier ME, Zurada JM, Aronoff GR. Neural network predicted peak and trough gentamicin concentrations. Pharm Res 1995; 12: 40612[CrossRef][ISI][Medline]
14 Baxt WG. Use of an artificial neural network for the diagnosis of myocardial infarction. Ann Intern Med 1991; 115: 8438[ISI][Medline]
15 Patil S, Henry JW, Rubenfire M, Stein PD. Neural network in the clinical diagnosis of acute pulmonary embolism. Chest 1993; 104: 16859[Abstract]
16 León MA, Räsänen J, Mangar D. Neural network-based detection of esophageal intubation. Anesth Analg 1994; 78: 54853[Abstract]
17 Dybowski R, Weller P, Chang R, Gant V. Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm. Lancet 1996; 347: 114650[ISI][Medline]
18 Bevan DR, Kahwaji R, Ansermino JM, et al. Residual block after mivacurium with or without edrophonium reversal in adults and children. Anesthesiology 1996; 84: 36271[CrossRef][ISI][Medline]
19 Eriksson LI, Sundman E, Olsson R, et al. Functional assessment of the pharynx at rest and during swallowing in partially paralyzed humans. Simultaneous videomanometry and mechanomyography of awake human volunteers. Anesthesiology 1997; 87: 103543[CrossRef][ISI][Medline]