London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK.
In January 2001, a national newspaper in the UK, the Sunday Times, published a supplement purporting to demonstrate the relative performances of British hospitals in terms of in-hospital mortality.1 The data were supplied and promoted by a group of researchers from a leading medical school, internationally renowned for its scientific excellence. The publication attracted much public, political and media attention. Hospitals at the top of the league congratulated themselves, while staff working in badly performing hospitals were either demoralized, angry or, more sensibly, dismissed the findings. Here was another example of the misuse of inadequate observational data and yet further ammunition for the critics of such an approach. If only the researchers involved had read the report by FJ Double published 165 years earlier!
That work elegantly highlights the potential shortcomings of using observational data to make meaningful comparisons of clinical effectiveness.2 While Civiale's claims regarding the aetiology of bladder calculi are measured and show awareness of the limitations of drawing conclusions from selected case series, his interpretation of the relative merits of lithotomy and lithotripsy is highly suspect. This surprises me, given his acknowledgement of the difference between a real and an apparent increase in the incidence of calculi (the latter arising from differences in the judgement and practice of individual clinicians) and the influence of fashion on clinical practice. He also demonstrates his awareness of the vagaries of basing incidence on surgical rates rather than prospective surveys of representative populations. Yet when it comes to comparing surgical techniques, he seems content to use crude post-operative mortality rates (20% versus 2.3%) even though he acknowledges the age mix of the patient populations differed considerably (50% under 14 years compared with 1%).
In contrast, the report by Double and his colleagues recognizes that there are numerous sources of error that are so difficult to avoid. They, rightly, identify the potential confounding factors that might influence the sort of crude comparison carried out by Civiale: seasons, surgical difficulty, inflammation, bilious complications, illness duration, bladder damage and the general constitution of the patient. Not unreasonably, they conclude that the application of numerical methods in medicine is inevitably severely limited and that clinicians should continue to rely on intuition, experience and wisdom in deciding how to treat individual patients. Not unreasonable in 1835, but is such a conclusion reasonable in 2001?
Despite continuing examples of misleading use of observational data, such as that highlighted above, excellent examples do exist that demonstrate the potential application of these techniques.3 These can only be achieved if the data are accurate (valid and reliable) and complete, and if sufficient steps have been taken to adjust for case-mix or risk differences. Some people believe that the latter is never possible to achieve.4 If that view is accepted, much of health care will never be evaluated. A more pragmatic view seeks a role for research in improving the quality of health care. This inevitably involves the use of observational data which, if conducted carefully, can make a major contribution.5 To take two recent examples, the demonstration of the danger of premature discharge from intensive care units at night6 and the relative merits of surgical procedures to correct stress incontinence in women.7
So, while the Parisian Academy was correct in 1835 to treat unadjusted crude comparisons based on selected case series with scepticism, 165 years later we have the information technology to allow us to collect high quality clinical data and the statistical techniques to make meaningful comparisons. The task is to ensure methodological rigour is achieved and that poor analyses do not damage further the reputation of observational approaches.
References
1 Sunday Times Good Hospital Guide for Britain and Ireland. Your Guide to Better Health. Sunday Times, 14 January 2001.
2 Poisson SD, Double FJ et al. Rapports: Recherches de Statistique sur laffection calculeuse, par M. Le docteur Civiale. Comptes Rendus Hebdomadaires des Séances de lAcadémie des Sciences 1835;1:17172.
3 Britton A, McKee M, Black N et al. Choosing between randomised and non-randomised studies: a systematic review. Health Technol Assess 1998;2:13.
4 MacMahon, Collins R. Reliable assessment of the effects of treatment on mortality and major morbidity, II: observational studies. Lancet 2001;357:45562.[ISI][Medline]
5 Black NA. High-quality clinical databases: breaking down barriers. Lancet 1999;353:120506.[ISI][Medline]
6 Goldfrad C, Rowan K. Consequences of discharges from intensive care at night. Lancet 2000;355:113842.[ISI][Medline]
7 Hutchings A, Black NA. Surgery for stress incontinence: a non-randomised trial of colposuspension, needle suspension and anterior colporrhaphy. Eur Urol 2001;39:37582.[ISI][Medline]