A NICE TRY THAT FAILS: THE SWEDISH COUNCIL ON TECHNOLOGY ASSESSMENT IN HEALTH CARE (SBU) EVALUATION OF THE EFFECT OF TREATMENT OF ALCOHOL AND DRUG PROBLEMS: THE EPIDEMIOLOGIST’S VIEW

Kari Poikolainen

Finnish Foundation for Alcohol Studies, PO Box 220, FIN-00531 Helsinki, Finland

Received 6 December 2001; in revised form 22 April 2002;

ABSTRACT

Background and Aims: The Swedish Council on Technology Assessment in Health Care (SBU) has recently published a large, >800-page systematic review. It reviews brief interventions to reduce alcohol intake, long-term prognosis of substance dependence, obstetric questions and economic aspects of addiction treatments. The main part aims to evaluate treatments on alcohol and other addictive substances by meta-analytical techniques. Results and Conclusions: The report summarizes 641 individual studies. Unfortunately, several methods are weak, some inadequately documented, and many conclusions rest on shaky grounds.

INTRODUCTION

A new Swedish book of more than 800 pages in two volumes is an ambitious attempt to evaluate by meta-analytical techniques treatments of alcohol and other addictive substances (SBU, 2001Go). This is the main focus, although the report also reviews brief interventions to reduce alcohol intake, long-term prognosis of substance dependence, obstetric questions and economic aspects of addiction treatments. This book is the focus of this Commentary.

A team of 11 experts, seven reviewers and two co-ordinators has produced this twin book. The employer is the Swedish Council on Technology Assessment in Health Care (abbreviated to ‘SBU’ in Swedish). Their yellow book series includes not only the assessment made by the expert group but, based on the former, the Council’s executive conclusions and recommendations for action in practice. Thus, this magnum opus might have great influence on treatment practice in future. A forthcoming English translation will extend its influence beyond Sweden.

MAIN RESULTS

Several types of treatment for various addictive problems have been reviewed in individual chapters. The easiest way to get an idea of the topics and results of these reviews is to read the light blue pages with the executive summary. Table 1Go summarizes the report’s conclusions regarding effectiveness or otherwise of various interventions.


View this table:
[in this window]
[in a new window]
 
Table 1. Conclusions of the SBU (2001) report concerning the effectiveness of various interventions in the treatment of alcohol and other drug problems
 
Unfortunately, in my view, many of the methods to reach these conclusions were either weak or poorly documented. This opinion pertains to the chapters on brief interventions to reduce alcohol intake and on the evaluations of treatments of alcohol and other addictive substances by meta-analytical techniques. I shall not discuss the chapters on the long-term prognosis of substance dependence, obstetric questions or the economic aspects of addiction treatments.

METHODS OF THE REPORT

Literature search
Literature searches were well described. Search terms have been stated. Published studies were mainly from Medline. For some topics, studies were also searched from other databases and from reviews. The earliest studies included were from the 1950s or later, depending on the topic. The latest studies included were published in the summer of 2000. There were, however, wider variations in the thoroughness with which the various topics were approached. For some, but not all, topics, unpublished studies have been searched by contacting researchers and by examining conference programmes and abstracts.

For most topics, however, the objectives of the review have been less clearly stated. For example, it is not sufficient to state whether a particular treatment has an effect, but one should also specify on which factor there might be an effect. Various treatment outcomes were studied. Typically, these included abstinence from the drug of dependence and retention in treatment. The latter is, in my view, more of a means to, than a goal of, treatment and is thus less useful as an effect measure.

It is commendable that individual studies were rated for quality and that the quality rating scheme was explained. However, more detailed explanations would have been useful. Altogether, the report was based on 641 relevant studies. Most were randomized controlled trials and the quality was considered to be reasonably high.

Generalizability
It is important to know to which patient group the results can be generalized. Therefore, only studies with similar patient groups should be combined in a meta-analysis. The review on psychosocial treatments for alcohol problems attempted to solve this problem by combining studies with patients having similar severity of problems or alcohol consumption. This must be a difficult task, since individual studies are seldom clear on this point. Misleadingly, the review on long-term opioid agonist treatment claims that the target group was patients with DSM-IV or ICD-10 opioid dependence. None of the eight studies included in the meta-analysis of agonist vs control treatment effect on opioid misuse used these criteria (four applied DSM-III-R criteria). In fact, the inclusion criteria in the major studies were more stringent than simple dependence criteria, the former including parenteral use, misuse history of several years and earlier withdrawal treatments that had failed. This important point has not been mentioned in the review.

Classification of studies
The individual studies combined in meta-analysis should be reasonably similar with respect to the treatments given. This is difficult to achieve with respect to psychosocial treatments. The principles used in combining studies varied. For alcohol problems, the principle was the type of the treatment goal. The main treatment goals were change in motivation, misuse behaviour, background factors for misuse, or in support, or treatment focusing on ‘significant others’. For example, different types of cognitive behavioural treatments and 12-step programmes were combined together into the group where the aim of treatment was to change misuse behaviour. For drug problems, the classifying principle was the amount of education needed in order to qualify as a treatment professional. The main categories were supportive treatment, re-educational treatment and psychotherapy. For example, the latter group pooled family therapy, cognitive therapy and dynamic psychotherapy treatments. It is hard to imagine the rationale behind pooling these divergent forms of treatment.

Although the meta-analysis on opioid agonist (mainly methadone) maintenance treatment for opioid dependence treatment was called long-term opioid agonist treatment, in one of the studies included in the analysis the treatment lasted only for 2 weeks. This is in a stark contrast with the use in this meta-analysis of other really long-term studies with 50, 104 or 156 weeks of treatment.

Primary outcome
In randomized controlled studies, it is the primary outcome and primary analysis, defined before the onset of the study, that is important. It is quite legitimate to carry out secondary analyses, but these should not be considered to be the main result. It is therefore disturbing that, in some cases, secondary analyses have been chosen to represent the study result. For example, Project MATCH (1997) concluded that there was no difference between cognitive–behavioural treatment enhancement and a 12-step programme in the primary outcome, but the SBU (2001) review prefers the latter.

Statistical analysis
In most review topics, the main objective was to compare different treatments for the same problem and find out the best treatment options. To achieve this, one needs a common measure of effect to summarize the effect sizes for various types of treatment, and one must also ensure that there is no publication bias nor lack of homogeneity between the studies used to calculate the summary effect size of a certain treatment. In the SBU (2001) Report, the possibility of publication bias was analysed only in a few instances (by funnel plots). Although some efforts were made to include unpublished studies for some topics, this was not true for all chapters. There were no analyses of homogeneity, nor explorations for the reasons of possible lack of it. It was not made clear whether the meta-analyses were based on a fixed-effect or a random-effect model. Measure of effect was the d-statistic. Although widely used, it is not suitable for comparing the differences in effects of various treatments.

The d-statistic is generally computed as:

where e denotes the experimental group, c the control group, and SDpi is the pooled estimate of the SD of the effect measure for each study (Petitti, 1994Go). The SBU (2001) Report does not document its statistical procedures unambiguously, but this seems to be the effect measure they have used in most cases, although the book also suggests that they may have replaced the pooled SD by the SD in one of the groups compared.

The d-statistic effect size estimates have been nicely plotted in many figures showing the effect sizes and their 95% confidence limits. Unfortunately, the effect sizes are not comparable. First, continuous outcome variables tend to yield much higher d-values than dichotomous ones. The authors were aware of this problem and therefore performed meta-analyses excluding results based on continuous variables. An unsolved and more serious problem is, however, that the d-values from dichotomous outcomes depend on the number of successful outcomes in the comparison group. Say, for example, that you treat groups of 50 patients and in each group five will have good clinical outcome due to treatment. Then the clinical workload is always 50 patients, the net benefit from treatment always five successfully treated patients, and the percentage difference in good outcome between treated and untreated always 10%. However, in such samples of 50 patients and 50 controls, the d-value will vary between 1.0 and 2.3, depending on the good outcome rate in the untreated group of 50 patients (using the formula above, SD calculation based on the binomial distribution). Clearly, the d-statistic is useless for identifying the best treatment options.

It has been known for sometime that effect measures based on SDs are problematic (Greenland et al., 1987Go). Moreover, these effect measures are difficult to grasp. Petitti (1994) noted that, when there is no reason to convert the measures of effect to units of SD, natural units should be used. For example, if the goal of treatment is to reduce alcohol intake, natural units such as days of abstinence or average alcohol intake per day should be preferred to effect sizes based on the d-statistic.

Fortunately, some, but not all (e.g. chapter 3), chapters of the SBU (2001) Report also tabulated results from the meta-analyses. For example, the tables in the chapter on pharmacological treatments of drug problems showed the percentages of bad outcome in the treatment and control groups, both for individual studies and the respective meta-analyses. These data are potentially very useful, because, based on these categorical data, odds ratios (ORs) can be calculated. OR is in most cases a far better effect size estimator than the d-statistic. The question is, however, can these data be relied on? The evaluation of this is undermined by the lack of detailed explanation of how these data were abstracted. For example, the procedures of making reliability checks, if any, and the handling of missing data were not explained.

Data abstracting
A comprehensive review of the correctness of abstracting and combining data from individual studies is beyond the scope of this Commentary. However, I compared a few studies I had at hand. While nothing general can be said on the basis of this convenience sample, I found sufficient disagreements between data abstracting choices by myself and SBU to suggest that an independent assessment of the data in the original articles should be preferred to the SBU data.

GENERAL COMMENTS AND CONCLUSIONS

Brief interventions to reduce alcohol intake
Since the aim of the book was to summarize the evidence by meta-analysis, it is surprising that there is no such analysis in the chapter on brief interventions to reduce alcohol intake, even if the authors were well aware that such analyses have been done earlier. The reason given was that the studies were too different to be included in a meta-analysis. I agree with that, but dividing the studies into smaller homogeneous sets might have solved the problem.

Executive summary
The executive summary contains an astounding contradiction. It first suggests that there is no scientifically sound definition of risky alcohol intake, yet it later states that brief interventions have been shown to reduce intake below risky levels. As there is no sound definition of risky alcohol intake, it is misleading to use numbers needed to treat as the effect measure. Changes in levels of alcohol consumption (and some other continuous measures, such as laboratory test values) should have been used instead.

Presentation
There are enough minor defects in the presentation to make the reader slightly frustrated. The use of the book is undermined by the lack of an index and the brevity of the list of contents. The figures showing treatment effect sometimes fail to point out what is the outcome at hand, and the reader has to read carefully a few pages to reach a likely explanation. The percentage value of confidence intervals is not always mentioned; you just have to guess that it probably is 95%.

Review verdict
Systematic review by meta-analysis is an important, but difficult, exercise. A study on the quality of Cochrane systematic reviews first published in 1998 showed major problems in 29% of the reviews and warned that these reviews should be interpreted cautiously, particularly those with conclusions favouring experimental interventions and those with many typographical errors (Olsen et al., 2001Go). The SBU review does not compare favourably with the former reviews. Although the SBU (2001) review has its uses in listing the existing literature and presenting condensed overviews, it contains several serious errors and has a fair share of minor ones. The main findings cannot be trusted. The SBU (2001) review leaves a great deal of room for improvement. A new, state-of-the art systematic review would be most welcome.

REFERENCES

Greenland, S., Schlesselman, J. J. and Criqui, M. H. (1987) The fallacy of employing standardized regression coefficients and correlations as measures of effect. American Journal of Epidemiology 123, 203–208.[ISI][Medline]

Olsen, O., Middleton, P., Ezzo, J., Gotzsche, P. C., Hadhazy, V., Herxheimer, A., Kleijnen, J. and McIntosh, H. (2001) Quality of Cochrane reviews: assessment of sample from 1998. British Medical Journal 323, 829–832.[Abstract/Free Full Text]

Petitti, D. B. (1994) Meta-analysis, Decision Analysis and Cost-effectiveness Analysis: Methods for Quantitative Analysis in Medicine. Monographs in Epidemiology and Biostatistics, vol. 24. Oxford University Press, New York.

Project MATCH Research Group (1997) Matching alcoholism treatments to client heterogeneity: project MATCH posttreatment drinking outcomes. Journal of Studies on Alcohol 58, 7–29.[ISI][Medline]

SBU (2001) Behandling av alkohol-och narkotikaproblem. En evidensbaserad kunskapssammanställning. Statens beredning för medicinsk utvärdering, rapport nr 156. Stockholm. ISBN: 91-87890-73-9. Also available at http://www.sbu.se/admin/index.asp