Frederiksborg General Hospital, Hilleroed, Denmark
Service de Pharmacologie Clinique, Lyon, France
Psychopharmacology Clinical Research, Eli Lilly and Co, Indianapolis, Indiana, USA
Declaration of interest P. Bech is Head of a World Health Organization Collaborating Centre for psychometrics. J.P. Boissel, P. Cialdella, M.C. Haugh, and A. Hours were financed by APRET, a non-profit research organisation, for this project. M. A. Birkett and G. D. Tollefson are employed by Eli Lilly and Company.
Correspondence: Dr P. Bech, Psychiatric Research Unit, Frederiksborg General Hospital, Dyrehavevej 48, DK-3400 Hilleroed, Denmark
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aims To provide an estimate of the effect of fluoxetine compared with placebo and tricyclic antidepressants (TCAs), and to investigate reasons for early discontinuation from acute treatment.
Method Randomised trials were analysed using both intention-to-treat, efficacy and end-point.
Results Fluoxetine was superior to placebo but effect size was low. In trials comparing fluoxetine v. TCA, the results for all trials and for the USA trials showed a trend in favour of fluoxetine. Those for the non-USA trials showed a trend in favour of TCA. When combined, the results showed that significantly fewer patients on fluoxetine discontinued treatment because of adverse events.
Conclusion Fluoxetine is superior to placebo, irrespective of the analytical approach use, whereas the results obtained v. TCAs depend on the approach used. Hence, the results should be interpreted in this light.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
MATERIAL AND METHOD |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In our protocol for this meta-analysis we defined the criteria for selecting trials, before we accessed the trials database: (a) identical or very similar clinical inclusion criteria for patients (major depression as defined by DSM-III (American Psychiatric Association, 1980); (b) use of the Hamilton Depression Rating Scale (HDRS-17; Hamilton, 1967; and the first 17 items from trials that used more than 17); and (c) a double-blind follow-up phase of at least six weeks. For the non-USA trials, we analysed only trials of fluoxetine v. TCAs since the three non-USA placebo-controlled trials (116 patients) in the database did not satisfy our inclusion criteria or included very few patients. The same inclusion criteria were used for non-USA trials, except that trials with a five-week, double-blind follow-up period were also included since their exclusion would have led to only a handful of trials with a small number of patients being included. The database contained only one USA trial with a five-week double-blind follow-up period, but this was not included.
Trials without a control treatment (e.g. dose-ranging trials) and those
with a control treatment other than placebo or a TCA were excluded. In
addition, trials in which all control patients received fixed doses 75
mg/day of a TCA were eliminated, as were those in which treated patients
received less than 10 mg/day of fluoxetine. Within a trial, all patients were
pooled according to the treatment received, irrespective of the dose received,
this being equivalent to comparing a single fluoxetine-treated group with a
single TCA-treated group and a single placebo-treated group.
The first evidence-based diagnostic system in psychiatry is the DSM-III. New-generation antidepressants are indicated for major depression as defined using this diagnostic system in most countries, and this is the reason we decided to use DSM-III major depression as the only diagnostic inclusion criterion.
The database contained information for 69 trials, including 6633 patients; of these, 21 trials were USA trials and 48 had been performed elsewhere (non-USA trials). Of the 21 USA trials, five including 400 patients were excluded for the following reasons: three because of the diagnostic system used (Research Diagnostic Criteria; RDC; Spitzer et al, 1978); one because the double-blind follow-up was only for five weeks; and one because the TCA dose was too low. In addition 96 patients randomised to receive a fixed dose of 5 mg/day were excluded, as per our protocol. Of the 48 non-USA trials, 34 trials including 2047 patients were excluded for the following reasons: 23 because the DSM-III was not used (RDC; Feighner diagnostic criteria; ICD-9; World Health Organization, 1978); five because the control treatment was not a TCA (maprotiline, a monoamine reuptake inhibitor, was considered to be similar to TCAs although it is not tetracyclic, but mianserin was not); two because they were open-label uncontrolled trials; two because only one and two patients, respectively, had been recruited, one because a fixed dose of a TCA was used (clomipramine 75 mg); and one because there were only sparse data available for the 11 patients included. In total, 30 trials (16 USA and 14 non-USA) and 4120 patients (3447 USA and 673 non-USA) were included (62% of the total database) in accordance with the criteria defined in our protocol.
![]() |
ANALYSIS GROUPS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The trials were analysed in groups defined by where they were performed (USA and non-USA trials) and type of control treatment (placebo or TCA). Three types of analyses were performed for each out-come: (a) all randomised patients, classifying prematurely discontinued patients (before Day 42 in USA trials and Day 35 in non-USA trials) as failures (intention-to-treat); (b) all randomised patients who completed at least four weeks of therapy using "a last-observation-carried-forward" technique (efficacy analysis); and (c) all randomised patients with at least one post-baseline visit (end-point analysis) using "a last-observation-carried-forward" technique.
Outcomes
Frank et al (1991)
suggested using the term remission, rather than recovery, when defining
response to drug therapy in the short-term treatment of depression. Partial
remission after 4-6 weeks of treatment can be defined as at least a 50%
reduction compared with the baseline value for the HDRS-17 score, which
corresponds to very much or much improved on the Clinical Global Impression
Scale (CGI) (Guy, 1976). The
CGI was used in all the USA trials, but only in a few of the non-USA trials.
The primary outcome for USA and non-USA trials was defined as a binary
variable on the HDRS-17; partial remission, that is at least 50% reduction
compared with the baseline score on the HDRS-17 instrument. The secondary
outcome in the USA trials was also a binary variable, defined as a much
improved or very much improved on the CGI scale. Another secondary, but
quantitative, outcome was the mean change in HDRS-17 scores from baseline to
end-point. In this part of the analysis an HDRS subscale, the depression
factor (including the six items of depressed mood, guilt, work and interests,
retardation, psychic anxiety and general somatic), was also used (HDRS-6;
Bech, 1989;
O'Sullivan et al,
1997).
The reasons for early treatment discontinuation were analysed as binary variables (adverse event, lack of efficacy or any reason).
Meta-analytical methods
Log odds ratio analysis for binary data
We used the logarithm of the odds ratio method, which is based on a
multiplicative model, that is the success rate (partial remission) in the
treatment group is assumed to be a multiplicative function of that in the
control group (Boissel et al,
1989). Due to the large number of statistical tests performed the
level of statistical significance was set at a robust value P=0.01 or
less. A test for heterogeneity was also performed, and because this is an
insensitive test, the level of statistical significance was set at a value of
P=0.10 or less. When heterogeneity was detected we analysed the data
using a random effects model, which gives more conservative results, but can
deal with a certain amount of heterogeneity.
An odds ratio equal to one indicates that there is no difference between the two treatment groups. A value greater than one indicates that more patients in the fluoxetine group were classified as being in partial remission, and therefore that fluoxetine was better; a value of less than one indicates that more patients in the control group were classified as being in partial remission, and therefore that control treatment (placebo or TCA) was better. However, in the analyses of early treatment discontinuations an odds ratio of less than one indicates fewer discontinuations in the fluoxetine group, and that fluoxetine was better. Conversely, a log odds ratio of greater than one indicates that there were fewer discontinuations in the control (placebo or TCA) group, and therefore that the control treatment was better.
Effect size for the meta-analysis of quantitative data
Effect size analysis was introduced by Glass
(1976) as a means of combining
data from several independent clinical trials. In our analysis the effect size
was defined as the mean change of HDRS from baseline to end-point of the two
groups under investigation divided by the standard deviation of the change
score (Cohen, 1977). The 95%
confidence intervals (95% CIs) were calculated according to Hedges & Olkin
(1985). Data for all
randomised patients with at least one post-baseline visit (end-point
analysis), using a last-observation-carried-forward technique,
were included in these analyses. The method of calculation used is in
accordance with that described by Whitehead & Whitehead
(1991), using either a fixed
or random effects model as deemed appropriate. As for the meta-analysis of
binary data, a test of heterogeneity (Cochran's Q-test;
Laird & Der-Simonian,
1986) and a test of significance of the effect size were
performed.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
For the non-USA trials, data were analysed from 13 single- and multi-centre, randomised, double-blind trials in which fluoxetine was compared with a TCA in 643 patients (i.e. without non-USA trial 10; see Table 1b). Of these 643 patients, 314 had received fluoxetine and 329 had received TCAs (either amitriptyline, clomipramine, dothiepin, doxepin, imipramine or maprotiline). There were no statistically significant differences between the treatment groups in the percentage of men included in the trials (approximately 40% overall), the mean age (approximately 45 years), or the baseline HDRS-17 score (total mean score approximately 22). Only one of the USA trials v. placebo included both in- and out-patients, the others included only out-patients.
The dose ranges for the individual trials are shown in Tables 1a and 1b. Only two of the USA trials v. TCA included both in- and out-patients (both started with only in-patients and the protocols were amended during the trials); the other trials included only out-patients. Three non-USA trials v. TCA included only in-patients, three included only out-patients, six included both, and this was not specified for the remaining trial. The percentage of patients completing the trial was generally higher in the non-USA trials (Table 1b) than in the USA trials (Table 1a).
Meta-analysis of binary data for treatment effects
HDRS-17
Table 2 shows the results
obtained with HDRS-17, using both the percentage of responders and odds ratio
analysis. The efficacy analysis had the highest response rates in the
comparisons. The overall difference for fluoxetine v. placebo was
21.4% in the efficacy analysis but only 13.6% in the intention-to-treat
analysis. In all the analyses fluoxetine showed a statistically significant
benefit compared with placebo. In the USA trials of fluoxetine v. TCA
no statistically significant differences were observed. In the non-USA trials
no statistically significant differences were observed.
|
![]() |
CGI |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Meta-analysis of quantitative data (effect size)
When the results for all seven trials assessing fluoxetine v.
placebo are pooled an effect size of -0.30 in favour of fluoxetine was
obtained, with a 95% CI of -0.39 to -0.21 (see
Fig. 1). For the HDRS-6 outcome
an effect size of -0.37 was observed (95% CI: -0.46 to -0.28).
Figure 2 shows the results for
the trials v. TCAs. The pooled effect size for the HDRS-17 outcome in
the USA trials was 0.00 with a 95% CI of -0.18 to 0.10. The pooled effect size
for the HDRS-6 outcome showed a non-significant trend in favour of fluoxetine,
(-0.10; 95% CI -0.21 to 0.01). A trend in favour of TCAs was observed for the
non-USA trials v. TCAs, with a pooled effect size for the HDRS-17
outcome of 0.17 (95% CI 0.01 to 0.34). There was a stronger trend in favour of
TCAs for the HDRS-6 outcome, with a pooled effect size of 0.18 (95% CI 0.01 to
0.34). When the results from all the trials comparing fluoxetine v.
TCAs were pooled the effect size for the HDRS-17 outcome showed a
non-significant trend in favour of TCAs (0.05; 95% CI -0.04 to 0.14). The
pooled effect size for the HDRS-6 outcome also showed a non-significant trend
in favour of fluoxetine (-0.02; 95% CI -0.11 to 0.07).
|
|
Meta-analysis of early treatment discontinuation data (binary)
The results of the analyses of the reasons for discontinuations in the
trials v. placebo were as predicted, that is significantly more
discontinuations in the fluoxetine-treated group due to an adverse event, and
significantly more discontinuations in the placebo-treated group due to lack
of efficacy, with a non-significant trend for discontinuation for any reason
favouring fluoxetine (see Table
4). Using the fixed effects model the test for homogeneity was
significant indicating heterogeneity among the trials for the three outcomes,
and visual inspection of the graphical results (not shown) suggested this was
due to two trials (USA-trial-15 and USA-trial-16). We therefore decided to use
a random effects model, which gave more conservative results, but removed the
heterogeneity.
|
The analysis of the reasons for discontinuation in the USA trials of fluoxetine v. TCA showed that, while receiving fluoxetine, significantly fewer patients discontinued their treatment because of an adverse event, and significantly fewer patients discontinued for any reason. No significant difference was seen with respect to discontinuations due to lack of efficacy. The results from a similar analysis for the non-USA trials v. TCA did not indicate any significant differences between the two groups, however, the width of the confidence intervals suggest a potential lack of power to detect clinically significant differences. When the USA and non-USA trials were combined the results showed that significantly fewer patients on fluoxetine discontinued treatment due to adverse events or for any reason.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Antidepressive responsiveness to fluoxetine in major depression
A 50% reduction in the baseline HDRS-17 score was the primary outcome in
our study. In the intention-to-treat analysis, both for HDRS-17 and for CGI,
fluoxetine showed an advantage of approximately 15% over placebo. This is a
similar result to that found in one of the first overviews comparing TCAs with
placebo (Smith et al,
1969) as well as that reported in the Medical Research Council
trial (Medical Research Council,
1965). The odds ratio analysis confirmed that fluoxetine was
significantly superior to placebo, although no difference was seen between the
USA trials and non-USA trials.
Improved safety acceptance of fluoxetine
In this meta-analysis the results for discontinuation due to adverse
reactions were evaluated by the intention-to-treat analysis. Compared with
placebo we observed that significantly more patients ceased treatment with
fluoxetine due to adverse events while significantly more patients dropped out
on placebo due to lack of efficiency. This is reflected in the relatively
lower differences in antidepressive improvement in the intention-to-treat
analysis. However, compared with patients in the TCA groups, in the USA
trials, and to a lesser extent the non-USA trials, we observed that
significantly fewer trials in the fluoxetine group stopped treatment due to
adverse events. This seems to explain that the intention-to-treat analysis for
the USA trials favoured fluoxetine while that for the non-USA trials did not.
However, when combined, significantly fewer patients on fluoxetine compared
with those on TCAs discontinued treatment due to adverse events.
These results are in agreement with results of the meta-analyses published by Andersen & Tomenson (1995) and Hotopf et al (1997). In the latter meta-analysis, Hotopf et al analysed the old TCAs (e.g. imipramine and amitriptyline) separately from the newer TCAs (e.g. dothiepin, nortriptyline, clomipramine and doxepin). They found that the lower rate of discontinuation in patients on SSRIs was observed in the comparison with the old TCAs. This may explain our finding concerning the intention-to-treat analysis in the USA v. non-USA trials, as the old TCAs were used in 65% of the USA trials compared with 47% of the non-USA trials.
Comparison with other meta-analyses with fluoxetine
In previous meta-analyses the effect size was mainly used, for example,
Song et al (1993),
Greenberg et al
(1994) or Anderson &
Tomenson (1994). Our results
for fluoxetine v. placebo are in agreement with Greenberg et
al (1994), although our
effect size of -0.30 for the HDRS-17 remission outcome is low. In the
Greenberg et al
(1994) analysis we have
detected some publication bias (i.e. unpublished trials not included) and
double publication (i.e. data included from two publications of the same
trial). When using the core symptoms of depression, the HDRS-6 outcome, we
showed an effect size of -0.37, indicating that fluoxetine has an effect on
the specific symptoms for major depression. This is in agreement with the
results from our previous meta-analyses on citalopram and fluvoxamine
(Bech, 1989; Bech & Cialdella, 1992).
Our results for fluoxetine v. TCAs are in agreement with those
published by Anderson & Tomenson
(1994), that is, there is no
difference in the antidepressive effect. This was confirmed by the HDRS-6
outcome results.
In conclusion, we have shown that results from meta-analyses can differ depending on how patients who withdraw from treatment early are counted in the analyses. Generally, the approach used is intention-to-treat, whereby patients who withdraw from treatment early are considered failures in the trial group to which they were allocated, and it may be important in the future to consider using other approaches (efficacy and end-point) in meta-analyses, to determine if there is a difference. Thus, in our analyses, we have confirmed the superiority of fluoxetine over placebo for the short-term treatment of major depression, and although we were unable to show a difference in efficacy with TCAs, fewer patients on fluoxetine withdrew due to adverse effects.
![]() |
Clinical Implications and Limitations |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
LIMITATIONS
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Anderson, I.M. (1998) SSRIs versus tricyclic antidepressants in depressed in-patients: A meta-analysis of efficacy and tolerability. Depression and Anxiety 7 (suppl. 1), 11-17.[CrossRef][Medline]
Anderson, I. M. & Tomenson, B. M. (1994) The efficacy of selective serotonin re-uptake inhibitors in depression: a meta-analysis of studies against tricyclic antidepressants. Journal of Psychopharmacology, 8, 238-249.
Anderson, I. M. & Tomenson, B. M. (1995)
Treatment discontinuation with selective serotonin reuptake inhibitors
compared with tricyclic antidepressants: A meta-analysis. British
Medical Journal, 310,
1433-1438.
Ansseau, M. (1992) The Atlantic gap: clinical trials in Europe and the United States. Biological Psychiatry, 31, 109-111.[Medline]
Bech, P. (1989) Clinical effects of selective serotonin reuptake inhibitors. In Clinical Pharmacology in Psychiatry (Psychopharmacology. Series 7) (eds S. G. Dahl & L. F. Gram), pp. 81-93. Berlin & Heidelberg: Springer-Verlag.
Bech, P. & Cialdella, P. (1992) Citalopram in depression: meta-analysis of intended and unintended effects. International Clinical Psychopharmacology, 6 (suppl. 5), 45-54.[Medline]
Boissel, J. R., Blanchard, J., Panak, E., et al (1989) Considerations for the meta-analysis of randomized clinical trials. Summary of a panel discussion. Controlled Clinical Trials, 10, 254-281.[Medline]
Cohen, J. (1977) Statistical Power Analysis for the Behavioral Sciences. Orlando, FL: Academic Press Inc.
Cucherat, M., Boissel, J. P., Leizorovicz, A., et al (1997) Easy MA: a program for the meta-analysis of clinical trials. Computer Methods and Programs in Biomedicine, 53, 187-190.[CrossRef][Medline]
Frank, E., Prien, R. F., Jarrett, R. B., et al (1991) Conceptualisation and rationale for consensus definitions of terms in major depressive disorders. Archives of General Psychiatry, 48, 851-855.[Abstract]
Glass, G. V. (1976) Primary, secondary and meta-analysis of research. Review of Educational Research, 5, 3-9.
Greenberg, R. P., Bornstein, R. F., Zborowski, M. J., et al (1994) A meta-analysis of fluoxetine outcome in the treatment of depression. Journal of Nervous and Mental Disease, 182, 547-551.[Medline]
Guy, W. (1976) ECDEU Assessments Manual for Psychopharmacology. Rockville, MD: National Institute of Mental Health.
Hamilton, M. (1967) Development of a rating scale for primary depressive illness. British Journal of Social and Clinical Psychology, 6, 278-296.
Hedges, L. V. & Olkin, I. (1985) Statistical Methods for Meta-Analysis. New York: Academic Press.
Hotopf, M., Hardy, R. & Lewis, G. (1997) Discontinuation rates of SSRIs and tricyclic antidepressants: a meta-analysis and investigation of heterogeneity. British Journal of Psychiatry, 170, 120-127.[Abstract]
Laird, N. & DerSimonian, R. (1986) Meta-analysis in clinical trials. Controlled Clinical Trials, 7, 177-188.[CrossRef][Medline]
Medical Research Council (1965) Clinical trial treatment of depressive illness. British Medical Journal, 1, 881-886.
O'Sullivan, R. L., Fava, M., Agustin, C., et al (1997) Sensitivity of the six-item Hamilton Depression Rating Scale. Acta Psychiatrica Scandinavica, 95, 379-384.[Medline]
Review Manager (1997) Computer Programme Version 3.0.1. Oxford: Update Software.
Smith, A., Traganza, E. & Harrison, G. (1969) Studies on the effectiveness of antidepressant drugs. Psychopharmacology Bulletin, suppl., 1-53.
Song, F., Freemantle, N., Sheldon, T. A., et al (1993) Selective serotonin reuptake inhibitors: a meta-analysis of efficacy and acceptability. British Medical Journal, 306, 683-687.[Medline]
Spitzer, R. L., Endicott, J. & Robins, E. (1978) Research Diagnostic Criteria: Rationale and research reliability. Archives of General Psychiatry, 35, 773-785.[Abstract]
Whitehead, A. & Whitehead, J. (1991) A general parametric approach to the meta-analysis of randomized clinical trials. Statistics in Medicine, 10, 1665-1677.[Medline]
World Health Organization (1978) Mental Disorders: Glossary and Guide to their Classification in Accordance with the Ninth Revision of the International Classification of Diseases (ICD-9). Geneva: WHO.
Received for publication June 22, 1998. Revision received October 18, 1999. Accepted for publication October 18, 1999.