A multicentre randomized controlled trial of expectant management versus IVF in women with Fallopian tube patency

E.G. Hughes1,7, M.L. Beecroft1, V. Wilkie2, L. Burville1, P. Claman2, I. Tummon3, E. Greenblatt4, M. Fluker5 and K. Thorpe6

1 Department of Obstetrics and Gynecology, and 6 Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, 2 Department of Obstetrics and Gynecology, University of Ottawa, Ottawa, 3 Department of Obstetrics and Gynecology, University of Western Ontario, London, 4 Department of Obstetrics and Gynecology, University of Toronto, Toronto, Ontario and 5 Genesis Fertility Centre Incorporated, Vancouver, British Columbia, Canada

7 To whom correspondence should be addressed at: Department of Obstetrics and Gynecology, McMaster University Medical Centre, 1200 Main Street West, Room 4D14, Hamilton, ON L8N 3Z5, Canada. Email: hughese{at}mcmaster.ca


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
BACKGROUND: Although observational studies suggest that IVF is more effective than no treatment for women with Fallopian tube patency, this has not been tested rigorously in a randomized controlled trial (RCT). METHODS: Eligible consenting couples planning their first treatment cycle in five Canadian fertility clinics received either IVF, within 90 days of randomization, or a period of 90 days with no treatment. Random allocation was stratified by female age and sperm quality, and administered using numbered, opaque, sealed envelopes. Follow-up assessed live birth and associated morbidity. RESULTS: Sixty-eight couples were randomized to a first cycle of IVF and 71 couples had 3 months without treatment. The live birth rates were 20/68 (29%) and 1/71 (1%), respectively. The single delivery in the untreated group was of twins, as were six of the 20 IVF deliveries (30%). An average of 2.0 embryos were transferred and no triplet pregnancies resulted. The relative likelihood of delivery after allocation to IVF was 20.9-fold higher than after allocation to no treatment [95% confidence interval (CI) 2.8–155]. The presence of abnormal sperm did not reduce this likelihood. Treating four women (95% CI 3–6) with one cycle of IVF is required to achieve a single additional birth. CONCLUSIONS: This study provides a valid and up-to-date comparison for policy makers and patients as they make choices around IVF, accurately measuring and confirming a major benefit from treatment.

Key words: effectiveness/IVF/treatment-independent pregnancy


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
IVF has been widely accepted as appropriate therapy for subfertility associated with bilateral tubal occlusion. For persistent subfertility with evidence of Fallopian tube patency, however, IVF has not been proven effective using an experimental research design. Although observational studies (case series and cohort studies) continue to suggest that it is a highly effective therapy, there remains a need for a randomized controlled trial (RCT) of IVF versus no treatment in this patient group, because of the potential for bias in available subexperimental research.

There are many examples of observational data overestimating the true value of treatments, leading to dissemination of useless and at times even harmful interventions. These include the widespread adoption of lidocaine to prevent post-myocardial infarction (MI) arrhythmias (Hine et al., 1989Go). When this treatment was tested appropriately through RCTs, it was found to significantly increase post-MI mortality and may have resulted in thousands of deaths. Gastric freezing for the treatment of peptic ulcer is a less dangerous but equally ineffective treatment widely adopted and then discounted after a small RCT proved it to be worthless (Wangensteen et al., 1962Go).

There is also a need for an RCT of IVF versus no treatment to assist in health policy decisions. Access to medical treatment in countries with ‘universal health care systems’ is generally predicated on ‘medical necessity’ and ‘effectiveness’ (Hughes and Giacomini, 2001Go). While most observers agree that subfertility is both a social and a medical problem, proof of the effectiveness of IVF treatment is lacking. The current trial therefore asks: among women with subfertility of ≥2 years duration, having any diagnosis but bilateral tubal occlusion, how effective is IVF treatment, with or without ICSI, in terms of live birth rate?


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
The study took place in five Canadian university-affiliated and private IVF programmes. No payment or reduced costs were offered, although for couples with no insurance coverage for medication, 300 U of FSH were provided. Approval was first received from each centre’s institutional Ethics Review Board. Couples were eligible for inclusion if they fulfilled all of the following criteria: duration of subfertility ≥2 years, defined as no live birth during that time; no previous IVF treatment; female age 18–39 years; willingness to commence either IVF within 6 weeks of allocation or a 3 month period of observation without intervention; day 3 serum FSH level of ≥15 IU/l or the standard level for inclusion in an individual centre’s IVF programme, whichever level was lower; semen analysis available within the last 6 months showing an adequate number of sperm to perform ICSI; and evidence of Fallopian tube patency, based on a hysterosalpingogram (HSG) or laparoscopy. The following exclusion criteria were applied: women with bilateral Fallopian tube occlusion confirmed by HSG or laparoscopy; the use of donor sperm; need for sperm recovery procedures; and concurrent serious medical illnesses that could be a relative contraindication to IVF. All couples had exhausted appropriate lower intensity treatment options, such as ovulation induction and intrauterine insemination.

Random allocation was based on a blocked schedule using numbered, sealed, opaque envelopes. Randomization was stratified by centre, by female age (≥35 years) and the presence or absence of abnormal sperm (total motile sperm count ≥20 million).

The interventions compared were the patient’s first ever cycle of IVF treatment and 90 days of observation with no treatment. Similar IVF techniques were used across centres. All programmes used ‘long protocol’ GnRH analogue suppression followed by recombinant FSH as a prelude to oocyte retrieval and IVF. The drugs and dosages used for each patient’s stimulation were recorded, along with the number of oocytes retrieved, embryos produced, quality of individual embryos, day of transfer, and the number and quality of embryos transferred and frozen. Oocyte retrieval was carried out under vaginal ultrasound guidance and no centre transferred more than four embryos per cycle. The day of embryo transfer was not standardized and ranged between day 3 and day 5 post-retrieval.

Medication was begun within 42 days of randomization, to ensure that all embryo transfers occurred within 90 days, the same period of observation used in the control group. In the latter study group, no medications that might reduce spontaneous conception were allowed, such as the commencement of a GnRH analogue pre-treatment for subsequent IVF.

The primary study outcome was live birth. This was defined as delivery of a fetus or fetuses beyond 24 completed weeks of gestation with a heart rate at birth, or a neonate that survives for >10 min after resuscitation with a heart rate recorded. Delivery of more than one live fetus to a woman with a multiple pregnancy was considered as a single live birth. However, data on individual fetuses were recorded. Details of the delivery were gathered from patient charts by each centre’s research assistant and recorded in the final data collection sheet at 12 months post-randomization. If patients delivered elsewhere, these institutions were contacted and written documentation of outcome obtained.

Neonatal outcomes were recorded, and included birth weight, need for resuscitation, admission to and number of days stay in a neonatal nursery, neonatal seizures and death. Other secondary outcomes included clinical pregnancy, defined as evidence of pregnancy, based on serum {beta}HCG level >5 IU/l and visible evidence of a gestation sac with or without fetal heart rate or histological evidence of pregnancy after miscarriage, dilatation and curettage, or surgical excision of ectopic pregnancy. Written documentation for these outcomes was obtained from centres and checked centrally by the study coordinator. Details of adverse outcomes such as ovarian hyperstimulation syndrome (OHSS) and pelvic infection were included on the data collection forms.

Sample size estimation was based on the average reported delivery rates from the largest case series for North American IVF programmes at the time of study design: 26% per oocyte retrieval and 28% per embryo transfer (Society for ART, 1999Go). An estimate of treatment-independent pregnancy rate of 1% per month was based on five comparative studies of IVF versus no treatment, ranging from 0.35 to 0.9% (Ben-Rafael et al., 1986Go; Haney et al., 1987Go; Roh et al., 1987Go; Evers et al., 1998Go; Donderwinkel et al., 2000Go). For the purpose of this study, an absolute increase of at least 25% per cycle was considered clinically significant.

A conventional level for type I error was chosen: {alpha} (two-tailed) = 0.05. In defining {beta}, the need for a high degree of certainty to detect a true difference prompted the choice of {beta} = 0.10. Based on all of the above assumptions, using the Arcsine transformation and Fleiss continuity correction, a total of 124 patients were required in order to demonstrate a clinically and statistically significant difference between treatment and no treatment.

Data collation was centralized at the lead centre (Hamilton) and analysis was done in collaboration with the Clinical Trials Methodology Group of the Department of Clinical Epidemiology, McMaster University. Baseline prognostic variables included in the entry questionnaire were compared between groups and strata, in order to assess the effectiveness of randomization. The primary analysis was by intent to treat, comparing the number of live birth deliveries in women allocated to a single cycle of IVF or 3 months of untreated observation. Fisher’s exact test was used. Secondary analyses compared clinical pregnancy rates and the proportion of women achieving delivery with any outcome in the two groups. Confidence intervals (CIs) were calculated using Mantel–Haenszel statistics (Mantel and Haenszel, 1959Go).

Even with 90% power to evaluate benefits, the study had inadequate power to assess different rates of adverse events in the IVF and untreated groups. Adverse outcomes were summarized and reported.


    Results
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Between May 2000 and April 2002, 139 of 425 eligible couples agreed to be randomized. Their demographic and diagnostic data are summarized in Table I, which confirms that the groups were similar with respect to important prognostic variables. Overall, the mean (SD) female age was 33.0 (3.5) years and duration of subfertility 56 (29) months. The primary diagnosis was unexplained or male factor-related subfertility in more than half of the couples.


View this table:
[in this window]
[in a new window]
 
Table I. Demographics—data for 139 enrolled women
 
Of 71 couples randomized to expectant management, two inadvertently received IVF. Neither conceived. A total of three clinical pregnancies occurred in this group, resulting in two first trimester abortions and one live birth of twins, corresponding to a 1.4% (95% CI 0.4–7.6) live birth rate (Figure 1).



View larger version (29K):
[in this window]
[in a new window]
 
Figure 1. Flow diagram of the process through phases of the randomized trial.

 
Sixty-eight couples were randomized to a first cycle of IVF treatment. The mean peak serum estradiol level (SD) was 8024 (3441) pmol/l, the mean number (SD) of oocytes retrieved was 10 (4.3) and of embryos transferred was 2.0 (0.7). The maximum number of embryos transferred was four. Three women conceived after randomization but before IVF, and 18 as a result of IVF. Of 21 clinical pregnancies in the IVF group, 20 resulted in a live birth, six of which were twin deliveries. The live birth rate was thus 29% per woman randomized to early IVF (Table II), the absolute difference was 28 percentage points (95% CI 16.8–39.2) and the relative likelihood of delivery with IVF treatment allocation was 20.9 (95% CI 2.8–155) (Figure 2). The number needed to treat to achieve one additional live birth with IVF was 4 (95% CI 3–6).


View this table:
[in this window]
[in a new window]
 
Table II. Pregnancy data
 


View larger version (21K):
[in this window]
[in a new window]
 
Figure 2. Live birth rate following 3 months of untreated observation versus one cycle of IVF treatment.

 
Admission to hospital for severe OHSS was necessary for two women following IVF (3%). No cases of haemorrhage or pelvic infection occurred. Caesarean section was done for 8 of 20 IVF deliveries (40%). The neonatal outcomes are summarized in Table III.


View this table:
[in this window]
[in a new window]
 
Table III. Neonatal outcomes
 

    Discussion
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
IVF is a widely accepted treatment for persistent subfertility in couples with Fallopian tube patency, and the need for an RCT comparing it with no treatment has been questioned (Haney et al., 1987Go). However, treating patients on the basis of potentially biased observational data has in many instances proved costly, resulting in the administration of useless and sometimes harmful therapies to countless patients. It is therefore reassuring that the present RCT indicates that IVF is indeed effective, with a 21-fold increase in live birth rate following one IVF cycle, compared with 3 months of no treatment.

In planning the current trial, sample size was based on fecundity rates from non-experimental studies, which raised two important questions: first, do the numbers accurately reflect the rates of treatment-dependent and independent live birth in the population under study, or are these data significantly biased; and, second, what is a clinically significant difference between treatment-dependent and independent delivery rates for IVF? Although the current trial design answers the first question, the second is highly subjective. An absolute increase in live birth rate of 25% is much greater than would be considered clinically significant for most medical treatments, but was chosen because of the high cost and current controversy around access to therapy.

The intervention of first IVF cycle was selected because its outcome is relevant to all couples seeking IVF. Studying subsequent treatment cycles would risk bias; the good prognosis patients that achieve treatment-related conception would not be included in the analysis of subsequent cycles. Three months of untreated observation was chosen as a comparator for pragmatic reasons. A pilot study suggested that ~25% of couples approached might accept a 3 month delay in their IVF treatment for altruistic reasons alone. It was felt that extending the observation period further would lower the response rate, making recruitment and completion unmanageable. In reality, even asking patients to endure 3 months of treatment delay proved challenging, particularly as all were paying privately for care.

Does this trial provide an unbiased and reliable estimate of effectiveness? The inclusion criteria were broad and multiple centres participated, strengthening its external validity. Perhaps the most important element ensuring internal validity, secure randomization, was also achieved. Potential differences between populations, differing interventions across centres, female age and sperm quality were anticipated and dealt with by stratification. Once randomized to the observation period, two women received IVF treatment. Their outcomes were included in the ‘no treatment’ group as part of an intent to treat analysis. A similar number randomized to early IVF conceived before treatment and these again were analysed as allocated. All randomized couples were thus accounted for, with no loss to follow-up. An alternative analysis ‘as treated’ produces similar results: 17/66 IVF couples and 4/73 no treatment couples achieved a live birth. Here, that absolute treatment difference is 20 percentage points, relative likelihood 4.7 (2.7–8.1) and number needed to treat 5 (3–11).

A further concern in assessing internal validity is the influence of time. The recruitment phase of this study lasted 2 years. During that time, clinical practices and pregnancy rates may have varied. However, it is reassuring that clinical pregnancy rates have continued to improve steadily, according to annually published results from North American and European IVF registries (Society for ART, 1999Go). Since there is no reason to suspect that treatment-independent pregnancy rates have changed during the course of the trial, the treatment effect measured in the early part of the study should have underestimated rather than overestimated the IVF delivery rates achieved by centres by the end of the trial.

In testing the effectiveness of IVF, this study has not taken into account subsequent frozen embryo transfer. The primary question posed addresses the effectiveness of fresh embryo transfer following a stimulated cycle of IVF. A mean of two and a maximum of four embryos were transferred, with a mean of two more frozen in the IVF group. The delivery rate following thawed embryo transfer is generally less than that seen following fresh embryo transfer. The SART database reports a rate of 20% per thaw transfer in 1999, an important and relatively inexpensive adjunct to the measured effect.

In summary, this rigorously designed and executed multicentre RCT has shown that the first cycle increases the likelihood of live birth by >20-fold, in couples with persistent subfertility but Fallopian tube patency. Although six pairs of twins resulted from IVF and one pair from untreated observation, no triplets were conceived. Maternal and neonatal adverse events were relatively minor. The number needed to treat to achieve an additional live birth after IVF is only 4, significantly less than most alternative fertility treatments.


    Acknowledgements
 
The authors gratefully acknowledge the important work of the following team members in the completion of this study: Clare Stewart, Hamilton; Lynn Watson, London; Deborah Davies, Toronto; Caroline Robertson, Toronto; Joan Prewer, Toronto; Heather MacDonald, Vancouver; Physicians’ Services Incorporated; and Organon Canada. We also wish to acknowledge the sage advice of John Collins and Herzel Gerstein in the design phase of this study.


    References
 Top
 Abstract
 Introduction
 Methods
 Results
 Discussion
 References
 
Ben-Rafael Z, Mashiach S, Dor J, Rudak E and Goldman B (1986) Treatment-independent pregnancy after in vitro fertilization and embryo transfer trial. Fertil Steril 45,564–567.[Medline]

Donderwinkel PF, van der Vaart H, Wolters VM, Simons AH and Kroon G (2000) Treatment of patients with long-standing unexplained subfertility with in vitro fertilization. Fertil Steril 73,334–337.[CrossRef][Medline]

Evers JL, de Hass HW, Land JA, Dumoulin JC and Dunselman GA (1998) Treatment-independent pregnancy rate in patients with severe reproductive disorders. Hum Reprod 13, 1206–1209.[Abstract]

Haney AF, Hughes CL, Jr, Whitesides DB and Dodson WC (1987) Treatment-independent, treatment-associated, and pregnancies after additional therapy in a program of in vitro fertilization and embryo transfer. Fertil Steril 47,634–638.[Medline]

Hine LK, Laird N, Hewitt P and Chalmers TC (1989) Meta-analytic evidence against prophylactic use of lidocaine in acute myocardial infarction. Arch Intern Med 149,2694–2698.[Abstract]

Hughes EG and Giacomini M (2001) Funding IVF treatment for persistent sub-fertility: the pain and the politics. Fertil Steril 76,431–442.[CrossRef][Medline]

Mantel N and Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22,719–748.[Medline]

Roh SI, Awadalla SG, Friedman CI and Park JM (1987) In vitro fertilization and embryo transfer: treatment-dependent versus -independent pregnancies. Fertil Steril 48,982–986.[Medline]

Societyfor ART (1999) Assisted reproductive technology in the United States: 1996 results generated from The American Society for Reproductive Medicine/Society for Assisted Reproductive Technology Registry. Fertil Steril 71,798–807.[CrossRef][Medline]

Wangensteen OH, Peter ET, Nicoloff DM, Walder AI, Sosin H and Bernstein EF (1962) Achieving ‘physiological gastrectomy’ by gastric freezing. J Am Med Assoc 180,439–444.[Medline]

Submitted on January 14, 2004; accepted on February 18, 2004.