1 Department of Obstetrics and Gynecology, and 6 Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, 2 Department of Obstetrics and Gynecology, University of Ottawa, Ottawa, 3 Department of Obstetrics and Gynecology, University of Western Ontario, London, 4 Department of Obstetrics and Gynecology, University of Toronto, Toronto, Ontario and 5 Genesis Fertility Centre Incorporated, Vancouver, British Columbia, Canada
7 To whom correspondence should be addressed at: Department of Obstetrics and Gynecology, McMaster University Medical Centre, 1200 Main Street West, Room 4D14, Hamilton, ON L8N 3Z5, Canada. Email: hughese{at}mcmaster.ca
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: effectiveness/IVF/treatment-independent pregnancy
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
There are many examples of observational data overestimating the true value of treatments, leading to dissemination of useless and at times even harmful interventions. These include the widespread adoption of lidocaine to prevent post-myocardial infarction (MI) arrhythmias (Hine et al., 1989). When this treatment was tested appropriately through RCTs, it was found to significantly increase post-MI mortality and may have resulted in thousands of deaths. Gastric freezing for the treatment of peptic ulcer is a less dangerous but equally ineffective treatment widely adopted and then discounted after a small RCT proved it to be worthless (Wangensteen et al., 1962
).
There is also a need for an RCT of IVF versus no treatment to assist in health policy decisions. Access to medical treatment in countries with universal health care systems is generally predicated on medical necessity and effectiveness (Hughes and Giacomini, 2001). While most observers agree that subfertility is both a social and a medical problem, proof of the effectiveness of IVF treatment is lacking. The current trial therefore asks: among women with subfertility of
2 years duration, having any diagnosis but bilateral tubal occlusion, how effective is IVF treatment, with or without ICSI, in terms of live birth rate?
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Random allocation was based on a blocked schedule using numbered, sealed, opaque envelopes. Randomization was stratified by centre, by female age (35 years) and the presence or absence of abnormal sperm (total motile sperm count
20 million).
The interventions compared were the patients first ever cycle of IVF treatment and 90 days of observation with no treatment. Similar IVF techniques were used across centres. All programmes used long protocol GnRH analogue suppression followed by recombinant FSH as a prelude to oocyte retrieval and IVF. The drugs and dosages used for each patients stimulation were recorded, along with the number of oocytes retrieved, embryos produced, quality of individual embryos, day of transfer, and the number and quality of embryos transferred and frozen. Oocyte retrieval was carried out under vaginal ultrasound guidance and no centre transferred more than four embryos per cycle. The day of embryo transfer was not standardized and ranged between day 3 and day 5 post-retrieval.
Medication was begun within 42 days of randomization, to ensure that all embryo transfers occurred within 90 days, the same period of observation used in the control group. In the latter study group, no medications that might reduce spontaneous conception were allowed, such as the commencement of a GnRH analogue pre-treatment for subsequent IVF.
The primary study outcome was live birth. This was defined as delivery of a fetus or fetuses beyond 24 completed weeks of gestation with a heart rate at birth, or a neonate that survives for >10 min after resuscitation with a heart rate recorded. Delivery of more than one live fetus to a woman with a multiple pregnancy was considered as a single live birth. However, data on individual fetuses were recorded. Details of the delivery were gathered from patient charts by each centres research assistant and recorded in the final data collection sheet at 12 months post-randomization. If patients delivered elsewhere, these institutions were contacted and written documentation of outcome obtained.
Neonatal outcomes were recorded, and included birth weight, need for resuscitation, admission to and number of days stay in a neonatal nursery, neonatal seizures and death. Other secondary outcomes included clinical pregnancy, defined as evidence of pregnancy, based on serum HCG level >5 IU/l and visible evidence of a gestation sac with or without fetal heart rate or histological evidence of pregnancy after miscarriage, dilatation and curettage, or surgical excision of ectopic pregnancy. Written documentation for these outcomes was obtained from centres and checked centrally by the study coordinator. Details of adverse outcomes such as ovarian hyperstimulation syndrome (OHSS) and pelvic infection were included on the data collection forms.
Sample size estimation was based on the average reported delivery rates from the largest case series for North American IVF programmes at the time of study design: 26% per oocyte retrieval and 28% per embryo transfer (Society for ART, 1999). An estimate of treatment-independent pregnancy rate of 1% per month was based on five comparative studies of IVF versus no treatment, ranging from 0.35 to 0.9% (Ben-Rafael et al., 1986
; Haney et al., 1987
; Roh et al., 1987
; Evers et al., 1998
; Donderwinkel et al., 2000
). For the purpose of this study, an absolute increase of at least 25% per cycle was considered clinically significant.
A conventional level for type I error was chosen: (two-tailed) = 0.05. In defining
, the need for a high degree of certainty to detect a true difference prompted the choice of
= 0.10. Based on all of the above assumptions, using the Arcsine transformation and Fleiss continuity correction, a total of 124 patients were required in order to demonstrate a clinically and statistically significant difference between treatment and no treatment.
Data collation was centralized at the lead centre (Hamilton) and analysis was done in collaboration with the Clinical Trials Methodology Group of the Department of Clinical Epidemiology, McMaster University. Baseline prognostic variables included in the entry questionnaire were compared between groups and strata, in order to assess the effectiveness of randomization. The primary analysis was by intent to treat, comparing the number of live birth deliveries in women allocated to a single cycle of IVF or 3 months of untreated observation. Fishers exact test was used. Secondary analyses compared clinical pregnancy rates and the proportion of women achieving delivery with any outcome in the two groups. Confidence intervals (CIs) were calculated using MantelHaenszel statistics (Mantel and Haenszel, 1959).
Even with 90% power to evaluate benefits, the study had inadequate power to assess different rates of adverse events in the IVF and untreated groups. Adverse outcomes were summarized and reported.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In planning the current trial, sample size was based on fecundity rates from non-experimental studies, which raised two important questions: first, do the numbers accurately reflect the rates of treatment-dependent and independent live birth in the population under study, or are these data significantly biased; and, second, what is a clinically significant difference between treatment-dependent and independent delivery rates for IVF? Although the current trial design answers the first question, the second is highly subjective. An absolute increase in live birth rate of 25% is much greater than would be considered clinically significant for most medical treatments, but was chosen because of the high cost and current controversy around access to therapy.
The intervention of first IVF cycle was selected because its outcome is relevant to all couples seeking IVF. Studying subsequent treatment cycles would risk bias; the good prognosis patients that achieve treatment-related conception would not be included in the analysis of subsequent cycles. Three months of untreated observation was chosen as a comparator for pragmatic reasons. A pilot study suggested that 25% of couples approached might accept a 3 month delay in their IVF treatment for altruistic reasons alone. It was felt that extending the observation period further would lower the response rate, making recruitment and completion unmanageable. In reality, even asking patients to endure 3 months of treatment delay proved challenging, particularly as all were paying privately for care.
Does this trial provide an unbiased and reliable estimate of effectiveness? The inclusion criteria were broad and multiple centres participated, strengthening its external validity. Perhaps the most important element ensuring internal validity, secure randomization, was also achieved. Potential differences between populations, differing interventions across centres, female age and sperm quality were anticipated and dealt with by stratification. Once randomized to the observation period, two women received IVF treatment. Their outcomes were included in the no treatment group as part of an intent to treat analysis. A similar number randomized to early IVF conceived before treatment and these again were analysed as allocated. All randomized couples were thus accounted for, with no loss to follow-up. An alternative analysis as treated produces similar results: 17/66 IVF couples and 4/73 no treatment couples achieved a live birth. Here, that absolute treatment difference is 20 percentage points, relative likelihood 4.7 (2.78.1) and number needed to treat 5 (311).
A further concern in assessing internal validity is the influence of time. The recruitment phase of this study lasted 2 years. During that time, clinical practices and pregnancy rates may have varied. However, it is reassuring that clinical pregnancy rates have continued to improve steadily, according to annually published results from North American and European IVF registries (Society for ART, 1999). Since there is no reason to suspect that treatment-independent pregnancy rates have changed during the course of the trial, the treatment effect measured in the early part of the study should have underestimated rather than overestimated the IVF delivery rates achieved by centres by the end of the trial.
In testing the effectiveness of IVF, this study has not taken into account subsequent frozen embryo transfer. The primary question posed addresses the effectiveness of fresh embryo transfer following a stimulated cycle of IVF. A mean of two and a maximum of four embryos were transferred, with a mean of two more frozen in the IVF group. The delivery rate following thawed embryo transfer is generally less than that seen following fresh embryo transfer. The SART database reports a rate of 20% per thaw transfer in 1999, an important and relatively inexpensive adjunct to the measured effect.
In summary, this rigorously designed and executed multicentre RCT has shown that the first cycle increases the likelihood of live birth by >20-fold, in couples with persistent subfertility but Fallopian tube patency. Although six pairs of twins resulted from IVF and one pair from untreated observation, no triplets were conceived. Maternal and neonatal adverse events were relatively minor. The number needed to treat to achieve an additional live birth after IVF is only 4, significantly less than most alternative fertility treatments.
![]() |
Acknowledgements |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Donderwinkel PF, van der Vaart H, Wolters VM, Simons AH and Kroon G (2000) Treatment of patients with long-standing unexplained subfertility with in vitro fertilization. Fertil Steril 73,334337.[CrossRef][Medline]
Evers JL, de Hass HW, Land JA, Dumoulin JC and Dunselman GA (1998) Treatment-independent pregnancy rate in patients with severe reproductive disorders. Hum Reprod 13, 12061209.[Abstract]
Haney AF, Hughes CL, Jr, Whitesides DB and Dodson WC (1987) Treatment-independent, treatment-associated, and pregnancies after additional therapy in a program of in vitro fertilization and embryo transfer. Fertil Steril 47,634638.[Medline]
Hine LK, Laird N, Hewitt P and Chalmers TC (1989) Meta-analytic evidence against prophylactic use of lidocaine in acute myocardial infarction. Arch Intern Med 149,26942698.[Abstract]
Hughes EG and Giacomini M (2001) Funding IVF treatment for persistent sub-fertility: the pain and the politics. Fertil Steril 76,431442.[CrossRef][Medline]
Mantel N and Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22,719748.[Medline]
Roh SI, Awadalla SG, Friedman CI and Park JM (1987) In vitro fertilization and embryo transfer: treatment-dependent versus -independent pregnancies. Fertil Steril 48,982986.[Medline]
Societyfor ART (1999) Assisted reproductive technology in the United States: 1996 results generated from The American Society for Reproductive Medicine/Society for Assisted Reproductive Technology Registry. Fertil Steril 71,798807.[CrossRef][Medline]
Wangensteen OH, Peter ET, Nicoloff DM, Walder AI, Sosin H and Bernstein EF (1962) Achieving physiological gastrectomy by gastric freezing. J Am Med Assoc 180,439444.[Medline]
Submitted on January 14, 2004; accepted on February 18, 2004.