The Fertility Institute of New Orleans, New Orleans, LA and Section of Reproductive Endocrinology, Department of Obstetrics and Gynecology, Louisiana State University School of Medicine, New Orleans, LA, USA
1 To whom correspondence should be addressed at: 6020 Bullard Avenue, New Orleans, LA 70128, USA. e-mail: info{at}fertilityinstitute.com
In the review of statistical errors in analysis and design of subfertility trials by Vail and Gardener (2003), and in comments regarding this paper, the authors of the review, and both the editor (Barlow, 2003
) and associate editor (Daya, 2003
) of Human Reproduction, recommend statistical practices that, whilst they may make analysis more reliable, may unintentionally reduce the clinical significance of any results. As a consequence, important clinical and indeed scientific questions will be either not be answered, or be answered so simplistically as to be of questionable usefulness. The area of particular concern from a clinical viewpoint is the unit of analysis.
Both Barlow (2003) and Vail and Gardener (2003)
emphasize that success should be reported as live births rather than as pregnancies. This is also the viewpoint of regulatory authorities in the UK and USA. However, Vail and Gardener (2003)
further declare that UK, and by inference US, regulatory authorities publish inappropriate live birth rates per started cycle, per oocyte collection and per embryo transfer, instead of appropriate per person rates. Clinicians wonder how per person success rates of clinics can be compared when 2666% of patients drop out after each cycle, most often for reasons unrelated to response to treatment (Land et al., 1997
), when some clinics proceed directly to IVF without a trial of clomiphene or gonadotropin and intrauterine insemination, when clinics set different limits on the number of IVF cycles that they will perform for an individual patient and the number of embryos they will transfer, and when patients who are unsuccessful at one clinic may seek treatment at another. Per person pregnancy rates cannot be accurate unless all patients, except those who become pregnant, complete the same number of cycles, have had similar treatment before IVF and have the same number of embryos transferred. Since this is rarely possible, therapeutic efficacy trials need to be restricted to the first cycle and to subjects not previously treated with gonadotropins. Incidentally, most clinicians would welcome a reporting system in which they were rated on result from the first cycle of patients never treated previously, as well as on total performance.
Daya (2003) goes further than the others when he states that a singleton live birth should be selected as the outcome of interest, and that selecting the number of oocytes retrieved or the implantation rate for analysis is methodologically incorrect and should be discontinued because the subject being randomized is the patient and not the oocytes or embryos. His statement that a singleton live birth should be selected as the outcome of interest, results, no doubt, from the consensus recommendation of the ESHRE taskforce on risks and complications in assisted reproductive technology (ART), which declares that the outcome measure of ART and non-ART should be singleton live birth rate (Land and Evers, 2003
). Whilst a singleton live birth may be the desired outcome of IVF, although many patients and clinicians in the USA would disagree, it is not the appropriate unit of analysis for comparing ovulation induction drugs.
Contrary to Dayas opinion, the number of oocytes retrieved and implantation rate, not live birth, are the appropriate units of analysis when comparing ovulation induction drugs, for reasons that are obvious to all clinicians. Even these outcomes are dependent on decisions, often arbitrary, regarding how long drugs are administered, whether the dose of drug is increased or decreased during the treatment cycle, and when HCG is administered in both IVF and non-IVF cycles. In IVF cycles, pregnancy outcome, including whether there will be a singleton or multiple births, is additionally dependent on the number of quality embryos transferred. After ovulation in non-ART cycles, or embryo transfer in ART cycles, additional confounding factors affect the live birth rate. The probability that an embryo will implant depends on the quality of the endometrium, which in turn is dependent on estrogen and progesterone produced in response to ovulation induction. Excessive progesterone levels and possibly excessive estrogen levels at the time of HCG administration are associated with lower pregnancy rates and early abortion (Fanchin et al., 1997). The implantation rate is thus an indicator of both endometrial quality and embryo quality in response to ovulation induction drugs. A truism, understood by all clinicians, is that the further in time from infertility treatment that a pregnancy occurs the less likely that it is the result of treatment. Therefore, estrogen levels, number of preovulatory follicles, number of oocytes and number of embryos, the earlier indicators of response, not live birth, which is influenced by multiple confounding factors, are the appropriate units of analysis when comparing ovulation induction drugs.
Lastly, Barlow (2003), speaking from the viewpoint of chief editor, writes of his need to reconcile the desirability of requiring live birth as the unit of analysis with the desires of young researchers and sponsors to have their studies published as rapidly as possible. Clinicians in the UK and USA, who are required to report to regulatory authorities annually the outcome of thousands of cycles of IVF as live births, may well wonder why researchers cannot report the outcome of smaller numbers of study subjects in the same manner, and whether they are being held to a higher standard than are researchers.
All information that can be obtained from a properly designed and controlled study is important, and should be analysed and reported. Multiple units of analysis should be presented when possible, whether or not they are statistically significant. Clinical as well as statistical knowledge is necessary when designing and interpreting fertility studies.
![]() |
References |
---|
![]() ![]() |
---|
Daya, S. (2003) Pitfalls in the design and analysis of efficacy trials in subfertility. Hum. Reprod., 18, 10051009.
Fanchin, R., Righini, C., Olivennes, F., Ferrerira, A.L., de Ziegler, D. and Fryman, R. (1997) Consequences of premature progesterone elevation on the outcome of in-vitro fertilization: insights into a controversy. Fertil. Steril., 68, 799805.[CrossRef][ISI][Medline]
Land, J.A. and Evers, J.L.H. (2003) Risks and complications in assisted reproduction techniques: Report of an ESHRE consensus meeting. Hum. Reprod., 18, 455457.
Land, J.A., Courtar, D.A. and Evers, J.L.H. (1997) Patient dropout in an assisted reproductive technology program: implications for pregnancy rates. Fertil. Steril., 68, 278281.[CrossRef][ISI][Medline]
Vail, A. and Gardener,E. (2003) Common statistical errors in the design and analysis of subfertility trials. Hum. Reprod., 18, 10001004.