1 Department of Obstetrics and Gynaecology, University Hospital Nijmegen, PO Box 9101, NL-6500 HB Nijmegen, and 2 Department of Medical Affairs, University Hospital, Nijmegen, The Netherlands,
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: external validation/IVF/prediction/prognosis
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the field of infertility, several authors have launched their models for the probability of pregnancy. Before any of these models can be implemented in clinical practice, good external validation is required (Stolwijk et al., 1998). The predictive accuracy of a prognostic model can be expressed by calibration and discrimination (Harrel et al., 1996). Calibration refers to the amount of bias in the predictions, while discrimination refers to the ability to separate patients with different outcomes. Unfortunately, prognostic models after IVF for the probability of pregnancy presented in the literature have often not been validated (Hughes et al., 1989
; Haan et al., 1991
; Templeton et al., 1996
). One prognostic model was externally validated (Stolwijk et al., 1996
); however, in another centre these tests proved that these models could not predict well. Templeton et al. (1996) developed a model to predict live birth after treatment with IVF using data from 26 389 women treated in all IVF centres in the UK. Although a model based on such a large population seems rather confirmative, it might not predict well in other populations. To examine the external validation, a model should be applied to other data than those upon which the model was based. To our knowledge, this external validation of the `Templeton model' has not been done before. Therefore we started a retrospective study to validate the model and thereby estimate its clinical usability.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The predicted probability (P) of achieving a live birth after IVF was calculated using the Templeton the model:
![]() |
Where y was defined as y = 2.028 + [0.00551x(age 16)2] [0.00028x(age 16)3] + [i (0.0690xno. of unsuccessful IVF attempts)] (0.0711xtubal subfertility) + (0.7587xlive birth after IVF) + (0.2986xprevious pregnancy after IVF which did not result in a live birth) + (0.2277xlive birth which was not a result of IVF) + (0.1117xprevious pregnancy, not after IVF and which did not result in a live birth). Tubal subfertility and previous pregnancies were dichotomized in the model; 1 if applicable, 0 if not. The indicator `i' was a value used to represent the infertility duration in years and was 0.2163 if the infertility duration was 13 years, 0.0839 if infertility duration was 46 years, 0.1036 if infertility duration was 712 years, and 0.4179 if infertility duration was 13 years.
The Templeton model is based upon information from clinic forms which do not specify criteria for diagnosis (Craft and Forman, 1997). The variable `diagnosis', as used in the model, is therefore the result of different work-ups and criteria. Furthermore, other variables were not specified at all. Because of these, we made a few assumptions to define the following variables in the model; (i) age: age of the woman at the specific IVF cycle; (ii) duration of infertility: duration of subfertility at the first IVF cycle; (iii) unsuccessful IVF attempts: the total number of previous IVF cycles in which no ongoing pregnancy was achieved (max. no. = 10); (iv) previous pregnancy not resulting in a live birth: spontaneous abortion or an ectopic pregnancy; (v) tubal pathology: tubal pathology exclusively; (vi) furthermore, because of limitations in the data available, we defined the predicted outcome: live birth. Because of incomplete follow-up we assumed for our calculations that all ongoing pregnancies, which are pregnancies that continued for at least 12 weeks after embryo transfer, resulted in live births.
We performed three external validations in which we intended to study the influence of different definitions by comparing the outcome of these three validations: (i) following the assumptions mentioned above; (ii) woman's age at the first IVF cycle (instead of at the specific IVF cycle); and (iii) tubal pathology exclusively or in combination with one or more other subfertility diagnoses (male factor, endometriosis, or cervical factor) (instead of tubal pathology exclusively).
We evaluated the predictive performance of the model by means of, firstly, the c index, which indicates the overall discriminative performance (Harrell et al., 1982, 1996
), and secondly, compared observed and predicted proportions of success for groups with a low probability (<5%, <10%) and a high probability (
20%). We presented predicted proportions with mid-P exact 95% confidence intervals (CI) (Vollset, 1993
). The c index (number of concordant pairs + 0.5xthe number of tied pairs/total number of pairs) can be interpreted as the probability of a correct prediction for a random pair that comprises a woman with an ongoing pregnancy and a woman without an ongoing pregnancy. A c index of 0.5 indicates that the predictions made for the whole population are bad; such a prediction is comparable to a flip of a coin. A c index of 1 indicates the ability to make perfect predictions.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The mean age of the women at the beginning of treatment was 32.8 years (SD = 4.0; range 2244; median 33 years) and the mean duration of infertility was 3.7 years (SD = 2.5; range 121; median 3 years). The mean number of previous unsuccessful IVF attempts was 0.8 (SD = 1.0; range 06; median 0 unsuccessful attempts). In the validation, 7% of the cycles were preceded by at least one previous live birth after IVF, 6% by at least one previous IVF pregnancy not resulting in a live birth, 13% by at least one live birth (excluding IVF births) and 22% by at least one pregnancy not resulting in a live birth (excluding IVF pregnancies). From all cycles that were used in the validation, 47% were first cycles, 29% were second cycles, 16% were third cycles, 5% were fourth cycles and 2% were of a higher rank (range 58). The distribution of indications for treatment, also important to test Templeton's model, is shown in Table I.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the model the relative importance of the presented factors can be deducted from the parameter estimates resulting from the multiple logistic regression model. The `duration of infertility' as well as `previous pregnancies' play an important role (in the latter all applicable variables are multiplied by the regression coefficient), whereas the influence of the woman's age is not so obvious. Therefore we made a calculation of the relative effect of the woman's age. For this purpose we used the formula presented in the model:
![]() |
From Table III it becomes clear that 29 years is the most favourable age to achieve a live birth after IVF, with the likelihood rapidly decreasing as the patient becomes older and that the relative positive influence of low age decreases in younger women. Although the parameters used by Templeton et al. all contribute to the predictive capacity of the model, age still is a very important predictor. We could not find remarkable differences between the results of our original validation (A) and our second validation (B), suggesting that there is no significant influence of the definition of the woman's age. This was expressed by the virtually unchanged c index (from 0.629 to 0.632).
|
In our assumptions we chose to use ongoing pregnancy as our endpoint instead of live birth, because the follow-up of pregnancies was not accurate enough. Data from our own clinic show that from July 1991 to December 1997, 506 ongoing pregnancies resulted in 482 live births (95%). The predicted probabilities for an ongoing pregnancy will therefore overestimate the expected probabilities of a live birth. We observed for the entire population that in 17.6% of the started cycles an ongoing pregnancy was achieved. The model by Templeton et al. predicted that in 14.4% (95% CI = 13.115.7%) of the started cycles a live birth would be achieved. Craft and Forman (1997) pointed out that Templeton reported an unexplained infertility incidence of >30%, which they felt was very high, considering that patients were treated in tertiary fertility referral centres. Our data show a considerably lower percentage (20.8%) of unexplained infertility cases. Last but not least, the original study revealed big differences between the contributing centres, which could attribute to the poor reproducibility of the model.
The question arises whether the development of a better model is possible or not. Other promising predictive factors may increase the predictive value of a model, as pointed out by Craft and Forman (1997). For instance, the basal FSH (Sharif et al. 1998), or day 3 oestradiol (Smotrich et al., 1995
) values showed better predictive value than age alone. Inhibin is regarded to be another promising predictor of the outcome of IVF (Seifer et al., 1997
; Lindheim et al., 1998
). Moreover, since new techniques and medication influence the results of assisted reproductive technologies, a prognostic model has a limited lifetime and needs constant adaption.
In conclusion, the model presented by Templeton et al. based upon an unrivalled large data set, seems to be able to discriminate between a group of women with a very low probability of achieving success after IVF and those with a very high probability. However, for the majority of women, the application of the Templeton model did not give any more certainty, because their prior and posterior probabilities hardly differed.
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Haan, G., Bernardus, R.E., Hollanders, J.M.G., et al. (1991) Results of IVF from a prospective multicentre study. Hum. Reprod., 6, 805810.[Abstract]
Harrell, F.E., Califf, R.M., Pryor, D.B., et al. (1982) Evaluating the yield of medical tests. J. Am. Med. Assoc., 247, 25432646.[Abstract]
Harrell, F.E., Lee, K.L. and Mark, D.B. (1996) Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med., 15, 361387.[ISI][Medline]
Hughes, E.G., King, C. and Wood, E.C. (1989) A prospective study of prognostic factors in in vitro fertilization and embryo transfer. Fertil. Steril., 51, 838844.[ISI][Medline]
Lindheim, S.R., Chang, P.L., Vidali, A. et al. (1998) The utility of serum progesterone and inhibin A for monitoring natural-cycle IVF-ET. J. Assist. Reprod. Genet., 15, 538541.[ISI][Medline]
Seifer, D.B., Lambert-Messerlian, G., Hogan, J.W. et al. (1997) Day 3 serum inhibin-B is predictive of assisted reproductive technologies outcome. Fertil. Steril., 67, 110114.[ISI][Medline]
Sharif, K., Elgendy, M., Lashen, H. and Afnan, M. (1998) Age and basal follicle stimulating hormone as predictors of in vitro fertilisation outcome. Br. J. Obstet. Gynaecol., 105, 107112.[ISI][Medline]
Smotrich, D.B., Widra, E.A., Gindoff, P.R. et al. (1995) Prognostic value of day 3 estradiol on in vitro fertilization outcome. Fertil. Steril., 64, 11361140.[ISI][Medline]
Stolwijk, A.M., Zielhuis, G.A., Hamilton, C.J.C.M. et al. (1996) Prognostic models for the probability of achieving an ongoing pregnancy after in vitro fertilization and the importance of testing their predictive value. Hum. Reprod., 11, 22982303.[Abstract]
Stolwijk, A.M., Straatman, H., Zielhuis, G.A. et al. (1998) The search for externally prognostic models for ongoing pregnancy after in vitro fertilization. Hum. Reprod., 13, 35423549.[Abstract]
Templeton, A., Morris, J.K. and Parslow, W. (1996) Factors that affect outcome of in-vitro fertilization treatment. Lancet, 348, 14021406.[ISI][Medline]
Vollset, S.E. (1993) Confidence intervals for a binomial proportion. Stat. Med., 12, 809824.[ISI][Medline]
Submitted on June 21, 1999; accepted on January 11, 2000.