Accuracy of ovarian reserve tests

Bülent Gülekli1,2,4, Yesim Bulbul1, Ata Onvural1,2, Kutsal Yorukoglu3, Cemal Posaci1,2, Namik Demir1 and Oktay Erten1

1 Obstetrics and Gynaecology, 2 Reproductive Endocrinology and 3 Pathology, Dokuz Eylul University School of Medicine, Izmir, Turkey


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Several tests predict ovarian reserve in women undergoing assisted reproductive technologies. However, the accuracy of these tests in assessing the number of the remaining follicles within the ovary (ovarian reserve) has not been previously validated. The aim of this study was to assess the accuracy of ovarian reserve tests, namely basal and clomiphene-stimulated follicle stimulating hormone (FSH) concentrations and gonadotrophin-releasing hormone (GnRH) agonist stimulation test in predicting the number of the follicles within the ovaries. The ovaries of 22 parous women over 35 years of age who underwent oophorectomy were examined histologically for follicle number. Early follicular phase serum FSH, clomiphene citrate challenge tests (CCCT) and GnRH agonist stimulation test (GAST) were performed in the menstrual cycle prior to the surgery. The predictive value of these tests was then assessed. A positive correlation was detected between basal serum oestradiol concentrations and follicles per unit tissue but no significant correlation was detected between basal and clomiphene-stimulated FSH and follicles per unit tissue. The receiver operator characteristic curves indicated that the clomiphene citrate challenge test was the most accurate of the three tests assessed. In conclusion, none of the tests in this study accurately reflects ovarian reserve.

Key words: ageing/ovarian reserve tests/ovary


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The current trend towards delayed childbearing has decreased the reproductive potential of women. This diminished fertility has been expressed as the decreased number and quality of oocytes, changes in fertilizability and implantation, and increased risk of embryonic chromosomal abnormalities. Female fecundity is related to the total number of primordial follicles remaining within the ovaries, referred to as ovarian reserve, and this declines with age (Faddy and Gosden, 1996Go). The limited predictive value of age alone in estimating fecundity rates and response to the exogenous stimulation led to evaluation of other parameters.

Basal serum follicle stimulating hormone (FSH) concentrations (Scott et al., 1989Go; Toner et al., 1991Go; Farhi et al., 1997Go), clomiphene citrate challenge tests (CCCT) (Navot et al., 1987Go; Loumaye et al., 1990Go; Scott et al., 1993Go), gonadotrophin-releasing hormone agonist stimulation tests (GAST) (Winslow et al., 1991Go), exogenous follicle stimulating hormone ovarian reserve test (EFORT) (Fanchin et al., 1994Go), elevated day 10 serum progesterone levels (Hofmann et al., 1995Go) and inhibin B (Hall et al., 1999Go) have all been performed in order to assess response to gonadotrophins and chances of conception. Unfortunately, all the above tests are indirect measurements of ovarian reserve. Previous studies have assessed the role of these tests mainly in predicting ovarian follicular or oocyte response to gonadotrophins or chances of conception. To the best of our knowledge, the accuracy of these tests as a measure of the number of remaining primordial follicles within the ovaries has not been previously validated.

Attempts have been made to use follicular density in ovarian biopsy specimens as a direct means of assessing ovarian reserve (Lass et al., 1997Go). Although this may be a novel clinical tool for the evaluation of ovarian reserve, the limitation of this test is that follicular density within the biopsy specimen may not accurately represent the density of follicles within the whole ovary. The purpose of this study, therefore, was to examine the accuracy of the ovarian reserve tests by comparing the ability of basal serum FSH, CCCT, and GAST to predict the number of follicles within the ovaries as assessed by histology, in women over 35 years of age who underwent oophorectomy.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
This study was performed in the Obstetrics and Gynaecology Department of Dokuz Eylul University School of Medicine, Izmir, Turkey, between February 1996 and May 1997. The study was approved by the Human Investigation Review Committee of the Department, and informed written consent was obtained from all the participants. Twenty-four pre-menopausal parous women over 35 years of age, with regular menses, and about to undergo laparotomy with unilateral or bilateral salpingo-oophorectomy for uterine pathology were enrolled into the study. None of the women had ultrasound evidence of polycystic ovaries or any other ovarian pathology. One patient postponed her operation after all ovarian reserve tests had been performed, so she was excluded from the study. One other patient (patient no. 20 in Table IGo) was excluded because her basal FSH level was 48 mIU/ml in the post-menopausal range. All other women had a basal FSH <15 mIU/ml; thus 22 patients comprised our study group.


View this table:
[in this window]
[in a new window]
 
Table I. Hormone concentrations before and after the tests and total number of the follicles and follicles per cubic centimetre
 
Blood samples for basal FSH, luteinizing hormone (LH) and oestradiol were taken from the antecubital vein on the second day of the menstrual cycle. After this procedure, 1 mg of buserelin acetate was injected s.c. for the purpose of GAST. Twenty-four hours after the buserelin injection blood samples for FSH, LH, and oestradiol were taken. After waiting for washout of buserelin for 3 days clomiphene citrate (100 mg orally) was administered on days 5–9 and serum was obtained again on day 10 for FSH, LH, and oestradiol. Following centrifugation, sera were stored at –20° C until hormonal assessment. The operations were performed in the subsequent cycle after all tests had been completed.

All three tests were performed in the same cycle but we were confident that there was no carry-over effect, especially between gonadotrophin releasing hormone (GnRH) agonist and clomiphene. Single s.c. administration of GnRH agonist (buserelin) and its half-life in normal women has been studied (Lemay et al., 1983Go). It was found that (after a 1 mg single shot of buserelin) at 24 h, serum concentrations of both FSH and LH returned to pre-treatment control values. Serum oestradiol concentrations rose progressively and reached their maximum (2.5-fold) at 11 h, then began to decrease steadily, but remained above pre-treatment control values (0.5-fold) at 24 h. In the present study there was an interval of at least 8 days between the two tests (GAST and CCCT).

FSH and LH were measured by a double antibody assay (coat-A count IRMA; DPC Diagnostics, Los Angeles, CA, USA). The intra- and interassay coefficients of variation (CV) for the FSH assay (for {approx}24 mIU /ml) were 2.4 and 4.8% respectively. The intra- and interassay CV for the LH assay (for {approx}23.8 mIU /ml) were 1 and 3.4% respectively. 17-ß oestradiol concentration was measured with commercially available radioimmunoassay kits (coat-A count; DPC Diagnostics). The intra- and interassay CV for oestradiol assay (for {approx}50 pg /ml level) were 7 and 8.1% respectively.

Basal serum FSH concentration was considered abnormal if it exceeded 10 mIU/ml. This threshold was based on the upper limit of normal (95th centile) in our laboratory. Serum FSH >12 mIU/ml on day 10 was considered abnormal for CCCT (Tanbo et al., 1992Go). GnRH agonist stimulation test was considered normal for ovarian reserve if the serum oestradiol value on day 3 was double that of the baseline value (on day 2).

Of the 22 patients, 20 had operations for uterine myomata, one patient had uterine prolapse with urinary stress incontinence and the other had a stage 1 cervical carcinoma. Twenty of the patients had total abdominal hysterectomy (TAH) and two patients had myomectomy. In all women >39 years, at the time of the study it was routine to offer oophorectomy as prophylaxis for ovarian carcinoma. In the younger women, oophorectomy was offered as treatment for premenstrual syndrome or pelvic pain.

All the specimens were evaluated by the same pathologist (K.Y.). To assess the total number of follicles within the ovaries, all the ovaries were fixed for 24 h with buffered formaldehyde and cut into slices every 3 mm with a fractionater. All the tissues were processed in a routine manner and paraffin-embedded. From each 3 mm slice, a 5 µm section was cut and stained with haematoxylin–eosin (Gundersen et al., 1988Go). The sections were evaluated by light microscopy. The volume of each ovary (V(ov)) was calculated by the Cavalieri method (Gundersen et al., 1988Go), by multiplying the parameters from parallel sections separated by a known distance t (3 mm), area associated with one point in the grid (a) (4 mm2), and sum of the number of points hitting the section of the ovary ({Sigma} P).

The sections were then evaluated for the number of follicles (primordial, primary, secondary, and Graffian follicles). Atretic follicles were not counted. The total follicle number (N(ov)) and follicle number per cubic centimetre (N(ov)/cc) of the tissue were calculated (Gundersen et al., 1988Go) using the formulae:


where {Sigma}o, {Sigma}v represented counted total follicle number and calculated total section volume.

At the beginning of the calculations, after the first 10 counts, variance of {Sigma} area, total variance of {Sigma}P, Nugget effect (i.e. independent variance of each estimate), and Nug % was kept at 22.66 for point counting. The total variance of {Sigma}P representing the error ratio was 7.1%. Uni- or bilateral ovarian follicle numbers and ovarian volume as well as oocyte numbers per centimetre were calculated and the number of follicles per unit tissue was determined.

Statistical Package for Social Sciences (SPSS, Corp., Chicago, IL, USA) was used for statistical analyses. The paired and unpaired t-test and simple linear regression analyses were used where appropriate. The cut-off value for the number of follicles per unit tissue was taken as the mean minus 2 SEM. The sensitivity and specificity of each test were calculated and receiver operator curve (ROC) analysis performed using this cut-off value. Values are given as mean ± SEM.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
A total of 22 patients with a mean age of 39.2 ± 0.3 (range 36 – 42) years comprised our study group. The results of all the tests for each patient are presented in Table IGo.

Table IIGo shows the summary of results for each test. Ovarian reserve was considered as abnormal in a varying number of patients depending on the type of test performed. However, if the cut-off value for number of follicles was taken as 20 000/cc based on the mean – 2 SEM, the ovarian reserve was abnormal in 11 patients. For this value, the sensitivity was 45.4% for CCCT whereas the specificity was 90.9%. Sensitivity and specificity for basal FSH were 36.3 and 81.8%, and for GAST were 72.7 and 45.4% respectively.


View this table:
[in this window]
[in a new window]
 
Table II. Hormonal values and number of the follicles per cubic centimetre according to the tests
 
The receiver operator characteristic curves and area under the curve indicated that the CCCT was the most accurate of the three tests assessed (Table IIIGo, Figure 1Go).


View this table:
[in this window]
[in a new window]
 
Table III. Area under the curve for each test according to ROC analysis
 


View larger version (18K):
[in this window]
[in a new window]
 
Figure 1. The receiver operator characteristic curves for basal follicle stimulating hormone (FSH), gonadotrophin-releasing hormone agonist stimulation test (GAST), and clomiphene citrate challenge test (CCCT).

 
The number of follicles per unit tissue showed a negative correlation with basal FSH (r = –0.39) and with FSH after CCCT (r = –0.38) while there was a positive correlation between oestrogen after GAST (r = 0.12) and follicles per unit tissue. However, none of the correlations reached statistical significance (Figure 2Go). Also a negative correlation was noticed between age and follicle per unit tissue (r = 0.21, data not shown).





View larger version (29K):
[in this window]
[in a new window]
 
Figure 2. Correlation between follicles/cc with (a) GAST, (b) Basal FSH, (c) CCCT.

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The clinical use of gonadotrophins to induce follicular development is associated with a wide variation in ovarian response. Indeed, this ovarian responsiveness to stimulation plays a major role in assisted reproductive technologies, which are complex and costly. On the other hand, poor response to ovarian stimulation may alert the clinicians to a reduction in the women's fecundity. In most cases decreased ovarian function seems to be progressive. Another difficult area for clinicians is to provide infertile couples with accurate information about their chances of pregnancy. Although several factors influence the outcome of ovarian stimulation, ovarian response to stimulation is a major one in relation to IVF success. Each factor that influences ovarian response has its limitations, so ovarian reserve tests have been designed accordingly. Nevertheless, none of the screening tests that are currently available gives an appropriate answer to which test is the best for predicting diminished ovarian reserve.

Traditionally, as a woman's age increases, her oestradiol response to ovulation induction and number of oocytes retrieved decreases. In fact her functional ovarian reserve plays the major role in this impaired response. This decrease in ovarian function has been attributed to an absolute reduction in the number of the follicles available for stimulation. To the best our knowledge this is the first study that directly compares ovarian reserve tests to the number of the follicles within the ovaries.

The chronological age of a woman is related to her chance of conceiving and the decline in fertility after the age of 35 years has been well documented, even in the assisted reproductive technologies (Tan et al., 1992Go). For this reason our study population was restricted to women over 35 years of age.

We are aware that a screening test should have a high sensitivity and low specificity in order not to underestimate the number of affected subjects. On the other hand, a test could only be diagnostic for a certain disease with a high specificity, although this would increase the cut-off value and allow individuals without disease to be included. It is therefore essential to perform receiver operator characteristic curves when comparing the accuracy of screening tests.

In the present study the basal FSH concentrations were abnormal in 27% (six out of 22) of our patients and did not accurately reflect the actual number of the follicles per unit tissue. The sensitivity of basal FSH was 36.3% and specificity 81.8%. In other words basal FSH has poor predictive value to estimate the follicles per unit tissue. However, the number of our patients who had basal FSH values <10 mIU/ml (i.e. normal) was greater than the number classified as abnormal, and this may have had an adverse effect on our results. Nevertheless, of the tests performed, basal serum FSH was the second best in providing accurate prognostic information according to ROC analysis (Figure 1Go).

The CCCT was first used by Navot et al. to assess the ovarian reserve (Navot et al., 1987Go) and this test appears to be more sensitive than basal FSH (Loumaye et al., 1990Go; Tanbo et al., 1992Go; Loumaye, 1995Go). In our study, CCCT reflected a normal ovarian reserve in 73% (16 out of 22) of our patients. The mean number of follicles per unit tissue was significantly higher in patients with normal CCCT compared to those with an abnormal result. According to the ROC analysis, the CCCT was the most sensitive (area under the curve = 0.73, Table IIIGo).

The prognostic value of GAST has been documented by many authors in different protocols (Muasher et al., 1988Go; Padilla et al., 1990Go; Winslow et al., 1991Go). Eight of our patients (36%) had normal ovarian reserve according to GAST. However, there was no statistically significant difference in the mean number of follicles per unit tissue between the groups with normal and abnormal GAST. Furthermore, GAST was the least sensitive test according to ROC analysis (area under the curve = 0.48, Table IIIGo), and was relatively less accurate than the other two tests. In addition, we believe that the GAST is more invasive, expensive, and needs further standardization.

Some studies (Lass et al., 1997Go; Tomas et al., 1997Go; Chang et al., 1998Go) have assessed the use of ultrasound in evaluating ovarian reserve. Although ultrasound was performed in this study, ultrasonographic parameters were not an intended outcome measure.

We are aware of the limitations of the present study: The difficulty in recruiting patients who need an oophorectomy after 35 years of age and before they reach 50 years played a major role in defining the number of the study population. Secondly, since all of our patients were of proven fertility, the other possible factors that can affect the ovarian reserve in an infertile population cannot be eliminated in our study group. Finally, the number of the abnormal tests was small because of the limited number of the whole study group.

In conclusion, it seems unlikely that the sensitivity of the above tests can judge the ovarian reserve accurately. Nevertheless, according to ROC analysis, CCCT is more predictive of ovarian reserve compared with basal serum FSH and GAST.


    Acknowledgments
 
The authors would like to thank Mr Alp Ergör of the Public Health Department at the Dokuz Eylül University, Izmir, Turkey for his assistance with the statistics. We sincerely appreciate Professor Roy Homburg of the Tel Aviv University, Israel for critical review of the manuscript and correcting the English text.


    Notes
 
4 To whom correspondence should be addressed at: McGill Reproductive Center, Department of Obstetrics and Gynecology, Royal Victoria Hospital, 687 Pine Avenue West, # F6.58, Montreal, Quebec Canada H3A 1A1 Back


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Chang, M.Y., Chiang, C.H., Hsieh, T.T. et al. (1998) Use of antral follicle count to predict the outcome of assisted reproductive technologies. Fertil. Steril., 69, 505–510.[ISI][Medline]

Faddy, M.J. and Gosden, R.G. (1996) A model confirming the decline in the follicle numbers to the age of menopause in women. Hum. Reprod., 11, 1484–1486.[Abstract/Free Full Text]

Fanchin, R., de Ziegler, D., Olivennes, F. et al. (1994) Exogenous follicle stimulating hormone ovarian reserve test (EFORT): a simple and reliable screening test for detecting `poor responders' in in-vitro fertilization. Hum. Reprod., 9, 1607–1611.[Abstract]

Farhi, J., Homburg, R., Ferber, A. et al. (1997) Non-response to ovarian stimulation in normogonadotrophic, normogonadal women: a clinical sign of impending onset of ovarian failure pre-empting the rise in basal follicle stimulating hormone levels. Hum. Reprod., 12, 241–243.[Abstract]

Gundersen, H.J., Bendtsen, T.F., Korbo, L. et al. (1988) Some new, simple and efficient stereological methods and their use in pathological research and diagnosis. APMIS, 96, 379–394.[ISI][Medline]

Hall, J.E., Welt, C.K. and Cramer, D.W. (1999) Inhibin A and inhibin B reflect ovarian function in assisted reproduction but are less useful at predicting outcome. Hum. Reprod., 14, 409–415.[Abstract/Free Full Text]

Hofmann, G.E., Thie, J., Scott, R.T. et al. (1995) Evaluation of the reproductive performance of women with elevated day 10 progesterone levels during ovarian reserve screening. Fertil. Steril., 3, 979–983.

Lass, A., Silye, R., Abrams, D.-C. et al. (1997) Follicular density in ovarian biopsy of infertile women: a novel method to assess ovarian reserve Hum. Reprod., 12, 1028–1031.[ISI][Medline]

Lemay, A., Metha, A.E., Tolis, G. et al. (1983) Gonadotropins and estradiol responses to single intranasal or subcutaneous administration of a luteinizing hormone-releasing hormone agonist in the early follicular phase. Fertil. Steril., 39, 668–673.[ISI][Medline]

Loumaye, E. (1995) Ovarian challenge tests. In Hedon, B., Bringer, J., and Mares, P. (eds). Fertility and Sterility: A Current Overview. The Parthenon Publishing Group, New York, London, pp. 373–377.

Loumaye, E., Psialti, I., Billion, J.M. et al. (1990) Predicition of individual response to controlled ovarian hyperstimulation by means of a clomiphene citrate challenge test. Fertil. Steril., 53, 295–301.[ISI][Medline]

Muasher, S.J., Ellis, L.M., Oehninger, S. et al. (1988) The value of basal and/or stimulated serum gonadotropin levels in prediction of stimulated response and in vitro fertilization outcome. Fertil. Steril., 50, 298–307.[ISI][Medline]

Navot, D., Rosenwaks, Z. and Margalioth, E.J. (1987) Prognostic assessment of female fecundity Lancet, ii, 645–647.

Padilla, S.L., Bayati, J. and Garcia, J.E. (1990) Prognostic value of early serum estradiol response to leuprolide acetate in in vitro fertilization. Fertil. Steril., 53, 288–294.[ISI][Medline]

Scott, R.T., Oehninger, S., Toner, J.P. et al. (1989) Follicle-stimulating hormone levels on cycle day 3 are predictive of in vitro fertilization outcome. Fertil. Steril., 51, 651–654.[ISI][Medline]

Scott, R.T., Leonardi, M.R., Hofmann, G.E. et al. (1993) A prospective evaluation of clomiphene citrate challenge test screening of the general infertility population. Obstet. Gynecol., 82, 539–544.[Abstract]

Tan, S.L., Royston, P., Campell, S. et al. (1992) Cumulative conception and livebirth rates after in-vitro fertilisation. Lancet, 339, 1390–1394.[ISI][Medline]

Tanbo, T., Norman, N., Dale, P.O. et al. (1992) Prediction of response to controlled ovarian hyperstimulation: a comparison of basal and clomiphene citrate-stimulated follicle-stimulating hormone levels. Fertil. Steril., 57, 819–824.[ISI][Medline]

Tomas, C., Nuojua-Huttunen, S. and Martikainen, H. (1997) Pretreatment transvaginal ultrasound examination predicts ovarian responsiveness to gonadotrophins in in-vitro fertilization. Hum. Reprod., 12, 220–223.[Abstract]

Toner, J.P., Philput, C.B., Jones, G.S. et al. (1991) Basal follicle-stimulating hormone level is a better predictor of in vitro fertilization performance than age. Fertil. Steril., 55, 784–791.[ISI][Medline]

Winslow, K.L., Oehninger, S.C., Toner, J.P. et al. (1991) The gonadotropin-releasing hormone agonist stimulation test—a sensitive predictor of performance in the flare-up in vitro fertilization cycle. Fertil. Steril., 56, 711–717.[ISI][Medline]

Submitted on March 29, 1999; accepted on August 16, 1999.