1 Department of Reproductive Medicine and Gynaecological Endocrinology, Staedtische Kliniken Duesseldorf gGmbH, Frauenklinik Benrath and Institute of Natural Family Planning, 2 Biometric Research Group, Clinic for Thoracic and Cardiovascular Surgery and 3 Department of Gynaecological Endocrinology, University of Heidelberg and 4 Stiftung Warentest, Berlin, Germany
5 To whom correspondence should be addressed at: Frauenklinik Städt. Krankenhaus Düsseldorf-Benrath, Urdenbacher Allee 83, 40593 Düsseldorf, Germany. e-mail: freundlg{at}uni-duesseldorf.de
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: cycle monitor/hormonal computer/mini-microscopes/natural family planning/temperature computers
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
Hormonal assays
The LH surge in urine was determined semi-quantitatively using the commercially available monoclonal antibody test ClearPlan. Urine samples were collected daily beginning at the same time as ultrasound tracking by the participants, in collecting tubes prepared with thiomersal for conservation and stored in a refrigerator until common examination of all probes in the laboratory. The day with the most intensive coloured signal was called the LH peak day.
Use of the various devices
The devices were used according to the manufacturers instructions. Each day of the cycle was computed as fertile or infertile as predicted by the device.
Definitions
As suggested by the World Health Organization (1980), the probable fertile window is defined as day 5 to day +2 inclusively, related to the LH peak (day 0) which roughly corresponds to the day on which the dominant follicle nearly reaches its maximum diameter, and is
1 day prior to ovulation (2428 h). This definition of the complete and objective fertile window assumes a maximum of 6 days of sperm and 24 h of oocyte survival. Time of ovulation is defined as being between maximum follicular diameter plus 12 h and LH peak + 24 h (see Table IV).
|
P = k [1 (1 P5)x5(1 P4)x4 ... (1 P0)x0 (1 P1)x1 (1 P2)x2]
as the worst case probability. In this formula, the index i runs from 5, 4, ..., to 2, and xi takes the value of 1 if day i is indicated as infertile by the cycle tester (assuming the woman has intercourse on such a false infertile day), and xi takes the value of 0 if this day is predicted correctly as fertile (and she had no intercourse on that day). Using the ideas outlined by Colombo and Masarotto (2000), we need the day-specific conditional conception probabilities p5, ..., p2 (under the condition that the oocyte is fertilized) of each day of the menstrual cycle in the fertile window and we need the cycle viability k, which is roughly the maximum probability of pregnancy in any given cycle even if intercourse occurs on every fertile day in order to calculate the worst case probability, P, of becoming pregnant for every woman. For details of the model, the reader is referred to Colombo (1989
) and Colombo and Masarotto (2000
). They give estimates of the day-specific conception probabilities pi = k Pi (with p0 denoting the conception probability of the day of ovulation).
Higher values of P indicate a higher risk of conception for those women who want to use such a monitor to avoid pregnancy. In this sense, the worst case probability P is also an indicator for the quality of a monitor, indicating better contraception quality if P is smaller. We used this formula to obtain estimates of maximum pregnancy rates for every woman and device together with the specific values for k and p5, ..., p2 from Table 10 of Colombo and Masarotto (2000). These values are shown in Table IV. For example, if in the test cycle, days 5, 4, 3 and 2 of the fertile window (clinically detected by ultrasound scans and urinary LH) had not been detected as fertile by the device, and days 2 to 1 had been detected correctly, then
P = k [1 (1 P5) (1 P4) (1 P3) (1 P2)] = 0.277 [1 (1 0.2455) (1 0.6354) (1 0.8556) (1 0.1264)] = 0.2674
was calculated as the individual risk of an unintended pregnancy, which is the maximum estimated pregnancy rate in this cycle for the user of this device and is very close to the cycle viability factor k = 0.277. This is a worst case analysis since we assume intercourse occurred in each of the false infertile days, thus maximizing the individual risk of each participant.
A quality measure (quality index, QI) can be derived using P:
QI = P/k
for each woman and device. QI ranges between 0 and 1 with small values indicating a good method for preventing a pregnancy. A value close to 1 indicates that the device or method is close to no method at all. Thus, QI serves as a normalization between 0 and 1 with the cycle viability as normalizing factor. For the example above, QI = 0.2674/0.277 = 0.9653, indicating that the use of the device is virtually no better than no use of any device.
Performing these calculations for each combination of women and devices, the means and SD of these risks and quality measures for every device were computed. For the calculation of the individual risks of becoming pregnant, a Mathematica notebook was written for the product formula above (Mathematica program package from Wolfram Research for symbolic and numerical mathematics, version 4.2, www.wolfram.com). In addition, a Microsoft Excel table was written, into which the data from the notebook could be transferred to perform further statistical analysis with the SAS program package for statistics from SAS Institute (www.sas.com, version 8.2). The data from the participants were described by the usual means of descriptive statistics: we computed the means and SD of these risks and quality measurements for every cycle in which each device was tested.
For inferential statistics, the KruskalWallis test as a non-parametric one-way test was used. After demonstrating significant differences for the means of the logarithms of the worst case probabilities for unwanted pregnancies between the methods, the one-factorial analysis of variance and Duncans a-posteriori test was used to find the possible grouping of the methods according to these average conception probabilities. This was considered as the main outcome for this research. The logarithms of the probabilities were computed to homogenize the variances between the groups instead of the original values. In addition, an analysis of variance for the percentage of false negative days was performed. The results were virtually the same as the primary analysis and are not reported here.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In all cycles, ovulation could be detected by the LH surge and maximum follicular diameter occurring on the same day (see Table IV).
In a previously published study we have compared the correlation between the symptoms of self-observation and the ovulation detected by ultrasound/maximum follicle diameter/LH (Gnoth et al., 1996) in 87 cycles. The basal body temperature (BBT) rise identified according to the three-over-six rule was detected +0.92 (± 1.17) days around objective ovulation by ultrasound and LH monitoring.
Table V shows the total number of computed menstrual cycle days for each system and gives their false infertile (true fertile days predicted as not fertile days) and false fertile (true infertile days predicted as fertile days) days. It is obvious that the mini-microscopes had much more false infertile days than the temperature or hormonal devices.
|
These recently published probabilities (Table IV) correlate well with other previously published figures (Schwartz et al., 1979; Bremme, 1991
; Miolo et al., 1993
; Weinberg et al., 1994
). Table V summarizes the results of this worst case analysis. Values obtained from the mini-microscopes with their relatively high rates of false infertile cycle days did not differ much from the estimated cycle viability with a maximal probability for pregnancy values. These high false negative rates and low false positive rates account for their low sensitivity detecting truly fertile cycle days. In contrast, the temperature computers and the STM of NFP are highly sensitive but less specific in detecting the fertile window which accounts for their optimal use in contraception. Persona was found to have only a medium sensitivity and specificity. The non-parametric KruskalWallis test showed significant differences between the different devices and fertility prediction methods (P < 0.0001). Using one-factorial analysis of variance together with Duncans method of a-posteriori testing to validate a grouping for these worst case probabilities, the analysis showed a significant difference between three groups of methods: group A consisting of the mini-microscopes PG 53, PC 2000 and Maybe Baby, group B consisting of Persona only, and group C consisting of the temperature computers and natural family planning methods. We used the KruskalWallis test to prove possible differences between the means of the different methods since we could not assume normal distribution of the worst case probabilities. We additionally transformed the data by adding a constant to each value and then taking the logarithm to obtaining homogeneous variances for the groups.
We performed the KruskalWallis test to find possible differences in the expectations of the worst case probabilities (and thus in the quality measures) per device. We calculated the worst case probability in the test cycle of every participant. The results of the KruskalWallis test for differences and the Duncan test for a-posteriori grouping were the same for the transformed as well as for the original data. In Table VI shows the descriptive statistics for the original data together with the statistical grouping for easier understanding since a table of the KruskalWallis test (rankings of all women, together with the average ranking per group) does not give as much information as a table with the means and SD of the worst case probabilities and quality measures. For all women in the NFP group, the worst case probability for pregnancy was 0 (thus giving 0 also as the mean and SD of this probability in this group according to the usual formulas used for descriptive statistics). Compared with other methods, Table V shows that NFP does not predict too many infertile days to be fertile.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Prospective efficacy studies have been carried out on a few devices. One such device, the Ovarian Monitor (Brown and Blackwell, 1980; Brown et al., 1989
, 1991), which we did not test here, showed a Pearl Index of 7.3 (Brown et al., 1991
) in a study involving 37 women with 569 cycles. With this system, the beginning of the fertile period was marked by the rise in urinary metabolites of estrogen and the end by the rise in progesterone metabolites. This system still has some technical problems and is currently not available in Europe.
The largest prospective efficacy trial of cycle monitors was done for Persona (Freundl, 1998; Bonnar et al., 1999
; Trussell, 1999
; Trussell, 2001
), involving 710 participants in three European countries. The method failure rate was estimated to be 6.4%. In the present study, the failure rate of Persona with daily intercourse on false negative days was in the middle range of all devices tested (average of 0.1155 for the worst case probability and of 0.4169 for the quality index). Essentially, a modified device is now used to identify the fertile days to achieve pregnancy (ClearPlan Fertility Monitor: Behre et al., 2000
; Behre, 2001
; May, 2001
).
Another prospective trial was reported for Bioself (Drouin et al., 1994). This study included 83 women with 745 cycles. The pregnancy rate was 9.02 (Pearl Index). Another study by Flynn et al. (1991
) involving 131 women with 1238 cycles, showed a Pearl Index of 23. However, out of 24 unplanned pregnancies, only two could be definitely considered as method failures.
For the other temperature computers tested in the present trial, only small prospective and retrospective efficacy finding studies (EFS) (Freundl et al., 1992, 1998a,b) have been performed. However, it is noteworthy that the most effective devices (and the STM) in the present study are based on BBT and that the estimates from Colombo and Masarotto (2000
) are likewise derived from a BBT reference point.
No efficacy studies have been performed for the mini-microscopes prior to this study.
Recent research by Braat et al. (1998) investigated the reliability of predicting fertile days by observing ferning in saliva. In 30 women with regular menstrual cycles, the day of ovulation was confirmed either by ultrasound or by BBT recordings. Every morning a drop of saliva was dried and assessed with a mini-microscope in group 1 (17 women) and a normal light microscope in group 2 (13 women). Tests were judged positive with the appearance of ferning or intermediate (some) ferning. The sensitivity was 53% for group 1 and 86% for group 2. They reported a strong correlation between saliva estradiol and serum estradiol values but no correlation was detected between the estradiol concentrations in saliva and the ferning pattern, and they concluded that ... the saliva ferning is unreliable for predicting the fertile period and its use should therefore be discouraged.
The Cue Fertility Monitor uses the changes in salivary electrical resistance. Presently, it is not available in Europe but may be of interest after some technical changes. A computerized version, OvaCue, also exists. Two small studies with 42 cycles of 19 women (Moreno et al., 1997) and 21 cycles of 11 women (Fehring, 1996
) reported astonishing effectiveness. However, our small prospective study (Freundl et al., 1996
) could not prove these results.
In summary, there is an urgent need to test systems which are developed to detect the fertile window in the menstrual cycle. To undertake the efforts and expenses of a full prospective clinical trial, a primary efficacy estimation analysis as proposed in this paper should be performed. The QI should be 0.5 (Persona, temperature computers, STM of NFP). Systems with a QI >0.5 cannot be expected to have a reasonable failure rate in a prospective efficacy finding study (pEFS), should not be offered to patients and are not worth further investigations.
![]() |
Acknowledgements |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Barrett, J. and Marshall, J. (1969) The risk of conception on different days of the menstrual cycle. Population Studies, 23, 455461.[ISI]
Behre, H. (2001) Trial protocol and sample result of a study comparing the ClearPlan Easy Fertility Monitor with serum hormone and vaginal ultrasound measurements in the determination of ovulation. J. Int. Med. Res., 29 (Suppl. 1), 21A27A.
Behre, H.M., Kuhlage, J., Gassner, C., Sonntag, B., Schem, C., Schneider, H.P. and Nieschlag, E. (2000) Prediction of ovulation by urinary hormone measurements with the home use ClearPlan Fertility Monitor: comparison with transvaginal ultrasound scans and serum hormone measurements. Hum. Reprod., 15, 24782482.
Bonnar, J., Flynn, A., Freundl, G., Kirkman, R., Royston, R. and Snowden, R. (1999) Personal hormone monitoring for contraception. Br. J. Fam. Plann., 24, 128134.[ISI][Medline]
Braat, D.M.D., Smeenk, J.M.J., Manger, A.P., Thomas, C.M.G. and Veersema, S. (1998) Saliva test as ovulation predictor. Lancet, 352, 1283.
Bremme, J. Sexualverhalten und Konzeptionswahrscheinlichkeit (Auswertung einer prospektiven Studie zur Natürlichen Familienplanung) (1991) Thesis, Med. Fakultät der Heinrich-Heine-Universität Düsseldorf, pp. 152.
Brown, J.B. and Blackwell, L.F. (1980) Ovarian Monitor Instruction Manual. OM Research and Reference Center of Australia, Melborne.
Brown, J.B., Blackwell, L.F., Holmes, J. and Smyth, K. (1989) New assays for identifying the fertile period. Int. J Gynecol. Obstet., 1 (Suppl.), 111122.
Brown, J.B., Holmes, J. and Barker, G. (1991) Use of the Home Ovarian Monitor in pregnancy avoidance. Am. J. Obstet. Gynecol., 165, 20082011.[ISI][Medline]
Colombo, B. (1989) Biometrical research on some parameters of the menstrual cycle. Int. J. Gynecol. Obstet., 1 (Suppl.), 1318.
Colombo, B. and Masarotto, G. (2000) Daily fecundability: first results from a new data base. Demogr. Res., 315, Internet edition.
Drouin, J., Guilbert, E.E. and Desaulniers, G. (1994) Contraceptive application of the Bioself fertility indicator. Contraception, 50, 229238.[ISI][Medline]
Fehring, R.J. (1996) A comparison of the ovulation method with the CUE ovulation predictor in determining the fertile period. J. Am. Acad. Nurse Pract., 8, 461466.[Medline]
Flynn, A., Pulcrano, J., Royston, P. and Spieler, J. (1991) An evaluation of the Bioself 110 electronic fertility indicator as a contraceptive aid. Contraception, 44, 125139.[ISI][Medline]
Freundl, G. (1998) Kontrazeption per Computer. Hormonmesssystem Persona Studienergebnisse in Deutschland. (Contraception per computer. Hormone system personaresults of studies in Germany). Fortschritte der Medizin, 116, 4748.
Freundl, G., Baur, S., Bremme, M., Döring, G., Frank-Herrmann, P., Godehardt, E. and Kunert, J. (1992) Temperaturcomputer zur Bestimmung der fertilen Zeit im Zyklus der Frau: Babycomp, Bioself 110, Cyclotest D. Fertilität, 8, 6676.
Freundl, G., Bremme, M., Frank, H.P., Baur, S., Godehardt, E. and Sottong, U. (1996) The Cue Fertility Monitor compared to ultrasound and LH peak measurements for fertile time ovulation detection. Adv. Contracept., 12, 111121.[ISI][Medline]
Freundl, G., Frank-Herrmann, P. and Bremme, M. (1998a) Results of an efficacy-finding study (EFS) with the computer-thermometer Cyclotest 2 plus containing 207 cycles. Adv. Contracept., 14, 201207.[CrossRef][ISI][Medline]
Freundl, G., FrankHerrmann, P., Godehardt, E., Klemm, R. and Bachhofer, M. (1998b) Retrospective clinical trial of contraceptive effectiveness of the electronic fertility indicator Ladycomp/Babycomp. Adv. Contracept., 14, 97108.[CrossRef][ISI][Medline]
Gnoth, C., Frank-Herrmann, P., Bremme, M., Freundl, G. and Godehardt, E. (1996) Do the symptoms of self-observation correlate with ovulation? Zentralbl. Gynäkol., 118, 650654.[Medline]
May, K. (2001) Home monitoring with the ClearPlan Easy Fertility Monitor for fertility awareness. J. Int. Med. Res., 29 (Suppl. 1), 14A20A.
Miolo, L., Colombo, B. and Marshall, J. (1993) A database for biometric research on changes in basal body temperature in the menstrual cycle. Statistica, LIII, 563572.
Moreno, J.E., Khan, D.F. and Goldzieher, J.W. (1997) Natural family planning: suitability of the CUE method for defining the time of ovulation. Contraception, 55, 233237.[CrossRef][ISI][Medline]
Schwartz, D., Mayaux, M.J., Martin, B.A., Czyglik, F. and David, G. (1979) Donor insemination: conception rate according to cycle day in a series of 821 cycles with a single insemination. Fertil. Steril., 31, 226229.[ISI][Medline]
Trussell, J. (1999) Contraceptive efficacy of the personal hormone monitoring system Persona. Br. J. Fam. Plann., 25, 3435.[ISI][Medline]
Trussell, J. (2001) Measuring the contraceptive efficacy of Persona. Contraception, 63, 7779.[CrossRef][ISI][Medline]
Weinberg, C.R., Gladen, B.C. and Wilcox, A.J. (1994) Models relating the timing of intercourse to the probability of conception and the sex of the baby. Biometrics, 50, 358367.[ISI][Medline]
Wilcox, A.J. and Dunson, D. (2000) The timing of the fertile window in the menstrual cycle: day specific estimates from a prospective study. Br. Med. J., 321, 12591262.
Wilcox, A.J., Weinberg, C.R. and Baird, D.D. (1998) Post-ovulatory ageing of the human oocyte and embryo failure. Hum. Reprod., 13, 394397.[CrossRef][ISI][Medline]
Wilcox, A.J., Dunson, D.B., Weinberg, C.R., Trussell, J. and Baird, D.D. (2001) Likelihood of conception with a single act of intercourse: providing benchmark rates for assessment of post-coital contraceptives. Contraception, 63, 211215.[CrossRef][ISI][Medline]
World Health Organization (1980) WHO Task Force on methods for the determination of the fertile period: temporal relationship between ovulation and defined changes in the concentration of plasma estradiol-17, LH, FSH and progesterone. Am. J. Obstet. Gynecol., 138, 383390.[ISI][Medline]
Submitted on December 23, 2002; resubmitted on May 27, 2003; accepted on August 26, 2003.