1 Department of Clinical Epidemiology and Medical Technology Assessment and 2 Department of Obstetrics and Gynaecology, Academisch Ziekenhuis Maastricht, PO Box 5800, 6202 AZ Maastricht and 3 Department of Health Organisation, Policy and Economics, University Maastricht, PO Box 616, 6200 MD Maastricht, The Netherlands
4 To whom correspondence should be addressed. Email: afi{at}kemta.azm.nl
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: Chlamydia antibody testing/cost-effectiveness/decision model/diagnostic test
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In a previous study (Land et al., 2003), we compared the clinical performance of five CAT tests and a combination of tests. In clinical practice, however, the decision, regarding which CAT test is to be preferred in a particular clinical setting should not be based on diagnostic test accuracy alone. Costs related to the performance of the tests, and costs related to the consequences of the tests (of false-positive and false-negative test results in particular) have also to be taken into consideration. The first aim of this study is to compare five commercially available Chlamydia immunoglobulin G antibody tests (MIF Biomerieux, MIF Anilabsystems; ELISA Anilabsystems, pELISA Medac, and ELISA Savyon) with respect to their cost-effectiveness (costs per IVF pregnancy) in a screening strategy for tubal pathology. The second aim is to evaluate whether serial testing, with an enzyme-linked immunosorbent assay (ELISA) test as the first test and retesting of all positive serum samples with a microimmunofluorescence (MIF) test, is to be preferred from an economic point of view. A decision model technique was applied, since it is not possible to answer these questions by means of a prospective cohort study, in which all patients with subfertility problems will have all tests for CAT, HSG and laparoscopy. It is shown how a decision model technique can be used to structure the evidence on clinical and economic outcomes in a form that can help to inform decisions about clinical practices and health care resource allocations (Weinstein et al., 2003
).
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Decision analytic model
The decision analytic model, shown in Figure 1, is based mainly on clinical guidelines used in the Department of Obstetrics and Gynaecology at the University Hospital Maastricht, and on expert opinion. Since hardly any comparative studies have been done so far, few data could be obtained from the literature (Table I). Therefore, the decision analytic model is partly hypothetical. The principles of good practice for decision analytic modelling according to Weinstein et al. (2003) were used in our model.
|
|
The model, shown in Figure 1, is applied to all six CAT strategies, using their own test characteristics such as sensitivity, specificity and costs. The seventh strategy evaluated is a direct laparoscopy strategy. We assumed that the drop out in the direct laparoscopy strategy is equal to the drop out in the other six strategies. Complications of HSG and laparoscopy have been taken into consideration in the model. After HSG, in 7.010.0% of cases, subfebrile morbidity occurs (Forsey et al., 1990). In our model, costs of antibiotic treatment have been included. In 1.5% of laparoscopies, surgical complications occur which usually require laparotomy (Chapron et al., 1998
). The additional costs of these laparotomies have been incorporated in our model.
Data decision model
The data that we used as a basis for the decision analytic model are based on data obtained from an empirical study population of 315 patients, which has been described elsewhere (Land et al., 2003). The empirical study population was selected in the University Hospital Maastricht between 1992 and 2001, and only those patients for whom cryopreserved serum was available, and who underwent laparoscopy and tubal testing with methylene blue dye as part of their fertility work-up were included. Five different CAT tests were performed after thawing the cryopreserved sera of the participating patients. This empirical cohort is not representative for the population of subfertile women at intake for their fertility work-up, since, in daily practice, not all patients in whom a CAT test is performed as part of the initial fertility work-up will undergo evaluation of tubal function by HSG or laparoscopy. Therefore, an initial cohort had to be calculated as the source of this empirical population: the source population. Based on data of the empirical population, registration data of the Departments of Medical Microbiology and Obstetrics and Gynaecology of the University Hospital Maastricht, our own unpublished data and the literature (Forsey et al., 1990
; Swart et al., 1995
; Chapron et al., 1998
; Mol et al., 1999
; Veenemans and van der Linden, 2002
), it was defined that the source population would have consisted of 1715 patients, of whom 600 would have had a positive CAT test at intake and 1115 a negative CAT test (Figure 2). From the empirical population, it is known that 22% of CAT-positive patients had a laparoscopy, and that in 78% no further evaluation of tubal function was done. The main reasons for no further testing are spontaneous pregnancies, arriving at diagnoses in which evaluation of tubal status is not indicated (e.g. severe male factor subfertility) and patient drop out before HSG or laparoscopy. Therefore, of the 600 CAT-positive patients in the source population, 132 would have had a laparoscopy and 468 would have had no additional testing after CAT. From the empirical population, it is known that of the CAT-negative patients, 32% have a HSG and that 68% drop out after CAT. After HSG, 52% of the patients in the empirical population had a laparoscopy and 48% had no further evaluation of tubal function after HSG. Therefore, of the 1115 patients in the source population with a negative CAT, 352 patients would have had a HSG and 763 would have dropped out after CAT. After HSG, 183 patients would have had a laparoscopy and 169 patients would have dropped out before laparoscopy. Consequently, 484 patients (132+352) of the source population would have had further evaluation of tubal function by HSG and/or laparoscopy, and 315 patients would have had a laparoscopy. This group of 315 patients corresponds to the actual study population, all of whom had a laparoscopy (Figure 2).
|
Outcome measure
For this study, we defined that patients with tubal pathology have no probability of spontaneous pregnancy, but only a pregnancy probability if they have treatment by IVF. Since all patients with a diagnosis of tubal pathology were assumed to be treated by IVF, the outcome measure IVF pregnancy was used. IVF pregnancy reflects the probability of pregnancy after three cycles of IVF in patients who have had evaluation of tubal function by HSG and/or laparoscopy, and in whom tubal pathology is diagnosed. Consequently, in patients who have no evaluation of tubal function (because of spontaneous pregnancy, other diagnoses for which tubal evaluation is not indicated or drop out), tubal pathology will not be diagnosed, referral for IVF will not take place, and the IVF pregnancy probability will be zero.
The number of patients with tubal pathology is equal in all strategies. Therefore, the differences between the strategies are not based on different numbers of patients treated, but on the moment of treatment, which is related to test characteristics. A CAT test with high sensitivity will identify patients at high risk for tubal pathology at intake, and tubal factor subfertility can be diagnosed without delay. Early diagnosis of tubal pathology and immediate IVF treatment was defined to have a 50% cumulative pregnancy rate after three cycles (Mol et al., 2001; Granberg et al., 2003
; National Collaborating Centre for Women's and Children's Health, 2004
). A longer lasting diagnostic path, in the case where CAT is false negative, was estimated to reduce the cumulative pregnancy rate by 5% per year to 47.5% (National Collaborating Centre for Women's and Children's Health, 2004
). In CAT-negative patients, if HSG is abnormal, patients will have a delay of 3 months before they have a laparoscopy, mainly because of waiting lists. If HSG seems normal in patients who have tubal pathology at laparoscopy, expectant management is defined for half a year, which means that these patients, if they do not conceive, have a delay of 9 months before laparoscopy is done. Thus, in our model, in patients with tubal pathology, postponing IVF treatment after normal or abnormal HSG will decrease cumulative pregnancy rates by 0.6 or 1.9%, respectively, per year. It can be shown that the reduction in cumulative pregnancy rate after a delay of t months equals (1Pe)t/12, with Pe being the reduction in cumulative pregnancy rate after 12 months delay. The moment of IVF treatment does not influence the expected costs of the different strategies, and therefore costs of IVF are considered not to be relevant for our model.
Statistical analysis
The baseline values of the probabilities and costs were determined and incorporated into the decision tree by using the software programme DATA version 3.5 (TreeAge software, Williamstown, MA). Analysing the decision tree, the path probabilities of each branch of the tree, the expected costs per patient, the IVF pregnancies and finally the expected costs per IVF pregnancy were given for each strategy. An incremental analysis was done for the seven strategies. This analysis is based on the costs per extra IVF pregnancy of a particular strategy, compared with the least expensive strategy.
To test the robustness of these results, a sensitivity analysis is required (Briggs et al., 1994). This is the process of repeatedly analysing the tree by using different values for probability (e.g. values for specificity), utility (e.g. values of the outcome measure) (Krahn et al., 1997
) and cost variables. In this study, univariate sensitivity analyses were performed by using the 95% confidence intervals of baseline values (sensitivity range), or a predetermined value range (Table I, column 2). The disadvantage of a univariate sensitivity analysis is that only one variable at the time can be changed. Therefore, a multivariate sensitivity analysis on accuracy numbers (sensitivity and specificity) was also done. In this analysis, a worst and best case analysis (extreme sensitivity and specificity values of the CAT tests) (Briggs et al., 1994
) was performed. Finally, a threshold analysis was performed to determine if and when a variable changes. The threshold value represents the value of a variable above which another strategy is to be preferred.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Results of the model
Results, as presented in Table II, are expressed in four ways.
|
Expected effect per strategy. This is the probability of IVF pregnancy for the population seeking subfertility care. Thus, for instance, for the pELISA Medac strategy, an expected effect of 1.468% indicates that in a population of 100 000 couples seeking subfertility care, 1468 IVF pregnancies can be expected.
Cost per effect. These are the expected costs for any IVF pregnancy in patients diagnosed as having tubal pathology. For the pELISA Medac strategy, this outcome measure indicates that for each of the 1468 IVF pregnancies in 100 000 couples, an investment of 15 075 is required.
Incremental costs per effect. These are the costs per extra unit of effect, comparing different CAT strategies. In this study, it is the costs per extra IVF pregnancy of a particular strategy compared with the least expensive strategy. The incremental costs per effect show that two strategies (ELISA Anilabsystems and Med + Ani) are ruled out as dominated by pELISA Medac, since they are more expensive and less effective. In the MIF Anilabsystems strategy, 1469 IVF pregnancies are expected in 100 000 couples, which is an extra effect of one pregnancy compared with 1468 IVF pregnancies in the pELISA Medac strategy. However, for this extra pregnancy, an investment of 61 391 is required. The ELISA Savyon strategy is ruled out as dominated by the MIF Anilabsystems strategy, since it is more expensive and less effective. In the MIF Biomerieux strategy, 1473 IVF pregnancies are expected in 100 000 couples, which is an extra four compared with MIF Anilabsystems. For each of these four pregnancies, an investment of
321 989 is required (Table II). In the reference strategy, in which the patients will have a laparoscopy immediately after intake, 1480 IVF pregnancies are expected in 100 000 couples, which is an extra seven pregnancies compared with the MIF Biomerieux strategy. For each of these seven pregnancies, an investment of
726 520 is required.
Sensitivity results
Results of both the univariate and multivariate sensitivity analyses are given in Table III. In Table IIIA, the baseline values and corresponding sensitivity ranges are given for all characteristics in which a threshold value is exceeded.
|
In the multivariate sensitivity analysis, pELISA Medac remained the most cost-effective CAT screening strategy in four of the six strategies (Table IIIB) that were tested. However, when the specificity of all tests had the lowest value (Table I), or when the specificity as well as the sensitivity of all tests had the lowest value (Table I), MIF Anilabsystems became most cost-effective.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Comparison of the five CAT strategies in the source population showed that pELISA Medac is the most cost-effective strategy, followed by MIF Anilabsystems. The differences between the strategies were small, however, mainly because 72% of the patients had no further evaluation of tubal function after CAT (because of spontaneous pregnancy, diagnosis for which tubal status is not relevant or no more visits to the hospital), and had no additional costs. By performing sensitivity analyses on the baseline values of the source population, it was possible to evaluate in which circumstances MIF Anilabsystems became more cost-effective than pELISA Medac. The baseline value of the annual decrease in cumulative pregnancy rate after postponement of IVF was considered to be 5%. The sensitivity analyses showed a threshold value of 19% decrease in cumulative pregnancy rate (from 50 to 40.5%), above which the MIF Anilabsystems strategy was to be preferred. In clinical practice, however, it is unlikely that the cumulative pregnancy rate after three IVF cycles will decrease more than 19% if treatment is postponed for 12 months. MIF Anilabsystems also became most cost-effective if the cut-off titre of a positive test was 64 instead of 32. Although a cut-off titre of 32 is recommended by the manufacturer, a cut-off titre of 64 might be preferred from a clinical point of view, if one prefers to have higher specificity and less false-positive test results (Land et al., 1998). Finally, if specific test characteristics such as sensitivity, specificity and costs of pELISA Medac and MIF Anilabsystems changed, MIF Anilabsystems became the most cost-effective screening strategy.
pELISA Medac remained the most cost-effective screening strategy in the case where the prevalence of tubal pathology was between 5 and 40%. In a primary, secondary and tertiary care population, the prevalence of tubaperitoneal disorders has been estimated to be 11% (Snick et al., 1997; Evers, 2002
), 20% (Hull et al., 1985
; Evers, 2002
) and 30% (Collins et al., 1995
; Evers, 2002
), respectively. The probability that the prevalence of tubal pathology will exceed 40% is unlikely, even in a tertiary care population.
Cost prices were calculated for testing 10, 20 and 40 serum samples simultaneously in one laboratory session. The baseline number of samples tested simultaneously was 20, based on the clinical experience in the University Hospital Maastricht, where 300 CAT tests are done yearly, and CAT test results are available within 23 weeks. In a smaller fertility centre (<150 CAT tests per year), it may be preferred to test 10 samples simultaneously, and in a larger fertility centre (>600 CAT tests per year) the preferred number might be 40. In cases where 10, 20 or 40 serum samples were tested simultaneously, pELISA Medac remained most cost-effective.
pELISA Medac was also the most cost-effective screening strategy in the case where the cumulative pregnancy rate after three IVF cycles was between 20 and 70%. These values are extremes, and it is unlikely that the cumulative pregnancy rate will be lower than 20% or higher than 70%.
Apart from the five CAT tests, a combination of two CAT tests (Med + Ani) was evaluated. For this combination of tests, pELISA Medac was chosen as a first test, since it can be automated, and the more laborious MIF Anilabsystems was chosen as the second test, because of its diagnostic accuracy. It was concluded that if pELISA Medac is performed on all samples, and only those samples with positive test results (i.e. 20% of all samples) are retested with MIF Anilabsystems, the predictive value of the set is comparable with the predictive value of MIF Anilabsystems as a single test (Land et al., 2003). Performing Med + Ani might be preferred from a clinical point of view, since it is less laborious than testing all samples with MIF Anilabsystems, and a titre is available for positive test results. A titre gives additional information, since the height of a titre correlates with the risk for tubal pathology (Akande et al., 2003
). From a cost-effectiveness point of view, however, Med + Ani turned out to be less cost-effective than pELISA Medac or MIF Anilabsystems as a single test. Only in cases where the sensitivity, specificity or cost of the Med + Ani test were changed did Med + Ani become the most cost-effective strategy.
Finally, the reference strategy, in which the patients had a laparoscopy after intake, was found to have the greatest effect (1.48% IVF pregnancies). From a cost-effectiveness point of view, however, the reference strategy turned out to be inferior to the pELISA Medac strategy, because of the high cost of the high number of laparoscopies in the reference strategy.
Although the different strategies were evaluated in the University Hospital Maastricht, the results of this study can be used in other settings. The sensitivity analyses, based on clinical, economic and hospital characteristics, show that the model results are very robust, indicating that pELISA Medac is the most cost-effective screening strategy in most settings. In Table III, we have given the threshold values, which enables one to compare the baseline values used in this study with one's own specific situation. Before introducing a CAT screening strategy into clinical practice, one should take the baseline values in that particular setting into account, and not take decisions on differences in diagnostic accuracy of the CAT tests only. In the methodology of the economic evaluation of diagnostic tests, it is important to evaluate the effects and costs of a complete strategy, instead of only individual cost prices of the tests and separate clinical outcome measures.
In conclusion, only small differences were found between the CAT screening strategies for tubal factor subfertility evaluated, concerning cost-effectiveness. pELISA Medac was the most cost-effective strategy, followed by MIF Anilabsystems. A combination of tests (pELISA Medac as the first test and retesting of all positive serum samples with MIF Anilabsystems) did not improve the cost-effectiveness of pELISA Medac or MIF Anilabsystems as single tests.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Briggs A, Sculpher M and Buxton M (1994) Uncertainty in the economic evaluation of health care technologies: the role of sensitivity analysis. Health Econ 3, 95104.[ISI][Medline]
Chapron C, Querleu D, Bruhat MA, Madelenat P, Fernandez H, Pierre F and Dubuisson JB (1998) Surgical complications of diagnostic and operative gynaecological laparoscopy: a series of 29 966 cases. Hum Reprod 13, 867872.[Abstract]
Collins JA, Burrows EA and Wilan AR (1995) The prognosis for live birth among untreated infertile couples. Fertil Steril 64, 2228.[ISI][Medline]
Evers JLH (2002) Female subfertility. Lancet 360, 151159.[CrossRef][ISI][Medline]
Forsey J, Caul E, Paul ID and Hull MGR (1990) Chlamydia trachomatis, tubal disease and the incidence of symptomatic and asymptomatic infection following hysterosalpingography. Hum Reprod 5, 444447.[Abstract]
Granberg M, Strandell A, Thornburn J, Daya S and Wikland M (2003) Economic evaluation of infertility treatment for tubal disease. J Assist Reprod Genet 20, 301308.[CrossRef][ISI][Medline]
Hull M, Glazener CM, Kelly NJ, Conway DI, Foster PA, Hinton RA, Coulson C, Lambert PA, Watt EM and Desai KM (1985) Population study of causes, treatment, and outcome of infertility. Br Med J 291, 16931697.[ISI][Medline]
Krahn MD, Naglie G, Naimark D, Redelmeier DA and Detsky AS (1997) Primer on medical decision analysis: part 4analysing the model and interpreting the results. Med Decis Making 17, 142151.[ISI][Medline]
Land JA, Evers JLH and Goossens VJ (1998) How to use Chlamydia antibody testing in subfertility patients. Hum Reprod 13, 10941098.[Abstract]
Land JA, Gijsen AP, Kessels AGH, Slobbe MEP and Bruggeman CA (2003) Performance of five serological chlamydia antibody tests in subfertile women. Hum Reprod 18, 26212627.
Mol BWJ, Collins JA, Burrows EA, Veen van der F and Bossuyt PMM (1999) Comparison of hysterosalpingography and laparoscopy in predicting fertility outcome. Hum Reprod 14, 12371242.
Mol BWJ, Collins JA, Veen van der F and Bossuyt PMM (2001) Cost-effectiveness of hysterosalpingography, laparoscopy, and Chlamydia antibody testing in subfertile couples. Fertil Steril 75, 571580.[CrossRef][ISI][Medline]
National Collaborating Centre for Women's and Children's Health (2004) Commissioned by the National Institute for Clinical Excellence. Fertility assessment and treatment for people with fertility problems, Clinical Guidelines, pp. 8198. Accessible at http://www.rcog.org.uk/resources/Public/Fertility_full.pdf.
Oostenbrink JB, Koopmanschap MA and Rutten FFH (2000) Handleiding voor kostenonde rzoek, methoden en richtlijnprijzen voor economische evaluaties in de gezondheidszorg. College voor zorgverzekeringen, Amstelveen.
Snick HK, Snick TS, Evers JLH and Collins JA (1997) The spontaneous pregnancy prognosis in untreated subfertile couples: the Walcheren primary care study. Hum Reprod 12, 15821588.[Abstract]
Swart P, Mol BWJ, Veen van der F, Beurden van M, Redekop WK and Bossuyt PMM (1995) The accuracy of hysterosalpingography in the diagnosis of tubal pathology: a meta-analysis. Fertil Steril 64, 486491.[ISI][Medline]
Veenemans L and van der Linden PJQ (2002) The value of Chlamydia trachomatis antibody testing in predicting tubal factor infertility. Hum Reprod 17, 695698.
Weinstein MC, O'Brien B, Hornberger J, Jackson J, Johannesson M, McCabe C and Luce B (2003) Principles of good practice for decision analytic modelling in health-care evaluation: report of the ISPOR task force on good research practicesmodelling studies. Value Health 6, 917.[CrossRef][ISI][Medline]
Submitted on December 18, 2003; resubmitted on August 5, 2004; accepted on October 15, 2004.
|