Master University, Hamilton, Ontario, Canada
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() |
---|
Key words: assisted reproduction/gonadotrophins/homogeneity/meta-analysis/validity
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() |
---|
We agree that logistic regression analysis is one of several methods available for conducting a meta-analysis. In fact, we performed such an analysis, and the results, demonstrating that the magnitude of the overall treatment effect and its precision were similar to those obtained in the analysis performed using the Peto modification of the MantelHaenszel method, were presented in our paper. The odds ratios (OR) and their 95% confidence intervals (CI) were 1.20 (1.11.5) and 1.20 (1.021.42) respectively (Daya and Gunby, 1999). We should point out that the slight discrepancy noted by Dr Walters in the overall OR [i.e. 1.261 (using the weighted mean value approach) and 1.26 (using logistic regression analysis) compared with 1.20 (with our analysis)] is based on the fact that the data used in his analysis were limited to those for IVF only (he did not include the data from ICSI cycles). In our analysis, the data used were from both IVF and ICSI cycles. When we restricted our analysis to IVF cycles, we too observed a higher OR (1.26, 95% CI 1.051.52) (as noted in our paper); this estimate is identical to that provided by Dr Walters using the other two methods.
We agree with Dr Walters that investigation of trial heterogeneity is extremely important in any systematic review and meta-analysis. There are several methods to ascertain heterogeneity. An underlying assumption, in combining the data from individual trials to arrive at a summary estimate of the effect of treatment, is that differences among trials are due to chance alone. Statistical heterogeneity can be attributable to one of two causes. The estimates observed in trials may differ because of random sampling error. Consequently, even if the true (but unknown) effect is the same in each trial, because all samples are drawn from the same population, the results observed in the different trials would be expected to vary randomly around the true fixed-effect. This variability is called the within-study variance. Alternatively, each trial sample may have been drawn from a different population, resulting in treatment effect estimates that would be expected to differ. These differences, called random-effects, describe the between-study variance around the overall mean of the estimates of all trials. Algebraically, the fixed-effects and random-effects models are identical except for the inclusion of a random-effect term in the latter model. This term represents unmeasured sources of heterogeneity among trials. To the extent that confidence intervals are supposed to reflect subjective uncertainty about the estimate of the treatment effect, random-effects models can be superior to fixed-effects models (Dickersin and Berlin, 1992) because the random-effect term provides some allowance for sources of heterogeneity beyond sampling error. In the absence of significant heterogeneity, the random-effect term would be zero. Consequently, both analyses (fixed-effects and random-effects) should produce similar results. After performing such a sensitivity analysis on our data, we obtained the following (identical) results: common OR 1.21, 95% CI 1.031.43 (fixed-effects model) and 1.21, 95% CI 1.031.43 (random-effects model). Thus, there was no heterogeneity in the estimates of treatment effect among trials selected for the systematic review.
Another method of investigating variation in study outcomes is to assess the statistical significance of between-study heterogeneity based on the 2 distribution (Fleiss, 1981
). It provides a measure of the sum of the squared differences between the results observed and the results expected in each trial, under the assumption that each trial estimates the same common treatment effect. If the total deviation observed is large, then a single common treatment effect is unlikely. Using a test such as the BreslowDay (Breslow and Day, 1980
), we observed no significant heterogeneity of treatment effect across all trials (BreslowDay statistic = 7.5, P = 0.94). In practice, because this test has low sensitivity for detecting heterogeneity, it has been suggested that a liberal significance threshold, such as 0.1, be used to determine whether the result is statistically significant (Fleiss, 1981
). The probability value we obtained was clearly well above this threshold.
In addition to the statistical approaches mentioned, investigators may want to graphically display the trial outcomes to make a subjective judgement on homogeneity of treatment effect, especially when the formal statistical tests fail to reject the homogeneity assumption. One method displays variations in the observed estimates of treatment effect by plotting the event rates in the treatment groups on the vertical axis and the event rates in the control groups on the horizontal axis (L'Abbe et al., 1987). The data from our study are displayed in Figure 1
using this graphical approach. The scatter plot shows the data clustered together indicating consistency of treatment effect (i.e. homogeneity). If there was a lack of consistent treatment effect (i.e. heterogeneity), the data points would be more widely dispersed.
|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() |
---|
Daya, S. and Gunby, J. (1999) Recombinant versus urinary follicle stimulating hormone for ovarian stimulation in assisted reproduction. Hum. Reprod., 14, 22072215.
Dickersin, K. and Berlin, J.A. (1992) Meta-analysis: state-of-the-science. Epidemiol. Rev., 14, 154176.[ISI][Medline]
Fleiss, J.L. (1981) Statistical Methods for Rates and Proportions. 2nd. Edn,. J. Wiley, New York. pp 161165.
L'Abbe, K.A., Detsky, A.S. and O'Rourke, K. (1987) Meta-analysis is clinical research. Am. Int. Med., 107, 224233.