Thorpes, The Grip, Linton, Cambridge CB1 6NR, UK
Dear Sir,
The publication of a letter in a recent issue of Human Reproduction (James, 1999) has prompted this present appeal for more statistical rigour when pooling several sets of data to provide a single composite finding. The datasets in question are summarized in Table I
, which displays the proportions of male offspring for two periods of conception, the `Most fertile' days of the menstrual cycle and the remaining period. A previous paper by Gray et al (1998) had failed to detect any importance for this factor, but James (1999) had assembled the data from five published papers and produced a composite finding suggesting that there was a higher than expected proportion of males when the conception was outside the `most fertile' period.
|
It would appear that James has made his inference by simply pooling the relevant frequencies over the five references to produce a 2 statistic of 9.9 on one degree of freedom, a highly significant result.
In fact, that simple test would be valid only if each of the five references provided an estimate of an unknown but constant proportion for the two columns of data. Experience has shown that there is very often a degree of heterogeneity due to different data sets, representing differing conditions generally unknown to the analyst, and completely beyond his control. It is absolutely necessary therefore to investigate heterogeneity before applying the simplest of tests on the total frequencies. The effect of wrongly applying the test would invariably be to exaggerate the importance of the effect being investigated, since heterogeneity inflates the effective error and thus diminishes the perception of the effect.
The complete and rigorous analysis of the data in Table I would formerly have presented severe problems, but fairly recent advances in statistical computing have now resolved all these difficulties. Logistic regression may be used to model the data and investigate all the effects. The algorithm GENSTAT (1988) is particularly well suited for this type of analysis and summarizes the findings very concisely. It is not appropriate here to delve into the mathematical complexities of the analysis, but, stripped of statistical jargon, we fit a model such that the logistic transform of the proportion in any of the 10 cells in Table I
is computed as follows:
![]() |
|
|
It has not been the intention in this letter to discuss the scientific proposition relating to conception time, but rather to highlight the shortcomings of the simple analysis based on a 2 test of the summed frequencies. Unfortunately, the bias introduced by this unsatisfactory approach will almost certainly be in the same direction; that of exaggerating the importance of any effect being investigated. This is due to the fact that the `Error' implicit in the adoption of the
2 test is very frequently a gross underestimation of the true error. Analysts need to resist the temptation simply to aggregate frequencies before carrying out a careful study of trial heterogeneity.
In view of the points outlined above, readers will appreciate that a lack of attention to study heterogeneity in meta-analyses will, more often than not, lead to exaggerated claims for effects of which there is little or no real statistical evidence. Fortunately, the widespread availability of statistical software to carry out GLM analyses now permits a detailed, rigorous analysis for data of that sort.
References
Gray, R.H., Simpson, J.L., Bitto, A.C. et al. (1998) Sex ratio associated with timing of insemination and length of the follicular phase in planned and unplanned pregnancies during use of natural family planning. Hum. Reprod., 13, 13971400.[Abstract]
GENSTAT (1988) The GENSTAT V Reference Manual. Clarendon Press, Oxford, UK.
James, W.H. (1999) The status of the hypothesis that the human sex ratio at birth is associated with the cycle day of conception, Hum. Reprod., 14, 21772178.