1 Service de Biologie de la Reproduction-CECOS, Hôpital Cochin, 75014 Paris, France,
2 INSERM U292, CHU de Bicêtre, 94276 Le Kremlin Bicêtre, France,
3 CECOS Aquitaine (Dr J.J.Berjon), Hôpital Pellegrin, 33076 Bordeaux, France,
4 Centre de Stérilité Masculine, CECOS Midi-Pyrénées (Dr L.Bujan), Hôpital La Grave, 31052 Toulouse, France,
5 Unidad de Fertilidad, Procreation Medicamente Asistida, Clinica Marly, Santafe de Bogota, Colombia,
6 CECOS Basse Normandie (Dr A.Sauvalle), CHU Côte de Nacre, 14033 Caen, France,
7 Laboratoire d'Histologie, Faculté de Médecine, Sfax, Tunisia,
8 Centere za andrologico (Dr B.Zorn), Gynekoloska klinika, Univerzitetni Klinic
ni Center, Ljubljana, Slovenija,
9 Laboratoire de Biologie de la Reproduction, CECOS Lyon (Pr J.F.Guérin), Hôpitaal Edouard Herriot, 69373 Lyo,France,
10 Service d'Anatomie Pathologique, Antenne CECOS (Dr M.Roudier), CHU de Pointre à Pître, 97159 Guadeloupe, France and
11 Laboratoire de Biologie de la Reproduction, CECOS Ouest (Pr D.Le Lannou), Hôpital de l'Hôtel-Dieu, 35000 Rennes, France
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key words: quality control/routine semen analysis
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Recently published retrospective studies indicate secular and geographical differences in semen quality (Carlsen et al., 1992; Auger et al., 1995
; Fédération CECOS et al., 1997
; Swan et al., 1997
). However, there are many possible methodological biases which prevent the drawing of final conclusions. Among them, laboratory skews such as variability in semen analysis procedures and assessments have been mentioned (Brake and Krause, 1992
; Tumon and Mortimer, 1992
). Considering forthcoming studies in this area, it is of special importance to evaluate whether differences in semen quality are real, or reflect differences in measuring methods. Therefore, EQA and IQCwhich are complementary processesshould be performed in the time course of this type of investigation on semen quality.
In the few EQA schemes reported previously (Neuwinger et al., 1990; Matson, 1995
; Cooper et al., 1999
), samples of prepared semen were sent to the participating laboratories. Such essential practice provides the opportunity for individual laboratories to evaluate grossly their own methods against those of others. However, one limitation is the fact that the assessment of semen cannot be performed on native samples, or under the usual conditions of semen analysis. For example, assessment of sperm motility requires either frozen material to be diluted in a cryoprotectant, or is made by video recordings.
On the initiative of the Paris group, technicians and biologists involved in prospective multicentre studies on sperm production and quality were invited to join for one week at the Reproductive Biology Laboratory, Hospital Cochin in Paris, in order to analyse native or prepared semen samples of various quality. This offered the possibility of an in-depth assessment of their intra- and inter-individual variability in monitoring of sperm concentration, motility and vitality.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Experimental design
The assessment of intra- and inter-individual variability in routine semen analysis was made from semen samples obtained from healthy donors and infertile patients who gave informed consent for participation. All semen samples were collected by masturbation in the laboratory after 35 days of sexual abstinence. The semen characteristics evaluated in the present study were sperm concentration, the percentage of motile spermatozoa, and the percentage of living spermatozoa. Only the overall motility (grades a + b + c; World Health Organization, 1992) was considered in data analysis because only five teams had a separate evaluation of the four WHO grades. No participant followed rigorously the WHO guidelines for routine semen analysis, and there were some differences in procedure among centres, as summarized in Table I. Except for the microscopes, each participant used their own equipment, e.g. counting chamber, diluents, pipettes and tips, dyes, and followed their usual working method.
|
Since the volume of the semen sample was not large enough, intra-individual variability could not be assessed from evaluations made on native material. For a blind evaluation, the samples for intra-individual assessment were coded and distributed at even intervals of time during the entire week. Sperm concentration was assessed from pools of five frozen samples kept at 20°C without cryoprotectant. Each participant made three evaluations per sample. Each participant evaluated at random the percentage of motile spermatozoa three times in five pools of frozen straws kept at 196°C with a cryoprotectant added. All straws for the motility assessment were thawed for 10 min at 37°C before the analysis. The percentage of live spermatozoa was assessed on five slides from patients previously used for IQC, with three evaluations per sample for each participant. Eosinnigrosin-stained smears were prepared according to WHO procedures (WHO, 1992).
Data analysis
Inter-participant variability
Inter-participant variability in the assessment of sperm concentration and the percentages of motile and live spermatozoa was expressed as the coefficient of variation: CV (%) = 100xSD/mean value. A random effect model (SAS mixed model software; SAS Institute Inc. Cary, NC, USA) was used to compare the values found by each participant for the three sperm characteristics studied. Correlation (Spearman's rank correlation test) was used to assess whether the inter-participant variability in the evaluation of sperm concentration and the percentages of motile and live spermatozoa were related to the average values of these characteristics. BlandAltman plots (Bland and Altman, 1986) were used to illustrate the differences to the mean (%) for each participant and the 17 semen samples studied.
Intra-participant variability
For the three sperm characteristics, intra-participant variability was expressed as the coefficient of variation: CV (%) = 100xSD/mean value.
Influence of training
The participants were allocated to two groups according to their level of practice in order to assess the possible role of training. The first group included eight participants who had a daily practice of semen analysis, and at least 3 years experience. The second group included four participants with recent training and/or episodic semen analysis practice. According to these two groups, the differences in inter-individual variability were assessed by classifying the participants into three categories: (i) exact and accurate; (ii) exact and inaccurate or inexact and accurate; and (iii) inexact and inaccurate. The thresholds chosen for exactness were an average difference from the mean (%) for the 17 samples studied 15% for sperm concentration and the percentage of motile spermatozoa, and
10% for the percentage of live spermatozoa. The thresholds chosen for accuracy were an average SD of the difference to the mean for the 17 samples studied
10% for the three sperm characteristics. The differences in intra-individual variability were expressed as the mean of the intra-individual CV in both groups. After data analysis, an individual detailed report with recommendations was sent to each participant; this allowed them to evaluate their own results in comparison with the mean values obtained by the group.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A mean inter-individual CV for sperm concentration of 22.9% was found for the 12 participants and the 17 semen samples studied. There was no significant difference in the values of sperm concentration obtained by the different participants using different dilution methods and counting chambers. In an external quality control study (Neuwinger et al., 1990), which included 10 experienced German laboratories for the evaluation of eight sperm samples, the mean CV was 37.5%. This result was obtained despite the study being carried out on clean preparations of spermatozoa selected by swim-upa condition that is not normally applied for routine semen analysis. From the data of the EQA made under the auspices of the British Andrology Society reported previously (Matson, 1995
), the mean inter-individual coefficient of variation for sperm concentration assessment was calculated to be 64.7% for the technicians from the 20 laboratories which were supposed to be trained for routine semen analysis and who evaluated 24 semen samples. In the current study, and in the German and British studies, the samples studied covered a wide range of sperm concentrations. The observed differences in CV might reflect a more important disparity in the equipment and procedure steps used for sperm concentration measurement in the British and German studies, which unfortunately were not reported in the publications. The differences in the British and German studies could also have resulted from additional factors of variation related to the mailing of the samples. The time between the collection and preparation of samples and their analysis might lead to the biological material being damaged. From the current study, it could be postulated that when the sperm concentration assessments are made on fresh samples, the inter-individual CV is lower than previously reported. Two studies have reported results of workshops organized on a similar principle to the present study. In the first (Jequier and Ukombe, 1983
), 26 technicians and pathologists from medical laboratories participated, and a mean inter-individual CV of 44.3% for sperm concentration was found. However, only a single semen sample was studied (mean value 46.7x106 spermatozoa/ml; range: 1098x106 spermatozoa/ml). In the second study (Jorgensen et al., 1997
), technicians from four experienced teams involved in research on geographical variations of semen quality joined for 1 week to analyse 26 semen samples. Despite the use of different equipment and procedures, and that the mean inter-individual CV for sperm concentration was not provided, the authors concluded that there was a remarkable consistency between teams for the vast majority of samples studied. From the present study, it could be said that deviations from the mean values or the intra-individual variations were not dependent on the equipment used or the procedure followed (data not shown), and that daily practice and training are important modulators of the variations observed between laboratories. However, the unexpected result of greater inter-participant variations for high concentrations (Figure 1a
) rather than lower variations (WHO, 1999) despite a greater number of spermatozoa being counted by most participants, suggested that the different counting chambers used, as well as the different dilutions applied for high concentrations or the different pipettes used for dilution, contributed to this higher variation. This result illustrated the utmost need for standardized methods to minimize variations in sperm counting among laboratories. The present study also suggested that EQA using the same semen samples evaluated by various people at the same time lowers variation compared with EQA using biological materials sent to various laboratories.
The inter-individual CV of sperm motility assessment was 21.8%, and therefore very similar to the CV found in an earlier study (Neuwinger et al., 1990). Since the assessments of overall motility in this last study were made from material frozen with a cryoprotectant (which makes the evaluation more difficult), it might be supposed that the methodologies used were more homogeneous and/or the participants more trained. Very wide variations in the evaluations of motility were found in a more recent study (Jorgensen et al., 1997
), where the methodologies for sperm motility assessment were heterogeneous. In the present study, there was also an important disparity in the methodology for the assessment of sperm motility. Sperm motility assessment is clearly influenced by the temperature or the depth of the chamber used (Le Lannou et al., 1992
; Kraemer et al., 1998
). However, there is no a priori reason that this could influence markedly the estimation of the overall motility (a + b + c WHO grades). Therefore, the major factors of variation are probably related to the amount of training of the observer: the results of the present trial for intra-individual variability revealed that experienced participants had a CV of 22.8% compared with 33.0% for participants recently trained and/or with episodic practice. However, it should be pointed out that these values expressed the overall within-participant variation for all participants: in Figure 4
, it can be seen that there were quite important differences in intra-individual variation among participants, and from one sample to another. Nevertheless, intra-observer variability in assessing sperm motility appeared to be related to the amount of training of the observer. Low intra-individual variation in the evaluation of sperm motility (CV
15%) was reported for highly trained technicians from the same laboratory (Neuwinger et al., 1990
; D.Mortimer, personal communication). However, better reproducibility in the assessment of sperm motility could also depend on the natural ability of the observer for this subjective task, as was suggested in an earlier study (Dunphy et al., 1989
).
The current study appears to be the first to report results of quality control in the assessment of the percentage of live spermatozoa. Due to the principle of the test of vitality (immobilized spermatozoa, with or without staining) and its quantitative nature, a low variability was expected. The lowest inter- and intra-individual CV were found for this characteristic (17.5% and 13.1% respectively) in comparison with CV found for the two other sperm characteristics studied. This result was obtained despite there being small variations in procedures among participants, or that in the intra-individual trial some participants evaluated this characteristic on smears, despite normally performing the test with a fresh drop of the stained semen deposited on a slide (see Table I). Because of the remarkable homogeneity found for the percentage of living spermatozoa (which can be further improved), percentage of living spermatozoa should be incorporated in repeated EQA (and of course IQC) schemes. Moreover, it could be useful to report percentage of living spermatozoa in studies on secular and geographical variation in semen quality because of the probably low confounding effect of its measure, and that this characteristic reflects the maturation of spermatozoa in the male genital tract, which in turn influences their survival in the female genital tract and their fertilizing ability.
Past and present EQA and IQC raised an unsolved question in the absence of highly reproducible methods to assess semen quality, namely, what is the target value? As has been proposed in the UK NEQAS (United Kingdom National Quality Control Assessment Schemes, Sheffield, UK) in Andrology, it can be decided that the mean value obtained from highly experienced laboratories is the reference value (Cooper et al., 1999). However, previously reported studies (Neuwinger et al., 1990
; Jorgensen et al., 1997
) and the present study indicate that even experienced groups have a noticeable amount of disagreement for some characteristics. It has not been demonstrated that the mean value obtained by these teams provides the best reference point. Therefore, efforts should be made to develop reproducible objective methods in order to provide reliable target values, particularly for quality control schemes. There are some perspectives with the use of flow cytometry applied to sperm concentration assessment (Neuwinger et al., 1990
). Unfortunately, there is no current objective method which allows reproducible assessment of the percentage of motile spermatozoa. In particular, computer-assisted semen analysis (CASA), which is the sole technology offering the possibility to analyse sperm motion reliably (provided that there is rigorous control of all stages of the analysis; Kraemer et al., 1998), has not proved to be superior to visual estimation in terms of reproducibility of results. Expert groups in andrology do not recommend the use of CASA to assess percent motility of spermatozoa (Mortimer et al., 1995
; ESHRE Andrology Special Interest Group, 1998
). However, it should be pointed out that CASA might be very useful in quality control schemes to discriminate between the relative amounts of WHO grades a and b motile spermatozoa (Yeung et al., 1997
), since assessment of such spermatozoa is a major source of variability among individuals and laboratories (Dunphy et al., 1989
). Finally, no reproducible objective method has been proposed for the assessment of sperm vitality by microscopy, and the methods to distinguish between viable and non-viable cells using fluorescent dyes (e.g. propidium iodide) and flow cytometry applied to the evaluation of mammalian sperm viability (Garner et al., 1986
; Auger et al., 1989
) are not adapted for routine semen analysis. Furthermore, it was shown recently that, by using microscopy, vital staining with propidium iodide gave different results than staining with eosinnigrosin (Pintado et al., 2000
).
Basic semen analysis courses and training are prerequisites for novices in the field to minimize their basal variability in assessment (Mortimer, 1994). Subsequently, regular IQC and EQA are needed to reduce the variability inherent to semen analysis practice, and therefore the differences between evaluations made by different laboratories. Discussion of the results with the biologists in charge of the laboratories is essential for motivating the participants and defining corrective measures if necessary. The positive effects of these measures have been reported previously (Björndahl and Kvist, 1998
; Punjabi and Spiessens, 1998
). The significant improvements in the evaluation of semen characteristics resulting from these strategies are particularly important in order to harmonize results between laboratories, and ultimately for the management of infertile couples.
IQC is also required for intra-centre studies on temporal trends in semen quality to provide evidence that the observed variations are real, and that a better agreement in semen assessment made by various laboratories is also the basis for validating conclusions of multicentre studies on differences in semen quality. Therefore, any future prospective study in this field should be based on standardized methods and should include internal and/or external quality assessments, depending on the type of study. For planned multicentre studies, a prestudy EQA should be performed, followed by corrective measures if necessary, as in a recent study of geographical variation in semen quality in Europe (Jorgensen et al., 1997). This is very useful when the same characteristic cannot be analysed centrally, as may be done for sperm morphology. Moreover, the initial EQA should be followed by repeated quality control in the complete time course of the study in order to identify any possible deviation in assessment. Such approaches offer the opportunity to adjust data in the statistical analysis for taking into account the variations related to methodological factors.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Auger, J., Kunstmann, J.M., Czyglik, F. and Jouannet, P. (1995) Decline in semen quality among fertile men in Paris during the last 20 years. N. Engl. J. Med., 332, 281285.
Björndahl, L. and Kvist, U. (1998) Basic semen analysis courses: experience in Scandinavia. In Ombelet, W., Bosmans, E., Vandeput, H. et al. (eds), Modern ART in the 2000s. Andrology in the Nineties. Parthenon Publishing Group, New York, pp. 91101.
Bland, J.M. and Altman, D.G. (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, i, 307310.
Brake, A. and Krause, W. (1992) Decreasing quality of semen (letter). Br. Med. J., 305, 1498.
Carlsen, E., Giwercman, A., Keiding, N. and Skakkebaek, N.E. (1992) Evidence for decreasing quality of semen during past 50 years. Br. Med. J., 305, 609613.[ISI][Medline]
Clements, S., Cooke, I.D. and Barratt, C.L. (1995) Implementing comprehensive quality control in the andrology laboratory. Hum. Reprod., 10, 20962106.[Abstract]
Cooper, T.G., Neuwinger, J., Bahrs, S. and Nieschlag, E. (1992) Internal quality control of semen analysis. Fertil. Steril., 58, 172178.[ISI][Medline]
Cooper, T.G., Atkinson A.D. and Nieschlag E. (1999) Experience with external quality control in spermatology. Hum. Reprod., 14, 765769.
De Jonge, C. (1998) Total quality management and the clinical andrology laboratory: essential partners. In Ombelet, W., Bosmans, E., Vandeput, H. et al. (eds), Modern ART in the 2000s. Andrology in the Nineties. Parthenon Publishing Group, New York, pp. 5560.
Dunphy, B.C., Kay, R., Barratt, C.L.R. and Cooke, I.D. (1989) Quality control during the conventional analysis of semen, an essential exercise. J. Androl., 10, 378385.
ESHRE Andrology Special Interest Group (1998) Guidelines on the application of CASA technology in the analysis of spermatozoa. Hum. Reprod., 13, 142145.
Fédération CECOS, Auger, J. and Jouannet, P. (1997) Evidence for regional differences of semen quality among fertile French men. Hum. Reprod., 12, 740745.[Abstract]
Garner, D.L., Pinkel, D., Johnson, L.A. and Pace, M.M. (1986) Assessment of spermatozoal function using dual fluorescent staining and flow cytometric analyses. Biol. Reprod., 34, 127138.[Abstract]
Jequier, A.M. and Ukombe, E.B. (1983) Errors inherent in the performance of a routine semen analysis. Br. J. Urol., 55, 434436.[ISI][Medline]
Jorgensen, N., Auger, J., Giwercman, A. et al. (1997) Semen analysis performed by different laboratory teams: an intervariation study. Int. J. Androl., 20, 201207.[ISI][Medline]
Knuth, U.A., Neuwinger, J. and Nieschlag, E. (1989) Bias to routine semen analysis by uncontrolled changes in laboratory environment detection by long-term sampling of monthly means for quality control. Int. J. Androl., 12, 375383.[ISI][Medline]
Kraemer, M., Fillion, C., Martin-Pont, B. and Auger, J. (1998) Factors influencing human sperm kinematic measurements by the Celltrak computer-assisted sperm analysis system. Hum. Reprod., 13, 611619.[Abstract]
Le Lannou, D., Griveau, J.F., Le Pichon, J.P. and Quero, J.C. (1992) Effects of chamber depth on the motion pattern of human spermatozoa in semen or in capacitating medium. Hum. Reprod., 7, 14171421.[Abstract]
Matson, P.L. (1995) External quality assessment for semen analysis and sperm antibody detection: results of a pilot scheme. Hum. Reprod., 10, 620625.[Abstract]
Michelmann, H.W. (1997) Quality management in the andrology laboratory. Int. J. Androl., 20 (Suppl. 3), 5054.[ISI][Medline]
Mortimer, D. (1994) Technician training and quality control aspects. Practical Laboratory Andrology. Oxford University Press, Oxford, pp. 337347.
Mortimer, D., Shu, M.A. and Tan, R. (1986) Standardization and quality control of sperm concentration and sperm motility counts in semen analysis. Hum. Reprod., 1, 299303.[Abstract]
Mortimer, D., Aitken, R.J., Mortimer, S.T. and Pacey, A.A. (1995) Workshop report: clinical CASA the quest for consensus. Reprod. Fertil. Dev., 7, 951959.[ISI][Medline]
Neuwinger, J., Behre, H.M. and Nieschlag, E.(1990) External quality control in the andrology laboratory: an experimental multicenter trial. Fertil. Steril., 54, 308314.[ISI][Medline]
Pintado, B., de la Fuente, J. and Roldan, E.R.S. (2000) Permeability of boar and bull spermatozoa to the nucleic acid stains propidium iodide or Hoechst 33258, or to eosin: accuracy in the assessment of cell viability. J. Reprod. Fertil., 118, 145152.
Punjabi, U. and Spiessens, C. (1998) Basic semen analysis courses: experience in Belgium. In Ombelet, W., Bosmans, E., Vandeput, H. et al. (eds), Modern ART in the 2000s. Andrology in the Nineties. Parthenon Publishing Group, New York, pp. 107113.
Rowe, P.J., Comhaire, F.H., Hargreave, T.B. and Mellows, H.J. (1993) WHO Manual for the Standardized Investigation and Diagnosis of the Infertile Couple. Cambridge University Press, Cambridge, pp. 3339.
Swan, S.H., Elkin, E.P. and Fenster, L. (1997) Have sperm densities declined? A reanalysis of global trend data. Environ. Health Perspect., 105, 12281232.[ISI][Medline]
Tumon, A. and Mortimer, D. (1992) Decreasing quality of semen (letter). Br. Med. J., 305, 12281229.[ISI][Medline]
World Health Organization (1992) WHO Laboratory Manual for the Examination of Human Semen and Sperm-Cervical Mucus Interaction, 3rd edn, Cambridge University Press, Cambridge.
World Health Organization (1999) WHO Laboratory Manual for the Examination of Human Semen and Sperm-Cervical Mucus Interaction, 4th edn, Cambridge University Press, Cambridge.
Wyrobek, A.J. (1983) Methods for evaluating the effects of environmental chemicals on human sperm production. Environ. Health Perspect., 48, 5359.[ISI][Medline]
Yeung, C.H., Cooper, T.G. and Nieschlag, E. (1997) A technique for standardization and quality control of subjective sperm motility assessments in semen analysis. Fertil. Steril., 67, 11561158.[ISI][Medline]
Submitted on March 31, 2000; accepted on July 18, 2000.