Does the potential for selection bias in semen quality studies depend on study design? Experience from a study conducted within an infertility clinic

Russ Hauser1,2,3, Linda Godfrey-Bailey1 and Zuying Chen2

1 Department of Environmental Health, Occupational Health Program, Harvard School of Public Health, Boston, MA 02115 and 2 Vincent Memorial Obstetrics & Gynecology Service, Andrology Laboratory and In Vitro Fertilization Unit, Massachusetts General Hospital, Boston, MA 02114, USA

3 To whom correspondence should be addressed at: Occupational Health Program, Department of Environmental Health, Harvard School of Public Health, Building 1, room 1405, 665 Huntington Avenue, Boston, MA 02115, USA. Email: rhauser{at}hohp.harvard.edu


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
BACKGROUND: The low participation rates in human semen quality studies raises concern for the potential of differential participation based on semen quality (or a surrogate). To explore the potential for differential participation, we compared semen analysis results from study subjects with those of non-study subjects. METHODS: We obtained semen analysis results from 235 study subjects and retrospectively obtained results from a subset of 235 infertility clinic patients that were not study subjects but met the same eligibility criteria. The study was conducted at the Massachusetts General Hospital Infertility Clinic. All semen samples (study subjects and non-study subjects) were analysed for sperm concentration and motility by computer-aided semen analysis (CASA), and morphology was assessed using strict criteria. Semen analysis parameters for the non-study subjects were compared with the semen analysis results from study subjects. RESULTS: For all semen characteristics (sperm concentration, total sperm count, sperm motility and morphology), there were only marginal (non-significant) differences between study subjects and non-study subjects. CONCLUSIONS: Among men from an infertility clinic, we found no strong evidence of differential participation based on semen quality. This is reassuring since the potential for selection bias is of concern in semen quality studies. However, the potential for selection bias in other study designs remains unclear.

Key words: infertility/selection bias/semen quality


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
A recurring difficulty in conducting studies on human semen quality is the very low participation rates among eligible subjects (Cohn et al., 2002Go; Tielemans et al., 2002Go). This has heightened concern for the potential of differential participation of subjects based on semen quality. Selection bias may occur if participation is differentially based on semen quality, or a surrogate thereof, such as history of infertility, and also based on the exposure of interest. The exposure of interest can be the level of an environmental chemical, a lifestyle factor or even a birth history variable such as birth weight. It is important to emphasize that for selection bias to occur, participation must be based on both semen quality (or a surrogate) and the exposure of interest (or surrogate).

To explore the potential for selection bias, one would ideally collect data on semen quality from both study subjects and those that refuse to participate. Since this is usually not possible, we generally obtain data on characteristics that are considered surrogates for semen quality and compare participants with non-participants. Characteristics that are frequently measured include age, smoking history, educational level, and medical and reproductive history (Eustache et al., 2004Go; Muller et al., 2004Go)

Several studies have explored whether participants in semen quality studies differ from non-participants. Since, by definition, semen analysis results on non-participants are generally not available, innovative designs have been used to explore the potential for differential participation. In a French study on male partners of pregnant women, information was obtained from a group of men agreeing to collect a semen sample and complete a questionnaire (group A), a group of men completing only the questionnaire (group B) and from men refusing to participate altogether (group C) (Eustache et al., 2004Go). The participation rate for men in group A was only 15.8%. There were no differences in the ages and socio-professional status across the three groups. However, a higher proportion of men giving a semen sample had a high educational level.

Time to pregnancy was also not significantly different across the groups, although there was a higher proportion of couples in group A taking longer than 12 months to conceive than in group B. Pregnancy outcomes were also similar in the three groups. Interestingly, although not statistically significant, the study found a higher proportion of group A volunteers with a history of urogenital diseases (such as cryptorchidism, infection and varicocele). However, semen characteristics were not significantly different between men having or not having a history of urogenital disease. This suggests that participation may not relate to semen characteristics. Overall, the authors concluded that despite the low participation rates, there did not appear to be major differences in characteristics between the three groups of subjects. There were minor differences in education and urogenital history, therefore data on these characteristics need to be collected and adjusted for in statistical analyses as potential confounders.

In another recently published French study, Muller and co-workers (2004)Go explored differential participation among three groups of men that were partners of pregnant women. Group 1 men completed a questionnaire and produced a semen sample (13% participation rate), group 2 men completed only the questionnaire and group 3 men refused the questionnaire and did not produce a semen sample. Smokers and poorly educated men were more likely to refuse to participate than non-smokers and highly educated men. Semen providers were more likely to have reported unfavourable pregnancy outcomes (odds ratio 1.68, 95% confidence interval 1.14–2.49) as compared with men only completing the questionnaire. However, time to pregnancy (TTP) was similar across groups, suggesting that fecundity was similar across groups.

Muller and co-workers (2004)Go found that there were differences in socio-demographic characteristics by level of participation, and therefore men that agree to participate may not represent the population from which they originate. Although they found participation differed based on education and reproductive history, because they did not have semen analysis results on men in groups 2 and 3 they were unable to determine whether participation was also related to the other axis (i.e. semen quality). Based on differences of unfavourable pregnancy outcomes and maternal smoking among the men in group 1, they hypothesized that semen quality may be lower for group 1 men. The TTP results, however, did not support this, although TTP may not have been sensitive enough to detect changes in semen quality.

In a subset of subjects from the Child Health and Development Studies (CHDS), Cohn and co-workers (2002)Go explored potential participation bias. Between 1960 and 1963, while pregnant, the mothers of the sons targeted for study recruitment were enrolled in the CHDS. Sons were 36–39 years of age at the time of attempted recruitment by Cohn et al. The projected participation rate was 53% for semen donation. Sons who agreed to participate after the initial mailing (respondents) and sons who were subsequently traced and recruited after the initial mailing (initial non-respondents) differed in semen parameters, including sperm concentration and the percentage of sperm with normal morphology. In addition, race, marital history and fertility history differed between respondents to the initial mailing and recruited non-respondents. The design of the CHDS study was population based, and therefore participation patterns and motivations may differ from studies conducted within infertility clinics (our study) and studies conducted among male partners of pregnant women (Eustache et al., 2004Go; Muller et al., 2004Go).

Another design frequently used in semen quality studies is to conduct the study within an occupational cohort. These studies are also difficult to conduct due to traditionally low participation rates. Larsen and co-workers (1998)Go analysed data from three Danish occupational sperm studies and evaluated how age and subfertility were related to participation and providing a semen sample. Participation rates varied from a high of 62 and 74% among the exposed and comparison group, respectively, in the greenhouse workers study, to 28% among traditional farmers and 49% among organic farmers in the farmers' study. In the metalworkers and electricians study, participation rates were 38% in the exposed group and 35% in the unexposed group. In these studies, age and subfertility were found to be strong predictors of participation. Participation was higher among men <40 years of age and among men who experienced an infertile period. The effect of infertility was modified by occupational exposure status, producing a tendency for participation to be based on both subfertility, a surrogate for semen quality, and occupational exposure. This may introduce selection bias in studies conducted within these occupational cohorts.

Several studies explored whether hormone levels, specifically testosterone, may be useful to distinguish between participants and non-participants and as a surrogate for semen quality (Mullard et al., 1999Go; Andersen et al., 2000Go). Participation rates in the Mullard and Andersen studies were 35 and 18%, respectively. Andersen and co-workers compared serum testosterone levels in semen donors and non-donors among military recruits in a study in Denmark. They used the same cut-off as Mullard et al. (testosterone level of 400 ng/dl) and found similar mean testosterone levels among donors and non-donors. Four percent of semen donors had testosterone levels <400 ng/dl compared with 7% of non-donors. Mullard also found a lower percentage of men with <400 ng/dl among semen donors (0%) compared with non-donors (7%).

Because it is difficult, and generally not possible, to measure semen quality among non-participants, the published studies described above relied on measuring factors that may be related to semen quality, such as difficulty having a child or medical reproductive history. Some studies relied on comparisons of demographic or lifestyle characteristics between participants and non-participants. Demographic characteristics included age and educational level, and lifestyle factors included smoking history.

Because our study was conducted in an infertility clinic, we had the unique opportunity to explore directly whether semen analysis results of study subjects differed from the infertility clinic population from which the study subjects were recruited. The infertility clinic population includes three sets of subjects: (i) subjects that were approached and agreed to participate in the study (study subjects); (ii) subjects that were approached and refused to participate (refusals); and (iii) subjects not approached because of scheduling conflicts or other reasons. Those not approached represent a heterogeneous group consisting of those who would have refused if given the opportunity and those who would have participated if given the opportunity. Despite our ability to compare semen quality results of study subjects with the clinic population from which they arose, we were not able to identify a pure group of non-participants (i.e. subset number ii as described above). We were therefore limited to a comparison of study subjects and the population of infertility clinic patients that included both refusals and subjects not approached.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Because all men visiting the Andrology Laboratory provide a semen sample as part of their clinical care at Massachusetts General Hospital (MGH), we had access to semen analysis results from both men who agreed to participate in our study and all other men presenting to the infertility clinic. Men were eligible for study recruitment if they met the following criteria: (i) between the ages of 18 and 55 years; (ii) referred by a study-approved physician; (iii) did not have a vasectomy; and (iv) were not receiving hormonal treatments.

Men who agreed to participate were consented by the research nurse and asked to provide a urine sample, to allow us to draw a blood sample and to allow us to have access to the results of their semen analysis from their clinic visit. Approximately 65% of men approached agreed to participate fully. Fewer than 5% agreed to partial participation, which included not agreeing to provide a blood sample. We determined the participation rate of the study by tracking recruitment contacts that the research nurse had with men in the infertility clinic. Men who verbally declined to participate in the study were noted as declining. An additional category of men who were not ready to make a decision about study participation or who had not been approached by the nurse were not included in the total participation rate. Men who had not decided about the study after a first meeting with the research nurse were eligible to be approached again about the study at subsequent clinic visits. If this second contact led to either participation or declining to participate, the participation rate was adjusted accordingly.

To explore the potential for differential participation based on semen analysis results, we retrospectively and anonymously obtained semen analysis results from a subset of infertility clinic patients that were not study subjects. The MGH and Harvard School of Public Health (HSPH) Human Subject Committees approved this study. Since semen analysis results were collected anonymously from the subset of infertility clinic patients, very limited information on characteristics of the infertility clinic patients was available. For instance, we did not have information on abstinence time, smoking history or race. We only had information on the referring physician, the patients' age and a limited medical history [i.e. post-vasectomy (yes/no) and receiving hormonal treatments (yes/no)]. Since results were collected anonymously, none of this information was recorded on the non-study subjects.

Semen analysis data from men visiting the infertility clinic were readily available for the years 2001 and 2002 using the Andrology Laboratory records. Using the 2001–2002 database of subjects recruited into the study, the date of enrollment for each study subject was determined. The next potentially eligible non-study subject was identified. Potentially eligible non-study subjects included all men that visited the Andrology Laboratory for semen analysis and who met the same eligibility criteria as the study subjects. We attempted to match on the date of enrollment of the study subject. However, if an eligible non-study subject on the same date was not found, we selected the first eligible non-study subject on the following day. A list of the selected non-study subjects was made, and as subsequent men were selected they were checked against both the study subject list and the non-study subject list to prevent duplication. Once a final list was prepared, the semen analysis results were extracted from the Andrology Laboratory file. We then removed the identifiers (name and medical record no.) and substituted an anonymous study ID number.

Semen analysis
Semen samples were produced by masturbation into a sterile plastic specimen cup. The sample was liquefied at 37°C for 20 min prior to analysis. Men were instructed to abstain from ejaculation for 48 h prior to producing the semen sample. Since the semen analysis results from the non-study subjects were collected anonymously, we were unable to confirm abstinence time.

All semen samples (study subjects and non-study subjects) were analysed by Z.C. (or her trained technician) for sperm concentration and motion parameters by computer-aided semen analysis (CASA; Hamilton-Thorn Version 10HTM-IVOS, Beverly, MA). Setting parameters and the definition of measured sperm motion parameters for CASA were established by Hamilton-Thorn Company [frames acquired, 30; frame rate, 60 Hz; straightness (STR) threshold, 80.0%; medium VAP cut-off, 25.0 µm/s; and the duration of the tracking time, 0.38 s]. To measure both sperm concentration and motility, aliquots of semen samples (5 µl) were placed into a pre-warmed (37°C) Makler counting chamber (Sefi-Medical Instruments, Haifa, Israel). A minimum of 200 sperm from at least four different fields were analysed from each specimen. The percentage of motile sperm was defined as WHO grade ‘a’ sperm (rapidly progressive with a velocity ≥25 µm/s at 37°C) plus ‘b’ grade sperm (slow/sluggish progressive with a velocity ≥5 µm/s but <25 µm/s).

Using the ‘feathering’ method (World Health Organization, 1999Go), at least two slides were made for each fresh semen sample. The resulting thin smear was allowed to air dry for 1 h before staining with a Diff-Quik staining kit (Dade Behring AG, Düdingen, Switzerland). Morphological assessment was performed with a Nikon microscope using an oil immersion 100x objective (Nikon Company, Tokyo, Japan). Spermatozoa were assessed and scored as normal or abnormal using the strict criteria of Kruger et al. (1988). Results were expressed as the percentage of normal spermatozoa.

Because covariate information was not available for the non-study subjects, all results are unadjusted. Because distributions of sperm concentration and total sperm count were skewed, Wilcoxon signed rank tests were used to compare these semen characteristics across study and non-study subjects. t-tests were performed to compare sperm motility and morphology across study and non-study subjects. {chi}2 tests were used to compare the percentage of study and non-study subjects who had semen characteristics below the reference range. All statistical analyses were conducted using Statistical Analysis Software (SAS), version 8.2 (SAS Institute Inc., Cary, NC).


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Semen analysis parameters for 235 non-study subjects were compared with the semen analysis results from 235 study subjects (Table I). For all semen characteristics (sperm concentration, total sperm count, sperm motility and morphology), there were only marginal non-significant differences between study subjects and non-study subjects. For instance, median (25th and 75th percentiles) sperm concentrations (x106/ml) were 68.8 (28.0, 148.9) for study subjects and 72.8 (36.4, 134.6) for non-study subjects. The mean (SD) sperm motility (percentage motile) was 46.8 (25.6) for study subjects and 45.6 (24.6) for non-study subjects. The mean (SD) morphology (percentage normal) was 6.7 (4.5) for study subjects and 6.6 (4.2) for non-study subjects.


View this table:
[in this window]
[in a new window]
 
Table I. Semen characteristics for study subjects (n=235) and non-study subjects (n=235)

 
The percentage of subjects with below reference semen parameters was nearly identical for the study subjects and non-study subjects. This suggests that in analyses using categorical semen parameters, there would be no difference between study subjects and non-study subjects.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Among men from an infertility clinic, we did not find strong evidence of differential participation based on semen quality. This is reassuring since participation rates are generally low (<50%) in semen quality studies, raising concerns with the potential for selection bias if there was differential participation based on both semen quality and the exposure of interest. However, despite our results, the potential for selection bias in other study designs, as discussed below, remains unclear.

One of the advantages of conducting a study in an infertility clinic is that subject motivation is high and participation rates would be expected to be higher than among general population studies. Although higher participation rates generally improve the validity of the study by lowering concern with selection bias, the 65% participation rate in the present study is only moderately high and therefore does not ensure that selection bias is not a threat to the validity of the study. Because participation rates can only provide limited insight into the likelihood of selection bias, it is important to collect data that allow one to assess empirically whether there is differential participation based on both semen quality and the exposure of interest. In the present study, we were able to collect semen quality data on participants and non-participants. However, the present study was limited since it lacked exposure information on non-participants, as well as socio-demographic information on non-participants.

As described earlier, several studies have explored whether participants in semen quality studies differ from non-participants (Larsen et al., 1998Go; Mullard et al., 1999Go; Andersen et al., 2000Go; Cohn et al., 2002Go; Eustache et al., 2004Go; Muller et al., 2004Go). These studies compared demographic, medical or lifestyle factors of participants with those of non-participants.

Results across these studies were not entirely consistent; some studies found evidence of differential participation based on demographic or lifestyle characteristics, medical history and semen quality (Larsen et al., 1998Go; Cohn et al., 2002Go; Muller et al., 2004Go), while others did not (Eustache et al., 2004Go; this study). Upon closer inspection, the design of the study may be predictive of whether participation was likely to be differentially related to semen quality or the characteristics of the subjects.

Studies among men that were partners of pregnant women, referred to as ‘fertile male studies’ (Eustache et al., 2004Go; Muller et al., 2004Go), and studies among men who were partners of an infertile couple, referred to as ‘infertile male studies’ (this study), generally had lower potentials for differential participation. In these study designs, the men targeted for recruitment represent homogenous groups in terms of their fertility and also in terms of their concern with infertility. Concern with infertility is likely to be low for men in ‘fertile male studies’ and high for men in ‘infertile male studies’. Because a man's concern with infertility is similar within the targeted group of ‘fertile men’ (or ‘infertile men’), this may minimize the likelihood of differential participation. For example, in the present study, because all men were visiting an infertility clinic as a partner in an infertile couple, they all had similar concerns with infertility. Thus participation was likely to be independent of the man's concern with infertility. Because concern with infertility may be related to semen quality, participation would probably be independent of the man's semen quality. We recognize that the level of concern will vary even among men visiting an infertility clinic; however, on average they are expected to have higher levels of concern than men not visiting an infertility clinic or men who have fathered a child.

Studies among men from the general population (Cohn et al., 2002Go) or from occupational groups (Larsen et al., 1998Go) generally had higher potentials for differential participation related to semen quality or demographic and lifestyle factors. This may be because these designs targeted a more heterogenous group of men in terms of their fertility. In these designs, among the men targeted for recruitment, there may be large differences in the man's concern with infertility since some men are fertile (low concern) and others are infertile (high concern). Therefore, there may be differential participation related to concerns with infertility. If concerns with infertility are related to poor reproductive history or poor semen quality, this may lead to the potential for bias.

In conclusion, in the present study within an infertility clinic population, there was little evidence of differential participation based on semen quality. However, it is unclear whether there may be differential participation in studies using other designs. When assessing the likelihood and magnitude of bias related to differential participation in semen quality studies, it is important to consider how the design impacts the potential for bias. Two studies may have equally low rates of participation but, because their design differs, the likelihood of bias may also differ. In addition to concerns with how the study design impacts the potential for differential participation and bias, the design also has implications for the external validity (generalizability) of the results. In studies on infertile or fertile men, generalizability may be more restricted than in studies on men from the general population. Although a detailed discussion of generalizability is beyond the scope of this study, it is important to consider both the internal validity (bias) and external validity (generalizability) when designing human semen quality studies. Further work in these areas will help identify more optimal designs to study predictors of semen quality.


    Acknowledgements
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
The authors thank the staff of the Vincent Memorial Obstetrics and Gynecology Service Andrology Laboratory and In Vitro Fertilization Unit at Massachusetts General Hospital, Computer Programmers, Ms Lucille Pothier and Ms Janna Frelich, and Research Assistants Ms Ana Trisini and Mr Ramace Dadd. We thank Dr John Meeker for his thoughtful comments on the draft of the manuscript. This study was supported by grants ES09718 and ES00002 from the National Institute of Environmental Health Sciences, NIH.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Andersen A, Jorgensen N, Andersson A, Carlsen E, Skakkebaek N, Jensen TK, Keiding N and Swan S (2000) Serum levels of testosterone do not provide evidence of selection bias in studies of male reproductive health. Epidemiology 11, 232–233.[CrossRef][ISI][Medline]

Cohn BA, Overstreet JW, Fogel RJ, Brazil CK, Baird DD and Cirillo PM (2002) Epidemiologic studies of human semen quality: considerations for study design. Am J Epidemiol 55, 664–671.[CrossRef]

Eustache F, Auger J, Cabrol D and Jouannet P (2004) Are volunteers delivering semen samples in fertility studies a biased population? Hum Reprod 19, 2831–2837.[Abstract/Free Full Text]

Kruger TF, Acosta AA, Simons KF, Swanson RJ, Matta JF, Oehninger S (1988) Predictive value of abnormal sperm morphology in in vitro fertilization. Fertil Steril 49, 112–117.[ISI][Medline]

Larsen SB, Abell A and Bonde JP (1998) Selection bias in occupational sperm studies. Am J Epidemiol 147, 681–685.[Abstract]

Mullard A, Wirth JJ, Karmaus W and Paneth N (1999) Testosterone level as a potential selection bias for semen donors in assessing population fertility. Epidemiology 10, 467–468.

Muller A, De La Rochebrochard E, Labbe-Decleves C, Jouannet P, Bujan L, Mieusset R, Le Lannou D, Guerin JF, Benchaib M, Slama R et al. (2004) Selection bias in semen studies due to self-selection of volunteers. Hum Reprod 19, 2838–2844.[Abstract/Free Full Text]

Tielemans E, Burdorf A, te Velde E, Weber R, van Kooij R and Heederik D (2002) Sources of bias in studies among infertility clients. Am J Epidemiol 156, 86–92.[Abstract/Free Full Text]

World Health Organization (1999) WHO Laboratory Manual for the Examination of Human Semen and Sperm–Cervical Mucus Interaction. 4th edn. Cambridge University Press, Cambridge.

Submitted on February 1, 2005; resubmitted on April 19, 2005; accepted on April 22, 2005.





This Article
Abstract
Full Text (PDF )
All Versions of this Article:
20/9/2579    most recent
dei088v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Request Permissions
Google Scholar
Articles by Hauser, R.
Articles by Chen, Z.
PubMed
PubMed Citation
Articles by Hauser, R.
Articles by Chen, Z.