1 Department of Epidemiology, University of Michigan, Ann Arbor, MI.
2 Department of Biostatistics, University of Michigan, Ann Arbor, MI.
Received for publication April 10, 2003; accepted for publication November 6, 2003.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
epidemiologic methods; menstrual cycle; models, statistical; prospective studies; sample size; sampling studies
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Prospective menstrual calendars are considered the most reliable method for collecting data on menstrual function; however, little information is available about optimal sampling strategies for prospective studies of menstrual function, that is, how many women are needed and for how long women should be followed. Studies of menstrual function typically focus either on comparing differences in mean or variance of cycle length with respect to host and environmental factorssuch as ethnicity, smoking, human immunodeficiency virus serostatus, or body size (610)or on characterizing how mean and variance of cycle length change across age or during periods of transition (1114). With the exception of a few truly longitudinal cohorts (12, 13), studies of menstrual function have generally been short-sequence prospective studies whose duration ranges from 6 months to 2 years and whose sample sizes range from 100 to 500 women. Whether these studies are sufficiently powered has not been adequately evaluated, with sample size and study duration often driven by feasibility and cost considerations.
Although longitudinal methods for estimating sample size exist (15, 16), most investigators do not have access to the longitudinal menstrual data needed to apply these methods. Using data from the Tremin Trust, we determined sampling strategies for prospective studies of menstrual function during three spans of reproductive life. In this paper, we first consider sampling strategies for studies that aim to assess differences in mean cycle length with respect to an exposure. Next, we discuss sample size requirements for assessing changes in mean cycle length across the reproductive life span.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Menstrual calendars were used to record the days on which women experienced bleeding. Each calendar covered a 1-year period. Women completed a short questionnaire at the end of each year to collect information regarding marital status, medical treatment for menstrual difficulties, and pregnancies as well as to identify their willingness to continue participating. Questions about approaching menopause were added in 1952, and questions pertaining to oral contraceptive use and surgery were added in 1963.
Definitions recommended by the World Health Organization were used to summarize the menstrual diary data (17). Per these definitions, a bleeding episode is a period of consecutive bleeding days, and a bleeding-free interval is a period of consecutive bleeding-free days. A menstrual segment is defined as a bleeding episode and the subsequent bleeding-free interval. The term menstrual segment is analogous to the term menstrual cycle but acknowledges that diary data cannot distinguish between menstrual and nonmenstrual bleeds. Additionally, a criterion that a menstrual segment had to include at least 2 consecutive days in the bleeding-free interval was applied to avoid unusually short segments. A single bleeding-free day occurring between 2 bleed days was considered a bleed day. Pregnancy intervals and the first two cycles after a birth and the first cycle after a spontaneous abortion were coded as nonmenstrual intervals.
Sampling strategies for three spans of reproductive life were investigated. The three age strata1825 years of age (young adult), 2639 years of age (adult), and 4053 years of age (menopausal transition)were necessary because certain sample size calculations assume a linear relation with age, and the association between age and cycle length is nonlinear. The age strata were defined on the basis of prior investigations of menstrual function across the life span suggesting that the relation of cycle length with age within each of these strata is approximately linear (11, 14). The sample sizes for the three age strata were 963, 830, and 435 women for 1825 years of age, 2639 years of age, and 4053 years of age, respectively; 194 women in the last strata were followed through menopause.
Estimating the effect of host and environmental factors on mean cycle length
To evaluate sampling strategies for assessing differences in mean menstrual cycle length between two exposure groups, we estimated the magnitude of the detectable difference by using the following equation (16):
where N is the number of subjects in each exposure group, 2 is the total variation (between plus within) in mean cycle length,
is the correlation among repeated observations for individual women, d is the smallest difference to be detected, P is the number of cycles,
is the probability of a type I error assumed to be 5 percent, and Q is the power of the statistical test assumed to be 80 percent. The parameters of
2 and
were estimated for each age range by fitting random intercept mixed models using the SAS Proc Mixed procedure (18). Age was modeled as dummy variables for each year. Estimates of
2 and
used for sample size estimation were based on median values for the given age ranges. Because these models assume normality, models were run for cycles up to 180 days in length to create roughly normal distributions within each age range. The distribution of cycle lengths remained skewed in the age category 4053 years; thus, a natural log transformation was applied to cycles for this age range. We also estimated the number of subjects needed to detect specified differences in mean cycle length between exposure groups by using equation 1. Note that equation 1 pertains to subject-level covariates or exposures, which do not change over time, and does not directly apply to the situation in which an exposure is a time-varying covariate.
Describing menstrual function across the reproductive life span
To evaluate the precision of different sampling strategies for measuring annual rate of change in mean cycle length as women age, we estimated maximum error values as the half width of the 95 percent confidence interval for the annual rate of change in mean cycle length, ß, using the following equation (15):
The standard error for the annual rate of change was calculated as follows:
where is the variation in slope between women,
2 is the variation in slope within women, P is the number of cycles, D is the duration of the study in years, N is the number of women in the sample, and
is the probability of a type I error assumed at 5 percent. The parameters of
and
2 were estimated for each age range by fitting random interceptrandom slope mixed models using the SAS Proc Mixed procedure (18). The models fit average cycle length as a function of age, with age modeled as a continuous variable, using an unstructured covariance matrix. We again included cycles up to 180 days and applied a natural log transformation to cycles in the age range 4053 years.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Differences in mean cycle length between two exposure groups
Table 1 displays estimated detectable differences in mean menstrual cycle length between two exposure groups for fixed sample sizes and study durations. Estimates for the age range 4053 years were calculated on a log scale. If the anticipated difference in mean cycle length between two exposure groups is 1 day, then any of the footnoted sampling strategies in table 1 would be appropriate. For the same anticipated difference in mean cycle length, the largest sample size is needed to study women 4053 years of age followed by women aged 1825 years and then women aged 2639 years. To provide additional detail for investigators, table 2 presents estimates of the numbers of subjects needed to detect specified differences in mean menstrual cycle length between two exposure groups.
|
|
|
|
Increasing study duration from 2 to 3 years or from 3 to 4 years has a large impact on precision when estimating annual rate of change in mean cycle length, as illustrated in figure 2 for women aged 2639 years. Gains in precision from increasing the study duration to more than 4 years are not as large.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
For studies that focus on detecting differences in mean cycle length with respect to host or environmental factors, increasing study duration beyond 12 years will not have a large impact on minimizing the detectable difference in mean between exposure groups. In contrast, precision in estimates of annual rate of change in mean cycle length improves considerably when study duration is increased up to about 4 years.
Examples of exposures investigated in previous studies include race-ethnicity, socioeconomic status, weight, physical activity, stress, occupational exposures, and exposure to diethylstilbestrol (4, 6, 10, 19). On the basis of our findings, differences of 24 days in mean cycle length, characteristic of some occupational exposures (4), could be detected with as few as 25100 women per exposure group and 6 months of prospective follow-up. To detect smaller effects, such as those that may occur with in utero exposure to diethylstilbestrol, 200 women per group followed for 6 months would be needed to detect a 0.75-day difference in mean cycle length (19).
Most studies characterizing menstrual function across the reproductive life span have been conducted in highly educated, White populations (12, 13). An efficient approach to obtaining information on menstrual patterns across age in other populations would be to use an accelerated or mixed longitudinal study design, an approach that economizes the time that participants are required to maintain menstrual diaries (20). With this approach, investigators would select multiple distinct, but overlapping age cohorts and track each for a relatively short period of time. An accelerated strategy for characterizing menstrual patterns from age 15 to 50 years could reduce follow-up to 5 years (table 3) in contrast to the 35 years required if women were to be followed for their entire reproductive life, as occurred in the Tremin Trust.
This analysis does have some limitations. Transformation of the data, while appropriate for the statistical models, complicates interpretation of our findings. More work is needed to optimally account for the nonnormality of menstrual data. We considered sample size calculations for two specific study objectives that focus on mean cycle length. Other objectives of interest might include assessing change in cycle variability with age, evaluating differences in cycle variability across populations, and comparing rates of change in mean cycle length or cycle variability between two exposure groups. These study objectives would require different sampling methods.
Finally, the Tremin Trust includes data for only White, college-educated females. Studies suggest that menstrual cycle characteristics differ by race-ethnicity, socioeconomic status, body composition, and other host and environmental characteristics (3, 6, 9, 10). For example, postmenarcheal European-American girls have a longer mean cycle length than do African-American girls in the United States (6), whereas the association between body mass index and cycle length is curvilinear (9, 10). Although specific sample size estimates would likely vary slightly in populations that differ with respect to the above factors, the Tremin Trust is one of only two data sources in which this type of sample size estimation can be conducted (12, 13). Furthermore, because age is a primary determinant of variability in menstrual function within a population, sample size estimates provided here can serve as a basic guideline. The finding that increasing study duration for studies comparing exposure groups results in little gain in precision, whereas increasing study duration to 45 years for studies evaluating changes in the mean across age adds precision, is an important insight that is broadly generalizable.
In summary, using data from a prospective study of menstrual function, we have provided sampling strategies to guide investigators who want to design prospective menstrual calendar studies. This analysis suggests efficient sampling strategies that will result in adequate power to detect differences in mean cycle length with respect to exposures or acceptable values of maximum error when assessing annual rate of change in mean cycle length. Following women for a shorter period of time (e.g., 12 years) is optimal for studies investigating host and environmental exposures that alter menstrual function. In contrast, following women for an extended period of time (e.g., 45 years) is optimal for studying how menstrual patterns across the reproductive life span vary in different populations. Regardless of the study objective, sampling strategies should be tailored to the age range to be investigated.
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|