1 Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA
2 Department of Family and Preventive Medicine, University of California, San Diego, CA
3 Stanford Prevention Research Center, Stanford University School of Medicine, Stanford, CA
4 Medstar Research Institute, Hyattsville, MD
5 Department of Obstetrics and Gynecology, Albert Einstein School of Medicine, Bronx, NY
6 John A. Burns School of Medicine, University of Hawaii, Honolulu, HI
7 Division of Epidemiology, Health Policy Institute, Medical College of Wisconsin, Milwaukee, WI
8 Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA
9 Division of Cardiovascular Medicine, University of Florida College of Medicine, Gainesville, FL
10 Department of Social and Preventive Medicine and Gynecology-Obstetrics, University at Buffalo, Buffalo, NY
Reprint requests to Dr. Ross Prentice, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, M3-A410, P.O. Box 19024, Seattle, WA 98109-1024 (e-mail: rprentic{at}fhcrc.org).
Received for publication September 16, 2004. Accepted for publication February 3, 2005.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
cardiovascular diseases; clinical trials; cohort studies; estrogens; hormone replacement therapy; postmenopause; progestins
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The lack of explanation for this departure from expectation has prompted some clinicians and researchers to hypothesize flaws in the WHI trial (8, 9
). Others have argued lack of relevance of trial results to important groups of combined hormone therapy users. For example, a recent contribution noted that the WHI was not designed to provide a powerful test of cardioprotective effects among women aged 5054 years in menopausal transition, and it concluded that observational studies provide "the only applicable clinical guide to this issue" (10
, p. 1498).
Other authors have speculated on reasons for a discrepancy between WHI trial results and related observational research, citing confounding in observational studies, the limited ability of observational studies to assess short-term effects, differences among combined hormone therapy preparations, and differences among populations of women studied as possible reasons (1113
). Along these lines, a review (14
) noted that evidence for coronary heart disease benefit from hormone therapy is not apparent among studies that control for socioeconomic and other confounding factors. The April 2004 issue of the International Journal of Epidemiology includes a review (15
) and several commentaries (16
21
) on this topic that illustrate the continuing diversity of opinion on the sources of the discrepancy and on the clinical implications of the available evidence.
The implications of WHI trial results for the study designs needed to obtain reliable therapeutic or public health information have also been debated. Perspectives have ranged from the statement that "many people suspended ordinary standards of evidence concerning medical interventions and concluded that hormone therapy was the right thing to prevent heart disease in millions of postmenopausal women despite the absences of any large-scale clinical trials quantifying its overall risk-benefit ratio" (22, p. 519) to the assertion that "the good agreement between the observational studies and the [WHI] trial on endpoints other than CHD [coronary heart disease] confirms the utility and validity of observational studies as monitors of new preventive agents" (23
, p. 9).
The WHI, with its multifaceted clinical trial among 68,133 women, including 16,608 in the estrogen-plus-progestin trial, and its observational study among 93,676 women, provides an excellent setting to understand and resolve these discrepancies. Specifically, women were recruited to the clinical trial and observational study from the same underlying populations, and over essentially the same time period, at the 40 WHI clinical centers. Many elements of the protocol and procedures were common to the two WHI components, including the baseline questionnaire and interview data collection, as well as major elements of outcome ascertainment. The clinical trial has more information concerning the effects of hormone therapy during the first few years of use, while the observational study mostly provides information on the effects of longer term use. Both clinical trial and observational study women were personally interviewed at baseline concerning prior hormone therapy use, and they were periodically queried to ascertain hormone therapy use during WHI follow-up. Hence, the WHI provides a context for quantitative assessment of the discrepancy between clinical trial and observational study results. To test assertions (11, 23
) that the WHI estrogen-plus-progestin results agree closely with those of observational studies for outcomes other than coronary heart disease, we chose to include stroke and venous thromboembolism, in addition to coronary heart disease, in the comparisons presented here.
![]() |
MATERIALS AND METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Baseline exogenous hormones and clinical trial hormone regimen
Information on lifetime hormone use was obtained on clinical trial and observational study women at baseline by a trained interviewer, assisted by a structured questionnaire and chart displaying colored photographs of various hormone preparations. For postmenopausal hormone therapy, detailed information was obtained on the preparation, estrogen-and-progestin dose, schedule, and route of administration. The age at starting and stopping each preparation was recorded.
Women interested in the hormone therapy trials who were using postmenopausal hormone therapy at initial screening were required to undergo a 3-month washout period. Women with a uterus were potentially eligible for the combined hormone trial of 0.625 mg of CEE and 2.5 mg of MPA in a single daily tablet or for a matching placebo, while women with a prior hysterectomy were potentially eligible for a companion trial of unopposed estrogen, which involved randomization to a daily dose of 0.625 mg of CEE or a matching placebo. A group of 331 women (each with a uterus) were initially randomized to estrogen alone. Following the release of the Postmenopausal Estrogen/Progestin Interventions (PEPI) study results (27), these women were unblinded and reassigned to estrogen plus progestin, and they are included in the clinical trial combined hormone group in this analysis.
Study population
This report is based on information from the 16,608 women randomized to the combined hormone trial, 8,506 (51.2 percent) of whom were assigned to active estrogen plus progestin, and from the 53,054 women enrolled in the observational study who were with uterus and not using unopposed estrogen at the time of WHI enrollment. Among these 53,054 women, 17,503 (33.0 percent) were current users of combined estrogen-plus-progestin preparations at baseline.
Follow-up and outcome ascertainment
Follow-up and outcome ascertainment procedures in the clinical trial (6, 7
, 28
30
) involved semiannual contacts and in-clinic annual visits for the collection of standardized information on safety concerns, adherence to study medications, and structured initial reporting of clinical outcome events. Annual mailed follow-up forms in the observational study updated information on the use of hormone therapy, updated selected other risk factor information, and employed the same structured initial reporting of clinical events.
Disease events were initially self-reported for all three clinical outcomes. Ascertainment of information on coronary heart disease, comprising myocardial infarction and death due to coronary heart disease, involved physician adjudication based on the review of pertinent documents at each clinical center. In the clinical trial and in a fraction in the observational study, coronary heart disease and related outcomes were further adjudicated by a central committee with agreement rates of 90 percent for myocardial infarction and 97 percent for death due to coronary heart disease. Similarly, cases of hospitalized stroke were based (29) on rapid neurologic deficit attributable to obstruction or rupture of the arterial system or on a demonstrable lesion compatible with acute stroke. Central neurologists reviewed all stroke cases, as well as transient ischemia attacks and self-reports of stroke in the clinical trial, along with a fraction of such cases in the observational study. Of locally adjudicated strokes in the clinical trial, 94.5 percent were confirmed on central review, while 93.8 percent of centrally adjudicated strokes had been classified as strokes by local adjudicators. Venous thromboembolism comprised (30
) hospitalized deep vein thrombosis and pulmonary embolism. The confirmation rates in the clinical trial for locally adjudicated venous thromboembolism events in central review were 96 percent for deep vein thrombosis and 98 percent for pulmonary embolism. In the observational study, only self-reports of (hospitalized) deep vein thrombosis or pulmonary embolism were routinely obtained. The confirmation rate in the clinical trial for self-reported venous thromboembolism events on central adjudication was 80.4 percent.
Statistical analysis
Primary analyses used time-to-event methods based on the Cox regression procedure (31), with time from randomization in the clinical trial and time from enrollment in the observational study as the basic "time" variable. Disease incidence rates during follow-up were stratified on baseline age in 5-year categories and on the WHI component (clinical trial or observational study). Hence, hazard ratio estimates derive from comparisons among women in the same 5-year age interval and the same WHI component who are at the same length of time from enrollment in the WHI.
Disease events in the estrogen-plus-progestin trial were included through July 7, 2002, when women stopped taking study pills. This gives an average 5.6 years of follow-up and a maximum of 8.6 years of follow-up. Follow-up in the observational study subsample was included through February 28, 2003, giving comparable average (5.5 years) and maximum (8.4 years) durations. We used the best available outcome data, comprising all centrally adjudicated coronary heart disease, stroke, and venous thromboembolism events in the estrogen-plus-progestin trial and all locally adjudicated coronary heart disease, stroke, and self-reported venous thromboembolism events in the observational study.
The possibility that baseline characteristics confound the relation of estrogen-plus-progestin use to cardiovascular disease risk was examined by carrying out regression analysis of the clinical trial and observational study data that included selected baseline risk factors. The dependence of the hazard ratio on time from initiation of the current episode of estrogen-plus-progestin use was examined in Cox regression analyses by estimating separate hazard ratios for less than 2, 25, and more than 5 years, with proportional hazards within these time periods. The regression variable for these estrogen-plus-progestin hazard ratios is time dependent as women move from one time from initiation period to another during WHI follow-up. At a specific follow-up time in the WHI, the time from initiation of the current episode for the estrogen-plus-progestin group in the clinical trial was defined as the time from randomization, and for the estrogen-plus-progestin group in the observational study, it was defined by summing the time that a woman had used estrogen plus progestin at baseline plus the time from observational study enrollment. The time that a woman had used estrogen plus progestin at baseline was determined by going back in time from observational study enrollment until a gap in estrogen-plus-progestin usage was encountered, with a usage gap of 1 year or longer defining the starting point for the episode. Combined hormone use in the observational study was classified several ways, including estrogen-plus-progestin preparation, estrogen preparation and dose, and progestin preparation and dose.
The sensitivity of hazard ratio estimates to lack of adherence to estrogen-plus-progestin group designation was examined by restricting the follow-up period for each clinical trial or observational study woman to the time period when she remained adherent to her estrogen-plus-progestin or control group designation. Specifically, the follow-up period for each woman was censored 6 months after she stopped taking combined hormones in the estrogen-plus-progestin groups or initiated hormone therapy use in the control groups, after which hazard ratio estimates were recalculated. The 6-month period was included to accommodate hormone therapy changes resulting from diagnostic workup. Nominal 95 percent confidence intervals and two-sided significance tests (p values) are presented.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
Hazard ratio dependence on time from initiation of the current estrogen-plus-progestin episode
An important remaining source of discrepancy between clinical trial and observational study hazard ratios is elucidated by accommodating a dependence of hazard ratios on time from initiation of the current estrogen-plus-progestin episode (refer to Materials and Methods). Table 5 includes the same potential confounding factors (not shown for brevity) as shown in table 4. The left side of table 5 provides separate hazard ratio estimates for the clinical trial and observational study in each of the three estrogen-plus-progestin time-from-initiation periods. The numbers of estrogen-plus-progestin group women experiencing cardiovascular disease events in each time period are also shown. These numbers make clear that the observational study is very sparse concerning the first 2 years from estrogen-plus-progestin initiation, while the clinical trial is comparably sparse after 5 years' duration. Note that hazard ratios within time-from-initiation periods are now more similar for each of the three clinical outcomes, with little evidence of hazard ratio reduction with estrogen plus progestin, except possibly for coronary heart disease beyond 5 years from estrogen-plus-progestin initiation.
|
Hazard ratio in subgroups
The analyses described above were repeated in the following baseline subgroups of the clinical trial and observational study cohorts: the approximately 94 percent of clinical trial and observational study women without a personal history of cardiovascular disease (coronary heart disease, stroke, or venous thromboembolism); the 74 percent of clinical trial women and 85 percent of observational study control group women who had not used hormone therapy prior to WHI enrollment and the 79 percent of observational study estrogen-plus-progestin group women who had not used estrogen plus progestin prior to their "current hormone therapy episode" at baseline screening; women aged less than 60 years; women less than 10 years from menopause; and women (24 percent in the clinical trial and 39 percent in the observational study) having fewer than 5 years from menopause during which they did not use hormone therapy. The estimated estrogen-plus-progestin hazard ratios were fairly similar for the clinical trial and observational study in each of these subgroups. It can also be commented that estrogen-plus-progestin hazard ratios in the two cohorts did not differ substantially between women aged less than 60 years at baseline compared with women aged 60 years or more, between women less than 10 years from menopause compared with women 10 or more years from menopause, or between women having less than 5 years from menopause without hormone therapy compared with women having 5 or more years.
Hazard ratio sensitivity to lack of estrogen-plus-progestin adherence
Comparisons so far have focused on the estrogen-plus-progestin randomization group in the clinical trial and on current use of estrogen plus progestin at baseline in the observational study, since the intention-to-treat clinical trial analyses have both reliability and useful interpretation. However, differential adherence patterns between the two cohorts could affect hazard ratio comparisons. Hence, analyses were also carried out with the follow-up period for each woman restricted to the time period within which she continued to adhere to her estrogen-plus-progestin group designation (refer to Materials and Methods). The estimated ratios of estrogen-plus-progestin hazard ratio in the observational study to those in the clinical trial, following control for confounding and time from estrogen-plus-progestin initiation, were 0.86 (95 percent CI: 0.56, 1.30) for coronary heart disease, 0.82 (95 percent CI: 0.50, 1.34) for stroke, and 0.79 (95 percent CI: 0.50, 1.25) for venous thromboembolism in these analyses, rather similar to those given above.
Hazard ratio dependence on preparation and on estrogen and progestin dose
Among the 17,503 baseline estrogen-plus-progestin users in the observational study, 13,565 (78 percent) used CEE, while 1,377 used other estrone sulfate-dominant estrogens, 1,359 used oral estradiol, and 642 used transdermal estradiol. A total of 16,649 (95 percent) of these women used MPA, with 13,065 (75 percent) using a CEE/MPA combination. About 95 percent of the CEE/MPA users were on a daily regimen. Among these women, 11,095 (87 percent) used the standard 0.625-mg/day CEE dose, while 966 women used 0.3 mg/day, and 632 women used a higher dose. Similarly, 10,188 (80 percent) of these women used 2.5 mg/day of MPA, while 5,440 used a higher daily dose.
To ensure that comparisons of hazard ratios in the clinical trial and observational study were not influenced by the range of estrogen-plus-progestin preparations, dosages, and schedules in the observational study, analyses were carried out restricting the estrogen-plus-progestin group in the observational study to the 12,136 (69 percent of total) women who used (at baseline) the same daily combination of 0.625 mg of CEE and 2.5 mg of MPA studied in the clinical trial. Following this restriction, the estimated ratio of estrogen-plus-progestin hazard ratio in the observational study to that in the clinical trial was 0.85 (95 percent CI: 0.57, 1.28) for coronary heart disease, 0.77 (95 percent CI: 0.48, 1.22) for stroke, and 0.92 (95 percent CI: 0.59, 1.42) for venous thromboembolism, again similar to the estimated ratios previously given.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
These analyses reinforce early elevations in cardiovascular disease risk among estrogen-plus-progestin users. For coronary heart disease, such early elevation is consistent with the Heart Estrogen/progestin Replacement Study (HERS) secondary prevention trial of the same CEE/MPA regimen (34), which found no overall effect on coronary heart disease risk over an average 4.1-year follow-up period. Other hormone therapy trials of secondary prevention of atherosclerosis progression (35
40
) have mostly reported neutral or unfavorable effects over fairly short follow-up periods. For stroke, the Heart Estrogen/progestin Replacement Study (41
) reported a nonsignificant elevation in risk, while the secondary prevention Women's Estrogen and Stroke Trial of estradiol (42
) found no overall effect but with an indication of elevation in the first months of use. For venous thromboembolism, the Heart Estrogen/progestin Replacement Study (43
) found a substantial early elevation in risk. Moreover, a small secondary prevention trial of estradiol plus norethisterone was stopped early on the basis of an excess of venous thromboembolism events (44
).
It is interesting to consider other observational study findings in relation to the WHI clinical trial. For coronary heart disease, several observational studies among healthy postmenopausal women included confounding control efforts and reported results as a function of hormone therapy duration. Some (4548
), but not all (49
, 50
), provided hints of early coronary heart disease risk elevation, but most had limited precision for estimating early hormone therapy effects. In addition, few reported associations separately for estrogen plus progestin and for estrogen alone, and there is now randomized controlled trial evidence (51
) to suggest that the coronary heart disease implications of CEE alone are more favorable than for combined CEE and MPA. Similarly, an early stroke elevation among hormone therapy users has been reported in some (52
), but not all (53
), recent observational studies.
The Nurses' Health Study followed women over the age interval when postmenopausal hormone therapy would likely be initiated, and it reported coronary heart disease results separately for estrogen and for estrogen plus progestin but did not find an early elevation in risk among healthy postmenopausal women (49). However, analyses of Nurses' Health Study data to date use only a snapshot of current hormone therapy use at biennial contacts, so that an estrogen-plus-progestin user would be classified as a nonuser for her first year of use on average and would be classified as a nonuser permanently if estrogen-plus-progestin use started and stopped within a biennial period prior to hormone therapy status ascertainment. As an exercise, we conducted a simulation study, wherein the estrogen-plus-progestin group assignment in the WHI clinical trial was randomly contaminated in a similar fashion and found that evidence for an early elevation in coronary heart disease risk and evidence for a time trend in the coronary heart disease hazard ratio typically disappeared under these circumstances.
There are implications of these analyses for the design and analysis of observational studies when the hazard ratio of interest varies with time. For example, a cohort comprising new initiators of the study exposure (12) can be expected to yield meaningful average hazard ratio estimates over various periods of time from exposure initiation, even if a proportional hazards assumption is imposed. However, studies like the WHI observational study that enrolled participants having various exposure durations will need to use nonstandard data analysis methods to assess exposure effects. Specifically, such studies may be able to provide hazard ratio estimates as a function of time from exposure initiation, from which average hazard ratios over certain time from initiation periods can be calculated.
In summary, these analyses aid in our understanding of sources of bias in observational studies, and they indicate that the apparent discrepancy between the clinical trial and observational studies, such as the WHI observational study, may be substantially explained by classical confounding and differences in the distributions of time from estrogen-plus-progestin initiation. The inability of those factors to provide a full explanation for differences between stroke hazard ratios reinforces the importance of randomized controlled trial evidence, especially when public health implications are great.
![]() |
ACKNOWLEDGMENTS |
---|
The authors thank WHI investigators and staff for their outstanding dedication and commitment; thank Drs. Jacques Rossouw, JoAnn Manson, and Sylvia Wassertheil-Smoller for critiquing earlier versions of this paper; and acknowledge the contributions of Dr. Catherine "Kit" Allen who participated in the writing of this paper until her death in 2003.
A full listing of WHI investigators can be found at http://www.whi.org. A list of key investigators involved in this research follows. Program Office. National Heart, Lung, and Blood Institute, Bethesda, Maryland: Barbara Alving, Jacques Rossouw, and Linda Pottern. Clinical Coordinating Centers. Fred Hutchinson Cancer Research Center, Seattle, Washington: Ross Prentice, Garnet Anderson, Andrea LaCroix, Charles Kooperberg, and Anne McTiernan; Wake Forest University School of Medicine, Winston-Salem, North Carolina: Sally Shumaker and Pentti Rautaharju; Medical Research Labs, Highland Heights, Kentucky: Evan Stein; University of California at San Francisco, San Francisco, California: Steven Cummings; University of Minnesota, Minneapolis, Minnesota: John Himes; and University of Washington, Seattle, Washington: Bruce Psaty. Clinical Centers. Albert Einstein College of Medicine, Bronx, New York: Sylvia Wassertheil-Smoller; Baylor College of Medicine, Houston, Texas: Jennifer Hays; Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts: JoAnn Manson; Brown University, Providence, Rhode Island: Annlouise R. Assaf; Emory University, Atlanta, Georgia: Lawrence Phillips; Fred Hutchinson Cancer Research Center, Seattle, Washington: Shirley Beresford; George Washington University Medical Center, Washington, DC: Judith Hsia; HarborUCLA Research and Education Institute, Torrance, California: Rowan Chlebowski; Kaiser Permanente Center for Health Research, Portland, Oregon: Evelyn Whitlock; Kaiser Permanente Division of Research, Oakland, California: Bette Caan; Medical College of Wisconsin, Milwaukee, Wisconsin: Jane Morley Kotchen; MedStar Research Institute/Howard University, Washington, DC: Barbara V. Howard; Northwestern University, Chicago/Evanston, Illinois: Linda Van Horn; RushPresbyterian St. Luke's Medical Center, Chicago, Illinois: Henry Black; Stanford Center for Research in Disease Prevention, Stanford University, Stanford, California: Marcia L. Stefanick; State University of New York at Stony Brook, Stony Brook, New York: Dorothy Lane; The Ohio State University, Columbus, Ohio: Rebecca Jackson; University of Alabama at Birmingham, Birmingham, Alabama: Cora Beth Lewis; University of Arizona, Tucson/Phoenix, Arizona: Tamsen Bassford; University at Buffalo, Buffalo, New York: Jean Wactawski-Wende; University of California at Davis, Sacramento, California: John Robbins; University of California at Irvine, Orange, California: Allan Hubbell; University of California at Los Angeles, Los Angeles, California: Howard Judd; University of California at San Diego, LaJolla/Chula Vista, California: Robert D. Langer; University of Cincinnati, Cincinnati, Ohio: Margery Gass; University of Florida, Gainesville/Jacksonville, Florida: Marian Limacher; University of Hawaii, Honolulu, Hawaii: David Curb; University of Iowa, Iowa City/Davenport, Iowa: Robert Wallace; University of Massachusetts/Fallon Clinic, Worcester, Massachusetts: Judith Ockene; University of Medicine and Dentistry of New Jersey, Newark, New Jersey: Norman Lasser; University of Miami, Miami, Florida: Mary Jo O'Sullivan; University of Minnesota, Minneapolis, Minnesota: Karen Margolis; University of Nevada, Reno, Nevada: Robert Brunner; University of North Carolina, Chapel Hill, North Carolina: Gerardo Heiss; University of Pittsburgh, Pittsburgh, Pennsylvania: Lewis Kuller; University of Tennessee, Memphis, Tennessee: Karen C. Johnson; University of Texas Health Science Center, San Antonio, Texas: Robert Brzyski; University of Wisconsin, Madison, Wisconsin: Gloria Sarto; Wake Forest University School of Medicine, Winston-Salem, North Carolina: Denise Bonds; and Wayne State University School of Medicine/Hutzel Hospital, Detroit, Michigan: Susan Hendrix.
Conflict of interest: none declared.
![]() |
NOTES |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|