Address all correspondence and requests for reprints to: Michael Thorner, MB BS, DSc, FRCP, Department of Internal Medicine, University of Virginia Health System, McKim Box 800466, Charlottesville, Virginia 22908. E-mail: mot{at}virginia.edu
The development of interventions to prevent or treat age-associated diseases and disability is an important part of the mission of the National Institute on Aging. In the last decade, there has been increased interest in the potential of T replacement therapy for older men with low levels of T. Whether the gradual decline in serum T concentrations to levels significantly lower than those found in young men contributes to frailty, and whether treatment with T can prevent or reverse these changes, remains unknown.
Several recent studies of T replacement in older men have included relatively small numbers of participants (maximum 108 subjects) and have been of limited duration. These studies have shown increases in bone mineral density, muscle strength, and lean body mass; decreases in body fat; and improved serum lipid profiles (1, 2, 3, 4, 5, 6, 7). However, results have not been consistent in all trials, and their implications for clinical outcomes are not clear. The studies have also been too small to adequately address many potential adverse effects such as an increase in prostate cancer risk.
The Advisory Panel considered the following questions.
1. Do current epidemiological data indicate that androgen levels in middle-aged and older men are related to morbidity or elevated risk for disease? If so:
a)
Which diseases and functions (e.g. osteoporosis, cardiovascular disease, prostate cancer, cognitive function, quality of life, mood, sexuality) or risk factors are related to androgen levels? The strongest statement that can be made based on existing epidemiological research is that clinical hypogonadism (e.g. total T below 250 ng/ml) is associated with osteoporosis and decreased sexual function. The evidence for associations of measures of serum T with health outcomes are inconsistent and inconclusive. Despite biological plausibility that higher T levels could affect heart disease (favorably) and prostate cancer risk (unfavorably), the epidemiological evidence is inconclusive.
b)
What measures of androgen status (free, bound, metabolites, etc.) are related most strongly to disease or risk factors? Population-based data are available for estimating the number of men in various age groups with various degrees of hypogonadism as defined by total serum T. However, it is difficult to identify the "best" serum marker of T and hypogonadism. Until more research is done on the stability of T measures in the same individual over time, it is premature to conclude that the level of deficiency can be established by a single measure. Thus, the present data can provide approximate estimates of number of men with various degrees of deficiency, but there may be important shifts in these estimates due to progress in research on how best to measure T deficiency.
c)
What is the "dose-response curve" of these relationships (e.g. at what levels of hypogonadism is disease risk significantly elevated)?
d)
Are there adequate data on the numbers of men in various age groups with various degrees of hypogonadism that could affect risks (e.g. serum T <100, <200, <300 ng/dl)? It is difficult to identify the best serum markers of T level, and the best marker may depend on the target organ or disease of interest. It is not clear which marker is the best measure of the amount of T available for binding ARs. For example, it has been proposed that the level of T that is not bound to SHBG, or the level of free T as measured by dialysis, is a better marker of tissue availability of T than total T.
The effect of T on some tissues is probably due to metabolites
of T. The effect of T on prostate is probably due to conversion to
dihydrotestosterone by 5--reductase. However, the estimation of this
conversion by measuring 5-
-reductase activity would reflect a
mixture of two isoforms, only one of which is present in the prostate.
This would diminish the ability to detect effects of a naturally
occurring variation in type two 5-
-reductase activity on prostate
cancer. Hence, in epidemiological studies of prostate cancer, total
serum T may not be a sufficiently sensitive indicator of how T could
affect prostate cancer risk. The effect of T on bone may be mediated by
more than one pathway, and conversion to E2 by aromatase may be the key
pathway.
Many conditions of interest have a complicated pathogenesis, and the causal effects of T, while important, may be difficult to measure. For example, it would be surprising if T did not have some effect on muscle strength in older men, but such an effect can be difficult to detect in the presence of physical activity, or with motor neuron loss and contraction-induced muscle damage. An independent effect of T on cardiovascular disease risk may lack statistical power in some data sets, after adjustment for the multiple known risk factors.
The panel made the following observations about the purpose of continued epidemiological research:
Epidemiological research will likely continue to play a role in defining the best serum markers of T level. Indeed, if the serum marker problem were solved, epidemiological research could become much more useful.
There is still a rationale for epidemiological studies that generate new hypotheses about age-related effects of T deficiency.
Use of frozen serum banks of existing epidemiological studies is a cost-effective method of doing epidemiological research.
Comparisons of measurements of serum markers obtained in prior studies:
Study results should provide quality control to each other in the area of laboratory analysis of serum markers, to provide consistency between studies. All studies should use a common minimum set of T measurements. One concern is the quality of storage of frozen serum. There should be a method for determining the quality of storage of the each sample propter hoc.
Studies should seek to define the stability of T measurements over time for three reasons: 1) the ability to perform serial measurements; 2) insight into how much odds ratios should be adjusted for measurement error; and 3) insight into the role of multiple measurements in the clinical diagnosis of T deficiency.
Only longitudinal cohort studies, or derived from longitudinal studies such as nested case control studies, should be undertaken.
It is not clear whether funding a meta-analysis of existing studies would be useful.
Studies that examine the relationship of T to behaviors such as tobacco use, physical activity, diet, alcohol use, and sleep should be undertaken.
More understanding of the molecular biology of T and tissue-specific mechanisms of action is of importance.
2. Do results of intervention studies indicate potential benefits of androgen replacement therapy (ART) for middle-aged or older men with low androgen levels? If so:
a) What conditions do these studies indicate could be prevented or treated by ART? b) For what types of subjects [e.g. age or androgen status (both the level and form of hormone)] do these studies indicate potential efficacy of ART? Could more conclusive information on potential benefits of ART, and identification of subjects likely to respond to ART, be obtained from further analyses of data from previous studies, and if so, what studies, and what analyses? Five controlled studies have focused on potential benefits in healthy older men but none in frail elderly men. The average age was 72 yr in three studies and 64 yr in two studies. Definable benefits of androgen therapy might be present in frail elderly that are not detectable in the healthy younger elderly. In general, subjects have been selected using total T concentrations. Two studies attempted to use bioavailable T defined as the fraction of T in serum not bound to SHBG. Some men with severe hypogonadism (defined as very low serum T and elevated LH concentrations) may have been included inadvertently. This emphasizes the need to use both total T as well as the fraction of T not bound to SHBG for screening.
Most studies have shown a small increase in lean body mass, and both the Tenover and Snyder studies showed a significant decrease in fat mass. Changes in strength were variablewith three studies reporting an increase and four studies reporting no change. Both Snyders and Tenovers studies reported an increase in bone mineral density. In Tenovers study, there was a strong correlation between increase in bone mineral density and increase in plasma E2. The panel noted that the increase in bone mineral density was the most consistent effect noted in all studies where it had been measured. In Snyders studies, the men receiving T reported greater perceived benefit in strength improvement than was actually observed on strength testing. This outcome could have been due either to a direct effect of T or some metabolite on perception of strength or that the tests designed to measure improvement in muscle function were not adequately sensitive to detect an improvement.
Group comparisons across these seven studies are difficult, due to the variable selection criteria, duration of therapy, and different definitions of hypogonadism. It was noted that men with the lowest total serum T concentrations usually had the greatest response in bone density. Reduction in total and low-density lipoprotein cholesterol was noted in some studies. Many of the functional parameters measured, such as grip strength, probably do not correlate well with functional improvements that would result in a major health benefit in the elderly, such as prevention of falls, ability to tolerate falls without fractures, and adequate functional strength to perform daily tasks, such as cooking or house cleaning, that are associated with the ability to live independently.
The majority of the studies to date have enrolled relatively healthy Caucasian men, and it is difficult to extrapolate their results to more ethnically diverse sicker populations. One important parameter that was identified by the panel that deserves further study is whether non-SHBG-associated T correlates better with improvements in functional parameters as well as the bone accretion. An analysis of preexisting plasma from some of these studies for assessment of bioavailable T and comparison with the results already accumulated would be of further benefit. In the future assessment of serum androgen response modifiers might also be of benefit.
3. Do data from epidemiological or intervention studies indicate potentially serious risks from ART in middle-aged or older men? If so, for what conditions? What additional studies would be useful to clarify this issue?
The risks of long-term ART are unknown. ART could have at least six important adverse effects:
1) Increased risk of prostate cancer.
2) Increased risk of clinically significant benign prostatic hypertrophy (BPH). Also, if ART increases symptomatic BPH and/or serum prostate-specific antigen, men on ART could be more likely to have a diagnostic evaluation for prostate cancer and, therefore, be at increased risk for complications from diagnostic procedures. Assuming T does not increase the risk of prostate cancer, ART men would be more likely to have their "naturally" occurring indolent, slow growing, and nonclinically significant cancers detected, thus leading to unnecessary treatment for prostate cancer.
3) Stimulation of erythropoiesis, which could cause adverse effects due to hyperviscosity of the blood.
4) Increased risk of sleep apnea.
5) Increased risk of cardiovascular disease. Data are insufficient to determine whether ART would increase, decrease, or have no effect on cardiovascular disease incidence.
6) Increased risk of aggressive behavior or inappropriate sexual behavior. Studies of pharmacological doses of T in younger men as a birth control method generally report treatment does not increase risk of aggressive behavior. It is quite unlikely that much smaller replacement doses of T would increase risk of aggressive behavior in older men. However, since the popular belief is that T does increase aggressive behavior, studies should monitor this outcome, using, for example, standard hostility questionnaires.
For future trials of ART they recommended the following safety monitoring:
Periodic monitoring of hematocrit in clinical trials would be useful to help define the frequency, magnitude, and timing of any abnormal elevations and to help guide use in clinical practice.
BPH can be followed with noninvasive tests (after void abdominal ultrasound for urinary retention, urine flow rates) and by questionnaires (international prostate symptom score). Outcomes such as episodes of acute urinary retention and initiation of medical or surgical therapy for BPH should also be captured. Repeated trans-rectal ultrasound to assess prostate volume in a subset of men may have a role in monitoring for BPH.
Sleep apnea should be evaluated by appropriate validated techniques.
Aggressive behavior can be monitored by self reports of patient and spouse/partner.
There are existing protocols for adjudicating heart disease end points in clinical trials. Power calculations presented to the panel suggest a study that has power to exclude a 2030% increase in prostate cancer risk to ART would also have sufficient power to evaluate whether ART increased cardiovascular risk.
The most difficult problem for a clinical trial, or group of trials, is how to assess risk of ART on prostate cancer, given the detection bias problem mentioned above. Clinical trials should be designed to minimize detection bias.
Serum banks and DNA banks are important in randomized trials, because future research may suggest that T increases risk only in certain subgroups (e.g. as defined by polymorphisms of the androgen receptor).
4. What do current information on the mechanisms of androgen actions and responses, and age-related changes in these factors, imply regarding the likely efficacy of ART in preventing specific age-related chronic diseases?
What are the implications of this information for possible ART choices, regarding selection of androgens or AR modulators, routes of administration, and subjects for interventions?
T acts in cells both directly and through interactions with the
AR and indirectly through metabolism to either E2 or
dihydrotestosterone. Dihydrotestosterone acts principally through the
AR and principally on prostate. E2 acts directly on the E receptor. The
role of other T metabolites in maintenance of muscle strength and
function, as well as bone mineral density and sexual function, is not
well defined. The panel agreed that several deficits exist in our
knowledge base regarding this question, particularly as they pertain to
understanding these physiological processes in aged and elderly men and
how these processes change during normal aging. For example, there is
very little information regarding the normal age-associated changes in
T metabolism or conversion to dihydrotestosterone and E in elderly men
and whether these changes are tissue specific. Measurements, such as
clearance of T, dihydrotestosterone, and E2 as they change in serum as
a function of age, are not well established. Dihydrotestosterone has a
major role in stimulation of the prostate, therefore, further
development of selective androgens that are not converted to
dihydrotestosterone, such as 7 methyl nortestosterone, may be
important. These compounds will provide model substances to test the
hypothesis that failure to maintain this conversion leads to a
potentially safer androgen. Similarly, the panel noted that much of the
effect of T on bone may be due to conversion to E2, and additional
studies should be designed to assess how the degree of conversion
affects changes in bone metabolism. Aromatization and conversion to E2
may also be important for determining whether beneficial effects exist
on cholesterol metabolism.
Whether emphasis should be placed on selective AR modulator development will probably depend on a better understanding of individual tissue responsiveness and metabolism of androgens in elderly subjects.
Regarding routes of administration, the panel recommended use of T patches or gels, where adequate androgen concentrations for a given biological response can be achieved by this route.
Additional studies that were recommended by the panel included attempts to examine relationships between AR polymorphism and responses to androgens, as well as to whether they will predict androgen responsiveness in long-term, longitudinal, epidemiological studies. The panel felt that the development of androgens or selective AR modulators that could be used in studies that were designed to assess their effects on specific targets, such as prostate growth, would be desirable and that Phase II studies should be undertaken that are designed to look at specific target tissues, such as prostate growth. Dose response studies should also be undertaken in younger compared with older men, and the effects of various doses on the responses in specific target tissues (e.g. muscle, bone, etc.) should be analyzed, to determine whether one particular tissue is changing more in its responsiveness as a function of age compared with other tissues. The panel felt that more data were needed on the natural history of Leydig cell function with age, whether there was a decrease in the number of Leydig cells or if there was a decrease in the response of each cell to a given concentration of LH.
5. Based on the considerations in items 13, what, if any, ART intervention studies (Phase III clinical trials and/or smaller studies) would be particularly valuable to conduct at this time, with regard to: a) outcomes to be measured; b) subject population [androgen status (both the level and form of the androgen), age, etc.]; c) duration of intervention; d) provisions for patient safety; and e) ancillary studies regarding mechanisms affecting responses to ART?
There was a strong consensus that Phase III randomized trials of T replacement are needed. Reasons to conduct Phase III trials include:
The public health importance of ART is currently unknown, but could be enormous. Depending on how age-related T deficiency is eventually defined, between 5% and 50% of older men could be candidates for ART. Also, a paradigm might emerge that prevention of age-related decline in T levels is more effective than treatment of a deficiency syndrome initiated in old age. Whereas large trials are expensive, it is unrealistic to expect that a public health issue this large and complicated can be resolved without them.
Experience with HRT in women strongly suggests that epidemiological research alone, or even this research combined with small clinical trials, will not provide sufficient information.
There is presently a window of opportunity to study ART while ART is still relatively uncommon. As new preparations of T come to market that are more convenient than im shots or currently available patches, ART therapy will probably become more prevalent.
Regarding the selection of outcome measures for ART trials, clinical fractures (rather than bone density) as an end point is appropriate, because the effect of T on bone is of high potential importance, and there is substantial evidence T could be effective in preventing or treating osteoporosis in men. Whether to power a single multisite study to be able, in addition, to determine the effect of ART on prostate cancer and on cardiovascular disease would depend on whether this were the only large trial to be done in the near future. It would be appropriate to develop a plan for a prespecified combined analysis of prostate cancer and cardiovascular disease from more than one trial for these end points. One or more multisite studies would provide the opportunity for site-specific ancillary studies that can be done with modest sample sizes. Outcomes that can be addressed with sample sizes of 20100 per group apparently include cognitive outcomes, body composition, muscle strength, and lipid measurements. Somewhat larger trials may be needed to quantify effects of T on performance measures of functional limitations.
In particular, there exist valid, sensitive performance measures of functional limitations that predict the risk of fall and the status of independent living. Ancillary studies might also include factorial designs, such as whether exercise and ART have additive effects on muscle mass.
The panel had a few suggestions about general design and conduct of ART studies:
As already mentioned, there should be standardized and centralized monitoring for adverse effects across studies. This would include monitoring for mortality effects (e.g. by using National Death Index). It should also include monitoring for prostate cancer with a standard core set of data to be collected on all diagnosed cases and with central pathology review. For cardiovascular events, standard case definitions should be used across all studies, and all reported events meeting prespecified criteria should be adjudicated in a blinded manner by an expert panel.
Emphasis should be placed on understanding the effects of ART not just in typically healthy 50- to 75-yr-old men, but also in 75- to +90-yr-old men who are at most risk of frail health, osteoporosis, heart disease, sarcopenia, etc., and generally have the lowest serum T levels. Men with chronic illness have T levels 1015% lower than healthy men, and older men will have experienced the most functional decline. Trials need to answer the question of whether the oldest old receive the most benefits from ART.
Average treatment duration should be in the range of 35 yr.
Behavioral outcomes should be included, such as effects on sleep, physical activity, cognition, physical functioning, social functioning, sexual functioning, alcohol use, etc. Validated questionnaires exist for measuring a number of these behavioral outcomes. Careful review will be needed to select the questionnaires most suited to the populations and end points of interest.
Finally, the recommendation to fund Phase III trials is not a statement that Phase II trials are not useful. Phase II trials are still useful for estimating effect sizes on outcomes that are not well studied, for assessing dose response, for understanding relative effects of different T preparations, for defining the mechanism of T effects, and so on.
The panel recommended that if new, large, placebo controlled trials are undertaken, at least one study should focus on changes in carbohydrate metabolism and insulin resistance, particularly for those subjects who lose substantial amounts of fat. Consideration should be given to clinical trials to evaluate postmyocardial infarction survival in men, particularly those with elevated cholesterol. Similarly, it was felt that a limited Phase II study to determine whether selective E receptor modulators are effective in preventing further bone loss in elderly men with established osteopenia would be desirable. If a large trial were undertaken, consideration should be given to quantify prevention of the development of type 2 diabetes, myocardial infarction, and disability due to sarcopenia.
Acknowledgments
Footnotes
1 The Advisory Panel on Testosterone Replacement in Men convened
on June 56, 2000. Panel Members: Dr. Michael Thorner
(Chairman), University of Virginia Health System (Charlottesville, VA);
Dr. David Buchner (Rapporteur), CDC/NCCDPHP (Atlanta, GA); Dr.
David Clemmons (Rapporteur), University of North Carolina (Chapel Hill,
NC); Dr. Dennis Ausiello, Massachusetts General Hospital, Harvard
Medical School (Boston, MA); Dr. Elizabeth Barrett-Connor, University
of CaliforniaSan Diego (La Jolla, CA); Dr. William Bremner,
University of Washington (Seattle, WA); Dr. Harry A. Guess, Merck
Research Laboratories (Blue Bell, PA); Dr. Murray Raskind, VA Puget
Sound Health Care System, University of Washington School of Medicine
(Seattle, WA). Conveners: Drs. Stanley L. Slater and
Evan Hadley, National Institute on Aging (Bethesda, MD).
Abbreviations: ART, Androgen replacement therapy; BPH, benign prostatic hypertrophy.
Received May 1, 2001.
Accepted June 16, 2001.
References