Department of Environmental Health Sciences, School of Public Health and Health Sciences, University of Massachusetts, Amherst, Massachusetts 01003
Received November 17, 2000; accepted April 17, 2001
ABSTRACT
Hormesis has been defined as a dose-response relationship in which there is a stimulatory response at low doses, but an inhibitory response at high doses, resulting in a U- or inverted U-shaped dose response. To assess the proportion of studies satisfying criteria for evidence of hormesis, a database was created from published toxicological literature using rigorous a priori entry and evaluative criteria. One percent (195 out of 20,285) of the published articles contained 668 dose-response relationships that met the entry criteria. Subsequent application of evaluative criteria revealed that 245 (37% of 668) dose-response relationships from 86 articles (0.4% of 20,285) satisfied requirements for evidence of hormesis. Quantitative evaluation of false-positive and false-negative responses indicated that the data were not very susceptible to such influences. A complementary analysis of all dose responses assessed by hypothesis testing or distributional analyses, where the units of comparison were treatment doses below the NOAEL, revealed that of 1089 doses below the NOAEL, 213 (19.5%) satisfied statistical significance or distributional data evaluative criteria for hormesis, 869 (80%) did not differ from the control, and 7 (0.6%) displayed evidence of false-positive values. The 32.5-fold (19.5% vs 0.6%) greater occurrence of hormetic responses than a response of similar magnitude in the opposite (negative) direction strongly supports the nonrandom nature of hormetic responses. This study, which provides the first documentation of a data-derived frequency of hormetic responses in the toxicologically oriented literature, indicates that when the study design satisfies a priori criteria (i.e., a well-defined NOAEL, 2 doses below the NOAEL, and the end point measured has the capacity to display either stimulatory or inhibitory responses), hormesis is frequently encountered and is broadly represented according to agent, model, and end point. These findings have broad-based implications for study design, risk assessment methods, and the establishment of optimal drug doses and suggest important evolutionarily adaptive strategies for dose-response relationships.
Key Words: hormesis; compensatory responses; overcompensation; U-shaped; J-shaped; dose response; low doses; risk assessment; extrapolation.
The occurrence of hormesis in the toxicological sciences has a long and controversial history (Calabrese and Baldwin, 2000a,b
,c
,d
,e
). Evidence supporting the existence of hormesis is substantial, with numerous reproducible examples suggesting potential broad generalizability (Calabrese et al., 1999
). However, little information exists concerning the frequency of hormesis within the toxicological literature; that is, how often one would expect to observe hormesis given appropriate study design parameters. Two databases were previously created from the published literature to quantify aspects of hormetic responses in toxicological studies. In the case of Davis and Svendsgaard (1994), an attempt was made to estimate the incidence of hormetic responses based on the frequency of deviation from control responses independent of study design, NOAEL (no observed adverse effect level), and statistical significance. The second database (Calabrese and Baldwin, 1997a
,b
) focused on describing the quantitative features of the hormetic dose response and issues relating to generalizability rather than frequency in the toxicological literature.
Taking into consideration the limitations of the previous databases and incorporating suggestions by Crump (2001), a new database was created to assess the proportion of studies in the toxicological literature satisfying criteria for evidence of hormesis consistent with the definition of Stebbing (1998). Rigorous a priori entry criteria were established based on study design characteristics to identify data sets with the potential to detect a hormetic effect. Data sets meeting these criteria, independent of outcome, were entered into the database. Subsequent application of a priori evaluative criteria identified those dose-response relationships satisfying requirements for evidence of hormesis.
METHODS
Journal selection.
Because a broad range of experimental models, end points, and agents, including mixtures, was desired, two environmentally-oriented toxicological journals (Environmental Pollution, 19701998; The Bulletin of Environmental Contamination and Toxicology, 19661998) and one pharmacologically oriented toxicological journal (Life Sciences, 19621998) were selected. Use of these journals ensured broad coverage of the toxicological literature without truncated end-point selection associated with more specialized journals. This was viewed as a desirable and necessary journal selection strategy at this stage of project development, as it would offer greater opportunity to address issues of generalizability. Furthermore, inclusion of approximately 30 years of articles from each journal ensured the opportunity to incorporate independent peer review over prolonged periods, studies reflecting changes in toxicological funding priorities (thereby enhancing the range of chemicals, end points, and hypotheses assessed), improvements in study design, analysis, and technical developments as the field evolved, and assessment of historical trends if needed.
Screening protocol.
All articles were initially screened in ascending chronological order beginning with volume 1, number 1 of each journal through 1998, with the exception of Life Sciences. Due to the increasingly large number of articles published per year in this journal (by the end of 1979 approximately 6000 articles had been screened with an annual publication rate increasing to over 600 articles), a decision was made to limit additional screening to 6 years, approximately equally spaced over the remaining 19 years of publication (1982, 1985, 1988, 1992, 1995, 1998). During the initial screening, exclusion and entry criteria described below were applied to all dose-response relationships reported in tabular or graphical form in each article. Dose-response relationships meeting the entry criteria were later examined with evaluative criteria described below for satisfying or not satisfying evidence of hormesis. The initial screening and the subsequent application of evaluative criteria were performed by the two authors; the results of the application of evaluative criteria were examined a second time by one of the authors.
Exclusion criteria.
Only studies with experimental data were considered. Review articles, abstracts, non-English language articles, epidemiologic studies, and field studies were excluded. Studies lacking any of the following conditions were excluded: (1) a concurrent control; (2) the capacity to achieve responses greater than (or less than, depending on end point) the control response (e.g., studies where the end point was survival and the control response was 100% or where the end point was tumor incidence and the control response was zero); (3) at least two doses below the NOAEL; and (4) at least one dose showing a priori criteria-based inhibition.2
NOAEL designation.
The NOAEL designation represents a unique dose that can be satisfied by only one dose. In the hormesis database this dose is satisfied by definitional determinants such that this dose represents the highest dose not differing from the control and having defined decrements at immediately higher doses. Any dose lower than this designated NOAEL that displays a response below that of the control would be interpreted as displaying either variability or error. As a result of this definition of NOAEL and applying it consistently throughout the database, possible subjective reinterpretation and designation of the NOAEL dose was prevented. The implications of this scheme were to allow for the inclusion of negative variability/error in the dose-response relationship below a designated NOAEL to permit false-positive estimation. If this approach had not been followed, some dose-response relationships could have been eliminated from satisfying entry criteria, ultimately resulting in a higher proportion of studies satisfying the evaluative criteria.
Residual bias may occur as a result of the NOAEL designation used in this assessment. Some doses that are characterized as NOAELs may in fact display evidence of low/modest toxic responses. However, if the decrement does not achieve a certain designated level (e.g., statistical significance, percent decrement), a determination could be made for that dose being the NOAEL. Thus, it is possible to inappropriately designate a bona fide LOAEL (lowest observed adverse effect level) as a NOAEL. This concern is widely recognized in regulatory toxicology and is one of the reasons why the NOAEL has been broadly criticized with respect to its no adverse effect designation. This possible limitation has led to proposals for application of statistical procedures, such as the benchmark dose (BMD), to estimate the NOAEL. If a NOAEL is actually a LOAEL in the current hormesis database, this would have implications for detection of hormesis at lower doses in the dose response spectrum. In fact, it could limit the potential detection to possibly one dose under certain study design scenarios. Again, even this one dose may still actually represent a type of LOAEL, if in fact it too had low residual deficits. This suggests that for dose responses in the present hormesis database where the NOAEL reflects a dose with a slight/modest toxic response, a false-negative potential for hormesis estimation may exist.
A decision was made in the development of the criteria to include as NOAELs for evaluative purposes doses that could satisfy evaluative criteria for evidence of hormesis. Although it is possible that one could have eliminated NOAELs within an evaluative designation, this approach was rejected, since the NOAEL, when it exceeds the control value, could be considered as being in the hormetic zone. This is because the designation of the NOAEL is not a perfect representation of the zero equivalent point (i.e., the highest dose with a response equal to the control response), but could err on either side of the control for real biological effect purposes. For this reason, it was decided that it would be unfair to bias a determination against a hormetic perspective. It should be noted that it was argued above that mischaracterization of a LOAEL with a NOAEL could lead to false-negative representation. However, allowing a NOAEL to be positively identified as a hormetic response is not a misrepresentation.
Entry criteria.
The entry criteria were designed to ensure consistency with the U (or inverted U) shape of the hormetic dose-response relationship. That is, all studies needed to have sufficient evidence to demonstrate the occurrence of high-dose inhibition based on statistical and/or quantitative criteria, a NOAEL, and doses below the NOAEL that were to be evaluated for the potential of a low-dose stimulatory response based on statistical and/or quantitative criteria. Studies satisfying these general criteria were placed into one of three entry criteria tiers (T1, T2, T3) presented in Table 1: T1 includes dose-response relationships subjected to hypothesis testing; T2 was designed to identify dose-response relationships lacking hypothesis testing but reporting standard deviation (SD) or standard error of the mean (SEM) information, thereby providing information on the distribution of the data. T3 was designed to identify dose-response relationships defined only by data points reflecting mean/median values with no reference to variation.
|
|
In cases where data were graphically represented, on some occasions error bars were depicted for treatment data points, but not for the control. In those cases the dose responses were considered indicative of potential statistical significance if the error bars (SD/SEM x 2) of the treatment did not cross the control value.
In order to avoid exclusion of potentially relevant data below the NOAEL and to enhance the rigor of evaluative criteria, dose-response relationships with at least three doses with responses 110% of control (or with responses
90% for J- or U-shaped curves), i.e., alternative quantitative criteria were considered satisfying evidence of low-dose stimulation in the absence of statistical significance or potential statistical significance as determined by data distribution.
In order to avoid exclusion of potentially relevant data due to absence of a statistically significant or potentially statistically significant inhibitory response at high doses, dose-response relationships with at least two doses with responses < 90% of control (or > 110% for J- or U-shaped curves) were considered satisfying evidence of inhibition in the absence of statistical significance or potential statistical significance as determined by data distribution.
Assessment of false-positive responses.
An indication of the frequency of false-positive responses (i.e., to what extent the positive findings could be accounted for by chance or random variation) was obtained by assessing the responses of treatment doses below the NOAEL and comparing the proportion of negative findings to positive findings. This is based on the assumption that if chance or random variation was responsible for the positive findings (i.e., a hormesis designation) then the number of negative responses should approximate the number of positive responses. It should be noted that although the NOAEL dose was included when assessing dose-response relationships with the evaluative criteria (Table 1), only treatment doses below the NOAEL were evaluated for false-positive responses. The NOAEL by definition cannot display an adverse (or negative) response, and its inclusion in the assessment of false-positives would therefore bias the outcome. By excluding the NOAEL values in this assessment, bias favoring false-positive estimation was minimized. This approach therefore provides a rate of false-positive/negative estimates that could be applied to the total rather than deriving the absolute number by direct estimation.
When the evaluative criteria were based on the response of single doses, the proportion of false-positive findings was derived by dividing the total number of doses below the NOAEL showing significant or potentially significant negative responses by the total number of significant or potentially significant responses of a positive and negative nature for both the hypothesis testing (T1) and distributional data (T2) categories. A similar procedure was employed to estimate false-positive findings when alternative quantitative criteria were used.
Assessment of false-negative responses.
An indication of the frequency of false-negative responses was obtained by assessing the proportion of dose-response relationships satisfying the alternative quantitative evaluative criteria to the total number of dose-response relationships not satisfying evaluative criteria in the hypothesis testing category T1. This procedure was also applied to dose-response relationships that failed to satisfy evaluative criteria for the distributional data category T2.
RESULTS
Frequency of Hormetic Effects
Table 2 presents the results of application of the entry criteria organized by journal and year of publication. Of the 20,285 articles screened, 195 articles (1%) contained 668 dose-response relationships meeting the entry criteria. The number of articles screened was equally divided between the environmentally oriented journals (51.5%; 10,462 articles published in Environmental Pollution and The Bulletin of Environmental Contamination and Toxicology) and the more pharmacologically oriented journal (48.4%; 9823 articles published in Life Sciences). Approximately 1% of the articles in each journal contained dose-response relationships meeting the entry criteria (Environmental Pollution, 1.2%
The Bulletin of Environmental Contamination and Toxicology, 0.8%
Life Sciences, 1.0%
). The number of dose-response relationships meeting the entry criteria was approximately equally divided among the three journals (Environmental Pollution, 37.5%
The Bulletin of Environmental Contamination and Toxicology, 30.8%
Life Sciences, 31.5%
).
|
|
|
|
|
|
DISCUSSION
The findings indicate that in studies satisfying entry criteria, 36.7% satisfied the evaluative criteria for a hormetic response. Although the above assessment indicates that the study findings cannot be accounted for by false-positive responses or by random variation, there are fundamental limitations in the current study methodology that are likely to yield a tendency for false-negative conclusions (values lower than actual hormesis estimates). The false-negative rate was nearly three times greater than the false-positive rate (i.e., 9.7% vs 3.5%). The false-negative criteria were established as being twice as rigorous as the false-positive estimation procedure. Finally, while false-positive evaluation was able to be applied to all possible instances for positive responses, this was not the case for the 170 negative dose responses in the alternative quantitative criteria for which no validation procedure is available. Given these three factors, it is likely that the 36.7% estimate of hormetic dose-response frequency is conservative and is likely somewhat higher.
In addition, the study did not take temporal factors into consideration. Numerous investigations exist that demonstrate stimulatory responses occur only following a disruption in homeostasis, that is, after an initial decrement in response (Stebbing, 1998; Calabrese, 2001). If responses were not taken at multiple times during the experiment, possible stimulatory responses could be missed, leading to false-negative conclusions.
It is interesting to note that of 1089 treatment doses below the NOAEL using hypothesis testing and distributional data entry criteria (Table 4), 213 (19.5%) of the treatment doses were determined to satisfy hormesis evaluative criteria. Only seven treatment doses (7/1089 = 0.6%) were significantly below the control. This suggests that hormetic responses in these categories occurred approximately 30-fold (19.5%/0.6% = 32.5) more frequently than a response of similar magnitude in the opposite (negative) direction. This finding, which employs the treatment doses below the NOAEL as the unit of comparison, provides striking support for the position that hormetic effects cannot be attributed to chance.
The data further revealed that the general occurrence for hormetic dose responses was widely incorporating of biological model, end points, and chemical classes. These findings represent the first attempt to assess the frequency of hormetic responses within the context of a biological/toxicological model based on study design, dose response, and statistical features. The results are particularly noteworthy, as they directly challenge the long-held view that hormetic responses should be seen as statistical exceptions, paradoxical findings, or otherwise unexpected events.
Although the above findings suggest that hormetic responses are quite common if assessed with the appropriate study design criteria, only 1% of the more than 20,000 published articles contained data meeting the study design criteria for entry into the database. This emphasizes the fact that very few published studies have the potential for detecting hormetic responses in the low-dose region of dose-response relationships. In fact, the criteria used in the present study ignored temporal features. If adequate temporal features were required, the proportion of studies satisfying entry criteria would have been far less than the 1%. Yet, if hormetic effects are to be adequately characterized, multiple appropriately spaced doses need to be assessed over multiple periods. The dual combination of multiple doses and periods places extraordinary demands on the investigation and are generally ignored, at least in part, thereby affecting the opportunity to assess hormetic effects. Thus, it is not surprising that hormetic effects have been considered exceptions or paradoxical responses, as our findings indicate that only 1 of 100 studies has the appropriate dosage design needed to assess this hypothesis.
Although there are multiple reasons why entry criteria were not satisfied, the most likely reason is that the hormetic evaluation has high study design criteria requirements, especially with respect to the number of doses and doses below the NOAEL. At a minimum, four doses plus a concurrent control are required, with two of the four doses being below the NOAEL. Historically, there has been a strong emphasis on high-dose evaluation, as these responses are often more definitive and publishable for defining the NOAEL. These factors minimize the proportion of experiments that emphasize below-NOAEL responses. Likewise, there has been the long-standing belief that responses below the NOAEL are most likely due to normal variation and not reproducible treatment effects. It is this very central assumption of modern experimental and regulatory toxicology that the present findings challenge. Yet, it is this historically controlling assumption that has strongly influenced past toxicological study designs and contributes to the observation that 99% of studies do not satisfy the entry criteria for hormetic responses.
The selection of the three journals noted in the Methods section was designed to achieve a broad representation of biological models, agents tested, and end points assessed. Although this approach was generally successful in achieving these goals, there were several important omissions or underrepresentations in certain categories. For example, the number of microbiological models was minimally represented; likewise, studies involving various types of radiation were also minimal. Nonetheless, areas such as microbiological responses and effects of radioactivity have been extensively documented and are represented in the earlier and separate database developed by Calabrese and Baldwin (1997a,b). Such underrepresentation in the present study is believed to be a result of the journal selection rather than a biological restriction of the hormetic response.
The findings presented here add to and strengthen the earlier reports on the potential widespread generalizability of hormesis. They provide a useful complement to the Calabrese and Baldwin (1997a,b) hormesis database, which includes several thousand examples of dose-response relationships satisfying quantitative criteria for assessing hormesis, as well as study replication and mechanistic findings that account for the biphasic nature of the dose and temporal responses of the hormetic phenomenon. Although numerous examples of apparent hormetic responses exist independent of chemical, biological model, and end point, the previous database cannot address the issue of frequency of occurrence of hormetic responses. The current study addresses this limitation and suggests that hormetic responses are commonly encountered if the study design is appropriate.
The present findings have important implications for the design, conduct and interpretation of toxicological investigations as well as the potential to alter current concepts of NOAEL and challenge findings of risk assessment modeling activities commonly used for regulatory practices that assume linearity in low-dose areas. More specifically, for hormetic effects to be properly assessed, it is important that consideration be given to animal model and end-point selection. For example, an assessment of end points such as mutagenicity, carcinogenicity, and teratogenicity within a hormetic framework cannot be made using models with zero or negligible background/control incidence. It is also important to establish in a reliable manner the NOAEL for end points of interest and to include multiple and carefully spaced doses below the NOAEL. Furthermore, it may be necessary to include a temporal component within the study design if the hormetic mechanism represents an overcompensation response (Hart and Frame, 1996; Morré, 2000
; Stebbing, 1998
). The above suggestions are not trivial recommendations, as they require the commitment of substantial additional resources. Nonetheless, these features are necessary to more properly determine the nature of the dose response in the low-dose zone.
These findings address fundamental aspects of the nature of the dose response in the low-dose zone and suggest the need to incorporate U-shaped features in future modeling aspects of biological responses. Although the current investigation has focused on toxicologically derived data, sufficient data exist within the original hormesis database to indicate that this phenomenon is operational and similarly significant across the broad spectrum of biological, pharmacological, and other biomedical disciplines.
ACKNOWLEDGMENTS
The authors would like to thank Dr. Kenny Crump (ICF Clements) for his guidance and technical assistance. This effort was sponsored by the Air Force Office of Scientific Research, Air Force Material Command, USAF, under grant F49620-98-1-0091. The U.S. Government is authorized to reproduce and distribute for Governmental purposes notwithstanding any copyright notation thereon.
NOTES
The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Office of Scientific Research or the U.S. Government.
1 To whom correspondence should be addressed. Fax: (413) 545-4692. E-mail: edwardc{at}schoolph.umass.edu.
2 For purposes of this study, the NOAEL was defined as the highest dose with a response not statistically significantly different with respect to adverse responses from the control in studies where hypothesis testing was performed; in studies lacking hypothesis testing and in studies where hypothesis testing was performed but statistical significance was not observed with respect to adverse effects, the NOAEL was defined as the highest dose with a response 90% of the control for inverted U-shaped dose-response relationships or as the highest dose with a response
110% of the control for U- or J-shaped dose-response relationships. Inhibition was defined as occurring when: (1) the response for at least one dose higher than the NOAEL was statistically significantly different from the control in studies where hypothesis testing was performed; (2) the response for at least one dose higher than the NOAEL showed no 2x SD/SEM overlap with the control response in studies where only data distribution was reported; or (3) in the absence of statistical significance or nonoverlapping distributions, the response for at least two doses higher than the NOAEL was < 90% of the control for inverted-U shaped dose-response relationships or > 110% of the control for U- or J-shaped dose-response relationships.
REFERENCES
Calabrese, E. J. (in press). Overcompensation stimulation: A mechanism for hormetic effects. Crit. Rev. Toxicol.
Calabrese, E. J., and Baldwin, L. A. (1997a). A quantitatively-based methodology for the evaluation of chemical hormesis. Hum. Ecol. Risk Assess. 3, 545554.[ISI]
Calabrese, E. J., and Baldwin, L. A. (1997b). The dose determines the stimulation (and poison): Development of a chemical hormesis database. Int. J. Toxicol. 16, 545559.[ISI]
Calabrese, E. J., and Baldwin, L. A. (2000a). Chemical hormesis: Its historical foundations as a biological hypothesis. Hum. Exp. Toxicol. 19, 231.[ISI][Medline]
Calabrese, E. J., and Baldwin, L. A. (2000b). Radiation hormesis: Its historical foundations as a biological hypothesis. Hum. Exp. Toxicol. 19, 4175.[ISI][Medline]
Calabrese, E. J., and Baldwin, L. A. (2000c). Radiation hormesis: The demise of a legitimate hypothesis. Hum. Exp. Toxicol. 19, 7684.[ISI][Medline]
Calabrese, E. J., and Baldwin, L. A. (2000d). Tales of two similar hypotheses: The rise and fall of chemical and radiation hormesis. Hum. Exp. Toxicol. 19, 8597.[ISI][Medline]
Calabrese, E. J., and Baldwin, L. A. (2000e). The marginalization of hormesis. Hum. Exp. Toxicol. 19, 3240.[ISI][Medline]
Calabrese, E. J., Baldwin, L. A., and Holland, C. D. (1999). Hormesis: A highly generalizable and reproducible phenomenon with important implications for risk assessment. Risk Anal. 19, 261281.[ISI][Medline]
Crump, K. S. (2001). The regulatory implications of hormesis: Is hormesis a universal phenomenon? Crit. Rev. Toxicol., in press.
Davis, J. M., and Svendsgaard, D. J. (1994). Nonmonotonic dose-response relationships in toxicological studies. In Biological Effects of Low Level Exposures: Dose-Response Relationships (E. J. Calabrese, Ed.), pp. 6785. CRC Press, Boca Raton.
Hart, R. W., and Frame, L. T. (1996). Toxicological defense mechanisms and how they may affect the nature of dose-response relationships. BELLE Newsl. 5, 116.
Morré, D. J. (2000). Chemical hormesis in cell growth: a molecular target at the cell surface. J. Appl. Toxicol. 20, 157163.[ISI][Medline]
Stebbing, A. R. D. (1998). A theory for growth hormesis. Mutat. Res. 403, 249258.[ISI][Medline]