* DuPont Haskell Laboratory for Toxicology and Industrial Medicine, P. O. Box 50, Newark, Delaware 19714;
Procter and Gamble, Miami Valley Laboratories, P. O. Box 538707, Cincinnati, Ohio 45253-8707;
University of Washington, Department of Environmental Health, 4225 Roosevelt Avenue, Suite 100, Seattle, Washington 981056099; and
§ National Center for Environmental Assessment, and
¶ Office of Pollution Prevention and Toxics, U.S. Environmental Protection Agency, Ariel Rios Building, 1200 Pennsylvania Ave, NW, Washington, DC 20460
Received September 29, 2000; accepted January 3, 2001
![]() |
ABSTRACT |
---|
Key Words: risk assessment; harmonization; cancer; noncancer; mode of action; dose response; uncertainty factors; interspecies extrapolation.
![]() |
Introduction |
---|
EPA's proposed carcinogen risk-assessment guidelines are precisely in line with one of the major goals in the Society of Toxicology's long-range plan to improve the scientific basis of risk assessment. However, the publication of these guidelines highlights the significant differences that have evolved between the assessments of risks for developing cancer versus that for any other manifestation of toxicity, even when they all arise from the same mode of action. Current science, which establishes mechanistic links between noncancer responses to toxic agents and subsequent overt manifestations of toxicity such as cancer, suggests that these differences need to be resolved and a common broad paradigm for dose-response assessments developed for all toxicity endpoints.
Separate approaches to the assessment of cancer and noncancer health risks can be traced back to origins of the cancer risk-assessment guideline in which the process of chemical carcinogenesis was thought to be similar to that of radiation carcinogenesis (IRLG, 1979). That is, risk assessment for chemical carcinogens was based on the assumption that any exposure carries with it a risk of cancer. For noncancer risk assessment, Lehman and Fitzhugh (1954) applied the more traditional toxicology principle of dose thresholds. In the past 20 years, however, there has been an explosion of basic research into molecular mechanisms of toxicity. Much of this work has identified many common toxicological responses following exposure to carcinogens and noncarcinogens, and these findings beg the question of whether distinctly different philosophical approaches to cancer and noncancer risk assessments are appropriate (Butterworth and Bogdanffy, 1999
). Furthermore, many of these mechanisms are likely not to be unique to a particular manifestation of toxicity, but rather parts of the pathogenic pathway that results in a number of toxic responses. These responses may depend on a number of factors, such as the type of exposure, the sensitivity of the individual, etc.
The goal of the workshop reported here was to provide a forum for the exchange of scientific views on the most critical risk assessment issues involved in developing a more consistent and unified approach to risk assessment for all toxic endpoints. The intent of the workshop was to build consensus where it could be achieved and to identify the range of opinions for those areas where consensus was not yet possible. The aim of this manuscript is to summarize these discussions. Thus, the content reflects the discussions of the workshop and individual breakout groups and does not necessarily reflect the views of any individual author.
![]() |
Meeting Participants and Charge to the Participants |
---|
Prior to the meeting, the participants received background material and a list of focus questions that were used to guide discussions of the breakout groups and the plenary sessions. Also included were three examples of case studies and relevant scientific literature.2 These studies were used as resources for the discussions. The case studies included information on mammalian toxicology and risk assessments for ethylene oxide, ethylene thiourea, and trichloroethylene, and were chosen to provide a range of toxic responses and potential modes of action for the discussions. The focus questions assigned to each group were developed by the steering committee and are presented below in the individual breakout group reports.
Following an introductory plenary session in which Dr. Vu (U.S. EPA) presented an overview of various international definitions of the term "mode of action" (Table 1), Dr. Conolly (Chemical Industry Institute of Toxicology) provided his perspectives on an integrated approach to risk assessment for cancer and noncancer endpoints (Conolly, 1995
). The participants then convened in their assigned breakout groups and began discussions of the charge questions.
|
![]() |
Report of Breakout Group 1: Mode of Action as the Basis for Harmonization |
---|
Group 1 adopted the following working definition, which is applicable to all toxic manifestations:
Mode of action is a series of key events supported by a body of scientific knowledge that provide a biologically plausible explanation of causality for a given toxic effect within a context of dose and duration of exposure and susceptibility of target tissues. In contrast, "mechanism" of action refers to a complete understanding and demonstration of all biological steps leading to toxicity.
Question 1: Are There Similar Modes of Action Identified or Suspected for a Variety of Toxic Manifestations?
In light of current knowledge and given the working definition of mode of action as described above, the Breakout Group 2 generally agreed that there are common modes of action for different toxicities. Examples of such modes of action are cytotoxicity, mutagenesis, endocrine modulation, and immune suppression. However, the group was not comfortable with talking in generalities and felt a need for linking mode of action with dose and duration of exposure, and with levels of response (molecular, cellular, tissue cell, physiological).
The group also reviewed the IPCS framework for analyzing and evaluating a postulated mode of action for cancer risk assessment (IPCS, 1999). The IPCS framework is a tool providing a structured approach to assessing the overall weight of evidence for a postulated mode of action of a carcinogenic agent. The framework was developed by an international review group and was built on concepts discussed in the revised EPA cancer guidelines (U.S. EPA, 1999
). The framework utilized a modification of the Bradford Hill criteria (Hill, 1965
) for causality for human epidemiological studies that had been modified for noncancer endpoints (Faustman et al., 1996
). The framework begins with a summary description of the postulated mode of action and is followed by an evaluation of available data pertaining to the following issues or topics:
The framework is designed to bring transparency to the analysis of a postulated mode of action and thereby promote confidence in the conclusions reached; it is not designed to provide criteria for what constitutes sufficient evidence to establish a particular postulated mode of action.
Group 1 generally endorsed the utility of the IPCS analytical framework. It was recommended, however, that the section on dose-response relationship of the framework should be expanded to include other toxicity endpoints besides cancer, and an explicit and careful evaluation of the dose-response relationships for key events, precursor lesions, and more frank manifestations of toxicity. The discussion should include a qualitative evaluation of the effect and the shape of dose-response curves.
Questions 2 and 3: How Much Evidence Is Needed to Show That a Substance Acts Via a Particular Mode/Mechanism? How Much Evidence Is Needed to Show that Two Toxic Manifestations Caused by the Same Substance Were Produced by Different Modes of Action?
Overall, this breakout group expressed great difficulty in prescribing how much information or evidence is needed, both to establish a plausible mode of action and to judge whether different modes of action are involved (i.e., data-sufficiency criteria). The group felt that they could cite examples for use of mode-of-action information to conclude qualitatively that a particular mode of action was not relevant to humans (e.g., renal tumors in male rats associated with the alpha-2u-globulin). However, more quantitative dose-response information is usually needed to make judgments that sufficient evidence is available to support the use of mode-of-action data in low-dose extrapolation. The group recommended the need for providing criteria for "sufficiency of evidence," although no specific criteria were presented. They felt that the use of a peer review process alone is not a sufficient means for accepting or rejecting a postulated mode of action. Peer review will be important for judging technical integrity of data presented to support a mode of action.
In order to facilitate the discussion of "sufficiency of evidence," the group reviewed the information provided as part of the workshop background materials on two of the case studies: ethylene thiourea (ETU) and ethylene oxide (ETO). With regard to ETU, the group focused the discussion on the three major organ targets: thyroid (rats more susceptible than mice), pituitary (mice only), and liver (mice only). The group concluded that there was sufficient evidence to support the postulated mode of action for ETU-induced thyroid effects in the rat (i.e., via inhibition of the enzyme thyroid peroxidase, resulting in decreased serum T3 and T4 levels, increased thyroid-stimulating hormone, thyroid follicular-cell hyperplasia, and subsequently, thyroid follicular-cell adenomas and carcinoma). Thus, this group supported a threshold approach for both ETU-induced thyroid toxicity and carcinogenicity in the rat. However, they felt that there was not enough evidence provided in the case studies they reviewed to support a mode of action involving disruption of thyroid hormone homeostatsis (i.e., via antithyroid action) that was common to thyroid, liver, and pituitary tumors.
The discussion of the ETO case study focused on cancer and genetic, developmental, reproductive, and neurotoxic endpoints. The group concluded that although a common mode of action, DNA and protein alkylation, is plausible for cancer and noncancer endpoints, the sufficiency of evidence for this mode of action for all endpoints varies greatly. Also, it was felt that this particular mode of action (i.e., via DNA and protein alkylation) does not necessarily imply that the dose response is linear for cancer and noncancer endpoints, as these effects arise from many events subsequent to macromolecular binding.
Taken together, these discussions illustrated that, in establishing a mode of action for a given endpoint in a single tissue or organ system, it is inappropriate to assume the same mode of action for other endpoints or organ systems without sufficient data to bridge the mode of action between endpoint or organ systems.
In summary, Group 1 participants were able to make constructive modifications to the mode-of-action framework (adding additional emphasis on dose-response and in vivo contexts) and were able to apply these points in evaluating the mode-of-action data for two case studies. Although the committee was unable to prescribe a set of common criteria for "sufficiency of evidence" a priori, within the context of these specific case studies, the committee was able to identify when they felt comfortable with proposed modes of action within a given endpoint and across endpoints.
Question 4: What Should Be the Dose-Response Approach for a Chemical that Produces Multiple Manifestations, but through a Similar Mode of Action?
The group generally supported the use of similar dose response approaches for low dose extrapolation regardless of endpoints (i.e., cancer or noncancer endpoints) when there is sufficient evidence for a common mode of action. The group had considerable discussion on EPA's current default approaches for low-dose extrapolation, which stipulate that in the absence of knowledge in support of a particular mode of action, nonthreshold linear approaches are to be used for carcinogenic effects and threshold approaches for noncancer endpoints. In response to Question 4, more than half of the group felt that in light of current scientific knowledge, it is a plausible default that both carcinogenic and noncarcinogenic responses follow biological threshold-like responses. Thus, a majority of the participants held the view that there should be no difference between genotoxic and nongenotoxic carcinogens with regard to thresholds or points of departure for low-dose considerations. Some expressed the view that all responses should follow nonthreshold responses, while other felt that nonthreshold responses should be considered for certain types of genotoxic mechanisms. In summary, the Breakout Group 1 discussions concluded that more options are needed for evaluating dose-response relationships. These considerations should not be driven by a distinction of cancer versus noncancer but rather by considerations of mode of action.
![]() |
Report of Breakout Group 2: Common Levels of Adverse Effect across Toxicities for Use in Dose-Response Assessment |
---|
Initially, the group focused on quantal endpoints. Standard toxicology studies have been designed to characterize the potential hazard for specific toxicities. As a consequence, the studies may have different sample sizes, and the power may vary between studies as well as among endpoints within a specific study. This can result in difficulties when one attempts to compare the results of various toxicology studies. The breakout group discussed two examples that are often encountered. The first example was how to compare a neuropathology evaluation with a sample size of 5, a subchronic toxicity study with a sample size of 10, a prenatal developmental toxicity study with a sample size of 20, and a cancer bioassay with a sample size of 50. Some of these studies can detect a 5% change in some endpoints, whereas others may only detect a 40% change. The second example discussed was the situation where two chemicals are being compared for a given endpoint. One of the chemicals has been examined in a large study with a 1% detection level, while the second chemical has been examined in a small study with a 30% detection level. Just by the nature of the study design for the two chemicals, one of them will appear to be more toxic than the other.
The group agreed that there is no immediate solution to the problem and offered several available options. One option is simply to acknowledge that there are differences in power among studies and to select a standard level of change like an ED10, LED10, or ED50. However, caution needs to be exercised in the selection of the specific level of change. The group stressed that modeling should only be done within, or very near, the detectable range; modeling below that range is problematic. For this reason, selection of an ED10 would not be useful for studies with a resolving power of 30 or 50%. On the other hand, selection of an ED50 would not make full use of dose-response data for studies with statistical power to resolve a 5% change.
Epidemiology data were also discussed in the context of statistical power and limits of detection. Using epidemiological data, it is possible to identify a BMD that is much smaller (e.g. ED0.01), than that based on typical chronic bioassay data. There is, however, less extrapolation to environmentally relevant exposures. Proposed criteria for the selection of a BMD based on epidemiological data are as follows: BMD is within the observational range, and the BMD estimate is consistent across models.
A second option to providing consistency in dose-response assessments of various types of data used in risk assessment is to acknowledge the bounds of the studies and utilize all the available data in a sliding-scale approach. The sliding scale would acknowledge that the resolving power of different types of studies is not comparable. It would permit the risk assessor to use a starting point for risk estimation that is within the experimental dose range for the type of study being used, with that starting point being dependent on study design. However, because study designs are different, more consideration of a study's resolving power would need to go into the consideration of the margin of exposure necessary to be protective. Criteria would need to be established for the sliding scale. The group acknowledged that one drawback to this approach is resistance by risk managers. Finally, the group stressed that in order to resolve this issue and make standard toxicology studies more comparable, some may need to be redesigned. Many current studies are designed for hazard identification, not dose-response assessment; they provide qualitative information, but not necessarily useful quantitative information.
Breakout Group 2 also discussed continuous endpoints and offered several options, one of which was to quantalize the data based on a justifiable cut-off point (Kavlock et al., 1995). For example, one could decide that changes of 10% or greater from control means are "adverse," and that changes of less than 10% are acceptable (Gaylor and Slikker, 1990
). Another option is simply to model the continuous endpoint as a continuous variable. Finally, the group stressed that continuous endpoints should be viewed within the context of what is known about the mode of action and how a specific continuous endpoint may relate to a particular quantal endpoint. For example, a continuous endpoint may represent a precursor event to a particular quantal endpoint, and therefore provide information about the shape of the dose-response curve within a lower dose range. One would then have to apply knowledge about the biology and mode of action to determine a response level for the precursor that is considered acceptable.
Question 2: Should One Use a Consistent Response Level across Toxicities (e.g., 5%, 10%) for Comparing Adverse Effects; for Selecting the Point of Departure? How Does Mode of Action Impact the Point of Departure Selection?
The breakout group recognized several limitations in trying to use a consistent response level for comparing adverse effects, and cautioned that this may lead to similar problems encountered in the past when comparing NOAELs, RfDs, or cancer potencies. The breakout group stressed that some endpoints/studies should not be compared (e.g., 5% change in fetal body weight [continuous endpoint] vs. 5% increase in fetal death [dichotomous endpoint]) and that knowledge of the shape of the dose-response curve is requisite to making any comparison. Some study designs are useful for hazard identification but not well designed for quantitative risk assessment (e.g., the functional observation battery in the EPA neurotoxicity guideline). If the shape of the dose-response curve is not known, then the curves could cross, leading to errors in the comparison. In general, Group 2 did not recommend making such comparisons, but acknowledged that if this had to be done, it should be restricted to endpoints with similar response metrics, and should be based on a response level within the detectable limit of all the studies concerned (e.g., ED50). The breakout group stressed that it is not appropriate to prescribe, a priori, a consistent response level as the point of departure for all endpoints of toxicity.
Knowledge of the mode of action can impact the point of departure selection. If there is a common mode of action for different toxicities associated with a chemical, then one should select the study with the greatest resolving power, or base the point of departure on a precursor event that would require support for the conclusion that the event is linked to an accepted adverse effect. Use of a common precursor effect would increase confidence that the assessment was protective for all toxicities. The breakout group also discussed how mode-of-action information could influence decisions about which toxicology studies should be conducted for a given chemical. Knowledge of mode of action could allow for limited testing on chemicals with an assumed common mode of action. However, the breakout group also stressed that knowledge of mode of action could lead to false conclusions about potential effects. For example, a chemical may exhibit weak estrogenic activity, but not be associated with clinically adverse effects.
Question 3: How Can Severity of Response within an Endpoint Be Considered in Dose-Response Assessment?
Modern toxicology studies involve numerous measurements of the structure and function of tissues, organs, and individuals. For many of these, there appear to be progressive, dose-related increases, not only in magnitude of response but also in the severity of the effect elicited. That is, at low-dose levels, effects may be limited to pretoxic effects such as altered clinical chemistry parameters, xenobiotic enzyme induction, or subtle histological changes, while larger doses evoke responses that are frankly adverse. As toxicity measurements become more sophisticated, it will become possible to determine that many of these effects are mechanistically related. Whereas more subtle changes at the molecular or biochemical level may be evident at low doses, toxic manifestations may become more severe and involve cellular or whole-organ levels of biological organization at higher doses or longer duration of exposure.
It seems both a waste of information and an unreasonable approach to simply choose the most sensitive effect as the starting point for dose-response assessment, particularly given that studies designed to identify different toxic responses and conducted under different guidelines, or simply with less care, will likely be limited to the more fulminant manifestations of toxicity. Breakout Group 2 explored the available options for modeling effects of graded severity, and also issues around choosing the most sensitive effect as a point of departure.
It was noted that statistical models already exist that allow the use of all of the effects along a mechanistic continuum, from precursor events to frank toxicity. Categorical regression models were cited as having the longest history of use (Simpson et al., 1996). Other models have also been presented in the literature, such as one used to account for the relatively lesser severity of rib variations vs. frank rib malformations in rodents after oral exposure to high levels of boric acid (Allen et al., 1996
). The group felt that the use of such models and development of biologically based models in the future would improve risk assessment. However, the development and use of models must be coupled to the collection of richer data sets from toxicity studies.
The discussion then moved to a consideration of the problems that are inherent in choosing the most sensitive response (e.g., a precursor event). If chosen as the critical effect for one agent, and the same uncertainty factors are applied as might be applied to a more severe effect for a different agent, the reference dose for the former may be unduly conservative (or insufficient) for the latter. The breakout group briefly examined the traditional uncertainty factors to determine whether it would be appropriate to decrease the magnitude, or to eliminate certain factors, when basing an assessment on subtle precursor effects. While the group was able to identify some examples where uncertainty factors may not be relevant, little progress was made. The group recommended that this issue be discussed separately by Group 3. Nevertheless, the group did urge that risk assessments explicitly distinguish between precursor events and frankly adverse effects. It was further recommended that, with enough information, mode of action-based models might be a viable alternative to uncertainty factor-based approaches for setting RfDs.
Question 4: How Should Slope of the Dose-Response Curve Be Considered in Dose-Response Assessment?
There was a great deal of discussion about whether steep or shallow dose response curves were of greater concern. If a dose-response curve has a shallow slope in the observable range, then the expectation is that it would continue to have a shallow slope at lower dosages, and that the default factors applied to arrive at a RfD may not be sufficient to achieve the desired risk reduction. Steep curves would be expected to require smaller margins between the NOAEL and RfD to achieve comparable risk reduction.
There was some concern in the committee that the combination of a steep slope and small margin of safety is imprudent, because errors in estimation may have significant health consequences. The discussion on this point was far ranging, with some members of the group expressing their opinion that the level of their concern was contingent on the nature of the effect or the population that may be affected (e.g., children). No consensus was reached on this problem, but the nature of the concerns suggested that this is more an issue of risk management than risk assessment.
There are always numerous dose-response curves from the same study, even for different effects that are produced by the same mode of action. The question was raised of which dose-response curve to use in determining slope, particularly given that precursor events or lesser manifestations of toxicity would be present at more dose levels and might, therefore, have dose-response curves with better resolution than frank effects. It was decided that, if precursor effects were used as the basis for determining slope magnitude, the causal, rate-limiting precursor effect was the appropriate one to use. Limitations to this approach might be experienced when more severe effects of higher doses mask precursor effects, or if there is uncertainty concerning the "true" causal events.
The group discussed whether the slope of the dose-response curve from an animal study had predictive value for the shape of the dose-response curve in humans; if not, then there would be little reason to adjust for slope from animal studies. It was felt that mode-of-action information, including interspecies studies of pharmacokinetics, would be useful in addressing this concern. If it can be shown that the mode of action driving the dose-response relationship is operable in humans, then it would be more likely that the slopes of the dose-response curves would be similar.
The group considered the use of adjustment factors to account for slope in a RfD determination. One idea from the proposed cancer risk assessment guidelines is an adjustment factor that would decrease as the slope increases (Fig. 1). It was also pointed out that confidence intervals on the dose-response curve partially address the shallow/steep controversy, because shallow curves have wider confidence intervals. In plenary discussion, it was noted that such use of adjustment factors to account for slope was in contrast to other recommendations calling for greater direct use of the dose-response relationships.
|
On the other hand, noncancer risk assessment involves the application of factors that account for the uncertainty in extrapolating across species or across the human population, but with no explicit factor for risk reduction other than an uncertainty factor for extrapolating a NOAEL from a LOAEL. In other words, if one determines experimentally that there is no more than a 10% risk at a certain dosage in the tested animal species, then applies 10-fold uncertainty factors to cover the possibility that humans are 10 times more sensitive than animals, and assumes the most sensitive human subpopulation is 10 times more sensitive than the average human, it can be argued that the risk of toxicity to that subpopulation of exposure at the RfD is still around 10%, at least in theory (Sheehan et al., 1989).
The group found that maintaining these two different approaches is likely to be an impediment to harmonization of risk-assessment practices. The divergent approaches currently used for cancer and noncancer risk assessment are also an impediment to the acceptance of the benchmark dose as an alternative to the no-observed-adverse-effect level (NOAEL) as a starting point for risk estimation. The benchmark dose is a defined, nonzero level of effect, whereas the NOAEL represents an indeterminate level of response that may be zero or greater and that is highly influenced by study design issues affecting the statistical power of the study. One solution that was suggested for the benchmark dose/NOAEL problem is better communication of the true meaning of NOAEL, particularly in showing that there may be risk at the NOAEL. For studies of low statistical power, it may be that risks of adverse effects at the NOAEL are significant and may be much higher than expected at the benchmark response level. However, estimating doses associated with lower risks requires extrapolation outside of the region of experimental observation, and this process is also fraught with uncertainty. The closer the extrapolation is to the region of observation, the less significant these uncertainties become.
![]() |
Report of Breakout Group 3: Scaling and Uncertainty Factors |
---|
For ingestion exposures, the current practice advocated by most regulatory agencies worldwide is to make adjustments for dose when extrapolating from data in test animals to humans by assuming that pharmacokinetic differences impart greater internal exposure to parent chemical or its toxic metabolites in humans, relative to test species. Traditionally, dosimetry adjustment has been accomplished using categorical defaults (Appendix 1); i.e., by dividing the NOAEL or LOAEL by factors of 10 to allow for interspecies differences and human variability (Dourson and Stara, 1983). Recently, it has been recognized that each factor of 10 has to allow for pharmacokinetics (PK) and pharmacodynamics (PD); both of which contribute to the overall response (Andersen et al., 1995
; Renwick, 1991
, 1993
). It has been suggested that the inter- and intraspecies extrapolation factors of 10 are each comprised of two log-normally distributed factors; 100.5 or approximately three for each. Thus, it has been proposed that each 10-fold factor be subdivided into two numerical values, the product of which is the original value of 10 (Renwick, 1993
; IPCS, 1994
). This scheme allows chemical-specific data to replace either the PK or PD components of the default factors.
Evaluations of data on PK and PD differences between rodents and humans and among different human individuals gave rise to the proposal that the interspecies factor be split (4.0 and 2.5 for PK and PD, respectively), and that the 10-fold factor for human variability be evenly divided into PK and PD values equal to 3.16 each (Appendix 2, Fig. 2) (IPCS, 1994
; Renwick, 1993
; Renwick and Lazarus, 1998
). Segregating the interspecies factor of 10 into separate (PK and PD) components allows investigation of these components for a given chemical risk assessment through further directed experimental research and replacement of a PK or PD default by chemical-specific data (Renwick, 1993
). The product of a chemical-specific value, and remaining default factors for which data are not available, give rise to a chemical-specific factor termed a "data-derived uncertainty factor." Appendix 2 and Figure 2
establish a framework whereby the results of chemical-specific investigations conducted to inform these components can be easily integrated into risk assessment.
|
The introduction of chemical-specific data by the application of data-derived factors for dosimetry (PK) adjustment is gaining wider acceptance internationally (IPCS, 1994, 1999
; Meek, et al., 1994
; Renwick, 1993
) for both cancer and noncancer dose-response assessments. However, when conducting cancer risk assessments, the EPA uses a different approach to effect a similar accounting for dosimetric differences between rodents and humans. For cancer risk assessments only, EPA applies a default approach whereby interspecies dosimetry adjustment is accomplished through allometric scaling. In this approach, dose is adjusted by multiplying the daily dose, in mg, by the ratio of body weights raised to the
power. 3 This approach effectively reduces the daily dose by a factor of approximately 4 or 7, if extrapolating from rat or mouse data, respectively. This adjustment, based largely on empirical observations for metabolism, clearance, and cancer potencies across species, is said to account for both PD as well as PK differences between rodents and humans (U.S. EPA, 1992
).
The Group 3 participants discussed these approaches to dosimetric adjustment and clarified differences between the default uncertainty factors being considered here versus those additional factors that are more appropriately referred to as "safety factors." The workgroup agreed that, for the purposes of this workshop, the term "safety factors," such as that proposed in the context of the Food Quality Protection Act to protect children from risk of pesticide exposure, is appropriate when applied strictly to provide additional precaution or comfort. The necessity and magnitude of these factors are the purview of risk management. As such, safety factors were not discussed by the group. It should be made clear, however, that in some countries and advisory bodies, the default uncertainty factors for inter- and intraspecies differences are referred to as "safety factors."
The following points of consensus were reached by Breakout Group 3:
Question 2: How Should Exposure-Duration Relationships Be Taken into Account?
Currently, many aspects of adjusting for exposure-duration are dealt with for both cancer and noncancer health effects using defaults based on the C x T concept. This concept, associated with Haber (1924), assumes that equal effects are observed when the product of concentration and time is equal, regardless of the value of either parameter individually. For example, in setting a chronic RfD for noncancer health effects, C x T is the default assumption, and adjustments are made on that basis. Examples of such adjustments include extrapolation from 5 days per week to 7 days per week for an oral exposure, or from 6 to 24 h/day for inhalation exposure. However, when extrapolating from a subchronic (approximately 90-day study) to a chronic (usually 2-year rodent) study, an uncertainty factor (default 10-fold factor) is applied to account for uncertainties associated with the lack of chronic data. For shorter-term reference values, for example an acute inhalation reference value, the C x T adjustment is made when extrapolating from shorter-term to longer-term exposures (e.g., 6 to 24 h), but not when extrapolating from longer-term to shorter-term exposure duration. In the latter instance, the longer-term reference value is typically used for both exposure duration, as a more conservative estimate of the potentially toxic dose. This approach to duration adjustment is a conservative use of Haber's Law, as shown in several studies on neurotoxicity and developmental toxicity (e.g., Bushnell, 1997; Crofton and Zhao, 1997; Weller et al., 1999). Modeling of data from these studies indicates that using C x T for adjusting from shorter to longer duration tends to overestimate risk, while similar adjustments from longer to shorter duration tends to underestimate risk.
For cancer, most data are from near-lifetime (2-year bioassay) studies in rodents, although less-than-lifetime data are sometimes used, and data from humans may be available. The C x T concept is also used in cancer dose-response assessment to calculate the lifetime average daily dose (LADD) as follows:
![]() |
Where interim sacrifice data are available, the validity of the C x T assumption can be assessed. Alternatively, these data may be useful for deriving more appropriate methods of relating exposure duration to response. Group 3 agreed that studies should be designed with greater attention to the utility of interim and postexposure sacrifice data for addressing dose/temporal toxicity relationships.
Within the framework proposed (Appendix 2), there were a number of issues pointed out by the group that should be considered in adjusting for exposure-duration relationships:
The workgroup agreed that the approaches used for exposure-duration adjustment should be the same for cancer and noncancer endpoints. These approaches should be harmonized under the concept that mode of action provides the rationale used for exposure-duration adjustments.
Question 3: How Can Interspecies and Intraspecies Variability Be Treated Consistently for All Endpoints?
Variability in responses to toxic agents is a combined function of variability in individual physiology and biochemistry that affect PK and PD. The group was very clear in endorsing the need for explicit and consistent accounting for variability within the human population. Thus, the framework presented in Appendix 2 and Figure 2 allow for data-derived categorical expressions of variability or the use of biologically based PK/PD models. These approaches are appropriate for both cancer and noncancer endpoints.
The group also agreed that where data are available, distributional approaches to individual susceptibility among humans for toxic effects (governed by PK and PD differences) might replace default PK and PD components that account for the within-human-population variability (Hattis et al., 1999). The challenge will be to adequately separate and characterize the variability of the PK and PD components. Thus, it might be possible, with further research, to replace both interindividual factors of 3.16 (100.5) proposed by IPCS (1994) with chemical-specific population distributions for kinetics and dynamics (e.g., population distributions of the key PK and PD parameters) (Fig. 3). While these approaches are very much in their infancy, they offer the opportunity to inform and increase the level of objectivity of our assessments of human variability. The limited work to date, evaluating the PK and specific PD responses to therapeutic agents, suggests that the product of the two factors of 3.16 is adequate to cover the variability in all but approximately 2 in 10,000 individuals (assuming that variability is log-normally distributed) or 3 in 106 individuals (assuming that variability is normally distributed) (Renwick and Lazarus, 1998
).
|
The workgroup stressed that the type of factor applied in dose-response assessment depends on the adequacy of both the pharmacokinetic and pharmacodynamic information available. All data for both cancer and noncancer health effects must be considered in making judgements about severity and its influence on the size of the uncertainty factor applied. Certainly, the move toward using mode of action information in risk assessment will include consideration of the nature of an effect. One major concern expressed about using severity as a basis for comparison is that the levels of effect for different outcomes need to be comparable, particularly when comparing across organ systems. However, this requires scientific judgement, which can be very difficult. From a public health point of view, one might ask questions about how much harm will be done if the choice of the critical effect is wrong, and whether the effect is reversible, a common or rare effect, and/or associated with a common exposure. It was stated that dose-response evaluations should be conducted for all endpoints, including calculation of the human equivalent doses and the application of uncertainty factors in order to determine the most appropriate endpoint to use for risk-assessment purposes.
Group 3 briefly discussed the issue of adversity of responses (e.g., developmental toxicity vs. neurotoxicity) and recognized the magnitude of this issue as too important to be resolved in that workshop. It was apparent from the discussion that there are several ways to approach the problem, and that it should be pursued in a separate forum.
Question 5: What Aggregate Uncertainty Factor (i.e., Product of Individual Uncertainty Factors) Is Appropriate to Determine the RfD and RfC, or to Evaluate the MOE? How Does Mode of Action or Spectrum of Toxic Manifestations Influence the Answer to this Question?
The group refused to specify a default aggregate uncertainty factor, feeling that this was not a science-based approach to the problem, and emphasized that the choice of the total uncertainty factor depends on the database available as well as the mode of action. The group strongly agreed that approaches to cancer and noncancer assessment should be harmonized, particularly abandoning the cancer/noncancer dichotomy, and considering instead the mode of action in guiding decisions on methods of low-dose and interspecies extrapolation. The group discussed the different assumptions that are made as the basis for linear (genotoxic, no-threshold) and nonlinear (nongenotoxic, possible threshold) low-dose approaches. In the case of linear low-dose assessment, the resulting metric is a dose associated with a specific risk (a cancer potency or slope factor) that can be used to derive a "virtually safe dose" (e.g., a dose associated with a 1 x 106 cancer risk). These expressions of potency can impart more precision than is warranted. In contrast, for nonlinear assessments, the result is a "virtually safe dose" with no associated estimate of risk. Some members of the group questioned the linear low-dose assumption, and stated that much of the early work leading to this assumption was based on radiation data, which may be very different from chemical reactions in the body. Others raised the issue of additivity to background, especially for agents that occur endogenously, or those that add to endogenous mechanisms, advocating the use of linear low-dose assessment in these cases. Ultimately, there was consensus that biological thresholds are possible for both cancer and noncancer endpoints. Most advocated the use of uncertainty factors for both cancer and noncancer risk assessments and of calculating only a "virtually safe dose" or its equivalent without a slope factor. A minority of participants did not want to exclude the possibility of low-dose linearity for any carcinogenic mechanism and expressed the need for slope factors that enable population risk estimates.
Question 6: How Should One Account for a Database Deficiency?
Currently, risk assessment approaches used by various regulatory agencies account for database insufficiency by further dividing the NOAEL by an uncertainty factor. For example, the U.S. EPA uses a database uncertainty factor of up to 10 (generally either 3 or 10) if key toxicity studies are not available or are unreliable. The Group 6 discussion concluded with the following major points:
![]() |
Summary and Conclusions |
---|
![]() |
Accomplishments of the Workshop |
---|
The workshop developed specific definitions that will prove useful in the future. One was the definition of "harmonization," as it was used for the purposes of this workshop:
Harmonization refers to developing a consistent set of principles and guidelines for drawing inferences from scientific information. It does not mean that a single method should be used for the assessment of all toxicities and chemicals.
In addition, as noted above, Group 1 developed a working definition of mode of action that could be considered applicable to all toxic manifestations. There was an acceptance that understanding the mode of action of a chemical is ultimately critical for nondefault risk assessment, that common modes of action for different toxicities can be defined, and that our approach to assessing toxicity data should be biologically consistent. However, it was also recognized that a toxic response to an environmental exposure should not be viewed simplistically, and that there is a need for linking mode-of-action with the exposure and response at different levels of biological organization. Particular importance was given to assessing toxicity within the context of the exposure-response relationship. This pertains not only to the ultimate manifestation of toxicity, but to the entire process of pathogenesis. This will include effects at the molecular, cellular, and tissue/organ level, as well as the whole organism level, requiring a better understanding of how various biological events are related, both qualitatively and quantitatively, to the overall effect on an organism. There was consensus that, to the extent possible, all data should be considered in assessing the potential toxicity of an exposure, that the data should drive the choice of analysis and not vice versa, and that the basis for any assessment should be biological and not statistical. Choosing the "most sensitive endpoint" can severely limit the data available for consideration, and will not provide a profile of the overall toxicity that may be associated with an exposure. Limiting the data also squanders information that may be important when considering other exposure scenarios or regulatory uses of the data.
![]() |
The Future |
---|
This workshop was the culmination of over a year of planning by the cosponsors and organizers. While much was accomplished, the workshop should be considered a significant first step toward harmonizing approaches to cancer and noncancer risk assessment. Certainly, the need for continued work and discussion on the harmonization of risk-assessment approaches is clear. Hopefully, these workshop proceedings can serve as a jump start for future work in this area.
![]() |
APPENDIX 1 |
---|
All of these factors could be removed by developing the appropriate test data. Further analysis of existing databases, for example by distribution analysis, may help to provide more scientifically based default values, or allow estimation of uncertainty distribution.
Interspecies defaults (adjustment factors).
The selection of the default factor depends on the route of exposure and the site of uptake/deposition. Different factors may be used for oral toxicity, causing systemic effects (typically 10) and inhalation (with dosimetric correction for deposition), causing local toxicity (typically 3). Interspecies differences arise from both kinetic and dynamic differences. Alternative defaults are possible to account for kinetic aspects, such as the ratio of body weight0.75 or body weight0.66 or a generic kinetic default of 4.0 to allow for species differences in parent compound after oral dosage (see Appendix 2). These kinetic defaults would be multiplied by the generic default for dynamics of 2.5. Future developments could include pathway-related defaults (for kinetics) and process or mode of action-related defaults (for dynamics) to allow fine-tuning of the factor to the chemical in the absence of detailed chemical-specific data (see Fig. 2).
Intraspecies (interindividual) defaul.
A categorical default of 10 is usually applied for both oral and inhalation to account for kinetic and dynamic variability in the human population. Future developments will include pathway-related defaults (for kinetics) and process or mode of action-related defaults (for dynamics), to allow fine-tuning of the factor to the chemical in the absence of detailed chemical-specific data.
![]() |
APPENDIX 2 |
---|
Chemical-specific (adjustment) factors.
Factors in which one or more of the categorical defaults for inter- and intraspecies differences has been modified by the incorporation of chemical-specific data regarding kinetics or mode of action.
Data-derived factors.
Factors in which physiologically based kinetic parameters or target organ sensitivity data are used to replace part of a default (requiring information on mode of action). (Factors may be required to consider uncertainty in chemical-specific date/approach.)The product of the chemical-specific values and any remaining defaults is termed a data-derived safety factor or a data-derived uncertainty factor.
PBPK analysis.
Used to replace the toxicokinetic aspect of defaults (requires information on mode of action). PBPK models are useful for describing interspecies, high-to-low dose, and temporal extrapolations but require key physiological and biochemical kinetic constants. These may be available from literature or experimentally derived. (Factors may be required to consider uncertainty in chemical-specific date/approach.)
BBDR (biologically based dose-response used to replace the pharmacokinetic aspects together with part or all of the pharmacodynamic aspects).
Usually applied to interspecies comparisons; rarely used to model human variability. (Factors may be required to consider uncertainty in chemical-specific date/approach.)
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
NOTES |
---|
2 The case studies are available at www.toxicology.org or by written request to the corresponding author.
3 In practice, test species dose in mg/kg/day is multiplied by the ratio of test species body weight1/4:human body weight1/4.
![]() |
REFERENCES |
---|
Andersen, M. E., Clewell, H., and Krishnan, K. (1995). Tissue dosimetry, pharmacokinetic modeling, and interspecies scaling factors. Risk Anal. 15, 533537.[ISI][Medline]
Bushnell, P. J. (1997). Concentration-time relationships for the effects of inhaled trichloroethylene on signal detection behavior in rats. Fundam. Appl. Toxicol. 36, 3038.[ISI][Medline]
Butterworth, B. E., and Bogdanffy, M. S. (1999). A comprehensive approach for integration of toxicity and cancer risk assessments. Regul. Tox. Pharm. 29, 2336.
Conolly, R. B. (1995). Cancer and noncancer risk assessment: Not so different if you consider mechanisms. Toxicology 102, 179188.[ISI][Medline]
Crofton, K. M., and Zhao, X. (1997). The ototoxicity of trichloroethylene: Extrapolation and relevance of high-concentration, short-duration animal exposure data. Fundam. Appl. Toxicol. 38, 101106.[ISI][Medline]
Dourson, M. L., and Stara, J. F. (1983). Regulatory history and experimental support of uncertainty (safety) factors. Regul. Toxicol. Pharmacol. 3, 224238.[Medline]
Faustman, E. M., Ponce, R. A., Seeley, M. R., and Whittaker, S. G. (1996). Experimental approaches to evaluate mechanisms of developmental toxicity. In Handbook of Developmental Toxicology (Hood, R., Ed.), pp. 1341. Boca Raton, FL, CRC Press.
Gaylor, D. W., and Slikker, W. (1990). Risk assessment for neurotoxic effects. Neurotoxicology 11, 211218.[ISI][Medline]
Haber, F. (1924). Zur Geschichte des Gaskrieges. In Fuenf Vortraege aus den Jahren 19201923, pp. 7692. Julius Springer, Berlin.
Hattis, D., Banati, P., and Goble, R. (2001). Distributions of individual susceptibility among humans for toxic effects. How much protection does the traditional tenfold factor provide of what fraction of which kinds of chemicals and effects? Ann. N.Y. Acad. Sci. 895, 286316.
Health Canada (1994). Human Health Risk Assessment for Priority Substances. Environmental Health Directorate, Health Canada, Ottawa, Ontario.
Hertzberg, R. C. (1989). Extrapolation and scaling of animal data to humans: Fitting a model to categorical response data with application to species extrapolation of toxicity. Health Phys. 57(Suppl. 1), 405409.[ISI][Medline]
Hertzberg, R. C., and Miller, M. (1985). A statistical model for species extrapolating, using categorical response data. Toxicol. Ind. Health 1, 4357.
Hill, A. B. (1965). The environment and disease: Association or causation. Proc. R.. Soc. Med. 58, 295300.[ISI][Medline]
IPCS (1994). Environmental Health Criteria No. 170: Assessing Human Health Risks of Chemicals: Derivation of Guidance Values for Health-based Exposure Limits. International Programme on Chemical Safety, World Health Organization, Geneva.
IPCS (1999). Environmental Health Criteria 210. Principles for the Assessment of Risks to Human Health from Exposure to Chemicals. International Programme on Chemical Safety, World Health Organization, Geneva.
IPCS (1999). IPCS workshop on Developing a Conceptual Framework for Cancer Risk Assessment, 1618 February 1999, Lyon, France, IPCS/99.6.
IRLG. Working Group on Risk Assessment, International Regulatory Liaison Group (1979). Scientific basis for identification of potential carcinogens and estimation of risks. J. Natl. Cancer Inst. 63, 241268.[ISI]
Jarabek, A. M. (1995). Inhalation RfC methodology: Dosimetric adjustment and dose-response estimation of noncancer toxicity in the upper respiratory tract. Inhal. Toxicol. 6(Suppl.), 301325.[ISI]
Kavlock, R. J., Allen, B. C, Faustman, E. M., and Kimmel, C. A. (1995). Dose response assessments for developmental toxicity: IV. Benchmark doses for fetal weight changes. Fundam. Appl.. Toxicol. 26, 211222.[ISI][Medline]
Lehman, A. J., and Fitzhugh, O. G. (1954). 100-Fold margin of safety. Assoc. Food Drug Off. U.S.Q. Bull.18, 3355.
Meek, M. E., Newhook, R., Liteplo, R. G., and Armstrong, V. C. (1994). Approach to assessment of risk to human health for priority substances under the Canadian Environmental Protection Act. Environ. Carcinog. Ecotoxicol. Rev. C12, 105134.[ISI]
Moore, J. A., (1995). An assessment of lithium using the IEHR Evaluative Process for Assessing Human Developmental and Reproductive Toxicity of Agents. IEHR Expert Scientific Committee. Reprod. Toxicol. 9, 175210.[ISI][Medline]
National Research Council (1994). Science and Judgment in Risk Assessment. National Research Council Press, Washington, DC.
Renwick, A. G. (1991). Safety factors and establishment of acceptable daily intakes. Food Addit. Contam. 8, 135149.[ISI][Medline]
Renwick, A. G. (1993). Data-derived safety factors for the evaluation of food additives and environmental contaminants. Food Addit. Contam. 10, 275305.[ISI][Medline]
Renwick, A. G. (1999). Duration of intake above the ADI/TDI in relation to toxicodynamics and toxicokinetics. Regul. Toxicol. Pharmacol. 30, S6978.[ISI][Medline]
Renwick, A. G., and Lazarus, N. R. (1998). Human variability in noncancer risk assessmentan analysis of the default uncertainty factor. Regul. Toxicol. Pharmacol. 27, 320.[ISI]
Schlosser, P. M., and Bogdanffy, M. S. (1999). Determining modes of action for biologically based risk assessments. Regul. Toxicol. Pharmacol. 30, 7579.[ISI][Medline]
Sheehan, D., Young, J. F., Slikker, W., Gaylor, D., and Mattison, D. (1989). Workshop on risk assessment in reproductive and developmental toxicology: Addressing the assumptions and identifying the research needs. Regul. Toxicol. Pharmacol. 10, 110122.[ISI][Medline]
Simpson, D. G., Carroll, R. J., Zhou, H., and Guth, D. (1996). Weighted logistic regression and robust analysis of diverse toxicology data. Commun. Stat. Methods 25, 26152632.
U.S. EPA (1992). Draft report: A cross-species scaling factor for carcinogen risk assessment based on equivalence of mg/kg3/4/day. Fed. Reg. 57, 2415224173.
U.S. EPA (1996). Proposed Guidelines for Carcinogen Risk Assessment. Fed. Reg. 61, 1796018011.
U.S. EPA (1998). Assessment of Thyroid Follicular Cell Tumors. U.S. Environmental Protection Agency, Washington, DC, EPA/630/R-97/002.
U.S. EPA (1999). Guidelines for Carcinogen Risk Assessment. Risk Assessment Forum, U.S. Environmental Protection Agency, NCEA-F-0644. July 1999, Review draft.
Weller, E., Long, N., Smith, A., Williams, P., Ravi, S., Gill, J., Henessey, R., Skornik, W., Brain, J., Kimmel, C., Kimmel, G., Holmes, L., and Ryan, L. (1999). Dose-rate effects of ethylene oxide exposure on developmental toxicity. Toxicol. Sci. 50, 259270.[Abstract]
Zhao, Q., Unrine, J., and Dourson, M. (1999). Replacing default values of 10 with data-derived values: A comparison of two different data-derived uncertainty factors for boron. Hum. Ecol. Risk. Assess. 5, 973983.[ISI]