* Division of Biometry and Risk Assessment, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079;
Division of Biometrics II, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Rockville, Maryland 20857; and
R.O.W. Sciences, Inc., National Center for Toxicological Research, Jefferson, Arkansas 72079
Received October 12, 1999; accepted January 13, 2000
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: chronic; dose-response; Food and Drug Administration (FDA); Monte Carlo; MVK model; power; survival; trend.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Two alternatives that have been discussed, in the interest of having sufficient numbers of surviving animals at the terminal sacrifice, are (1) shortening the length of the bioassay to 18 months (or less) (Davies and Monro, 1996; Rao, 1994
) and (2) using some form of caloric restriction to increase the life span of the animals (Allaben et al., 1996
; Contrera, 1994
). Whereas many questions arise with respect to both of these proposed approaches, there appears to be one key question in each alternative that pertains to the statistical power to detect carcinogenic potential. For the first alternative, the question is, assuming no restriction on feeding, to what degree would early stopping reduce power? For the second alternative, the question is, how long would the bioassay with caloric restriction need to be conducted, and at what level of restriction, in order to achieve adequate power? Both of these questions are of concern to regulatory agencies as they strive to harmonize efforts to evaluate the carcinogenic potential of human drugs. The purpose of this paper is to address the first question regarding early stopping, specifically, the effect of early study termination on the ability to detect the carcinogenic potential of drugs that operate by various postulated carcinogenic mechanisms.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Development of Representative Models for Generating Data
Because of its underlying biological foundation, and its ability to represent variations of the initiation, promotion, and completion (progression) events in the cancer process, the 2-stage clonal expansion model for carcinogenesis (Moolgavkar and Luebeck, 1990) is used to represent the time-to-onset distribution function for each dose level of a drug. This model's exact cumulative hazard function, assuming time-homogeneous genetic event rates, cell growth rates, and population-size of normal cells, may be expressed for the continuous dosing case (Kopp-Schneider, et al., 1994
; Zheng, 1994
) by:
![]() |
![]() |
where C = [(ß + + µ)2 4ß
]
, t represents time, N is the number of normal cells in the tissue, and the quantities v, µ, ß, and
are rates corresponding to initiation ( v = first genetic event rate), progression or completion (µ = second genetic event rate) and promotion (ß = cell birth rate,
= cell death rate). Whereas often N is taken to be a large constant (e.g., 107), here N is absorbed into the model's parameter, v.
For the simulation study described below, it is assumed that the 2-stage model's parameters are linearly related to dose, i.e., v = 1 + ß1 d, µ =
2 + ß2 d, ß =
3 + ß3 d, and
=
4 + ß4 d, where d is the dose level of drug. Various submodels can be obtained from the above fully parameterized model simply by setting various ßi coefficients equal to zero. For example, setting ß2 = 0 gives an initiator-promoter model, while setting ß2, ß3, and ß4 all to zero gives a pure initiator model.
Tumorigenicity data submitted by new-drug sponsors to the Center for Drug Evaluation and Research (CDER) of the U.S. Food and Drug Administration (FDA), along with data from a National Toxicology Program (NTP) drug study (Gold et al., 1984), serve as a basis for parameterizing the 2-stage clonal expansion model for the simulation study. These data have been used to develop basic parameterizations of various submodels of the full model, which are then used to develop a set of models to represent the broad spectrum of bioassay data that are seen in practice. This ensures that the generated tumor data are representative of actual drug studies carried out in rats. The raw data (crude tumor proportions) are given in Table 1
.
|
|
|
The estimated parameter values corresponding to the predicted values in Table 2 are given in Table 3
. These estimated parameters provide a useful starting point for representing variations of the initiation, promotion, and completion (progression) events in the cancer process. The exact procedure to be followed is described below.
|
![]() |
where (d)
1 and
i
0 (i = 1,2,3). This formulation was suggested based on a study of a large historical NTP bioassay database (Portier et al., 1986
). In particular, the distribution of time to death from competing risks is used to represent various levels of reduced survival that have been observed over time in Sprague-Dawley rats. The parameter
(d) allows the additional modeling of differential treatment-related mortality.
For both the distribution of time to death from competing risks and the distribution of time to death after onset of tumor, the values of the i are as follows:
1 = 0.0001,
2 = 1016, and
3 = 7.783381. For the distribution of time to death from competing risks, these values give a survival probability at 104 weeks (24 months) of 0.6, with
(d) = 1. For other survival probabilities, 0.4, 0.3, 0.2, 0.1, the values of
(d) are found by solving
, where Q(d) is the desired survival probability at 104 weeks for dose d. For the distribution of time to death from tumor after onset, the values of
(d) have been chosen to represent either low lethality (99% survival at 26 weeks after onset:
(d) = 4) or high lethality (1% survival at 26 weeks after onset:
(d) = 1764).
Generation of Data via Monte Carlo Simulation
The bioassay design includes one untreated control group and 3 treated groups, each with 50 animals. The middle-dose and low-dose levels are 0.5 and 0.25 of the high-dose level, respectively. For the purposes of this investigation, it is unnecessary to include dual control groups. In fact, the international community is moving away from the practice of having duplicate control groups in pharmaceutical studies (Fairweather et al., 1998; DIA, 1999
).
It is assumed that 3 independent random variables completely determine an animal's fate. These are the time to onset of tumor, T1, the time after onset until death from the tumor, T2, and the time to death from a competing risk, T3. Note that the sum, T1 + T2, represents the overall time to death from the tumor of interest. The random variables T1, T2, and T3 are generated using the distributions outlined above. Thus, with Ts denoting the time of terminal sacrifice, a simulated animal dies without the tumor of interest if T3 < min(T1,Ts), it dies with a tumor but from a competing risk if T1 < T3 <min(Ts,T1+ T2), it dies from the tumor of interest if T1 + T2 < min(T3,Ts), it is sacrificed without the tumor of interest if Ts < min(T1,T3), or it is sacrificed with the tumor of interest if T1 < Ts < min(T3,T1+ T2). Note that the simulated data are representative of occult tumors, that is, tumors that can be detected only upon the death or sacrifice of an animal. Although it is possible to detect certain types of tumors in live animals (e.g., skin, mammary), occult tumors represent the type most commonly observed in rodent bioassays, and hence, they are the focus of this study.
Since there is complete knowledge of the 3 random variables generated for each animal, it is possible to determine for each animal the precise effect that altering the value of Ts, the time of terminal sacrifice, would have with regard to the type of information observed. Thus each simulated experimental data set can be evaluated as if the terminal sacrifice were conducted at either 18, 21, or 24 months. The models for time to onset of tumor, time to death from tumor, and time to death from competing risks, discussed above, are parameterized to achieve a range of effect levels. By using Monte Carlo techniques to simulate 1000 data sets for each parameterization of these models, appropriate data are produced for the power comparison carried out in the next section below.
For time to onset of tumor, each submodel of the 2-stage clonal expansion model was configured to produce the following tumor probabilities at 24 months: 0.01 in controls with either 0.15 or 0.30 at the high dose, 0.05 in controls with either 0.30 or 0.50 at the high dose, 0.15 in controls with either 0.50 or 0.75 at the high dose. The submodels represented in Table 3 and Figure 1
were first standardized (normalized) so that each model's predicted probabilities would cover the entire interval from 0 to 1. That is, if P(0,t) represents the predicted probability of tumor at dose 0 and time t, and P(dH,t) the predicted probability of tumor at the high dose, dH, at time t, for a given model in Figure 1
, then the model will be normalized to
![]() |
where P(d,t) is the probability of tumor at dose d (0 d
dH) and time t (0
t
24). The first term on the right hand side of P'(d,t) gives the normalized dose cross-sectional profile and the second term on the right-hand side of P'(d,t) gives the normalized time profile. Normalized versions of all the submodels in Table 3
and Figure 1
are represented in Figure 2
for t = 24 months, where the dose has also been normalized to the unit interval by d' = d/dH. As can be seen from Figure 2
, the normalized submodels represent all types of dose-response curvature. Let P'(d1,24) and P'(d4,24) correspond, respectively, to the control and high-dose probabilities that are desired to be simulated for a given submodel at 24 months. The corresponding values of d1 and d4 were obtained by iteration for the particular normalized submodel. The values of d2 and d3, the low and middle doses, respectively, were chosen as
and d3 = d1 + (d4-d1)/2. Figure 3
illustrates this process of dose selection for the pure promoter model with P'(d1,24) = 0.05 and P'(d4,24) = 0.30. Thus, the probabilities that were used to generate times to onset of tumor for the 4 dose groups were P'(d1,t), P'(d2,t), P'(d3,t) and P'(d4,t), and the corresponding dose values used for the trend test were 0, 1, 2, and 4, respectively.
|
|
For the survival distribution for time to death after onset, only 2 parameterizations were used: low lethality (99% survival at 26 weeks after onset) and high lethality (1% survival at 26 weeks after onset). There was no differential tumor lethality among groups.
The simulated experimental data were generated using SAS (SAS Language: Reference, 1990) on an Alpha workstation. The SAS random number generator CALL RANUNI (seed, random #) was used to generate pseudo-random numbers uniformly distributed on the interval (0,1) to represent the event probabilities for T1, T2 and T3. Because the CALL form of RANUNI allowed independent seeds to be used for each call, it ensured the independence of T1, T2, and T3 in that they were generated from different random number streams. The seeds were chosen systematically based on an iteration number. Each set of three initial seeds generated a stream used for 5 replicates, where a replicate consisted of all combinations of parameterizations: 6 shapes of T1 models, 6 parameterizations of T1 models, 2 parameterizations of T2 models, and 6 parameterizations of T3 models. After 5 replicates, a new seed was systematically chosen for each of the 3 streams and 5 more replicates were produced. In total, 1000 replicates were generated using 200 different random number streams.
Because of their mathematical complexity, none of the models for T1, T2, or T3 allowed solving for the variable, t, in closed form. Thus, the value of t corresponding to a particular pseudo-random uniform variate on the (0,1) probability scale was found, using a standard halving algorithm in each case. While the models for T2 and T3 were parameterized for time expressed in weeks, the model for T1 was parameterized for time expressed in months. The generated value of T1 was converted to weeks for subsequent use on the basis of 4 1/3 weeks/month. Stopping times of 18, 21 and 24 months were similarly converted to weeks.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
For each parameterization of the models described above, the analysis was carried out as if there were a terminal sacrifice at 18, 21, or 24 months. For each terminal sacrifice time, the simulated power of the test for positive dose-related trend was calculated as the proportion of the 1000 simulated data sets for which the hypothesis of no-differences in tumor rates was rejected in a right-tailed test. Nominal significance levels of 1% and 5% were evaluated. Tables 4 to 9 contain results of the simulated power calculations for the 3 stopping times, along with the relative efficiencies (relative powers) at 18 and 21 months compared to 24 months. Each table contains complete results (72 combinations) for one of the 6 types of time-to-onset models. Figure 4
summarizes the results of the tables in graphical form. For each submodel, Figure 4
gives the average power at 18, 21, and 24 months for a 1% significance level, where the average has been taken over all competing risk survival rates and both levels of tumor lethality. In addition to line plots connecting the average powers, Figure 4
presents box and whisker plots to show the variability in the simulated results.
|
|
|
|
|
|
|
Quite a number of combinations in Tables 49 (see also Fig. 4
), for each of the 6 types of time-to-onset models, exhibit both absolute and relative powers that are substantially reduced for an 18-month stopping time relative to the conventional 24-month stopping time. For the 3 simplest types of models (initiator, completer, and promoter), the power of the promoter model is most adversely affected by early stopping, with the completer model the next most affected. The loss of power is much attenuated for the more complex models (initiator-completer, initiator-promoter, promoter-completer), but there are many combinations for which the relative power is low (e.g., below 80%), especially for a 1% significance level.
For the 21-month stopping time, the relative efficiency remains high for many of the combinations in Tables 49 (see also Fig. 4
). It is greater than 80% for all configurations for the pure initiator, initiator-completer, initiator-promoter, and promoter-completer models. Except for the case of low-background tumor rates with a weak dose-related trend and a 1% significance level, the relative efficiency is well above 80% for the pure completer model, as well. Even for the exceptional case, the relative efficiency is never below 67%. However, the pure promoter model is a clear exception. For a 1% right-tailed test, the highest relative efficiency observed is 65%, and for a 5% test more than half of the observed relative efficiencies are below 70%.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The case for or against stopping at 21 months is not so clear. For most combinations of tumor-onset, tumor-lethality, and competing-risk-survival models, the efficiency at 21 months relative to 24 months is high enough (e.g., 80% or higher) to warrant stopping at the earlier time. The most notable exception is in the pure promoter model. For a 5% right-tailed test, the relative efficiency for more than half of the combinations of models is below 70%. For a 1% test, the relative efficiency for all combinations is below 70%, being appreciably lower in many cases. Based on the results of this simulation study, it must be concluded that stopping at 21 months or earlier would be ill advised for chemicals that are pure promoters. Because the likely mechanism of action of a potential chemical carcinogen is rarely presumed with great certainty in advance, the results of this study can be used to conclude that the conventional rodent bioassay should not be shortened from 24 to 21 months. This conclusion differs from that of a previous simulation study by Ahn et al. (1998), which indicated that a study of 91 weeks (21 months) duration would maintain adequate power. The difference in conclusions appears to be due to the broader class of time-to-onset models considered in the present study.
The vast majority of rat carcinogenicity studies of pharmaceuticals that are submitted to CDER for review have durations of 24 months and have reasonable survival. The duration of mouse studies ranges from 18 to 24 months, with many lasting only 18 or 21 months. One reason for shorter-term bioassays in mice appears to be a 1985 guideline stating that carcinogenicity studies should be conducted for at least 18 months in mice and 24 months in rats (OSTP, 1985). However, the guideline goes on to say that a longer duration may be appropriate if cumulative mortality at the planned terminal sacrifice is low. CDER requests that pharmaceutical sponsors conduct mouse studies for 24 months, unless there is excessive mortality. The results of this simulation study appear to support CDER's request for studies of 24-months duration.
There is an important point that needs to be clarified regarding the conduct of the present study. The motivation for the study is the observed reduction in survival of certain rodent test species and strains. If this reduced survival is the result of acceleration of the total aging process, then the measured endpoints of survival and carcinogenesis might not be independent. That is, the cancer process might accelerate at the same rate as the overall aging process. If this were so, then it would not be correct to assume the mutual independence of T1, T2, and T3, the random variables representing time to onset of a tumor, time after onset until death from a tumor, and time to death from a competing risk, as was done in generating the data for the present study. In caloric restriction studies carried out at the National Center for Toxicological Research, ad libitum-fed rats and mice of various strains actually developed greater numbers of spontaneous tumors than calorically restricted animals, despite the significantly shorter lifespan of the ad libitum-fed animals (e.g., Thurman et al., 1994). If the same phenomenon were to occur with chemically induced tumors, then decreased survival might not reduce the power to detect carcinogens at all. However, it is generally agreed that bioassays ought to have good survival and relatively low-background tumor rates in order to minimize potential confounding of the various pathological changes resulting from exposure to chemical agents. This line of thought has led to proposals to use dietary control as a means of increasing survival and reducing variability among treatment groups in toxicity studies (e.g., Turturro et al., 1996). As was stated in the Introduction, the focus of the present study is early stopping rather than the equally important issue of dietary control. However, the 24-month power results of the present study can be used to infer that the use of dietary control to reduce differential survival among treatment groups would enhance the statistical power to detect chemical carcinogens.
![]() |
NOTES |
---|
1 To whom correspondence should be addressed. Fax: (870) 5437662. E-mail: rkodell{at}nctr.fda.gov.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Ali, M. W. (1989). Exact versus asymptotic tests of trend in tumorigenicity experiments: a comparison of p-values for a small number of events. Proceedings of the Biopharmaceutical Section, American Statistical Association, pp. 148153.
Allaben, W., Turturro, A., Leakey, J., Seng, J., and Hart, R. (1996). FDA points-to-consider document: The need for dietary control for the reduction of experimental variability within animal assays and the use of dietary restriction to achieve dietary control. Toxicol. Pathol. 24, 279284.
Bailer, A. J., and Portier, C. J. (1988). Effects of treatment-induced mortality and tumor-induced mortality on tests for carcinogenicity in small samples. Biometrics 44, 417431.[ISI][Medline]
CDER (Center for Drug Evaluation and Research) (1999). Guidance for Industry: Statistical Aspects of the Design, Analysis, and Interpretation of Chronic Rodent Carcinogenicity Studies of Pharmaceuticals. U.S. Food and Drug Administration. Document in preparation.
Contrera, J. F. (1994). Control of dietary intake: implications for drug evaluation. ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28March 2, 1994; Washington, DC.
Davies, T. S. and Monro, A. M. (1996). The duration of the rodent carcinogenicity bioassay necessary to detect agents carcinogenic to humans. Fundam. Appl. Toxicol. 30, S201 (abstract).
DIA (Drug Information Association) (1999). DIA Workshop on Carcinogenicity Testing of Pharmaceuticals in the EU. September 12, 1999; Zurich, Switzerland.
Fairweather, W. R., Bhattacharyya, A., Ceuppens, P. P, Heimann, G., Hothorn, L. A., Kodell, R. L., Lin, K. K., Mager, H., Middleton, B. J., Slob, W., Soper, K. A., Stallard, N., Ventre, J., and Wright, J. (1998). Biostatistical methodology in carcinogenicity studies. Drug Information Journal 32, 401421.[ISI]
Gold, L. S., Sawyer, C. B, Magaw, R., Backman, G. M., de Veciana, M., Levinson, R., Hooper, N. K., Havender, W. R., Bernstein, L., Peto, R., Pike, M. C., and Ames, B. N. (1984). A carcinogenic potency database of the standardized results of animal bioassays. Environ. Health Perspect. 58, 244245.
Hart, R. W. (1994). Dietary restriction: an update. Keynote address, ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28March 2, 1994; Washington, DC.
Keenan, K. P. (1994). The effects of overfeeding and moderate dietary restriction on Sprague-Dawley (SD) rat survival, carcinogenesis, and chronic disease. ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28March 2, 1994; Washington, DC.
Kodell, R. L., Chen, J. J., and Moore, G. E. (1994). Comparing distributions of time to onset of disease in animal tumorigenicity experiments. Commun. Stat. Theory Methods 23, 959980.[ISI]
Kodell, R.L., Haskin, M.G., Shaw, G.W. and Gaylor, D.W. (1983). CHRONIC: a SAS procedure for statistical analysis of carcinogenesis studies. J. Stat. Comput. Simulation 16, 287310.[ISI]
Kodell, R. L., Shaw, G. W. and Johnson, A. M. (1982). Nonparametric joint estimators for disease resistance and survival functions in survival/sacrifice experiments. Biometrics 38, 4358.[ISI][Medline]
Kopp-Schneider, A., Portier, C. J., and Sherman, C. D. (1994). The exact formula for tumor incidence in the two-stage model. Risk Anal. 14, 10791080.[ISI][Medline]
McKnight, B., and Wahrendorf, J. (1992). Tumour incidence rate alternatives and the cause-of-death test for carcinogenicity. Biometrika 79, 131138.[ISI]
Moolgavkar, S. H., and Luebeck, G. (1990). Two-event model for carcinogenesis: biological, mathematical, and statistical considerations. Risk Anal. 10, 323341.[ISI][Medline]
OSTP (Office of Science and Technology Policy) (1985). Chemical Carcinogens: A Review of the Science and Its Associated Principles. U.S. Interagency Staff Group on Chemical Carcinogenesis. Federal Register 50, 1037110442.
Peto, R., Pike, M. C., Day, N. E., Gray, R. G., Lee, P. N., Parish, S., Peto, J., Richards, S., and Wahrendorf, J. (1980). Guidelines for simple, sensitive significance tests for carcinogenic effects in long-term animal experiments. In IARC Monographs on the Evaluation of the Carcinogenic Risk of Chemicals to Humans, Supplement 2: Long-Term and Short-Term Screening Assays for Carcinogens, a Critical Appraisal. IARC, Lyon.
Portier, C., Hedges, J., and Hoel, D. (1986). Age-specific models of mortality and tumor onset for historical control animals in the National Toxicology Program's carcinogenicity experiments. Cancer Res. 46, 43724378.[Abstract]
Rao, G. N. (1994). Husbandry procedures other than diet restriction for lowering body weight and tumor/disease rates in Fischer 344 rats. ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28March 2, 1994; Washington, DC.
SAS Language: Reference (1990). Version 6, 1st ed. SAS Institute, Cary, NC.
SAS/STAT User's Guide (1990). Ver. 6, 4th ed. SAS Institute, Cary, NC.
Thurman, J. D., Bucci, T., Hart, R., and Turturro, A. (1994). Survival, body weight, and spontaneous neoplasms in ad libitum-fed and dietary-restricted Fischer 344 rats. Toxicol. Pathol. 22, 19.[ISI][Medline]
Turturro, A., Duffy, P., Hart, R., and Allaben, W. (1996). Rationale for the use of dietary control in toxicity studiesB6C3F1 mouse. Toxicol. Pathol. 24, 769775.[ISI][Medline]
Zheng, Q. (1994). On the exact hazard and survival functions of the MVK stochastic carcinogenesis model. Risk Anal. 14, 10811084.[ISI][Medline]