Bioassays of Shortened Duration for Drugs: Statistical Implications

Ralph L. Kodell*,1, Karl K. Lin{dagger}, Brett T. Thorn{ddagger} and James J. Chen*

* Division of Biometry and Risk Assessment, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079; {dagger} Division of Biometrics II, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Rockville, Maryland 20857; and {ddagger} R.O.W. Sciences, Inc., National Center for Toxicological Research, Jefferson, Arkansas 72079

Received October 12, 1999; accepted January 13, 2000


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Declining survival rates in rodent carcinogenesis bioassays have raised a concern that continuing the practice of terminating such studies at 24 months could result in too few live animals at termination for adequate pathological evaluation. One option for ensuring sufficient numbers of animals at the terminal sacrifice is to shorten the duration of the bioassay, but this approach is accompanied by a reduction in statistical power for detecting carcinogenic potential. The present study was conducted to evaluate the loss of power associated with early termination. Data from drug studies in rats were used to formulate biologically based dose-response models of carcinogenesis using the 2-stage clonal expansion model as a context. These dose-response models, which were chosen to represent 6 variations of the initiation-promotion-completion cancer model, were employed to generate a large number of representative bioassay data sets using Monte Carlo simulation techniques. For a variety of tumor dose-response trends, tumor lethality, and competing risk-survival rates, the power of age-adjusted statistical tests to assess the significance of carcinogenic potential was evaluated at 18 and 21 months, and compared to the power at the normal 24-month stopping time. The results showed that stopping at 18 months would reduce power to an unacceptable level for all 6 submodels of the 2-stage clonal expansion model, with the pure-promoter and pure-completer models being most adversely affected. For the 21-month stopping time, the results showed that, unless pure promotion can be ruled out a priori as a potential carcinogenic mode of action, the loss of power is too great to warrant early stopping.

Key Words: chronic; dose-response; Food and Drug Administration (FDA); Monte Carlo; MVK model; power; survival; trend.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
For a number of years, it has been noted that the survival of various strains of rats commonly used to evaluate the carcinogenic potential of human drugs has been declining. In addition, a strong negative association between survival and the amount of food consumed has been observed (Keenan, 1994Go). The reduced survival of the Sprague-Dawley strain has raised concerns about the advisability of continuing to conduct the cancer bioassay in this strain for the usual time period of 24 months and under the usual ad libitum feeding conditions (Hart, 1994Go).

Two alternatives that have been discussed, in the interest of having sufficient numbers of surviving animals at the terminal sacrifice, are (1) shortening the length of the bioassay to 18 months (or less) (Davies and Monro, 1996Go; Rao, 1994Go) and (2) using some form of caloric restriction to increase the life span of the animals (Allaben et al., 1996Go; Contrera, 1994Go). Whereas many questions arise with respect to both of these proposed approaches, there appears to be one key question in each alternative that pertains to the statistical power to detect carcinogenic potential. For the first alternative, the question is, assuming no restriction on feeding, to what degree would early stopping reduce power? For the second alternative, the question is, how long would the bioassay with caloric restriction need to be conducted, and at what level of restriction, in order to achieve adequate power? Both of these questions are of concern to regulatory agencies as they strive to harmonize efforts to evaluate the carcinogenic potential of human drugs. The purpose of this paper is to address the first question regarding early stopping, specifically, the effect of early study termination on the ability to detect the carcinogenic potential of drugs that operate by various postulated carcinogenic mechanisms.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The approach involves 3 steps. In the first step, existing bioassay data on drugs are used to develop mathematical models for generating representative tumor data for various types of carcinogenic pharmaceutical agents. This step also involves identifying suitable models for representing a range of survival patterns. In the second step, a large number of bioassay data sets are simulated based on the models selected in step one. A given data set is presented in 3 ways, as if a terminal sacrifice were conducted at 18 months, 21 months, or 24 months. In the third step, the data generated in step 2 are evaluated using age-adjusted tests for carcinogenicity. Comparisons of statistical power are made between the 24-month study length and the shorter studies.

Development of Representative Models for Generating Data
Because of its underlying biological foundation, and its ability to represent variations of the initiation, promotion, and completion (progression) events in the cancer process, the 2-stage clonal expansion model for carcinogenesis (Moolgavkar and Luebeck, 1990Go) is used to represent the time-to-onset distribution function for each dose level of a drug. This model's exact cumulative hazard function, assuming time-homogeneous genetic event rates, cell growth rates, and population-size of normal cells, may be expressed for the continuous dosing case (Kopp-Schneider, et al., 1994Go; Zheng, 1994Go) by:



where C = [(ß + {delta} + µ)2 – 4ß{delta}]1/2, t represents time, N is the number of normal cells in the tissue, and the quantities v, µ, ß, and {delta} are rates corresponding to initiation ( v = first genetic event rate), progression or completion (µ = second genetic event rate) and promotion (ß = cell birth rate, {delta} = cell death rate). Whereas often N is taken to be a large constant (e.g., 107), here N is absorbed into the model's parameter, v.

For the simulation study described below, it is assumed that the 2-stage model's parameters are linearly related to dose, i.e., v = {alpha}1 + ß1 d, µ = {alpha}2 + ß2 d, ß = {alpha}3 + ß3 d, and {delta} = {alpha}4 + ß4 d, where d is the dose level of drug. Various submodels can be obtained from the above fully parameterized model simply by setting various ßi coefficients equal to zero. For example, setting ß2 = 0 gives an initiator-promoter model, while setting ß2, ß3, and ß4 all to zero gives a pure initiator model.

Tumorigenicity data submitted by new-drug sponsors to the Center for Drug Evaluation and Research (CDER) of the U.S. Food and Drug Administration (FDA), along with data from a National Toxicology Program (NTP) drug study (Gold et al., 1984Go), serve as a basis for parameterizing the 2-stage clonal expansion model for the simulation study. These data have been used to develop basic parameterizations of various submodels of the full model, which are then used to develop a set of models to represent the broad spectrum of bioassay data that are seen in practice. This ensures that the generated tumor data are representative of actual drug studies carried out in rats. The raw data (crude tumor proportions) are given in Table 1Go.


View this table:
[in this window]
[in a new window]
 
TABLE 1 Actual Drug Data on Which Simulation Study Is Based
 
To make the tumor data compatible with the 2-stage clonal expansion model, the raw data in Table 1Go have been adjusted according to the method of Kodell et al. (1982), in order to obtain empirical estimates of the uncensored probability of tumor onset by 24 months for each dose group. That is, the time-to-tumor-onset data have been adjusted for effects of mortality unrelated to the tumors themselves. Each animal's death time was used in a procedure to estimate nonparametrically the distribution of time to onset, utilizing cause-of-death information where applicable. The 24-month empirical tumor onset probabilities are given in Table 2Go for each data set, along with predicted values from the two-stage model that were obtained using SAS NLIN (SAS/STAT User's Guide, 1990Go). For each data set, the particular submodel of the two-stage clonal expansion model that was used is indicated. For submodels involving promotion, only the parameter governing the birth process of initiated cells, ß, was assumed to be affected, while the death-process parameter, {delta}, was assumed to be unaffected. Plots of the fitted models to the adjusted empirical tumor probabilities are given in Figure 1Go.


View this table:
[in this window]
[in a new window]
 
TABLE 2 Empirical Adjusted Tumor Probabilities and Predicted Values from the Indicated Model
 


View larger version (36K):
[in this window]
[in a new window]
 
FIG. 1. Plots of various submodels of the 2-stage clonal expansion model fitted to survival-adjusted 24-month tumor rates observed in rodent bioassays of drugs.

 
As indicated in Table 2Go and Figure 1Go, appropriate representations have been obtained for the pure initiator model (CDER-1, F), the pure promoter model (NTP#194, F), the pure completer model (CDER-1, F), the initiator-promoter model (CDER-2, M), the initiator-completer model (CDER-1, M), and the promoter-completer model (CDER-3, M). Thus, all relevant models of interest are captured with these data. It should be noted that Table 2Go and Figure 1Go merely suggest submodels of the 2-stage clonal expansion model that are consistent with the data. While these are useful for simulating representative data, no strong inference can be made as to mechanisms of action for a given drug. Also, as indicated in Figure 1Go, some representations are better than others. It is interesting to note that the data are consistent with the notion that a given drug could operate by different mechanisms in separate organs of a given sex (CDER-1, F).

The estimated parameter values corresponding to the predicted values in Table 2Go are given in Table 3Go. These estimated parameters provide a useful starting point for representing variations of the initiation, promotion, and completion (progression) events in the cancer process. The exact procedure to be followed is described below.


View this table:
[in this window]
[in a new window]
 
TABLE 3 Parameter Values Corresponding to Predicted Tumor Probabilities in Table 2Go
 
Following Portier et al. (1986), the cumulative hazard functions for time to death from competing risks and time to death after onset of tumor are of the form


where {theta}(d) >= 1 and {gamma}i >= 0 (i = 1,2,3). This formulation was suggested based on a study of a large historical NTP bioassay database (Portier et al., 1986Go). In particular, the distribution of time to death from competing risks is used to represent various levels of reduced survival that have been observed over time in Sprague-Dawley rats. The parameter {theta}(d) allows the additional modeling of differential treatment-related mortality.

For both the distribution of time to death from competing risks and the distribution of time to death after onset of tumor, the values of the {gamma}i are as follows: {gamma}1 = 0.0001, {gamma}2 = 10–16, and {gamma}3 = 7.783381. For the distribution of time to death from competing risks, these values give a survival probability at 104 weeks (24 months) of 0.6, with {theta}(d) = 1. For other survival probabilities, 0.4, 0.3, 0.2, 0.1, the values of {theta}(d) are found by solving , where Q(d) is the desired survival probability at 104 weeks for dose d. For the distribution of time to death from tumor after onset, the values of {theta}(d) have been chosen to represent either low lethality (99% survival at 26 weeks after onset: {theta}(d) = 4) or high lethality (1% survival at 26 weeks after onset: {theta}(d) = 1764).

Generation of Data via Monte Carlo Simulation
The bioassay design includes one untreated control group and 3 treated groups, each with 50 animals. The middle-dose and low-dose levels are 0.5 and 0.25 of the high-dose level, respectively. For the purposes of this investigation, it is unnecessary to include dual control groups. In fact, the international community is moving away from the practice of having duplicate control groups in pharmaceutical studies (Fairweather et al., 1998Go; DIA, 1999Go).

It is assumed that 3 independent random variables completely determine an animal's fate. These are the time to onset of tumor, T1, the time after onset until death from the tumor, T2, and the time to death from a competing risk, T3. Note that the sum, T1 + T2, represents the overall time to death from the tumor of interest. The random variables T1, T2, and T3 are generated using the distributions outlined above. Thus, with Ts denoting the time of terminal sacrifice, a simulated animal dies without the tumor of interest if T3 < min(T1,Ts), it dies with a tumor but from a competing risk if T1 < T3 <min(Ts,T1+ T2), it dies from the tumor of interest if T1 + T2 < min(T3,Ts), it is sacrificed without the tumor of interest if Ts < min(T1,T3), or it is sacrificed with the tumor of interest if T1 < Ts < min(T3,T1+ T2). Note that the simulated data are representative of occult tumors, that is, tumors that can be detected only upon the death or sacrifice of an animal. Although it is possible to detect certain types of tumors in live animals (e.g., skin, mammary), occult tumors represent the type most commonly observed in rodent bioassays, and hence, they are the focus of this study.

Since there is complete knowledge of the 3 random variables generated for each animal, it is possible to determine for each animal the precise effect that altering the value of Ts, the time of terminal sacrifice, would have with regard to the type of information observed. Thus each simulated experimental data set can be evaluated as if the terminal sacrifice were conducted at either 18, 21, or 24 months. The models for time to onset of tumor, time to death from tumor, and time to death from competing risks, discussed above, are parameterized to achieve a range of effect levels. By using Monte Carlo techniques to simulate 1000 data sets for each parameterization of these models, appropriate data are produced for the power comparison carried out in the next section below.

For time to onset of tumor, each submodel of the 2-stage clonal expansion model was configured to produce the following tumor probabilities at 24 months: 0.01 in controls with either 0.15 or 0.30 at the high dose, 0.05 in controls with either 0.30 or 0.50 at the high dose, 0.15 in controls with either 0.50 or 0.75 at the high dose. The submodels represented in Table 3Go and Figure 1Go were first standardized (normalized) so that each model's predicted probabilities would cover the entire interval from 0 to 1. That is, if P(0,t) represents the predicted probability of tumor at dose 0 and time t, and P(dH,t) the predicted probability of tumor at the high dose, dH, at time t, for a given model in Figure 1Go, then the model will be normalized to


where P(d,t) is the probability of tumor at dose d (0 <= d <= dH) and time t (0 <= t <= 24). The first term on the right hand side of P'(d,t) gives the normalized dose cross-sectional profile and the second term on the right-hand side of P'(d,t) gives the normalized time profile. Normalized versions of all the submodels in Table 3Go and Figure 1Go are represented in Figure 2Go for t = 24 months, where the dose has also been normalized to the unit interval by d' = d/dH. As can be seen from Figure 2Go, the normalized submodels represent all types of dose-response curvature. Let P'(d1,24) and P'(d4,24) correspond, respectively, to the control and high-dose probabilities that are desired to be simulated for a given submodel at 24 months. The corresponding values of d1 and d4 were obtained by iteration for the particular normalized submodel. The values of d2 and d3, the low and middle doses, respectively, were chosen as and d3 = d1 + (d4-d1)/2. Figure 3Go illustrates this process of dose selection for the pure promoter model with P'(d1,24) = 0.05 and P'(d4,24) = 0.30. Thus, the probabilities that were used to generate times to onset of tumor for the 4 dose groups were P'(d1,t), P'(d2,t), P'(d3,t) and P'(d4,t), and the corresponding dose values used for the trend test were 0, 1, 2, and 4, respectively.



View larger version (24K):
[in this window]
[in a new window]
 
FIG. 2. Normalized dose-response curves representing 6 submodels of the 2-stage clonal expansion model of carcinogenesis.

 


View larger version (29K):
[in this window]
[in a new window]
 
FIG. 3. Illustration of dose selection for a pure promoter model with a 5% background tumor rate at 24 months and a 30% tumor rate at the highest dose. Dose d'1 corresponds to zero dose and dose d'4 to the high dose. Intermediate doses d'2 and d'3 are selected in terms of d'1 and d'4 as indicated.

 
The probability of survival with respect to competing risks at 24 months was set at representative values of 0.60, 0.40, and 0.20 in untreated controls. The survival probability in treated groups varied from equality with control down to 0.10 survival in the high-dose group. Six combinations of survival probabilities, denoted by (control, low, medium, high), were used to give target survival probabilities at 104 weeks (24 months). These were:

For the survival distribution for time to death after onset, only 2 parameterizations were used: low lethality (99% survival at 26 weeks after onset) and high lethality (1% survival at 26 weeks after onset). There was no differential tumor lethality among groups.

The simulated experimental data were generated using SAS (SAS Language: Reference, 1990Go) on an Alpha workstation. The SAS random number generator CALL RANUNI (seed, random #) was used to generate pseudo-random numbers uniformly distributed on the interval (0,1) to represent the event probabilities for T1, T2 and T3. Because the CALL form of RANUNI allowed independent seeds to be used for each call, it ensured the independence of T1, T2, and T3 in that they were generated from different random number streams. The seeds were chosen systematically based on an iteration number. Each set of three initial seeds generated a stream used for 5 replicates, where a replicate consisted of all combinations of parameterizations: 6 shapes of T1 models, 6 parameterizations of T1 models, 2 parameterizations of T2 models, and 6 parameterizations of T3 models. After 5 replicates, a new seed was systematically chosen for each of the 3 streams and 5 more replicates were produced. In total, 1000 replicates were generated using 200 different random number streams.

Because of their mathematical complexity, none of the models for T1, T2, or T3 allowed solving for the variable, t, in closed form. Thus, the value of t corresponding to a particular pseudo-random uniform variate on the (0,1) probability scale was found, using a standard halving algorithm in each case. While the models for T2 and T3 were parameterized for time expressed in weeks, the model for T1 was parameterized for time expressed in months. The generated value of T1 was converted to weeks for subsequent use on the basis of 4 1/3 weeks/month. Stopping times of 18, 21 and 24 months were similarly converted to weeks.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Comparison of Statistical Power at Various Stopping Times
The statistical analysis of each generated experimental data set was carried out using the standard IARC context-of-observation (cause-of-death) approach as outlined by Peto et al. (1980). NCTR's user-written program, PC CHRONIC, was used to implement the IARC test for dose-related trend based on pooled information on fatal and incidental tumors. This pooled test is a test of equality of tumor incidence (onset) rates for non-crossing alternatives (McKnight and Wahendorf, 1992Go). Because of problems that have been noted in maintaining the nominal significance level when using IARC's ad hoc data-determined time intervals for the incidental portion of the analysis in the presence of differential treatment mortality (Kodell et al., 1994Go), the NTP time intervals (Bailer and Portier, 1988Go) were used in all cases. These intervals are weeks 0–52, 53–78, 79–92 and 93–104. When the total number of tumors across treatment groups was 10 or less, the exact permutation version of the IARC test, as recommended by CDER (Ali, 1989Go; CDER, 1999Go), was used instead of the asymptotic test to calculate a p-value.

For each parameterization of the models described above, the analysis was carried out as if there were a terminal sacrifice at 18, 21, or 24 months. For each terminal sacrifice time, the simulated power of the test for positive dose-related trend was calculated as the proportion of the 1000 simulated data sets for which the hypothesis of no-differences in tumor rates was rejected in a right-tailed test. Nominal significance levels of 1% and 5% were evaluated. Tables 4 to 9GoGoGoGoGoGo contain results of the simulated power calculations for the 3 stopping times, along with the relative efficiencies (relative powers) at 18 and 21 months compared to 24 months. Each table contains complete results (72 combinations) for one of the 6 types of time-to-onset models. Figure 4Go summarizes the results of the tables in graphical form. For each submodel, Figure 4Go gives the average power at 18, 21, and 24 months for a 1% significance level, where the average has been taken over all competing risk survival rates and both levels of tumor lethality. In addition to line plots connecting the average powers, Figure 4Go presents box and whisker plots to show the variability in the simulated results.


View this table:
[in this window]
[in a new window]
 
TABLE 4 Onset Power and 24 Month Relative Efficiency for Initiator Model
 

View this table:
[in this window]
[in a new window]
 
TABLE 5 Onset Power and 24 Month Relative Efficiency for Promoter (Birth) Model
 

View this table:
[in this window]
[in a new window]
 
TABLE 6 Onset Power and 24 Month Relative Efficiency for Completer Model
 

View this table:
[in this window]
[in a new window]
 
TABLE 7 Onset Power and 24 Month Relative Efficiency for Initiator-Promoter Model
 

View this table:
[in this window]
[in a new window]
 
TABLE 8 Onset Power and 24 Month Relative Efficiency for Initiator-Completer Model
 

View this table:
[in this window]
[in a new window]
 
TABLE 9 Onset Power and 24 Month Relative Efficiency for Promoter-Completer Model
 


View larger version (45K):
[in this window]
[in a new window]
 
FIG. 4. Simulated power at 18, 21, and 24 months for a 1% right-tailed test. Straight lines join average powers at adjacent stopping times, with averages taken over competing risk survival rates and tumor lethalities. Box and whisker plots demonstrate variability in simulated results.

 
As can be seen from Tables 4–9GoGoGoGoGoGo and Figure 4Go, for each of the 6 types of time-to-tumor models except the pure promoter model, there are greater power losses for cases of weak dose-related trends compared to strong dose-related trends. For all models except the pure promoter and promoter-completer models, there are greater power losses for lower-background tumor rates compared to higher-background rates. Perhaps surprisingly, whether tumor lethality is high or low does not appear to have much effect on either absolute or relative power. It is difficult to establish a clear pattern with respect to the effect of differential competing risk survival. Surprisingly, there are a number of configurations for several of the model types for which the power at 18 or 21 months relative to 24 months is actually increased at lower compared to higher survival rates. However, the absolute power was reduced at lower survival rates in virtually all cases for all stopping times.

Quite a number of combinations in Tables 4–9GoGoGoGoGoGo (see also Fig. 4Go), for each of the 6 types of time-to-onset models, exhibit both absolute and relative powers that are substantially reduced for an 18-month stopping time relative to the conventional 24-month stopping time. For the 3 simplest types of models (initiator, completer, and promoter), the power of the promoter model is most adversely affected by early stopping, with the completer model the next most affected. The loss of power is much attenuated for the more complex models (initiator-completer, initiator-promoter, promoter-completer), but there are many combinations for which the relative power is low (e.g., below 80%), especially for a 1% significance level.

For the 21-month stopping time, the relative efficiency remains high for many of the combinations in Tables 4–9GoGoGoGoGoGo (see also Fig. 4Go). It is greater than 80% for all configurations for the pure initiator, initiator-completer, initiator-promoter, and promoter-completer models. Except for the case of low-background tumor rates with a weak dose-related trend and a 1% significance level, the relative efficiency is well above 80% for the pure completer model, as well. Even for the exceptional case, the relative efficiency is never below 67%. However, the pure promoter model is a clear exception. For a 1% right-tailed test, the highest relative efficiency observed is 65%, and for a 5% test more than half of the observed relative efficiencies are below 70%.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Based on the results of this simulation study, it is clear that stopping the conventional rodent bioassay at 18 months instead of the standard 24 months would result in a substantial loss of statistical power to detect chemical carcinogens. The loss of power is greater for the pure initiator, pure completer, and pure promoter models than for the initiator-completer, initiator-promoter and promoter-completer models. This result agrees with intuition, in that a chemical carcinogen that acts by more than one mechanism would seem to be easier to detect in a rodent bioassay than one that acts by a single mechanism. Overall, there are too many combinations of tumor-onset, tumor-lethality, and competing-risk-survival models for which the relative efficiency is reduced too greatly to warrant stopping as early as 18 months.

The case for or against stopping at 21 months is not so clear. For most combinations of tumor-onset, tumor-lethality, and competing-risk-survival models, the efficiency at 21 months relative to 24 months is high enough (e.g., 80% or higher) to warrant stopping at the earlier time. The most notable exception is in the pure promoter model. For a 5% right-tailed test, the relative efficiency for more than half of the combinations of models is below 70%. For a 1% test, the relative efficiency for all combinations is below 70%, being appreciably lower in many cases. Based on the results of this simulation study, it must be concluded that stopping at 21 months or earlier would be ill advised for chemicals that are pure promoters. Because the likely mechanism of action of a potential chemical carcinogen is rarely presumed with great certainty in advance, the results of this study can be used to conclude that the conventional rodent bioassay should not be shortened from 24 to 21 months. This conclusion differs from that of a previous simulation study by Ahn et al. (1998), which indicated that a study of 91 weeks (21 months) duration would maintain adequate power. The difference in conclusions appears to be due to the broader class of time-to-onset models considered in the present study.

The vast majority of rat carcinogenicity studies of pharmaceuticals that are submitted to CDER for review have durations of 24 months and have reasonable survival. The duration of mouse studies ranges from 18 to 24 months, with many lasting only 18 or 21 months. One reason for shorter-term bioassays in mice appears to be a 1985 guideline stating that carcinogenicity studies should be conducted for at least 18 months in mice and 24 months in rats (OSTP, 1985Go). However, the guideline goes on to say that a longer duration may be appropriate if cumulative mortality at the planned terminal sacrifice is low. CDER requests that pharmaceutical sponsors conduct mouse studies for 24 months, unless there is excessive mortality. The results of this simulation study appear to support CDER's request for studies of 24-months duration.

There is an important point that needs to be clarified regarding the conduct of the present study. The motivation for the study is the observed reduction in survival of certain rodent test species and strains. If this reduced survival is the result of acceleration of the total aging process, then the measured endpoints of survival and carcinogenesis might not be independent. That is, the cancer process might accelerate at the same rate as the overall aging process. If this were so, then it would not be correct to assume the mutual independence of T1, T2, and T3, the random variables representing time to onset of a tumor, time after onset until death from a tumor, and time to death from a competing risk, as was done in generating the data for the present study. In caloric restriction studies carried out at the National Center for Toxicological Research, ad libitum-fed rats and mice of various strains actually developed greater numbers of spontaneous tumors than calorically restricted animals, despite the significantly shorter lifespan of the ad libitum-fed animals (e.g., Thurman et al., 1994). If the same phenomenon were to occur with chemically induced tumors, then decreased survival might not reduce the power to detect carcinogens at all. However, it is generally agreed that bioassays ought to have good survival and relatively low-background tumor rates in order to minimize potential confounding of the various pathological changes resulting from exposure to chemical agents. This line of thought has led to proposals to use dietary control as a means of increasing survival and reducing variability among treatment groups in toxicity studies (e.g., Turturro et al., 1996). As was stated in the Introduction, the focus of the present study is early stopping rather than the equally important issue of dietary control. However, the 24-month power results of the present study can be used to infer that the use of dietary control to reduce differential survival among treatment groups would enhance the statistical power to detect chemical carcinogens.


    NOTES
 
The authors' views expressed in this paper are not necessarily the views of the U.S. Food and Drug Administration.

1 To whom correspondence should be addressed. Fax: (870) 543–7662. E-mail: rkodell{at}nctr.fda.gov. Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Ahn, H., Zhu, W., Yang, J., and Kodell, R. L. (1998). Efficient designs for animal carcinogenicity experiments. Commun. Stat. Theory Methods 27, 1275–1287.[ISI]

Ali, M. W. (1989). Exact versus asymptotic tests of trend in tumorigenicity experiments: a comparison of p-values for a small number of events. Proceedings of the Biopharmaceutical Section, American Statistical Association, pp. 148–153.

Allaben, W., Turturro, A., Leakey, J., Seng, J., and Hart, R. (1996). FDA points-to-consider document: The need for dietary control for the reduction of experimental variability within animal assays and the use of dietary restriction to achieve dietary control. Toxicol. Pathol. 24, 279–284.

Bailer, A. J., and Portier, C. J. (1988). Effects of treatment-induced mortality and tumor-induced mortality on tests for carcinogenicity in small samples. Biometrics 44, 417–431.[ISI][Medline]

CDER (Center for Drug Evaluation and Research) (1999). Guidance for Industry: Statistical Aspects of the Design, Analysis, and Interpretation of Chronic Rodent Carcinogenicity Studies of Pharmaceuticals. U.S. Food and Drug Administration. Document in preparation.

Contrera, J. F. (1994). Control of dietary intake: implications for drug evaluation. ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28–March 2, 1994; Washington, DC.

Davies, T. S. and Monro, A. M. (1996). The duration of the rodent carcinogenicity bioassay necessary to detect agents carcinogenic to humans. Fundam. Appl. Toxicol. 30, S201 (abstract).

DIA (Drug Information Association) (1999). DIA Workshop on Carcinogenicity Testing of Pharmaceuticals in the EU. September 1–2, 1999; Zurich, Switzerland.

Fairweather, W. R., Bhattacharyya, A., Ceuppens, P. P, Heimann, G., Hothorn, L. A., Kodell, R. L., Lin, K. K., Mager, H., Middleton, B. J., Slob, W., Soper, K. A., Stallard, N., Ventre, J., and Wright, J. (1998). Biostatistical methodology in carcinogenicity studies. Drug Information Journal 32, 401–421.[ISI]

Gold, L. S., Sawyer, C. B, Magaw, R., Backman, G. M., de Veciana, M., Levinson, R., Hooper, N. K., Havender, W. R., Bernstein, L., Peto, R., Pike, M. C., and Ames, B. N. (1984). A carcinogenic potency database of the standardized results of animal bioassays. Environ. Health Perspect. 58, 244–245.

Hart, R. W. (1994). Dietary restriction: an update. Keynote address, ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28–March 2, 1994; Washington, DC.

Keenan, K. P. (1994). The effects of overfeeding and moderate dietary restriction on Sprague-Dawley (SD) rat survival, carcinogenesis, and chronic disease. ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28–March 2, 1994; Washington, DC.

Kodell, R. L., Chen, J. J., and Moore, G. E. (1994). Comparing distributions of time to onset of disease in animal tumorigenicity experiments. Commun. Stat. Theory Methods 23, 959–980.[ISI]

Kodell, R.L., Haskin, M.G., Shaw, G.W. and Gaylor, D.W. (1983). CHRONIC: a SAS procedure for statistical analysis of carcinogenesis studies. J. Stat. Comput. Simulation 16, 287–310.[ISI]

Kodell, R. L., Shaw, G. W. and Johnson, A. M. (1982). Nonparametric joint estimators for disease resistance and survival functions in survival/sacrifice experiments. Biometrics 38, 43–58.[ISI][Medline]

Kopp-Schneider, A., Portier, C. J., and Sherman, C. D. (1994). The exact formula for tumor incidence in the two-stage model. Risk Anal. 14, 1079–1080.[ISI][Medline]

McKnight, B., and Wahrendorf, J. (1992). Tumour incidence rate alternatives and the cause-of-death test for carcinogenicity. Biometrika 79, 131–138.[ISI]

Moolgavkar, S. H., and Luebeck, G. (1990). Two-event model for carcinogenesis: biological, mathematical, and statistical considerations. Risk Anal. 10, 323–341.[ISI][Medline]

OSTP (Office of Science and Technology Policy) (1985). Chemical Carcinogens: A Review of the Science and Its Associated Principles. U.S. Interagency Staff Group on Chemical Carcinogenesis. Federal Register 50, 10371–10442.

Peto, R., Pike, M. C., Day, N. E., Gray, R. G., Lee, P. N., Parish, S., Peto, J., Richards, S., and Wahrendorf, J. (1980). Guidelines for simple, sensitive significance tests for carcinogenic effects in long-term animal experiments. In IARC Monographs on the Evaluation of the Carcinogenic Risk of Chemicals to Humans, Supplement 2: Long-Term and Short-Term Screening Assays for Carcinogens, a Critical Appraisal. IARC, Lyon.

Portier, C., Hedges, J., and Hoel, D. (1986). Age-specific models of mortality and tumor onset for historical control animals in the National Toxicology Program's carcinogenicity experiments. Cancer Res. 46, 4372–4378.[Abstract]

Rao, G. N. (1994). Husbandry procedures other than diet restriction for lowering body weight and tumor/disease rates in Fischer 344 rats. ILSI Conference on Dietary Restriction: Implications for the Design and Interpretation of Toxicity and Carcinogenicity Studies; February 28–March 2, 1994; Washington, DC.

SAS Language: Reference (1990). Version 6, 1st ed. SAS Institute, Cary, NC.

SAS/STAT User's Guide (1990). Ver. 6, 4th ed. SAS Institute, Cary, NC.

Thurman, J. D., Bucci, T., Hart, R., and Turturro, A. (1994). Survival, body weight, and spontaneous neoplasms in ad libitum-fed and dietary-restricted Fischer 344 rats. Toxicol. Pathol. 22, 1–9.[ISI][Medline]

Turturro, A., Duffy, P., Hart, R., and Allaben, W. (1996). Rationale for the use of dietary control in toxicity studies—B6C3F1 mouse. Toxicol. Pathol. 24, 769–775.[ISI][Medline]

Zheng, Q. (1994). On the exact hazard and survival functions of the MVK stochastic carcinogenesis model. Risk Anal. 14, 1081–1084.[ISI][Medline]