Department of Psychiatry, University of Mainz, Mainz and 1 Institute of Psychology, University of Tübihgen, Germany
* Author to whom correspondence should be addressed at: Department of Psychiatry, University of Mainz, Untere Zahlbacher Straße 8, D-55131 Mainz, Germany. Tel.: +49 6131 17 2920; Fax: +49 6131 17 66 90; E-mail: mjm{at}mail.psychiatrie.klinik.uni-mainz.de
(Received 11 March 2003; first review notified 6 May 2003; revised and accepted 23 November 2004)
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Generally, one of the essential problems in planning and conducting comparative randomized clinical trials with two or more treatment arms is to provide a balanced assignment of subjects to the experimental groups (Pocock, 1983). Although simple randomization procedures seem to yield several advantages, e.g. high unpredictability of treatment and applicability of inference statistics based on random sample theory, practical considerations and empirical findings often show shortcomings and irregularities of these techniques, resulting in major interpretative problems (Simon, 1979
). Simple randomization seems to be sufficient and recommendable in trials with n > 200 (Lachin, 1988a
). In smaller samples, matching or stratification techniques have been suggested to avoid accidentally occurring unbalanced designs (covariate imbalance) (Billewicz, 1965
; Chase, 1968
; Bailey, 1983
). Both approaches have advantages and shortcomings. In the case of primarily unknown proportions of subjects with a prognostic characteristic, randomization within groups of individuals with the same characteristic (stratum) seems most appropriate (Simon, 1979
). However, block randomization or simple randomization within strata will yield the same results as without stratification, e.g. each subject has a probability of P = 0.50 to be assigned to one of two treatments without correcting for deviations occurring from balance. On the other hand, highly deterministic approaches have been proposed, e.g. pairwise block randomization within each stratum, where each assignment determines the next. Alternative strategies determine treatment assignment in the event of a defined deviation from random balance (Taves, 1974
). However, every deterministic interference can invalidate the results of randomized trials and should therefore be avoided (selection bias). The key idea to improve randomization within strata is to apply an algorithm, which is efficient in adjusting or forcing the balance of randomized assignments, without significantly reducing treatment unpredictability. Therefore, the sequential progress of assignment has to be taken into account in some way. Depending on previous assignments and the resulting current over- or under-representation of one treatment, the immediate next step of the randomization process should be influenced. Efron (1971)
proposed and Pocock and Simon (1975)
elaborated on the idea of the biased coin, i.e. to vary the probability of treatment assignment in favour of the so-far under-represented treatment (Atkinson, 1982
). Wei (1978)
and Wei and Lachin (1988)
presented a general outline, including mathematical properties, of adaptive randomization techniques within the framework of standard urn designs. The biased coin procedure should reduce possible imbalances in stratified randomization, particularly if subjects enter the study sequentially and if the prevalence of the categories of a stratification variable is not known a priori.
According to Efron's (1971) biased coin idea, the following rationale can be derived:
Efron (1971) suggested that a value of P = 2/3 is generally acceptable in case (ii); Pocock and Simon (1975)
used values of P = 1 and P = 3/4. More sophisticated procedures to optimize a balance in stratified designs have also been developed (Klotz, 1978
) but did not reach widespread use, possibly due to complex computer calculations that have to be carried out. In clinical studies for example, a randomization routine has to be maximally straightforward and safe in terms of the treatment blindness of patients and clinical staff. Pocock (1983)
has emphasized that the possibility of implementing and running a design with ease is at least as important as its theoretical optimality. Therefore, we will propose an easily and routinely applicable approach and show that our approachin the case of a priori unknown but low numbers of subjects fulfilling specific stratification criteriais at least comparable to the biased coin procedure with respect to forcing a balance between two treatments. The proposed approach is a sequentially adjusted randomization technique; as with the ongoing trial and with successively assigned subjects the base probability of P = 0.50 for each of two treatment alternatives is continuously adjusted depending on previous assignments (without having to calculate them). The rationale follows the above mentioned three steps. However, while Efron's approach used a fixed value of P = 2/3 whenever one treatment group was under-represented, we propose a more flexible strategy:
|
Due to the statistical properties and simplicity of the proposed approach, it should be preferable in designs with a priori unknown but low prevalence of specific stratification characteristics.
The conjectured features and advantages will be tested empirically by computer simulation techniques comparing the present approach with simple randomization and Efron's method. For illustrative purposes, an application of the algorithm to a hypothetical clinical study on different treatments of alcoholism will be given.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Efron's biased coin approach (P = 2/3). The approach is carried out as described by Efron (1971); whenever a treatment is under-represented during sequential assignment, the probability for this treatment is set at P = 0.67 for subsequent assignments. If both treatments are balanced, chances of P = 0.50 are used for random assignment to both treatments. For each new assignment, the proportion of previous assignments to each treatment has to be calculated and taken into account.
Sequentially adjusted randomization (new method). (i) PA or PB is the probability for a subject to be assigned to treatment A or B; (ii) nA or nB is the number of subjects already assigned to treatment A or B; and (iii) for each new subject of a stratum sx to be assigned, let PA = (nB + 1)/(nA + nB + 2) and PB = (nA + 1)/ (nA + nB + 2); e.g.
Simulation procedure and outcome parameters
To test the accuracy of the aforementioned approaches, different a priori specified hypothetical values for the prevalence of subjects with specific stratum characteristics were used (Table 1); in each simulation, a particular number of individuals (2100) had to be randomly assigned to either treatment A or B. An accurate randomization was assumed whenever the simulation resulted in a balanced assignment of the hypothetical subjects of the hypothetical stratum to the treatment. Balance was accepted if the distribution of assignments did not deviate significantly from equal distribution (P = 0.50 for assignment to either group). For that purpose, binomial tests were calculated before simulations were performed and distributions that did not deviate significantly (one-sided P > 0.05) from the hypothesis of balance were accepted as balanced. Table 1 shows the hypothetical prevalence, i.e. the number of subjects to be assigned, used for simulation and the accepted distributions with the corresponding binomial test results.
|
Illustrative example
For illustrative purposes, data from a hypothetical study are used to show the applicability and results of the approach. A clinical study on the outcome of a placebo-controlled study of pharmacological relapse prevention in 100 alcohol-dependent patients is outlined. Three prognostic factors with unknown prevalence in the sample are used to stratify the sample of subsequently enrolled patients: antisocial personality features [according to the Diagnostic and Statistical Manual for Mental Disorders (DSM-IV)], severity of alcohol dependence (more than three hospitalizations for alcohol detoxification) and social maladjustment (unemployment).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
The results of the simulation procedures with 10 000 runs are given in Tables 2 and 3. Table 2 yields expected and observed probabilities for an assignment of subjects with a specific prognostic feature to one of two treatments. Only minor deviations from the expected value of P = 0.50 were computed for all three approaches under investigation.
|
|
The rationale for the decision whether or not an assignment trial is balanced was derived from inference statistics, i.e. binomial tests as outlined in Table 1. Randomization runs resulting in proportions of assignments to both treatments, which did not have a statistically significant (one-sided P > 0.05) deviation from equal distribution, were accepted as sufficiently balanced. According to this pragmatic guideline, a numerical comparison of proportions revealed highly satisfactory figures (acceptable balance in >90% of trials) for both Efron's approach and the newly proposed method in all cases with n > 2 subjects to be randomized per stratum. Both approaches were clearly superior to simple randomization. In stratum sizes of n 20, the new method was numerically superior to Efron's approach, whereas assignments in strata with n > 20 favoured Efron's approach. Taking into account all computed assignment simulations, the new approach reached a proportion of statistically acceptable balanced randomization >95% in all stratum sizes. Efron's method achieved similar results with the exception that in the stratum with n = 6, a proportion of 93.5% sufficiently balanced assignments was calculated from 10 000 simulation runs. Figure 3 shows the comparison of Efron's and the new approach.
|
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The problem of unbalanced assignments in the so-called randomized trials is presumably underestimated (Stout et al., 1994). Simple random assignment of patients to one of two or more treatment alternatives still seems, for most researchers, sufficient to protect from ending up with treatment groups that differ significantly in essentially relevant features. This source of bias was labelled as accidentally occurring bias because it can not by definition be a systematic bias in randomized assignment, but it still represents one serious form of bias (covariate imbalance). The only way to protect against such randomly occurring influences, which can substantially invalidate research findings, seems a sufficiently large sample size (n > 200) (Lachin, 1988a
). In all other cases, potentially relevant features should be thoroughly assessed prior to the study, and a matching or stratification procedure should be used (Simon, 1979
; Fleiss, 1981
). For these designs, feasible procedures are still required, although the mathematical background has already been developed (Wei, 1978
; Lachin, 1988b
; Lachin et al., 1988
; Wei and Lachin, 1988
). The method we propose deals with the issue of gaining methodologically sound results with a straightforward, robust and practically tractable procedure. We did not claim to propose an optimal solution to the problem of stratified randomization, as there have been several excellent contributions to that field (Pocock and Simon, 1975
; Freedman and White, 1978
; Klotz, 1978
). The scope of our approach was to evaluate a foolproof randomization algorithm with respect to its relevant statistical properties. Therefore, we used a simulation procedure with 10 000 runs, which should be sufficient for comparison of different approaches. We did not choose a more or less arbitrary but mathematically derived criterion for deviation from balanced assignment, such as the |q1 q2| statistic proposed by Pocock and Simon (1975)
. Instead, in line with our pragmatic aims, we defined acceptable balance in assignment from an inference statistical standpoint. One typical step in data analysis is to check for comparability of treatment groups with respect to relevant characteristics which could influence the outcome measures. If there are significant differences between groups, in some cases disturbing variables can be used as covariates for analysis. However, the best statistical approach cannot compensate for shortcomings in a study design. In the case of nominal categorical variables (e.g. gender), the use of covariates is problematic, as a single individual with a mixed gender comprising e.g. 2/5 male and 3/5 female features as artificially created by analysis of covariance does not exist. On the other hand, post hoc stratification lowers the power of statistical tests, but even in this case a balanced distribution between treatment groups would be clearly recommendable. Nevertheless, stratification approaches and covariance analyses are not mutually exclusive, and could be useful in combination.
When analysing the balance of relevant prognostic values between treatments for categorical data, usually 2 tests or binomial tests are used. Hence, it is not necessary that prognostic features are exactly balanced between treatment groups. Instead, a distribution not deviating significantly from the hypothesis of equal distribution is, in most cases, sufficient for assuming balance. According to Cui et al. (2002)
imbalance frequently occurs in strata with small number of subjects (numerical imbalance). However, numerical imbalance does not necessarily imply clinical relevance (pragmatic balance); i.e. pragmatic balance allows numerical imbalance, and binomial tests can in such cases be useful to decide on the cut-off (not rejecting the null hypothesis of balance as accepting the balance) as shown in Table 1.
Thus, we calculated binomial tests prior to the simulation of assignments and regarded all assignment distributions as acceptable if they did not lead to the rejection of the null hypothesis of equal distribution between groups. We chose a rather conservative level of significance (one-tailed P < 0.05).
Several limitations of our approach should, however, be mentioned. First, for the sake of simplicity we decided to use only even numbers of subjects to be assigned in our simulation procedures. The application example was extended to the use of odd numbers of subjects hypothetically assigned to two treatment groups (n = 5, n = 15). Second, strata comprising a very small number of subjects (n 3) cannot be assigned in a balanced fashion by the proposed approach with the same accuracy as in the case of larger strata. Third, the utilization of even numbers of subjects to be assigned and the discrete cut-off values for statistical decision about balanced or unbalanced simulation outcomes led to discrete and not strictly monotone curves of proportions of balanced assignments. Nevertheless, for the aforementioned practical reasons, we decided to show these results. The outcome of assignment simulation for the new method and for Efron's approach was highly satisfactory, as in clearly >90% assignments of simulated runs (for n > 2 in each stratum), an acceptable balance was obtained, and, in general, both approaches were substantially superior to a simple randomization procedure (Table 4). If unbalanced assignments should be definitely avoided, deterministic assignment procedures are recommended (Taves, 1974
); however, this occurs at the cost of lost treatment unpredictability. For high numbers of stratification features and combinations thereof, the approach leads to unsatisfactory results, as most of the strata will contain no or only very low numbers of subjects, and the newly proposed, as well as Efron's, assignment procedure will assimilate simple randomization (Pocock and Simon, 1975
).
For strata with high numbers of subjects (n > 20), Efron's approach seems superior to the new method because the probability for assignment of the new approach will asymptotically reach P = 0.50 for n
. Another critical point is that we have used only the biasing probability of 1/3 or 2/3 for forcing balance within Efron's approach. As it has been successfully shown (Pocock and Simon, 1975
), more extreme biasing probabilities lead to even better results in typical designs. Hence, the advantage of the newly proposed method has to be seen in the context of accuracy, practicability and safety with respect to administration and treatment unpredictability (blindness). A descriptive comparison of different approaches with respect to these features is given in Table 6.
|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bailey, R. A. (1983) Restricted randomization. Biometrika 70, 183198.[ISI]
Billewicz, W. Z. (1965) The efficiency of matched samples. An empirical investigation. Biometrics 21, 623644.[ISI][Medline]
Chase, G. R. (1968) On the efficiency of matched pairs in Bernoulli trials. Biometrika 55, 365369.[ISI]
Cochran, W. G. (1968) The effectiveness of subclassification in removing bias in observational studies. Biometrics 24, 295313.[ISI][Medline]
Cui, L., Hung, H. M. J., Wang, S. J. et al. (2002) Issues related to subgroup analysis in clinical trials. Journal of Biopharmaceutical Statistics 12, 241252.
Efron, B. (1971) Forcing a sequential experiment to be balanced. Biometrika 58, 403417.[ISI]
Fleiss, J. L. (1981) Statistical Methods for Rates and Proportions. Wiley, New York.
Freedman, L. S. and White, S. J. (1978) On the use of Pocock and Simon's method for balancing treatment numbers over prognostic factors in the controlled clinical trial. Biometrics 32, 691694.
Johnson, B. A., Roache, J. D., Javors, M. A. et al. (2000) Ondansetron for reduction of drinking among biologically predisposed alcoholic patients: a randomized controlled trial. Journal of the American Medical Association 284, 963971.
Klotz, J. H. (1978) Maximum entropy constrained balance randomization for clinical trials. Biometrics 34, 283287.[ISI][Medline]
Lachin, J. M. (1988a) Statistical properties of randomization in clinical trials. Controlled Clinical Trials 9, 289311.[CrossRef][ISI][Medline]
Lachin, J. M. (1988b) Properties of simple randomization in clinical trials. Controlled Clinical Trials 9, 312326.[CrossRef][ISI][Medline]
Lachin, J. M., Matts, J. P. and Wei, L. J (1988). Randomization in clinical trials: conclusions and recommendations. Controlled Clinical Trials 9, 365374.[CrossRef][ISI][Medline]
National Institute on Alcohol Abuse and Alcoholism (1993) Project MATCH (Matching Alcoholism Treatment to Client Heterogeneity): rationale and methods for a multisite clinical trial matching patients to alcoholism treatment. Alcohol: Clinical and Experimental Research 17, 11301145.[ISI][Medline]
Nielsen, B., Nielsen, A. S. and Wraae, O. (1998) Patient-treatment matching improves compliance of alcoholics in outpatient treatment. Journal of Nervous and Mental Disease 186, 752760.[CrossRef][ISI][Medline]
Pocock, S. J. (1983) Clinical Trials: A Practical Approach. Wiley, New York.
Pocock, S. J. and Simon, R. (1975) Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 31, 103115.[ISI][Medline]
Simon, R. (1979) Restricted randomization designs in clinical trials. Biometrics 35, 503512.[ISI][Medline]
Stout, R. L., Wirtz, P. W., Carbonari, J. P. et al (1994) Ensuring balanced distribution of prognostic factors in treatment outcome research. Journal of Studies on Alcohol. Suppl. 12, 7075.
Taves, D. R. (1974) Minimization: a new method of assigning patients to treatment and control groups. Clinical Pharmacology and Therapeutics 15, 443453.[ISI][Medline]
Wei, L. J. (1978) An application of urn model to the design of sequential controlled clinical trials. Journal of the American Statistical Association 73, 559563.[ISI]
Wei, L. J. and Lachin, J. M. (1988) Properties of the urn randomization in clinical trials. Controlled Clinical Trials 9, 345364.[CrossRef][ISI][Medline]
|