Evaluation of Cluster Randomized Controlled Trials in Sub-Saharan Africa

Petros Isaakidis1 and John P. A. Ioannidis1,2 

1 Clinical Trials and Evidence-Based Medicine Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece.
2 Division of Clinical Care Research, Department of Medicine, Tufts University School of Medicine, Boston, MA.

Received for publication April 3, 2002; accepted for publication May 8, 2003.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Cluster randomized controlled trials (CRCTs) are attractive in settings in which individual randomization is difficult or impossible. This issue is common when studying several health problems in developing countries. The authors aimed to assess empirically the extent to which the prerequisite design and analysis aspects of cluster randomization were taken into account and reported properly in CRCTs conducted in sub-Saharan Africa. CRCTs published in the last three decades were evaluated by using a checklist based on the Consolidated Standards of Reporting Trials (CONSORT) statement. The authors identified 51 eligible CRCTs; 40 of them (78%) had been published after 1990. Only 10 (20%) studies took clustering into account in sample size or power calculations, and only 19 (37%) took clustering into account in the analysis. Intracluster correlation coefficients and design effects were reported in only one (2%) and three (6%) trials, respectively. An increasing number of CRCTs are conducted in sub-Saharan Africa, but many are not analyzed and reported properly. The special features stemming from cluster randomization need to be addressed in the design, analysis, and reporting of these studies.

Africa; cluster analysis; data collection; developing countries; planning techniques; random allocation; randomized controlled trial

Abbreviations: Abbreviations: CONSORT, Consolidated Standards of Reporting Trials; CRCT, cluster randomized controlled trial; ICC, intracluster correlation coefficient.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Cluster randomization is increasingly being used in medical research when it is difficult or impossible to apply an experimental intervention to individual subjects. Cluster randomized controlled trials (CRCTs) have their own methodological peculiarities that need to be taken into account regarding their design, conduct, analysis, and reporting to avoid serious errors and misinterpretations (15). However, empirical evaluations have shown that methodological shortcomings are common in the analysis and reporting of CRCTs conducted in developed countries (68). One area in which CRCTs are particularly attractive as a method of research is the evaluation of interventions in developing countries. Developing countries, in particular sub-Saharan Africa, carry a major portion of the global burden of disease (9, 10). Many interventions in sub-Saharan Africa are likely to be feasible for application primarily or exclusively at the community or group level. Thus, CRCTs are particularly important in this setting.

In this article, we report the results of a methodological evaluation of CRCTs performed in sub-Saharan Africa. Our aim was to assess the extent to which the prerequisite design and analysis aspects of cluster randomization were taken into account and reported properly in the trial publications. This information may be important for gaining insight into improving the conduct and reporting of future CRCTs in developing countries.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Eligibility criteria
We considered all CRCTs that were conducted in sub-Saharan Africa and were published until November 2001. We excluded nonrandomized and pseudorandomized controlled trials, trials randomized at the level of the individual, trials that used only random sampling within clusters without randomization, and reports that described only baseline data collection without information on the interventional phase of a randomized trial.

Study reports that obviously reflected secondary publication of a main study report were also excluded. However, whenever secondary publications reported additional useful information about the trial design or analysis, this information was recorded and was used to give respective credit to the trial.

Identification of trials
We used a comprehensive database of randomized controlled trials in sub-Saharan Africa (10). This database is based on MEDLINE, the Cochrane Controlled Trials Register, and the African Published Trials Register of the South African Cochrane Center, which includes hand searching of major African journals. MEDLINE and the Cochrane Controlled Trials Register searches were updated until November 2001. A number of terms reflecting CRCTs were used in conjunction with "Africa" and "sub-Saharan Africa" and with specific geographic names. Details on the search strategy are available on request.

Data extraction
We reviewed each potentially eligible article to determine whether it satisfied the selected criteria. From each eligible article, we extracted the following information: author, journal, year of publication, country or countries of recruitment, disease(s) or condition(s) targeted, number of trial arms and type of intervention, and methodological criteria (as described below). One author extracted data on all items. The other assessed all questionable items. In case of disagreement, consensus was reached after discussion.

Methodological criteria
To formulate the specific items of the methodological evaluation of the included trials, we referred to the Consolidated Standards of Reporting Trials (CONSORT) statement checklist (11, 12), taking into account the published suggestions on extending the CONSORT statement to CRCTs (13). The standard CONSORT checklist of items that should be included in the trial report was thus modified selectively to take into account the main specific methodological issues referring to the design, conduct, analysis, and reporting of CRCTs.

Thus, for each article, we recorded whether 1) the study was identified as a CRCT in the title; 2) the rationale was given for choosing the cluster design and, if so, what the rationale was; 3) the exclusion and inclusion criteria were stated for individuals, clusters, or both; 4) the planned intervention was aimed at individuals, clusters, or both; 5) the primary outcome(s) was stated clearly; 6) the sample size, number of clusters, and cluster size were reported; 7) the sample size calculations took clustering into account; 8) the intracluster correlation coefficient (ICC) was calculated and recorded; 9) the design effect was estimated and reported; 10) the unit of randomization was described; 11) pairing and/or stratification were used; 12) the within-cluster recruitment procedure was stated to be cross-sectional or longitudinal; 13) within-cluster sampling was used; 14) the method of masking was stated; 15) allocation schedule control (location of code) was described; 16) a participant’s flow diagram was provided; 17) the level of analysis was stated; 18) clustering was taken into account in calculating confidence intervals or p values; 19) results in absolute numbers were provided in sufficient detail; 20) prognostic variables by treatment group and any attempt to adjust for them were described; and 21) protocol deviations were reported. The ICC is defined as the ratio of the between-cluster component of variance to the total variance (sum of between-cluster and within-cluster variances). The design effect is estimated by using the formula 1 + (m – 1) x ICC, where m is the mean sample size in a cluster unit, and it signifies how many more individuals are required for a cluster design versus a trial randomizing individuals in order to have the same power.

Whenever all of the qualitative criteria listed above were met by less than 20 percent of the trial reports, we used Fisher’s exact tests to examine whether the situation was better in more recent trials (published in 1996 or later) than in earlier trials. We also used Spearman’s correlation coefficients to evaluate whether sample size measures correlated with the year of publication.

Analyses were conducted by using SPSS software (SPSS Inc, Chicago, Illinois). All p values reported in this article are two-tailed.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The searches yielded 83 potentially eligible reports. Of these, 29 were excluded because they were ineligible (14 publications pertaining to other, already included CRCTs; eight studies randomized at the level of the individual; two studies that simply used random sampling (without randomization); two pseudorandomized studies; two publications describing only baseline data collection; one case-control study nested within a CRCT), and three potentially eligible articles could not be retrieved in full text for further scrutiny. Thus, 51 eligible CRCTs were included in this analysis.

The number of CRCTs published per year increased over time. The earliest sub-Saharan CRCT identified was published in 1973, but 40 (78 percent) trials were published after 1990, and 25 (49 percent) were published in 1996 or later. The trials were conducted in 20 sub-Saharan African countries, including the Gambia (n = 9 trials), South Africa (n = 8), Tanzania (n = 6), Kenya (n = 5), Ghana (n = 3), Zimbabwe (n = 3), Zaire (n = 2), Uganda (n = 2), and Ethiopia (n = 2); another 11 countries contributed one trial each. Only one CRCT had been conducted in multiple countries. A large number of these trials were published in The Lancet (n = 11); other trials were published in Tropical Medicine & International Health (n = 6), Transactions of the Royal Society of Tropical Medicine and Hygiene (n = 5), Bulletin of the World Health Organization (n = 4), AIDS (n = 3), The American Journal of Tropical Medicine and Hygiene (n = 2), International Journal of Epidemiology (n = 2), Journal of the Dental Association of South Africa (n = 2), Social Science & Medicine (n = 2), and 14 other journals (one trial each).

Common subjects included malaria (18 trials (35 percent)) as well as sexually transmitted diseases, acquired immunodeficiency syndrome, and reproductive health (seven trials (14 percent)). Another 16 CRCTs focused on other infectious and parasitic diseases (diarrheal diseases (n = 4), trachoma (n = 4), hepatitis B (n = 2), intestinal helminthes (n = 2), childhood-cluster diseases and immunization (n = 2), trichiasis (n = 1), and otitis media (n = 1)). Oral conditions and proper drug use were the focus of four and three trials, respectively, while nutrition, epilepsy, and antenatal care accounted for one trial each.

Most trials (n = 37 (73 percent)) had two arms, but nine trials had three arms and five had three to six arms. Thirty-four CRCTs (67 percent) focused predominantly on prevention and 12 on treatment, while the type of intervention in the remaining five studies was either a combination of the two (therapeutic interventions combined with health education activities) or focused on drug management (including stock management and rational drug use). Among preventive intervention trials, 13 focused on the use of insecticide-treated bed nets for preventing malaria.

As shown in table 1, only one trial was identified as a CRCT in the title, whereas several trials were identified as "community trials" or "community-based trials" without stating their specific cluster design. Most of the trials offered no justification for randomizing clusters rather than individuals. The rationale for choosing the cluster design was given in 11 studies; five studies reported logistics and/or administrative reasons, four used this design to avoid intercontamination of the randomized groups, and two more ascribed their choice of the design to some specific characteristics of the type of intervention. Most of the trials stated the inclusion/exclusion criteria at the level of the individual, whereas the planned intervention was aimed at the individual in almost half of the studies reviewed.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Protocol characteristics of eligible cluster randomized controlled trials
 
All studies clearly reported their primary outcomes (table 1). Sample size considerations were often suboptimal; only a fifth of the trials provided a sample size justification that accounted for the cluster randomization (either in the Methods section or in a post hoc discussion of statistical power). Furthermore, only one study provided an estimate of the ICC. Actually, this information was given in a separate publication (focusing exclusively on the ICC estimation) and not in the main publication. Three studies provided an estimation of the design effect. Despite relatively large numbers of individual subjects, the effective sample size based on the number of clusters was usually small. Only 11 trials had more than 50 clusters (table 1).

Table 2 reports on the assignment and masking features of the 51 eligible trials. All trials described the unit of randomization, many of them actually in sufficient detail. Various pairing and stratification methods and recruitment procedures were used. Although the types of clusters were quite diverse, they can be broadly classified into one of three main categories: villages and residential areas, schools or classes, and health care settings (mostly primary health clinics). Double blinding was uncommon, and allocation control was rarely described (table 2).


View this table:
[in this window]
[in a new window]
 
TABLE 2. Assignment and masking characteristics of eligible cluster randomized controlled trials
 
A diagram showing the flow of the participants (starting from randomization up to those remaining to contribute to the main outcomes) was rarely provided. Several analytical shortcomings were noted (table 3). In almost half of the studies, the analysis was performed at the level of the individual rather than the cluster. Almost two thirds of the trials did not take clustering into account in calculating confidence intervals or p values. The p values reported in these studies, calculated under the assumption of statistical independence, may be spurious.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Results and analysis of eligible cluster randomized controlled trials
 
Recognition of the need to provide results in absolute numbers was better, but often little consideration was given to potential confounders and adjustments thereof (table 3). Protocol deviations from the study as planned, together with the reasons for them, were rarely described (table 3).

Some improvement was documented in several parameters over time. Specifically, clustering had been accounted for in sample size calculations in 10 of 26 trials published in 1996 or later but in none of the earlier reports (p = 0.001); clustering was accounted for in the analysis in 13 of these 26 trials versus 6 of 25 earlier ones (p = 0.083). The only trials that reported an ICC and/or design effects were also recent. Less impressive, nonsignificant improvements were also seen in the reporting of rationale, description of allocation control, and presentation of absolute numbers in sufficient detail (not shown). A nonsignificant correlation was found between year of publication and sample size (r = 0.17, p = 0.23) or number of clusters (r = 0.12, p = 0.44).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
This empirical evaluation shows that the methodological issues associated with cluster design in trials performed in sub-Saharan Africa are still not recognized widely enough. The prerequisite design and analysis aspects of cluster randomization were not taken into account and were not reported properly in the majority of the trials we reviewed. Only 20 percent of the studies considered clustering in sample size calculations or discussions of power; in less than 40 percent of them, clustering was taken into account in calculating confidence intervals or p values. The ICC and design effect were reported very rarely. Reporting of other methodological details that would be important regardless of whether individual or cluster randomization is used was also neglected sometimes. Despite a definite improvement in more recent trials, several CRCT reports in the recent literature have considerable deficiencies.

The results of our study are quite similar to those of previous empirical evaluations that targeted relatively smaller sets of CRCTs in other medical domains, and they may even suggest a lower rate of use of appropriate statistical methods in CRCTs than estimated previously. In their review of 16 nontherapeutic intervention trials, Donner et al. (7) found that only three studies (19 percent) accounted for between-cluster variability in sample size or power calculations; however, eight of the 16 (50 percent) trials took into account the effect of clustering in the analysis. Simpson et al. (6) recorded a similar picture in an appraisal of 21 primary prevention trials. Finally, in a review of 24 computer-based clinical decision support systems intervention trials, Chuang et al. (8) found that only one study (4 percent) took clustering into account in calculating sample size, although 14 of the 24 trials (58 percent) used adequate statistical methods for analysis. With the exception of four trials in the Chuang et al. report, all other trials in these three empirical evaluations were published before 1996. Half of the trials in our evaluation were published in 1996–2001, and, despite some definite improvement in appreciating the implications of the cluster design in these trials compared with earlier ones, deficiencies were still common.

We also tried to record several other facets of the quality of reporting of CRCTs in sub-Saharan Africa, and several problems were detected. Inadequate reporting may be associated with either clear overestimation of the effects of interventions (14, 15) or unpredictable bias in the effect size (16). Moreover, the report of a trial is a proxy for the true trial quality (17, 18), although sometimes the actual quality of a trial’s design, conduct, and analysis may not be adequately reflected in the study report (19). For example, it is possible that trial investigators may give considerable thought to the rationale for using a cluster design, but this rationale may not be presented properly in the final report. Adopting a standardized checklist may facilitate adequate reporting of CRCTs. The CONSORT statement checklist has been adopted by most of the main medical journals and by many research teams. The modified CONSORT checklist that we used is a relatively simple tool that could assist researchers when reporting a CRCT.

Several methodological articles have been published addressing the specific issues of the design, conduct, and analysis of CRCTs (17, 18, 2022). However, publications presenting essential, valuable information such as the estimates of ICCs and design effects are still limited (23, 24). It is therefore recommended that authors include their ICC estimates in the main trial publications to help future investigators plan CRCTs.

The lack of appropriate statistical implementation of cluster designs in the past might have been in part also due to lack of readily available software. Currently, however, analyses can be conducted easily by using the PROC MIXED procedure of the Statistical Analysis System (25). Specialized software such as ACLUSTER (26) has also become available, focusing on estimating intracluster correlation coefficients, calculating sample size for cluster designs, and analytical methods for binary, continuous, and time-to-event outcomes.

Despite the relative inefficiency of cluster randomization compared with individual randomization (in terms of statistical power), future investigators should not be discouraged from using this design whenever indicated. In many interventions, there is no alternative to cluster randomization. Ethical and political concerns, the need to minimize the potential for intercontamination of the randomized groups, administrative problems, and a limited budget can force researchers to abandon the possibility of using a design that randomizes individuals. In developing countries such as in sub-Saharan Africa, where there is a large burden of disease and research resources are limited (10), CRCTs are likely to be very useful for addressing a variety of important medical and public health-related questions. Careful design, conduct, and analysis as well as proper reporting of CRCTs could improve the quality of medical research and contribute toward more effective health care in developing countries.


    NOTES
 
Correspondence to Dr. John P. A. Ioannidis, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina 45110, Greece (e-mail: jioannid{at}cc.uoi.gr). Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Donner A, Klar N. Statistical considerations in the design and analysis of community intervention trials. J Clin Epidemiol 1996;49:435–9.[CrossRef][ISI][Medline]
  2. Elbourne D. Guidelines are needed for evaluations that use cluster approach. BMJ 1997;315:1620–1.[Free Full Text]
  3. Campbell MK, Grimshaw JM. Cluster randomized trials: time for improvement. BMJ 1998;317:1171–2.[Free Full Text]
  4. Donner A, Klar N. Methods for comparing event rates in intervention studies when the unit of allocation is a cluster. Am J Epidemiol 1994;140:279–89.[Abstract]
  5. Sashegyi AI, Brown KS, Farrell PJ. Application of a generalized random effects regression model for cluster-correlated longitudinal data to a school-based smoking prevention trial. Am J Epidemiol 2000;152:1192–200.[Abstract/Free Full Text]
  6. Simpson JM, Klar N, Donner A. Accounting for cluster randomization: a review of primary prevention trials, 1990 through 1993. Am J Public Health 1995;85:1378–83.[Abstract]
  7. Donner A, Brown KS, Brasher P. A methodological review of non-therapeutic intervention trials employing cluster randomization, 1979–1989. Int J Epidemiol 1990;19:795–800.[Abstract]
  8. Chuang JH, Hripcsak G, Jenders RA. Considering clustering: a methodological review of clinical decision support system studies. Proc AMIA Symp 2000:146–50.
  9. Murray CJL, Lopez AD. The global burden of disease: a comprehensive assessment of mortality and disability from diseases, injuries, and risk factors in 1990 and projected to 2020. Cambridge, MA: Harvard School of Public Health, Harvard University Press on behalf of the World Health Organization and The World Bank, 1996.
  10. Isaakidis P, Swingler GH, Pienaar E, et al. Relation between burden of disease and randomized evidence in sub-Saharan Africa: survey of research. BMJ 2002;324:702–5.[Abstract/Free Full Text]
  11. Moher D, Schulz KF, Altman DG. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001;357:1191–4.[CrossRef][ISI][Medline]
  12. Begg C, Cho M, Eastwood S, et al. Improving the quality of reporting of randomized controlled trials. The CONSORT statement. JAMA 1996;276:637–9.[CrossRef][ISI][Medline]
  13. Elbourne DR, Campbell MK. Extending the CONSORT statement to cluster randomized trial: for discussion. Stat Med 2001;20:489–96.[CrossRef][ISI][Medline]
  14. Schulz KF, Chalmers I, Hayes RJ, et al. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995;273:408–12.[Abstract]
  15. Juni P, Altman DG, Egger M. Systematic reviews in health care: assessing the quality of controlled clinical trials. BMJ 2001;323:42–6.[Free Full Text]
  16. Balk EM, Bonis PA, Moskowitz H, et al. Correlation of quality measures with estimates of treatment effect in meta-analyses of randomized controlled trials. JAMA 2002;287:2973–82.[Abstract/Free Full Text]
  17. Alexander F, Roberts MM, Lutz W, et al. Randomization by cluster and the problem of social class bias. J Epidemiol Community Health 1989;43:29–36.[Abstract]
  18. Donner A, Klar N. Cluster randomization in epidemiology: theory and application. J Stat Plann Infer 1994;42:37–56.[CrossRef][ISI]
  19. Ioannidis JP, Lau J. Can quality of clinical trials and meta- analyses be quantified? Lancet 1998;352:590–1.[CrossRef][ISI][Medline]
  20. Ukoumunne OC, Gulliford MC, Chinn S, et al. Methods for evaluating area-wide and organisation-based interventions in health and health care: a systematic review. Health Technol Assess 1999;3:iii–92.[Medline]
  21. Hayes RJ, Bennet S. Simple sample size calculation for cluster-randomized trials. Int J Epidemiol 1999;28:319–26.[Abstract]
  22. Kerry SM, Bland JM. The intracluster correlation coefficient in cluster randomization. BMJ 1998;316:1455–60.[Free Full Text]
  23. Smeeth L, Siu-Woon Ng E. Intraclass correlation coefficients for cluster randomized trials in primary care: data from the MRC Trial of the Assessment and Management of Older People in the Community. Control Clin Trials 2002;23:409–21.[CrossRef][ISI][Medline]
  24. Reading R, Harvey I, McLean M. Cluster randomized trials in maternal and child health: implications for power and sample size. Arch Dis Child 2000;82:79–83.[Abstract/Free Full Text]
  25. Littell RC, Milliken GA, Stroup WW, et al. SAS system for mixed models. Cary, NC: SAS Institute, Inc, 1996.
  26. ACLUSTER software. Oxford, United Kingdom: Update Software Ltd. (Internet Web site: www.update-software.com/ Acluster).