a Medical Statistics Unit, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK.
b ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, Old Road, Headington, Oxford OX3 7LF, UK.
c MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 2SR, UK.
d Oral Health and Development, University Dental Hospital of Manchester, Higher Cambridge Street, Manchester M15 6FH, UK.
e R & D Academic Affairs Directorate, Clinical Sciences Building, Hope Hospital, Stott Lane, Salford M6 8HD, UK.
Prof. Diana Elbourne, Medical Statistics Unit, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK. E-mail: diana.elbourne{at}lshtm.ac.uk
Abstract
Background Meta-analysis of randomized controlled trials (RCTs) is usually based on trials where patients are randomized individually into two different, parallel, treatment groups. This paper concentrates on RCTs of a different designtwo-period, two-treatment cross-over trials.
Methods The characteristics of these trials are outlined, with detailed examples of methods for analysis for both continuous and binary data. These case studies are then extended into the context of a meta-analysis. The Cochrane Library was surveyed to assess current practice for synthesis.
Results Methods are described for continuous and binary data for use both when the necessary paired data are given and also when they need to be calculated or imputed, and some suggestions are provided to help people wishing to synthesize data from cross-over trials into meta-analyses. The survey suggested that about 8% of the trials in the Cochrane library were cross-over trials and 18% of the reviews referred to such trials, although there was no consistent approach to their inclusion into the reviews.
Conclusions Methods do exist for including valuable information from two-period, two-treatment cross-over trials into quantitative reviews. However, poor reporting of cross-over trials will often impede attempts to perform a meta-analysis using the available methods.
Keywords Cross-over trials, meta-analysis
Accepted 18 October 2001
Meta-analysis is a statistical approach to the synthesis of quantitative data. It has been particularly used in the context of systematic reviews of randomized controlled trials (RCTs). The design of these RCTs is usually based on patients being randomized individually into two different, parallel, treatment groups. While the majority of RCTs are of this design, a substantial minority are not. A challenge for meta-analysts is to find ways to incorporate RCTs of different designs into the synthesis, and to establish whether this is appropriate.
This paper concentrates on cross-over trials, and considers issues of data synthesis in principle and in practice. We outline the characteristics of cross-over trials, give examples illustrating particular issues related to synthesizing them in meta-analyses, describe current practice in reviews published in the Cochrane Library, and conclude with general comments to help people wishing to synthesize data from cross-over trials into meta-analyses. Throughout the paper we consider only the case where two treatments are compared using the two-period, two-treatment cross-over design (see below).
Cross-over Trials
In contrast to a parallel group trial, each individual in a cross-over trial receives two or more treatments but in a random orderi.e. it is the sequence which is randomized. In this way, each patient acts as his or her own control. This pre-specified design should not be confused with trials in which some individuals cross over through non-compliance or use of rescue medication, or in which all participants in the control group are given the chance to cross over to the experimental treatment at the end of the main trial.
The particular strength of the cross-over design is that treatments (interventions) are evaluated on the same patients, allowing comparison at the individual rather than group level. Also, as patients receive multiple treatments they can express preferences for or against particular treatments.1
Because each patient receives both interventions, cross-over trials usually require no more than half the number of patients to produce the same precision as a parallel group trial. Furthermore, variation in repeated responses within a patient is usually less than that between different patients. This means cross-over trials can give more precise results than parallel group trials even when the cross-over trial involves half as many patients. When within-patient variation is smaller than between-patient variation there is said to be correlation between patients' responses to the different treatments. To exploit this correlation, cross-over trials should be analysed using a method of analysis specific to paired data (such as a paired t test or the McNemar test). In addition, the analysis may examine the possibility of order, period, or period by treatment interaction (carryover) effects.2,3 We do not consider these aspects in detail in this paper.
Cross-over trials are widely used in some branches of medicine, such as clinical pharmacology, or paediatrics4 yet almost unknown in others, such as cancer, or schizophrenia.5 In principle, cross-over trials are most appropriate for symptomatic treatment of conditions or diseases that are chronic or relatively stable (such as multiple sclerosis or rheumatoid arthritis), at least over the time period under study, and when the treatment effects are likely to be reversible, and short-lived.
Carryover effect
A particular concern with the cross-over design is the risk of a carryover effect. Carryover occurs when the treatment given in the first period has an effect that carries over into the second period. In general it refers to any effect of an intervention (perhaps pharmacological, physiological, or psychological) that influences the patient's response in the subsequent intervention period.
For example, in a placebo-controlled cross-over trial, patients taking placebo followed by active treatment may be unaffected by carryover, while patients taking active treatment followed by placebo may have exaggerated responses during the later, placebo period. The carryover effect reduces the apparent effectiveness of the treatment for the patients in the second group, and thus for all patients combined. Other forms of carryover are more difficult to foresee and could modify the effect of treatment in either direction. A washout period between treatment periods may reduce the risk of a carryover effect.
A previously recommended method of analysing cross-over trials was to test for carryover, and if this was significant to discard the data from the second period and analyse only the data from the first period as if from a parallel group trial.6 Although it is highly desirable that authors report the results from each treatment in each period separately, Freeman7 showed that this strategy is seriously flawed, and leads to biased answers, as is generally the case when the choice between two analyses depends on the result of a preliminary hypothesis test. Senn2 and others have argued that the use of the cross-over design is effectively built on the assumption that there is minimal carryover of the effect of a treatment into the next period. Following this philosophy, rather than testing for carryover, one should therefore proceed as if there were none.
Inappropriate use of the cross-over design
Cross-over trials are inappropriate when the treatment may alter the condition of interest to such an extent that, on entry to subsequent phases, patients systematically differ from their initial state. An extreme example of this is where treatments provide cure or when patients might die. The design is quite commonly used, however, in inappropriate circumstances. For example, pregnancy is an intended outcome of sub-fertility treatment. If a woman becomes pregnant in the first period of the trial (i.e. before cross-over), she will be precluded from entry into subsequent phases of the trial, making intra-subject comparison impossible in the latter period. The effect of simply pooling the results from different periods is to exaggerate the benefits of the more successful treatment.8 Nevertheless the design is defended in the field9 and remains common despite criticism.10
Illustrative Cross-over Trials
In this section we discuss possible analyses of two illustrative cross-over trials, first when the data are continuous and second when they are binary.
Additional methods for analysing both binary data and continuous data from a cross-over trial exist, including methods that address period effects and carryover.2 In our analyses we assume that neither is present.
A cross-over trial with continuous data
A cross-over trial reported by Johnson and colleagues11 investigated the opiate dihydrocodeine to reduce the symptom of breathlessness, a common complaint of patients with chronic airflow obstruction. The 19 patients were assigned a regimen of one week of dihydrocodeine and one week of a matching placebo in a random order. No washout period was built into the trial since the effect of opiates on breathlessness is believed to be short acting. At the end of each week breathlessness was measured using a 10 cm visual analogue scale (VAS) ranging from not breathless' to extremely breathless'. Measurements are treated as continuous outcome data in units of centimetres. Since one patient was withdrawn from the trial due to medical complications, data are reported for 18 patients only.
Breathlessness results appear in three formats in the paper (Table 1). These illustrate the range of levels of detail that might be encountered in the report of a cross-over trial, and directly relate to the ability to extract suitable data for a meta-analysis. We draw on the three formats to illustrate correct and incorrect analyses of continuous data from a simple cross-over trial. Formulae are provided in the Appendix.
|
In the second format, a table gives the mean (SD) for assessments after active treatment and the mean (SD) for assessments after placebo. This common approach to reporting results from a cross-over trial treats the data as if they arose from a parallel group trial, that is as if the active and placebo treatments were taken by different groups of patients. For illustration, we perform a two-sample t test, the analysis that would be appropriate for such a parallel group trial. The mean difference is 0.9 with a t statistic (this time with 16 degrees of freedom) of 1.28. This corresponds to a two-sided P-value 0.22 and a 95% CI from 2.39 to 0.59. Thus, without taking into account the cross-over nature of the trial there is inadequate evidence of a beneficial effect of dihydrocodeine. Means and SDs presented separately for each treatment are insufficient for a proper analysis of a cross-over trial. It is important to take into account the within-person differences. This can be achieved only through a paired analysis or with knowledge of the correlation between active and placebo outcomes measured on the same patient. Using results from the groups separately in conjunction with the paired data in the graph, we can estimate the correlation between active and placebo treatment period VAS scores to be 0.91 (Appendix). The substantial difference between results of the paired and unpaired t tests is due to the high correlation.
In the third format, a reported t or P-value can be used to retrieve the SDs of paired observations. The success of this procedure will depend upon how accurately the value is given. In the present case, Johnson et al.11 reported the result of a paired t test through the presentation of a bounded P-value: P < 0.001. Bounded P-values are a popular means of presenting statistical test results. Upper bounds may be used to yield conservative approximations to the true result. Lower bounds (such as P > 0.2) are not useful. A conservative approximation to the paired t test described above may be produced by taking P = 0.001, for which the corresponding t statistic (with 17 degrees of freedom) is 3.97. Taking the mean difference as 0.9, the SD of the mean difference can be calculated to be 0.963 (Appendix). A 95% CI for the mean difference is 1.38 to 0.42. In this particular example, the assumption of P = 0.001 provides a considerable improvement on the parallel group analysis, and yields a result that is only slightly conservative compared to the correct paired t test (as judged by the width of the CI).
A method to estimate trial variance and between-period correlation in cross-over trials is described by Follmann et al.12
A cross-over trial with binary data
A multicentre, cross-over trial reported by Anderson and colleagues,13 and later by Brunelle and colleagues14 compared regular insulin with a variation on the insulin molecule, insulin lispro, in patients with Type I diabetes. Patients were randomized to different orderings of the two treatments for 3 months each. No carryover effects were identified. A binary outcome is whether or not each patient experienced at least one episode of severe hypoglycaemia. Data are available for 977 patients (Table 2). Such a breakdown of paired data is uncommon in reports of cross-over trials with binary outcome data. We use them to illustrate some methods of analysis and to demonstrate how data reported in less detail might be used in preparation for a meta-analysis.
|
It is also possible to calculate the difference between the proportions with the event of interest, and an associated CI.15 Here we would compare the proportions 24/977 (2.5%) and 33/977 (3.4%), that experienced severe hypoglycaemia on insulin lispro and regular insulin, respectively. The method takes proper account of the fact that these estimates are based on paired observations from the same individuals. The estimated risk difference is 0.9% (95% CI : 0.4%, +2.3%). Relative effects, such as risk ratio or odds ratio (OR), tend to be more consistent across studies, however, we do not consider the risk difference further in this paper, although in principle the same ideas can be extended to meta-analysis with risk differences.
The Mantel-Haenszel OR for paired outcomes is calculated as the ratio of discordant pairs.16 In our example it gives an OR of 17/26 = 0.65 (95% CI : 0.33, 1.25). The Mantel-Haenszel method is based on the conditional probabilities of success and thus the magnitude of this OR varies with the strength of the correlation between paired outcomes.17 This feature is an issue when conditional OR from different trials with different correlations are to be combined.
An analysis that ignores the cross-over design would be based on the information that 24/977 patients experienced severe hypoglycaemia on insulin lispro and 33/977 experienced hypoglycaemia on regular insulin. Standard techniques for evaluating treatment effects in trials with binary data can then be used. For example, the OR is (24 x 944)/(953 x 33) = 0.72 (95% CI : 0.42, 1.23). An OR specific to two-treatment, two-period cross-over trials has been developed by Becker and Balagtas.18 We describe this in the Appendix. Evaluating the OR for these data gives the same estimate of 0.72 but with a narrower CI (0.45, 1.15). A binary correlation coefficient estimated from the data is 0.23.
Cross-over Trials in Systematic Reviews
Having illustrated examples of individual crossover trials, we consider in the rest of the paper how such studies can be incorporated into systematic reviews. We start from the premise that it is entirely reasonable to combine them within a review if (1) cross-over and parallel trial are estimating the same treatment effect (ignoring the possible presence of a carryover effect), and (2) the choice of trial design has not been dictated by any differences in therapeutic indication or clinical conditions which could potentially influence the observed treatment effect and so preclude the combination of these different types of trials
As in any meta-analysis, regardless of the design, we may exclude individual trials on design quality grounds. For example, this would apply to cross-over trials when the design was inappropriate (as discussed above). Many of the issues considered below apply not just to situations in which the review contains both cross-over and parallel group trials, but also when all the trials are cross-over trials.
If it is felt appropriate, in principle, to include cross-over trials in a review, there are a number of strategies to consider. The simplest option is to adopt a qualitative approach, and to describe them separately, without including them formally in quantitative synthesis or meta-analysis. This is preferable to ignoring the cross-over trials altogether, but does not allow the calculation of a bottom-line answer to the question addressed by the review. Second, results of two or more cross-over trials might be combined, but with this pooled result kept separate from the data from parallel group trials. Finally, in situations in which the review contains both cross-over and parallel group trials, we could combine results from trials of both designs.
In the next section we consider approaches that can be adopted to include cross-over trials into quantitative meta-analyses in either of the cases just described. We also consider the practical difficulties in trying to implement these methods. A key issue is whether the data are available in the desired form.
Methods of meta-analysis including cross-over trials
Suppose that we wish to combine the results from one or more cross-over trials, or to combine results from cross-over trials with results from parallel trials. We adopt a general approach to meta-analysis in which an estimate of treatment effect and its standard error is required from each study.19 Estimates of treatment effect from the cross-over trials can be obtained in several ways, as shown earlier.
Ideally, cross-over trials should be included in meta-analyses using the results from paired analyses. Crucially, such analyses require information regarding the within-individual comparison of treatments. The ideal situation is for estimates of the treatment effect with appropriate standard errors to be available from the trial reports or through correspondence with the trialists. If these data are not directly available they may be calculable and imputed (as noted above). The most straightforward methods of analysis are when data are continuous,20 but methods also exist if data are binary,17 and/or there is carryover, although this may exist but not be detected.21 Some methods are shown below, extending the earlier illustrative examples for continuous and binary data.
A second approach is to include the trial but using the data from the first period only. The logic here is that in a randomized cross-over trial the first period is, in effect, a parallel group trial. Discarding the second period allows all trials to be considered on the same footing. However, this approach is possible only in the unusual situation when the data are available in this form. Even when the data are available there are disadvantages to this approach. Excluding later periods loses some of the information collected. More importantly, if the data are available, they are likely to represent a biased subset of trials, usually because authors have found evidence of carryover. For example, a meta-analysis of fish oil in rheumatoid arthritis included eight parallel group and two cross-over trials. As stated in the paper, one of the cross-over trials was analysed as a parallel group study, including data from the first arm of the trial when it was found that the carryover effect exceeded the washout period.22 We advocate the use of first period data alone only when the cross-over design is considered inappropriate for the condition of outcome being investigated.
A third option is simply to ignore the cross-over design of the trials. This implies treating the results from the first period as if they came from one group of patients and results from the second period as if they came from a different group of patients. This approach is not usually to be recommended. At best, it is conservative as it ignores the within-patient correlation and so does not make use of the design advantages of a cross-over trial. More importantly, this approach ignores the fact that the same patients appear in both arms of the study and so they are not independent of each other, as required in the standard statistical methods.
Extended Case Studies
A meta-analysis of cross-over trials with continuous data
A Cochrane review has sought randomized trials of opioids versus placebo for the palliation of breathlessness in terminal illness.23 Eighteen trials were identified, all of which had a simple cross-over design. Twelve trials provide information on suitable measures of breathlessness. It was considered appropriate to combine the results from these 12 trials in a meta-analysis.
The trials had used a variety of instruments for measuring breathlessness, including different visual analogue scales and different rating scales. This was overcome by standardizing each mean difference by dividing by a pooled SD for the outcome in that trial. The reviewers faced two methodological problems related to the cross-over design of the trials. First, results of paired analyses were not always reported, or were reported in different ways. Second, some studies addressed a change-from-baseline measurement within each treatment period, and others focused on the post-treatment measurement from each period, as in the trial by Johnson et al.11 above.
Of the 12 eligible trials, four provided individual patient data (measurements for each treatment period for each patient) either in the publication or through provision of raw data by the original investigators. From these data the desired mean differences and their standard errors could be obtained (as in column A of Table 1). Further, correlations between post-treatment outcomes on treatment and placebo were calculated, yielding values of 0.89, 0.84, 0.68 and 0.49. The trial by Johnson et al.11 described above, reported within-patient differences in post-treatment measures and mean (SD) for the treatments separately. These enabled estimation of the desired mean difference, its standard error and a correlation of 0.91 (using equation B
of the Appendix). A further four trials provided mean outcomes for the treatment and placebo periods separately (as in column B of Table 1
) as well as results of a paired t test (as in column C of Table 1
). These enabled the reviewers to approximate the standard error of the mean difference, and to estimate correlations of 0.79, 0.80 and 0.71.
The remaining three trials did not report any paired results, though they did provide means and SD for treatment-specific outcomes (column B of Table 1). Paired analyses from these trials could only be approximated by assuming a certain degree of correlation between treatment and placebo outcomes (equation A
of the Appendix). The reviewers used a value of 0.68, the lowest observed correlation among the other studies (not including the trial with a correlation of 0.49, which measured outcomes using a different technique). Sensitivity analysis was performed to assess the impact of the assumed correlation on the outcome of the meta-analysis by repeating the analysis assuming zero correlation (equivalent to a parallel group analysis of the results).
Change-from-baseline outcome measures are advantageous in many parallel group trials. Since the correlation coefficient between baseline and post-treatment outcomes will often be larger than 0.5, between-patient variation can be substantially reduced. However, the use of change-from-baseline measures in cross-over trials raises interesting questions. The correlation coefficient between changes from baseline from the two treatment periods is likely to be very low, and the variance of treatment effect estimates may actually be increased. A possible benefit of using change-from-baseline in a cross-over trial is when there is substantial variation across treatment periods but not within them. However, if the variation across periods is in a consistent direction, for example when the condition is degenerative, then a cross-over design is not generally deemed to be a suitable choice.
Three trials from the breathlessness meta-analysis enabled estimation of the correlation of change-from-baseline measures between active and placebo treatment periods. The values were 0.38, 0.13 and 0.29, suggesting that the cross-over design did not add much precision to results (and in one trial was detrimental to the precision). The reviewers therefore decided to assume a zero correlation between change-from-baseline outcome measures when paired analyses were not available.
Results from the 12 trials are shown in Table 3. The mean difference is the (unstandardized) result for breathlessness. The unpaired standard error arises from an analysis assuming a parallel group design, whereas the paired analysis is the appropriate analysis for the trial. Figure 1
presents meta-analyses of the 12 trials to illustrate the impact of performing the correct, paired analyses. Standardized mean differences were used in the meta-analyses and change-from-baseline results were used in preference to post-treatment measures. When change-from-baseline results were used, the pooled SD for changes from baseline were used to standardize the mean difference; otherwise pooled SDs of post-treatment measures were used. Figure 1(a)
is a meta-analysis using unpaired analyses from each trial, and Figure 1(b)
is a meta-analysis in which paired analyses have been performed (or approximated). Several of the CI are substantially narrowed in the paired analysis. However, the pooled result is similar to that obtained from the (inappropriate) unpaired analysis.
|
|
|
To calculate a pooled estimate, a similar metric should be used for each trial. The classic maximum likelihood OR estimate obtained in parallel trials should be combined with the marginal OR calculated in the cross-over trials according to Becker and Balagtas.18 The computation of the pooled logarithm of the OR (lnOR) is based on the usual weighted average of trial lnOR where weights are the inverse of the lnOR variance.17,24 If the cross-over trials have different between-period correlations, conditional OR should not be combined as the results may be biased.17
Our analysis in Table 4 focuses on ORs rather than risk ratios. Using the technique of Becker and Balagtas18 (Appendix) we calculate ORs for the cross-over trials that account for the paired design of the studies. We present binary correlation coefficients in the last column. While ORs themselves are identical in the naïve and paired analyses, the CI from the paired analyses are narrower. The difference is not substantial in this example since the correlations do not greatly exceed zero.
A meta-analysis may be undertaken of the log OR, with the variances, and hence weights, being based on the CI from the paired analyses.17 Figure 2 shows separate meta-analysis of the parallel and cross-over trials.
|
Current Practice
Given the existence of the methods described above, we investigated the extent to which they are used in current practice, concentrating on the Cochrane Library.25 The Cochrane Controlled Trials Register in the January 2001 issue of the Cochrane Library was searched for cross-over or crossover. Eight per cent (24 710 out of 294 369) of trials contained these terms in the title or abstract. A search of the Cochrane Database of Systematic Reviews (Issue 1, 2001) revealed that 315 out of 1000 complete reviews contained the free-text terms cross-over or crossover. We identified 184 (18%) that referred to cross-over trials.
Eleven of these (6%) specifically excluded cross-over trials from consideration, only one of which specified a reason that the design was inappropriate. The authors of this review on prophylactic antibiotics for cystic fibrosis felt that a cross-over design would mask the effects of the prophylaxis on long-term outcomes such as lung function, nutrition and acquisition of resistant organisms.26 A further 21 (11%) reviews excluded cross-over trials from analysis, but considered their results separately in the text of the review. The two most common policies are seeking data from the first period of the trial only (95 [52%]) and including data from both periods as though a parallel group design had been used (56 [30%]). Of those seeking first period data, when data were unavailable 54 excluded trials, 13 included the data from both periods, and 28 gave no clear policy. Only one review (1%) incorporated the paired data into the meta-analysis. This was achieved by devising a common SD that reproduced the correct CI in the statistical package, Metaview, which was not designed to handle the data from cross-over trials.27 Thus the absence of paired analyses found in the other Cochrane reviews may not reflect the analyses that the reviewers would have liked to perform. Similar comments apply to other meta-analysis software.
It is clear from the above investigation that although methods exist, there was usually no clear policy in the methods section of reviews to indicate how data from cross-over trials were to be used in the review. Nor is there a consistent approach among the various Cochrane review groups.
Comments
In this paper, we have described methods for the synthesis of data from cross-over trials into meta-analyses, and provided illustrative examples of practice. We suggest that before adopting the approaches we have demonstrated for both continuous and binary data, meta-analysts may first wish to consider the following questions:
Is a meta-analysis on this topic justified in principle? In particular, are all the trials addressing a similar enough question in terms of populations, interventions and outcomes?
Are the individual trials of adequate quality to be considered for inclusion? The standard quality criteria which exist,28 also apply to cross-over trials. There may, however, also be issues about the quality of cross-over trials specifically. These might include an assessment of whether a cross-over design was inappropriate for the meta-analytic question, or whether there was a substantial possibility of carryover. Criteria for including or excluding such trials and/or conducting sensitivity analyses should be detailed in the methods section of the protocol for the systematic review.
Are data available in a suitable form? Ideally there should be paired data from each patient. These should be in the form of the mean and SD (or SE or CI) of within-patient paired differences for continuous outcomes, and the 2 x 2 table of paired responses on the two interventions for binary outcomes. Even if these data are not available in the published paper, they can often be elicited from authors, or the missing data may be calculated or imputed as described above.
Is it appropriate to combine results if there is statistical heterogeneity between the trials (or between parallel group and cross-over trials)? As above, the criteria for synthesis and/or conducting sensitivity analyses should be detailed in the methods section of the protocol for the systematic review.
How should the data for the review be reported? In order to allow readers of the review to be able to understand or replicate the results, if possible summaries of paired raw data (such as in Table 5, column A; or Table 6
), and basic summary data (such as in Table 5
, column B; or Table 7
) as well as the results of paired analyses should be provided.
|
|
|
Although it is unlikely that there will be sufficient information in all reports of cross-over trials to apply any one synthesis method consistently, we have shown that methods do exist for including valuable information from two-period, two-treatment cross-over trials into quantitative reviews.
Many of the issues for cross-over trials apply also to other designs. Some have strong similarities to cross-over trials such as randomized within-subject trials which are very common in specific fields, in particular dentistry29 and ophthalmology.30 For instance, a recent Cochrane systematic review comparing two different types of periodontal surgery includes a meta-analysis of a continuous outcome for 10 trials.31 Four of these trials employed a standard parallel group design, but four had a split-mouth design where pairs of sites within a patient's mouth were randomly allocated to the two treatment groups.
The inclusion into reviews of RCT of different designs such as factorial or cluster randomized trials,32 or non-randomized studies, or qualitative data33 will provide further challenges for meta-analysts in the future.
KEY MESSAGES
|
Appendix: Formulae
Continuous data
Suppose a two-treatment, two-period cross-over trial involves n individuals. In this Appendix, as in the rest of the paper, we assume that neither period effects nor carryover pose a problem, and describe paired analyses that ignore the ordering of treatments. Let the outcomes measured on individual i be xiA from the active treatment period and xiP from the placebo treatment period. The paired analysis focuses on the differences, . The fundamental result of the trial is the value of
and its standard error,
. Note that
can easily be calculated as
. The paired t statistic is
,having n 1 degrees of freedom, which can be used to obtain a P-value, P(T). Table 5
gives the algebraic representations of the data appearing in Table 1
.
From basic statistical theory the variance of the within-patient differences is
![]() | ((A)) |
![]() | ((B)) |
Binary data
Suppose a two-treatment, two-period cross-over trial involves n individuals and has a binary outcome. That is, for each individual, a success' or failure is measured for each of an active treatment period and a placebo treatment period. Results from such a cross-over trial with binary data may be represented as in Table 6, which is an algebraic representation of data such as those shown in Table 2
. Again, we assume that neither period effects nor carryover pose a problem. A common way of presenting the results, however, is as in Table 7
. These are results as if the trial were a parallel group trial, and each patient appears twice, so that the apparent total sample size is 2n.
The McNemar test is based only on those patients whose outcomes differed in the two periods (i.e. discordant pairs). Based on the notation of Table 6, the asymptotic approximation of the McNemar test is given by
![]() |
The Mantel-Haenszel OR for paired outcomes is calculated as the ratio of discordant pairs t/u. An exact 95% CI is obtained by treating t as a Binomial variable with sample size t + u.16 As noted in the main text, this method is based on the conditional probabilities of success and the magnitude of this OR varies with the strength of the correlation between paired outcomes.17
An alternative OR approach is based on the marginal probabilities of success.18 The OR is calculated as
![]() |
![]() |
![]() |
The term /n is a covariance and can be rewritten:
![]() |
![]() |
Cross-over trials with binary outcomes quite often yield 2 x 2 tables with empty cells. For OR methods, if either of the discordant frequencies is zero it is customary to add 0.5 to all four cell frequencies. The Becker-Balagtas marginal estimated OR is more robust to empty cells than the Mantel-Haenszel conditional OR.17
As for continuous data, it may be reasonable to impute correlations when they are unknown. This may be particularly appropriate as part of a sensitivity analysis. If a correlation, , is available then the cross-over OR variance may be calculated with s given by
![]() |
Acknowledgments
We would like to thank Anne Louise Jennings for permission to use data from the review of opioids. JH was funded by MRC project grant G9815466.
References
1 Gøtzsche PC. Patients' preference in indomethacin trials: an overview. Lancet 1989;i:8891.
2 Senn S. Cross-over trials in Clinical Research. Chichester: Wiley, 1993.
3 Jones B, Kenward M. Design and Analysis of Cross-over Trials. London: Chapman and Hall, 1989.
4
Campbell H, Surry SAM, Royle EM. A review of randomised controlled trials published in Archives of Disease in Childhood from 198296. Arch Dis Child 1998;79: 19297.
5
Thornley B, Adams C. Content and quality of 2000 controlled trials in schizophrenia over 50 years. Br Med J 1998;317: 118184.
6 Grizzle JE. The two-period change over design and its use in clinical trials. Biometrics 1965;21: 46780.[ISI]
7 Freeman PR. The performance of the two-stage analysis of two-treatment, two-period cross-over trials. Stat Med 1989;8: 142132.[ISI][Medline]
8 Hills M, Armitage P. The two-period cross-over clinical trial. Br J Clin Pharmacol 1979;8: 720.[ISI][Medline]
9 Cohlen BJ, te Velde ER, Looman CWN, Eijckemans R, Habbema JDF. Crossover or parallel design in infertility trials? The discussion continues. Fertil Steril 1998;70: 4045.[CrossRef][ISI][Medline]
10 Daya S. Differences between crossover and parallel study designsdebate? Fertil Steril 1999;71: 77172.[CrossRef][ISI][Medline]
11 Johnson MA, Woodcock AA, Geddes DM, Dihydrocodeine for breathlessness in pink puffers. Br Med J 1983;286: 67577.[ISI][Medline]
12 Follmann D, Elliott P, Suh I, Cutler J. Variance imputation for overviews of clinical trials with continuous response. J Clin Epidemiol 1992;45: 76973.[ISI][Medline]
13 Anderson JH Jr, Brunelle RL, Koivisto VA et al. and the Multicenter Insulin Lispro Study Group. Reduction of postprandial hyperglycemia and frequency of hypoglycemia in IDDM patients on insulin-analog treatment. Diabetes 1997;46: 26570.[Abstract]
14 Brunelle RL, Llewelyn J, Anderson JH Jr, Gale EAM, Koivisto VA. Meta-analysis of the effect of insulin lispro on severe hypoglycemia in patients with Type I diabetes. Diabetes Care 1998;21: 172631.[Abstract]
15 Newcombe RN, Altman DG. Proportions and their differences. In: Altman DG, Machin D, Bryant TN, Gardner MJ (eds). Statistics with Confidence. 2nd Edn. London: BMJ Books, 2000, pp.4556.
16 Morris JA, Gardner MJ. Epidemiological studies. In: Altman DG, Machin D, Bryant TN, Gardner MJ (eds). Statistics with Confidence. 2nd Edn. London: BMJ Books, 2000, pp.5772.
17 Curtin F, Elbourne D, Altman DG. Meta-analysis combining parallel and cross-over clinical trials: II. binary outcomes. Stat Med 2002a (In press).
18 Becker MP, Balagtas CC. Marginal modelling of binary cross-over data. Biometrics 1993;49: 9971009.[ISI][Medline]
19 Deeks JJ, Altman DG, Bradburn MJ. Statistical methods for examining heterogeneity and combining results from several studies in meta-analysis. In: Egger M, Davey Smith G, Altman DG (eds). Systematic Reviews in Health Care. Meta-analysis in Context. 2nd Edn. London: BMJ Books, 2001, pp.285312.
20 Curtin F, Altman DG, Elbourne D. Meta-analysis combining parallel and cross-over clinical trials: I. continuous outcomes. Stat Med 2002b (In press).
21 Curtin F, Elbourne D, Altman DG. Meta-analysis combining parallel and cross-over clinical trials: III. the issue of carry-over. Stat Med 2002c (In press).
22 Fortin PR, Lew RA, Liang MH et al. Validation of a meta-analysis: the effects of fish oil in rheumatoid arthritis. J Clin Epidemiol 1995;48: 137990.[CrossRef][ISI][Medline]
23 Jennings AL, Davies A, Higgins JPT, Broadley K. Opioids for the palliation of breathlessness in terminal illness (Cochrane review). In: The Cochrane Library, Issue 4, 2001. Oxford: Update Software, 2001.
24 Fleiss JL. The statistical basis of meta-analysis. Stat Meth Med Res 1993;2: 12145.[Medline]
25 Cochrane Collaboration. The Cochrane Library. Issue 1. Oxford: Update Software, 2001.
26 Smyth A, Walters S. Prophylactic antibiotics for cystic fibrosis (Cochrane Review). In: The Cochrane Library, Issue 1, 2001. Oxford: Update Software. Updated quarterly.
27 Huppert FA, Van Niekerk JK, Herbert J. Dehydroepiandrosterone (DHEA) supplementation for cognition and well-being (Cochrane Review). In: The Cochrane Library, Issue 1, 2001, Oxford: Update Software. Updated quarterly.
28 Moher D, Schulz KF, Altman DG for the CONSORT Group. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001; 357:119194. (Also published in Ann Intern Med 2001;134:65762 and JAMA 2001;285:198791.)[CrossRef][ISI][Medline]
29 Riordan PJ, FitzGerald PEB. Outcome measures in split mouth caries trials and their statistical evaluation. Community Dent Oral Epidemiol 1994;22: 19297.[ISI][Medline]
30
Murdoch IE, Morris SS, Cousens SN. People and eyes: statistical approaches in ophthalmology. Br J Ophthalmol 1998;82: 97173.
31 Needleman IG, Giedrys-Leeper E, Tucker RJ, Worthington HV. Guided tissue regeneration for periodontal infra-bony defects (Cochrane Review). In The Cochrane Library, Issue 2, 2001, Oxford: Update software. Updated quarterly.
32 Donner A, Piaggio G, Villar J. Statistical methods for the meta-analysis of cluster randomization trials. Stat Meth Med Res 2001;10: 25338.
33 Roberts KA, Jones DR, Abrams KR, Dixon-Woods M, Fitzpatrick R. Meta-analysis of qualitative and quantitative evidence: an example based on studies of patient satisfaction. Technical Report (Statistics) 9801. Department of Epidemiology and Public Health, University of Leicester, 1998.