Alberta Cancer Board, Calgary, AB, Canada.
Alberta Cancer Board, Division of Epidemiology, Prevention and Screening, 1331-29 St NW, Calgary, Alberta T2N 4N2, Canada. E-mail: chrisf{at}cancerboard.ab.ca
Four types of reviews of existing literature and extant data have been defined for epidemiological studies: traditional narrative reviews, meta-analyses of published studies, and pooled analyses of individual-level data that are either retrospectively or prospectively planned.1 The methods for the conduct and reporting of meta-analyses and pooled analyses of observational epidemiological studies have been described2,3 and these types of reviews of epidemiological evidence are now being performed more frequently. The Pooling Project, a retrospectively planned pooled analysis, has been on-going since 1990, and the most recent publication on the association between meat and dairy products and breast cancer risk from that pooled data set is presented in this issue of the International Journal of Epidemiology.4 Investigators at the Harvard Medical School identified all cohort studies conducted by the late 1980s that used a comprehensive and validated assessment of dietary intake at baseline of the cohort, and that had a sample size of at least 200 breast cancer cases by the time of the pooling. Since the initial pooling of the data, several papers have been published that have examined the association between breast cancer and dietary fat,5,6 alcohol,7 fruit and vegetable consumption,8 anthropometric risk factors9 and non-dietary risk factors.10 Given the large size of the data set (over 350 000 women and nearly 7500 cases), the quality of the original studies that were combined, and the attention to the statistical methods used for the pooled analysis, the Pooling Project is an excellent example of the type of co-operative analytical effort that should be increasingly pursued in epidemiology. The investigators can be commended for providing leadership in this field. Nonetheless, as with any scientific endeavour, there are still methodological issues that need to be addressed and improvements that should be considered for future pooled analyses.
Eight cohort studies were combined in the Pooling Project that included one Canadian cohort, two European cohorts and five American cohorts. The investigators chose, in each of the papers published to date, to divide the Nurses' Health Study (NHS) cohort into two groups: the first includes the follow-up from 1980 to 1986 and the second from 1986 to 1996. Risk estimates are provided for each of these two groups separately and they are treated as though they arose from separate cohort studies even though the baseline cohort in the latter follow-up period (19861996) are a subset of the earlier follow-up period (19801986). The investigators justify their decision to divide the NHS cohort into two groups because the dietary assessment was done repeatedly during the follow-up period and the exposure assessment in 1986 was more detailed than the baseline assessment in 1980. Furthermore, they argue that the person-time in the two time periods is statistically independent despite the fact that they are derived from the same individuals. Although the investigators are correct that the person-time is independent, they do not address the issue of the colinearity of the observations within these two subsets of the NHS. With the random effects model, used to estimate the pooled effects, between-study variation is taken into consideration in the modelling and the underlying assumption is that each cohort is independent. In the analysis performed for the Pooling Project, this assumption is not upheld since the two sub-cohorts of the NHS are not independent and the study-specific biases will be the same. In addition, the NHS was the largest of the seven cohorts pooled and by dividing the follow-up period into two groups, the NHS data have been given more weight in the final pooled analysis. Consequently, the summary estimates are noticeably influenced by the results obtained in that cohort.
Another decision taken by the investigators, that influenced the results obtained, is the use of study-specific quartile cut-points rather than common cut-points for all of the studies combined. The differences in dietary intake across these cohorts were large given the heterogeneous populations that were included in these studies. One of the opportunities of a pooled analysis, to examine risks across larger samples of individuals with more heterogeneous exposures, was missed by maintaining within-study comparisons. The possibility exists that associations may have been observed for meat and dairy products within these data if the investigators had used common cut-points.
One of the advantages of a pooled analysis is the increased study power that permits a full examination of effect modification within the data. Unfortunately, a tendency exists to evaluate all statistical interactions that are possible with the available data rather than to limit the assessment to more meaningful biological interactions. Hence, the assessment of 84 interactions in this paper of the Pooling Project is difficult to support and not the preferred approach to be used in future pooled analyses. Likewise, another problem with pooled analyses of such large data sets, that has not been addressed here, is the issue of multiple comparisons since several associations have been evaluated in this paper and in previous publications from the Pooling Project.
Several sources of heterogeneity exist between the cohorts in the Pooling Project including the study populations and sampling methods used in the baseline cohorts, the dietary assessment and validation methods including the level of detail on food items consumed and cooking methods used, and the type and quality of information available on confounding risk factors and effect modifiers. Although the investigators have developed sophisticated methods to deal with measurement error, ones that they have used in previous publications,57 no adjustment for measurement error was made in this paper because the original studies did not have validation study data on individual foods or food groups. Besides considering measurement error in the dietary data, the investigators have not considered other sources of error or bias in the individual studies that could exist when the studies are combined. Hence, more effort should be given in future studies to examining and controlling sources of heterogeneity across studies.
The best method for pooling observational epidemiological studies that avoids some of the limitations found in the Pooling Project, is to conduct prospectively planned pooled analyses. This ultimate type of pooled analysis can reduce to a minimum the measurement errors and bias arising when studies are combined that used heterogeneous designs and data collection methods. The International Agency for Research on Cancer has been conducting prospectively planned pooled analyses for the past two decades with the SEARCH programme11 and European Investigation on Cancer and Nutrition (EPIC)12 studies. These are but two examples of how common protocols can be developed and applied across individual studies with the plan to pool the data for the analyses of numerous outcomes. With increased world-wide collaboration in epidemiology, more prospectively planned pooled analyses need to be conducted that use standardized study designs, data collection and analytical methods. In so doing, the validity, reliability and quality of these methods can be improved and more clarity on disease-exposure associations obtained.
References
1 Blettner M, Sauerbrei W, Schlehofer B, Scheuchenpflug T, Friedenreich C. Traditional reviews, meta-analyses and pooled analyses in epidemiology. Int J Epidemiol 1999;28:19.[Abstract]
2 Friedenreich CM. Methods for pooled analyses of epidemiologic studies. Epidemiology 1993;4:295305.[ISI][Medline]
3
Stroup DF, Berlin JA, Morton SC et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. JAMA 2000;283: 200812.
4
Missmer SA, Smith-Warner SA, Spiegelman D et al. Meat and dairy food consumption and breast cancer: a pooled analysis of cohort studies. Int J Epidemiol 2002;31:7885.
5
Hunter DJ, Spiegelman D, Adami H-O et al. Cohort studies of fat intake and the risk of breast cancera pooled analysis. N Engl J Med 1996;334:35661.
6 Smith-Warner SA, Spiegelman D, Adami HO et al. Types of dietary fat and breast cancer: a pooled analysis of cohort studies. Int J Cancer 2001;92:76774.[CrossRef][ISI][Medline]
7
Smith-Warner SA, Spiegelman D, Yuan S-S et al. Alcohol and breast cancer in women: a pooled analysis of cohort studies. JAMA 1998; 279:53540.
8
Smith-Warner SA, Spiegelman D, Yaun SS et al. Intake of fruits and vegetables and risk of breast cancer: a pooled analysis of cohort studies. JAMA 2001;285:76976.
9
van den Brandt PA, Spiegelman D, Yaun S-S et al. Pooled analysis of prospective cohort studies on height, weight, and breast cancer risk. Am J Epidemiol 2000;152:51427.
10 Hunter DJ, Spiegelman D, Adami HO et al. Non-dietary factors as risk factors for breast cancer, and as effect modifiers of the association of fat intake and risk of breast cancer. Cancer Causes Control 1997;8:4956.[CrossRef][ISI][Medline]
11 Boyle P. SEARCH programme of the International Agency of Research on Cancer. Eur J Cancer 1990;26:54749.[ISI][Medline]
12
Riboli E, Kaaks R. The EPIC project: rationale and study design. Int J Epidemiol 1997;26:S6S14.
|