1 Epidemiology Group, Department of Medicine and Therapeutics, University of Aberdeen, Aberdeen, Scotland.
2 Sunrise Medical Laboratory, Hauppauge, NY.
3 University of Texas Health Science Center at Houston, Houston, TX.
4 Office of Genomics and Disease Prevention, Centers for Disease Control and Prevention, Atlanta, GA.
5 Department of Epidemiology, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA.
6 National Heart, Lung, and Blood Institute, Bethesda, MD.
7 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD.
8 Division of Clinical Care Research, New England Medical Center, Boston, MA.
9 National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA.
10 Ospedale Policlinico IRCCS, Milano, Italy.
11 Biostatistics Division, Department of Preventative Medicine, University of Southern California, Los Angeles, CA.
12 International Agency for Research on Cancer, Lyon, France.
13 National Institute of Environmental Health Sciences, Research Triangle Park, NC.
Received for publication September 5, 2001; accepted for publication June 13, 2002.
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
association; case-control studies; causality; cohort studies; epidemiologic methods; gene frequency; genetic techniques; meta-analysis
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Although the spectrum of the relation between genes and disease is very broad, ranging from single gene disorders to multifactorial conditions, many common methodological issues apply throughout this spectrum (4). These issues relate to the planning and analysis of original studies, to the critical appraisal of individual studies, and to the integration of evidence from diverse studies.
Many papers deal with critical appraisal. A number of national organizations have specified criteria for assessing evidence on which policies and guidelines are based, for example, the Preventive Services Task Forces Guide to Community Preventive Services (5), the Scottish Intercollegiate Guidelines Network (6), and the Australian National Health and Medical Research Council (7). Issues that are particularly important in the appraisal of studies of genotype prevalence and gene-disease associations include the analytic validity of genotyping, selection of subjects, confounding (especially as a result of population stratification), gene-environment and gene-gene interactions, statistical power, and multiple statistical comparisons. There appears to be a need to develop guidelines to assess evidence from these studies. For example, a checklist for studies of associations between asthma and candidate genes has been proposed (8). Issues regarding the integration of evidence include identification of studies, the adequacy of reporting methods and results from individual studies, publication bias, quality scoring schemes, and the appropriateness of quantitative synthesis of the evidence.
Critical appraisal and integration of evidence require that the evidence be adequately reported. This paper presents recommendations regarding considerations that should be addressed when reporting studies of genotype prevalence and gene-disease associations, both for individual investigations and for systematic reviews. Although the focus of this paper is on recommendations for reporting, these recommendations necessarily have implications for study conduct and analysis. The recommendations resulted from a meeting of an expert panel workshop convened by the Centers for Disease Control and Prevention and the National Institutes of Health in January 2001. The methods used to develop the recommendations are described in the accompanying commentary (9).
The checklist presented in table 1 is intended to guide investigators in the preparation of manuscripts, to guide those who need to appraise manuscripts and published papers, and to be useful to journal editors and readers. It should not be regarded as an exhaustive list of points that have to be presented in all journal articles. We recognize that it may not always be feasible to address all of the considerations, for example, in studies of rare conditions in clinical settings. However, it would be useful if a record were kept of the coverage of these points for each study to accommodate syntheses from different studies. We suggest establishing a Web-based methods register to record such information.
|
![]() |
REPORTING AND APPRAISAL OF SINGLE STUDIES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Definition and grouping of genotypes
We recommend that a clear definition of the genotype(s) investigated should be presented. The validity of grouping on the basis of putative functional effects depends on the availability and quality of functional studies of gene variants, and this is likely to change over time. For multiallelic systems, genotypes have been grouped according to functional effects in some investigations. For example, this has been done for the N-acetlytransferase 2 gene (NAT2) polymorphisms (10). Therefore, we recommend that, when there are multiple alleles, those tested for should be specified.
It is important to distinguish true functional variants from markers that are associated with a disease only because they are in linkage disequilibrium with a functional variant. It may be useful to type several polymorphisms throughout the region of a candidate gene in order to construct haplotypes, which could be tested for association with the phenotype of interest. The increasing availability of mapped single nucleotide polymorphism markers (1115) offers the opportunity for such an approach and presents methodological challenges (see the section on statistical issues).
Type of samples and timing of collection
A variety of DNA sources can be used. Polymerase chain reaction methods are widely used to genotype DNA extracted from blood or a number of other sources (1622). Theoretically, genotyping results using genomic DNA from different tissues should be identical, because the DNA should contain an identical sequence in all tissues. However, participation rates may be higher for studies based on mouthwash or hair samples than those based on blood samples (19), and DNA is often more difficult to obtain and purify from certain tissues than from whole blood.
Sometimes the source of DNA is peripheral blood for controls, while it is a different tissue (e.g., tumor specimen) for cases. Although this should not in theory affect the genotypic assay when germline mutations are being assessed, the technique may be easier to perform in DNA extracted from blood, and therefore the results of the genotype may be more accurate in controls than in cases. In studies on cancer in particular, the ability to extract DNA from tumor tissue depends on several factors, including the amount of necrosis and the quantity of tissue, and therefore can result in the inclusion of selected cases. Because the timing of sample collection and the storage period before DNA extraction and/or analysis may differ between cases and controls, we recommend that the number of specimens collected, the success rate and timing of DNA extraction, and the success rate and timing of genotyping should be reported by the study group. Differences in timing of the sample collection and the storage period may be of especial importance when the genotype is assessed on the basis of phenotype.
Genotyping by polymerase chain reaction methods
The results of genotyping by polymerase chain reaction methods may be affected by the technology used, operational conditions in the laboratory, and inadequate safeguards for scoring the results. A recent appraisal of 40 studies in which molecular genetic techniques were used demonstrates the need for universal standards for quality control. Molecular genetic analyses were repeated in a pertinent sample of specimens or the test was confirmed with another procedure in only 15 (37.5 percent) studies, and assays were conducted in a manner blind to pertinent characteristics of subjects or hypotheses in only 13 (32.5 percent) studies (23). With the application of high-throughput methods, a number of which are under development, quality control procedures are particularly important. Numerous polymerase chain reaction-based methods are available for single nucleotide polymorphism genotyping, including restriction fragment length polymorphism analysis (24), oligonucleotide ligation assays (25), the "TaqMan" assay (26), single-base-pair extension assays, and others. We recommend that the description of the genotyping assay should include the primer sequences, thermocycle profile, number of cycles, and reference.
The accuracy of polymerase chain reaction methodology is generally quite high, although different types of laboratory errors can occur. Poor or nonspecific amplification of polymerase chain reaction products or lack of complete enzymatic function of restriction enzymes, leading to incomplete digestion in restriction fragment length polymorphism assays, can occur with measurable frequency. These errors can be controlled for by optimization of assays prior to genotyping and the use of internal controls and repetitive experiments. A data set containing less than 95 percent reproducibility between replicates indicates a potential problem and should not be considered as accurate. Rothman et al. (27) noted that misclassification of the genotype can bias measures of association between the genotype and disease, especially when the prevalence of the genotype is either very high or very low.
Some genotypic tests require visual inspection and interpretation of electrophoresis gels, and therefore, observer variability may be important (23). Observer variability can be minimized by double blind scoring and data entry, followed by electronic comparison of the blind entries. Discrepancies are then flagged automatically and adjudicated by a third (experienced) person.
We recommend that authors specify the quality measures used for the genotyping analysis and provide information on the degree of reproducibility between quality control replicates. Quality control measures include 1) internal validation for analytic validity; 2) blinding of laboratory personnel to pertinent characteristics of the samples, donor subjects, and hypotheses being investigated; 3) procedures for establishing duplicates and quality control numbers from blind duplicates; 4) test failure rate, by study group; 5) inspection of whether genotype frequencies conform to Hardy-Weinberg equilibrium (in controls in case-control studies) and are consistent with other reports for the same population (this criterion should not be binding); and 6) blind or automated data entry and third party adjudication. If a large number of samples do not produce acceptable genotyping results, comparability data should be provided with samples that yield acceptable results.
Genotyping on the basis of phenotype
In some studies, the genotype has been inferred on the basis of a phenotypic test. A potential advantage of this approach is that the phenotypic assay reflects the enzyme activity level and therefore may provide a direct measure of the functional significance of the underlying genetic polymorphism. The genotype is one determinant of the long-term enzyme activity level. Nevertheless, an enzyme activity assay provides a measure only at a single point in time and potentially may be distorted by systematic influences (e.g., effects of disease stress on metabolism and inducing factors), as well as by random measurement error. Contrasting results between studies based on phenotypic and polymerase chain reaction methods have been observed, for example, for the acetylator polymorphism and colorectal cancer (10) and for the glutathione S-transferase µ polymorphism and lung cancer (28). These differences underline the need to specify the genotyping method.
Phenotypic methods have also been used because enormous sequence variability within some genes presents challenges to genotyping by polymerase chain reaction methods. For example, more than 240 different constitutional neurofibromatosis type 1 mutations have been documented (29). As the majority of these lead to a truncated protein product, a protein truncation test was developed. This test is commercially available, but its sensitivity, specificity, and predictive value have not been established. A "significant" proportion of cases identified by the protein truncation test are not confirmed by sequencing, suggesting a problem of false positives for the protein-based assay. False negative results may also be a problem as the protein truncation test had a sensitivity of around 70 percent in small series of clinically diagnosed neurofibromatosis type 1 (29).
As for genotyping by polymerase chain reaction methods, we recommend that quality control measures and the degree of reproducibility between quality control replicates be specified.
Selection of subjects
Evaluation of potential selection bias requires consideration of the aim and design of the study and fieldwork. To date, most studies of gene-disease associations for late-onset diseases have used the case-control design. It is important to distinguish between studies whose aim is the detection of an association and those whose aim is the estimation of an association. In the former situation, cases may be "overselected" from multiplex families to increase the power to detect an association; it would be inappropriate to present the measure of association as an estimate of population association. In the latter situation, the principles underlying study design are essentially the same as for the investigation of the magnitude of association with environmental risk factors, including the minimization of the potential for selection bias emphasized in many epidemiologic textbooks (3033). In a number of studies, the selection of cases has not been well described (34). In a review of type 1 diabetes and human leukocyte-DQ antigen locus (HLA-DQ) polymorphisms, it was noted that many studies were based on convenience samples of cases in which type 2 diabetic persons who use insulin in their treatment regimen have been included (35). In several studies of cancer, prevalent cases have been included to varying extents (28). In these studies, bias in both detecting and estimating association would occur if the genotype affected survival and thereby ascertainment or if genotypes were assayed by a phenotypic test that was influenced by disease progression and/or treatment.
A recurrent problem in case-control studies of gene-disease associations with unrelated controls has been that the controls were not selected from the same source population as the cases (10, 3436). The potential problem of selecting controls who do not represent the population from which cases arise is illustrated by the divergence in odds ratios for the association between colorectal cancer and the glutathione S-transferase T1 gene (GSTT1) null genotype (37), when the different control groups were analyzed (36). Thus, sufficient information should be presented to assess whether controls would have become cases if they had developed the disease. In regard to genotype prevalence, many early studies were based on convenience samples, and not infrequently, little information was given on how the samples were selected (10, 34, 36, 38).
There are a number of ongoing cohort studies in which DNA samples have been collected. Compared with case-control studies, cohort studies have a number of advantages, including the capacity to examine age at onset distributions and multiple disease outcomes. The use of case-cohort and nested case-control analysis of archived samples that are suitable for genotypic analysis potentially can minimize the disadvantages of the cost of genotyping an entire cohort (3941). A major advantage of the case-cohort design for studies where expensive biologic markers are collected is that the same comparison group can be used for several different disease outcomes. Therefore, this design is likely to be used increasingly.
In case-cohort studies, comparison subjects are a random sample of the cohort, and the effect of age, which is the key time variable, is controlled for in the analysis. In more traditional nested case-control designs, controls are selected to match the cases on a temporal factor, such as age, and the main comparisons are within the time-matched sets (42). We recommend that the method of age adjustment be specified in case-cohort studies and that in nested case-control studies details of the matching on age or other temporal factors be presented.
Population stratification
There has been concern about the possible effects of population stratification on the results of population-based case-control studies (4346). Population stratification includes differences between groups in ethnic origin, and it can also arise because of differences between groups of similar ethnic origin but between which there has been limited admixture, such as in isolated populations. For example, a population might comprise the descendants of waves of immigrants from the same source but differ generally because of founder effects. The differences may then be apparent because insufficient time has elapsed for mixture between the groups. In an exploration of the possible degree of bias from population stratification in US studies of cancer among non-Hispanics of European descent, it was concluded that this bias is unlikely to be substantial when the epidemiologic principles of study design, conduct, and analysis are rigorously applied (47). Variations in the frequency of certain genotypes in African Americans appear to be much wider than those observed in subjects of European origin, and therefore the possibility of stratification may be higher (48).
Concern about the possible effects of population stratification has stimulated the development of family-based case-control designs, which essentially eliminate potential confounding from this source (49, 50). The most commonly used examples of such designs involve the use of siblings or parents as controls. Sibling controls are derived from the same gene pool as cases. However, selection bias could arise from the fact that a sibling may not be available for every case. Bias would arise if determinants of availability, for example, sibship size, were associated with genotype, particularly if a substantial proportion of cases had to be excluded because no sibling control was available. In addition, because of overmatching on genotype, there is a loss of statistical power compared with the use of unrelated controls (51). This loss of power generally does not occur for case-parental control studies (50), which have been advocated for the identification of modest gene-disease associations (52). However, the need to obtain samples from parents is a practical problem that limits the applicability of the design for diseases of late onset. Therefore, it is important that the study design be reported so that the possible impact of population stratification can be assessed.
Another approach proposed to minimize the potential problem of population stratification when using unrelated controls is to measure and adjust for genetic markers of ethnicity that are not linked to the disease under investigation (5356). This would be expected to control for ethnic variation in disease risk attributable to genetic factors. However, residual confounding from other sources of ethnic variation in disease risk would be a potential issue. It is unlikely that a single measure will capture the important sources of ethnic variation (57). In case-control studies with unrelated controls or in cohort studies, we recommend that the details of matching for ethnicity or adjustment for ethnicity in analysis be reported.
Statistical issues
We recommend that genotype frequencies be presented when available or inexpensive to obtain, rather than allele frequencies alone, both for studies of genotype prevalence and for studies of gene-disease associations. We make this recommendation because it is the genotype that determines risk and because allele frequencies can be calculated from genotype data (e.g., to determine Hardy-Weinberg equilibrium), whereas if only allele frequencies are presented, genotype frequencies cannot be calculated. Clearly, this recommendation would not be appropriate for studies based on DNA pooling, which may be a valuable approach in estimating allele frequency distributions in many loci in multiple populations (58), for initial investigation of disease loci, or for follow-up to confirm regions identified in linkage studies (59). In studies in which pooling is used, the strategy for pooling specimens from cases and controls should be specified.
Methodological issues relating to haplotype analysis are still under development. In particular, in studies based on unrelated individuals, haplotypes can only be estimated probabilistically based on allele frequencies. If external estimates of haplotype frequencies in the population are used instead of estimating them within the study, inference may be affected by the quality and availability of the data on haplotype frequencies in the relevant population. As more single nucleotide polymorphism loci are identified, the number of possible haplotypes will become very large, in turn raising the issues of multiple comparisons and sparse data for many haplotypes (6062). We recommend that, when haplotypes are used, the method of construction should be specified.
Calculation of risk difference (i.e., the risk of the disease in those with the genotype under investigation minus the risk of the disease in those without the genotype) in the context of gene-disease associations may be useful in that it measures the potential impact of the association in public health terms (33). However, the magnitude of the risk difference is less likely to be generalizable to other populations than is the relative risk (30), because it depends on the baseline risk in those without the genotype, which is likely to vary between populations.
A small study size is a limitation of many studies that test a priori hypotheses about gene-disease associations (e.g., 36, 63). A possible solution is pooled analysis (see below). One research strategy proposed for the future is large-scale testing by genome-wide association mapping (52, 60, 64, 65). It is important to note that this strategy is hypothesis generating rather than hypothesis testing, and thus it may require additional safeguards against type 1 error. For example, Risch and Merikangas (52) suggested specifying a higher significance level. However, increasing the significance level will increase the number of subjects required to have adequate statistical power (52), which may significantly increase recruitment costs and make some studies unfeasible. An alternative approach is to emphasize replication of findings and to obtain data on biologic plausibility, for example, from in vitro studies. We recommend that all tests performed should be reported, as long as the tests have adequate statistical power, not just the "significant" ones. This would require reviewers and editors to give importance (and journal space) to negative results.
Associations between several genes and a disease can be tested according to a priori hypotheses based, for example, on the biologic mechanism of these genes in determining the disease. It is recognized that it is becoming the usual practice in human genome epidemiology studies to initiate a study to test hypotheses that are current at that time and also to establish a resource to test additional hypotheses proposed later, on the basis of knowledge external to the resource. These are all a priori hypotheses. We reiterate the need to distinguish between hypothesis testing and hypothesis generation.
![]() |
INTEGRATION OF EVIDENCE |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Identification of studies
A comprehensive search is one of the key differences between a systematic review and a traditional review (68). We recommend that the details of the strategy used to identify relevant papers should be specified as described by Stroup et al. (69). There have been several instances of sequential or multiple publications of analyses of the same or overlapping data sets. An aid to identifying this problem is to organize evidence tables first by geographic area and then by study period within a specified area. If it is clear that the reports relate to the same or overlapping data sets, then we recommend including data only from the largest or most recent publication. It is possible that, under these circumstances, details of the methodology are described in greater detail in an earlier publication. If so, we recommend including the reference to the earlier publication with the reference to the publication from which the data were abstracted in the evidence tables.
Publication bias
Publication bias is potentially a serious problem for the integration of evidence. One method of minimizing the potential impact of publication bias is to identify and include "gray literature," which includes abstracts, technical reports, and non-English journals (70) that may not be identified by electronic searches. We recommend caution in using various types of "gray literature" because the material may not be peer reviewed and may be subject to modification and revision and because the information on study methods may be insufficient to assess study quality. We suggest that consideration should be given to including "gray literature" if the study quality can be assessed adequately.
A labor-intensive method to minimize publication bias is to establish a research register for studies of gene-disease association similar to the Cochrane Collaboration, which maintains a register of controlled trials (71), and the Directory of On-going Research in Cancer Prevention (72).
In other fields, quantitative and qualitative methods of detecting publication bias have been used, such as the fail-safe technique where the number of new studies averaging a null result needed in order to bring the overall effect to nonsignificance is calculated (73, 74). After this is calculated, a judgment can be made as to whether it is realistic to assume that so many unpublished studies exist in the field of investigation. If the assumption were realistic, then there would be doubt about the validity of conclusions based on potential evidence. Other quantitative and qualitative methods have been reviewed by Sutton et al. (75) and Thornton and Lee (76). In general, all the methods have limitations. Therefore, it seems appropriate to take into account the fact that the evidence base may be skewed toward positive results in drawing conclusions about causal relations.
Quality scoring
There are a number of publications concerning the rating of the quality of analytic observational studies. Several relate to case-control studies (30, 32, 7781). Some (77, 78) are part of a series of articles documenting the deficiencies of epidemiologic research; they have been challenged on the grounds of technical errors, failure to distinguish important from unimportant biases, and ignoring the need to weight the totality of the evidence about a relation (82, 83). Other issues include possible overemphasis of the potential problems of case-control studies as compared with cohort studies (80) and difficulty in assessing differences between methods applied in the case and control groups or between different exposure (prognostic) groups (81, 84).
A number of authors have proposed quantitative quality scoring systems for critical appraisal (84). Other schemes have been developed for the purposes of meta-analyses in which an attempt has been made to assess the importance of study quality in accounting for heterogeneity of results between studies (8587). This type of assessment has also been considered for pooled analysis (88, 89). Certain features of the assessment schemes are specific to the disease and/or the exposure under consideration, and each aspect of the study is given equal weight. Thus, summation of points might result in worse quality scores for a study with several minor flaws than for a study with one major flaw. Although empirical studies on a large number of primary investigations might suggest an overall relation between a specific aspect of study design and the reported results, this relation is ecologic and may not be true for a specific investigation. Therefore, it is very difficult to isolate specific noncausal factors, which might affect the interpretation of a single investigation. Jüni et al. (90) observed that the use of scores to identify clinical trials of high quality is problematic, and they recommended that relevant methodological aspects should be assessed individually and their influence on the magnitude of the effect of the intervention explored. We recommend similar caution in consideration of studies of gene-disease associations. As in clinical trials, it may be more appropriate to consider multidimensional domains than a single grade in the integration of evidence from observational studies.
Currently, there is little or no empirical evaluation of the quality scoring of association studies. However, we recognize that many users of data on genotype prevalence and gene-disease associations need a robust means of grading evidence. We recommend following the approach of the Scottish Intercollegiate Guidelines Network (6). In this approach, studies of the gene-disease association in which all or most of the criteria specified in table 1 are satisfied would be graded as "++." Criteria that have not been fulfilled would not affect the grade if it were thought that the conclusions of the study would be very unlikely to be affected by their omission. Studies in which some of the criteria have been fulfilled, and those that were not fulfilled would be thought to be unlikely to alter the conclusions, would be graded as "+." Studies in which few or no criteria were fulfilled, and the conclusions of the study would be thought likely or very likely to be altered by multiple omissions in required criteria for an acceptable study would be graded as "." For studies of genotype prevalence, similar considerations would be applied.
![]() |
QUALITATIVE SYNTHESIS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Causal inference
There are well-established criteria for causal inference (66, 67). In relation to the consistency of gene-disease associations, heterogeneity between studies is frequent (91). Consideration has to be given to methodological factors that might account for inconsistency. For example, in a meta-analysis of 370 studies that assessed 36 gene-disease associations, a small sample size of the first publication was a predictor of inconsistent results (91). In addition, it is important to consider that differences between studies in distributions of subjects by age and gender will be sources of heterogeneity. For example, hormonal alterations can affect ligand binding, enzyme activity, gene expression, and the metabolic pathways influenced by gene expression. Some inconsistency among the results of gene-disease association studies may be secondary to variation among studies in the prevalence of interacting environmental factors that have not been assessed (D. J. Hunter, Channing Laboratory, unpublished manuscript). It would be appropriate to test a priori hypotheses about differences in gene-disease associations and genotype frequencies between studies that may arise from these sources. We recommend that information on the age distribution of subjects be presented and that consideration be given to presenting data on gene-disease associations by gender. As noted in the section on analytic validity, contrasting results between studies in which genotyping was based on polymerase chain reaction methods and those in which it was based on phenotypic assays have been observed. However, there is a need to consider the possibility that these differences may have been due to reasons other than the genotyping method, such as selection or participation bias of cases and/or controls. In addition, a given gene may metabolize multiple environmental substrates, and thus phenotypic assays, using one chemical to induce the gene, may not truly reflect the metabolizing activity of that gene. It is also possible that other DNA variants may alter enzyme function or activity.
Regarding the strength of association, many of the genetic variants so far identified as influencing susceptibility to common diseases are associated with a low relative and absolute risk (92). Therefore, exclusion of noncausal explanations for associations is crucial.
Biologic plausibility is a particularly important issue. It is linked with consideration of 1) whether a known function of the gene product can be linked to the observed phenotype; 2) whether the gene is expressed in the tissue of interest; and 3) temporal relations, including the time window of gene expression in relation to age-specific gene-disease relations. Thus, the gene should be in the disease pathway and/or involved in the mechanism that is responsible for the development of the disease. If not, then the effect of the gene may be indirect. It may also be relevant to consider maternally mediated effects of the maternal genotype and parental imprinting. Case reports may provide clues that could not be obtained from epidemiologic designs. For example, evidence from a heteropaternal twin pair provided a lead to genetic differences in the metabolism of phenytoin that accounted for a lack of concordance for teratogenic effect (93).
In regard to temporality, it is possible that the disease could influence the result of a phenotypic assay of the genotype under investigation. This should not be a problem with polymerase chain reaction methods. Methods to analyze longitudinal phenotypes, such as changes in blood pressure or obesity (94) over time, are being developed. If data were available on the time window of gene expression, it would be relevant to consider this in relation to the age specificity of gene-disease relations.
Experimental support for a gene-disease association is most likely to be derived from studies of gene expression in knockout or other experimental animals, from in vitro data on gene function, or from experimental interactions based on clinical protocols aimed at normalizing the levels of a product regulated by the gene (e.g., with gene therapy in cystic fibrosis).
Quantitative synthesis
There are two types of quantitative synthesis of evidence: 1) meta-analysis of the results of studies and 2) pooled analysis of data on individual subjects obtained in several studies. There has been debate about the validity of meta-analysis of observational studies (69, 95). On the one hand, meta-analysis may indicate a "spurious precision," and it has been suggested that either meta-analysis of observational studies should be abandoned altogether (96) or the focus of attention should be the consideration of possible sources of heterogeneity between studies (91, 97). On the other hand, meta-analysis can help to clarify whether or not an association exists and to provide an indication of the quantitative relation between the dependent and independent variables (98). The indication of the quantitative relation, although potentially biased, may be of value in consideration of the public health effects of interventions based on knowledge of the genetic factor and/or its interactions.
Pooled analysis requires data on individual subjects. This approach offers many advantages over the meta-analysis of the results of studies, including standardization of definitions of cases and variables, testing the assumptions of time-to-event models, better control of confounding, standardization of analyses of genetic loci that are in linkage disequilibrium, evaluation of alternative genetic models and multiple genes, consistent treatment of subpopulations, and assessment of sampling bias (88). For example, this approach has been used successfully to study the effect of chemokine and chemokine receptor alleles on human immunodeficiency virus type 1 disease progression (99). Nevertheless, pooling approaches require a much greater commitment of time and resources to collect primary data and to coordinate a large collaborative project (100). For questions that justify the required intensive effort, the pooling approach is a useful tool to help to clarify the role of candidate genes in complex human diseases (101).
We recommend that this type of quantitative synthesis be done whenever possible in preference to meta-analysis of the results of studies when a high degree of accuracy of the measures of effect is required. However, stratification by original study may still be important, to allow for and elucidate the causes of heterogeneity among the data sets being pooled.
![]() |
CONCLUSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
ACKNOWLEDGMENTS |
---|
![]() |
NOTES |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Related articles in Am. J. Epidemiol.: