*Department of Ecology and Evolution, University of Chicago;
Windber Research Institute, Windber, Pennsylvania
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
For similar reasons, it is also useful to determine the degree of rate variation among different regions of a genome. Although there has been general agreement that considerable variation in substitution rates exists among different regions of the mammalian genome (Filipski 1988
; Wolfe, Sharp, and Li 1989
; Bernardi, Mouchiroud, and Gautier 1993
; Casane et al. 1997
; Matassi, Sharp, and Gautier 1999
; Lercher, Williams, and Hurst 2001
; Ebersberger et al. 2002
), some authors have proposed a uniform mutation rate along the mammalian genome (Kumar and Subramanian 2002
).
There have been generally two drawbacks in previous studies on the molecular clock hypothesis. First, estimates of nucleotide substitution rates often involved uncertain assumptions about divergence dates between taxa. For example, Kumar and Subramanian (2002)
relied mainly on divergence dates estimated under the assumption of a global protein clock in vertebrates and the assumption of 310 MYA for the mammal-reptile split to examine mutation rate differences within and among genomes of diverse mammalian lineages (e.g., Kumar and Hedges 1998
). However, the assumption of a global protein clock in vertebrates is unlikely to be true because, for example, the rate of amino acid substitution has been shown to be much higher in the rodent lineage than in the primate lineage (Gu and Li 1992
). Further, some of the fossil dates they used may be questionable; e.g., they assumed the same divergence date of 90 MYA for several pairs of mammalian orders. Second, the sequence data used in many studies were usually limited and often biased. For example, Kumar and Subramanian (2002)
excluded all sequences that failed their disparity test; e.g., they excluded 41% of the available sequence data in the comparison between primates and rodents. In all their between-taxa comparisons, the excluded sequences have evolved, on average, considerably faster than the sequences retained in their analysis (see figure 2c of Kumar and Subramanian [2002]
). This could be a major reason for their considerably lower estimated rates than previous estimates.
In this article, we address the above issues in three ways. First, we show that the molecular clock in Old World monkeys (OWMs), humans, and apes (i.e., catarrhines) runs at a rate much lower than rates estimated for other mammals. For this purpose we take advantage of the large amount of genomic sequence data from human and chimpanzee and a recent discovery of an early hominid fossil. As questions regarding the rate of neutral substitutions can be adequately addressed only using selectively neutral sequences, and as repetitive sequences tend to evolve at different rates than the surrounding genomic regions, we use only noncoding, nonrepetitive sequences. Then we use the dating of a recently discovered hominid fossil (Brunet et al. 2002
; Vignaud et al. 2002
) as a calibration point. This fossil allows us to estimate a minimal divergence time for these two lineages, which in turn can be used to infer a "maximal" mutation rate for these lineages. In a similar manner, we obtain a maximal rate for the human and OWM lineages and show that it is much lower than the rates estimated for other mammals.
Second, we revisit the hominoid rate-slowdown hypothesis. This was first proposed on the basis of immunological data (Goodman 1961
, 1962
) and further debated using DNA sequence data (Bailey et al. 1991
; Easteal 1991
; Herbert and Easteal 1996
; Li et al. 1996
). In this study we have obtained new intron sequence data and have retrieved existing intron sequence data and fourfold degenerate site data from higher primates. Using the relative rate test (Sarich and Wilson 1967
; Wu and Li 1985
) with a New World monkey (NWM) species as the outgroup, we show that the rate of nucleotide substitution is significantly lower in the hominoid lineage than in the OWM lineage. This and the above results together reject the proposal of a global molecular clock in mammals.
Third, using new and existing intron sequence data we show that the rate of nucleotide substitution is not uniform among genomic regions. This result corroborates previous observations (Filipski 1988
; Wolfe, Sharp, and Li 1989
; Casane et al. 1997
; Matassi, Sharp, and Gautier 1999
; Lercher, Williams, and Hurst 2001
; Ebersberger et al. 2002
). Further, using the same data set, we show that the rate heterogeneity is in part due to variation in GC content.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
In addition, 75 regions including exons and introns were retrieved from GenBank for at least one OW monkey species and at least one NW monkey species. From these data we excluded sequences from the X and Y chromosomes because the rate of molecular evolution on the sex chromosomes is also affected by other evolutionary factors (e.g., Bachtrog and Charlesworth 2002
; Makova and Li 2002
). After exclusion, 27 intron sequences from various genomic regions were available for alignment. RepeatMasker was used to exclude repeats before alignment. Manual adjustments were made, if necessary, to finalize each alignment.
Retrieval of Primate Fourfold Degenerate Sites
We extracted protein-coding exon sequences from 41 protein-coding regions that fit the above criteria in the sequences retrieved from GenBank. Amino acid sequences were then produced from the exon-only DNA sequences and aligned with those from other species using the Clustal option of the MegAlign (part of the DNA STAR package). Sequence lengths were trimmed to obtain a complete alignment set for the three species. Then we back translated amino acid sequences to produce DNA sequence alignments. We only used fourfold degenerate sites for further analyses.
Statistical Analysis
The number of nucleotide substitutions per site (K) between two sequences was estimated using Kimura's two-parameter method (Kimura 1980
) and the Tamura-Nei method (Tamura and Nei 1993
) available in the DAMBE program (Xia and Xie 2001
). Because these methods provided essentially the same results for all sequence comparisons, only the results obtained by Kimura's two-parameter method are shown. The relative rate tests were performed using Wu and Li's method (Wu and Li 1985
) only on introns longer than 100 bp.
The disparity index test (Kumar and Gadagkar 2001
) was done for each set of orthologous sequences using MEGA2 (Kumar et al. 2001
). We used a stringent criterion such that if the disparity index test failed in any of the three pairwise comparisons (Homo-OW monkey, OW-NW monkeys, Homo-NW monkeys), then we considered that the set of three sequences has failed the disparity index test. We also performed a G-test for goodness of fit (Sokal and Rohlf 1995
, p. 738) to a uniform mutation rate model for the observed number of substitutions between human and NW monkeys. For each gene, we chose the longest intron sequence, which had to be longer than 250 bp after excluding repeats. A total of 23 introns were thus selected. The same data set was used to investigate the correlation between the GC content and the evolutionary rate between human and a NW monkey species.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
First, we use the huge amount of sequence data from the human and chimpanzee genomes to obtain a reliable estimate of the mean sequence divergence between these two lineages. Beginning with 2.73 x 106 bp of homologous Pan and Homo genomic contigs available in GenBank, we extracted and aligned approximately 875 kb of noncoding intergenic regions between human and chimpanzee (table 1
). This was after excluding all the predicted and known coding exons, introns, and regulatory regions. We also excluded repetitive sequences. The sequence divergences in aligned regions between human and chimpanzee are listed in table 1
.
The degree of sequence divergence differs considerably among regions (0.67%2.91%, table 1
). This has been observed earlier in two different studies (Chen and Li 2001
; Ebersberger et al. 2002
). It may reflect statistical fluctuations and variation among genomic regions in the degree of preexisting polymorphism in the common ancestral population of humans and chimpanzees. The average divergence from these regions, excluding repetitive sequences, is 1.19% (table 1
). This estimate is almost the same as previously reported (1.24%) based on 53 short segments of noncoding intergenic regions and many long contigs (Chen and Li 2001
; Chen et al. 2001
) and almost 2 million bases of randomly scattered sequences in the entire genome (Ebersberger et al. 2002
).
Next we take advantage of the recent discovery of a hominid fossil, Sahelanthropus tchadensis (Brunet et al. 2002
), which was dated to be 67 MYA before present (Vignaud et al. 2002
). This is even older than the earliest known fossil hominid so far, the Ardipithecus ramidus kadabba, dated to be 5.25.8 MYA before present (Haile-Selassie 2001
). As the Sahelanthropus fossil is hominid, it gives a minimal date of 6 MYA for the divergence between human and chimpanzee. (Note that as long as Sahelanthropus was a hominid, the fossil would have postdated the human-chimpanzee divergence, even if it was not a direct ancestor of Homo sapiens.) We therefore obtain a maximal rate of 0.99 x 10-9 substitutions per site per year for the human and chimpanzee lineages. The actual human-chimp divergence should be older than the Sahelanthropus hominid fossil, which can be as old as 7 MYA (Vignaud et al. 2002
). If we assume that the date for the human-chimp split is 7.5 MYA, then the average substitution rate in noncoding, nonrepetitive regions becomes 0.79 x 10-9 substitutions per site per year. This result may still overestimate the substitution rate because it did not exclude the effect of preexisting polymorphism in the common ancestor of human and chimpanzee. We understand that fossil dates may not be reliable, but even if we assume 5 MYA for the human-chimpanzee divergence, which is a minimum date commonly used in the literature, we still obtain a rate of only 1.19 x 10-9 substitutions per site per year.
The above values are less than half the "global rate" proposed by Kumar and Subramanian (2002)
and much lower than estimates obtained from other mammalian lineages. For example, the maximum-likelihood estimate of the numbers of synonymous substitutions per site along the Artiodactyl lineage is 0.327 (Bielawski, Dunn, and Yang 2000
), which leads to a rate of
3.5 x 10-9 substitutions per site per year, using the divergence time estimates of 9095 MYA by Kumar and Hedges (1998)
. The average sequence divergence between mouse and rat estimated from fourfold degenerate sites is
18% (Smith and Hurst 1999
). The divergence time between mouse and rat is controversial, ranging from 12 MYA based on the fossil record (Flynn, Jacobs, and Lindsay 1985
) to over 40 MYA based on a global vertebrate molecular clock (Kumar and Hedges 1998
). Even if we use the range of 20 to 40 MYA, the substitution rate range is 2.55.0 x 10-9. Therefore, the substitution rate in humans and chimpanzees is exceedingly low, contradictory to the proposal of a global clock in eutherians (Kumar and Subramanian 2002
).
Slow Molecular Clock in Old World Primates
We also estimate the evolutionary rates in the human and OWM lineages for introns and fourfold degenerate sites. Using new and existing sequence data, we estimate a divergence of 6.85% for introns (table 2
). We also divide the introns used into those that failed the disparity test and those that passed the test. Note that, as expected, the average divergence (7.05%) for the first group of introns is substantially higher than that (6.77%) for the second group of introns (table 2
). In the following calculations we shall include all introns because we are interested in obtaining a maximal estimate of the substitution rate. We also obtain a divergence of 7.17% for fourfold degenerate sites (data not shown) between the Homo and OWM lineages. The simple average of the estimates from introns and fourfold degenerate sites is 7.01%. As the earliest fossil record of true hominoids is represented by the genus Proconsul, which lived in Africa 2327 MYA during the Early Miocene (Delson 2000
, pp. 595597), the human-OWM divergence would have predated this fossil. If we assume 23 MYA, a commonly used fossil calibration point for the human-cercopithecoid divergence (Goodman et al. 1998
), the average rate of nucleotide substitution for the two lineages is
1.5 x 10-9. If we assume a divergence date of 30 MYA, the average rate becomes only 1.17 x 10-9. Clearly, these estimates are much lower than the values cited above for other mammals. Note that our estimated rates for humans, apes, and OW monkeys are inflated to some extent because we did not exclude those sequences that failed the disparity test as was done by Kumar and Subramanian (2002)
. Therefore, our results strongly contradict the proposal of a global clock in mammals.
|
In the majority of cases, the rate of nucleotide substitution is faster in the OWM lineage (table 2 ). When all the sites are combined the rate of nucleotide substitution in the OWM lineage is significantly faster (P < 0.001). The average ratio of the substitution rate in the OWM lineage to that in the Homo lineage is 1.33, which implies a 33% faster rate in the OWM lineage. When we restrict our analyses to the introns that passed the disparity index test, we reach the same conclusion (table 2 ). In fact, the rate difference between the Homo and OWM lineages is more pronounced in the introns that passed the disparity index test (table 2 ).
The above result supports the hominoid rate-slowdown hypothesis, which postulates that the rate of molecular evolution has become slower in hominoids (humans and apes) after their separation from the OW monkeys (Goodman 1961
, 1962
). The same trend has been observed earlier (Bailey et al. 1991
; Seino, Bell, and Li 1992
; Li et al. 1996
) but has been challenged by Easteal (1991)
and Herbert and Easteal (1996)
. Although Herbert and Easteal (1996)
did find a
30% rate increase in the OWM lineage in the noncoding regions they studied, they doubted the universality of this trend because most of the data (83%) were from the ß-globin gene region (including the
-globin gene). In our analysis, many different genomic regions were analyzed and the
-globin region was not included (table 2
). Moreover, we obtained the same result (K13 - K23 = 0.9% ± 0.27%) when we excluded all the sequences from the ß-globin gene region. This suggests the generality of our observation. Kumar and Subramanian (2002)
reported a 10.9% increase in the evolutionary rate in the OWM lineage compared with the human lineage. However, their result was obtained from a much smaller data set, less than 4,000 fourfold degenerate sites from 23 genes (Kumar and Subramanian 2002
). At any rate, the average rate difference from their and our estimates can be somewhere between 20% and 30%.
The basis of the hominoid rate-slowdown hypothesis lies in the notion that the molecular clock should run faster in organisms with a short generation time, for they go through many more generations per unit time than do organisms with a long generation (Ohta 1993
; Li et al. 1996
). This entails that the replication-dependent mutational errors such as errors in DNA replication in germ cells are the major source of mutations that are responsible for lineage effects. This view is strengthened by other observations. For example, the male-to-female ratios of germ cell division in mice and primates roughly correspond to the estimated male-to-female ratios of the mutation rate (Li et al. 1996
). Also, when compared with autosomes, the X-linked sequences usually show the lowest sequence divergence between species (Smith and Hurst 1999
; Lercher, Williams, and Hurst 2001
; Castresana 2002
; Ebersberger et al. 2002
). The generation-time effect hypothesis is also supported by the observation that the rate of substitution in the chloroplast genome is more than five times higher in grasses than in palms (Gaut et al. 1992
). Nevertheless, the large variation in substitution rate among autosomal regions suggests that replication-independent factors may also be important sources of mutation (Lercher, Williams, and Hurst 2001
; Castresana 2002
; Ebersberger et al. 2002
).
Rate Variation Among Genomic Regions
Next let us examine the proposal of a uniform mutation rate along the mammalian genome. For this purpose, we use the human and NW monkey lineages because they show a moderate degree of sequence divergence (14%, table 2
) and the sequences can be reliably aligned. We choose introns longer than 250 bp after exclusion of repeats. This is to minimize selective constraints due to splicing signals and other functional motifs. Furthermore, only one intron (the longest one) from a given gene is examined; the purpose is to avoid the correlation of data from the same region, which is an important consideration when we compute the correlation between substitution rate and GC content. In total we include 23 introns from 17 chromosomes, with varying GC contents. We test a simple model of independence between genomic regions by applying the G-test statistic of heterogeneity to the counts of substituted and conserved sites in different introns (Sokal and Rohlf 1995
, p. 738). The test rejects the model (G2 = 63.6, df = 22, P < 0.001), implying that the mutation rate differs significantly among genomic regions.
We now determine whether the substitution rate in an intron is correlated with its GC content. Because we have no knowledge of the ancestral GC contents, we take the average of the GC contents of the introns from both species. We find a significant correlation ( = 0.44; fig. 1
). When we use only human or only NWM species GC content, the obtained correlation coefficient varies only slightly (
= 0.42 and 0.45, respectively), as expected from the relatively small divergence between the two taxa.
|
![]() |
Concluding Remarks |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Keywords: molecular clocks
mutation rate
hominoid rate-slowdown
Old World monkeys
humans
apes
Address for correspondence and reprints: Wen-Hsiung Li, Department of Ecology and Evolution, University of Chicago, 1101 East 57th street, Chicago, Illinois 60637. E-mail: whli{at}uchicago.edu
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bachtrog D., B. Charlesworth, 2002 Reduced adaptation of a non-recombining neo-Y chromosome Nature 416:323-326[ISI][Medline]
Bailey W. J., D. H. A. Fitch, D. A. Tagle, J. Czelusniak, J. L. Slightom, M. Goodman, 1991 Molecular evolution of the -globin gene locus: Gibbon phylogeny and the hominoid slowdown Mol. Biol. Evol 8:155-184[Abstract]
Bernardi G., D. Mouchiroud, C. Gautier, 1993 Silent substitutions in mammalian genomes and their evolutionary implications J. Mol. Evol 37:583-589[ISI][Medline]
Bielawski J. P., K. A. Dunn, Z. Yang, 2000 Rates of nucleotide substitution and mammalian nuclear gene evolution: approximate and maximum-likelihood methods lead to different conclusions Genetics 156:1299-1308
Brunet M., F. Guy, D. Pilbeam, et al. (38 co-authors). 2002 A new hominid from the upper Miocene of Chad, Central Africa Nature 418:145-151.[ISI][Medline]
Casane D., S. Boissinot, B. H.-J. Chang, L. C. Shimmin, W.-H. Li, 1997 Mutation pattern variation among regions of the primate genome J. Mol. Evol 45:216-226[ISI][Medline]
Castresana J., 2002 Genes on human chromosome 19 show extreme divergence from the mouse orthologs and a high GC content Nucleic Acids Res 8:1751-1756
Chen F.-C., W.-H. Li, 2001 Genomic divergence between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees Am. J. Hum. Genet 68:444-456[ISI][Medline]
Chen F.-C., E. J. Vallender, H. Wang, C. S. Tzeng, W.-H. Li, 2001 Genomic divergence between human and chimpanzee estimated from large-scale alignments of genomic sequences J. Hered 92:481-489
Delson E., 2000 Encyclopedia of human evolution and prehistory Garland Publishing Co., New York
Easteal S., 1991 The relative rate of cDNA evolution in primates Mol. Biol. Evol 8:115-127[Abstract]
Ebersberger I., D. Metzler, C. Schawarts, S. Pääbo, 2002 Genomewide comparison of DNA sequences between humans and chimpanzees Am. J. Hum. Genet 70:1490-1497[ISI][Medline]
Filipski J., 1988 Why the rate of silent codon substitutions is variable within a vertebrate genome J. Theor. Biol 34:159-164
Flynn L. J., L. L. Jacobs, E. H. Lindsay, 1985 Problems in muroid phylogeny: relationships to other rodents and origin of major groups Pp. 589616 in W. P. Luckett and J.-L. Hartenberger, eds. Evolutionary relationships among rodents. Plenum, New York
Gaut B. S., S. V. Muse, W. D. Clark, M. T. Clegg, 1992 Relative rates of nucleotide substitution at the rbcL locus of monocotyledonous plants J. Mol. Evol 35:292-303[ISI][Medline]
Goodman M., 1961 The role of immunochemical differences in the phyletic development of human behavior Hum. Biol 33:131-162[ISI][Medline]
. 1962 Evolution of the immunologic species specificity of human serum proteins Hum. Biol 34:104-150[ISI][Medline]
Goodman M. C., C. A. Porter, J. Czelusniak, S. L. Page, H. Schneider, J. Shoshani, G. Gunnell, C. P. Groves, 1998 Toward a phylogenetic classification of primates based on DNA evidence complemented by fossil evidence Mol. Phyl. Evol 9:585-598[ISI][Medline]
Gu X., W.-H. Li, 1992 Higher rates of amino acid substitution in rodents than in humans Mol. Phylogenet. Evol 1:211-214[Medline]
Haile-Selassie Y., 2001 Late miocene hominids from the middle Awash, Ethiopia Nature 412:178-181[ISI][Medline]
Herbert G., S. Easteal, 1996 Relative rates of nuclear DNA evolution in human and old world monkey lineages Mol. Biol. Evol 13:1054-1057
Kimura M., 1968 Evolutionary rate at the molecular level Nature 217:624-626[ISI][Medline]
. 1980 A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences J. Mol. Evol 16:111-120[ISI][Medline]
Kumar S., B. Hedges, 1998 A molecular timescale for vertebrate evolution Nature 392:917-920[ISI][Medline]
Kumar S., S. R. Gadagkar, 2001 Disparity index: a simple statistic to measure and test the homogeneity of substitution patterns between molecular sequences Genetics 158:1321-1327
Kumar S., S. Subramanian, 2002 Mutation rates in mammalian genomes Proc. Natl. Acad. Sci. USA 99:803-808
Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Bioinformatics 17:1244-1245
Lercher M. J., E. J. Williams, L. D. Hurst, 2001 Local similarity in evolutionary rates extends over whole chromosomes in human-rodent and mouse-rat comparisons: implications for understanding the mechanistic basis of the male mutation bias Mol. Biol. Evol 18:2032-2039
Li W.-H., 1997 Molecular evolution Sinauer Associates, Sunderland, Mass
Li W.-H., D. L. Ellsworth, J. Krushkal, B. H.-J. Chang, D. Hewett-Emmett, 1996 Rates of nucleotide substitution in primates and rodents and the generation-time effect hypothesis Mol. Phyl. Evol 5:182-187[ISI][Medline]
Makova K. D., W.-H. Li, 2002 Strong male-driven evolution of DNA sequences in humans and apes Nature 416:624-626[ISI][Medline]
Matassi G., P. M. Sharp, C. Gautier, 1999 Chromosomal location effects on gene sequences evolution in mammals Curr. Biol 9:786-791[ISI][Medline]
Nei M., S. Kumar, 2000 Molecular evolution and phylogenetics Oxford University Press, New York
Ohta T., 1993 An examination of the generation-time effect on molecular evolution Proc. Natl. Acad. Sci. USA 90:10676-10680[Abstract]
Sarich V. M., A. C. Wilson, 1967 Immunological time scale for hominoid evolution Science 158:1200-1203[ISI][Medline]
Seino S., G. I. Bell, W.-H. Li, 1992 Sequences of primate insulin genes support the hypothesis of a slower rate of molecular evolution in humans and apes than in monkeys Mol. Biol. Evol 9:193-203[Abstract]
Smith N. G. C., L. D. Hurst, 1999 The causes of synonymous rate variation in the rodent genome: can substitution rates be used to estimate the sex bias in mutation rate? Genetics 152:661-673
Sokal R. R., F. J. Rohlf, 1995 Biometry: the principles and practice of statistics in biological research 3rd edition. W. H. Freeman and Company, New York
Tamura K., M. Nei, 1993 Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees Mol. Biol. Evol 10:512-526[Abstract]
Vignaud P., P. Duringer, H. T. Mackaye, et al. (21 co-authors). 2002 Geology and palaeontology of the upper miocene Toros-Menalla hominid locality, Chad Nature 418:152-155.[ISI][Medline]
Wolfe K. H., P. M. Sharp, W.-H. Li, 1989 Mutation rates differ among regions of the mammalian genome Nature 337:283-285[ISI][Medline]
Wu C.-I., W.-H. Li, 1985 Evidence for higher rates of nucleotide substitution in rodents than in man Proc. Natl. Acad. Sci. USA 82:1741-1745[Abstract]
Xia X., Z. Xie, 2001 DAMBE: data analysis in molecular biology and evolution J. Hered 92:371-373
Zuckerkandl E., L. Pauling, 1965 Evolutionary divergence and convergence in proteins Pp. 357417 in V. Bryson and H. J. Vogel. eds. Evolving genes and proteins. Academic Press, New York