* Center for Evolutionary Functional Genomics, Arizona Biodesign Institute, and School of Life Sciences, Arizona State University
Department of Biological Sciences, Tokyo Metropolitan University, Tokyo, Japan
Correspondence: E-mail: s.kumar{at}asu.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: mutation rate Drosophila speciation molecular evolution molecular clock evolutionary distance estimation
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Furthermore, codon usage biases are known to vary significantly among lineages (Rodriguez-Trelles et al. 1999), which will lead to synonymous substitution rate variation among lineages. This poses severe problems while inferring molecular time scales of fruit fly evolution. For instance, the Hawaiian Drosophila species, for which the divergence time is considered to be the best point for calibrating molecular clocks, show much lower codon usage biases than D. melanogaster (Rodriguez-Trelles et al. 2000); the D. melanogaster Adh gene sequence shows a 36% higher codon adaptation index (CAI; Sharp and Li 1987) than the Hawaiian D. picticornis Adh gene sequence. This is also true even when we compute codon usage statistics independent of the knowledge of the optimal codons, such as the effective number of codons (Wright 1990); D. picticornis has a much higher effective number of codons (47.1) than D. melanogaster (36.1). This difference in codon usage biases is also reflected in the rejection of the homogeneity of the substitution patterns in the third codon positions at a 1% level when the disparity index test (Kumar and Gadagkar 2001) is used.
Because evolutionary divergences now used for building fruit fly molecular clocks directly employ the actual amount of synonymous or nonsynonymous change that has been permitted in sequence evolution, the intrinsic non-clock-like behavior of substitution accumulation has repeatedly impeded those efforts (Thomas and Hunt 1993; Russo, Takezaki, and Nei 1995; Rodriguez-Trelles, Tarrio, and Ayala 2001a, b). Therefore, to build molecular time scales, we need to estimate distances that are independent of the effects of codon usage bias and selection. Here we present a method for estimating mutation distance based on analysis of multiple genes (we refer to this as genomic mutation distance) and use it for inferring timing of major fruit fly speciation events.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Distance Estimation
Synonymous Distances
We used the number of nucleotide substitutions per fourfold-degenerate site as the measure of synonymous distance to avoid estimation biases from approximations needed to separate synonymous and nonsynonymous sites (Dunn, Bielawski, and Z. Yang 2001; Kumar and Subramanian 2002). A third codon position was considered fourfold degenerate only if it was fourfold degenerate in both sequences compared. To reduce estimation errors, only sequence pairs with 50 or more fourfold-degenerate sites were included. This produced 6,085 sequence pairs from 176 genes. Of these only 2,977 pairs involved in major divergence events that occurred in the lineage leading to D. melanogaster were used. The Tamura-Nei (Tamura and Nei 1993) method was used to correct for multiple hits in order to account for transition/transversion rate and base-composition biases. The Disparity index test (Kumar and Gadagkar 2001) revealed significant base composition differences among lineages as the stationarity of substitution pattern was rejected in 35% pairwise comparisons at a 5% level, confirming significant differences in codon usage among lineages. Therefore, we used the modified Tamura-Nei method (Tamura and Kumar 2002) to account for substitution pattern heterogeneity in fourfold-degenerate sites among lineages.
Genomic Mutation Distances
For computing genomic mutation distances, we express the relationship between synonymous distance (dSi) and the mutation distance (dµi) for a given gene i by
|
|
|
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
In the estimation of divergence times of the ananassae subgroup from the melanogaster subgroup, we found that the average of uncorrected synonymous distances was much higher than that for the obscura-melanogaster comparison. This is unexpected because the species in the ananassae subgroup are considered to be more closely related to D. melanogaster than those in the obscura group. In fact, in the ananassae-melanogaster sequence comparisons, the observed heterogeneity of substitution pattern was higher than that expected in >99% Monte-Carlo replicates of the disparity index test (Kumar and Gadagkar 2001) for 7 of 10 genes. For the other 3 genes, this percentage was >80%. This extreme substitution pattern heterogeneity in the ananassae lineage might be a reason for the high synonymous distance estimates. However, there were three genes that were shared by species belonging to the ananassae subgroup, melanogaster subgroup, and obscura group. In those genes, the mutation distance for the ananassae-melanogaster comparison was always lower than that for the obscura-melanogaster comparison. This allowed us to use those genes to estimate ananassae-melanogaster divergence time by using the obscura group is an outgroup. For each gene, we divided the mutation distance for the ananassae-melanogaster divergence by the mutation distance for the obscura-melanogaster divergence and multiplied it by the obscura-melanogaster divergence time to obtain the ananassae-melanogaster divergence time. Finally, the average divergence time was obtained from three gene-specific estimates to reduce the effect of gene sampling errors. This estimate will need to be refined in the future as more sequences become available.
Divergence Times in Fruit Fly Evolution
Figure 4 shows the species divergence times for taxon-pairs with multiple genes (364 genes); 2,977 pairwise sequence comparisons from 176 genes were used in this analysis. The estimated molecular time scales for major divergence events are in good accordance with the inferences from biogeography and fossil records. The divergence of melanogaster + simulans from orena + erecta and from yakuba + teissieri (12.612.8 MYA) is in agreement with previous inferences based on biogeography (Lachaise et al. 1988). A fossil of a Scaptomyza species was found in Dominican amber with a minimal age of 23 Myr (Grimaldi 1987). Because Scaptomyza originated in Hawaii from a common ancestor of the Hawaiian Drosophila (Tamura et al. 1995; Tatarenkov, Zurovcova, and Ayala 2001), their divergence time should be at least 23 Myr. A molecular estimate of 30.5 (± 6.6) MYA is consistent with this requirement. In a similar way, the oldest time estimate of 62.9 ± 12.4 MYA is consistent with the maximum possible divergence time of 80 Mya between subgenera Drosophila and Sophophora as inferred from biogeographic considerations (Beverley and Wilson 1984). These agreements support potential appropriateness of the calibration of the genomic mutation clock.
|
|
Single Gene Molecular Clocks
The linear fit of species divergence times for the mutation distances observed for each individual gene, with divergence times estimated by using all the genes, is shown in table 2. For almost all genes shown, R2 values in linear regressions are very high for mutation distances and the average rates of mutation are rather similar. The multigene histogram of these mutation rates is shown in figure 6a. This distribution has the average rate (± 1 SE) of 0.0114 ± 0.0003. These mutation rates are contrasted with the nonsynonymous substitution rates for the same set of genes shown in figure 6a. The L-shaped distribution of the nonsynonymous substitution rates shows that Drosophila genes are under strong purifying selection, with only one gene showing a nonsynonymous substitution rate significantly higher than the average genomic mutation rate. This is the Acp26Aa gene, which is well known to be under strong positive selection (Tsaur and Wu 1997). Some of the other genes showing high nonsynonymous substitution rates were Acp29AB, mei-218, and rux; all of which have already been recognized as rapidly evolving at the protein sequence level (Aguade 1999; Avedisov et al. 2001; Manheim et al. 2002). However, the nonsynonymous rates were much lower than the average genomic mutation rate in all of these cases.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Comparative sequence analyses inherently generate average genomic rates among species. Therefore our estimates are average estimates among species. However, the similarity of base compositions of introns and noncoding regions over widely divergent Drosophila species (Moriyama and Hartl 1993; Bergman and Kreitman 2001) suggests that the mutation bias has not changed much during fruit fly evolution and that our estimates may be close approximations for species-specific mutation rates. (See, however, a small departure for D. willistoni reported by Bergman et al. [2002].) This permitted us to use genomic mutation distances from different gene sets from different species pairs to maximally utilize the available data in our analyses.
Relative rate tests conducted using the mutation distances at different levels of taxonomic divergence show that the null hypothesis of the presence of a mutation clock in diverse fruitfly species is not rejected. This allows for assuming a mutation clock and inferring a temporal pattern of species divergences during fruit fly evolution leading to D. melanogaster. In fact, the order of divergences for all the species groups (obscura and willistoni) and subgroups (takahashii, montium, and ananassae) belonging to the subgenus Sophophora from D. melanogaster (fig. 4) is consistent with previous studies for molecular phylogeny of Sophophora using multiple genes (Goto and Kimura 2001; O'Grady and Kidwell 2002).
On the basis of the divergence times estimated, it is interesting to speculate about the temporal pattern of speciation events in the evolutionary history leading to D. melanogaster. Figure 4 suggests that the speciation events have not occurred regularly in time, as several events are clustered. Three independent pairs of sibling species (melanogaster-simulans, yakuba-teissieri, and orena-erecta) diverged within a short time (5.46.8 MYA). Evolutionary divergences among these three pairs also occurred in a short span of time 10.412.8 MYA. The melanogaster subgroup (containing these six species) diverged from the cluster of the three major subgroups in the melanogaster group (containing ananassae, montium, and takahashii subgroups) during the Late to Middle Eocene (3545 MYA). The Hawaiian Drosophila also diverged from the lineage leading to D. virilis during this time period. Finally, both the obscura and willistoni groups diverged from the melanogaster group about 55 to 62 MYA after the divergence of the subgenera Drosophila and Sophophora, close to the K-T boundary, 63 MYA. These clustered timings of divergence are compatible with "radiations" proposed by Throckmorton (1975) to explain coincidental distribution patterns of species from independent species groups.
Many of the clustered divergence times either coincide or fall close to the periods of major climate changes during the Cenozoic. Marine sediment records suggest that important cooling steps occurred during the Late Miocene (5.06.5 MYA) and the Middle Miocene (1215 MYA) (Kennett 1995; Zachos et al. 2001) era that coincide with a number of divergence time estimates (fig. 4). By contrast, there is a paucity of speciation events during Oligocene to the Early Miocene, periods with relatively uniform climatic temperatures or warming. An event mapped to this period is the Hawaiian Drosophila-Scaptomyza split which occurred in Hawaii, where local volcanic activities are thought to be responsible for speciation (Carson 1992). There are also clear correspondences of older species divergence events with climatic cooling. Although we have many species from most of the major species groups and subgroups related to D. melanogaster in our analysis, speciation patterns for independent species groups and subgroups need to be examined with a number of genes to generalize these inferences. Nevertheless, if the observed correspondence between the time of species divergences and paleoclimate changes is true, it supports Wallace's hypothesis for a rapid species change resulting from climatic change (Wallace 1870a, b). In the present case, the factor is postulated to be climatic cooling in the Cenozoic. A major consequence of this cooling was an extensive increase in aridification in the middle to low latitude regions, which lead to expansions of savannas and grasslands as well as the fragmentation of forests (Kennett 1995) that were primary habitats of ancestral fruit fly species and populations (Throckmorton 1975). The adaptation to the newly arisen dry environment and the allopatry caused by the forest fragmentation are potential causes for stimulating fruit fly speciation. The former adaptation is supported by the distribution patterns of D. teissieri and D. yakuba, which are adapted to forests and savannas, respectively (Lachaise et al. 1988). The allopatric speciation is also plausible from the overlapping distribution patterns for independent species pairs, say, D. melanogaster-simulans and D. teissieri-yakuba (Lachaise et al. 1988).
Therefore the mutation clock proposed here provides opportunities to get an important glimpse of speciation processes and mechanisms when they are examined in the context of the contemporary earth history and environmental changes. Although the current view is still mostly speculative, the correlation between speciation events and Cenozoic climatic cooling will become better understood with accumulation of gene sequence data for other fruit fly species. The inferred temporal pattern of speciation from these efforts will be also useful in selecting genomes for sequencing and annotation, calibrating the tempo of DNA loss, building temporal contexts of origin and horizontal transfer events of the transposable elements, and understanding timing of gene duplications, chromosomal changes, and evolution of genome anatomies in general (Petrov and Hartl 1998; Silva and Kidwell 2000; Bergman et al. 2002; Kaminker et al. 2002).
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aguade, M. 1999. Positive selection drives the evolution of the Acp29AB accessory gland protein in Drosophila. Genetics 152:543-551.
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, and Z. Zhang, et al. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
Avedisov, S. N., I. B. Rogozin, E. V. Koonin, and B. J. Thomas. 2001. Rapid evolution of a cyclin A inhibitor gene, roughex, in Drosophila. Mol. Biol. Evol. 18:2110-2118.
Bergman, C. M., and M. Kreitman. 2001. Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11:1335-1345.
Bergman, C. M., B. D. Pfeiffer, D. E. Rincon-Limas, R. A. Hoskins, and A. Gnirke, et al. 2002. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome. Genome Biol 3:RESEARCH00860086.
Beverley, S. M., and A. C. Wilson. 1984. Molecular evolution in Drosophila and the higher Diptera II. A time scale for fly evolution. J. Mol. Evol. 21:1-13.[ISI][Medline]
Carson, H. L. 1992. Inversions in Hawaiian Drosophila,. Pp. 407439 in C. B. Krimbas and J. R. Powell, eds. Drosophila inversion polymorphism. CRC Press, Boca Raton, Fla.
Carson, H. L., and D. A. Clauge. 1995. Geology and biogeography of Hawaii,. Pp. 1429 in W. L. Wagner and V. A. Funk, eds. Hawaiian biogeography: evolution on a hot spot archipelago. Smithsonian Institution Press, Washington, D.C.
Dunn, K. A., J. P. Bielawski, and Z. Yang. 2001. Substitution rates in Drosophila nuclear genes: implications for translational selection. Genetics 157:295-305.
Eanes, W. F., M. Kirchner, J. Yoon, C. H. Biermann, and I. N. Wang, et al. 1996. Historical selection, amino acid polymorphism and lineage-specific divergence at the G6pd locus in Drosophila melanogaster and D. simulans. Genetics 144:1027-1041.
Easteal, S., and J. G. Oakeshott. 1985. Estimating divergence times of Drosophila species from DNA-sequence comparisons. Mol. Biol. Evol. 2:87-91.
Goto, S. G., and M. T. Kimura. 2001. Phylogenetic utility of mitochondrial COI and nuclear Gpdh genes in Drosophila. Mol. Phylogenet. Evol. 18:404-422.[CrossRef][ISI][Medline]
Grimaldi, D. A. 1987. Amber fossil Drosophilidae (Diptera), with particular reference to the Hispaniolan taxa. Am. Mus. Novitates 2880:1-23.
Hedges, S. B., and S. Kumar. 2003. Genomic clocks and evolutionary timescales. Trends Genet. 19:200-206.[CrossRef][ISI][Medline]
Hughes, A. L., and M. Yeager. 1997. Comparative evolutionary rates of introns and exons in murine rodents. J. Mol. Evol. 45:125-130.[ISI][Medline]
Kaminker, J. S., C. M. Bergman, B. Kronmiller, J. Carlson, and R. Svirskas, et al. 2002. The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective. Genome Biol. 3:RESEARCH00840084.
Kennett, J. P. 1995. A Review of polar climatic evolution during the Neogene, based on the marine sediment record,. Pp. 4964 in E. S. Vrba, G. H. Denton, T. C. Partridge and L. H. Burckle, eds. Paleoclimate and evolution, with emphasis on human origins. Yale University Press, New Heaven, Conn.
Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge, U.K.
Kumar, S., and S. R. Gadagkar. 2001. Disparity index: a simple statistic to measure and test the homogeneity of substitution patterns between molecular sequences. Genetics 158:1321-1327.
Kumar, S., and S. Subramanian. 2002. Mutation rates in mammalian genomes. Proc. Natl. Acad. Sci. USA 99:803-808.
Lachaise, D., M. L. Cariou, J. R. David, F. Lemeunier, and L. Tsacas, et al. 1988. Historical biogeography of the Drosophila-Melanogaster species subgroup. Evol. Biol. 22:159-225.[ISI]
Manheim, E. A., J. K. Jang, D. Dominic, and K. S. McKim. 2002. Cytoplasmic localization and evolutionary conservation of MEI-218, a protein required for meiotic crossing-over in Drosophila. Mol. Biol. Cell 13:84-95.
Moriyama, E. N., and D. L. Hartl. 1993. Codon usage bias and base composition of nuclear genes in Drosophila. Genetics 134:847-858.
Nei, M., and S. Kumar. 2000. Molecular evolution and phylogenetics. Oxford University Press, Oxford, New York.
O'Grady, P. M., and M. G. Kidwell. 2002. Phylogeny of the subgenus sophophora (Diptera: Drosophilidae) based on combined analysis of nuclear and mitochondrial sequences. Mol. Phylogenet. Evol. 22:442-453.[CrossRef][ISI][Medline]
Petrov, D. A., and D. L. Hartl. 1998. High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups. Mol. Biol. Evol. 15:293-302.[Abstract]
Powell, J. R. 1997. Progress and prospects in evolutionary biology: the Drosophila model. Oxford University Press, New York.
Rodriguez-Trelles, F., R. Tarrio, and F. J. Ayala. 1999. Switch in codon bias and increased rates of amino acid substitution in the Drosophila saltans species group. Genetics 153:339-350.
2000. Fluctuating mutation bias and the evolution of base composition in Drosophila. J. Mol. Evol. 50:1-10.[ISI][Medline]
2001a. Erratic overdispersion of three molecular clocks: GPDH, SOD, and XDH. Proc. Natl. Acad. Sci. USA 98:11405-11410.
2001b. Xanthine dehydrogenase (XDH): episodic evolution of a "neutral" protein. J. Mol. Evol. 53:485-495.[CrossRef][ISI][Medline]
Rowan, R. G., and J. A. Hunt. 1991. Rates of DNA change and phylogeny from the DNA sequences of the alcohol dehydrogenase gene for five closely related species of Hawaiian Drosophila. Mol. Biol. Evol. 8:49-70.[Abstract]
Russo, C. A. M., N. Takezaki, and M. Nei. 1995. Molecular phylogeny and divergence times of drosophilid species. Mol. Biol. Evol. 12:391-404.[Abstract]
Sharp, P. M., and W. H. Li. 1987. The Codon Adaptation Indexa measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15:1281-1295.[Abstract]
1989. On the rate of DNA sequence evolution in Drosophila. J. Mol. Evol. 28:398-402.[ISI][Medline]
Shields, D. C., P. M. Sharp, D. G. Higgins, and F. Wright. 1988. Silent sites in Drosophila genes are not neutralevidence of selection among synonymous codons. Mol. Biol. Evol. 5:704-716.[Abstract]
Silva, J. C., and M. G. Kidwell. 2000. Horizontal transfer and selection in the evolution of P elements. Mol. Biol. Evol. 17:1542-1557.
Takezaki, N., A. Rzhetsky, and M. Nei. 1995. Phylogenetic test of the molecular clock and linearized trees. Mol. Biol. Evol. 12:823-833.[Abstract]
Tamura, K., and S. Kumar. 2002. Evolutionary distance estimation under heterogeneous substitution pattern among lineages. Mol. Biol. Evol. 19:1727-1736.
Tamura, K., and M. Nei. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial-DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526.[Abstract]
Tamura, K., G. Toba, J. Park, and T. Aotsuka. 1995. Origin of Hawaiian drosophilids inferred from alcohol dehydrogenase gene sequences,. Pp. 918 in M. Nei and N. Takahata, eds. Current topics on molecular evolution: proceedings of the US-Japan workshop. The Pennsylvania State University, USA, Graduate School for Advanced Studies, Hayama, Japan.
Tatarenkov, A., J. Kwiatowski, D. Skarecky, E. Barrio, and F. J. Ayala. 1999. On the evolution of Dopa decarboxylase (Ddc) and Drosophila systematics. J. Mol. Evol. 48:445-462.[ISI][Medline]
Tatarenkov, A., M. Zurovcova, and F. J. Ayala. 2001. Ddc and amd sequences resolve phylogenetic relationships of Drosophila. Mol. Phylogenet. Evol. 20:321-325.[CrossRef][ISI][Medline]
Thomas, R. H., and J. A. Hunt. 1993. Phylogenetic relationships in Drosophila: a conflict between molecular and morphological data. Mol. Biol. Evol. 10:362-374.[Abstract]
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. Clustal-W-improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract]
Throckmorton, L. H. 1975. The phylogeny, ecology, and geography of Drosophila. Pp. 421469 in R. C. King, ed. Handbook of genetics. Plenum Press, New York.
Tsaur, S. C., and C. I. Wu. 1997. Positive selection and the molecular evolution of a gene of male reproduction, Acp26Aa of Drosophila. Mol. Biol. Evol. 14:544-549.[Abstract]
Wallace, A. R. 1870a. The measurement of geological time I. Nature 1:399-401.
Wallace, A. R. 1870b. The measurement of geological time II. Nature 1:425-455.
Wright, F. 1990. The "effective number of codons" used in a gene. Gene 87:23-29.[CrossRef][ISI][Medline]
Wu, C. I., and W. H. Li. 1985. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. USA 82:1741-1745.[Abstract]
Zachos, J., M. Pagani, L. Sloan, E. Thomas, and K. Billups. 2001. Trends, rhythms, and aberrations in global climate 65 Ma to present. Science 292:686-693.