Estimation of Nucleotide Substitution Rates in Eurotiomycete Fungi

Takao Kasuga*, Thomas J. White{dagger} and John W. Taylor*

*Department of Plant and Microbial Biology, University of California–Berkeley;
{dagger}Celera Diagnostics, Alameda, California

In the entire fungal kingdom, only DNA substitution rates in the SSU rRNA gene (Berbee and Taylor 1993Citation , 2001Citation ) and amino acid substitution rates (Heckman et al. 2001Citation ) have been estimated and used to date fungal divergences. However, these molecules are not sufficiently variable to date events at or below the genus level. DNA sequences of protein-coding genes and the internal transcribed spacer (ITS) region are sufficiently variable, but their substitution rates are not known. In this article, we investigated the DNA substitution rates in the protein-coding genes and the ITS region by comparison with the SSU rDNA divergence in a representative fungal lineage, Eurotiomycetes (=plectomycetes) (Eriksson and Winka 1998Citation ). The Eurotiomycete lineage is a monophyletic class of Ascomycota (Berbee and Taylor 1992Citation ) and includes many economically important fungi, such as Penicillium chrysogenum (antibiotic production) and Aspergillus oryzae (soy sauce production), as well as many human pathogens such as Coccidioides immitis (lung disease), Histoplasma capsulatum (lung disease), and Trichophyton rubrum (athlete's foot). Estimating DNA substitution rates in quickly evolving molecules such as the ITS and protein-coding genes will not only provide a means to estimate the divergence times between lineages at and below the genus level, but in conjunction with coalescent theory, it may provide information for estimating epidemiological parameters such as effective population size (Watterson 1975Citation ) and recombination rates (Hey and Wakeley 1997Citation ).

With the aid of published Eurotiomycete phylogenies based on SSU rDNA (Ogawa, Yoshimura, and Sugiyama 1997Citation ; Sugiyama, Ohara, and Mikawa 1999Citation ), we searched for pairs of closely related species which had not only SSU rDNA sequences but also sequences of homologous protein-coding genes. Next, we compared the extent of differentiation at synonymous sites of protein-coding genes between sister taxa. It was found that saturation at synonymous sites of protein-coding loci is attained before divergence of the SSU rDNA sequences reaches 1%–2%. A 1%–2% divergence usually corresponds to divergences seen within a genus or among closely related genera. Therefore, the protein genes should be well suited to dating divergences at and below the genus level.

In this paragraph we briefly outline our methodology. The first step of our approach was to estimate the divergence times between sister species from the SSU rDNA phylogenetic tree (Ochman and Wilson 1987Citation ; Ochman, Elwyn, and Moran 1999Citation ). Twenty-seven Eurotiomycete species and two out-group taxa, i.e., the Sordariomycete Neurospora crassa and the loculoascomycete Capronia pilosella, were subjected to distance analysis. The regularity of nucleotide substitution rates at the SSU rDNA gene was evaluated by the two-cluster test and the branch length test (Takezaki, Rzhetsky, and Nei 1995Citation ; Nei and Kumar 2000Citation ). Rates were not found to be constant for six species and four nodes using the branch length test and the two-cluster test, respectively (P < 0.05, data not shown). Apparently, rate constancy does not hold throughout this system. Because rates may not be constant, we also considered a model in which molecular evolutionary rates vary across lineages instead of remaining constant. We compared the Langley Fitch (LF) algorithm, which assumes rate constancy, with the nonparametric rate-smoothing (NPRS) algorithm, which does not (Langley and Fitch 1974Citation ; Sanderson 1997Citation ). All methods of phylogenetic inference depend heavily on their underlying models. For this reason we used a hierarchical likelihood ratio test to search for the DNA substitution model that best fit our SSU rDNA data set (Posada and Crandall 1998Citation ). Phylogenetic relationships were then inferred using the selected evolutionary model and the heuristic search option for maximum likelihood (ML) implemented in PAUP 4.0b8a (Swofford 2001Citation ). The ML tree obtained was then used to estimate the divergence times between closely related taxa using the NPRS and LF methods. A simple and widely used Kimura two-parameter model was also used to construct a neighbor-joining (NJ) tree; the NJ tree was also subjected to the NPRS and LF methods as a comparison. Both NPRS and LF methods require at least one calibration point to fit branch lengths to the geological time scale. We used the divergence of Eurotiomycetes (=plectomycetes) and Sordariomycetes (=pyrenomycetes) for calibration. Initially, the divergence was set to 100 arbitrary units, and the relative divergence times between taxa were estimated (fig. 1 ; details are available in Supplementary Material at http://www.molbiolevol.org/). For all pairs of taxa, the NPRS algorithm gave 15.2% to 87.5% larger values for divergence times than did the LF algorithm. For both NPRS and LF algorithms, most of the divergence values calculated from the NJ tree with a Kimura two-parameter algorithm were larger than those from the ML tree; NJ values were up to 48.6% larger than ML values. A simulation study indicated that divergence times are estimated more accurately using NPRS rather than LF when (1) there are enough data, (2) evolution is not clock-like, and (3) levels of rate autocorrelation are moderate to high (Sanderson 1997Citation ). Unfortunately, although we believe we have sufficient data (1,631 bp), we have no way of assessing the other parameters, so we have used the divergence times of both the NPRS and the LF methods, hoping to account for the ambiguity of the past.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 1.—Estimation of divergence times according to the NPRS (a) and LF (b) methods. Branch lengths are proportional to time. Either of the three published values (310, 400, or 670 Myr) for the ES divergence time can be used to calibrate the trees. The tree topology was obtained through the heuristic search for ML implemented in PAUP 4.0b8a (Swofford 2001)Citation using the Tamura Nei DNA substitution model with equal base frequencies, among-site rate variation Gamma (0.6431), and the proportion of sites unable to accept substitutions (0.6425). Aligned sequences have been deposited in TreeBase (study accession number S804, www.treebase.org/treebase/)

 
As mentioned above, the NPRS and LF algorithms give only relative divergence times, which may be understood as a percentage of the time since the divergence of Eurotiomycetes and Sordariomycetes (ES). This ES divergence time can be inferred from the fossil record or estimated by molecular clock approaches, both of which again require calibration points from fossils. The oldest well-documented ascomycete fossils are the perithecia, asci, and ascospores that resemble those of extant Sordariomycetes and are found in the 400-Myr-old Rhynie chert (Taylor, Hass, and Kerp 1999Citation ). Thus, 400 Myr for the ES divergence would seem to provide a conservative date; however, earlier dates can be expected. The ES divergence time estimated from a molecular clock approach largely depends on a fundamental calibration point: the fungi-animal or fungi-animal-plant divergence time. Berbee and Taylor (Berbee and Taylor 2001Citation ) calibrated an SSU rDNA phylogenetic tree using a fungi-animal divergence time of 965 Myr (Doolittle et al. 1996Citation ) and estimated the ES divergence to be 310 Myr. Heckman et al. (2001)Citation used a fungi-animal-plant divergence time of 1,576 Myr, and their protein diversity analyses yielded an ES divergence of 670 Myr. The 310-Myr date is probably too recent, judging from the fossil evidence, and the 670-Myr date provides a more ancient estimate. We used three values, 310, 400, and 670 Myr, as calibration points for estimation of the divergence times between closely related Eurotiomycete species.

Our goal was to estimate DNA substitution rates at six independent protein-coding loci and the ITS of the rDNA repeat unit. To obtain substitution rates, we required estimates for divergence times and genetic distances between sister species. Divergence times between sister species of Eurotiomycete fungi were estimated using NPRS and LF algorithms based on three calibration times corresponding to the three estimates of ES divergence; hence, a total of six rate estimates were calculated for each of the species pairs (table 1 Go Go ). For class 1 chitin synthase (CHS1), class 2 chitin synthase (CHS2), zinc finger protein (creA), and orotidine-5'-phosphate decarboxylase (pyrG), four different measures of genetic distance were calculated: DNA substitutions in synonymous sites (KS), nonsynonymous sites (KA), the third bases of codons (K3rd), and exons. Introns were not included in these data sets. The modified Nei-Gojobori method implemented in MEGA2.0 (Kumar et al. 2000Citation ) was used to estimate KS and KA. The Kimura two-parameter model was used for all the other distance measures. Large parts of the data sets for ADP-ribosylation factor (arf) and alpha tubulin (tub1) are intron sequences, so for these loci four measures of genetic distance were calculated: DNA substitutions in exons, introns, K3rd, and KS. There were no nonsynonymous substitutions in the arf or tub1 loci.


View this table:
[in this window]
[in a new window]
 
Table 1 Estimates for Nucleotide Substitution Rates According to NPRS and LF Algorithms. Divergence Times were Estimated from the Tamura Nei ML Tree in Figure 1. There are Three Rate Estimates at Each Site for Each of the Algorithms Because of the Three ES Calibration Time Points 310, 400, and 670 Myr

 

View this table:
[in this window]
[in a new window]
 
Table 1 Continued

 

View this table:
[in this window]
[in a new window]
 
Table 1 Continued

 
The most data were available for CHS1, which was found in 4 out of 10 pairs of sister species. We used these data to evaluate how estimates of nucleotide substitution varied in different regions of the Eurotiomycete clade and how estimates were influenced by assumptions of rate constancy or rate variability (LF vs. NPRS), the substitution model (ML vs. NJ), or the ES divergence value (310, 400, or 670 Myr). Using data for all four pairs of species, KS for CHS1 varied 4.7-fold between the NPRS and LF algorithms, 1.5-fold between the ML–Tamura Nei and NJ-Kimura two-parameter models, and 2.2-fold over the three ES calibration points. The average synonymous substitution rates KS estimated with NPRS and the three ES values were 7.8 ± 3.4 x 10-9 (310 Myr), 6.1 ± 2.6 x 10-9 (400 Myr), and 3.0 ± 1.3 x 10-9 (670 Myr) (n = 4; ± values are one standard deviation). The average KS value estimated with LF and an ES of 400 Myr was 28.5 ± 28.7 x 10-9. Most of the variation could be explained by use of the LF algorithm or the NPRS algorithm and its underlying assumptions of rate constancy or variability. In spite of the several sources of variation, KS estimates ranged only from 10-9 to 10-8

Only one pair of sister species could be used for substitution rate estimation in the remaining protein-coding loci, CHS2, creA, pyrG, arf, and tub1. KS values for these genes are also in the range of 10-9–10-8. The KS values for the four species pairs at the CHS1 and one value for each of the CHS2, creA, pyrG, arf, and tub1 loci gave the average of 6.3 ± 5.4 x 10-9 (n = 9) when the ES divergence of 400 Myr and NPRS were used. On average, in CHS1, CHS2, creA, and pyrG, synonymous substitutions were found to accumulate about 22 times faster (KS/KA = 22.2 ± 8.4, n = 7) than nonsynonymous substitutions. This value is higher than the average ratio of Drosophila genes (KS/KA = 8.2) but is not an extreme value (Li 1997Citation ). The KS/KA ratio is positively influenced by both the intensity of negative selection and the population size.

The arf and tub1 data sets consist of exons and introns, for which rates were estimated separately. At arf and tub1 loci, introns mutate 2.2 and 1.6 times faster than synonymous sites and 8.6 and 4.7 times faster than all sites in exons, respectively.

The ITS in the rDNA repeat unit have been widely used for phylogenetic studies (e.g., Berbee et al. 1995Citation ; Cullings, Szaro, and Bruns 1996Citation ) and population studies (e.g., O'Donnell 1992Citation ; Kasuga et al. 1993Citation ). DNA substitution rates were estimated using six pairs of sister species. Unlike with protein-coding genes such as CHS1 (Bowen et al. 1992Citation ) and RPB2 (Liu, Whelen, and Hall 1999Citation ), multiple alignment of DNA sequences from species belonging to different genera was impossible due to the frequent occurrence of indels. The length at the spacer regions (ITS1 + ITS2) varied greatly between the species used in this study, ranging from 302 to 372 bp. DNA substitution rates at the ITS region (ITS1, ITS2 + 5.8S rDNA) vary extensively between lineages, and the average was 1.4 ± 1.3 x 10-9 (n = 5) when the NPRS and an ES divergence of 400 Myr were used. Despite the large functional difference between the ITS region and the protein-coding genes, DNA substitution rates are found to be comparable. In a wide range of plants, substitution rates at the ITS region fall between 1.72 x 10-9 and 7.83 x 10-9 (Richardson et al. 2001Citation ). Therefore, estimates for Eurotiomycetes are comparable to the lower values for plants.

Neutral Mutation Rates Across Kingdoms
Overall, neutral mutation rates in protein-coding genes, which were measured as synonymous substitution rates or approximated as substitution rates at the third bases of codons, ranged from 0.9 x 10-9 to 16.7 x 10-9 substitutions per site per year (NPRS and ES = 400 Myr). These values are in the range of DNA substitution rates in most of the protein-coding genes in plants (Gaut et al. 1996Citation ), animals (Li 1997Citation ), and bacteria (Ochman, Elwyn, and Moran 1999Citation ). For example, synonymous substitution rates in cereals are in the range of 5.1 x 10-9 to 7.1 x 10-9 (Wolfe, Sharp, and Li 1989Citation ). In mammals, rates were between 1.6 x 10-9 and 6.4 x 10-9, and in Drosophila, rates were between 3.7 x 10-9 and 30.0 x 10-9 (Li 1997Citation ; Moriyama and Powell 1997Citation ). In Escherichia coli and Salmonella typhimurium, substitution rates varied from 0.14 x 10-9 to 5.6 x 10-9 (Ochman and Wilson 1987Citation ). Although neutral mutation rates vary across loci, their range is surprisingly constant among the four major clades plants, animals, bacteria, and fungi in spite of the enormous differences in cellular organization, body size, generation time, and ecology of these organisms.

Acknowledgements

The authors are grateful to Michael Sanderson at the University of California at Davis, Chen Su and Helen Piontkivska at Pennsylvania State University, and Naoko Takezaki at the Max Planck Institute for helping with data analysis; Masato Sugiyama at Mitsubishi Chemical for sharing unpublished data with us; and Anne Pringle for comments on the manuscript. Financial support for this work was provided by the National Institutes of Health (grant A1 43491 to J.W.T.).

Footnotes

Julian Adams, Reviewing Editor

Keywords: synonymous substitution rates molecular clock plectomycete fungi evolution ITS Back

Address for correspondence and reprints: Takao Kasuga, Department of Plant and Microbial Biology, 111 Koshland Hall, University of California, Berkeley, California 94720. E-mail: kasugat{at}uclink4.berkeley.edu Back

References

    Berbee M. L., J. W. Taylor, 1992 Two ascomycete classes based on fruiting-body characters and ribosomal DNA sequence Mol. Biol. Evol 9:278-284[Abstract]

    ———. 1993 Dating the evolutionary radiations of the true fungi Can. J. Bot 71:1114-1127[ISI]

    ———. 2001 Fungal molecular evolution: gene trees and geologic time Pp. 229–246 in D. J. McLaughlin, E. McLaughlin, and P. A. Lemke, eds. The Mycota. Springer, Berlin

    Berbee M. L., A. Yoshimura, J. Sugiyama, J. W. Taylor, 1995 Is Penicillium monophyletic? An evaluation of phylogeny in the family Trichocomaceae from 18S, 5.8S, and ITS ribosomal DNA sequence data. Mycologia 87:210-222

    Bowen A. R., J. L.-P. Chen-Wu, M. Momany, R. Young, P. J. Szaniszlo, P. W. Robbins, 1992 Classification of fungal chitin synthases Proc. Natl. Acad. Sci. USA 89:519-523[Abstract]

    Cullings K. W., T. M. Szaro, T. D. Bruns, 1996 Evolution of extreme specialization within a lineage of ectomycorrhizal epiparasites Nature 379:63-66[ISI]

    Doolittle R. F., D. F. Feng, S. Tsang, G. Cho, E. Little, 1996 Determining divergence times of the major kingdoms of living organisms with a protein clock Science 271:470-477[Abstract]

    Drysdale M. R., S. E. Kolze, J. M. Kelly, 1993 The Aspergillus niger carbon catabolite repressor encoding gene, creA Gene 130:241-245[ISI][Medline]

    Eriksson O. E., K. Winka, 1998 Families and higher taxa of Ascomycota Myconet 1:17-24

    Gaut B. S., B. R. Morton, B. C. McCaig, M. T. Clegg, 1996 Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL Proc. Natl. Acad. Sci. USA 93:10274-10279[Abstract/Free Full Text]

    Heckman D. S., D. M. Geiser, B. R. Eidell, R. L. Stauffer, N. L. Kardos, S. B. Hedges, 2001 Molecular evidence for the early colonization of land by fungi and plants Science 293:1129-1133[Abstract/Free Full Text]

    Hey J., J. Wakeley, 1997 A coalescent estimator of the population recombination rate Genetics 145:833-846[Abstract/Free Full Text]

    Kasuga T., C. Woods, S. Woodward, K. Mitchelson, 1993 Heterobasidion annosum 5.8s ribosomal DNA and internal transcribed spacer sequence: rapid identification of European intersterility groups by ribosomal DNA restriction polymorphism Curr. Genet 24:433-436[ISI][Medline]

    Kasuga T., J. W. Taylor, T. J. White, 1999 Phylogenetic relationships of varieties and geographical groups of the human pathogenic fungus Histoplasma capsulatum Darling J. Clin. Microbiol 37:653-663[Abstract/Free Full Text]

    Kumar S., K. Tamura, I. Jakobsen, M. Nei, 2000 MEGA: molecular evolutionary genetics analysis Arizona State University, Tempe

    Langley C. H., W. Fitch, 1974 An estimation of the constancy of the rate of molecular evolution J. Mol. Evol 3:161-177[ISI][Medline]

    Li W.-H., 1997 Rates and patterns of nucleotide substitution Pp. 177–213 in Molecular evolution. Sinauer Associates, Sunderland, Mass

    Liu Y. J., S. Whelen, B. D. Hall, 1999 Phylogenetic relationships among ascomycetes: evidence from an RNA polymerase II subunit Mol. Biol. Evol 16:1799-1808[Abstract/Free Full Text]

    Moriyama E. N., J. R. Powell, 1997 Synonymous substitution rates in Drosophila: mitochondrial versus nuclear genes J. Mol. Evol 45:378-391[ISI][Medline]

    Nei M., S. Kumar, 2000 Molecular clocks and linearized trees Pp. 187–206 in Molecular evolution and phylogenetics. Oxford University Press, New York

    Nino-Vega G. A., C. A. Munro, G. San-Blas, G. W. Gooday, N. A. R. Gow, 2000 Differential expression of chitin synthase genes during temperature induced dimorphic transitions in Paracoccidioides brasiliensis Med. Mycol 38:31-39[ISI][Medline]

    Ochman H., S. Elwyn, N. A. Moran, 1999 Calibrating bacterial evolution Proc. Natl. Acad. Sci. USA 96:12638-12643[Abstract/Free Full Text]

    Ochman H., A. C. Wilson, 1987 Evolution in bacteria: evidence for a universal substitution rate in cellular genomes J. Mol. Evol 26:74-86[ISI][Medline]

    O'Donnell K., 1992 Ribosomal DNA internal transcribed spacers are highly divergent in the phytopathogenic ascomycete Fusarium sambucinum (Gibberella pulicaris) Curr. Genet 22:213-220[ISI][Medline]

    Ogawa H., A. Yoshimura, J. Sugiyama, 1997 Polyphyletic origin of species of the anamorphic genus Geosmithia and the relationships of the cleistothecial genera: evidence from 18S, 5S, and 28S rDNA sequence analysis Mycologia 89:756-771[ISI]

    Posada D., K. A. Crandall, 1998 MODELTEST: testing the model of DNA substitution Bioinformatics 14:817-818[Abstract]

    Richardson J. E., R. T. Pennington, T. D. Pennington, P. M. Hollingsworth, 2001 Rapid diversification of a species-rich genus of neotropical rain forest trees Science 293:2242-2245[Abstract/Free Full Text]

    Sanderson M. J., 1997 A nonparametric approach to estimating divergence times in the absence of rate constancy Mol. Biol. Evol 14:1218-1231[Free Full Text]

    Sugiyama M., A. Ohara, T. Mikawa, 1999 Molecular phylogeny of onygenalean fungi based on small subunit ribosomal DNA (SSU rDNA) sequences Mycoscience 40:251-258

    Swofford D. L., 2001 PAUP*: phylogenetic analysis using parsimony (*and other methods) Sinauer Associates, Sunderland, Mass

    Takezaki T., A. Rzhetsky, M. Nei, 1995 Phylogenetic test of the molecular clock and linearized tree Mol. Biol. Evol 12:823-833[Abstract]

    Taylor T. N., H. Hass, H. Kerp, 1999 The oldest fossil ascomycetes Nature 399:648.[ISI][Medline]

    Watterson G. A., 1975 On the number of segregating sites in genetical models without recombination Theor. Popul. Biol 7:256-276[ISI][Medline]

    Weidner G., C. d'Enfert, A. Koch, P. C. Mol, A. A. Brakhage, 1998 Development of a homologous transformation system for the human pathogenic fungus Aspergillus fumigatus based on the pyrG gene encoding orotidine 5'-monophosphate decarboxylase Curr. Genet 33:378-385[ISI][Medline]

    Wolfe K. H., P. M. Sharp, W. H. Li, 1989 Rates of synonymous substitution in plant nuclear genes J. Mol. Evol 29:208-211[ISI]

Accepted for publication July 25, 2002.