*College of Life Sciences, Peking University, Beijing;
College of Life Sciences, Fudan University, Shanghai;
Department of Biology, University College London
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Most of the enzymatic components of plant secondary metabolism are encoded by small families of genes that originated through gene duplications (Durbin, McCaig, and Clegg 2000
). CHS is encoded by a multigene family in many plants, such as Petunia (Koes et al. 1989
), Ipomoea (Durbin et al. 1995
), Gerbera (Heleriutta et al. 1995
, 1996
), and leguminous plants (Ryder et al. 1987
; Wingerder et al. 1989
; Ito et al. 1997
). Durbin, McCaig, and Clegg (2000)
reviewed CHS evolution in flowering plants with special reference to the genus Ipomoea. They pointed out that new CHS genes are recruited recurrently in flowering plants, and the rate of nucleotide substitution was frequently accelerated in new duplicate genes. There is growing evidence for repeated divergence in the CHS gene family (Tropf et al. 1994
; Helariutta et al. 1996
). One example is the functional shift from CHS to stilbene synthase (SS), which appears to have occurred repeatedly in plants (Tropf et al. 1994
). SSs are similar to CHS in that they are polyketide synthases, but the final products of the synthetic pathway in which SSs are involved are stilbene phytoalexins rather than flavonoids (Tropf et al. 1995
). The SSs are thought to have evolved independently from CHS several times over the course of plant evolution (Tropf et al. 1994
), and only a small number of amino acid changes are necessary to convert CHS to SS function (Tropf et al. 1995
). A second example of functional divergence of the CHS gene is found in Gerbera hybrida (Asteraceae). Helariutta et al. (1995)
analyzed the enzymatic properties of CHS-like genes and showed that their properties are distinct from both CHS and SS. Substrate testing showed that the novel CHS gene was unable to use 4-coumaroyl-CoA as a substrate but was able to use benzoyl-CoA. A novel product is produced by the enzymatic reaction, although the role of this product in plants is unknown.
Comparison of CHS gene sequences from different species revealed that the CHS gene is structurally conserved, and most of the CHS genes contain two exons and one intron. The position of the intron is conserved. Exon 1 usually encodes 3764 amino acid residues and exon 2 encodes about 340 residues. The latter is more conserved than the former in length and encodes almost all the active sites. The length of the intron varies significantly in different species, ranging from less than 100 bp to several kilobases.
Plants of Dendranthema exhibit considerable variation in flower color. Some of them are famous ornamental plants, e.g., florists dendranthema (formerly known as florists chrysanthemum). The diversity in flower color in Ipomoea is almost certainly due to differences in either the structure or the regulatory genes of the flavonoid biosynthetic pathway (Durbin et al. 1995
). In chrysanthemum cultivar Moneymaker, flower color was altered by modifications to the CHS gene (Courtney-Gutterson et al. 1994
). It is thus interesting to examine whether the diversity in flower color correlates with the diversity of CHS gene in Dendranthema. In this article the Dendranthema CHS genes were cloned and sequenced from six species, four of which (D. indicum, D. indicum var. aromaticum, D. lavandulifoium, and D. nankingense) have yellow flowers, whereas D. chanetii has pink flowers and D. vestitum has white flowers. By performing phylogenetic and evolutionary analyses, we hope to characterize the organization and evolution of the CHS gene family in Dendranthema.
Gene duplication is considered to be a major mechanism for evolutionary innovations and functional divergence (Ohno 1970
; Ohta 1993
; Force et al. 1999
). There has been considerable debate as to whether rapid evolution in gene families is caused by positive Darwinian selection after gene duplication (Ohta 1993
) or by relaxation of the functional constraints in redundant genes (Kimura 1983
, pp. 104113). The CHS gene family provides an interesting case for testing those hypotheses. To examine the differences of selective pressures among evolutionary lineages, maximum likelihood (ML) models of codon substitution were used to analyze the functional sequences in the Dendranthema CHS gene family (Yang 1998
; Yang and Nielsen 1998
). These models used the nonsynonymous-synonymous rate ratio (
= dN/dS) as an indicator of selective pressure on the protein. An
greater than 1 means that nonsynonymous mutations were fixed at a higher rate than synonymous mutations and that protein evolution is driven by positive selection. Our analysis provides evidence for positive selection driving functional divergence after gene duplication in the Dendranthema CHS gene family.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
To test for possible rate changes following gene duplication, the method of Muse and Gaut (1994)
, implemented in Hyphy (Pond 2001
), and the method of Li and Bousquet (1992)
, implemented in RRTree (Robinson-Rechavi and Huchon 2000
), were used for relative rate tests, with PHCHSA (Petunia x hybrida CHSA) used as the reference sequence. The method of Muse and Gaut is a LRT of rate constancy between two lineages with reference to a third outgroup lineage. The method of Li and Bousquet is a distance-based method that compares substitution rates between monophyletic groups of sequences. The test was applied to synonymous and nonsynonymous rates separately, with the method of Li (1993)
being used for rate estimation.
Codon substitution models implemented in the codeml program in the PAML package (Yang 1997
) were used to analyze changes in selective pressure during functional divergences of the Dendranthema CHS gene family. Two kinds of codon-substitution models are used. The "branch" models allowed the
ratio to vary among lineages and were used to construct LRTs to examine whether the
ratio along lineages after gene duplication was higher than along other lineages. Those models average synonymous and nonsynonymous rates over all sites in the sequence. The "branch-site" model accounts for variation in selective pressure among sites and is used to test for positive selection along the branches of interest which affects only a few amino acid sites (Yang and Nielsen 2002
). The model assumes four classes of sites. The first two site classes have
0 and
1 along all lineages in the phylogeny. The third and fourth site classes have
0 and
1 along all branches except a few branches of interest, which have
2. When the estimate of
2 is greater than 1, some sites are under positive selection along the branches of interest. This model can be compared with a "site-specific" model (M3 discrete, K = 2; Yang et al. 2000), that allows for two site classes with
0 and
1 only, to construct a LRT.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The phylogenetic relationships between Dendranthema CHS genes and the CHS-like sequences available for Asteridae were inferred by NJ, MP, and ML analyses. NJ trees were reconstructed with several distance measures (JC69, K80, and F84), and a parsimony tree search was conducted using the TBR perturbation algorithm. The HKY85 model of sequence evolution was used for inferring the ML tree. The tree topologies produced by different methods were similar on the overall structure. Furthermore, analyzing the first and second codon positions only and analyzing all three codon positions produced similar results. Figure 1
shows the phylogenetic tree reconstructed by ML using the first and second codon positions. It highlights three important features in all inferred trees. First, the CHS sequences of Dendranthema formed three distinct subfamilies, designated here as SF1, SF2, and SF3 (fig. 1
). Indeed the different methods produced identical relationships between the subfamilies, although the within-subfamily relationships are not stable. Second, the CHS genes of Dendranthema did not form species-specific clusters but instead formed subfamilies with the CHS genes from other plants. For example, sequences DICHS39, DICHS24, and DICHS3 were from the same species D. indicum, but they were grouped into different subfamilies. Third, the three subfamilies of the Dendranthema CHS gene clustered with GHCHS1, GHCHS2, and GHCHS3 of Gerbera, respectively. Those three genes have been studied in detail both on gene structure and function. Helariutta et al. (1995)
pointed out that GHCHS1, GHCHS2, and GHCHS3 are different members of the Gerbera CHS gene family, and GHCHS1 and GHCHS3 code for typical CHS enzyme, whereas the GHCHS2 enzyme differs from CHSs in its substrate specificity and reaction. CHS genes in Dendranthema showed a similar divergence pattern and fell into three subfamilies as well. Sequences DNCHS16 and DICHS39 were homologous to GHCHS1, and sequences DICHS24 and DCCHS1 were homologous to GHCHS2, whereas the rest of the Dendranthema CHS sequences were more closely related to GHCHS3 (fig. 1
). The phylogenetic analysis thus suggests that the CHS genes of Dendranthema and Gerbera originated from common ancestral genes, and the duplications giving rise to those ancestral genes occurred before the divergence of Dendranthema and Gerbera.
|
|
As the molecular clock assumption appears to hold at synonymous sites, we use synonymous rates to calculate rough estimates of the gene duplication times. No good fossil data are available to calibrate the clock. Instead we use the average synonymous substitution rate for plant nuclear genes at about 5 x 10-9 substitutions per site per year (Li 1997
, p. 193). The average synonymous distance from SF1 to SF2 and SF3 was 0.66 substitutions per site. The average synonymous distance between SF2 and SF3 was 0.602 substitutions per site. Thus the times of duplications were estimated to be about 66 (node A in fig. 1
) and 60 (node B) million years ago. The divergence time of Asteraceae was estimated to be at 30 to 60 million years ago (Cronquist 1977
).
LRTs of Adaptive Evolution after Gene Duplication
To understand the mechanisms of evolutionary rate variation among lineages, we apply two kinds of likelihood rate tests based on models of codon substitution. The first analysis examines the variation of selective pressures among lineages, with the selective pressure indicated by the nonsynonymous-synonymous rate ratio ( = dN/dS) (Yang 1997
, 1998
; Yang and Nielsen 1998
). The phylogeny of Asteraceae CHS genes was assumed as shown in figure 1
but only four sequences from the four outgroup families (PHCHSA, IpurD, PFAB2815, and AMCHS) were used. Three different models were used. The "one-ratio" model assumes the same
ratio for all lineages (fig. 1
; table 3
). The log-likelihood value under this model was
0 = -6881.85, with the estimate
= 0.053. The low
ratio highlights the overwhelming role of purifying selection in this gene family. The "free-ratios" model assumes an independent
ratio for each branch in the tree. The likelihood value under this model was
1 = -6748.87. Comparison of twice the log-likelihood difference, 2
= 2(
1 -
0) = 2x (-6748.87 - (-6881.85)) = 265.96, with the
2 distribution (df = 46) suggested rejection of the one-ratio model, with P < 0.001. The difference between the two models was significant, indicating that the
ratios were extremely variable among lineages.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Gene duplication is often followed by an elevated evolutionary rate (Li 1985
; Zhang, Rosenberg, and Nei 1998; Duda and Palumbi 1999; Schmidt, Goodman, and Grossman 1999; Bielawski and Yang 2001
), which can be due to either positive Darwinian selection for functional divergence (Ohta 1993
) or relaxation of selective constraints (Kimura 1983
). In the former case, the requirement of the new function exerts directional selective pressure, promoting fixation of advantageous nonsynonymous mutations. In the latter case, neutral mutations are fixed at random, which, perhaps due to environmental changes, lead to a novel function in one or both copies. The two hypotheses are often difficult to distinguish; for example, an elevated
ratio that is not >1 is compatible with both hypotheses.
In the present analysis of the Dendranthema and Gerbera CHS genes, the relative rate test suggested homogeneous synonymous rates, whereas nonsynonymous rates are different when sequences from SF2 are compared with those from SF1 or SF3. Likelihood analysis using the branch models showed that the ratios were highly variable among lineages of the tree, and the
ratio for the branch ancestral to SF2 is much higher than for other lineages (table 2
). Although the estimated
ratio is not greater than 1, this ratio is an average across all codons in the gene. A second analysis using the branch-site model accounts for heterogeneous selective pressure among sites and provided significant evidence for positive selections (with
> 1) acting at some sites along the branch ancestral to SF2. Furthermore, studies of Gerbera CHS genes indicate different functions of GHCHS2 from GHCHS1 and GHCHS3 (Helariutta et al. 1995
), and the phylogeny of figure 1 suggests similar functional divergence of SF2 from SF1 and SF3 in Dendranthema. Combining those results, we conclude that positive selection is a more likely explanation for the evolution of the Dendranthema CHS gene family than relaxed selective constraints.
The species sampled in this article have different flower colors. Yet, no correlation can be easily found between the flower color and the CHS genes of Dendranthema. The yellow-flowered species, D. indicum, has CHS genes from all three subfamilies, whereas other yellow-flowered species, D. indicum var. aromaticum, D. nankingense, and D. lavandulifoium, have CHS genes from only some of the subfamilies (fig. 1 ). The white-flowered D. vestitum has CHS genes from SF3 only, and the pink flowered species D. chanetii has CHS genes from both SF2 and SF3. We note that some members of the CHS gene family may not have been sequenced in some of the species. Also, flower color might not be determined by the number of CHS family members. Furthermore, other loci may affect flower color as well. Clegg, Cummings, and Durbin (1997) examined flower color variation in Ipomoea purpurea and suggested that as many as five loci control floral phenotypes in I. purpurea. Of those, only the A/a locus, which encodes the CHS gene, is well characterized at the molecular level, whereas the other four are not. In Dendranthema very few studies have been conducted on the molecular biology of the genes of flavonoid biosynthesis that determine flower color, and we do not yet know how many genes are responsible for flower color polymorphisms of Dendranthema plants. More work is needed for a complete causal analysis that connects floral phenotypes to genes.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Keywords: duplication
adaptive evolution
chalcone synthase
gene family
Dendranthema
Address for correspondence and reprints: Ziheng Yang, Department of Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, U.K. z.yang{at}ucl.ac.uk
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bielawski J. P., Z. Yang, 2001 Positive and negative selection in the DAZ gene family Mol. Biol. Evol 18:523-529
Clegg M. T., M. P. Cummings, M. L. Durbin, 1997 The evolution of plant nuclear genes Proc. Natl. Acad. Sci. USA 94:7791-7798.
Courtney-Gutterson N., C. Napoli, C. Lemieux, A. Morgan, E. Firoozabady, K. E. Robison, 1994 Modification of flower color in florist's chrysanthemum: production of a white-flowering variety through molecular genetics Biotechnology 12:268-271[ISI][Medline]
Cronquist A., 1977 The Compositae revisited Brittonia 29:137-153[ISI]
Duda T. F. Jr.,, S. R. Palumbi, 1999 Molecular genetics of ecological diversification: duplication and rapid evolution of toxin genes of the venomous gastropod Conus Proc. Natl. Acad. Sci. U. S. A 96:6820-6823
Durbin M. L., G. H. Learn, G. A. Huttley, M. T. Clegg, 1995 Evolution of the chalcone synthase gene family in the genus Ipomoea Proc. Natl. Acad. Sci. USA 92:3338-3342[Abstract]
Durbin M. L., B. McCaig, M. T. Clegg, 2000 Molecular evolution of the chalcone synthase multigene family in the morning glory genome Plant Mol. Biol 42:79-92[ISI][Medline]
Felsenstein J., 1981 Evolutionary trees from DNA sequences: a maximum likelihood approach J. Mol. Evol 17:368-376[ISI][Medline]
Force A., M. Lynch, F. B. Pickett, A. Amores, Y. L. Yan, J. Postlethwait, 1999 Preservation of duplicate genes by complementary, degenerative mutations Genetics 151:1531-1545.
Hasegawa M., H. Kishino, T. Yano, 1985 Dating the human-ape split by a molecular clock of mitochondrial DNA J. Mol. Evol 22:160-174[ISI][Medline]
Helariutta Y., P. Elomma, M. Kotilainen, R. J. Griesbach, J. Schroder, T. H. Teeri, 1995 Chalcone synthase-like genes active during corolla development are differentially expressed and encode enzymes with different catalytic properties in Gerbera hybrida (Asteraceae) Plant Mol. Biol 28:47-60[ISI][Medline]
Helariutta Y., M. Kotilainen, P. Elomaa, N. Kalkkinen, K. Bremer, T. H. Teeri, V. A. Albert, 1996 Duplication and functional divergence in the chalcone synthase gene family of Asteraceae: evolution with substrate change and catalytic simplification Proc. Natl. Acad. Sci. USA 93:9033-9038
Ito M., Y. Ichinose, H. Kato, T. Shiraishi, T. Yamada, 1997 Molecular evolution and functional relevance of the chalcone synthase genes of pea Mol. Gen. Genet 255:28-37[ISI][Medline]
Jukes T. H., C. R. Cantor, 1969 Evolution of protein molecules Pp. 21123 in H. N. Munro, ed. Mammalian protein metabolism. Academic Press, New York
Kimura M., 1980 A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences J. Mol. Evol 16:111-120[ISI][Medline]
. 1983 The neutral theory of molecular evolution Cambridge University Press, Cambridge, U.K
Koes R. E., F. Quattrocchio, J. N. M. Mol, 1994 The flavonoid biosynthetic pathway in plants: function and evolution BioEssays 16:123-132[ISI]
Koes R. E., C. E. Spelt, P. J. van den Elzen, J. N. M. Mol, 1989 Cloning and molecular characterization of the chalcone synthase multigene family of Petunia hybrida Gene 81:245-257[ISI][Medline]
Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Arizona State University, Tempe, Ariz
Li W.-H., 1985 Accelerated evolution following gene duplication and its implications for the neutralist-selectionist controversy Pp. 333352 in T. Otha and K. Aoki, eds. Population genetics and molecular evolution. Japan Scientific Press, Tokyo
. 1993 Unbiased estimation of the rates of synonymous and nonsynonymous substitution J. Mol. Evol 36:96-99[ISI][Medline]
. 1997 Molecular evolution Sinauer Associates, Inc., Sunderland, Mass
Li P., J. Bousquet, 1992 Relative-rate test for nucleotide substitutions between two lineages Mol. Biol. Evol 9:1185-1189
Muse S. V., B. S. Gaut, 1994 A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates with application to the chloroplast genome Mol. Bio. Evol 11:715-724
Nei M., T. Gojobori, 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions Mol. Bio. Evol 3:418-426[Abstract]
NC-IUB (Nomenclature Committee of the International Union of Biochemistry). 1985 Nomenclature for incompletely specified bases in nucleic acid sequences Recommendations. Eur. J. Biochem 150:1.
Ohno S., 1970 Evolution by gene duplication Springer-Verlag, New York
Ohta T., 1993 Pattern of nucleotide substitution in growth hormone-prolactin gene family: a paradigm for evolution by gene duplication Genetics 134:1271-1276
Pond S. K., 2001 Hypothesis testing using phylogenies (HYPHY). Version 0.91 beta University of Arizona, Tucson.
Robinson-Rechavi M., D. Huchon, 2000 RRTree: relative-rate tests between groups of sequences on a phylogenetic tree Bioinformatics 16:296-297[Abstract]
Ryder T. B., S. A. Hedrick, J. N. Bell, X. Liang, S. D. Clouse, C. J. Lamb, 1987 Organization and differential activation of a gene family encoding the plant defense enzyme chalcone synthase in Phaseolus vulgaris Mol. Gen. Genet 210:219-233[ISI][Medline]
Saitou N., M. Nei, 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]
Schmidt T. R., M. Goodman, L. I. Grossman, 1999 Molecular evolution of the COX7A gene family in primates Mol. Biol. Evol 16:619-626[Abstract]
Swofford D. L., 1998 PAUP*: phylogenetic analysis using parsimony (* and other methods). Version 4.0 Sinauer Associates, Sunderland, Mass
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL-X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882
Tropf S., B. Karcher, G. Schroder, J. Schroder, 1995 Reaction mechanisms of homodimeric plant polyketide synthases (stilbene and chalcone synthase). A single active site for the condensing reaction is sufficient for synthesis of stilbenes, chalcones, and 6'-deoxychalcones J. Biol. Chem 270:7922-7928
Tropf S., T. Lanz, S. A. Rensing, J. Schroder, G. Schroder, 1994 Evidence that stibene synthases have developed from chalcone synthases several times in the course of evolution J. Mol. Evol 38:610-618[ISI][Medline]
Wang J., L. Qu, J. Chen, H. Gu, Z. Chen, 2000 Molecular evolution of the exon 2 of CHS genes and the possibility of its application to plant phylogenetic analysis Chinese Sci. Bull 45:1735-1742[ISI]
Wingender R., H. Rohrig, C. Horicke, D. Wing, J. Schell, 1989 Differential regulation of soybean chalcone synthase genes in plant defence, symbiosis and upon environmental stimuli Mol. Gen. Genet 218:315-322[ISI][Medline]
Yamazaki Y., D. Suh, W. Sitthithaworn, K. Ishiguro, Y. Kobayashi, M. Shibuya, Y. Ebizuka, U. Sankawa, 2001 Diverse chalcone synthase superfamily enzymes from the most primitive vascular plant, Psilotum nudum Planta 214:75-84[ISI][Medline]
Yang Z., 1994 Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods J. Mol. Evol 39:306-314[ISI][Medline]
. 1997 PAML: a program package for phylogenetic analysis by maximum likelihood Comput. Appl. Biosci 13:555-556 [http://abacus.gene.ucl.ac.uk/software/paml.html] [Medline]
. 1998 Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution Mol. Biol. Evol 15:568-573[Abstract]
Yang Z., R. Nielsen, 1998 Synonymous and nonsynonymous rate variation in nuclear genes of mammals J. Mol. Evol 46:409-418[ISI][Medline]
. 2002 Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages Mol. Biol. Evol 19:908-917
Yang Z., R. Nielsen, N. Goldman, A.-M. K. Pedersen, 2000 Codon-substitution models for heterogeneous selection pressure at amino acid sites Genetics 155:431-449
Zhang J., H. F. Rosenberg, M. Nei, 1998 Positive Darwinian selection after gene duplication in primate ribonuclease genes Proc. Natl. Acad. Sci. U. S. A 95:3708-3713