* Laboratory of Biometrics and Bioinformatics, Graduate School of Agriculture and Life Sciences, University of Tokyo, Bunkyo-ku, Tokyo, Japan
Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), Kawaguchi, Saitama, Japan
Correspondence: E-mail: zzhang{at}lbm.ab.a.u-tokyo.ac.jp.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: amylase duplicated genes genomic background evolutionary fate Drosophila
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Eukaryotic genomes are not uniform in recombination and mutation rates (Wolfe, Sharp, and Li 1989; Hey and Kliman 2002), and these affect the evolutionary rates and patterns of genes (Stephan and Langley 1989; Takano-Shimizu 1999, 2001; Munte, Aguade, and Segarra 2001). Furthermore, local recombination and mutation rates may or may not vary between the two duplicated gene copies. At one extreme, tandemly repeated genes may have a similar genomic background and be likely to evolve via concerted activity. At the other extreme, duplicated genes located far apart may have very different genomic backgrounds and experience very different evolutionary processes. Therefore, genomic background factors such as local recombination and mutation rates may predict the fates of the latter kind of recently duplicated genes.
Thornton and Long (2002) found that the average ratio of nonsynonymous to synonymous substitutions between duplicated genes on the X chromosome is significantly higher than the genome average in Drosophila melanogaster, implying that genomic locations affect the divergence between duplicated genes. Based on their survey for new retrogenes and the functionality and evolution of those genes, they found that there is a significant excess of retrogenes from the X chromosome that retropose to autosomes. Moreover, most X-derived autosomal retrogenes have evolved a testicular expression pattern (Betran, Thornton, and Long 2002). These observations may be explained by natural selection favoring those new retrogenes that moved to autosomes and thus avoided X inactivation; they also suggest the importance of genome position for the origin of new genes.
The Amy genes of Drosophila constitute a relatively small multigene family with two to seven members in different species. Inomata and Yamazaki (2000) first found that in D. kikkawai and its sibling species there are two divergent Amy gene clusters, each encoding for active isozymes. Cluster 1 is in the middle of the B arm of chromosome 2, thought to be a region with a normal recombination rate, and cluster 2 is near the centromere, a region with reduced recombination (Ashburner 1989). The two clusters exhibit significant divergence at synonymous sites and different expression levels and patterns (Inomata and Yamazaki 2000). Similar observations were reported in D. ananassae (Da Lage, Maczkowoak, and Carious 2000). Zhang et al. (2002) showed that the difference in GC3 content at synonymous sites between clusters 1 and 2 was caused primarily by the changes in selection intensity immediately after gene duplication in the montium subgroup. Based on analyses of the coding and 3'-flanking regions of the extended Amy gene sequences, we show here that the difference in local recombination rate rather than mutation bias has contributed significantly to the divergence at synonymous sites between clusters 1 and 2 in Drosophila species. Alternatively, the different patterns and levels of expression between the two clusters might be caused by chromatin potentiation, and this may explain the decreased codon bias in clusters 2. Both of these hypotheses suggest that genomic background has had a significant effect on the divergence of non-tandemly duplicated genes.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
As there is a great difference in GC3 content between the two types of Amy genes (Da Lage, Maczkowoak, and Carious 2000; Inomata and Yamazaki 2000; Zhang et al. 2002), the sequences of the first and second codon positions were used for phylogenetic analysis to reduce the effects of compositional bias on phylogenetic reconstruction. Neighbor-Joining (NJ), maximum parsimony (MP), and maximum likelihood (ML) methods, implemented in PAUP* 4.0 (Swofford 2001), were used for phylogenetic analysis. NJ analyses were carried out using the JC69, K80, and TN93 distances to examine their effects on topological stability. A heuristic tree search under parsimony was performed using the tree-bisection-recombination (TBR) swapping algorithm. Maximum likelihood trees were generated under the general time-reversible (GTR) model of evolution with a discrete gamma model (d) allowing for four categories of rate variation among sites (Swofford 2001). Heuristic searches under the ML optimality criterion were conducted using an MP starting tree and an NNI branch-swapping algorithm. The accuracy of the tree topology was assessed by bootstrap analysis, with 1,000 resampling replicates for the MP and NJ methods and 100 replicates for the ML method.
To test for differences in evolutionary rate between the two types of the Amy genes, the distance-based method of Li and Bousquet (1992), implemented in RRTree (Robinson-Rechavi and Huchon 2000), and a likelihood ratio test method of Muse and Gaut (1994), implemented in Hy-Phy (Muse and Pond 2000), were used for relative rate tests. The tests were applied to synonymous and nonsynonymous substitution rates separately.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
The Amy genes with lower GC3 contents are located in the regions near the centromeres of chromosomes 3 and 2 in D. ananassae (Da Lage, Maczkowoak, and Carious. 2000) and in D. kikkawai (Inomata and Yamazaki 2000). These regions have reduced local recombination rates (Ashburner 1989). In contrast, Amy genes with higher GC3 contents are located on the center of the arm of chromosome 2, and they have a normal local recombination rate (Ashburner 1989). This suggests that the local recombination rate affected nucleotide divergence between the two Amy gene clusters within species. Based on the pattern of polymorphism and divergence at synonymous sites, synonymous substitutions have been found to be subject to weak selection against major and non-major codons (Akashi 1995). Increasing evidence suggests that natural selection acts on synonymous sites in genes of Drosophila (Takano-Shimizu 1999; Munte, Aguade, and Segarra 2001). The results in table 2 indicate that the lower local recombination rate relaxed the selection constraint on synonymous substitutions in cluster 2 in the different subgroups because of the HillRobertson effect (Hill and Robertson 1966).
It should be pointed out that the differences in recombination rate between clusters 1 and 2 were qualitatively inferred based on their locations on particular chromosomes. Local recombination rates of orthologous regions may vary among the genomes of related species. Takano-Shimizu (1999) observed differences in the GC3 contents of the yellow gene between closely related species of Drosophila and experimentally suggested a difference in local recombination rates as a potential cause. In the present analysis, however, GC3 contents were very similar within clusters 1 and 2 (table 1). Thus, we have no reason to expect large variations in local recombination rates within the clusters.
Local mutation bias may explain the divergence between the two Amy clusters within species. However, Zhang et al. (2002) examined the GC contents of the introns and the 5'-flanking nucleotide sequences, and they found no difference in mutation bias between clusters 1 and 2 in the montium subgroup. It may be argued that the intron and 5'-flanking sequences are under some selection constraint. The unique short intron has 50% of the sequence corresponding to elements for the splicing reaction (Mount et al. 1992) and the 5'-flanking region harbors the promoter and other regulatory elements of gene expression. We therefore examined the 24 3'-flanking nucleotide sequences available from GenBank. The sequence lengths varied from 121 bp to 400 bp. Although the alignment columns without gaps are too short to estimate the detailed phylogenetic relations, the clusters formed groups consistent with the coding regions (results not shown). In contrast to the coding regions, the average GC content in the 3'-flanking region of the Amy gene cluster 1 is relatively smaller than that of the Amy gene cluster 2 in the montium subgroup (table 1). In D. ananassae, the GC contents of the 3'-flanking regions are 45.75 and 27.75 for Amy35 and Amy58 of cluster 1, respectively, and 28.26 and 36.75 for Amy4N and Amyi5 of cluster 2, respectively. The four D. ananassae Amy genes have 3'-flanking regions with a shared length of 400 bp. There is large GC content variation in these regions compared with the coding regions. There is no evidence that mutation bias has shaped the composition patterns of the two Amy clusters within species. Therefore, in the case of the Amy gene family in Drosophila, local recombination rate may be an important factor in the genomic background. By comparing orthologous sequences in Drosophila species, Takano-Shimizu (2001) observed positive correlation in GC content between coding and noncoding regions. However, we did not find such a correlation in the comparison of paralogous genes. Mutation bias explains species-specific GC contents, but location effect via local recombination rate makes a large contribution to the divergence between duplicated genes.
Local recombination rates affect natural selection through changes in the effective population size. Thus, changes in the effective population size should affect all types of substitutions in genes. However, changes in local recombination rate do not seem to have affected the corresponding divergence at amino acid level between the two Amy clusters within species (table 2). Alpha-amylase plays a major role in the digestive processes of carbohydrates by hydrolyzing starch from food substrates into smaller sugars, such as maltose and glucose. Both Amy clusters are active and expressed (Da Lage, Maczkowoak, and Carious 2000; Inomata and Yamazaki 2000; Zhang et al. 2002). On the one hand, strongly purifying selection might prevent changes to amino acids. On the other hand, this implies that most possible replacement substitutions are deleterious, and suggests that rates of amino acid replacement are insensitive to differences in the effective population size of the Amy gene region. Similar results have been observed in a study on the y gene in Drosophila (Munte, Aguade, and Segarra 2001). In addition, Zeng et al. (1998) inferred that the rates of amino acid replacement in Drosophila were not overdispersed. One possibility is that effective population sizes in Drosophila are large enough for most nonsynonymous mutations to be effectively deleterious, and therefore they do not become fixed (Zeng et al. 1998).
It is also of interest that the codon usage bias of one gene is positively related to its expression level (Shields et al. 1988). As indicated in previous studies (Inomata and Yamazaki 2000; Da Lage, Maczkowoak, and Carious 2000; Zhang et al. 2002), clusters 1 and 2 have different patterns and levels of expression, cluster 2 being expressed less than cluster 1. Changes in codon usage bias might therefore be caused by changes in expression level. In cluster 2, a decrease of expression level has led to a decrease in selection on codon usage bias. Consequently, codon usage bias has gone down and synonymous substitution rate has gone up (table 1). Two factors might explain this variation. One possibility is changes to the regulatory elements such as cis-sequences or trans-acting elements. In fact, cluster 2 in general has lost some cis-regulatory elements (Da Lage, Maczkowoak, and Carious 2000; Inomata and Yamazaki 2000; Zhang et al. 2002), which might be caused by relaxation of purifying selection in a lower recombination rate region. The other possibility lies in chromosomal domains of expression, such as chromatin potentiation in higher eukaryotes (Kramer et al. 1998; Boutanaev et al. 2002), because cluster 2 is near the centromere and cluster 1 is in the middle of its chromosomal arm; they therefore probably have different chromatin structures. It is not clear which of these scenarios is true. However, all the evidence suggests that genomic background drives the divergence between the two Amy clusters within Drosophila species, although it may act through different mechanisms.
Thus, the change of selection intensity triggered by genomic background along a genome seems to be the most general model that can account for the divergence at synonymous sites but not at amino acid level between the two Amy gene clusters within Drosophila species. With these clues from the evolution of the Amy gene family in Drosophila, together with emerging evidence from genomic data (see review in Kondrashov et al. 2002), it is the most likely that duplicated genes are not redundant from the start because of selection for increased dosage (Grauer and Li 1999; Force et al. 1999). Thus, after duplication, either copy should be subject to purifying selection (Kondrashov et al. 2002). Under certain selection pressures, the fates of duplicate genes will depend on their genomic backgrounds. Concerted evolution between tandemly repeated genes is also consistent with the genomic background hypothesis, because they share a similar genomic background (Grauer and Li 1999). In contrast, if two duplicate genes are located in different regions, the difference in genomic backgrounds will be a driving force for divergence. Because eukaryotic genomes are heterogeneous in recombination and mutation rates and in chromatin potentiation, this heterogeneity will accelerate the divergence of non-tandemly duplicated genes.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Akashi H. 1995. Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA. Genetics 139:1067-1076.
Ashburner, M. D. 1989. Drosophila. A laboratory handbook. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
Betran, E., K. Thornton, and M. Long. 2002. Retroposed new genes out of the X in Drosophila. Genome Res. 12:1854-1859.
Boutanaev, A. M., A. I. Kalmykova, Y. Y. Sheveiyov, and D. I. Nurminsky. 2002. Large clusters of co-expressed genes in the Drosophila genome. Nature 420:666-669.[CrossRef][ISI][Medline]
Brown, C. J., C. F. Aquadro, and W. W. Anderson. 1990. DNA sequence evolution of the amylase multigene family in Drosophila pseudoobscura. Genetics 126:131-138.
Da Lage, J.-L., F. Maczkowoak, and M.-L. Carious. 2000. Molecular characterization and evolution of the amylase multigene family of Drosophila ananassae. J. Mol. Evol. 51:391-403.[ISI][Medline]
Force, A., M. Lynch, B. Pickett, A. Amores, and Y.-L. Yan, et al. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545.
Grauer, D., and W.-H. Li. 1999. Fundamentals of molecular evolution, 2nd Edition. Sinauer Associates, Sunderland, Mass.
Hill, W. G., and A. Robertson. 1966. The effect of linkage on limits to artificial selection. Genet. Res. 8:269-294.[ISI][Medline]
Hey, J., and R. M. Kliman. 2002. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160:595-608.
Inomata, N., H. Tachida, and T. Yamazaki. 1997. Molecular evolution of the Amy multigenes in the subgenus Sophophora of Drosophila. Mol. Biol. Evol. 14:942-950.[Abstract]
Inomata, N., and T. Yamazaki. 2000. Evolution of nucleotide substitutions and gene regulation in the amylase multigenes in Drosophila kikkawai and its sibling species. Mol. Biol. Evol. 17:601-615.
Kondrashov, F. A., I. B. Rogozin, Y. I. Wolf, and E. V. Koonin. 2002. Selection in the evolution of gene duplications. Genome Biol. 3:research 0008.1-0008.9.
Kramer, J. A., J. R. McCarrey, D. Djakiew, and S. A. Krawetz. 1998. Differentiation: the selective potentiation of chromatin domains. Development 125:4749-4755.
Li, P., and J. Bousquet. 1992. Relative-rate test for nucleotide substitutions between two lineages. Mol. Biol. Evol. 9:1185-1189.
Lynch, M., and J. S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290:1151-1155.
Mount, S. M., C. Burks, G. Hertz, G. D. Stormo, O. White, and C. Fields. 1992. Splicing signals in Drosophila: intron size, information content, and consensus sequences. Nucleic Acids Res. 20:4255-4262.[Abstract]
Munte, A., M. Aguade, and C. Segarra. 2001. Changes in the recombinational environment affect divergence in the yellow gene of Drosophila. Mol. Biol. Evol. 18:1045-1056.
Muse, S. V., and B. S. Gaut. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates with application to the chloroplast genome. Mol. Biol. Evol. 11:715-724.
Muse, S. V., and S. K. Pood. 2000. Hy-Phy user manual. North Carolina State University,. Raleigh; University of Arizona, Tucson.
Ohno, S. 1970. Evolution by Gene Duplication. Springer-Verlag, Heidelberg.
Robinson-Rechavi, M., and D. Huchon. 2000. RRTree: relative-rate tests between groups of sequences on a phylogenetic tree. Bioinformatics 16:296-297.[Abstract]
Rubin, G. M., M. D. Yandell, and J. R. Wortman, et al. (50 co-authors). 2000. Comparative genomics of the eukaryotes. Science 287:2204-2215.
Shields, D. C., P. M. Sharp, D. G. Higgins, and F. Wright. 1988. "Silent" sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol. Evol. 5:704-716.[Abstract]
Steinemann, S., and M. Steinemann. 1999. The amylase gene cluster on the evolving sex chromosomes of Drosophila miranda. Genetics 151:151-161.
Stephan, W., and C. H. Langley. 1989. Molecular genetic variation in the centromeric region of the X chromosome in three Drosophila ananassae populations. 1. Contrasts between the vermilion and forked loci. Genetics 121:89-99.
Swofford, D. L. 2001. PAUP*: phylogenetic analysis using parsimony (*and other methods), Version 4. Sinauer Associates, Sunderland, Mass.
Takano-Shimizu, T. 1999. Local recombination and mutation effects on molecular evolution. Genetics 153:1285-1296.
Takano-Shimizu, T. 2001. Local changes in GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol. Biol. Evol. 18:606-619.
Thornton, K., and M. Long. 2002. Rapid divergence of gene duplicates on the Drosophila melanogaster X chromosome. Mol. Biol. Evol. 19:918-925.
Wolfe, K. H., P. M. Sharp, and W.-H. Li. 1989. Mutation rates differ among regions of the mammalian genome. Nature 337:283-285.[CrossRef][ISI][Medline]
Zeng, L.-W., J. M. Comeron, B. Chen, and M. Kreitman. 1998. The molecular clock revisited: the rate of synonymous vs. replacement change in Drosophila. Genetica 102/103:369-382.[CrossRef]
Zhang, Z., N. Inomata, T. Ohba, M.-L. Cariou, and T. Yamazaki. 2002. Codon bias differentiates between the duplicated amylase Loci following gene duplication in Drosophila. Genetics 161:1187-1196.