Laboratory of Plant Genetics, Graduate School of Agriculture, Kyoto University, Kyoto, Japan
Correspondence: E-mail: >akirakawabe{at}hotmail.com.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: PgiC Arabidopsis halleri ssp. gemmifera pseudogene polymorphism
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the present study, DNA polymorphisms in the two PgiC genes were analyzed in A. halleri ssp. gemmifera. DNA polymorphism in the PgiC gene was described previously in A. thaliana (Kawabe, Yamane, and Miyashita 2000). The nucleotide diversity in the PgiC region was 0.0038, which was relatively low compared with that in other genes of A. thaliana. In A. thaliana, two divergent sequences associated with the two allozyme classes were observed. An excess of singleton sites and a high proportion of polymorphic replacements were observed in the PgiC region of A. thaliana. Newly arisen advantageous allozyme (Fast type) was proposed for the cause of these nonneutral DNA polymorphisms in A. thaliana PgiC.
The sequences of the PgiC genes in A. halleri ssp. gemmifera have been published (Kawabe and Miyashita 2002b). There are two PgiC loci in A. halleri ssp. gemmifera that are thought to be the result of a duplication after species splitting of A. halleri ssp. gemmifera and A. thaliana. One of these PgiC loci, PgiC2, was not detected by RT-PCR of cDNAs derived from whole plant during vegetative growth stages. Thus, the pattern of expression of PgiC2 should differ from those of PgiC1 and A. thaliana PgiC. The PgiC2 locus may not have a functional promoter. In the present study, the possibility that expression of PgiC2 is silenced was examined by analyzing DNA polymorphisms. The other locus (PgiC1) shares a common structure with A. thaliana and is expressed normally.
The present study had two primary objectives. The first was to identify and characterize DNA polymorphisms in the PgiC locus. DNA polymorphisms in the PgiC1 gene of A. halleri ssp. gemmifera were then compared with those in PgiC of A. thaliana and other species to evaluate gene-specific patterns in DNA polymorphism in the PgiC gene. The second objective was comparison of DNA polymorphisms between the active gene and pseudogene in A. halleri ssp. gemmifera. The rates of nonsynonymous substitutions are higher in pseudogenes in comparison with those in functional genes (Li 1981; Miyata and Hayashida 1981; Gojobori, Li, and Graur 1982), because pseudogenes are not subject to selective constraints. Thus, in pseudogenes, population structure and/or species history influence patterns and levels of DNA polymorphism directly. By comparing DNA polymorphisms between Adh and ChiA, and the active form and pseudogene of PgiC, differences in the effects of population structure and species history or selection may be revealed. If selection caused an excess of singleton sites and a high proportion of polymorphic replacements, the active PgiC gene, which is under different selection pressure from that of Adh and ChiA, may have a different pattern of DNA polymorphism. However, if population structure and species history influenced nonneutral patterns of DNA polymorphism, all genes, including pseudogenes, should show similar DNA polymorphism patterns.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
DNA Sequencing
Total DNAs were purified by the CTAB method as described in Kawabe, Yamane, and Miyashita (2000) and used as templates for PCR amplification of an approximately 6-kb fragment that includes the entire coding region of the PgiC gene. Two primers, AGPGIF2 (5'- GGT TTG GGT TCG TAT TAG AT 3') and AGPGIDU2 (5' ATC ATT GTG GTT CTG TCT AA-3'), were designed in the 5' flanking regions of PgiC1 and PgiC2, respectively. These primers and ATHPGI102 (5'-TTT ATG GGG TTT GGA TTA TTA G-3'), which had been designed in the 3' flanking region of the PgiC locus of A. thaliana (Thomas et al. 1993), were used for PCR amplification. PCR products were cloned into pUC18. Three clones were then mixed at equal concentrations and used as templates for sequencing reactions to avoid PCR artifacts and heterozygous sites. If heterozygous length variations were present in the sample and caused sequencing failures, the three clones were sequenced separately, and the consensus sequence was obtained. The PgiC2 locus of strain Taihei15 had two PCR bands with an approximately 500-bp difference in length. Both bands were cloned and sequenced. Twenty primers designed at approximately 500-bp intervals were used to sequence both strands of PgiC1 and PgiC2. Newly determined DNA sequences were deposited in the DNA Data Bank of Japan database under accession numbers AB100274 to AB100303.
Data Analyses
Analyzed regions included 0.8-kb and 1.2-kb of the 5' flanking regions of PgiC1 and PgiC2, respectively. In the present study, nucleotide positions were assigned relative to the translation initiation site (+1) of strain Ashibi. For the coding region of the PgiC2 gene, the original frame was considered for analyses of synonymous and replacement changes irrespective of frame shift variations. The DnaSP program version 3.0 (Rozas and Rozas 1999) was used to analyze intraspecific and interspecific variations. For A. halleri ssp. gemmifera, nucleotide diversity () (Nei and Li 1979; Tajima and Nei 1984) and
(4Neµ) (Watterson 1975) were estimated. The neutral hypothesis was assessed with the tests of Tajima (1989a), McDonald and Kreitman (1991), and Fu and Li (1993). A Neighbor-Joining (NJ) tree (Saitou and Nei 1987) was constructed based on Jukes and Cantor distances (1969) with MEGA2 (Kumar et al. 2000). Because introns could not be aligned between PgiC1 and PgiC2, only coding sequences were used for construction of a phylogenetic tree. Bootstrap probabilities with 500 replications were obtained for each internal branch.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
DNA Variations in the PgiC Loci in A. halleri ssp. gemmifera and Between Species
The differences in the ratios of synonymous to replacement changes between polymorphism and divergence was analyzed for the PgiC loci of A. halleri ssp. gemmifera (table 2). With divergences between species, fixed synonymous differences were much larger than fixed replacement differences. The high proportions of polymorphic replacement sites in both PgiC loci of A. halleri ssp. gemmifera yielded in statistically significant results for the McDonald and Kreitman (MK) test; the results for the PgiC2 locus showed greater significance because of the higher proportion of polymorphic replacement sites. Thus, the proportion of polymorphic replacement sites was high in the PgiC loci, especially in PgiC2, in comparison with that of between-species changes. Again, the high proportion of replacements in PgiC2 and the statistical significance of these differences suggest that PgiC2 is subject to relaxed selective pressure because it is a pseudogene.
|
|
The excesses of singleton sites were due to strain-specific variants, which generated a starlike phylogeny (fig. 2). The lengths of the internal branches were short for both PgiC loci of A. halleri ssp. gemmifera. Internal branches within each locus had low bootstrap probabilities. A starlike phylogeny is typically observed after a selective sweep or a population bottleneck. In either case, the starlike phylogeny may have been caused by population expansion (Kaplan, Hudson, and Langley 1989).
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Normal Level of Polymorphism in the Pgic2 Locus
In the coding region of PgiC2, novel stop codons, frame shifts, and large indel variations were observed. These DNA variations may alter the mRNA or affect mature protein structures. The presence of these mutations supports our theory that PgiC2 is a pseudogene. The high proportion of replacement polymorphic sites and nonconservative amino acid changes also suggest that PgiC2 is a pseudogene. The nucleotide diversity in the PgiC2 region was 0.0033, which was similar to those of previously analyzed regions of A. halleri ssp. gemmifera (Miyashita, Innan, and Terauchi 1996; Kawabe and Miyashita 2002a). In addition, the level of variation in replacement sites of PgiC2 was similar to those of other A. halleri ssp. gemmifera loci. This result was unexpected because PgiC2 is likely a pseudogene, and pseudogenes, which are free from selective constraints, are neutral during natural selection.
Previous investigations of DNA polymorphisms in pseudogenes in D. melanogaster also did not reveal higher levels of DNA variation (Pritchard and Schaeffer 1997; Ramos-Onsins and Aguadé 1998). Background selection was suggested to explain the finding that the pseudogene of larval cuticle protein had very low polymorphisms in D. melanogaster (Pritchard and Schaeffer 1997). The pseudogenes of Ceropin multigenes had lower levels of replacement variation than synonymous variation (Ramos-Onsins and Aguadé 1998), and the Ceropin pseudogenes were suggested to be either active genes with some null alleles or young pseudogenes. In two cases of pseudogenes in Drosophila, the levels of variation in the pseudogenes were lower than those of the active counterparts and noncoding regions. However, the level of variation in the PgiC2 locus of A. halleri ssp. gemmifera is similar level to that of noncoding regions of other loci in A. halleri ssp. gemmifera (table 4).
In A. halleri ssp. gemmifera, levels of replacement variation were relatively high compared with those of synonymous variation in all genes investigated (table 4). In contrast, the level of variation was much lower in replacement sites than in synonymous sites in Drosophila (Moriyama and Powell 1996) and A. thaliana (Innan et al. 1996; Kawabe et al. 1997; Purugganan and Suddith 1998, 1999; Kawabe and Miyashita 1999; Kuittinen and Aguadé 2000; Aguadé 2001). One possible explanation for this finding is that the effective population size of A. halleri ssp. gemmifera is small. In a small population, replacement substitutions with slightly deleterious effects are not eliminated as readily as those in a large population (Ohta 1973, 1992).
Nonneutral Patterns of DNA Polymorphism in the Pgic2 Region
The PgiC2 region showed significant deviation from neutrality in the tests of MK, Tajima, and Fu and Li. The high significance in the MK test is consistent with PgiC2 being a pseudogene. In general, a high proportion of replacement polymorphism is expected in pseudogenes. In PgiC2, a high proportion of nonconservative amino acid changes was also observed. These observations suggest that PgiC2 was released from selective constraint after silencing. However, the significant results from Tajima's and Fu and Li's tests cannot be explained by silencing of PgiC2. Considering that a pseudogene is under selectively neutral conditions, polymorphisms in PgiC2 may have been influenced directly by population structure and/or species history.
In A. halleri ssp. gemmifera, DNA polymorphisms in the Adh (Miyashita, Innan, and Terauchi 1996; Miyashita 2001) and ChiA (Kawabe and Miyashita 2002a) regions were analyzed previously. An excess of singletons and a high proportion of replacement polymorphic sites were observed in all A. halleri ssp. gemmifera genes examined. The excess of less frequent alleles would be caused under strong purifying selection (Tajima 1989a), recent population bottleneck (Tajima 1989b), small population size (Tajima 1989a), and/or hitchhiking effect (Braverman et al. 1997). The significant results from Tajima's test and Fu and Li's test regarding silent sites and high level of DNA polymorphism in replacement sites suggest that strong purifying selection is not the case for these four loci in A. halleri ssp. gemmifera. If the hitchhiking effect caused the excess of singletons in A. halleri ssp. gemmifera genes, we should assume that advantageous mutations should occur frequently throughout the genome of A. halleri ssp. gemmifera because Adh, ChiA, and PgiC encode proteins with different functions and may be located in different chromosomal regions. Although the locations of these genes in the genome were not examined in A. halleri ssp. gemmifera, Adh, ChiA, and PgiC are single-copy genes located on chromosomes 1L, 5S, and 5L, respectively, of A. thaliana.
A recent bottleneck and/or small population size should be considered in the case of A. halleri ssp. gemmifera. If the mutation rates are not significantly different between A. thaliana and A. halleri ssp. gemmifera, the rather low level of DNA variation in A. halleri ssp. gemmifera (table 4) suggests that the population size of this species is small. Small populations are responsible for excesses of less frequent alleles. Similar to the two PgiC loci, starlike phylogenies were also observed for Adh and ChiA (data not shown). A starlike phylogeny is not caused by occurrence of recombination or recurrent mutations, but by strain-specific variations. This suggests the occurrence of a recent bottleneck in A. halleri ssp. gemmifera. Additionally, the extensive subpopulation structure of A. halleri ssp. gemmifera may yield an excess of singletons and a high proportion of replacement polymorphic sites. Under a subpopulation structure with a small population size, subpopulation-specific DNA variants would be fixed frequently. The high proportion of replacement polymorphic sites in A. halleri ssp. gemmifera might be explained by a subpopulation structure with a small population size of A. halleri ssp. gemmifera. In this case, replacement mutations with slightly deleterious effects would have been fixed frequently in each subpopulation. The disequilibrium population structure should have influenced DNA polymorphisms across the A. halleri ssp. gemmifera genome. Thus, the similar DNA polymorphism patterns among A. halleri ssp. gemmifera genes may be a consequence of the population structure of A. halleri ssp. gemmifera.
Small population size with low migration could explain the present results. However, these conditions might occur in highly clonal species such as selfing plants, although A. halleri ssp. gemmifera is an outcrossing species. To determine whether the population structure of A. halleri spp. gemmifera is highly fragmented, it will be necessary to analyze genome-wide variations within and between populations as A. thaliana (Abbott and Gomes 1989; Todokoro, Terauchi, and Kawano 1995; Bergelson et al. 1998).
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Abbott, R. J., and M. F. Gomes. 1989. Population genetic structure and outcrossing rate of Arabidopsis thaliana (L.) Heynh. Heredity 62:411-418.[ISI]
Aguadé, M. 2001. Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana. Mol. Biol. Evol. 18:1-9.
Bergelson, J., E. Stahl, S. Dudek, and M. Kreitman. 1998. Genetic variation within and among populations of Arabidopsis thaliana. Genetics 148:1311-1323.
Braverman, J. M., R. R. Hudson, N. L. Kaplan, C. H. Langley, and W. Stephan. 1995. The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783-796.
Fu, Y. X., and W.-S. Li. 1993. Statistical tests of neutrality of mutation. Genetics 133:693-709.
Ford, V. S., B. R. Thomas, and L. D. Gottlieb. 1995. The same duplication accounts for PgiC genes in Clarkia xantiana and C. lewisii (Onagraceae). Syst. Bot. 20:147-160.[ISI]
Gojobori, T., W.-H. Li, and D. Graur. 1982. Patterns of nucleotide substitution in pseudogenes and functional genes. J. Mol. Evol. 18:360-369.[ISI][Medline]
Innan, H., F. Tajima, R. Terauchi, and N. T. Miyashita. 1996. Intragenic recombination in the Adh locus of the wild plant Arabidopsis thaliana. Genetics 143:1761-1770.
Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pp. 21132 in H. Munro, eds. Mammalian protein metabolism. Academic Press, New York.
Kaplan, N. L., R. R. Hudson, and C. H. Langley. 1989. The "hitchhiking effect" revisited. Genetics 123:887-899.
Kawabe A., H. Innan, R. Terauchi, and N. T. Miyashita. 1997. Nucleotide polymorphism in the acidic chitinase locus (ChiA) region of the wild plant Arabidopsis thaliana. Mol Biol Evol 14:1303-1315.[Abstract]
Kawabe, A, and N. T. Miyashita. 1999. DNA variation in the basic chitinase locus (ChiB) region of the wild plant Arabidopsis thaliana. Genetics 153:1445-1453.
Kawabe, A, and N. T. Miyashita. 2002a. DNA variation in the acidic chitinase locus (ChiA) region in Arabis gemmifera and among its related species. Genes Genet. Sys. 77:167-175.[CrossRef][ISI]
Kawabe, A, and N. T. Miyashita. 2002b. Characterization of duplicated two cytosolic phosphoglucose isomerase (PgiC) loci in Arabidopsis halleri ssp. gemmifera. Genes Genet. Syst. 77:159-165.[CrossRef][ISI][Medline]
Kawabe, A., N. T. Miyashita, and R. Terauchi. 1997. Phylogenetic relationship among the section Stenophora in the genus Dioscorea based on the analysis of nucleotide sequence variation in the phosphoglucose isomerase (Pgi) locus. Genes Genet. Syst. 72:253-262.[CrossRef][ISI]
Kawabe, A., K. Yamane, and N. T. Miyashita. 2000. DNA polymorphism at the cytosolic phosphoglucose isomerase (PgiC) locus of the wild plant Arabidopsis thaliana. Genetics 156:1339-1347.
Koch, M. A., B. Haubold, and T. Mitchell-Olds. 2000. Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol. Biol. Evol. 17:1483-1498.
Kuittinen, H., and M. Aguadé. 2000. Nucleotide variation at the chalcone isomerase locus in Arabidopsis thaliana. Genetics 155:863-872.
Kumar, S., K. Tamura, I. Jakobsen, and M. Nei. 2000. MEGA2: molecular evolutionary genetics analysis. Version 2.0. Pennsylvania and Arizona State Universities, University Park,. Pennsylvania and Tempe, Arizona.
Li, W.-H. 1981. Pseudogenes as a paradigm of neutral evolution. Nature 292:237-239.[ISI][Medline]
Liu, F., D. Charlesworth, and M. Kreitman. 1999. The effect of mating system differences on nucleotide diversity at the phosphoglucose isomerase locus in the plant genus Leavenworthia. Genetics 151:343-357.
McDonald, J. H., and M. Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652-654.[CrossRef][ISI][Medline]
Miyashita, N. T. 2001. DNA variation in the 5' upstream region of the Adh locus of the wild plant Arabidopsis thaliana and Arabis gemmifera. Mol. Biol. Evol. 18:164-171.
Miyashita, N. T., H. Innan, and R. Terauchi. 1996. Intra- and interspecific variation in the alcohol dehydrogenase locus region of wild plants Arabis gemmifera and Arabidopsis thaliana. Mol. Biol. Evol. 13:433-436.
Miyashita, N. T., A. Kawabe, H. Innan, and R. Terauchi. 1998. Intra- and interspecific DNA variation and codon bias of the alcohol dehydrogenase (Adh) locus in Arabis and Arabidopsis species. Mol. Biol. Evol. 15:1420-1429.
Miyata, T., and H. Hayashida. 1981. Extraordinary high evolutionary rate of pseudogenes: evidence for the presence of selective pressure against changes between synonymous codons. Proc. Natl. Acad. Sci. USA 78:5739-5743.[Abstract]
Miyata, T., S. Miyazawa, and T. Yasunaga. 1979. Two types of amino acid substitutions in protein evolution. J. Mol. Evol. 12:219-236.[ISI][Medline]
Moriyama, E. N., and J. R. Powell. 1996. Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277.[Abstract]
Nei, M., and W.-S. Li. 1979. Mathmatical model for studying genetic variation in terms of restriction endonuclease. Proc. Natl. Acad. Sci. USA 76:5269-5273.[Abstract]
Ohta, T. 1973. Slightly deleterious mutant substitution in evolution. Nature 246:96-98.[ISI][Medline]
Ohta, T. 1992. The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Syst. 23:263-286.[CrossRef][ISI]
O'Kane, S. L., and I. A. Al-Shehbaz. 1997. A synopsis of Arabidopsis (Brassicaceae). Novon 7:323-327.[ISI]
Pritchard, J. K., and S. W. Scheffer. 1997. Polymorphism and divergence at a Drosophila pseudogene locus. Genetics 147:199-208.
Purugganan, M. D., and J. J. Suddith. 1998. Molecular population genetics of the Arabidopsis califlower regulatory gene: nonneutral evolution and naturally occurring variation in floral homeotic function. Proc. Natl. Acad. Sci. USA 95:8130-8134.
Purugganan, M. D., and J. J. Suddith. 1999. Molecular population genetics of floral homeotic loci: departure from equilibrium neutral model at the apetala3 and pistillata genes of Arabidopsis thaliana. Genetics 151:839-848.
Ramos-Onsins, S., and M. Aguadé. 1998. Molecular evolution of Cecropin multigene family in Drosophila: functional genes vs. pseudogenes. Genetics 150:157-171.
Rozas, J., and R. Rozas. 1999. DnaSP version 3: a integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.
Saitou, N., and M. Nei. 1987. The Neighbor-Joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.[Abstract]
Tajima, F. 1989a. Statistical test for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.
Tajima, F. 1989b. The effect of change in population size on DNA polymorphism. Genetics 123:597-601.
Tajima, F., and M. Nei. 1984. Estimation of evolutionary distance between nucleotide sequences. Mol. Biol. Evol. 1:269-285.[Abstract]
Terauchi, R., T. Terachi, and N. T. Miyashita. 1997. DNA polymorphism at the Pgi locus of a wild yam,. Dioscorea tokoro. Genetics 147:1899-1914.
Thomas, B. R., V. S. Ford, E. Pichersky, and L. D. Gottlieb. 1993. Molecular characterization of duplicate cytosolic phosphoglucose isomerase gene in Clarkia and comparison to the single gene in Arabidopsis. Genetics 135:895-905.
Todokoro, S., R. Terauchi, and S. Kawano. 1995. Microsatellite polymorphisms in natural populations of Arabidopsis thaliana in Japan. Jpn. J. Genet. 70:543-554.
Watterson, G. A. 1975. On the number of segregating sites in genetical models without recombination. Thor. Pop. Biol. 7:256-276.
|