Laboratorio de Evolución, Facultad de Ciencias, Montevideo, Uruguay
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The lack of any apparent function and the presumed absence of selective pressures (Li, Gojobori, and Nei 1981
; Kimura 1983
; Li 1997
) make retropseudogenes an interesting component of the genome for detailed evolutionary studies. Because all mutations occurring in a pseudogene have the same chance of being fixed in a population, they are ideal for studying patterns of spontaneous mutation and resulting biases in nucleotide composition under the assumption of neutrality (Li, Wu, and Luo 1984
).
Relatively few studies have taken advantage of pseudogenes to examine patterns of substitution (and evolution or phylogenetic reconstruction) in different kinds of organisms. However, Gojobori, Li, and Graur (1982)
, Li, Wu, and Luo (1984)
, and Bulmer (1986)
have documented biases in nucleotide substitutions, with an excess of changes from C to T and from G to A. Therefore, it is expected that pseudogenes will eventually become A-T-rich sequences. These trends can be interpreted as mutation biases, and we can assume in general that the deviations from the neutral evolutionary pattern are not significant (Li, Wu, and Luo 1984
; Li 1997
).
The present study focuses on retropseudogenes of the Aldolase A gene. Aldolase genes constitute an example of a small, dispersed multigene family. They are glycolytic enzymes associated with basal metabolism and are highly conserved in evolution. They are phylogenetically widespread, occurring in organisms as diverse as protozoans and humans (Hori et al. 1987
; Kukita et al. 1988
; Marsh and Lebherz 1992
). In vertebrates, the gene family consists of three functional loci, A, B, and C, each apparently represented by a single copy (Sakakibara, Mukai, and Hori 1985
; Tsutsumi et al. 1985
; Kukita et al. 1988
).
Processed pseudogenes derived from Aldolase A appear to be ubiquitous in mammals. They have previously been documented in humans (Keiichiro et al. 1986
) and rabbits (Amsden, Penhoet, and Tolan 1992
), and recently, we amplified them in a variety of rodents, including Rattus, North American pocket gophers (Thomomys), South American tuco-tucos (Ctenomys), and long-nosed mice (Oxymycterus) (data not shown). To examine the evolutionary details of Aldolase A retropseudogenes, we identified a variety of these sequences in several species of the rodent genus Mus. Studies of pseudogenes have often relied on comparisons of sequences obtained across divergent taxa (e.g., humans and mice), resulting in questions about the orthologous nature of the sequences and problems related to substitutional saturation. The fairly well established phylogeny of Mus and time-since-divergence estimates for several species (Bonhomme 1986
; Boursot et al. 1993
; Lundrigan and Tucker 1994
; Silver 1995
) provide an opportunity for detailed studies of retropseudogene emergence, divergence, and overall rates and patterns of evolution.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
DNA Extractions and PCR Amplifications
Total genomic DNA extractions were performed using SDS- proteinase-K-sodium chloride and alcohol precipitation according to Miller, Dikes, and Polesky (1988)
. Primers ALD6 (5'-GTGATCCTCTTCCACGAGACACT-3') and ALD11 (5'-ACCGTTGCCATGGCAATCTCCTC-3') were designed to match highly conserved sites in exons 3 and 7 of mouse Aldolase A (fig. 1
). Amplifications were performed in a Thermolyne DB 66920-26 thermocycler. Fragments of approximately 530 bp, corresponding to the expected length for processed sequences (fig. 1
), were obtained after 30 cycles of 95°C denaturation for 1 min, annealing at 58°C for 1 min, and extension at 72°C for 1 min, followed by a final extension of 2 min at 72°C. Recombinant Taq polymerase from either PROMEGA or GIBCO-BRL was used. Fragments were electrophoresed on 1% agarose gels and visualized on a UV transilluminator after ethidium bromide staining (0.5 µg/ml). The same amplification procedure was used for DGGE analysis, except primer ALD6+GC was substituted for primer ALD6 (Myers et al. 1986
; Myers, Maniatis, and Lerman 1987
; Myers, Sheffield, and Cox 1989
). The ALD&+GC primer sequence was 5'-CGCCCGCCGCGCCCCGCGCCCGGCCCGCCGCCCCCGCCCCGTGATCCTCTTCCACGAGACACT-3'). The two primers differ with respect to an additional 40-bp GC clamp on the original primer's 5' end (Sheffield et al. 1989
).
|
Cloning
Two primers (ALD 6up18 [5'GAGAUCUCUGTGATCCTCTTCCACGAGACACT-3'] and ALD 11up18 [5'-ACGCGUACUAGUACCGTTGCCATGGCAATCTCCTC-3']) were modified and used for PCR of products for cloning. Amplified PCR products were excised from 1.8% agarose gels, digested overnight with Agarase I (Sigma), and cloned using the Cloning Amp TM System (Gibco-BRL). Eight to thirteen clones from each of seven selected individuals (table 1
) were cycle sequenced using a 2400 Perkin Elmer thermocycler, PRISM reaction mix (Perkin Elmer), and an Applied Biosystems 377 automated sequencer. Each primer was sequenced in both directions with "Forward" and "Reverse" M13-pUC18 primers from Gibco-BRL and the ALD11 primer.
|
Side-by-side comparisons of reamplified cloned products and the original amplicons from genomic DNA, carried out using parallel DGGE, confirmed the correspondence of the vast majority of the sequences with DGGE bands. Sequences that did not match the original bands in DGGE were discarded as likely cloning or PCR artifacts, as were redundant sequences.
Alignment and Phylogenetic Analysis
Alignment was performed with CLUSTAL W (Higgins, Bleasby, and Fuchs 1997
) and optimized with visual inspection. Corresponding exon sequences of the functional Aldolase A genes of rat and mouse (GenBank accession numbers Y00516 and M12919, respectively) were included in all analyses. Phylogenies were obtained using equally weighted parsimony analysis in PAUP, version 4.0b2 (Swofford 1999
). Distance analyses were performed using Kimura two-parameter distances and the neighbor-joining algorithm as implemented in PAUP and MEGA, version 1.01 (Kumar, Tamura, and Nei 1993
). Support for nodes was assessed with 1,000 bootstrap replicas (in both distance and parsimony analyses).
Substitution biases were assessed as follows. First, the topology of the neighbor-joining tree, excluding the functional genes, was input into MacClade, version 3.07 (Maddison and Maddison 1997
). The program estimated the average number of substitutions of each type that must have taken place along the branches of the tree using the parsimony criterion. To obtain relative rates of substitutions, these values were divided by the frequency of each nucleotide. These were estimated for all data, as well as exclusively for the mouse and rat functional genes, and the results were essentially identical. Finally, a separate analysis was conducted using the same methods but excluding CG dinucleotides to eliminate the possible effects of these dinucleotides on C-to-T changes (Gojobori, Li, and Graur 1982
; Li, Wu, and Luo 1984
; see Results and Discussion).
Evolutionary rates were estimated by two methods. First, pairwise distances (Kimura two-parameter model) between sequences were estimated and plotted against estimated times of divergence of the species taken from Boursot et al. (1993)
and Silver (1995)
. This was done independently for each of the major clades identified in the phylogeny. Intraspecific comparisons within M. musculus domesticus were assigned a maximum age of 0.5 Myr, as suggested by Silver (1995)
. A second method entailed the computation of maximum-likelihood estimates of branch lengths (substitutions along branches) of the neighbor-joining topology as implemented in PAUP 4.0b2 (Swofford 1999
). Unlike pairwise comparisons, maximum-likelihood estimates are deemed independent of each other. However, they demand more assumptions about the nature of evolutionary changes and about the correctness of the tree. We also conducted a likelihood ratio test of the molecular clock for all pseudogenes by comparing a model that assumes a single underlying rate of evolution with one that releases that constraint and allows variation in rates among branches (Felsenstein 1995b
).
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Computational (Lerman and Silverstein 1985
) and empirical studies of DNA fragments differing by single-base-pair substitutions in correspondence with melting theory have been carried out (Fixman and Freire 1977
; Fisher and Lerman 1983
). Changes in migration during DGGE are predictable for single mutations but not for multiple substitutions as in our case. Thus, bands with different migratory behaviors must differ in sequence, but the converse is not necessarily true for a complex set of amplicons.
The number and pattern of observed bands suggest the presence of more than one locus (of processed fragments) in all cases. Thus, at least nine distinct bands are observed in M. pahari (fig. 2c ), which must therefore have at least five distinct loci. Even among individuals of the same population, it was possible to detect different bands, suggesting intrapopulational, possibly allelic, variation. In sum, DGGE analysis suggests a minimum of three or possibly four loci.
Phylogenetic Analysis of Sequence Variation
To corroborate and further examine the rich diversity of retropseudogenes suggested by DGGE, a subset of PCR products from seven individuals spanning the range of taxonomic diversity (table 1
) was subjected to multiple cloning and sequencing (see Materials and Methods) to produce a total of 43 distinct sequences.
Variation among processed amplicons ranged from single substitutions to 30% uncorrected sequence divergence, with typical values of about 10%. The presence of multiple insertions and deletions confirmed the pseudogene status of these sequences.
Processed pseudogenes may originate by reverse transcription from mature mRNA and subsequent insertion in the genome. Once one of such processed pseudogenes has been formed, additional copies may be generated by subsequent duplications. Each of these two pseudogene-generating mechanismsreverse transcription and subsequent duplicationleaves a characteristic phylogenetic pattern (fig. 3
). A phylogeny consisting solely of independently generated retrocopies from a single individual should be starlike (Felsenstein 1995a
); i.e., it should show no demonstrable cladistic relationships among loci other than through their historical connections with the functional gene. In contrast, pseudogenes generated by duplication are no different from other gene families in that paralogous loci are related to each other by a dichotomous gene tree that reflects the sequence of duplications (fig. 3
).
|
The neighbor-joining tree (fig. 4 ) combines the two patterns outlined above (fig. 3 ) and therefore suggests that several loci were originated by independent insertion of reverse transcripts and that in some cases such inserts underwent subsequent duplications. Thus, there are four strongly supported clusters (AD in fig. 4 ) of pseudogene sequences that connect to the base of the phylogeny with the mouse and rat functional genes. These four clusters likely represent independently generated retrocopies. However, there is also evidence for the presence of more than one locus in some of these clusters, suggesting that additional loci were generated by pseudogene duplication. These two features of our phylogenetic treeindependent retrotransposition and subsequent duplicationare illustrated by the following features of the phylogeny:
|
|
Estimates of nucleotide substitutions reconstructed with MacClade (Maddison and Maddison 1997
) on the phylogeny (fig. 4
) show clear biases in favor of transitions over transversions. Among the former, C-to-T changes are the most frequent. Indeed, estimation of nucleotide substitutions excluding CG dinucleotides essentially eliminates the excess of C-to-T over A-to-G transitions (fig. 6 ). More generally, the patterns of nucleotide substitutions in aldolase pseudogenes compare well with those previously estimated by O'hUigin and Li (1992)
. A comparison of the average nucleotide composition in our collection of pseudogenes with the corresponding regions of the functional Aldolase A gene of Mus shows no significant difference (
2 = 1.332, P > 0.72). It seems that there has not been enough time for the observed substitution biases to result in significant changes in nucleotide frequencies toward A-T.
|
Molecular Clock and Rates of Evolution
Figure 7
plots pairwise Kimura two-parameter distances against estimated times of divergence between species of Mus (Boursot et al. 1993
; Silver 1995
). Values for each of the major clusters in the phylogeny fit different curves, suggesting rate variation among them. Furthermore, our slopes suggest rates of pseudogene evolution that are two- to fourfold larger than those estimated by O'hUigin and Li (1992)
.
|
A likelihood ratio test yields support for the latter alternative. We computed the best maximum-likelihood tree for the aldolase pseudogenes, excluding the functional genes, using a 2:1 rate of transitions over transversions and no enforcement of a molecular clock. The resulting topology and a similar transition : transversion bias were used in a second model that enforced a molecular clock. As suggested by Felsenstein (1995b)
, these two models may be statistically compared to produce a test of the molecular-clock hypothesis. In this case, the model without a clock is significantly better (-2 ln(L) = 2,095.23500,
2 = 51.61505, P < 0.001). This analysis does not eliminate the possible effects of different times of origin of the four major clusters. Nonetheless, it reinforces the idea of substantial rate heterogeneity among aldolase pseudogenes. It should come as no surprise that, in turn, these data show somewhat different rates than published estimates.
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Our family of at least eight pseudogene loci, derived from a single functional gene, is one of the largest described thus far (Weiner 1986
; Dhawan et al. 1998
). We have found evidence of both independent retrotranscription and subsequent duplication being sources of pseudogene diversity. The underlying process responsible for the generation of pseudogenes deserves further analysis. One explanation for the abundance of pseudogenes may relate to a very active expression rate in germ line tissues, facilitating the generation of retrocopies (Lee et al. 1983
; Lindsey and Wilkinson 1996a, 1996b, 1996c
; Maiti et al. 1996
). In some cases, however, positive selection may be responsible (Sutton and Wilkinson 1997
).
A related, unresolved question is whether there is any association between the amount of retropseudosequences generated and the location of their parental functional genes within early- or late-replicating chromatin. Alternatively, there could be a relationship between the stability of the functional genes' mRNAs and this high number of retrocopies.
If the chance of fixation depends on the expression level in the germ line, two other predictions should hold: (1) other genes from families with high levels of expression in the germ line should have incorporated retrosequences; and (2) males, with continuous production of sperm, should have more suitable conditions in the germ line for the accumulation of retrocopies in comparison with females with a number of reproductive cells fixed, arrested in meiotic prophase (Lee et al. 1983
). It is just as important to examine what triggers the secondary duplication of retropseudogenes.
Because of their relatively rapid, and presumably neutral, evolution, processed pseudogenes are in principle ideal for phylogenetic analysis of species relationships. However, the diversity, multiple origins, and complex history of Aldolase A pseudogenes indicate that molecular phylogeneticists should take great care in distinguishing orthologous from paralogous sequences. Although we cannot be sure at this point, it is quite possible that families of related pseudogenes such as the one uncovered here are very common (Weiner 1986
).
![]() |
Supplementary Material |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: Aldolase A
pseudogenes
molecular evolution
silent substitutions
2 Address for correspondence and reprints: María Noel Cortinas, Iguá 4225, Montevideo 11400, Uruguay. manoel{at}fcien.edu.uy
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Amsden A. B., E. E. Penhoet, D. R. Tolan, 1992 A rabbit Ald A pseudogene derived from a partially spliced primary Aldolase A transcript Gene 120:323-324[ISI][Medline]
Bonhomme F., 1986 Evolutionary relationships in the genus Mus Curr. Top. Microbiol. Immunol 127:19-34[ISI][Medline]
Boursot P., J. C. Auffray, J. Briton-Davidian, F. Bonhomme, 1993 The evolution of house mice Annu. Rev. Ecol. Syst 24:119-152[ISI]
Bulmer M., 1986 Neighboring base effects on substitution rates in pseudogenes Mol. Biol. Evol 3:322-329[Abstract]
Coulondre C., J. H. Miller, P. J. Farabaugh, W. Gilbert, 1978 Molecular basis of base substitutions hotspots in Escherichia coli Nature 274:775-780[ISI][Medline]
Dhawan P., E. Yang, A. Kumar, K. D. Metha, 1998 Genetic complexity of the human geranylgeranyltransferase I beta-subunit gene: a multigene family of pseudogenes derived from mis-spliced transcripts Gene 210:9-15[ISI][Medline]
Felsenstein J., 1995a. Phylogenies and the comparative method American Naturalist 125:1-12
. 1995b. PHYLIP (phylogeny inference package). Version 3.5c Distributed by the author, Department of Genetics, University of Washington, Seattle
Fisher S. G., L. S. Lerman, 1983 DNA fragments differing by single base-pair substitutions are separated in denaturing gradient gels: correspondence with melting theory Proc. Natl. Acad. Sci. USA 80:1579-1583[Abstract]
Fixman M., J. J. Freire, 1977 Theory of DNA melting curves Biopolymers 16:2693-2704[ISI][Medline]
Gojobori T., W.-H. Li, D. Graur, 1982 Patterns of nucleotide substitutions in pseudogenes and functional genes J. Mol. Evol 18:360-369[ISI][Medline]
Graur D., Y. Shuali, W.-H. Li, 1989 Deletions in processed pseudogenes accumulate faster in rodents than in humans J. Mol. Evol 28:279-285[ISI][Medline]
Higgins D. G., A. J. Bleasby, R. Fuchs, 1997 CLUSTAL W 1.07: improved software for multiple sequencing alignment Comput. Appl. Biosci 8:189-191[Abstract]
Hori K., T. Mukai, K. Joh, Y. Arai, M. Sakakibara, H. Yatsuki, 1987 Structure and expression of human and rat aldolase isozyme genes: multiple mRNA species of Aldolase A produced from a single gene Pp. 153175 in Isozymes: current topics in biological research. Vol. 14.
Keiichiro J., Y. Arai, T. Mukai, K. Hori, 1986 Expression of three mRNA species from a single Rat Aldolase A gene differing in their 5' non-coding regions J. Mol. Biol 190:401-410[ISI][Medline]
Kimura M., 1983 The neutral theory of molecular evolution Cambridge University Press, Cambridge, England
Kukita A., T. Mukai, T. Miyata, K. Hori, 1988 The structure of brain-specific rat aldolase C mRNA and the evolution of aldolase isozyme genes Eur. J. Biochem 171:471-478[Abstract]
Kumar S., K. Tamura, M. Nei, 1993 MEGA: molecular evolutionary genetics analysis. Version 1.01 Pennsylvania State University, University Park
Lee M. G.-S., S. A. Lewis, C. D. Wilde, N. J. Cowan, 1983 Evolutionary history of a multigene family: an expressed human-tubulin gene and three processed pseudogenes Cell 33:477-487[ISI][Medline]
Lerman L. S., K. Silverstein, 1985 Computational simulation of DNA melting and its application to denaturing gradient gel electrophoresis Methods Enzymol 155:483-501
Lessa E. P., 1992 Rapid surveying of DNA sequence variation in natural populations Mol. Biol. Evol 9:323-330[Abstract]
. 1993 Analysis of DNA sequence variation at populational level by polymerase chain reaction and denaturing gradient gel electrophoresis Methods Enzymol 224:419-428[ISI][Medline]
Lessa E. P., G. Applebaum, 1993 Screening techniques for detecting allelic variation in DNA sequences Mol. Ecol 2:119-129[ISI][Medline]
Li W.-H., 1997 Molecular evolution Sinauer, Sunderland, Mass
Li W.-H., T. Gojobori, M. Nei, 1981 Pseudogenes as a paradigm of neutral evolution Nature 292:237-239[ISI][Medline]
Li W.-H., D. Graur, 1991 Fundamentals of molecular evolution Sinauer, Sunderland, Mass
Li W.-H., C.-I. Wu, C.-C. Luo, 1984 Nonrandomness of point mutations as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications J. Mol. Evol 21:58-71[ISI][Medline]
. 1985 Evolution of DNA sequences Pp. 194 in R. J. MacIntyre, ed. Molecular evolutionary genetics. Plenum Press, New York
Lindsey J. S., M. F. Wilkinson, 1996a. Pem: a testosterone- and LH-regulated homeobox gene expressed in extraembrionic tissues during early murine development Dev. Biol 166:170-179
. l996b. An androgen-regulated homeobox gene expressed in rat testis and epididymis Biol. Reprod 55:975-983[Abstract]
. 1996c. Homeobox genes and male reproductive development J. Assist. Reprod. Genet 13:182-192[ISI][Medline]
Lundrigan B. L., P. K. Tucker, 1994 Tracing paternal ancestry in mice, using the Y-linked, sex-determining locus, Sry Mol. Biol. Evol 11:483-492[Abstract]
Maddison W. P., D. R. Maddison, 1997 MacClade: analysis of phylogeny and character evolution. Version 3.07 Sinauer, Sunderland, Mass
Maiti S., J. Doscow, K. Sutton, R. P. Nhim, D. A. Lawlor, K. Levan, J. S. Lindsey, M. F. Wilkinson, 1996 The Pem homeobox gene: rapid evolution of the homeodomain, X chromosomal localization, and expression in reproductive tissue Genomics 34:304-316[ISI][Medline]
Marsh J. J., H. G. Lebherz, 1992 Fructose-bisphosphate aldolases: an evolutionary history Trends Biochem. Sci 17:110-113[ISI][Medline]
Miller S. A., D. D. Dikes, H. F. Polesky, 1988 A simple salting procedure for extracting DNA from human nucleated cells Nucleic Acids Res 16:215
Myers R. M., S. G. Fisher, T. Maniatis, L. S. Lerman, 1986 Modification of the melting properties of duplex DNA by attachment of a GC-rich DNA sequence as determined by denaturing gel electrophoresis Nucleic Acids Res 13:3111-3130[Abstract]
Myers R. M., T. Maniatis, L. S. Lerman, 1987 Detection and localization of single base changes by denaturing gradient gel electrophoresis Methods Enzymol 155:501-527[ISI][Medline]
Myers R. M., V. C. Sheffield, D. R. Cox, 1989 Polymerase chain reaction and denaturing gradient gel electrophoresis Pp. 177181 in H. A. Erlich, R. Gibbs, and H. H. Kazazian, eds. Polymerase chain reaction. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y
O'hUigin C., W.-H. Li, 1992 The molecular clock ticks regularly in muroid rodents and hamsters J. Mol. Evol 35:377-384[ISI][Medline]
Razin A., A. D. Riggs, 1980 DNA methylation and gene function Science 210:604-610[ISI][Medline]
Saitou N., S. Ueda, 1994 Evolutionary rates of insertion and deletion in noncoding nucleotide sequences of primates Mol. Biol. Evol 11:504-512[Abstract]
Sakakibara M., T. Mukai, K. Hori, 1985 Nucleotide sequence of a cDNA clone for human aldolase messenger RNA in the liver Biochem. Biophys. Res. Commun 131:413-420[ISI][Medline]
Sanguinetti C. J., E. D. Neto, A. J. G. Simpson, 1994 Rapid silver staining and recovery of PCR products separated on polyacrylamide gels BioTechniques 17:915-918
Sheffield V. C., D. R. Cox, L. S. Lerman, R. Myers, 1989 Attachment of a 40-base-pair G+C-rich sequence (GC-clamp) to genomic DNA fragments by the polymerase chain reaction results in improved detection of single-base changes Proc. Natl. Acad. Sci. USA 86:232-236[Abstract]
Silver L. M., 1995 Mouse genetics: concepts and applications Oxford University Press, Oxford, England
Sutton K. A., M. F. Wilkinson, 1997 Rapid evolution of a homeodomain: evidence for positive selection J. Mol. Evol 45:579-588[ISI][Medline]
Swofford D. L., 1999 PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0b2 Sinauer, Sunderland, Mass
Tsutsumi K.-I., T. Mukai, R. Tsutsumi, S. Hidaka, Y. Arai, K. Hori, K. Ishikawa, 1985 Structure and genomic organization of the rat aldolase B gene J. Mol. Biol 181:153-160[ISI][Medline]
Weiner A. M., 1986 Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information Annu. Rev. Biochem 55:631-661[ISI][Medline]