* Institute of Experimental Pathology (ZMBE), University of Münster, Münster, Germany; and Institute for Systems Biology, Seattle
Correspondence: E-mail: jueschm{at}uni-muenster.de.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: tRNA SINE armadillo retroposition SINE evolution
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The initial steps of retroposition require an RNA template. Reverse transcription is probably performed by a long interspersed element (LINE)encoded endonuclease/reverse transcriptase, the only known eukaryotic source that performs reverse transcription in trans (Kajikawa and Okada 2002; Dewannieux, Esnault, and Heidmann 2003). Processed as well as unprocessed RNAs can serve as templates for reverse transcription (Schmitz et al. 2004). It has been shown that the genomic integration sites in the most abundant primate and rodent SINEs exhibit a preference for TTAAAA motifs (Jurka 1997). In addition, Schmitz et al. (2004) showed that the 3' end of template RNAs can choose appropriate genomic targets by base complementarity.
tRNA-derived SINEs are found in different combinations. Monomeric tRNA-SINEs are present, for example, in the primate infraorder lorisiformes (Roos, Schmitz, and Zischler 2004), first discovered in Galago crassicaudatus (Galago monomer [Daniels and Deininger 1991]), in the dermopteran species Cynocephalus variegatus (CYN-I-SINE [Schmitz and Zischler 2003]), in rodents (ID-SINEs [Kass, Kim, and Deininger 1996]), and in the camelid species Vicugna vicugna (vic-1 [Lin et al. 2001]). Dimeric and trimeric tRNA-SINEs with identical subunits could be characterized in dermopterans (CYN-II, CYN-III [Schmitz and Zischler 2003], or t-SINE [Piskurek et al. 2003]). In many instances, tRNA-derived SINEs display fusion products of a tRNA-derived and a tRNA-unrelated subunit.
There are at least three conceivable mechanisms that generate combined tRNA-derived SINEs: (1) tRNA genes are frequently arranged in clusters and may, for instance, cotranscribe as dicistrons. One example of a retroposed, probably unprocessed RNA Pol III transcript, is the Twin SINE of the mosquito Culex pipiens. The unprocessed dicistronic transcript has been retroposed and serves as a master locus for new SINEs (Feschotte et al. 2001). Subunits of the Twin SINEs are separated by a 39-nt spacer that is assumed to be also present at the locus of the founder tRNA genes. (2) Genomic integration of a reverse transcribed monomeric element into the oligo(A) tail of a preexisting master SINE can lead to subsequent transcription of the new dimeric structure. Beside tRNA-derived SINEs, the most familiar example of dimerization is the fusion of 7SL-derived FLAM (free left Alu monomers) and FRAM (free right Alu monomers) to form Alu dimers (Quentin 1992). (3) Template switching during reverse transcription generates chimeric retronuons (Brosius 1999; Gilbert and Labuda 2000; Cost et al. 2002; Buzdin 2004).
Borodulina and Kramerov (2001) distinguished two structural variants of mammalian SINEs by comparing tRNA-related SINEs. Like all 7SL derivates and, for example, the tRNA-derived SINEs of dermopterans (CYN-SINEs), class T SINEs show a more or less homogeneous 3' terminal oligo(A) segment. On the other hand, class T+ SINEs, like most other tRNA-related SINEs, represent a more complex A-rich tail, including an AATAAATCTTT/(T)3A(n)-like motif. The conserved AATAAA motif has been suggested to act as a polyadenylation signal followed by an RNA Pol III termination signal (Borodulina and Kramerov 2001). Class T and class T+ SINEs are discussed as possible outcomes of different mechanisms of retroposition (Borodulina and Kramerov 2001). Based on our findings from surveying the sequences of the armadillo genome we discuss the involvement of both mechanisms to account for different DAS subfamilies.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Characterization of the Novel SINE Elements
We aligned all extracted candidates of DAS-SINEs by using the Mac OS X/Darwin version of the MAFFT multisequence alignment program (Katoh et al. 2002). To improve the alignment, we performed local realignments with XCED, the graphical user interface of MAFFT. This alignment was used to distinguish several subtypes of DAS-SINEs. All DAS-SINEs feature a tRNAAla-like sequence, in most cases preceded by a short GGGAA-like motif. A secondary structure model of this element was drawn corresponding to secondary models compiled at the tRNAscan-SE Search Server (http://rna.wustl.edu/GtRDB/Hs/Hs-align.html) and mfold (http://www.bioinfo.rpi.edu/applications/mfold/old/rna/form1.cgi) (default options; Zuker 2003) and compared with the human tRNAAla structure that so far shows greatest similarity. Phylogenetic analysis was performed by the maximum-likelihood algorithm implemented in Tree-Puzzle version 5.2 (Schmidt et al. 2002) with the HKY model of substitution (Hasegawa, Kishino, and Yano 1985) and 1,000 quartet-puzzling steps (QPS). All DAS-SINE locations are assigned by the coordinates of GenBank entries (see Supplementary Material online).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
Direct Repeats
A landmark of retroposed elements are their DRs. Perfect DRs are an indicator of recent retroposition events, and the existence of a retronuon subfamily with predominantly perfect DRs indicates recent activity of those elements. DAS-III3b retronuons exhibit the highest score of DR presence (90%) compared with DAS-II2b (50%) and DAS-Ia (44%) elements. This suggests that DAS-III elements are the youngest members of the DAS family (fig. 2A).
Self-Integration
A clear indicator of the evolutionary order of transpositions is the integration of SINE family members into already established older family members. We could detect DAS-III members in DAS-III and DAS-II types. On the other hand, we were unable to detect any DAS-I or DAS-II elements integrated into DAS-III retronuons (see Supplementary Material online). This provides additional evidence that DAS-III members are the most recent and/or still active elements. In that respect, they resemble Alu Y elements of primates (Batzer et al. 1996) (fig. 2B).
Phylogenetic Analysis
To compare the order of descent we established for the distinct DAS subfamilies, we performed a maximum-likelihood phylogenetic reconstruction of the tRNA-related parts (consensus sequences) of all DAS members. As an outgroup, we chose a human tRNAAla. Although the analyzed sequences are short, we could find strong correlation by high QPS values for the major groups of DAS members. DAS-III2b and DAS-III3b cluster by 100% QPS. All DAS-III members display one group with 100% QPS. The next relative is DAS-IIb, supported by 68% QPS. However, the basal resolution of the phylogenetic tree is too low to show a clear cluster of DAS-Ia and DAS-IIa. Furthermore, the origin of the second tRNA part of DAS-IIa and DAS-IIb cannot clearly be resolved by phylogenetic reconstruction (fig. 3).
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The novel D. novemcinctus SINE elements are currently not found in GenBank entries from any other species. Their taxonomic distribution within the edentates remains to be investigated experimentally. In total, we found 687 DAS-SINEs within the estimated 0.22% of genomic sequence information available (BAC clones). A total number of more than 300,000 copies per genome is assumed (fig. 2A). The DAS-SINE family is characterized by several clearly distinct subfamilies that probably originated from corresponding single or low-copy chromosomal source genes that "seeded" additional SINE source genes into different genomic locations. A high degree of similarity to a human tRNAAla gene and the distinct tRNA like structural features (figs. 1 and 5) suggest tRNAAla as the initial source for DAS-SINEs. An additional common characteristic of DAS elements is a GGGAA-like motif at their 5' ends that presumably corresponds to the 5' tRNA flanking region of the original chromosomal locus. Most tRNA-derived SINEs start with similar G-rich and 3 to 7 nt long tRNA-unrelated 5' regions, possibly because of the use of a partially unprocessed tRNA as retroposition template. An example is the rodent-specific BC1 RNA that probably arose by retroposition of a partialy processed tRNAAla with a 5' GCGGCT leader sequence (Rozhdestvensky et al. 2001). Correspondingly, the BC1 leader sequence starts with a similar 5' GGGGTT motif. Mature tRNAs, without any leader sequence, may serve as primers in reverse transcription of certain retroviruses and long terminal repeat (LTR) retrotransposons (Mak and Kleiman 1997). They may also serve as templates for the generation of certain SINE elements (Ohshima et al. 1996). DAS-SINEs are clearly distinct from this mechanism because of their unprocessed state.
A possible sequence of events forming the distinct DAS subfamilies can be summarized as follows: A first wave of retroposition generated a monomeric subfamily of DAS-Ia SINEs with a short leader sequence, characteristic homogeneous oligo(A) tail, and flanking DRs. The dimeric DAS-IIa retronuons probably arose by a DAS-Ia integration into the oligo(A) tail of a preexisting DAS-Ia master gene and subsequent transcription and retroposition of the dimeric DAS-IIa elements, featuring two tRNA domains separated by an oligo(A) region.
Perhaps in an independent second wave, as indicated by parallel lines in figure 2B, a more complex dimeric group of DAS-SINEs arose featuring additional approximately 120 nt long, heterogeneous A-rich motifs distal to the tRNA parts. Presumably, the A-rich tail of a monomeric DAS master gene grew in size substantially, presumably by extension during the retroposition process. This phenomenon has been described for Alu elements by Hagan, Sheffield, and Rudin (2003) and Dewannieux, Esnault, and Heidmann (2003) and is thought to be the result of reverse transcriptase slippage. During dimerization, the corresponding monomeric transcript retroposed into the enlarged 3' flank of a corresponding master gene to yield the dimeric structure of DAS-IIb. Analysis of DAS-SINE retropositions into preexisting DAS-SINE loci suggests that in 14 out of 16 cases, the integration took place in the A-rich spacer region between the two tRNA domains. This indicates a high acceptor affinity of those regions, probably a situation comparable to the formation of the first dimeric structures. However, we could not detect the typical TTAAAA target motif.
DAS-III chimeric elements simply emerged because of a deletion and insertion within a DAS-IIblike master locus rather than by template switching. Hence, scenario two (see Introduction) in combination with insertions and deletions are the most probable events in forming dimeric and chimeric DAS subfamilies. In contrast to DAS-Ia and DAS-IIa, all DAS-IIb and DAS-III subfamilies terminate with a 3' ATAAATCTTT-like motif, the trait of class T+ elements (Borodulina and Kramerov 2001). This observation supports the hypothesis of an independent origin of DAS-Ia /DAS-IIa from DAS-IIb/DAS-III subfamilies, even though both are derived from similar RNA template.
The DAS-III3b subfamily is the youngest, with 90% of the members exhibiting distinct DRs, as well as the lowest degree of internal heterogeneity (see figure 5). Finally, DAS-III3 elements are found integrated in most other DAS members but not vice versa, indicating a clear pattern of appearance over time (see figure 2B). Most of the splitting points as drawn by key events of deletions or insertions are congruent with the sequence phylogeny of the tRNA- related consensus part of all DAS-SINE subfamilies. Although the phylogenetic reconstruction was restricted to the tRNA related parts, strong QPS support values could be derived for the DAS-IIb and DAS-III subfamilies. For determination of a close relationship between DAS-Ia and DAS-IIa, as well as the affiliation of the second tRNA related region of DAS-IIa and DAS-IIb, sequence phylogeny could not be applied (fig. 3). This underscores once more the superiority of insertion and deletion analysis for phylogenetic inquiry.
In the lineage leading to the class T+ DAS-SINEs compensatory changes occurred (U-A U-G
C-G [see figure 2]) at the RNA level. These changes correspond to the D-arm of the tRNAAla and are unknown for any published tRNA. Consequently, the DAS-specific compensatory change seems to be under structural selective pressure to conserve a tRNA D-armlike structure in the founder RNAs (see figure 1B). This indicates that the respective master genes may encode a functional RNA.
For 354 DAS-III SINEs with recognizable DRs, we reconstructed the nucleotide composition of the genomic target sites and derived a TTAAAAA consensus sequence (fig. 5B; boxed area). SINE retroposition depends on the transpositional machinery of autonomous elements, such as LINEs. For Alu-SINEs, L1-mediated reverse transcription and integration is the most apparent mechanism of retroposition (Jurka 1997). Both retronuons share a characteristic 3' end, oligo(A), that is responsible for their target-site preference. Concerning the target-site preference, DAS-SINEs show the same integration profile as Alu-SINEs and as a consequence, share similar 3' ends. In contrast, L2 or L3 and their nonautonomous associates deviate substantially (Kapitonov, Pavlicek, and Jurka 2004). Presumably, L2 and L3 elements were active 200 to 300 MYA, long before the mammalian radiation (Kapitonov, Pavlicek, and Jurka 2004). To selectively detect LINE elements in armadillo whose activity coincides with that of DAS-SINEs, we extracted all LINE-related sequences flanked by perfect DRs (data not shown). Together with the aforementioned target-site preference, detection of L1 elements only supports our notion that L1 elements mediated retroposition of DAS-SINE elements.
In conclusion, we could follow the evolution of a newly discovered SINE family in edentates, the first described in this order to date. The origin can clearly be traced back to a presumably incompletely processed tRNAAla yielding a monomeric SINE master gene, followed by the emergence of dimeric and chimeric forms. Compensatory changes point to a structural constraint on the tRNA D-arm-corresponding region of the DAS-SINEs source gene. Other features correspond to canonical characteristics of SINEs (e.g., flanking DRs, oligo(A) terminal regions, and the preference for AT-rich target sites). This preference is shared by resident armadillo L1 elements that probably were active at the same period and provided the necessary enzymatic machinery for DAS-SINE retroposition. Our study completes evidence that SINE elements are present in all mammalian orders. It will facilitate further investigations of SINE elements and their evolutionary impact in the order Xenathra.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Batzer, M. A., P. L. Deininger, U. Hellmann-Blumberg, J. Jurka, D. Labuda, C. M. Rubin, C. W. Schmid, E. Zietkiewicz, and E. Zuckerkandl. 1996. Standardized nomenclature for Alu repeats. J. Mol. Evol. 42:36.[ISI][Medline]
Bentolila, S., J. M. Bach, J. L. Kessler, I. Bordelais, C. Cruaud, J. Weissenbach, and J. J. Panthier. 1999. Analysis of major repetitive DNA sequences in the dog (Canis familiaris) genome. Mamm. Genome 10:699705.[CrossRef][ISI][Medline]
Borodulina, O. R., and D. A. Kramerov. 2001. Short interspersed elements (SINEs) from insectivores: two classes of mammalian SINEs distinguished by A-rich tail structure. Mamm. Genome 12:779786.[CrossRef][ISI][Medline]
Brosius, J. 1999. RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 238:115134.[CrossRef][ISI][Medline]
Brosius, J., and S. J. Gould. 1992. On "genomenclature": a comprehensive (and respectful) taxonomy for pseudogenes and other "junk DNA". Proc. Natl. Acad. Sci. USA 89:1070610710.
Buzdin, A. A. 2004. Retroelements and formation of chimeric retrogenes. Cell. Mol. Life Sci. 61:114.[CrossRef][ISI][Medline]
Cheng, J. F., R. Printz, T. Callaghan, D. Shuey, and R. C. Hardison. 1984. The rabbit C family of short, interspersed repeats: nucleotide sequence determination and transcriptional analysis. J. Mol. Biol. 176:120.[ISI][Medline]
Cost, G. J., Q. Feng, A. Jacquier, and J. D. Boeke. 2002. Human L1 element target-primed reverse transcription in vitro. EMBO J. 21:58995910.
Daniels, G. R., and P. L. Deininger. 1991. Characterization of a third major SINE family of repetitive sequences in the galago genome. Nucleic Acids Res. 19:16491656.[Abstract]
Dewannieux, M., C. Esnault, and T. Heidmann. 2003. LINE-mediated retrotransposition of marked Alu sequences. Nat. Genet. 35:4148.[CrossRef][ISI][Medline]
Feschotte, C., N. Fourrier, I. Desmons, and C. Mouches. 2001. Birth of a retroposon: the twin sine family from the vector mosquito Culex pipiens may have originated from a dimeric tRNA precursor. Mol. Biol. Evol. 18:7484.
Gilbert, N., and D. Labuda. 2000. Evolutionary inventions and continuity of CORE-SINEs in mammals. J. Mol. Biol. 298:365377.[CrossRef][ISI][Medline]
Hagan, R., F. Sheffield, and C. M. Rudin. 2003. Human Alu element retroposition induced by genotoxic stress. Nat. Genet. 35:219220.[CrossRef][ISI][Medline]
Hasegawa, M., H. Kishino, and T. Yano. 1985. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160174.[ISI][Medline]
Jurka, J. 1997. Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. Proc. Natl. Acad. Sci. USA 94:18721877.
Kajikawa M., and N. Okada. 2002. LINEs mobilize SINEs in the eel through a shared 3' sequence. Cell 111:433444.[ISI][Medline]
Kapitonov, V. V., and J. Jurka. 2003. A novel class of SINE elements derived from 5S rRNA. Mol. Biol. Evol. 20:694702.
Kapitonov, V. V., A. Pavlicek, and J. Jurka. 2004. Anthology of human repetitive DNA. Pp. 251306 in R. A. Meyers, ed. Encyclopedia of molecular cell biology and molecular medicine, Vol. 1. Wiley VCH, Weinheim, Germany,.
Kass, D. H., J. Kim, and P. L. Deininger. 1996. Sporadic amplification of ID elements in rodents. J. Mol. Evol. 42:714.[ISI][Medline]
Katoh, K., K. Misawa, K. Kuma, and T. Miyata. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30:30593066.
Kramerov, D. A., A. A. Grigoryan, A. P. Ryskov, and G. P. Georgiev. 1979. Long double-stranded sequences (dsRNA-B) of nuclear pre-mRNA consist of a few highly abundant classes of sequences: evidence from DNA cloning experiments. Nucleic Acids Res. 6:697713.[ISI][Medline]
Lander, E. S., L. M. Linton, B. Birren, et al. (256 co-authors) 2001. Initial sequencing and analysis of the human genome. Nature 409:860921.[CrossRef][ISI][Medline]
Lin, Z., O. Nomura, T. Hayashi, Y. Wada, and H. Yasue. 2001. Characterization of a SINE species from vicuna and its distribution in animal species including the family Camelidae. Mamm. Genome 12:305308.[CrossRef][ISI][Medline]
Mak, J., and L. Kleiman. 1997. Primer tRNAs for reverse transcription. J. Virol. 71:80878095.
Murphy, W. J., E. Eizirik, W. E. Johnson, Y. P. Zhang, O. A. Ryder, and S. J. O'Brien. 2001. Molecular phylogenetics and the origins of placental mammals. Nature 409:614618.[CrossRef][ISI][Medline]
Nikaido, M., H. Nishihara, Y. Hukumoto, and N. Okada. 2003. Ancient SINEs from African endemic mammals. Mol. Biol. Evol. 20:522527.
Nikaido, M., A. P. Rooney, and N. Okada. 1999. Phylogenetic relationships among cetartiodactyls based on insertions of short and long interpersed elements: hippopotamuses are the closest extant relatives of whales. Proc. Natl. Acad. Sci. USA 96:1026110266.
Nishihara, H., Y. Terai, and N. Okada. 2002. Characterization of novel Alu- and tRNA-related SINEs from the tree shrew and evolutionary implications of their origins. Mol. Biol. Evol. 19:19641972.
Ohshima, K., M. Hamada, Y. Terai, and N. Okada. 1996. The 3' ends of tRNA-derived short interspersed repetitive elements are derived from the 3' ends of long interspersed repetitive elements. Mol. Cell. Biol. 16:37563764.[Abstract]
Okada, N., and K. Ohshima. 1995. Evolution of tRNA-derived SINEs. Pp. 6179 in R. J. Maraia, ed. The impact of short interspersed elements (SINEs) on the host genome. RG Landes Company, Austin, Tex.
Piskurek, O., M. Nikaido, Boeadi, M. Baba, and N. Okada. 2003. Unique mammalian tRNA-derived repetitive elements in Dermopterans: The t-SINE family and its retrotransposition through multiple sources. Mol. Biol. Evol. 20:16591668.
Quentin, Y. 1992. Fusion of a free left Alu monomer and a free right Alu monomer at the origin of the Alu family in the primate genomes. Nucleic Acids Res. 20:487493.[Abstract]
Rogers, J. H. 1985. The origin and evolution of retroposons. Int. Rev. Cytol. 93:187279.[ISI][Medline]
Roos, C., J. Schmitz, and H. Zischler. 2004. Primate jumping genes elucidate strepsirrhine phylogeny. Proc. Natl. Acad. Sci. USA 101:1065010654.
Rozhdestvensky, T. S., A. M. Kopylov, J. Brosius, and A. Huttenhofer. 2001. Neuronal BC1 RNA structure: evolutionary conversion of a tRNA(Ala) domain into an extended stem-loop structure. RNA 7:722730.
Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502504.
Schmitz, J., G. Churakov, H. Zischler, and J. Brosius. 2004. A novel class of mammalian-specific tailless retropseudogenes. Genome Res. 14:19111915.
Schmitz, J., and H. Zischler. 2003. A novel family of tRNA-derived SINEs in the colugo and two new retrotransposable markers separating dermopterans from primates. Mol. Phylogenet. Evol. 28:341349.[CrossRef][ISI][Medline]
Shimamura, M., H. Yasue, K. Ohshima, H. Abe, H. Kato, T. Kishiro, M. Goto, I. Munechika, and N. Okada. 1997. Molecular evidence from retroposons that whales form a clade within even-toed ungulates. Nature 388:666670.[CrossRef][ISI][Medline]
Szemraj, J., G. Plucienniczak, J. Jaworski, and A. Plucienniczak. 1995. Bovine Alu-like sequences mediate transposition of a new site-specific retroelement. Gene 152:261264.[CrossRef][ISI][Medline]
Ullu, E., and C. Tschudi. 1984. Alu sequences are processed 7SL RNA genes. Nature 312:171172.[ISI][Medline]
Zuker, M. 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31:34063415.
|