Evolutionary Dynamics in a Novel L2 Clade of Non-LTR Retrotransposons in Deuterostomia

Nika Lovsin, Franc Gubensek and Dusan Kordi

Department of Chemistry and Biochemistry, Faculty of Chemistry and Chemical Technology, University of Ljubljana, Slovenia;
Department of Biochemistry and Molecular Biology, Jozef Stefan Institute, Ljubljana, Slovenia


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
The evolution of the novel L2 clade of non-long terminal repeat (LTR) retrotransposons and their evolutionary dynamics in Deuterostomia has been examined. The short-term evolution of long interspersed nuclear element 2s (LINE2s) has been studied in 18 reptilian species by analysis of a PCR amplified 0.7-kb fragment encoding the palm/fingers subdomain of reverse transcriptase (RT). Most of the reptilian LINE2s examined are inactive since they contain multiple stop codons, indels, or frameshift mutations that disrupt the RT. Analysis of reptilian LINE2s has shown a high degree of sequence divergence and an unexpectedly large number of deletions. The evolutionary dynamics of LINE2s in reptiles has been found to be complex. LINE2s are shown to form a novel clade of non-LTR retrotransposons that is well separated from the CR1 clade. This novel L2 clade is more widely distributed than previously thought, and new representatives have been discovered in echinoderms, insects, teleost fishes, Xenopus, Squamata, and marsupials. There is an apparent absence of LINE2s from different vertebrate classes, such as cartilaginous fishes, Archosauria (birds and crocodiles), and turtles. Whereas the LINE2s are present in echinoderms and teleost fishes in a conserved form, in most tetrapods only highly degenerated pseudogenes can be found. The predominance of inactive LINE2s in Tetrapoda indicates that, in the host genomes, only inactive copies are still present. The present data indicate that the vertical inactivation of LINE2s might have begun at the time of Tetrapoda origin, 400 MYA. The evolutionary dynamics of the L2 clade in Deuterostomia can be described as a gradual vertical inactivation in Tetrapoda, stochastic loss in Archosauria and turtles, and strict vertical transmission in echinoderms and teleost fishes.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
Retrotransposons are class I transposable elements (TE), that transpose through reverse transcription of an RNA intermediate. They possess a replicative mode of transposition, so that the insertions are mostly stable. Retrotransposons are present in all eukaryotic genomes, where they are the most abundant class of mobile DNA. In many cases, they constitute over 50% of the nuclear DNA, a situation that may have arisen in just a few million years. Retrotransposons play a central role in the structure, evolution, and function of eukaryotic genomes (Kidwell and Lisch 1997, 2000Citation ; Shapiro 1999Citation ; Bennetzen 2000; Makalowski 2000Citation ).

Non-long terminal repeat (LTR) retrotransposons have been classified into 12 clades (Malik, Burke and Eickbush 1999Citation ; Malik and Eickbush 2000Citation ). In vertebrate genomes, representatives of four clades of non-LTR retrotransposons are known. The L1 clade is present in mammals (Smit et al. 1995Citation ), in teleost fishes (Duvernell and Turner 1998Citation ), and in amphibians (Garrett and Carroll 1986Citation ). The CR1 clade is found in species from elasmobranch fishes to birds (Haas et al. 1997Citation ; Kajikawa, Ohshima, and Okada 1997Citation ; Ogiwara et al. 1999Citation ) and in degenerate form in mammals (L3 and CR1-HS, in Repbase only). It has been proposed that LINE2 elements belong to the CR1 clade (Malik, Burke, and Eickbush 1999Citation ). Very recently the novel Rex1 clade, specific for teleost fishes, has been described (Volff, Koerting, and Schartl 2000Citation ). The RTE clade (Malik and Eickbush 1998) is represented in vertebrates by Bov-B LINEs in Squamata and in Ruminantia (Kordis and Gubensek 1998, 1999Citation ; Zupunski, Gubensek, and Kordis, 2001Citation ) and by Rex3 elements in teleost fishes (Volff et al. 1999Citation ).

The evolutionary dynamics of non-LTR retrotransposons can be described in most cases by vertical transmission and also, exceptionally, by horizontal transfer (Kordis and Gubensek 1998, 1999Citation ; Zupunski, Gubensek, and Kordis, 2001Citation ). Transposition of active non-LTR retrotransposons frequently generates 5' truncated copies that are unable to transpose because they lack the sequences, including promoters and parts of open reading frames (ORFs) that encode proteins essential for transposition (Luan et al. 1993Citation ). These "dead-on-arrival" (DOA) elements evolve essentially as pseudogenes (Petrov and Hartl 1997Citation ).

LINE2s represent one of the vertebrate non-LTR retrotransposons, but their origin and evolutionary relationships are still largely unknown. They were discovered relatively recently as mammalian-specific SINEs, termed MIR elements (Smit and Riggs 1995Citation ; Jurka, Zietkiewicz, and Labuda 1995Citation ). Later, they were recognized as being LINE elements (Smit 1996Citation ). It was believed that mammalian-wide interspersed repeat (MIR) elements are mesozoic molecular fossils (Jurka, Zietkiewicz, and Labuda 1995Citation ), but later they were also discovered in teleost fishes (Okada et al. 1997Citation ; Terai, Takahashi, and Okada 1998Citation ). The only full-length vertebrate LINE2 element currently known has been identified in the Takifugu genome (Poulter, Butler, and Ormandy 1999Citation ). The size of the full-length LINE2 element is 4.5 kb, encoding two ORFs, and its structural organization is very similar to the vertebrate CR1 LINEs (Haas et al. 1997Citation ; Kajikawa, Ohshima, and Okada 1997Citation ; Ogiwara et al. 1999Citation ). Surprisingly, the level of sequence conservation between LINE2 and CR1 elements is quite low, even from the same species, reaching only 30% identity at the amino acid level. In mammalian genomes, all LINE2 fragments are highly defective and no full-length elements exist (Lander et al. 2001Citation ). The data available for LINE2s in other vertebrate classes are very limited, and in most cases only a short fragment of LINE2 is known. In reptiles the only known LINE2 fragment is from the viperid snake, Trimeresurus flavoviridis, whereas in amphibia the only partial LINE2 element is from salamander, Hynobius (Terai, Takahashi, and Okada 1998Citation ).

Evolutionary relationships among the Deuterostomia are central to understanding the evolution of the L2 clade. Deuterostomia are a major group of metazoans composed of three phyla: Chordata (including three subphyla: Vertebrata, Cephalochordata, and Urochordata), Hemichordata, and Echinodermata (Cameron, Garey, and Swalla 2000Citation ). Echinoderms offer unique advantages for comparative genomics as an outgroup to vertebrates for the study of genome evolution. Extant vertebrates consist of gnathostomes (jawed vertebrates) and lampreys. As the most basal lineage of jawed vertebrates are commonly accepted to be the cartilaginous fishes (Chondrichthyes), they are also the most primitive group of fishes. The appearance of land vertebrates or Tetrapoda (includes amphibians, reptiles, birds, and mammals) was one of the most important events in the evolution of vertebrates. Phylogenetic relationships within reptiles are still controversial. Lepidosauria (snakes and lizards) have a basal position within reptiles, whereas turtles are the sister group to Archosauria (birds and crocodiles) (Zardoya and Meyer 2000Citation ; Janke et al. 2001Citation ).

In contrast to the large number of truncated mammalian and teleost LINE2s available in GenBank databases, almost nothing is known about their distribution, variability within and among species, and their evolutionary dynamics in reptiles. To get a deeper insight into their evolutionary history, their long- and short-term evolutionary dynamics in reptiles and Deuterostomia, and their evolutionary relationships with the CR1 clade and other non-LTR retrotransposons, we have carried out an extensive study of the distribution and evolution of LINE2s. In the present study, we clarified the evolutionary history of the L2 clade in Deuterostomia, their evolutionary origin and dynamics of vertical inactivation in Tetrapoda, which were until now unknown.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
Species Analyzed
To analyze the distribution of LINE2s by PCR, the genomic DNA of the same vertebrate and invertebrate species as tested for Bov-B LINEs (Kordis and Gubensek 1998Citation ) has been used. A blood sample of Macropus (kangaroo, Marsupialia) was obtained from Ljubljana Zoo. Genomic DNA extraction was performed as previously described (Kordis and Gubensek 1998Citation ).

PCR Amplification of the LINE2s
All experiments were performed in parallel with negative and positive controls. Different rooms, reagents, equipments, and positive displacement pipettes were used according to the general precautions for PCR performance. The sense L2s (5'-ATCGGAATTCTWGACCTCTCAGCRGC-3') and the antisense L2as (5'-ATCGGAATTCYATTGCARTAGTCCAGG-3') degenerate oligonucleotide primers were based on conserved regions in the reverse transcriptase domains (LDLSAA-L2s; LDYCN-L2as) of a partial T. flavoviridis LINE2 and a partial human consensus LINE2 sequence (obtained from Repbase). PCR amplification of the 0.7-kb fragment was performed on 1 µg of genomic DNA from each species in 100-µl volume of 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl2, 200 µM of each dNTP, 0.5 µM of each oligonucleotide, and 2.5 units of AmpliTaq polymerase (Perkin-Elmer). After an initial denaturation step for 5 min at 95°C, the PCR reactions were subjected to 30 cycles of amplification consisting of 1-min denaturation at 95°C, 1-min annealing at 60°C, and 1-min extension at 72°C with a 5-min final extension at 72°C. Ten microliters of each reaction solution containing the amplified DNA fragments were electrophoresed on a 1% agarose gel, stained with ethidium bromide, and visualized with UV light.

Cloning and Sequencing of LINE2s
The resulting PCR products were directly ligated into a pGEM vector using pGEM-T-easy cloning kit (Promega) for sequence determination. The inserts were sequenced on both strands with an ABI fluorescent sequencing kit on an ABI 310 sequencer.

DNA- and Protein-based Sequence Analyses
Computer-based nucleotide and protein searches of the sequence databases with reptilian and vertebrate LINE2 sequences and several representatives from the L2 and CR1 clades were performed with the different BLAST (Altschul et al. 1990Citation ) and Fasta programs (Pearson 1990Citation ). Evolutionary rates were estimated by standard methods (Nei and Kumar 2000Citation , p. 20). Poisson correction distances (d) were estimated by the equation, d = -ln (1 - p), where p represents the proportion of different amino acids. The rate of amino acid substitution (r) was estimated by the standard equation, r = d/2T, where T is the divergence time of the last common ancestor (LCA) of the species compared. Kimura two-parameter genetic distances were calculated from sequences of the reverse transcriptase (RT) domain using the MEGA2 program (Kumar et al. 2000Citation ).

Phylogenetic Analyses of LINE2s
The amino acid sequences of the RT domain of metazoan LINE2 elements and of the representatives of nine non-LTR retrotransposon clades (L1, RTE, R1, LOA, Tad1, Jockey, I, CR1, and Rex1) were aligned using Clustal W (Thompson, Higgins, and Gibson 1994Citation ). Gaps in aligned sequences were removed for the purpose of analysis. Phylogenetic trees were inferred using the Neighbor-Joining (NJ) method (Saitou and Nei 1987Citation ) as implemented in Treecon (van de Peer and de Wachter 1994Citation ). As an outgroup we used group II intron RT from Neurospora (acc. no. S07649). The significance of the various phylogenetic lineages was assessed by bootstrap analysis.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
Distribution of LINE2s in Reptiles
The presence of a LINE2 fragment in the third intron of the T. flavoviridis TBBP gene (Okada et al. 1997Citation ; Terai, Takahashi, and Okada 1998Citation ) prompted a more extensive investigation of its distribution among reptiles. The main part of the RT (domains three to eight) has been amplified by PCR, using degenerate oligonucleotide primers based on two highly conserved sequence motifs derived from the LINE2 of T. flavoviridis and the human LINE2 consensus sequence. Their distribution has been analyzed in 18 reptilian species, representatives of crocodiles, turtles, and Squamata (table 1 ), as well as in some representatives of vertebrates and invertebrates. LINE2s have been detected in the earliest reptilian lineage represented by Squamata (snakes and lizards), but not in the genomes of the evolutionarily younger turtles and Archosauria (crocodiles and birds).


View this table:
[in this window]
[in a new window]
 
Table 1 Reptilian Species that were Tested by PCR Assay

 
Reptilian LINE2s Are Highly Defective
To confirm that the amplified PCR products were LINE2s and to analyze their evolutionary relationships, they were cloned and sequenced from 15 reptilian species. Two to nine clones were sequenced from each species (table 1 ) in order to estimate the level of within-species nucleotide sequence divergence. The multiple copies of LINE2s obtained from some species, such as Boidae snakes, show a high level of nucleotide sequence divergence, with a number of indels and stop codons. In a large number of reptilian LINE2s, we found stop codons and short deletions; these result in frameshift or in frame stop mutations. In the reptilian species, where only highly degenerate LINE2 copies are present, the intraspecific Kimura two-parameter genetic distances among different copies are very high, indicating that they are molecular fossils of once active LINE2s. In contrast, in the species where LINE2s are highly conserved, we observed very low Kimura two-parameter genetic distances (data not shown). Alignment of the RT with conserved domains three to eight of reptilian LINE2s is shown in figure 1 . Only a few conserved RT sequences from representatives of different snake families (Viperidae, Elapidae, Colubridae, and Boidae) and a lacertid lizard are shown to highlight the conserved regions in RT domain. Comparisons of the newly available reptilian LINE2s with the elements from Deuterostomia show a moderate level of conservation at the amino acid level, reaching 45%–50% identity.



View larger version (70K):
[in this window]
[in a new window]
 
Fig. 1.—Multiple alignment of the RT domain of LINE2s in reptiles. The alignment was constructed with the program Clustal W (Thompson, Higgins, and Gibson 1994Citation ). RT conserved domains are given according to Malik, Burke, and Eickbush (1999)Citation . Identical residues are in black, and conservative substitutions are in gray

 
Complex Evolutionary Dynamics of LINE2s in Reptiles
The evolutionary relationships among paraphyletic reptiles (Archosauria: crocodiles and birds are monophyletic) have been recently clarified (Hedges and Poling 1999Citation ; Janke et al. 2001Citation ). The long-standing debate about the most ancestral lineage of reptiles has now been resolved in favor of Lepidosauria, whereas turtles and crocodiles are evolutionarily younger. The distribution pattern of LINE2s in reptiles is very interesting, being present only in the most ancestral reptilian lineage (Lepidosauria), but absent from the genomes of evolutionarily younger turtles and Archosauria. It is apparent that in turtles and Archosauria the LINE2s have undergone stochastic loss. A very interesting observation is that LINE2s are absent from vertebrates with relatively small genomes, such as birds, turtles, and crocodiles.

We analyzed LINE2s from a number of Lepidosauria representatives: from three lizard infraorders—Gekkota, Scincomorpha, and Anguimorpha, and four snake families—Boidae, Viperidae, Elapidae, and Colubridae (table 1 ). In some cases, we analyzed several species from the same family of snakes, in order to get an insight into their short-term evolutionary dynamics in closely related species.

We found that in two species of genus Vipera (ammodytes and palaestinae) quite different levels of conservation exist among LINE2s at the intraspecific level. LINE2s are highly conserved at the nucleotide level in V. ammodytes, but at a much lower level in V. palaestinae. Translatable LINE2 products in V. ammodytes are more common than in V. palaestinae, indicating that in closely related species LINE2s may have had very different evolutionary histories. Similarly, in mammals, LINE2s in mouse and human also have very different evolutionary histories (Lander et al. 2001Citation ).

We found only a few snake species, i.e., Natrix, V. ammodytes, Echis, Crotalus, Walterinnesia, and Bothrops, with relatively well-conserved LINE2 nucleotide sequences, whereas the ancestral snake lineage (Boidae) showed a much lower level of conservation with a large number of indels. Most of the translatable snake LINE2 sequences contain stop codons. The LINE2 nucleotide sequences from some lizards are also conserved, as in Tupinambis and Podarcis, whereas in Hemidactylus and Anguis they are more divergent.

If reptilian genomes contained conserved and potentially active LINE2 elements, they could be easily amplified by PCR. The analysis of translated sequences of the RT domain from reptilian LINE2s provides no evidence for the presence of active LINE2 elements in the reptilian genomes. Whereas some of the translated LINE2 sequences appear to be relatively conserved, they contain stop codons and frameshift mutations.

Our analysis of the large number of different reptilian species shows the importance of good taxon sampling for deeper insight into the evolutionary history, dynamics, and relationships of LINE2s. In sequence databases, the only known reptilian LINE2 fragment was from Trimeresurus, and one could make completely misleading conclusions about their distribution and evolution in reptiles on the basis of presence in a single species. For the recognition of LINE2 long-term evolution in other vertebrate classes, and of their evolutionary dynamics, in particular vertebrate class, the currently available data in sequence databases are not adequate. Therefore, additional analyses based on the PCR amplified RT domain of LINE2s are necessary for mammals (Marsupionta and Eutheria), amphibians, cartilaginous fishes, and lampreys.

Widespread, but Discontinuous Distribution of LINE2s in Deuterostomia
The distribution of LINE2s has been analyzed and found to be wider than previously thought. By searching specific GenBank databases, and by searching within specific taxonomic groups, using the full-length sequence of Takifugu LINE2 and other novel LINE2 elements, we discovered several new representatives (table 2 ).


View this table:
[in this window]
[in a new window]
 
Table 2 Novel Representatives of L2 Clade in Metazoa

 
The echinoderm LINE2 lineage is particularly interesting. In a sea urchin (Strongylocentrotus) GSS database we found a nearly full-length LINE2 element (table 2 ). By TBLASTN searching of the GSS database with ORF1, apurinic-apyrimidinic endonuclease (AP-EN), RT and the C-terminal part of RT, we found a high level of conservation between teleost and echinoderm LINE2s, showing up to 50% of amino acid identity. The level of conservation at the amino acid level, in the range of 40%–55%, between vertebrate and echinoderm LINE2s, seems relatively high, considering their taxonomic position and their last common ancestor with vertebrates (around 600 MYA). The most probable reason for the close similarity of echinoderm and vertebrate LINE2s is a low evolutionary rate in the LINE2s (table 3 ). In echinoderms the RT domain is well conserved, indicating that these elements might be active.


View this table:
[in this window]
[in a new window]
 
Table 3 Evolutionary Rates in the L2 Clade

 
LINE2s have been previously reported from the three teleost orders: Perciformes (fam. Cichlidae) (Terai, Takahashi, and Okada 1998Citation ; Oliveira et al. 1999Citation ), Tetraodontiformes (fam. Tetraodontidae, genus Takifugu) (Poulter, Butler, and Ormandy 1999Citation ), and Cypriniformes (fam. Cyprinidae) (Danio) (Okada et al. 1997Citation ). By TBLASTN searching of GenBank databases, further LINE2 fragments have been found in numerous teleost species and orders (table 2 ). The presence of LINE2 in eight different teleost orders indicates that they are widespread in teleost fishes. By searching NR and HTGS databases we found a novel full-length LINE2 element in medakafish (Oryzias latipes) and several in zebrafish (Danio rerio) (table 2 ). The full-length LINE2 element from medakafish shows 60%–70% amino acid identity to the LINE2 from the Takifugu or other teleost fishes, whereas zebrafish full-length LINE2 elements show only 50% identity. By searching databases we found no sequence data for LINE2s in cartilaginous fishes.

The only LINE2 so far reported for amphibians is from the salamander Hynobius (Okada et al. 1997Citation ; Terai, Takahashi, and Okada 1998Citation ). We amplified and sequenced LINE2 from Xenopus, which showed that LINE2s are present in both main amphibian groups Anura (frogs and toads) and Caudata (salamanders), and might be widespread in amphibia. The amphibian LINE2 elements show a number of indels and stop codons.

In mammalian genomes only highly defective molecular fossils of LINE2s are present (Lander et al. 2001Citation ). Human and mouse genomes and other mammalian databases were searched by TBLASTN with the full-length Takifugu LINE2 element. No full size elements, just very short fragments of the RT were discovered. An examination of mammalian sequences in GenBank databases provides additional support for the widespread presence of highly degenerated LINE2 sequences in mammals. It has been reported that marsupials also contain MIRs (Jurka, Zietkiewicz, and Labuda 1995Citation ). Marsupial (Macropus) LINE2 nucleotide sequences show low levels of similarity when compared with that of placental mammals. To shed light on the origin of LINE2s and their current distribution in mammals, however, more extensive analyses of the monotremes, marsupials, and placental mammals are necessary.

While amplifying potential LINE2s from turtles, crocodiles, and birds (Gallus), the sequences of several clones were found to show no similarity to LINE2. TBLASTN searching of Aves in GenBank databases with reptilian or Takifugu LINE2 elements provides no evidence for their presence in avian genomes. This suggests that the LINE2s were lost from the turtle, crocodile, and bird genomes through genetic drift (stochastic loss). Avian genomes are very compact, and contain a very small number of TE families, mainly only CR1 LINEs. The currently known distribution of LINE2s in vertebrates is discontinuous, being absent in cartilaginous fishes, turtles, and Archosauria.

It has been reported that a short, highly conserved LINE2 fragment exists in gastropod Patella (Mollusca) (Poulter, Butler, and Ormandy 1999Citation ). A search of GenBank databases revealed a highly significant match (E-value: e-61) with non-LTR retrotransposon termed Samurai from silkmoth (Bombyx mori) and may represent the first nearly full-length LINE2-like element in insects. In sandflies (Phlebotomus) we found fragments of LINE2-like elements. Surprisingly, we found LINE2 in a highly conserved form also in a human parasite Schistosoma japonicum (Trematoda). Trematode LINE2 sequences seem too much conserved, considering the age of their LCA with Deuterostomia (900 MYA) (Feng, Cho, and Doolittle 1997Citation ) and they probably originated from their vertebrate hosts.

The ORF1 of LINE2s is shorter than that of CR1 elements. The full-length LINE2s from teleost fishes show relatively conserved sequences of ORF1 (fig. 2 ). The only similarly conserved, but shorter ORF1 fragments have been found in the Tetraodon GSS database. The ORF2 encodes AP-EN and RT (including thumb subdomain) domains; comparisons of deuterosomian LINE2 representatives are shown in figure 2 . Available amino acid sequence data of the L2 clade show that the level of conservation in the RT domain (including thumb subdomain) is relatively high (around 40%) but the ORF1 and AP-EN show considerably lower levels of conservation (fig. 2 ).



View larger version (95K):
[in this window]
[in a new window]
 
Fig. 2.—Multiple alignments of ORF1, AP-EN, RT, and thumb domain of LINE2s in Deuterostomia. The alignments were constructed with the program Clustal W (Thompson, Higgins, and Gibson 1994Citation ). RT conserved domains are given according to Malik, Burke, and Eickbush (1999)Citation . Identical residues are in black, and conservative substitutions are in gray. a, ORF1 alignment: the following representatives were used: Takifugu (AF086712), Oryzias (AB054295), Danio (AL591210), and Gallus CR1 (U88211). b, AP endonuclease domain alignment: the following sequences were used: Takifugu (AF086712), Oryzias (AB054295), Tetraodon (AL292608), Xiphophorus (AF278692), Homo LINE2 consensus (Repbase), Strongylocentrotus (AZ137326), and Gallus CR1 (U88211). c, Reverse transcriptase domain alignment: the following sequences were used: Trimeresurus (D31777), Homo LINE2 consensus (Repbase), Takifugu (AF086712), Oryzias (AB054295), Tetraodon (AL238310), Xiphophorus (AF278692), Strongylocentrotus (AZ138419), and Gallus CR1 (U88211). d, Thumb subdomain of RT alignment: the following sequences were used: Trimeresurus (D31777), Hynobius (D17442), Takifugu (AF086712), Oryzias (AB054295), Strongylocentrotus (AZ211660), Tetraodon (AL197005), Xiphophorus (AF278692), Homo LINE2 consensus (Repbase), B. mori (AB055391), and Gallus CR1 (U88211)

 


View larger version (71K):
[in this window]
[in a new window]
 
Fig. 2 (Continued)

 
Knowledge relating to the distribution of LINE2s in Metazoa has been extended and shows a much wider taxonomic distribution than previously thought. Our analysis shows that the L2 clade is moderately widespread, originating relatively late during the evolution of Metazoa.

Evolutionary Rates in LINE2s
All LINE2s compared show a relatively high level of conservation (40%–50%) at the amino acid level among different vertebrate classes, but conservation within a particular taxonomic group is always much higher. The evolutionary rates estimated for representatives of the L2 clade have been found to be very low, only from 0.3 to 1.2 x 10-9 amino acid replacements per amino acid site per year (table 3 ). This contrasts strongly with the majority of LINE2s present in deuterostomian genomes as pseudogenes. Whereas nonfunctional copies of TE are expected to evolve at the same rate as pseudogenes, the data obtained for relatively conserved LINE2s show the opposite pattern. It seems that after massive initial amplification of LINE2s in the ancestor of Deuterostomia they have been undergoing very slow extinction in all tetrapods for more than 400 Myr. In some taxonomic groups (e.g., Boidae snakes, mammals) LINE2s are represented only by highly degenerated, ancient, and decaying members. This indicates that all active copies were inactivated, followed by gradual erosion of existing LINE2 sequences.

LINE2s Form a Novel Clade of Non-LTR Retrotransposons
The evolutionary relationship of LINE2s with the other non-LTR retrotransposon clades has been uncertain. It was assumed that they belong to the CR1 clade (Malik, Burke, and Eickbush 1999Citation ; Poulter, Butler, and Ormandy 1999Citation ; Arkhipova and Messelson 2000Citation ) or represent a separate clade (Terai, Takahashi, and Okada 1998Citation ), but this has not been supported by phylogenetic analyses. In the largest evolutionary analysis of non-LTR retrotransposons yet, LINE2 elements were not included in the analysis, as most of them are highly degenerated in mammals (Malik, Burke, and Eickbush 1999Citation ).

To determine the evolutionary relationship of the LINE2s with other non-LTR retrotransposon clades, the amino acid sequences of the RT domain were subjected to phylogenetic analysis using the NJ algorithm (Saitou and Nei 1987Citation ). The analyses were based on the representatives of nine clades of non-LTR retrotransposons and LINE2s. The Neurospora group II intron RT was used as an outgroup, and the topology of the NJ tree consists of 10 different clades; nine of them already reported (Malik, Burke, and Eickbush 1999Citation ; Volff, Koerting, and Schartl 2000Citation ) (fig. 3 ).



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 3.—LINE2s form a novel clade of non-LTR retrotransposons. The rooted Neighbor-Joining phylogenetic tree using the Poisson correction model and Neurospora group II intron RT as an outgroup was drawn by Treecon program (van de Peer and de Wachter 1994Citation ). It represents the bootstrap consensus following 1,000 replicates; nodes with confidence values greater than 70% are indicated. The corresponding alignment is available on request. The accession numbers of L2 clade representatives are as follows: B. mori (AB055391), S. purpuratus (AZ138419 and AZ211660 joined), Patella (X77618), Trimeresurus (D31777), Xiphophorus (AF278692), Oryzias (AB054295), Takifugu (AF086712). Majority of representatives from CR1, I, Jockey, LOA, Tad1, R1, RTE, and L1 clades are the same as reported by Malik, Burke and Eickbush (1999)Citation , except the Scyliorhinus HER 1 (AB027732), Xenopus CR1 (BF612246), Rana CR1 (AF175979), Balanus CR1 (AB044077), Calliactis (AF221986), Xiphophorus Rex3 (AF125981), Takifugu Rex3 (AF108422), Schistosoma SR2 (AF025672), and Vipera Bov-B LINE (AF332697). Rex1 sequences were taken from Volff, Koerting, and Schartl (2000)Citation

 
This analysis shows that the LINE2s form a novel and significantly supported clade that is well separated from CR1 and Rex1 clades; LINE2 elements are never clustered within the other clades. Additional support for the existence of the novel L2 clade is the very low level of sequence identity between the CR1 and L2 clades. Comparison of the full-length Takifugu LINE2 with Platemys or Gallus CR1 LINEs shows only 20%–27% identity at the amino acid level, as does comparison of CR1 and LINE2 elements from the same genome (e.g., from Squamata, Xenopus, or echinoderms).

This is the first comprehensive phylogenetic analysis of the L2 clade and uses a number of LINE2 representatives from insects to mammals. Phylogenetic analysis of reptilian and vertebrate LINE2 elements alone does not reveal the concordance with the taxonomic classification of the host species (data not shown), as is expected when using pseudogenes in phylogenetic analyses.

Evolutionary Dynamics of LINE2s in Deuterostomia
LINE2 elements may be considered as a model system for investigating the evolutionary dynamics of TEs in the "arms race" with host-directed silencing mechanisms. During their coexistence with the host genomes, TEs are subjected to short bursts of activity followed by long-term inactivation through host-directed silencing mechanisms. Models of evolutionary dynamics of TEs indicate that the ultimate fate of these elements within species lineages is inactivation and eventual extinction (Kaplan, Darden, and Langley 1985Citation ).

What happens during the long-term evolution of the L2 clade in a large and well-defined taxonomic group, such as vertebrates or Deuterostomia, over very long periods of time, over 600 Myr? The tendency for vertically inactivated LINE2s to be prevalent in the Tetrapoda genomes has been observed in the present study. The vast majority of reptilian, amphibian, and mammalian LINE2s so far examined are pseudogenes; they contain multiple mutations that inactivated RT.

In global terms, the evolutionary dynamics and history of the L2 clade in Deuterostomia is relatively simple. In echinoderms and teleost fishes the LINE2 elements are well conserved and full size elements exist, making it likely that they are retrotranspositionally active. In contrast, in all tetrapods (Amphibia, Reptilia, and Mammalia) only highly degenerated LINE2 pseudogenes can be recognized, indicating that LINE2s have been vertically inactivated. The vertical inactivation of LINE2s in Tetrapoda could have occurred only once, some 400 MYA in the ancestor of the Tetrapoda, or alternatively, there may have been several independent inactivations in each vertebrate class (Amphibia, Reptilia, Aves, and Mammalia).

One possible reason for such a rapid inactivation of active LINE2 elements in Tetrapoda genomes might be the small number of active copies in their ancestor. Very rough estimates indicate that the Takifugu genome contains about 3,000 LINE2 copies, 8% of which may be of full size (Poulter, Butler, and Ormandy 1999Citation ). As in all other TEs, the number of active copies is much smaller than that of the full-length members. The limited amount of full-length copies and a very small number of active copies are probably responsible for the relatively rapid vertical inactivation of active LINE2 elements in Tetrapoda. In the near future the completed Takifugu genome sequence will show the actual number of full-length LINE2 elements.

In reptiles, whose phylogeny is known and in which LINE2s have been studied, we found that turtles and Archosauria (birds and crocodiles) no longer harbor the LINE2 elements, whereas the most basal lineage of reptiles, the Lepidosauria, contains it. It is obvious that in turtles and Archosauria, the LINE2s have undergone stochastic loss. This means that over a long enough time, the rate of loss of LINE2 elements by random genetic drift exceeds the rate of gain by transposition until eventually no LINE2 elements remain in their genomes (as seen in birds). The apparent lack of LINE2 elements in turtles and Archosauria is difficult to explain by vertical inactivation of LINE2 at the time of the origin of the tetrapods (which could, however, explain the presence of only degenerated LINE2 elements in tetrapods).

An alternative model is based on the existence of a small number of active LINE2 elements at the time of the origin of tetrapods. These elements would have been lost from some lineages (turtles and Archosauria), but would have remained in the genomes of amphibians, lepidosaurs, and mammals, where they amplified independently and subsequently became inactivated. Whereas this alternative model seems quite reasonable, there is a difficulty connected with detecting the active or full-length LINE2 elements in Tetrapoda. We found the absence of any long remains of LINE2s in the most ancestral tetrapods (amphibians) by searching Xenopus EST database with the full-length Takifugu LINE2 elements. Repeatedly, only very short C-terminal parts of RT could be found (by TBLASTN). In contrast, by searching the same database with avian CR1 element, a nearly full-length CR1 element could be easily reconstructed on the basis of relatively long translated CR1 ORF2 sequences.

The data presented here provide the strongest support for the vertical inactivation of LINE2 at the time of the origin of Tetrapoda. Several observations support this conclusion.

First, there is a gradual increase in the number of defective LINE2 copies, going from teleost fishes to mammals, the highest level being found in mammals. By searching the human and mammalian databases with the nucleotide or translated sequences of consensus human LINE2, only a small number of very short fragments at the DNA level and a much smaller number of translatable LINE2 copies are shown to be present. The present data show that the LINE2s are widespread in Squamata genomes but are in most cases degenerated pseudogenes, although they are still much better conserved than in mammals. In some species (e.g., Python) the indels in LINE2 sequences are so different that it is impossible to align or translate them. In birds no LINE2 sequences could be found in sequence databases, indicating that they were removed from their genomes through genetic drift (stochastic loss). In the genomes of all Tetrapoda analyzed, we found the same pattern: the prevalence of highly degenerated pseudogenes. We may conclude that LINE2s in tetrapods have been inactive for a very long period of time, probably from the origin of Tetrapoda 400 MYA (Feng, Cho, and Doolittle 1997Citation ).

Second, our attempt to amplify full-length LINE2 elements in tetrapods failed, and additional searching of the Tetrapoda sequence databases also showed the absence of full-length LINE2 elements.

Third, if the particular non-LTR retrotransposon family contains active copies, it would always be possible to amplify by PCR the conserved RT domain that should be easily translated. However, in the case of LINE2, all elements isolated had numerous indels, frameshift mutations, and stop codons. Non-LTR retrotransposons produce DOA copies by the process of retrotransposition (Luan et al. 1993Citation ; Petrov and Hartl 1997Citation ) but in the case of conserved ORF2 or RT domain it is always possible to amplify and obtain conserved RT sequences. This has been shown for some other vertebrate non-LTR retrotransposons, such as Bov-B LINEs (Kordis and Gubensek 1998Citation ; Zupunski, Gubensek, and Kordis, 2001Citation ) analyzed from the same reptilian species.

Finally, comparison of the LINE2 sequences with the other available non-LTR retrotransposon families in tetrapods, such as LINE1s, CR1 LINEs, and Bov-B LINEs, shows that LINE2s are poorly conserved in a particular species or among higher taxonomic groups.


    Conclusions
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
In this study we have resolved the evolutionary origin of LINE2s and their evolutionary relationships with other non-LTR retrotransposons, their short- (in reptiles) and long-term (in Deuterostomia) evolutionary dynamics, and the ongoing process of vertical inactivation in Tetrapoda. The evolutionary dynamics of LINE2 in reptiles have been found to be complex. There is an apparent absence of LINE2s from different vertebrate classes, such as cartilaginous fishes, Archosauria (birds and crocodiles), and turtles. Evolutionary analysis shows that LINE2s are not members of the CR1 clade but form a novel L2 clade of non-LTR retrotransposons. Whereas the LINE2s are present in echinoderms and teleost fishes in a conserved form, in most tetrapods only highly degenerated LINE2 pseudogenes can be found. Our data indicate that the vertical inactivation of LINE2s might have begun at the time of Tetrapoda origin, 400 MYA. The evolutionary dynamics of L2 clade in Deuterostomia can be described as a gradual vertical inactivation in Tetrapoda, stochastic loss in Archosauria and turtles, and strict vertical transmission in echinoderms and teleost fishes.


    Sequence Availability
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
The new sequences used in this paper have been deposited in the GenBank database (accession numbers AF373334 to AF373406).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 
For critical reading of the manuscript we thank Prof. R. H. Pain. This work was supported by the Ministry of Science and Technology of Slovenia by a program P0-0501-0106.


    Footnotes
 
Dan Graur, Reviewing Editor

Abbreviations: AP-EN, apurinic-apyrimidinic endonuclease; LCA, last common ancestor; LINE, long interspersed nuclear element; LTR, long terminal repeat; MIR, mammalian-wide interspersed repeat; NJ, Neighbor-Joining; ORF, open reading frame; RT, reverse transcriptase; TE, transposable element. Back

Keywords: non-LTR retrotransposon L2 clade LINE2 vertical inactivation evolutionary dynamics Back

Address for correspondence and reprints: Dusan Kordis, Department of Biochemistry and Molecular Biology, Jozef Stefan Institute, Jamova 39, 1001 Ljubljana, Slovenia. dusan.kordis{at}ijs.si . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Sequence Availability
 Acknowledgements
 References
 

    Altschul S. F., W. Gish, W. Miller, E. W. Myers, D. J. Lipman, 1990 Basic local alignment search tool J. Mol. Biol 215:403-410[ISI][Medline]

    Arkhipova I., M. Meselson, 2000 Transposable elements in sexual and ancient asexual taxa Proc. Natl. Acad. Sci. USA 97:14473-14477[Abstract/Free Full Text]

    Bennetzen J. L., 2000 Transposable element contributions to plant gene and genome evolution Plant Mol. Biol 42:251-269[ISI][Medline]

    Cameron C. B., J. R. Garey, B. J. Swalla, 2000 Evolution of the chordate body plan: new insights from phylogenetic analyses of deuterostome phyla Proc. Natl. Acad. Sci. USA 97:4469-4474[Abstract/Free Full Text]

    Duvernell D. D., B. J. Turner, 1998 Swimmer 1, a new low copy number LINE family in teleost genomes with sequence similarity to mammalian L1 Mol. Biol. Evol 15:1791-1793[Free Full Text]

    Feng D., G. Cho, R. F. Doolittle, 1997 Determining divergence times with a protein clock: update and reevaluation Proc. Natl. Acad. Sci. USA 94:13028-13033[Abstract/Free Full Text]

    Garrett J. E., D. Carroll, 1986 Tx1—a transposable elelement from Xenopus laevis with some unusual properties Mol. Cell. Biol 6:933-941[ISI][Medline]

    Haas N. B., J. M. Grabowski, A. B. Sivitz, J. B. E. Burch, 1997 Chicken repeat 1 (CR1) elements, which define an ancient family of vertebrate non-LTR retrotransposons, contain two closely spaced open reading frames Gene 197:305-309[ISI][Medline]

    Hedges S. B., L. L. Poling, 1999 A molecular phylogeny of reptiles Science 283:998-1001[Abstract/Free Full Text]

    Janke A., D. Erpenbeck, M. Nilsson, U. Arnason, 2001 The mitochondrial genomes of the iguana (Iguana iguana) and the caiman (Caiman crocodylus): implications for amniote phylogeny Proc. R. Soc. Lond., Ser. B Biol. Sci 268:623-631[ISI][Medline]

    Jurka J., E. Zietkiewicz, D. Labuda, 1995 Ubiquitous mammalian-wide interspersed repeats (MIRs) are molecular fossils from the mesozoic era Nucleic Acids Res 23:170-175[Abstract]

    Kajikawa M., K. Ohshima, N. Okada, 1997 Determination of the entire sequence of turtle CR1: the first open reading frame of the turtle CR1 element encodes a protein with a novel zinc finger motif Mol. Biol. Evol 14:1206-1217[Abstract]

    Kaplan N., T. Darden, C. H. Langley, 1985 Evolution and extinction of transposable elements in Mendelian populations Genetics 109:459-480[Abstract/Free Full Text]

    Kidwell M. G., D. Lisch, 1997 Transposable elements as sources of variation in animals and plants Proc. Natl. Acad. Sci. USA 94:7704-7711[Abstract/Free Full Text]

    ———. 2000 Transposable elements and host genome evolution Trends Ecol. Evol 15:95-99[ISI][Medline]

    Kordi D., F. Gubensek, 1998 Unusual horizontal transfer of a long interspersed nuclear element between distant vertebrate classes Proc. Natl. Acad. Sci. USA 95:10704-10709[Abstract/Free Full Text]

    ———. 1999 Molecular evolution of Bov-B LINEs in vertebrates Gene 238:171-178[ISI][Medline]

    Kumar S., K. Tamura, I. Jakobsen, M. Nei, 2000 MEGA 2 (molecular evolutionary genetics analysis program). Version 2.0 Pennsylvania State University, University Park, and Arizona State University, Tempe

    Lander E. S., L. M. Linton, B. Birren, et al. (243 co-authors) 2001 Initial sequencing and analysis of the human genome Nature 409:860-921[ISI][Medline]

    Luan D. D., M. H. Korman, J. L. Jakubczak, T. H. Eickbush, 1993 Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition Cell 72:595-605[ISI][Medline]

    Makalowski W., 2000 Genomic scrap yard: how genomes utilize all that junk Gene 259:61-67[ISI][Medline]

    Malik H. S., W. D. Burke, T. H. Eickbush, 1999 The age and evolution of non-LTR retrotransposable elements Mol. Biol. Evol 16:793-805[Abstract]

    Malik H. S., T. H. Eickbush, 1998 The RTE class of non-LTR retrotransposons is widely distributed in animals and is the origin of many SINEs Mol. Biol. Evol 15:1123-1134[Abstract]

    ———. 2000 NeSL-1, an ancient lineage of site-specific non-LTR retrotransposons from Caenorhabditis elegans Genetics 154:193-203[Abstract/Free Full Text]

    Nei M., S. Kumar, 2000 Molecular evolution and phylogenetics Oxford University Press, New York

    Ogiwara I., M. Miya, K. Ohshima, N. Okada, 1999 Retropositional parasitism of SINEs on LINEs: identification of SINEs and LINEs in elasmobranchs Mol. Biol. Evol 16:1238-1250[Abstract]

    Okada N., M. Hamada, I. Ogiwara, K. Ohshima, 1997 SINEs and LINEs share common 3' sequences: a review Gene 205:229-243[ISI][Medline]

    Oliveira C., J. S. K. Chew, F. Porto-Foresti, M. J. Dobson, J. M. Wright, 1999 A LINE2 repetitive DNA sequence from the cichlid fish, Oreochromis niloticus: sequence analysis and chromosomal distribution Chromosoma 108:457-468[ISI][Medline]

    Pearson W. R., 1990 Rapid and sensitive sequence comparison with FASTP and FASTA Methods Enzymol 183:63-98[ISI][Medline]

    Petrov D. A., D. L. Hartl, 1997 Trash DNA is what gets thrown away: high rate of DNA loss in Drosophila Gene 205:279-289[ISI][Medline]

    Poulter R., M. Butler, J. Ormandy, 1999 A LINE element from the pufferfish (fugu) Fugu rubripes which shows similarity to the CR1 family of non-LTR retrotransposons Gene 227:169-179[ISI][Medline]

    Saitou N., M. Nei, 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]

    Shapiro J. A., 1999 Transposable elements as the key to a 21st century view of evolution Genetica 107:171-179[ISI][Medline]

    Smit A. F. A., 1996 The origin of interspersed repeats in the human genome Curr. Opin. Genet. Dev 6:743-748[ISI][Medline]

    Smit A. F., A. D. Riggs, 1995 MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation Nucleic Acids Res 23:98-102[Abstract]

    Smit A. F. A., G. Toth, A. D. Riggs, J. Jurka, 1995 Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences J. Mol. Biol 246:401-417[ISI][Medline]

    Terai Y., K. Takahashi, N. Okada, 1998 SINE cousins: the 3'-end tails of the two oldest and distantly related families of SINEs are descended from the 3' ends of LINEs with the same genealogical origin Mol. Biol. Evol 15:1460-1471[Free Full Text]

    Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680[Abstract]

    van de Peer Y., R. de Wachter, 1994 TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment Comput. Appl. Biosci 10:569-570[Medline]

    Volff J. N., C. Koerting, M. Schartl, 2000 Multiple lineages of the non-LTR retrotransposon Rex1 with varying success in invading fish genomes Mol. Biol. Evol 17:1673-1684[Abstract/Free Full Text]

    Volff J. N., C. Koerting, K. Sweeney, M. Schartl, 1999 The non-LTR retrotransposon Rex3 from the fish Xiphophorus is widespread among teleosts Mol. Biol. Evol 16:1427-1438[Abstract]

    Zardoya R., A. Meyer, 2000 Mitochondrial evidence on the phylogenetic position of caecilians (Amphibia:Gymnophiona) Genetics 155:765-775[Abstract/Free Full Text]

    Zupunski V., F. Gubensek, D. Kordi, 2001 Evolutionary dynamics and evolutionary history in the RTE clade of non-LTR retrotransposons Mol. Biol. Evol. 18:1849–1863

Accepted for publication August 13, 2001.