L1 (LINE-1) Retrotransposon Evolution and Amplification in Recent Human History

Stéphane Boissinot, Pascale Chevret1, and Anthony V. FuranoGo,

Section on Genomic Structure and Function, Laboratory of Molecular and Cellular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
L1 (LINE-1) elements constitute a large family of mammalian retrotransposons that have been replicating and evolving in mammals for more than 100 Myr and now compose 20% or more of the DNA of some mammals. Here, we investigated the evolutionary dynamics of the active human Ta L1 family and found that it arose ~4 MYA and subsequently differentiated into two major subfamilies, Ta-0 and Ta-1, each of which contain additional subsets. Ta-1, which has not heretofore been described, is younger than Ta-0 and now accounts for at least 50% of the Ta family. Although Ta-0 contains some active elements, the Ta-1 subfamily has replaced it as the replicatively dominant subfamily in humans; 69% of the loci that contain Ta-1 inserts are polymorphic for the presence or absence of the insert in human populations, as compared with 29% of the loci that contain Ta-0 inserts. This value is 90% for loci that contain Ta-1d inserts, which are the youngest subset of Ta-1 and now account for about two thirds of the Ta-1 subfamily. The successive emergence and amplification of distinct Ta L1 subfamilies shows that L1 evolution has been as active in recent human history as it has been found to be for rodent L1 families. In addition, Ta-1 elements have been accumulating in humans at about the same rate per generation as recently evolved active rodent L1 subfamilies.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
L1 (LINE-1) elements (fig. 1 ) are mammalian long interspersed repeated DNA elements that replicate by retrotransposition (Voliva et al. 1984Citation ; Hattori et al. 1986Citation ; Fanning and Singer 1987aCitation ; Mathias et al. 1991Citation ; Furano 2000Citation ). Most L1 copies are defective and accumulate mutations at the pseudogene (neutral) rate (e.g., Voliva et al. 1983Citation ; Hardies et al. 1986Citation ; Pascale et al. 1993Citation ). They have been replicating and evolving in mammals since before the marsupial/placental divergence (~170 MYA) and account for ~14% of the mass of the human genome (Smit 1999Citation ). Thus, L1 activity has had a major and perhaps defining effect on the structure and function of modern genomes.



View larger version (44K):
[in this window]
[in a new window]
 
Fig. 1.—Ta L1 elements. The top diagram shows a typical full-length human L1 element. The 5' untranslated region (UTR) has a regulatory function, open reading frame I (ORF I) encodes a 40-kDa nucleic acid–binding protein, ORF II encodes a protein with endonuclease (EN) and reverse transcriptase (RT) activity, and the 3' UTR contains a conserved G-rich motif (filled rectangle) (reviewed in Furano 2000). Each horizontal line represents one of the 71 Ta elements of known size that were present in the nonredundant human database (GenEMBL) as of August 13, 1999. The filled circle indicates Ta-0 elements (see text). The first (left) vertical line represents the location of two of the nucleotide characters (T, G) that distinguish the Ta-0 and Ta-1 subfamilies, and the second vertical line indicates the location of the ACA motif in the 3' UTR that is a diagnostic feature of the Ta family (Skowronski, Fanning, and Singer 1988Citation ) (see text and fig. 2 ). The arrows indicate inversions of the corresponding L1 sequence. Some of the inversion-containing inserts also harbor deletions, indicated by the open rectangle. The filled rectangle represents a member of the youngest Ya5 human Alu (SINE) family (Batzer et al. 1996Citation )

 
However, an important unresolved issue is the effect of L1 replication on present-day populations. In addition to causing polymorphisms and genetic defects (reviewed in Kazazian and Moran 1998Citation ), L1 insertion could also affect the structural (Usdin and Furano 1989Citation ; Schwartz et al. 1998Citation ; Moran, DeBerardinis, and Kazazian 1999Citation ) and regulatory properties of the genome (Wade et al. 1997Citation ; Yang et al. 1998Citation ). Although some L1 transposition takes place in modern humans, it has been suggested that L1 amplification occurs at such a low rate that the impact on the genome could be considered minor (DeBerardinis et al. 1998Citation ). In contrast, there is clear evidence in murine rodents (Old World rats and mice) for novel, rapidly expanding families of L1 elements (Cabot et al. 1997Citation ; DeBerardinis et al. 1998Citation ; Naas et al. 1998Citation ; Saxton and Martin 1998Citation ). These novel L1 families are the latest products of a vigorous evolutionary process that has exemplified rodent L1 evolution for at least the last 10 Myr (e.g., Adey et al. 1994Citation ; Casavant and Hardies 1994Citation ; Cabot et al. 1997Citation ; Saxton and Martin 1998Citation ; Verneau, Catzeflis, and Furano 1998Citation ).

A subset of human L1 elements called the Ta family (Skowronski, Fanning, and Singer 1988Citation ) is the source of 11 of 12 de novo L1 insertions identified so far (quoted in Kimberland et al. 1999Citation ), and two active elements belonging to this family have been isolated (Dombroski et al. 1991Citation ; Holmes et al. 1994Citation ). In addition, several other full-length Ta elements were shown to be active in a cell culture–based retrotransposition assay (Moran et al. 1996Citation ; Sassaman et al. 1997Citation ). However, the evolutionary dynamics of the Ta family has yet to be examined.

Here, we show that the Ta family emerged ~4 MYA, somewhat after the divergence (6 MYA; Goodman et al. 1998Citation ) of humans and chimpanzees. Since then, the Ta family has differentiated into two major subfamilies, Ta-0 and Ta-1, each of which spawned additional subsets. Ta-0 is older than Ta-1, and although Ta-0 retains some active elements, Ta-1 now accounts for about one half of the Ta family and has largely replaced Ta-0 as the replicatively dominant subfamily in humans. The youngest subset of Ta-1, Ta-1d, arose about 1.4 MYA and accounts for about two thirds of the Ta-1 subfamily. The extensive differentiation of the Ta L1 family, typified by the emergence of, and eventual replacement by, novel active subsets, recapitulates the active evolution typical of murine L1 elements (Adey et al. 1994Citation ; Casavant and Hardies 1994Citation ; Cabot et al. 1997Citation ; Saxton and Martin 1998Citation ; Verneau, Catzeflis, and Furano 1998Citation ). In addition, the Ta family has been expanding at about the same rate per generation as the most active rodent L1 families. Thus, human and murine genomes are being altered to about the same extent by L1 activity.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
Sequence and Phylogenetic Analysis
The BLAST (Altschul et al. 1990Citation ) search program was used to find every sequence in the GenEMBL database with a perfect match to a 20mer sequence cognate to different regions of the L1 sequence. For example, to identify all of the Ta elements in the database, we performed the search using the sequence of oligonucleotide 3 depicted in figure 4 , which is cognate to the 3' untranslated region (UTR) ACA motif characteristic of the Ta family. Sequences were initially aligned using the GCG Pileup program (Wisconsin Package, version 10.0, Genetics Computer Group, Madison, Wis.) and refined manually. Phylogenetic analyses were performed using PAUP, version 4.0b2a (Swofford 1998Citation ).



View larger version (32K):
[in this window]
[in a new window]
 
Fig. 4.—The Ta family is human-specific. Primer numbers correspond to the boxed sequences at the bottom. The primate phylogeny (branch lengths not to scale) is adapted from Goodman et al. (1998)Citation . In addition to humans, we analyzed four hominoids (the chimpanzee [Pan troglodytes], the gorilla [Gorilla gorilla], the orangutan [Pongo pygmaeus], and the gibbon [Hylobates lar]), one Old World monkey, the macaque (Macaca arctoides), one New World monkey, the squirrel monkey (Saimiri sciureus), and one prosimian, the lemur (Lemur catta). The sequence at the bottom shows a region of open reading frame II and the 3' untranslated region of a non-Ta element. The relationship between this sequence and the corresponding region of Ta-1 and Ta-0 elements is also shown. The unlabeled sequence is derived from a non-Ta element (accession number ac004053) that harbors the Ta-1 (T, G) character at positions 5557 and 5560 (see text and fig. 2 ). Symbols are as in figure 2 .

 
Quantification of Ta and Ta-1 Elements
One and two micrograms of human DNA were transferred using a slot-blotter to Bio-Rad Zeta-Probe GT nylon membranes, as well as about 300, 600, 1,200, or 2,400 haploid genomic equivalents of a Ta-1 PCR product mixed with 2 µg of rat DNA. Membranes were hybridized to an oligonucleotide probe cognate to either the Ta (oligonucleotide 3 on fig. 4 ) or the Ta-1 (oligonucleotide 1 on fig. 4 ) subfamilies, together with their respective competitor oligonucleotides, as previously described (Verneau, Catzeflis, and Furano 1997Citation ). The Ta- and Ta-1-type probes were hybridized overnight at 43°C and 45°C, respectively. Blots were washed three times in 0.3 M NaCl, 0.03 M Nacitrate (pH 7.4). The amount of radioactivity hybridized to each slot was quantified by analyzing the scanned autoradiograms using NIH Image (Wayne Rasband, NIH). In control experiments, we found that oligonucleotide 1 was specific to the Ta-1 subfamily, since it did not hybridize to a cloned Ta-0 element (sequence U09116 in clone JM104-Lre2; Moran et al. 1996Citation ).

PCR
PCRs were performed in either an Idaho Technology Air-Thermo Cycler or an MJ Research PTC 100 thermocycler. In both cases, the reactions contained (as suggested by Idaho Technology) 50 mM Tris-Cl (pH 8.3), 2 mM MgCl2, 0.2 mM of each dNTP, 250 µg/ml bovine serum albumin, 2% sucrose, 0.1 mM Cresol Red (as an electrophoretic dye marker), and 0.5 µM primers. The primers used to determine the phylogenetic distribution of the Ta family are shown on figure 4 . The specificity of primers to amplify different L1 sequences was verified with clones of known sequence. Fifty to one hundred nanograms of genomic DNA was amplified in a total volume of 25 µl using the following conditions: denaturation at 94°C for 0 s (30 s in the MJ instrument), primer annealing at 40–50°C (depending on the pair of primers used) for 0 s (30 s), and chain extension at 72°C for 10 s (40 s), for 30 cycles.

We determined the polymorphism of each L1 insert using two PCRs and the primers listed in the appendix: one included the primer pair cognate to the non-L1 flanking sequence, and the second included the primer for the 3' flank and one specific for a 3' region of open reading frame (ORF) II (oligonucleotides 2 or 5; fig. 4 ). All of the human DNAs used in this study were purchased from the Coriell Institute for Medical Research. Nonhuman primate DNAs were gifts from Dr. C. Roos.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
The Ta Family Consists of at Least Two Major Subfamilies: Ta-0 and Ta-1
We identified 91 Ta elements in the GenEMBL database (as of August 13, 1999; see appendix). Of these, 18 were obtained in studies designed to select L1 elements (Dombroski et al. 1991Citation ; Dombroski, Scott, and Kazazian 1993Citation ; Holmes et al. 1994Citation ; Sassaman et al. 1997Citation ), and 73 were the result of nonselective sequencing. The size distribution of all of the nonselected elements of known size (71 of 73) is shown in figure 1 . Interestingly, 34% of the nonselected Ta elements are full-length. This is considerably higher than the previous estimates of 5%–10% full-length elements in primates and mice (Fanning and Singer 1987bCitation ), but it is about the proportion of full-length L1 elements in the rat genome (D'Ambrosio et al. 1986Citation ). Most (81%) of the partial elements represent simple truncations. However, the remaining 19% contain inversions, some of which contained deletions (see legend to fig. 1 ). These types of L1 structures have long been noted and are thought to be generated during L1 insertion (Rogers 1985Citation ; Hutchison et al. 1989Citation ; Casavant 1994Citation ).

Figure 2 shows an alignment of all 42 full-length Ta elements (24 of the nonselected plus 18 of the selected elements), along with two "pre-Ta" elements (see below) and five ancestral non-Ta elements (Smit et al. 1995Citation ). The 150 (of 6,048) positions (numbers given across the top of the fig. 2 ) at which a character difference from the consensus is shared by three or more elements are presented. We do not show the additional 108 informative positions at which a difference is shared between just two elements, because these additional data do not change any of the conclusions and make inspection of the alignment unwieldy. We also aligned all of the non–full–length Ta elements and found no subsets other than those revealed here (results not shown). The full alignment of all informative positions of the full-length elements is available in the file Boissinot.Ta-align and can be obtained by anonymous ftp from helix.nih.gov in pub/avf.



View larger version (62K):
[in this window]
[in a new window]
 
Fig. 2.—Alignment of full-length L1 elements. This alignment includes all of the full-length Ta elements present in GenEMBL (i.e., both the intentionally cloned [selected] L1 elements and those resulting from nonselective [for L1] sequencing). The Ta consensus sequence is shown at the bottom, and only positions at which differences from the consensus are shared between three or more elements are shown. Identity with the consensus is indicated by a dot, and gaps are indicated by dashes. The positions are numbered across the top and the demarcation between the 5' untranslated region (UTR), open reading frame (ORF) I, ORF II, and the 3' UTR is indicated by a vertical gray line. The ORF II "Ta-1 versus Ta-0" (T, G) and the 3' UTR ACA characters that correspond to the vertical lines in figure 1 are highlighted in the Ta consensus sequence. Several distinct subsets within either the major Ta-0 or the major Ta-1 subfamilies are outlined in gray boxes (see text). Note, however, that the bottommost pair of Ta-0 sequences (al096677 and u93573) may be the same sequence. These sequences are identical except for the T at position 1154 and share eight differences from all of the other elements shown here. Except for sequence m80340, the distinguishing T at position 1154 is found only in the set of sequences U93562–U93574, reported by Sassaman et al. (1997)Citation

 
Since only five ancestral non-Ta elements were included in the alignment, the consensus sequence is that of the Ta elements. The Ta-defining ACA character in the 3' UTR is boxed. There has been little differentiation of the 3' UTR among the Ta elements. Elements ac00632 and ac007043 (arrows in fig. 2 ) might be considered "pre-Ta" elements because they contain ACg instead of the ACA character. However, these elements clearly belong in Ta, because they share numerous characters with bona fide Ta elements in other regions of the sequence. In addition, at least some of these elements retain activity, as one generated an insert in the factor VIII gene (Kazazian et al. 1988Citation ).

The sequence alignment of the 5' UTR, ORF I, and ORF II reveals obvious subdivisions within the Ta family. For example, the Ta family can be divided into two groups based on the nucleotide pair at positions (5557, 5560) of ORF II. One, which we call Ta-1, has (T, G) at these positions, and the second one, Ta-0, almost always has (G, C). Although some ancestral non-Ta elements contain (T, G), (G, C) is quite likely the ancestral character at these positions. First, this (G, C) was invariably found associated with the ancestral LPA2 and LPA3 non-Ta L1 families in an earlier survey of primate L1 sequences (Smit et al. 1995Citation ; the Ta family was designated LPA1 in this study). Second, a BLAST search of GenEMBL with a 20mer sequence cognate to this region of ORF II returned 500 L1 entries containing the (G, C) ORF-II sequence (500 is the maximal number returned by BLAST) but only 50 L1 entries containing the (T, G) sequence. Of these, 44 were Ta L1 elements, which we here call Ta-1. Third, as figure 3A shows, the (G, C)-containing Ta-0 subfamily is, on the whole, more divergent than Ta-1. Sequence divergence within an L1 subfamily is positively correlated with its age (Pascale et al. 1993Citation ; Adey et al. 1994Citation ; Casavant and Hardies 1994Citation ; Furano et al. 1994Citation ; Verneau, Catzeflis, and Furano 1998Citation ) Therefore, the results in figure 3A suggest that most Ta-0 elements have resided longer in the genome than have most Ta-1 elements. This would be consistent with the ancestral nature of the (G, C) character.



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 3.—Divergence between members of L1 subfamilies. The timescale was calibrated using a mutation rate per lineage of 0.25%/Myr (see text). A, Frequency distribution of the pairwise differences between the full-length Ta-0 or Ta-1 elements shown in figure 2 . B, Frequency distribution of the pairwise differences between the Ta-1d or Ta-1nd elements shown in figure 2

 
Figure 2 also reveals additional subdivisions within both the Ta-1 and the Ta-0 subfamilies. A deletion at position 74 of the 5' UTR (and several other characters) divides Ta-1 into two groups, Ta-1nd (no deletion) and Ta-1d, and the latter group itself harbors a subset consisting of the first four elements (in the gray box) in the alignment. As was the case for the Ta-1 and Ta-0 subfamilies, the distinction between the Ta-1d and Ta-1nd subsets based on the alignment is also evident by a difference between their divergence; figure 3B shows that Ta-1d is considerably less divergent (younger) than Ta-1nd. In addition, the Ta-0 subfamily also contains several apparent subsets (boxed). The divergence between the members of the top two subsets are, respectively, ~0.5% and ~0.4% less than the ~1.1% divergence of the Ta-0 subfamily on the whole (results not shown and fig. 3A ). As we show below, these younger Ta-1 and Ta-0 subsets were supported by phylogenetic analysis. (The pair of sequences [al096677 and u93573] that compose the bottommost Ta-0 subset differ at just one position and may actually be the same sequence; see legend to fig. 2 .)

Expansion of the Ta Family in Humans
Given the correlation between sequence divergence and age, we converted the percentage of divergence of the Ta-1 and Ta-0 subfamilies into time by using a molecular clock calibrated from the ~7% average divergence between human-specific and orangutan-specific L1 subfamilies (unpublished data). Assuming that humans and orangutans diverged 14 MYA (Goodman et al. 1998Citation ), we obtained a nucleotide substitution rate per lineage of ~0.25% per Myr, which is ~15% to ~60% higher than other estimates of the hominid pseudogene rate (e.g., Casane et al. 1997Citation ; Easteal and Herbert 1997Citation ). Since this difference would not materially change any of our conclusions, we used the L1-derived rate, as it is based on structurally similar sequences.

We estimated that the Ta family emerged as early as ~4 MYA but that most of the amplification occurred in the last 3 Myr (fig. 3A ). We confirmed this recent origin of the Ta family by PCR. Ta family–specific primers generated a product only from humans but not from its closest relative, the chimpanzee, which diverged from humans ~6 MYA (Goodman et al. 1998Citation ) (fig. 4 ). In contrast, primers specific for older primate L1 families generated products in both humans and other primates (fig. 4 ). The data in figure 3A indicate that the Ta-1 subfamily first arose about 2.5 MYA, with most (75%) of the Ta-1 elements having been generated during the last ~1.6 Myr. In contrast, 80% of the Ta-0 elements had apparently already been inserted before ~1.6 MYA.

Slot blot analysis showed that the haploid human genome contains ~700 Ta elements (fig. 5 ). At the time of our search, ~13% of the human genome had been sequenced and this portion contained 73 Ta elements (after removal of the entries that had been purposely cloned). This would extrapolate to ~560 elements, which agrees reasonably well with the hybridization data (table 1 , cf. columns 2 and 5). Of these 73 "nonselected" Ta elements, 55 (75%) were long enough (fig. 1 ) to be classified as either Ta-1 or Ta-0. Applying this proportion to the ~700 Ta elements detected by hybridization gives a value of ~525 classifiable Ta elements (table 1 , column 3). Hybridization revealed ~300 sequences that hybridize to the Ta-1 (T, G) character (figs. 2, 4, and 5 and table 1 ). As mentioned above, ~12% of this hybridization may be due to non-Ta elements and correction for these yields ~265 Ta-1 elements per haploid genome (table 1 ). Thus, amplification of the Ta-1 subfamily accounts for at least half of the Ta family. This value is consistent with the proportion of these elements (counting only the nonselected ones) present in the GenEMBL database (cf. columns 3 and 6 in table 1 ).



View larger version (29K):
[in this window]
[in a new window]
 
Fig. 5.—Hybridization of Ta- and Ta-1-specific oligonucleotides to genomic DNA. The left side of the figure shows a slot blot hybridized with the Ta-specific probe (oligonucleotide 3, fig. 4 ), and the right side shows a slot blot hybridized with the Ta-1-specific probe (oligonucleotide 1, fig. 4 ). In both experiments, the genome equivalents of the Ta-1 PCR product correspond to 2 µg of genomic DNA. These samples and the 0-copy control contained 2 µg of rat DNA. The human DNA used here was from a female Biaka pygmy (Coriell Institute repository number NA10471). Hybridization of oligonucleotide 1 to blots of restricted DNA from a Druze female, a Melanesian female, and a Chinese male showed that they contained about the same numbers of Ta-1 elements as the Biaka pygmy (data not shown)

 

View this table:
[in this window]
[in a new window]
 
Table 1 Copy Numbers of Ta and Ta-1 Elements

 
Although Ta-1 elements account for at least half of the Ta family, the Ta-1 subfamily had not been recognized earlier. In particular, no full-length Ta-1d elements (which account for about two thirds of Ta-1; see Discussion) were recovered in a previous search for full-length Ta elements (Sassaman et al. 1997Citation ). As figure 2 shows, Ta-1d elements contain a deletion in the 5' UTR at what would be position 74, and we noted that the oligonucleotide (oligonucleotide A) used to select full- length elements in the above study was cognate to the nondeleted version of the 5' UTR (Sassaman et al. 1997Citation ). Figure 6 shows that oligonucleotide A hybridizes very poorly to the deleted version of the 5' UTR. Thus, it is not surprising that of the 13 full-length elements isolated (Sassaman et al. 1997Citation ), only 3 were Ta-1 elements, each of which (U93566, U93572, U93568) belonged to the smaller and older Ta-1nd subset that is not deleted at position 74 (fig. 2 ).



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 6.—The 5' untranslated region (UTR) of most Ta-1 elements differs from that of Ta-0 elements. We generated a PCR fragment that contained the 5' UTR of two different Ta-1 insertions, one deleted at position 74 (on the right; from a Ta-1d element, AC004554) and one with the ancestral state, i.e., the absence of deletion (on the left; from a Ta-1nd element, AC004694). The PCR products were blotted and hybridized to oligonucleotide A (see Sassaman et al. 1997Citation ). The efficiency of the hybridization of oligomer A to the deleted version of the 5' UTR is ~5% of the hybridization to the nondeleted version

 
Figure 1 shows that 34% of the 71 nonselected L1 Ta elements analyzed are full-length. Extrapolating this value to the ~700 present in the entire haploid genome indicates that it contains ~240 full-length Ta elements, or about three times the 80 previously reported (Sassaman et al. 1997Citation ). Sassaman et al. (1997)Citation estimated the number of full-length Ta elements by hybridization of a Ta-specific oligonucleotide (e.g., analogous to oligonucleotide 3 in fig. 4 ) to a ~6-kb L1 fragment on a blot of electrophoretically fractionated genomic DNA that had been digested with the AccI restriction endonuclease. Except for Ta-0 element ac003080, AccI sites are present at positions 41 and 5987 in every Ta and pre-Ta element shown in figure 2 . Therefore, the presence of a subset of Ta elements that lacks either of the AccI sites seems not to explain the apparent underestimation of full-length Ta elements in this earlier study (Sassaman et al. 1997Citation ).

Polymorphic Ta Inserts
Figures 2 and 3 show that the Ta family has a distinct subfamily structure and that these subfamilies are of different but somewhat overlapping divergence. If these differences actually reflect different ages (fig. 3 ), then we would expect that the loci which contain inserts of the less divergent (younger) subfamilies would be polymorphic (for the presence or absence of the L1 insert) compared with the loci that contain inserts of the older L1 subfamilies. For brevity, we refer to these insert-containing loci as polymorphic and fixed L1 inserts, respectively. We designed oligonucleotide primers cognate to the 5' (F) and 3' (R) flanking sequences of each of 14 nonselected Ta-0 and 25 nonselected Ta-1 elements present in GenEMBL (fig. 1 ). Figure 7 summarizes the results of duplicate PCR reactions with these oligonucleotides (one with the F/R pair, the second with R and an L1 (L) oligonucleotide on a single individual from each of eight populations. Whereas 17 (68%) of 25 Ta-1 inserts were polymorphic, only 3 (22%) of 14 Ta-0 inserts were. These results are consistent with the difference between the divergency (age) of the Ta-1 and Ta-0 subfamilies (fig. 3A ). Since we sampled only a single individual from each of the eight populations, it is possible that some "fixed" L1-containing loci may be absent in some individuals. However, this would not change the fact that any arbitrarily sampled Ta-1–containing locus is much rarer in humans than an arbitrarily chosen Ta-0–containing locus.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 7.—Polymorphic Ta insertions. The electrophoretic patterns of a pair of PCRs for just one locus are shown. All of the Ta insertions were obtained from the database and include both full-length and partial elements. The PCR shown on the top contained primers cognate to the 5' flank (the F primer) and the 3' flank (R), and the one on the bottom contained a primer (L) cognate to either Ta-1 or Ta-0 (oligonucleotides 2 or 5, fig. 4 ) and primer R. A plus sign indicates the presence of the insertion, and a minus sign indicates the absence of one. In all reactions, we obtained PCR products of the sizes expected for the locations of the PCR primers. Note that a (-/-) result in all of the individuals tested here nonetheless means that the insert-containing locus is polymorphic, because the tested locus does contain an insert in the individual(s) being used to generate the human genome database. Samples 10469, 11321, 10975, and 10492 were from males; thus, the X chromosome genotype of these samples is given as /0). The other samples were from females

 
This correlation was even more dramatic when we mapped polymorphic Ta inserts on a phylogenetic tree of the Ta family. Figure 8 shows a single neighbor-joining tree (Saitou and Nei 1987Citation ) of all of the full-length Ta elements. Several nodes (circled) within the Ta family are supported by bootstrap values of ~70% or greater. This would correspond to an ~95% confidence level, since the extents (and presumably the rate) of character change are not very different among the members of our data set (Hillis and Bull 1993Citation ). One of these nodes groups the Ta-1d elements as a distinct subset within Ta-1, separate from the Ta-1nd elements. The Ta-1d subset is distinctly less divergent (younger) than the Ta-1nd subset (fig. 3B ) and 10 of the 11 (91%) Ta-1d elements tested (here or by others, see legend to fig. 8 and appendix) were polymorphic. In contrast, only one of five tested Ta-1nd elements was polymorphic (fig. 8 ).



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 8.—Phylogenetic tree of full-length L1 elements. This tree is a single neighbor-joining tree (using uncorrected p-distances) of all of the full-length elements shown in figure 2 (see Materials and Methods) and is built on the entire sequence exclusive of the 3' untranslated region (UTR), which showed little or no differentiation (see fig. 2 ). The number at a particular node indicates its percentage of appearance in 1,000 bootstrap replicates. Only values >50% are indicated. The tree is rooted with a non-Ta L1 element. "F" and "P" represent fixed and polymorphic elements, respectively, as determined either by us (fig. 7 ) or by others (see appendix). The subsets outlined in gray boxes correspond to those outlined in figure 2 . The elements that descend from the node labeled "?S" may be the same (see text and the legend to fig. 2 ). The "pre-Ta" elements (i.e., those that contain an ACg instead of the ACA in the 3' UTR; see text and fig. 2 ) are indicated. The box lists those elements analyzed in earlier studies (Sassaman et al. 1997Citation ; Kimberland et al. 1999Citation ). These elements are underlined in the tree and listed in the box (by both accession number and previous designation) in the same order as they are on the tree except for the Ta-1nd element, u93572 (L1.25, flanked by "*"). This element clusters with the Ta-0 elements in the neighbor-joining tree even though it lacks several defining features of Ta-0 elements (see fig. 2 ). Figure 2 shows that u93572 shares a number of nucleotide characters with the Ta-0 element, u93571, with which it groups on the tree. Although some of the shared characters may be due to chance, some may represent regions of common ancestry acquired by recombination (or gene conversion), as has been demonstrated for rodent elements (Hayward, Zavanelli, and Furano 1997Citation ; Saxton and Martin 1998Citation ). Element af148856 (L1RP) is in intron 1 of the retinitis pigmentosa-2 gene, and af149422 (L1ß-thal) is in intron 2 of the ß-globin gene (see Kimberland et al. 1999Citation and references therein). The elements in bold type are active in a cell culture retrotransposition assay. The elements flanked by "<" have retrotransposition frequencies of <=10% that of the others, and the elements in parentheses contain open reading frames interrupted by a termination codon or a frameshift (or both)

 
Figure 8 also shows that the generally older Ta-0 subfamily contains at least two well-supported subsets of elements. These subsets are young, as their members are quite similar to each other (connected by short branch lengths). This implies that these elements are recent inserts and all four of the elements tested from these subsets were polymorphic. On the other hand, none of five Ta-0 elements outside of these subsets was polymorphic. Thus, within both the Ta-1 and the Ta-0 subfamilies, there is a consistent correlation between low divergency and polymorphism.

Figure 8 also compares our results with previous studies of Ta elements using the cell culture retrotransposition assay (Sassaman et al. 1997Citation ; Kimberland et al. 1999Citation ). Although the earlier failure to clone Ta-1d elements (see above) somewhat limits the comparison, there is nonetheless a generally excellent correlation between the grouping of an element within a young cluster (i.e., in Ta-1 or the "young" Ta-0 subsets) and whether the element is active in the retrotransposition assay. In particular, those elements that are among the most active in this assay (l19088, L1.3; l19092, L1.4; af148856, L1RP) are all Ta-1d elements (Sassaman et al. 1997Citation ; Kimberland et al. 1999Citation ).


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
Sequence alignment revealed that the human Ta L1 family consists of distinct subsets of elements. We defined two major groups (subfamilies) based on the presence or absence of a number of ancestral nucleotide characters. For example, all members of the Ta-0 subfamily retain the ancestral C at 5560, most retained the ancestral G's at (4920, 5557) and the ancestral T at 5413, and about one half retained the ancestral A at 2188 and the ancestral G at 2380 (fig. 2 ). In contrast, none of these characters have been retained by any (or most) of the Ta-1 elements. The presence of the Ta-1 (T, G) character at (5557, 5560) in several non-Ta elements does not invalidate its use as a hallmark of the Ta-1 subfamily. This character was found only about 12% of the time in non-Ta elements, and the ancestral (G, C) character is invariably associated with ancestral primate L1 families (Smit et al. 1995Citation ). We are now investigating whether these sites represent mutational hot spots.

Both Ta-0 and Ta-1 harbor additional subsets of elements, and although the various Ta subgroups are clearly distinguished by a number of nucleotide characters, the alignment in figure 2 shows that determining the genealogical relationship between them could be difficult. First, while some characters are exclusively (or largely) confined to a particular subset (e.g., the deletion at position 74 or the T at position 1820), others are shared between several members of two or more subsets: the C at position 155, the G at position 1645, the G at position 2380, the T at position 5131. Furthermore, although the T at position 1820 that distinguishes Ta-1d is an apparent derived character (this T is not present in the five ancestral non-Ta elements), the diagnostic G at position 355 of Ta-1d may be an ancestral character being present in some of the ancestral non-Ta elements. This inconsistent pattern of shared characters and admixture of derived and ancestral characters will confound phylogenetic methods, like maximum parsimony or maximum likelihood, that rely on the pattern of inherited characters. Additionally, all of the Ta sequences are quite similar; most Ta-0 elements are 0.7%–1.1% divergent from most Ta-1 elements. Nonetheless, the neighbor-joining method, which is a distance method and groups sequences based on their overall sequence similarity, did generate some well-supported subsets, including Ta-1d (fig. 8 ).

The different subsets within the Ta family could also be distinguished by their degree of sequence divergence (fig. 3 ) and the extent to which their inserts are polymorphic in human populations (figs. 7 and 8 ). The excellent correlation between the low sequence divergence of a particular Ta subset and the high degree of polymorphism of its members again validates the use of sequence divergence within an L1 subfamily as a measure of its age (Pascale et al. 1993Citation ; Adey et al. 1994Citation ; Casavant and Hardies 1994Citation ; Furano et al. 1994Citation ; Verneau, Catzeflis, and Furano 1998Citation ). Thus, the differentiation within the Ta family was the result of the successive amplification of different L1 subfamilies over the last ~4 Myr since it first arose.

The distributions of pairwise divergence between members of the Ta-1 and Ta-0 subfamilies (fig. 3A ) suggest that the amplification of Ta-1, and particularly Ta-1d (fig. 3B ), generally coincided with a decline of transpositional activity of Ta-0 elements, despite the fact that Ta-0 still retains some active subsets (fig. 8 ). This apparent replacement of a preexisting active subfamily by a more recent one recapitulates the mode of L1 evolution in rats (Cabot et al. 1997Citation ; Hayward, Zavanelli, and Furano 1997Citation ) and mice (Adey et al. 1994Citation ; Casavant and Hardies 1994Citation ; Saxton and Martin 1998Citation ). The distributions of pairwise divergence shown in figure 3 closely resemble those expected when L1 families are the product of a single lineage of replication-competent elements (the "master model") (Clough et al. 1996Citation ). This model, wherein novel L1 subfamilies belonging to a single dominant lineage are generated successively in time, describes L1 evolution in most of the studied mammalian taxa. Although distinct active L1 subsets may coexist for short periods, a single lineage usually prevails (Rikke, Garvin, and Hardies 1991Citation ; Pascale et al. 1993Citation ; Adey et al. 1994Citation ; Casavant and Hardies 1994Citation ; Furano et al. 1994Citation ).

The differentiation of a single 3' UTR lineage into distinct subfamilies that have waxed and waned over the past ~4 Myr describes the evolutionary dynamics of the human Ta family. This has also been observed for a recent rat L1 family (Cabot et al. 1997Citation ). Figure 2 shows that the Ta 3' UTR sequence has hardly changed since it first arose in hominids ~4 MYA. In contrast, significant variation has taken place in all other regions of the Ta family. Selective constraint on the 3' UTR sequence or adaptive changes in the non-3' UTR sequence (or both) could account for this difference. An essential role for the 3' UTR sequence in L1 replication, as has been demonstrated for L1-like elements in insects (Luan and Eickbush 1995Citation ; Mathews et al. 1997Citation ), could account both for the conservation of the Ta 3' UTR sequence shown here and for the persistence of certain sequence motifs in the 3' UTR throughout mammalian L1 evolution (Howell and Usdin 1997Citation ). On the other hand, adaptive changes in the non-3' UTR region could have enabled the element to either evade host repression or gain replicative superiority over existing elements. Whatever the case, competition for replicative dominance could explain the successive replacement of existing L1 subfamilies by novel ones (e.g., fig. 3 ) and the expansion of one L1 subfamily at the expense of another (Casavant and Hardies 1994Citation ; Cabot et al. 1997Citation ).

In addition to mimicking the evolutionary dynamics of murine L1 evolution, Ta has also been accumulating at about the same rate per generation as its murine counterparts. The average accumulation (haploid) rate of the Ta family since it began amplifying in earnest ~3.5 MYA (fig. 3A ) is ~0.2 elements per 1,000 years (700 ÷ 3,500; table 1 ). The most recent active L1 subfamilies in murine rodents have accumulated ~12 elements per 1,000 years for the L1Rnmlvi2-rn subfamily in Rattus norvegicus (~5,600 elements per 450,000 years; Cabot et al. 1997Citation ) and ~5 elements per 1,000 years for the mouse Mus musculus Tf subfamily (~1,825 elements per 325,000 years, the average of three published determinations; DeBerardinis et al. 1998Citation ; Naas et al. 1998Citation ; Saxton and Martin 1998Citation ). Because the impact of L1 amplification would be related to the number of elements accumulating per generation, we normalized the accumulation rates using generation times of ~20 years for humans and ~0.5 years for rodents. The normalized accumulation rates are ~0.006 elements per generation in rats and ~0.0025 elements per generation in mice, as compared with ~0.004 elements per generation in humans. These accumulation rates reflect the rate of both L1 transposition and L1 elimination by either selection (in the case of deleterious or lethal insertions) or genetic drift. Although these factors could be very different between these species, our results suggest that L1 transpositional activity might play equally important roles in the genetic diversity and evolution of both human and rodent genomes.

About 90% of the Ta-1d and a smaller percentage of Ta-1nd and Ta-0 insertions are polymorphic and could be useful for gene mapping or population genetics studies. The number of polymorphic inserts due to Ta-1d alone may well number several hundred in world human populations. L1 insertions have several advantages over commonly used polymorphic markers such as microsatellites and single-nucleotide polymorphisms. Parallelism (i.e., independent retrotransposition into the same chromosomal location) is unlikely, and the ancestral state of the polymorphism is known (i.e., the absence of inserts). Therefore, Ta insertions, like Alu insertions, could be used as robust markers to root population trees (e.g., Batzer et al. 1994Citation ; Novick et al. 1998Citation ) or allele phylogenies (e.g., Hammer 1994Citation ).


    Conclusions
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
Evolutionary analysis of the Ta L1 family has shown that the tempo and mode of L1 evolution in recent human history recapitulates the process in murine rodents and that L1 activity may have affected the genomes of these taxa similarly. Furthermore, by identifying the most recently active human L1 subfamily(s), we now might possibly identify those factors responsible for its replicative success. To this end, we recently isolated every member of the Ta-1 subfamily and can compare such parameters as the genetic environment and the regulatory and enzymatic properties of various members of the Ta-1 subsets. Finally, by identifying the entire insertional history of the Ta-1 subfamily in several human populations, we may be better able to estimate the effect of this most current wave of L1 replication on present-day humans.


View this table:
[in this window]
[in a new window]
 

 

    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 
We thank C. Roos for the primate DNA samples and Dr. John Moran for clone JM104-Lre2.


    Footnotes
 
Thomas Eickbush, Reviewing Editor

1 Keywords: L1/LINE-1 human evolution polymorphism retrotransposon Back

1 Present address: Institut des Sciences de l'Evolution, Case courrier 064, Université Montpellier II, Montpellier, France. Back

3 Address for correspondence and reprints: Anthony V. Furano, National Institutes of Health, Building 8, Room 203, 8 Center Drive MSC 0830, Bethesda, Maryland 20892-0830. E-mail: avf{at}helix.nih.gov Back


    literature cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 Acknowledgements
 literature cited
 

    Adey, N. B., S. A. Schichman, D. K. Graham, S. N. Peterson, M. H. Edgell, and C. A. I. Hutchison. 1994. Rodent L1 evolution has been driven by a single dominant lineage that has repeatedly acquired new transcriptional regulatory sequences. Mol. Biol. Evol. 11:778–789.[Abstract/Free Full Text]

    Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410.[ISI][Medline]

    Batzer, M. A., P. L. Deininger, U. Hellmann-Blumberg, J. Jurka, D. Labuda, C. M. Rubin, C. W. Schmid, E. Zietkiewicz, and E. Zuckerkandl. 1996. Standardized nomenclature for Alu repeats. J. Mol. Evol. 42:3–6.[ISI][Medline]

    Batzer, M. A., M. Stoneking, M. Alegria-Hartman et al. (11 co-authors). 1994. African origin of human-specific polymorphic Alu insertions. Proc. Natl. Acad. Sci. USA 91:12288–12292.

    Cabot, E. L., B. Angeletti, K. Usdin, and A. V. Furano. 1997. Rapid evolution of a young L1 (LINE-1) clade in recently speciated Rattus taxa. J. Mol. Evol. 45:412–423.[ISI][Medline]

    Casane, D., S. Boissinot, B. H. Chang, L. C. Shimmin, and W. Li. 1997. Mutation pattern variation among regions of the primate genome. J. Mol. Evol. 45:216–226.[ISI][Medline]

    Casavant, N. C. 1994. Dynamics of LINE-1 amplification in the mouse. Ph.D. thesis, University of Texas Health Science Center, San Antonio.

    Casavant, N. C., and S. C. Hardies. 1994. The dynamics of murine LINE-1 subfamily amplification. J. Mol. Biol. 241:390–397.[ISI][Medline]

    Clough, J. E., J. A. Foster, M. Barnett, and H. A. Wichman. 1996. Computer simulation of transposable element evolution: random template and strict master models. J. Mol. Evol. 42:52–58.[ISI][Medline]

    D'Ambrosio, E., S. D. Waitzkin, F. R. Witney, A. Salemme, and A. V. Furano. 1986. Structure of the highly repeated, long interspersed DNA family (LINE or L1Rn) of the rat. Mol. Cell. Biol. 6:411–424.[ISI][Medline]

    DeBerardinis, R. J., J. L. Goodier, E. M. Ostertag, and H. H. Kazazian Jr. 1998. Rapid amplification of a retrotransposon subfamily is evolving the mouse genome. Nat. Genet. 20:288–290.[ISI][Medline]

    Dombroski, B. A., S. L. Mathias, E. Nanthakumar, A. F. Scott, and H. H. J. Kazazian. 1991. Isolation of an active human transposable element. Science 254:1805–1808.

    Dombroski, B. A., A. F. Scott, and H. H. J. Kazazian. 1993. Two additional potential retrotransposons isolated from a human L1 subfamily that contains an active retrotransposable element. Proc. Natl. Acad. Sci. USA 90:6513–6517.

    Easteal, S., and G. Herbert. 1997. Molecular evidence from the nuclear genome for the time frame of human evolution. J. Mol. Evol. 44:S121–S132.

    Fanning, T., and M. Singer. 1987a. The LINE-1 DNA sequences in four mammalian orders predict proteins that conserve homologies to retrovirus proteins. Nucleic Acids Res. 15:2251–2260.

    ———. 1987b. LINE-1: a mammalian transposable element. Biochim. Biophys. Acta 910:203–212.

    Furano, A. V. 2000. The biological properties and evolutionary dynamics of mammalian LINE-1 retrotransposons. Prog. Nucleic Acids Res. Mol. Biol. 64:255–294.[ISI][Medline]

    Furano, A. V., B. E. Hayward, P. Chevret, F. Catzeflis, and K. Usdin. 1994. Amplification of the ancient murine Lx family of long interspersed repeated DNA occurred during the murine radiation. J. Mol. Evol. 38:18–27.[ISI][Medline]

    Goodman, M., C. A. Porter, J. Czelusniak, S. L. Page, H. Schneider, J. Shoshani, G. Gunnell, and C. P. Groves. 1998. Toward a phylogenetic classification of primates based on DNA evidence complemented by fossil evidence. Mol. Phylogenet. Evol. 9:585–598.[ISI][Medline]

    Hammer, M. F. 1994. A recent insertion of an alu element on the Y chromosome is a useful marker for human population studies. Mol. Biol. Evol. 11:749–761.[Abstract/Free Full Text]

    Hardies, S. C., S. L. Martin, C. F. Voliva, C. A. Hutchison III, and M. H. Edgell. 1986. An analysis of replacement and synonymous changes in the rodent L1 repeat family. Mol. Biol. Evol. 3:109–125.[Abstract]

    Hattori, M., S. Kuhara, O. Takenaka, and Y. Sakaki. 1986. L1 family of repetitive DNA sequences in primates may be derived from a sequence encoding a reverse transcriptase-related protein. Nature 321:625–628.

    Hayward, B. E., M. Zavanelli, and A. V. Furano. 1997. Recombination creates novel L1 (LINE-1) elements in Rattus norvegicus. Genetics 146:641–654.

    Hillis, D. M., and J. J. Bull. 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42:182–192.[ISI]

    Holmes, S. E., B. A. Dombroski, C. M. Krebs, C. D. Boehm, and H. H. J. Kazazian. 1994. A new retrotransposable human L1 element from the LRE2 locus on chromosome 1q produces a chimaeric insertion. Nat. Genet. 7:143–148.[ISI][Medline]

    Howell, R., and K. Usdin. 1997. The ability to form intrastrand tetraplexes is an evolutionarily conserved feature of the 3' end of L1 retrotransposons. Mol. Biol. Evol. 14:144–155.[Abstract]

    Hutchison, C. A. III, S. C. Hardies, D. D. Loeb, W. R. Shehee, and M. H. Edgell. 1989. LINEs and related retroposons: long interspersed repeated sequences in the eucaryotic genome. Pp. 593–617 in D. E. Berg and M. M. Howe, eds. Mobile DNA. American Society for Microbiology, Washington, D.C.

    Kazazian, H. H. Jr., and J. V. Moran. 1998. The impact of L1 retrotransposons on the human genome. Nat. Genet. 19:19–24.[ISI][Medline]

    Kazazian, H. H. J., C. Wong, H. Youssoufian, A. F. Scott, D. G. Phillips, and S. E. Antonarakis. 1988. Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. Nature 332:164–166.

    Kimberland, M. L., V. Divoky, J. Prchal, U. Schwahn, W. Berger, and H. H. Kazazian Jr. 1999. Full-length human L1 insertions retain the capacity for high frequency retrotransposition in cultured cells. Hum. Mol. Genet. 8:1557–1560.[Abstract/Free Full Text]

    Luan, D. D., and T. H. Eickbush. 1995. RNA template requirements for target DNA-primed reverse transcription by the R2 retransposable element. Mol. Cell. Biol. 15:3882–3891.[Abstract]

    Mathews, D. H., A. R. Banerjee, D. D. Luan, T. H. Eickbush, and D. H. Turner. 1997. Secondary structure model of the RNA recognized by the reverse transcriptase from the R2 retrotransposable element. RNA 3:1–16.

    Mathias, S. L., A. F. Scott, H. H. J. Kazazian, J. D. Boeke, and A. Gabriel. 1991. Reverse transcriptase encoded by a human transposable element. Science 254:1808–1810.

    Moran, J. V., R. J. DeBerardinis, and H. H. Kazazian Jr. 1999. Exon shuffling by L1 retrotransposition. Science 283:1530–1534.

    Moran, J. V., S. E. Holmes, T. P. Naas, R. J. DeBerardinis, J. D. Boeke, and H. H. Kazazian Jr. 1996. High frequency retrotransposition in cultured mammalian cells. Cell 87:917–927.

    Naas, T. P., R. J. DeBerardinis, J. V. Moran, E. M. Ostertag, S. F. Kingsmore, M. F. Seldin, Y. Hayashizaki, S. L. Martin, and H. H. Kazazian. 1998. An actively retrotransposing, novel subfamily of mouse L1 elements. EMBO J. 17:590–597.[Abstract/Free Full Text]

    Novick, G. E., C. C. Novick, J. Yunis et al. (11 co-authors). 1998. Polymorphic Alu insertions and the Asian origin of Native American populations. Hum. Biol. 70:23–39.[ISI][Medline]

    Pascale, E., C. Liu, E. Valle, K. Usdin, and A. V. Furano. 1993. The evolution of long interspersed repeated DNA (L1, LINE 1) as revealed by the analysis of an ancient rodent L1 DNA family. J. Mol. Evol. 36:9–20.[ISI][Medline]

    Rikke, B. A., L. D. Garvin, and S. C. Hardies. 1991. Systematic identification of LINE-1 repetitive DNA sequence differences having species specificity between Mus spretus and Mus domesticus. J. Mol. Biol. 219:635–643.[ISI][Medline]

    Rogers, J. H. 1985. The origin and evolution of retroposons. Int. Rev. Cytol. 93:187–279.[ISI][Medline]

    Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425.[Abstract]

    Sassaman, D. M., B. A. Dombroski, J. V. Moran, M. L. Kimberland, T. P. Naas, R. J. DeBerardinis, A. Gabriel, G. D. Swergold, and H. H. Kazazian Jr. 1997. Many human L1 elements are capable of retrotransposition. Nat. Genet. 16:37–43.[ISI][Medline]

    Saxton, J. A., and S. L. Martin. 1998. Recombination between subtypes creates a mosaic lineage of LINE-1 that is expressed and actively retrotransposing in the mouse genome. J. Mol. Biol. 280:611–622.[ISI][Medline]

    Schwartz, A., D. C. Chan, L. G. Brown, R. Alagappan, D. Pettay, C. Disteche, B. McGillivray, A. de la Chapelle, and D. C. Page. 1998. Reconstructing hominid Y evolution: X-homologous block, created by X-Y transposition, was disrupted by Yp inversion through LINE-LINE recombination. Hum. Mol. Genet. 7:1–11.[Abstract/Free Full Text]

    Skowronski, J., T. G. Fanning, and M. F. Singer. 1988. Unit-length line-1 transcripts in human teratocarcinoma cells. Mol. Cell. Biol. 8:1385–1397.[ISI][Medline]

    Smit, A. F. A. 1999. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9:657–663.[ISI][Medline]

    Smit, A. F. A., G. Tóth, A. D. Riggs, and J. Jurka. 1995. Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. J. Mol. Biol. 246:401–417.[ISI][Medline]

    Swofford, D. L. 1998. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer, Sunderland, Mass.

    Usdin, K., and A. V. Furano. 1989. Insertion of L1 elements into sites that can form non-B DNA. Interactions of non-B DNA-forming sequences. J. Biol. Chem. 264:20736–20743.[Abstract/Free Full Text]

    Verneau, O., F. Catzeflis, and A. V. Furano. 1997. Determination of the evolutionary relationships in Rattus sensu lato (Rodentia: Muridae) using L1 (LINE-1) amplification events. J. Mol. Evol. 45:424–436.[ISI][Medline]

    ———. 1998. Determining and dating recent rodent speciation events by using L1 (LINE-1) retrotransposons. Proc. Natl. Acad. Sci. USA 95:11284–11289.

    Voliva, C. F., C. L. Jahn, M. B. Comer, C. A. Hutchison III, and M. H. Edgell. 1983. The L1Md long interspersed repeat family in the mouse: almost all examples are truncated at one end. Nucleic Acids Res. 11:8847–8859.[Abstract]

    Voliva, C. F., S. L. Martin, C. A. Hutchison III, and M. H. Edgell. 1984. Dispersal process associated with the L1 family of interspersed repetitive DNA sequences. J. Mol. Biol. 178:795–813.[ISI][Medline]

    Wade, D. P., L. H. Puckey, B. L. Knight, F. Acquati, A. Mihalich, and R. Taramelli. 1997. Characterization of multiple enhancer regions upstream of the apolipoprotein(a) gene J. Biol. Chem. 272:30387–30399 [published erratum appears in J. Biol. Chem. 273:3798].[Abstract/Free Full Text]

    Yang, Z., D. Boffelli, N. Boonmark, K. Schwartz, and R. Lawn. 1998. Apolipoprotein(a) gene enhancer resides within a LINE element. J. Biol. Chem. 273:891–897.[Abstract/Free Full Text]

Accepted for publication February 22, 2000.