Characterization of the Intragenomic Spread of the Human Endogenous Retrovirus Family HERV-W

Javier Costas

Departamento de Bioloxía Fundamental, Universidade de Santiago de Compostela, Spain


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
This study examines the intragenomic spread of the human endogenous retrovirus family HERV-W from insertions present within the draft sequence of the human genome. Identification of shared diagnostic differences and phylogenetic analyses revealed the existence of three main subfamilies. The average divergence between sequences for each of the subfamilies suggests that most of the HERV-W elements were inserted within the genome during a short period of evolutionary time. Each one of the subfamilies consists of two types of insertions, the expected proviral sequences and other sequences resembling the structure of processed retrogenes. These HERV-W retrosequences extend from the R region of the 5' long-terminal repeat (LTR) to the R region of the 3' LTR (as viral genomic RNAs), end in poly(A) 3' tails, and are flanked by direct repeats longer than the proviral integrations. Furthermore, several of the HERV-W retrosequences are 5'-truncated at different sites. I suggest the involvement of the L1 machinery in these integrations and discuss the characteristic features of the evolutionary history of HERV-W, with emphasis on the putative impact of HERV-W retrosequence integrations on the mammalian genome.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Approximately 8% of the human genome is derived from retrovirus-like elements (International Genome Sequencing Consortium 2001Citation ). Most of them are human endogenous retroviruses (HERVs) originated by germ-line infection of the exogenous counterparts in any remote past of primate evolution. Presumably, subsequent retrotransposition (although reinfection cannot be formally ruled out) led to an increase in the copy number of each HERV family (Löwer, Löwer, and Kurth 1996Citation ). The structure of an integrated retrovirus (provirus) consists of two long-terminal repeats (LTRs) flanking a central coding region. The LTR sequence can be divided into three regions from the 5' to the 3' end, called U3, R, and U5 (fig. 1A ). Transcription of genomic RNA begins at the R region of the 5' LTR and ends at the R region of the 3' LTR. Thus, the U5 region is unique to the 5' end of the genomic RNA, and the U3 region is unique to the 3' end (fig. 1B ). The structure of the LTRs is restored during reverse transcription of genomic RNA. The U3 region contains the promoter and a series of regulatory sequences, which may influence the expression of the neighboring cellular genes (reviewed in Brosius 1999Citation ).



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 1.—Diagrammatic representation of several HERV-W structures. Length of the different regions is not to scale. Structural features shown: black box, U3 region; white box, R region; grey box, U5 region; box with angled bars, internal region; grey arrow, short direct repeat (DR) of 4 bp; black arrow, short direct repeat of 10–16 bp; AAA, poly(A) tail; dashed line, chromosomal DNA.

 
There are at least 22 independently acquired HERV families within the human genome (Tristem 2000Citation ). Very little is known about the evolutionary history of the different families, with a few exceptions, such as HERV-K (Medstrand and Mager 1998Citation ; Lebedev et al. 2000Citation ), ERV9 (Costas and Naveira 2000Citation ), ERV-L (Benit et al. 1999Citation ), or HERV-H (Goodchild, Wilkinson, and Mager 1993Citation ; Anderssen et al. 1997Citation ). HERV-W has been one of the most extensively studied HERVs during the last few years, since the isolation of an HERV-W–related retrovirus (named multiple-sclerosis associate retrovirus [MSRV]) from retroviral particles produced by cell cultures from patients with multiple sclerosis (Perron et al. 1997Citation ; Blond et al. 1999Citation ; Komurian-Pradel et al. 1999Citation ). Recently, its transcriptional activation in the brain has also been related to schizophrenia (Karlsson et al. 2001Citation ). HERV-W proviruses probably entered the genome of primates before the split between Old World and New World monkeys (Kim, Takenaka, and Crow 1999Citation ). The human genome contains at least 70, 100, and 30 HERV-W–related gag, pro, and env regions, respectively (Voisset et al. 2000Citation ), although all elements of the family are apparently not competent for replication (Blond et al. 1999Citation ). Interestingly, one HERV-W provirus may have been recruited by its host to serve an important physiological function. The envelope gene of this proviral insertion codes for the syncytin protein, which mediates placental cytotrophoblast fusion in vivo (Blond et al. 2000Citation ; Mi et al. 2000Citation ; Stoye and Coffin 2000Citation ).

In the present work I conducted a comparative sequence approach to reconstruct the evolutionary history of HERV-W, using data from the draft sequence of the human genome. This analysis revealed several unexpected features of HERV-W evolution.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Identification of HERV-W homologous sequences within the human genome was made using BLAST (Altschul et al. 1990Citation ) from the specialized human genome BLAST page at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/BLAST). The proviral insertion coding for syncytin (genomic sequence NT_017168.3, positions 6814289–6824510) was used as the query sequence for the initial searches. Further searches were done using as a query the region from the splice acceptor site 3 (AS3) of HERV-W, located 240 bp upstream of the 3' LTR (Blond et al. 1999Citation ), to the 3' end of the R region, with the addition of a 25-bp poly(A) tail. ClustalX (Thompson et al. 1997Citation ) was used for sequence alignments, which were later refined by visual inspection with GeneDoc (Nicholas and Nicholas 1997Citation ). Duplicated entries were excluded from the alignment before the other analyses. The open reading frame (ORF) finder at NCBI was used to detect long ORFs in the collected HERV-W proviruses.

HERV-W subfamilies were established by grouping sequences into different sets according to the most variable sites after exclusion of CpG dinucleotide positions (defined using the same criterion as in Costas and Naveira 2000Citation ), with little discrimination for subfamilies because of the fast mutation rate of these dinucleotides to TpG or CpA. Subfamily status was conferred on a sequence set if it was constituted by at least five elements presenting at least two diagnostic nucleotide differences. Subfamily consensus sequences were obtained by choosing the most frequent nucleotide at each position with one exception: those positions considered as CpG in the general alignment were also considered as CpG in the subfamily consensus sequences.

MEGA v2.1 (Kumar et al. 2001Citation ) was used to calculate divergence values within each set of sequences and between different sets of sequences. Net divergence values between different sets of sequences (dN) were calculated by the following formula:

where dXY is the average distance between groups X and Y, and dX and dY are the mean within-group distances. This program was also used to reconstruct phylogenetic relationships by the neighbor-joining method (Saitou and Nei 1987Citation ) and to calculate bootstrap values for each internal branch (1,000 replicates). In all cases, Kimura's two-parameter model was applied to correct for multiple substitutions. The average age of amplification for each subfamily (T ) was calculated using the formula:

where T is the time of divergence, K the average pairwise divergence between sequences from the same subfamily, and r the substitution rate of pseudogene sequences in primates.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
A BLAST search for HERV-W homologous sequences within the human genome was carried out in April 2001, using the syncytin genomic sequence as a query. This search revealed an unexpected result. Several of the identified elements begin at the R region of the 5' LTR and end at the R region of the 3' LTR, presenting in addition a 3' poly(A) tail (fig. 1C ). Some of these unusual elements are truncated at their 5' ends at different positions (fig. 1D ). Furthermore, although the insertion of HERV-W proviruses is flanked by direct repeats of 4 bp, these elements resembling the genomic RNA structure are flanked by longer repeats, typically from 10 to 16 bp. A visual inspection of the flanking regions indicates that the 5'-TT/AAAA sequence and its variants derived by a single base substitution, representing an L1-endonuclease consensus cleavage site (Jurka 1997Citation ; Toda, Saito, and Tomita 2000Citation ), are frequent at the preintegration site of the unusual elements (data not shown). Nevertheless, no further analysis was done on account of the difficulty in inferring these preintegration sites because of the relative old age of the insertions (see later). All these facts are characteristic of retroposed sequences (see Discussion). Because of this, I shall refer to these sequences as HERV-W retrosequences in contrast with the normal HERV-W proviruses.

An additional BLAST search was done using as a query the region from the AS3 of HERV-W, located 240 bp upstream of the 3' LTR (Blond et al. 1999Citation ), to the 3' end of the R region, continued by a poly(A) tail. By this strategy, I collected novel 5'-truncated HERV-W retrosequences not recovered in the previous search because of their shorter length. A total of 140 sequences, representing 39 HERV-W proviruses, 40 full-length HERV-W retrosequences, and 61 truncated HERV-W retrosequences, were collected (table 1 ). Furthermore, this search also revealed the existence of solitary R regions with a poly(A) tail flanked by short direct repeats (fig. 1E ), showing that inter-R recombination efficiently removes full-length HERV-W retrosequences from the genome, in a way similar to that giving rise to solitary LTRs from full-length proviruses (Mager and Goodchild 1989Citation ). Besides the known env ORF coding for syncytin (Blond et al. 2000Citation ; Mi et al. 2000Citation ), there are two other HERV-W proviruses preserving ORFs longer than 1,000 bp. One of them, included within the genomic clone NT_022833, extends from amino acids 64 to 524 of syncytin, sharing 87.6% homology with it. The other (NT_006307) is 1,638 bp long, corresponding to the main portion of the pol gene, from the conserved domain 3 of the reverse transcriptase (according to Xiong and Eickbush 1988Citation ) to the end of the RNaseH region.


View this table:
[in this window]
[in a new window]
 
Table 1 Collected HERV-W Sequences

 
Alignment of the 140 sequence fragments led to its classification into three main subfamilies on the basis of consistent correlated nucleotide differences between them (fig. 2 and table 2 ). Interestingly, all of these diagnostic differences are located within the 3' LTR. A total of 16 sequences remained unclassified. These unclassified elements present autapomorphic deletions removing key diagnostic positions, exclusive differences at diagnostic sites, or a combination of diagnostic nucleotides from different subfamilies (most probably because of gene conversion or recombination). Some of the unclassified sequences might represent intermediate subfamilies, eliminated from the analysis because of the absence of other elements belonging to them. Phylogenetic analyses of the sequences are consistent with this classification, with the exception of NT_011896, belonging to subfamily 2 on the basis of diagnostic differences but clustering with sequences from subfamily 1 in the phylogenetic trees. I removed this sequence from the rest of the analyses to avoid putative artifactual results. Figure 3 presents a neighbor-joining tree of the remaining 123 sequences. Sequences from subfamily 3 are clustered together with a high bootstrap support (86%). In agreement with the greater number of diagnostic differences defining this subfamily (fig. 2 ), the branch connecting this cluster is the longest internodal branch. The second longer internodal branch leads to the cluster of sequences from subfamily 2. Nevertheless, this cluster is not supported by a high bootstrap value. The remaining sequences constitute subfamily 1. Within each subfamily, HERV-W proviruses and HERV-W retrosequences are distributed without any tendency to split each other.



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 2.—Alignment of subfamily consensus sequences. The "general" consensus sequence is shown above as a reference. Dots represent identical nucleotide positions in the three subfamily consensus sequences. Gaps are represented by dashes. CpG dinucleotide positions are shown in italics. The diagnostic positions used to classify the sequences in subfamilies are outlined in grey boxes. The arrow marks the beginning of the 3' LTR. Nucleotides present in more than 70% of the sequences from the subfamily are shown in capital letters, whereas nucleotides present in between 50% and 70% are shown in lowercase letters. R = A or G, N = any nucleotide, and a = A or gap

 

View this table:
[in this window]
[in a new window]
 
Table 2 Subfamily Classification of HERV-W

 


View larger version (47K):
[in this window]
[in a new window]
 
Fig. 3.—Neighbor-joining tree of the 123 sequences classified as members of any of the three subfamilies. Bootstrap values higher than 50% are shown. Brackets indicate the different subfamilies. HERV-W proviruses are marked by dots. The syncytin sequence (subfamily 3 provirus) is marked by a horizontal arrow

 
Table 3 shows the divergence values within and between different subfamilies. In agreement with the subfamily classification, divergence values between subfamilies are always greater than within each of the subfamilies. The net divergence values between subfamilies are in accordance with the number of diagnostic differences between them (fig. 2 ). On the other hand, divergence values between HERV-W proviruses and retrosequences from the same subfamily are always an intermediate value between them, and the net divergence values are less than 0.1% (table 4 ), indicative of the absence of different clusters for proviruses and retrosequences within the same subfamily. Based on the average pairwise divergence of elements from each of the subfamilies, the estimated amplification ages range from 15.5 to 18.6 MYA, assuming r = 0.2% per million years (as in Anderssen et al. 1997Citation ), from 19.0 to 23.2, assuming r = 0.16 (as in Costas and Naveira 2000Citation ), or from 23.4 to 28.6, assuming r = 0.13 (as in Lebedev et al. 2000Citation ).


View this table:
[in this window]
[in a new window]
 
Table 3 Divergence Values (%) Within and Between Subfamilies

 

View this table:
[in this window]
[in a new window]
 
Table 4 Divergence Values (%) Within and Between HERV-W Proviruses and Retrosequences

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
This paper reconstructs the main features of the evolutionary history of HERV-W. This family consists of three different subfamilies whose main periods of activity extend over a short period of evolutionary time (~5 Myr). On the basis of the average pairwise divergence between members of each subfamily, subfamily 1 seems to be the oldest, and the other two originated independently from it. Although the average pairwise divergence must be considered a rough estimation of the relative amplification ages, because of stochastic errors and variation in substitution rates (r) within different genomic regions and over time, another fact suggests this hypothesis. There are several sites within the subfamily 1 consensus sequence with two alternative nucleotides (or a nucleotide and a gap) in similar proportion, whereas each of the other two subfamilies presents only one of the two alternative differences (positions 386, 415, and 438 in the alignment of fig. 2 ). We must take into account that an element must be able to create copies of itself at a relatively high level over a significant period of time in order to give rise to a detectable subfamily (Deininger et al. 1992Citation ). So, it is possible that minor subfamilies are hidden under the umbrella of the big ones, especially subfamily 3, accounting for ~75% of all the HERV-W insertions. For instance, 25 of the 92 elements from this subfamily share a 9-bp deletion, and 21 present a 10-bp insertion. The existence of several ambiguous positions in the subfamily consensus sequences also suggests this possibility (fig. 2 ). Nevertheless, there are no other correlated diagnostic differences characterizing these putative groups, and furthermore, the divergence values between them and the other elements do not support the existence of new subfamilies (data not shown). Therefore, within each of the subfamilies, all the elements most probably arose from very few closely related active elements.

This picture of the intragenomic spread of HERV-W is in clear contrast with other HERV families, such as HERV-K, HERV-H, or ERV9, that remained transpositionally active over extended periods of primate evolution, leading to several distinct subfamilies over time (Anderssen et al. 1997Citation ; Medstrand and Mager 1998Citation ; Costas and Naveira 2000Citation ; Lebedev et al. 2000Citation ). Thus, each HERV family underwent its particular evolutionary history, and these histories may be quite different from each other. The presumably shorter period of amplification in the case of HERV-W (based on the average integration age of the different subfamilies), as well as the apparent lack of intact ORFs, suggests that the MSRV isolated from retroviral particles produced by cell cultures from patients with multiple sclerosis (Perron et al. 1997Citation ) may be an exogenous member of the HERV-W family. The failure to detect intermediate subfamilies between subfamily 1 and subfamily 3 (that present seven diagnostic differences within the U3 region; fig. 2 ) also suggests the possibility that these two subfamilies might be originated by two independent germ-line infections.

The most surprising fact of the evolutionary dynamics of HERV-W is the existence of a high proportion of insertions showing characteristic features of retrosequences, such as acquisition of a poly(A) 3' tail, presence of direct flanking repeats of 10–16 bp, and a structure resembling mRNAs. Recently, Esnault, Maestre, and Heidmann (2000)Citation and Wei et al. (2001)Citation formally disclosed the ability of the non-LTR retrotransposon L1 to retrotranspose polyadenylated RNA transcripts in trans displaying these characteristics. Thus, HERV-W presumably spread by two different mechanisms: (1) the normal retrotransposition process of retroviruses, giving rise to full-length proviruses with intact LTRs, and (2) the parasitism on the L1 element, as in the case of short interspersed elements (SINEs; Mathias et al. 1991Citation ; Ohshima et al. 1996Citation ), giving rise to HERV-W retrosequences. Alternatively, it is legitimate to speculate that the reverse transcriptase of HERV-W itself would be responsible for HERV-W retrosequences formation. Nevertheless, the fact that nonviral RNAs encapsidated in retroviral particles generate integrated cDNA genes lacking the hallmarks of naturally occurring processed pseudogenes (they are 5'- and 3'-truncated and do not contain poly(A) tails) strongly militates against this hypothesis (Dornburg and Temin 1988, 1990Citation ). The existence of both types of elements within each of the subfamilies clearly supports the idea that HERV-W retrosequences formation is dependent on the expression of full-length proviruses, which are the source of genomic RNA. The alternative hypothesis of independent evolution of retrosequences after their origin should give rise to subfamilies constituted only by HERV-W retrosequences, but these subfamilies have not been identified. Taking into account that HERV-W retrosequences are expected to be "dead on arrival" copies, the lower success of HERV-W within the genome, compared with the other afore-mentioned HERV families, might be related to the existence of a considerable proportion of genomic RNA sequestered by the L1 machinery.

The putative impact of HERV-W retrosequences on the genome might be quite different from that of HERV-W proviruses. Retroviral protein expression may cause deleterious effects on the host by several processes. Thus, the antigenic character of proteins encoded by gag and env has been associated with several autoimmune pathologies (Nakagawa and Harrison 1996Citation ; Perron et al. 1997Citation ). The transmembrane domain of the envelope protein presents immunosuppressive effects (Cianciolo et al. 1985Citation ; Haraguchi et al. 1997Citation ), suggesting its possible implication in tumoral processes, leading to the escape of immune rejection by tumoral cells (Mangeney and Heidmann 1998Citation ). Other peptides encoded by small ORFs (two putative small ORFs have been described in HERV-W; Blond et al. 1999Citation ) might interfere with the cellular machinery (Boese et al. 2000Citation ). Furthermore, active proviruses may be the source of new insertions, acting as insertional mutagens (Mitreiter et al. 1994Citation ; Vasicek et al. 1997Citation ). All these deleterious effects are not associated with HERV-W retrosequences, which lack the capability to be expressed because of the loss of LTRs (not only in truncated but also in full-length retrosequences). HERV insertions may also be involved in deleterious chromosomal rearrangements by ectopic recombination between two copies of the same family of HERVs located at different chromosomal loci (Kamp et al. 2000Citation ; Sun et al. 2000Citation ). This effect is expected to be substantially reduced in the case of truncated retrosequences of short length. On the other hand, insertion of HERV-W retrosequences might introduce short enhancer sequences near genes (most of the enhancer signals are within the U3 region), providing raw material for natural selection. Thus, this type of insertion might represent a novel potential mechanism for the evolution of enhancers, adding a new possibility for L1 to shape the mammalian genomes (Kazazian and Moran 1998Citation ; Moran, DeBerardinis, and Kazazian 1999Citation ; Pickeral et al. 2000Citation ).


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The alignment of the 140 insertions is available as Supplementary Material on-line.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The author is a recipient of a postdoctoral fellowship from the USC/Xunta de Galicia.


    Footnotes
 
Thomas Eickbush, Reviewing Editor

Abbreviations: AS3, splice acceptor site 3; HERV, human endogenous retroviruses; LTR, long-terminal repeat; MSRV, multiple-sclerosis associate retrovirus; NCBI, National Center for Biotechnology Information; ORF, open reading frame. Back

Keywords: endogenous retrovirus retrotransposition HERV-W L1 MSRV retrosequence Back

Address for correspondence and reprints: Departamento de Bioloxía Fundamental, Facultade de Bioloxía, Universidade de Santiago de Compostela, E-15782 Santiago de Compostela, Spain. bfcostas{at}usc.es . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Altschul S. F., W. Gish, W. Miller, E. W. Myers, D. J. Lipman, 1990 Basic local alignment search tool J. Mol. Biol 215:403-410[ISI][Medline]

    Anderssen S., E. Sjottem, G. Svineng, T. Johansen, 1997 Comparative analyses of LTRs of the ERV-H family of primate-specific retrovirus-like elements isolated from marmoset, African green monkey, and man Virology 234:14-30[ISI][Medline]

    Benit L., J. B. Lallemand, J. F. Casella, H. Philippe, T. Heidmann, 1999 ERV-L elements: a family of endogenous retrovirus-like elements active throughout the evolution of mammals J. Virol 73:3301-3308[Abstract/Free Full Text]

    Blond J. L., F. Beseme, L. Duret, O. Bouton, F. Bedin, H. Perron, B. Mandrand, F. Mallet, 1999 Molecular characterization and placental expression of HERV-W, a new human endogenous retrovirus family J. Virol 73:1175-1185[Abstract/Free Full Text]

    Blond J. L., D. Lavillette, V. Cheynet, O. Bouton, G. Oriol, S. Chapel-Fernandes, B. Mandrand, F. Mallet, F. L. Cosset, 2000 An envelope glycoprotein of the human endogenous retrovirus HERV-W is expressed in the human placenta and fuses cells expressing the type D mammalian retrovirus receptor J. Virol 74:3321-3329[Abstract/Free Full Text]

    Boese A., M. Sauter, U. Galli, B. Best, H. Herbst, J. Mayer, E. Kremmer, K. Roemer, N. Mueller-Lantzsch, 2000 Human endogenous retrovirus protein cORF supports cell transformation and associates with the promeyelocytic leukemia zinc finger protein Oncogene 19:4328-4336[ISI][Medline]

    Brosius J., 1999 RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements Gene 238:115-134[ISI][Medline]

    Cianciolo G. J., T. D. Copeland, S. Oroszlan, R. Snyderman, 1985 Inhibition of lymphocyte proliferation by a synthetic peptide homologous to retroviral envelope proteins Science 230:453-455[ISI][Medline]

    Costas J., H. Naveira, 2000 Evolutionary history of the human endogenous retrovirus family ERV9 Mol. Biol. Evol 17:320-330[Abstract/Free Full Text]

    Deininger P. L., M. A. Batzer, C. A. Hutchison III, M. H. Edgell, 1992 Master genes in mammalian repetitive DNA amplification Trends Genet 8:307-311[ISI][Medline]

    Dornburg R., H. M. Temin, 1988 Retroviral vector system for the study of cDNA gene formation Mol. Cell. Biol 8:2328-2334[ISI][Medline]

    ———. 1990 cDNA genes formed after infection with retroviral vector particles lack the hallmarks of natural processed pseudogenes Mol. Cell. Biol 10:68-74[ISI][Medline]

    Esnault C., J. Maestre, T. Heidmann, 2000 Human LINE retrotransposons generate processed pseudogenes Nat. Genet 24:363-367[ISI][Medline]

    Goodchild N. L., D. A. Wilkinson, D. L. Mager, 1993 Recent evolutionary expansion of a subfamily of RTVL-H human endogenous retrovirus-like elements Virology 196:778-788[ISI][Medline]

    Haraguchi S., R. A. Good, G. J. Cianciolo, R. W. Engelman, N. K. Day, 1997 Immunosuppressive retroviral peptides: immunopathological implications for immunosuppressive influences of retroviral infections J. Leukoc. Biol 61:654-666[Abstract]

    International Human Genome Sequencing Consortium. 2001 Initial sequencing and analysis of the human genome Nature 409:860-921[ISI][Medline]

    Jurka J., 1997 Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons Proc. Natl. Acad. Sci. USA 94:1872-1877[Abstract/Free Full Text]

    Kamp C., P. Hirschmann, H. Voss, K. Huellen, P. Vogt, 2000 Two long homologous retroviral sequence blocks in proximal Yq11 cause AZFa microdeletions as a result of intrachromosomal recombination events Hum. Mol. Genet 9:2563-2572[Abstract/Free Full Text]

    Karlsson H., S. Bachmann, J. Schroder, J. McArthur, E. F. Torrey, R. H. Yolken, 2001 Retroviral RNA identified in the cerebrospinal fluids and brains of individuals with schizophrenia Proc. Natl. Acad. Sci. USA 98:4634-4639[Abstract/Free Full Text]

    Kazazian H. H. Jr.,, J. V. Moran, 1998 The impact of L1 retrotransposons on the human genome Nat. Genet 19:19-24[ISI][Medline]

    Kim H. S., O. Takenaka, T. J. Crow, 1999 Isolation and phylogeny of endogenous retrovirus sequences belonging to the HERV-W family in primates J. Gen. Virol 80:2613-2619[Abstract/Free Full Text]

    Komurian-Pradel F., G. Paranhos-Baccala, F. Bedin, et al. (11 co-authors) 1999 Molecular cloning and characterization of MSRV-related sequences associated with retrovirus-like particles Virology 260:1-9[ISI][Medline]

    Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Distributed by the authors (http://www.megasoftware.net/)

    Lebedev Y. B., O. S. Belonovitch, N. V. Zybrova, P. P. Khil, S. G. Kurdyukov, T. V. Vinogradova, G. Hunsmann, E. D. Sverdlov, 2000 Differences in HERV-K LTR insertions in orthologous loci of humans and great apes Gene 247:265-277[ISI][Medline]

    Löwer R., J. Löwer, R. Kurth, 1996 The viruses in all of us: characteristics and biological significance of human endogenous retrovirus sequences Proc. Natl. Acad. Sci. USA 93:5177-5184[Abstract/Free Full Text]

    Mager D., N. Goodchild, 1989 Homologous recombination between the LTRs of a human retrovirus-like element causes a 5-kb deletion in two siblings Am. J. Hum. Genet 45:848-854[ISI][Medline]

    Mangeney M., T. Heidmann, 1998 Tumor cells expressing a retroviral envelope escape immune rejection in vivo Proc. Natl. Acad. Sci. USA 95:14920-14925[Abstract/Free Full Text]

    Mathias S. L., A. F. Scott, H. H. Kazazian Jr., J. D. Boeke, A. Gabriel, 1991 Reverse transcriptase encoded by a human transposable element Science 254:1808-1810[ISI][Medline]

    Medstrand P., D. L. Mager, 1998 Human-specific integrations of the HERV-K endogenous retrovirus family J. Virol 72:9782-9787[Abstract/Free Full Text]

    Mi S., X. Lee, X. Li, et al. (12 co-authors) 2000 Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis Nature 403:785-789[ISI][Medline]

    Mitreiter K., J. Schmidt, A. Luz, M. J. Atkinson, H. Hofler, V. Erfle, P. G. Strauss, 1994 Disruption of the murine p53 gene by insertion of an endogenous retrovirus-like element (ETn) in a cell line from radiation-induced osteosarcoma Virology 200:837-841[ISI][Medline]

    Moran J. V., R. J. DeBerardinis, H. H. Kazazian Jr., 1999 Exon shuffling by L1 retrotransposition Science 283:1530-1534[Abstract/Free Full Text]

    Nakagawa K., L. C. Harrison, 1996 The potential roles of endogenous retroviruses in autoimmunity Immunol Rev 152:193-236[ISI][Medline]

    Nicholas K. B., H. B. Nicholas Jr., 1997 GeneDoc: a tool for editing and annotating multiple sequence alignment Distributed by the authors (http://www.cris.com/~ketchup/genedoc.shtml)

    Ohshima K., M. Hamada, Y. Terai, N. Okada, 1996 The 3' ends of tRNA-derived short interspersed repetitive elements are derived from the 3' ends of long interspersed repetitive elements Mol. Cell. Biol 16:3756-3764[Abstract]

    Perron H., J. Garson, F. Bedin, et al. (13 co-authors) 1997 Molecular identification of a novel retrovirus repeatedly isolated from patients with multiple sclerosis. The Collaborative Research Group on Multiple Sclerosis Proc. Natl. Acad. Sci. USA 94:7583-7588[Abstract/Free Full Text]

    Pickeral O. K., W. Makalowski, M. S. Boguski, J. D. Boeke, 2000 Frequent human genomic DNA transduction driven by LINE-1 retrotransposition Genome Res 10:411-415[Abstract/Free Full Text]

    Saitou N., M. Nei, 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]

    Stoye J. P., J. M. Coffin, 2000 A provirus put to work Nature 403:715-717[ISI][Medline]

    Sun C., H. Skaletsky, S. Rozen, J. Gromoll, E. Nieschlag, R. Oates, D. Page, 2000 Deletion of azoospermia factor a (AZFa) region of human Y chromosome caused by recombination between HERV15 proviruses Hum. Mol. Genet 9:2291-2296[Abstract/Free Full Text]

    Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882[Abstract/Free Full Text]

    Toda Y., R. Saito, M. Tomita, 2000 Characteristic sequence pattern in the 5- to 20-bp upstream region of primate Alu elements J. Mol. Evol 50:232-237[ISI][Medline]

    Tristem M., 2000 Identification and characterization of novel human endogenous retrovirus families by phylogenetic screening of the human genome mapping project database J. Virol 74:3715-3730[Abstract/Free Full Text]

    Vasicek T. J., L. Zeng, X. J. Guan, T. Zhang, F. Costantini, S. M. Tilghman, 1997 Two dominant mutations in the mouse fused gene are the result of transposon insertions Genetics 147:777-786[Abstract/Free Full Text]

    Voisset C., O. Bouton, F. Bedin, L. Duret, B. Mandrand, F. Mallet, G. Paranhos-Baccala, 2000 Chromosomal distribution and coding capacity of the human endogenous retrovirus HERV-W family AIDS Res. Hum. Retroviruses 16:731-740[ISI][Medline]

    Wei W., N. Gilbert, S. L. Ooi, J. F. Lawler, E. M. Ostertag, H. H. Kazazian, J. D. Boeke, J. V. Moran, 2001 Human L1 retrotransposition: cis preference versus trans complementation Mol. Cell. Biol 21:1429-1439[Abstract/Free Full Text]

    Xiong Y., T. H. Eickbush, 1988 Similarity of reverse transcriptase-like sequences of viruses, transposable elements, and mitochondrial introns Mol. Biol. Evol 5:675-690[Abstract]

Accepted for publication December 4, 2001.