Ylli, a Non–LTR Retrotransposon L1 Family in the Dimorphic Yeast Yarrowia lipolytica

Serge Casaregola3, Cécile Neuvéglise, Elisabeth Bon and Claude Gaillardin

Collection de Levures d'Intérêt Biotechnologique, Laboratoire de Génétique Moleculaire et Cellulaire, Thiverval-Grignon, France


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
During the course of a random sequencing project of the genome of the dimorphic yeast Yarrowia lipolytica, we have identified sequences that were repeated in the genome and that matched the reverse transcriptase (RT) sequence of non–long terminal repeat (non–LTR) retrotransposons. Extension of sequencing on each side of this zone of homology allowed the definition of an element over 6 kb long. The conceptual translation of this sequence revealed two open reading frames (ORFs) that displayed several characteristics of non–LTR retrotransposons: a Cys-rich motif in the ORF1, an N-terminal endonuclease, a central RT, and a C-terminal zinc finger domain in the ORF2. We called this element Ylli (for Y. lipolytica LINE). A total of 19 distinct repeats carrying the 3' untranslated region (UTR) and all ending with a poly-A tail were detected. Most of them were very short, 17 being 134 bp long or less. The number of copies of Ylli was estimated to be around 100 if these short repeats are 5' truncations. No 5' UTR was clearly identified, indicating that entire and therefore active elements might be very rare in the Y. lipolytica strain tested. Ylli does not seem to have any insertion specificity. Phylogenetic analysis of the RT domain unambiguously placed Ylli within the L1 clade. It forms a monophyletic group with the Zorro non–LTR retrotransposons discovered in another dimorphic yeast Candida albicans. BLAST comparisons showed that ORF2 of Ylli is closely related to that of the slime mold Dictyostelium discoideum L1 family, TRE.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Retrotransposons are found in all eukaryotes and in some cases constitute a major part of the genome (Le et al. 2000Citation ; Li et al. 2001; The Arabidopsis Genome Initiative 2000Citation ). Because of their ability to undergo transposition, they were shown to contribute to genome plasticity and to have been involved in numerous deletions, duplications, and rearrangements in lower eukaryotes (Warmington et al. 1987Citation ; Picologlou et al. 1988Citation ). Two types of retrotransposons are present in eukaryotes: long terminal repeat (LTR) retrotransposons, closely related to retroviruses, and non–LTR retrotransposons. All these elements use reverse transcription to propagate, but they differ in their gene structure, the transcription of their genes, and the mechanisms of transposition. On the basis of sequence conservation of their reverse transcriptase (RT) domain, they were proposed to have been derived from a common ancestor (Xiong and Eickbush 1990Citation ).

Non–LTR retroelements (also called LINEs for Long Interspersed Repetitive Elements) have been found in all eukaryotes, including plants, mammals, and fungi. The most extensively studied non–LTR retrotransposons, L1, was found to constitute 15.4% of the human genome (Li et al. 2001Citation ). L1 consists of a 5' untranslated region (UTR) of variable length and structure that carries a polymerase-II promoter, two open reading frames (ORFs), and a 3' UTR terminating in a poly-A tract (Hutchinson III et al. 1989Citation ), whereas the oldest non–LTR retrotransposon families are constituted by a single ORF. ORF1 is considered to be the equivalent of the retroviral gag gene, although the proteins do not show sequence conservation. Human ORF1 gene product was shown to bind L1 RNA in vitro (Hohjoh and Singer 1997Citation ). ORF2 encodes several domains with enzymatic characteristics, including an endonuclease and a RT activities involved in the retrotransposition process and a cysteine-rich (Cys-rich) motif of the C-terminus thought to be involved in nucleic acid binding. A total of 11 families of non–LTR retrotransposons were recently described, and variability between families was proposed to be linked to the gain or loss of various domains in both ORFs (Malik, Burke, and Eickbush 1999Citation ). Non–LTR retrotransposon maintenance is based on vertical transmission (Malik, Burke, and Eickbush 1999Citation ; Furano 2000Citation ), although some instances of horizontal transfer have been documented (Kordis and Gubensek 1998Citation ; Volff et al. 2001Citation ).

Propagation of non–LTR retrotransposons requires a reverse transcription step, and these elements were mostly found truncated because of abortive transcription. Over 800,000 copies are present in humans, but very few are entire. Even fewer (30–60) are capable of transposition (Sassaman et al. 1997Citation ). Besides the colonization of their host genome, non–LTR retrotransposons were recently found to be responsible for generating pseudogenes (Esnault, Maestre, and Heidmann 2000Citation ). Non–LTR retrotransposons were also proposed to be involved in the propagation of X chromosome inactivation in mammals (Bailey et al. 2000Citation ).

Non–LTR retrotransposons have been detected in four fungal species, Tad-1 in Neurospora crassa (Cambareri, Helber, and Kinsey 1994Citation ), MGR583 in Magnaporthe grisea (Hamer et al. 1989Citation ), CgT1 Colletotrichum gloeosporioides (He et al. 1996Citation ), and Mars in Ascobollus immersus (Goyon, Rossignol, and Faugeron 1996Citation ), although no entire element was described for the last species. Whereas LTR retrotransposons are widely distributed in all yeast species tested, except in Pichia sorbitophila (Génolevures 2000Citation ; Goodwin and Poulter 2000Citation ; unpublished data), no non–LTR retrotransposons have been found in yeast until recently. Chibana et al. (1998)Citation mentioned a gene, LRT2, that displayed homology with the non–LTR retrotransposon RT and hybridized to three chromosomes in the yeast Candida albicans. Very recently, three elements, the Zorros, phylogenetically related to the L1 clade, have been described in C. albicans (Goodwin, Ormandy, and Poulter 2001Citation ). Interestingly, only one was full length, the others were either truncated in 5' or degenerate. In addition, unlike most non–LTR retrotransposons, the Zorros are present in a few copies per genome. Another non–LTR retrotransposon family was very recently described in basidiomycetous yeast Cryptococcus neoformans, which was found to belong to the most ancient clade CRE (Goodwin and Poulter 2001Citation ).

Yarrowia lipolytica is a yeast with many unusual properties. It is dimorphic, and it can grow on fatty acids and alkanes. It is heterothallic; although it can display sexuality, most of its isolates are haploid, reminiscent of filamentous fungi. In contrast to other yeasts, the Y. lipolytica genome shares several properties with higher eukaryotes, such as dispersion of the rDNA clusters and the 5S RNA genes and the presence of a typical Signal Recognition Particle 7S RNA (Barth and Gaillardin 1997Citation ). Despite the numerous features that Y. lipolytica shares with higher eukaryotes and filamentous fungi, rDNA sequence phylogeny placed it unambiguously among the hemiascomycetous yeasts (Kurtzman and Robnett 1998Citation ). A retrotransposon, Ylt1, belonging to the Ty3-gypsy group and bound by unusually long (714 bp) LTRs, was identified (Schmid-Berger, Schmid, and Barth 1994Citation ). Surprisingly, Ylt1 is not systematically present in all wild isolates of Y. lipolytica (Juretzek et al. 2001Citation ).

A recent genome survey of Y. lipolytica based on random sequencing (Casaregola et al. 2000Citation ) confirmed that no homologous sequence to Ylt1 could be found in the wild French isolate W29, as suggested by (Juretzek et al. 2001Citation ), but that another Ty3-like retrotransposon was present. This work also evidenced BLASTX matches to mammalian non–LTR retrotransposons. Here, we report the characterization of this Y. lipolytica non–LTR retrotransposon family.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Strains and Media
The Y. lipolytica strains used in this study are W29, a wild strain isolated from Paris sewage; H222, a wild strain isolated from German soil; and CX161-1B, a laboratory derivative of the type strain isolated from an American corn processing plant. Cells were routinely grown in YPD (1% yeast extracts, 1% peptone, and 1% glucose) at 28°C with shaking.

DNA Techniques
Common DNA manipulations were performed as described in Sambrook, Fritsch, and Maniatis (1989)Citation . Restriction enzymes were purchased from GIBCO-BRL and BioLabs. Blotting of DNA onto a GeneScreen nylon membrane (DuPont) was performed according to Casaregola et al. (1998)Citation . DNA-DNA hybridization was performed according to Church and Gilbert (1984)Citation , with P32-{alpha}-dCTP–labeled DNA probes using the Megaprime labeling kit (Amersham, U.K.). Final washes were done at 65°C in 0.1 x SSC, 0.1% SDS. For Pulse Field Gel Electrophoresis, genomic DNA in agarose plugs was prepared according to Casaregola et al. (1997)Citation . Chromosomes were separated using a BioRad CHEF MAPPER apparatus in 0.5 x TBE running buffer at 14°C for 15 h 39 min with pulses from 0.05 to 0.92 s in 1% Pulse field certified Biorad agarose gels with voltages of 9 and 6 Volts/cm and an angle of 180°.

The program consed (Gordon, Abajian, and Green 1998Citation ) was used for the design of the primers. The primers used for PCR amplification of the 5'-end of Ylli are 5'-CCTCGAGCACCTCGATA and 5'-GAGCCGGTAAGCCTAGC and of the 3' end are 5'-TGACCTTCGAAAACACTACAT and 5'-GTACGCTTACACGAATATTACTAA. PCR amplifications were performed with 50 ng genomic DNA prepared according to Romano et al. (1996)Citation using a Crocodile III or a Perkin-Elmer 9600 thermocycler. Amplification conditions were as follows: 4 min at 94°C, 25 or 30 cycles consisting of 30 s at 94°C, 30 s at the Tm of the primers, 1 min per kb to be amplified at 72°C followed by 7 min at 72°C. Appligene Taq polymerase (Oncor) (2.5 units) in the supplied buffer was used. The PCR products were run on a 0.8% agarose gel (ICN) in TAE 1 x buffer.

Sequencing and Sequence Assembly
Most of the sequences used in this study were generated during the Génolevures project (Artiguenave et al. 2000Citation ; Casaregola et al. 2000Citation ) on strain W29. Additional sequences were obtained from clones of the Y. lipolytica genomic DNA library used in Génolevures. Sequence assembly was performed using programs phred (Version 0.980904.c) and phrap (Version 0.960731) with a minscore of 14 and a minmatch of 30 (Ewing and Green 1998Citation ; Ewing et al. 1998Citation ). The sequence compilation was edited with consed, Version 10.38 (beta) (Gordon, Abajian, and Green 1998Citation ).

Sequence Analysis
BLAST (Altschul et al. 1997Citation ) was used to screen sequence databases for homology. Sequences were analyzed with various programs in the GCG environment (Genetics Computer Group, Madison), including FASTA (Pearson and Lipman 1988Citation ). The alignment of the RT domains of non–LTR retrotransposons described by Malik, Burke, and Eickbush (1999)Citation was used as a basis for the alignment shown in figure 5 . Sequence alignments were generated using CLUSTAL W (Higgins, Thompson, and Gibson 1996Citation ) and CLUSTAL X (Thompson et al. 1997Citation ) and were manually adjusted in Genedoc (http://www.psc.edu/biomed/genedoc). Phylogenetic trees were generated by the neighbor-joining method (Saitou and Nei 1987Citation ) and were visualized with Treeview, Version 1.6.5 (Page 1996Citation ).



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 5.—Schematic representation of the structural organization of the yeast non–LTR retrotransposons. APE: apurinic endonuclease, RT: reverse transcriptase, RNH: RNase H. Vertical bars represent Cys-rich motifs. The ORFs are drawn to scale. Hatched representation of Zorro-1 ORF1 indicates that the ORF might not be full length

 
Source of Sequences
The sequences used in this study are listed in table 1 . The Génolevures random sequence tags (RSTs) are available at the Génolevures web site http://cbi.labri.u-bordeaux.fr/Genolevures. Part of the sequence data for C. albicans was obtained from the Stanford Genome Technology Center web site http://www-sequence.stanford.edu/group/candida. Sequencing of C. albicans was accomplished with the support of the NIDR and the Burroughs Wellcome Fund.


View this table:
[in this window]
[in a new window]
 
Table 1 Source of Sequences

 

    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Sequencing of the Y. lipolytica Non–LTR Retrotransposon
The Génolevures project consisted in generating and analyzing a large number of RSTs for 13 hemiascomycetous yeasts. The RSTs represent both extremities of 3- to 5-kb-long inserts cloned in a high copy number plasmid. For Y. lipolytica, 4,940 RSTs with an average size of 995 bp were thus obtained and analyzed either separately or after assembly of the RSTs into contigs (Casaregola et al. 2000Citation ). A 3,582-bp-long contig Yl-C834 made of 10 RSTs was found to match the mouse L1Md ORF2 over the RT domain with 22% amino acids identity over 738 amino acids (BLASTX). We extended the sequence from each of the extremities of the contig Yl-C834 and found overlapping clones belonging to another contig Yl-C828. Further upstream and downstream sequencing led to a 6,942-bp sequence compilation called 2-1100 that contained a number of sequences forming a contig and sequences that had a central part that matched the contig 2-1100 sequence and a 5' end or a 3' end that did not. In addition, a number of reads contained, at their 3' end, a poly-A tract of various lengths (11–23 A's) and an abruptly diverging sequence past this poly-A tract. This was reminiscent of the 5' truncated non–LTR retrotransposons. The sequence 02-1100 carries two large ORFs of 728 and 1,300 amino acids with the same polarity but in two different reading frames and separated by two nucleotides (fig. 1 ). Conceptual translation of ORF1 started with a methionine, a second Met codon being at position 15. To obtain sequence 02-1100, we further sequenced RSTs upstream from the ORF1. Two sequences were obtained, each corresponding to four RSTs, and diverged between the two Met codons. We inferred that ORF1 started with the second ATG encountered in the ORF. Ylli ORF1, with 714 amino acids, is one of the largest non–LTR retrotransposon ORF1. The first Met codon in ORF2 is 2 bp from the ORF1 stop codon. Other Met codons can also be found 57 and 107 bp downstream from the ORF1 stop codon, although amino acids sequence comparisons with other non–LTR retrotransposon ORF2 clearly indicate that Ylli ORF2 starts with the first Met codon of the ORF (see subsequently). Three stop codons, each separated by 6 bp, constitute a strong signal for translation arrest. This is shown in figure 1 .



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 1.—Structure of Ylli. A, The different parts of Ylli are depicted. On top, the nucleotide number of the start and stop of the ORFs are shown. B, Sequence of the 5' region and of the 3' UTR. The 5' region: the direct repeat CAGCAA is in bold face and underlined. The T-rich region is in boldface and is indicated by dashed underlining. The translated regions are in small letters and the start codon and the stop codons are in boldface and underlined. The 35-bp-long repeats present in the 5' region and in the 3' UTR are indicated with a thick arrow. The location of divergence between the two available sequences of the 5' region is indicated with an open triangle. The 3' UTR: dashed underlining indicates the dyad symmetry in the 3' end. The various locations of the start of the 3'-end short repeats are indicated with filled triangles. C, A potential stem loop structure deduced from the dyad symmetry in the 3' UTR as indicated in B is depicted

 
We also detected over the 6.9-kb sequence 15 bases that diverged between copies of Ylli. Two divergent bases are in ORF1 and change the amino acids (33Pro->His and 535Thr->Ile). Four divergent bases are in ORF2 with two amino acid changes, including one conservative change (1200Arg->Lys, 1226Pro->Thr) and two silent changes (459Asn AAT->AAC, 822Arg CGG->CGT), indicating that the element is not degenerate and has transposed recently. The composite sequence compiled from the various Génolevures RSTs and additional sequences that made the sequence compilation 02-1100 are deposited at EMBL under the accession number AJ319752.

Whereas the 3'-end of Ylli could be identified without difficulty (see previously), we were unable to find more than two clones with divergent sequences immediately upstream of ORF1 in our Y. lipolytica library. Therefore, we could not define precisely the start of the element, and we have no evidence either for the location of the 5'-end of Ylli in the contig 02-1100 or for the presence of an entire element in the Y. lipolytica genome. In addition, non–LTR retrotransposon transposition creates a 7- to 20-bp genomic duplication bounding the element that was not detected. Out of the two sequences associated with ORF1 generated by additional sequencing of existing clones, that included in the sequence compilation 02-1100 spans 476 bp and carries interesting features that are reminiscent of a yeast promoter. We found a poly-T tract, interrupted by three Cs, 121 bp upstream from the first ATG of ORF1 with the following sequence: T9 C T3 C T8 C T2 C T7. This type of poly-dT, found in the promoter of three different gene of S. cerevisiae and located upstream from a TATA box, was shown to act as a cis-activating sequence (Struhl 1985Citation ). Two putative TATA boxes are present 27 and 52 bp downstream from the poly-dT (fig. 1B ). We also found a 6-bp-long sequence CAGCAA, that we called Y box, repeated five times within the 476-bp upstream from the ORF1, with two of these repeats in tandem. Interestingly, within this region, a 35-bp sequence, CTTTCAC A/G GCAAAGTTATAATTAAATGAAT A/G TATA, located 31 bp upstream of the Met codon of ORF1, is directly repeated, with two mismatches, at the end of the 3' UTR, ending up 5 bp upstream from the poly-A tract (fig. 1B ). The first mismatch (A/G) is caused by two different sequences within the compilation, whereas the second mismatch (A/G) corresponds to a divergence between the 5' and the 3' repeats. It is noteworthy that one of the six Y boxes that we detected in the sequence upstream from the ORF1 is located within the 35-bp repeat, supporting the idea that the 476-bp sequence may belong to Ylli. Further work will be necessary to demonstrate this possibility. In addition to this direct repeat, we found in the 3' UTR a long region of dyad symmetry with an internal energy of -3.8 kcal that overlaps the second 35-bp-long direct repeat of the element (fig. 1B ). It is depicted in figure 1C (see subsequently). This is reminiscent of the end of the 3' UTR that is recognized by the protein encoded by the Bombyx mori site-specific element R2 and is susceptible to form secondary structure (Mathews et al. 1997Citation ).

Ylli Copy Number
A total of 11 RSTs containing the poly-A and corresponding to eight different insertions in the genome of W29 were found in the sequence compilation 02-1100. Overall, BLASTN comparison of the sequence 02-1100 with the Génolevures Y. lipolytica RSTs identified 23 RSTs which had between 23 and 134 bp matching the 3' end of the element, 16 of these were not included by the assembler used to generate the contig Yl-C834 (table 2 ) because of short homology size. These 23 RSTs carry isolated repeats flanked by genomic material, indicating that they might be extreme 5' truncations of the elements. Overall, 19 distinct repeats matching the 3'-end of Ylli, including the poly-A were identified. One is 431 bp long and another 350 bp long. All the others are short repeats varying in size from 23 to 134 bp (table 2 ). Other RSTs homologous to Ylli and carrying genomic material were also detected along the element, that diverged from the Ylli consensus sequence (AW0AA006D11T1 at position 447, AW0AA003D09D1 at position 1393, AW0AA027H02T1 at position 5826, and AW0AA018D09D1 at position 6173). These 5' truncates were rare compared with the short repeats. As the coverage of the Y. lipolytica genome in the Génolevures project is around 20% (Casaregola et al. 2000Citation ), the overall number of Ylli insertions in the genome of the strain W29 was estimated to be around 100, if short repeats are considered, or 15–20 if these are omitted.


View this table:
[in this window]
[in a new window]
 
Table 2 List of Génolevures RSTs Carrying Ylli Sequence

 
In order to test whether 5' truncation of Ylli occurred in Y. lipolytica, two probes, either located in ORF1 or in the 3' UTR but not overlapping the short repeats, were hybridized to genomic DNA with two enzymes, EcoR1 and Sph1, that do not cut Ylli. The 3'-end probe does not encompass the entire 3' UTR (bp 6007 to 6873), therefore overlapping over 11 bp for the 70-bp short repeats and over 18 bp for the 77-bp short repeats. Figure 2 indicates that the number of copies of Ylli is reduced when the 5'-end probe is used when compared with the 3'-end probe. The 3'-end probe used can hybridize to 3'-end repeats over 70 bp long; there are seven of these, ranging from 105 to 431 bp. This could account for a reduced number of bands between the 5' and the 3' probe. The hybridization with the 3'-end–specific probe nevertheless confirmed our estimation of 15–20 copies per haploid genome when 3' very short repeats are not taken into account and seems to indicate that, like in higher eukaryotes but to a lesser extent, Ylli RT causes 5' truncation.



View larger version (64K):
[in this window]
[in a new window]
 
Fig. 2.—Hybridization of Ylli-specific probes to separation of genomic DNA from Y. lipolytica. A, Genomic DNA from strain W29 was digested by EcoR1 (lanes 1 and 2) and Sph1 (lanes 3 and 4) separated on pulse field gel (see Materials and Methods), blotted to a Nylon membrane, and hybridized with a 645-bp probe (bp 923 to 1,567 of the sequence 02-1100) specific for ORF1 and a 766-bp (bp 6007 to 6873 of the sequence 02-1100) probe specific for the 3' UTR. B, Genomic DNA from strain W29 (lanes 1 and 4), H222 (lanes 2 and 5), and CX161-1B (lanes 3 and 6) was digested by EcoRI (lanes 1–3) or HindIII (lanes 4–6), separated on gel, blotted to a Nylon membrane, and hybridized with the Ylli 3'-end probe (see Materials and Methods). M indicates the molecular weight in kb

 
The short repeats are reminiscent of SINE (Short Interspersed Nuclear Elements), short sequences found in very high copy number in mammals and thought to carry RNA polymerase-III promoter associated with the 3'-end of non–LTR retrotransposons. SINEs have thus two conserved sequences, boxes A and B, typical of RNA polymerase-III–transcribed genes. We did not find these conserved sequences upstream from the short repeats. Interestingly, the junction between genomic DNA and the 5' end of the short repeats falls, for most of them, in or in the close vicinity of the dyad symmetry detected at the extremity of Ylli (fig. 1 ), suggesting that this dyad symmetry maybe involved in the process that generates these repeats (see Discussion).

Analysis of Ylli ORF1
Pairwise amino acid sequence comparisons with FASTA gave an amino acid identity between Ylli and Zorro-3 ORF1 amounting to 18.5% over 583 amino acids, whereas we obtained 23.5% identity over 136 amino acids when ORF1 of Zorro-1 and Ylli were compared. Comparisons of both C. albicans retrotransposons ORF1 gave 20.3% identity over 291 amino acids. These results are consistent with the difference in size observed for the ORFs of the three elements because the best match is obtained when Ylli and Zorro-3 are compared (table 3 ). Pairwise comparison of Ylli ORF1 gave 19.2% identity over 208 amino acids with human L1Hs ORF1 and 8.9% identity over 307 amino acids with D. discoideum TRE3_A ORF1 (see subsequently). Comparisons with other non–LTR retrotransposon ORF1 sequences, including those of fungi, gave poor scores.


View this table:
[in this window]
[in a new window]
 
Table 3 Size of the ORFp Encoded by the Yeast LINEs and Comparison to Neurospora, Dictyostelium, Xenopus, and human

 
Non–LTR retrotransposon ORF1 is assumed to be related to the retroviral gag gene. L1 ORF1 gene product was shown to specifically bind RNA (Hohjoh and Singer 1997Citation ) and seems to act as a nucleic acid chaperone in the reverse transcription process (Martin and Bushman 2001Citation ). Most ORF1 gene products carry at least one Cys-rich motif assumed to be involved in the binding to nucleic acids. Unlike ORF1 from other organisms like N. crassa, which have several Cys-rich motifs that can be degenerate, we detected only one Cys-rich motif in Ylli ORF1p, of the type found in gag-like protein, CX2CX4HX4C, located at position 392 (fig. 2B ). Two Cys-rich motifs of this type were found in Zorro-1 and Zorro-3, starting at position 121 and 427, respectively. We also noticed that Ylli ORF1p is enriched in proline in the C-terminal part as it has been observed for N. crassa Tad-1 and gag-like protein (Cambareri, Helber, and Kinsey 1994Citation ).

An alignment of the ORF1p of Ylli, Zorro-1, and Zorro-3, in addition to the human L1Hs and the mouse L1Md ORF1p sequences, is shown in figure 3A . This alignment clearly shows that the yeast ORFs have a C-terminal extension with respect to the mammalian proteins, consistent with the lack of the Cys-rich motif in the mammalian proteins. From this alignment we confirmed that, as mentioned by Goodwin, Ormandy, and Poulter (2001)Citation , the N-terminal part of Zorro-1 ORF1p is not carried by the C. albicans C6-1996 contig because the most conserved part of the aligned sequences is located before the beginning of Zorro-1 ORF1p.



View larger version (61K):
[in this window]
[in a new window]
 
Fig. 3.—Amino acid sequence alignment of yeast non–LTR retrotransposon ORF1p with related proteins. A, The three yeast non–LTR retrotransposon ORF1 (Ylli, Zorro-1, Zorro-3) amino acid sequences and the human (L1Hs) and mouse (L1Md) ORF1 amino acid sequences were aligned with CLUSTAL W and CLUSTAL X with minor manual adjustments. Conserved residues that are common to three and two species are boxed in gray, with white and black letters, respectively. The Cys-His residues of the zinc finger domains are indicated below the alignment. B, The Cys-His domains of Ylli, Zorro-1, Zorro-3, and Tad-1 (Cambareri, Helber, and Kinsey 1994Citation ) were aligned. Cys and His residues are in bold characters. The second and third domains in Zorro-1, Zorro-3, and Tad-1 departed from the canonical CX2CX4HX4C domain. Numbers within brackets refer to the residues between two conserved blocks

 
Analysis of Ylli ORF2
Several features characterize non–LTR retrotransposon ORF2. Depending on the element, these features include an apurinic endonuclease (APE) or a restriction enzyme–like endonuclease (REL-endo) domain, the RT domain, a C-terminal Zinc finger (ZN), and an RNase H (RNH) domain, the central RT domain being always present.

Because the best score in BLASTX comparisons of the entire Ylli ORF2p with proteins in databases in the initial search was obtained with L1Hs, we compared systematically the yeast ORF2p to this protein and to other ORF2s, including the fungal proteins. Pairwise comparisons over the entire ORFs gave an identity of 21.9% over 1,073 amino acids for Ylli and L1Hs. Pairwise FASTA comparisons between yeast proteins Ylli/Zorro-1, Ylli/Zorro-3, and Zorro-1/Zorro-3 gave 22.6% identity over 1,179 amino acids, 21.4% identity over 868 amino acids, and 29.6% identity over 1,073 amino acids, respectively. In addition, the size of the Ylli, Zorro-1, and Zorro-3 ORF2p are comparable to mammalian ORF2p (table 3 ). A recent search in public databases with Ylli ORF2 revealed that the best match (E value of 10-50 compared with 10-39 for human L1Hs) was with ORF2 of TRE3-A of the slime mold D. discoideum, with 25.2% identity over 865 amino acids.

Although the APE domain is not well conserved, we were able to detect such a domain in the N-terminal part of Ylli ORF2p. In retrotransposons of the L1 clade and in most non–LTR retrotransposon families, this domain is located in the N-terminal end. The multiple alignment shown in figure 4A indicated that, except for an aspartate replaced by a serine in the fifth block of the domain, all of the residues common to the non–LTR retrotransposons tested in Feng et al. (1996)Citation are present in Ylli. In addition, the essential residues required for endonuclease activity detected so far in L1Hs (Feng et al. 1996Citation ) are also present in Ylli ORF2. By comparison, the APE domain, especially at its 5' end, seems to have diverged considerably in the Zorros as five common residues are missing in Zorro-1 and four in Zorro-3.



View larger version (68K):
[in this window]
[in a new window]
 
Fig. 4.—Amino acid sequence alignment of yeast ORF2 domains with related proteins. The sequences are from mammals, L1Hs (human L1), L1Md (mouse L1), from yeasts Ylli (Y. lipolytica), Zorro-1 and Zorro-3 (C. albicans), from filamentous fungi Tad-1 (N. crassa) and CgT1 (C. gloeosporioides), and from insect (B. mori). The amino acid sequences were aligned using CLUSTAL W and CLUSTAL X with minor manual adjustments. The N-terminus part of the CgT1 is missing. Conserved residues in all species are boxed in black. Conserved residues in more than six elements are boxed in grey with white letters. Conserved residues in less than six elements are boxed in gray with black letters. A, Comparison of the N-terminal endonuclease domain (APE). Conserved residues and putative endonuclease active site residues described by Feng et al. (1996)Citation are indicated on top of the alignment. Numbers within brackets refer to the residues between two conserved blocks. Asterisks indicate the putative endonuclease active site residues described by Feng et al. (1996)Citation . B, comparison of the RT domain. Amino acids that are common to seven out of eight species are displayed at the top of the alignment. Numbers within brackets indicate the number of residues between the shown conserved blocks. Bold numbers indicate expansions of the Ylli sequence. Figures below the sequences number the eight segments of reverse transcriptase defined by Xiong and Eickbush (1988)Citation and modified by Malik, Burke, and Eickbush (1999)Citation . C, comparison of the C-terminal Cys-His region. The Cys-rich regions of various elements defining a putative Zinc finger within the C-terminus part of the ORF2 were aligned. Conserved cysteine and histidine are indicated above the alignment. Conserved residues in more than six elements are boxed in dark gray. Conserved residues in less than six elements are boxed in light gray. Numbers on the right indicate the location of the domain within the protein

 
From residue 468 to 828 of Ylli ORF2, we could easily align, using well identified invariant residues in all the proteins tested, the first nine conserved blocks of high conservation described by Malik, Burke, and Eickbush (1999)Citation (fig. 4B ), the block O being the previously described Z domain. The C-terminal part of ORF2 is less conserved. Some non–LTR retrotransposons carry an RNase H domain, a Cys-rich motif, and a restriction enzyme–like endonuclease. We found in the C-terminal part of the three yeast ORF2 gene products a putative Cys-rich motif (fig. 4C ). A schematic representation of the structure of Ylli and various non–LTR retrotransposons is shown in figure 5 .

Phylogenetic Analysis
Because of its strong conservation, the RT domain has been widely used for retrotransposons phylogenetic studies (Xiong and Eickbush 1990Citation ; Malik, Burke, and Eickbush 1999Citation ). The data shown previously, in particular the overall amino acid sequence comparisons, indicate that the structure of Ylli is closer to that of the elements belonging to the L1 family, in spite of some differences like the longer size of ORF1 associated with the presence of a putative Cys-rich motif. To confirm this, we performed a phylogenetic analysis on the RT domain of the non–LTR retrotransposons introducing the yeast element described here. We used at least two members of each of the clades described by Malik, Burke, and Eickbush (1999)Citation , and we added few elements belonging to the L1 clade, including members of the D. discoideum TRE family that gave a good score in BLAST searches with Ylli ORF2 in nonredundant public databases. We also included the Zorro-1 and Zorro-3 sequences. Sequences were aligned with CLUSTAL W and CLUSTAL X (see Materials and Methods). Then, phylogenetic reconstruction was performed on the RT domain, including the nine segments 0–7, as shown in figure 4B , and the less conserved segments 8 and 9, recently described by Malik, Burke, and Eickbush (1999)Citation . We included the intervening regions that separate the conserved blocks, even for rare domains that showed interblock expansion. The phylogenetic tree obtained is shown in figure 6 . Results confirmed a close relationship observed between the D. discoideum TRE elements, L1 from mammals, and Ylli. Ylli and the Zorros form a monophyletic group, but the branching separating the yeast clade from the clade that contains the mammalian retrotransposons and that of D. discoideum is not supported by a high bootstrap value. The poor support of the branching between mammals L1s and the slime mould TRE was already observed by Malik, Burke, and Eickbush (1999)Citation . On the other hand, the branching separating the mammalian L1s and the D. discoideum TRE from the other L1 families, including that from plants (maize and Arabidopsis) or animals (Xenopus) is better supported. It is clear from this tree and the sequence comparisons that Ylli forms a novel family that belongs to the L1 clade. We also confirm the classification of the Zorros within the L1 clade (Goodwin, Ormandy, and Poulter 2001Citation ). Interestingly, the yeast clade is quite distant from the Tad clade containing the fungal sequences. The yeast and fungal non–LTR retrotransposons are therefore well distinct from each other on the basis of ORF2 sequence comparisons.



View larger version (16K):
[in this window]
[in a new window]
 
Fig. 6.—Phylogenetic relationships of non–LTR retrotransposons from yeasts and from various organisms. The phylogeny, based on the sequences covering the 11 conserved blocks of the RT domain defined by Malik, Burke, and Eickbush (1999)Citation , was obtained using the neighbor-joining method (Saitou and Nei 1987Citation ) and rooted on a group-II intron. Bootstrap values based on 1,000 replications are shown next to the nodes. Only bootstrap values above 50% are shown. Source of the sequences is described in table 1

 
Localization of Ylli Within the Genome
Non–LTR retrotransposons have variable insertion specificity. The first described R1 from B. mori were shown to specifically insert in ribosomal DNA (rDNA). Since then, several non–LTR retrotransposons were shown to insert specifically in rDNA, in telomeres, or in the vicinity of tRNAs. On the other hand, no specificity was observed for the rest of the non–LTR retrotransposons found in various hosts. For instance, L1s were shown to inactivate genes in mammals, causing several types of disease (Kazazian 1998Citation ). From the data shown in table 2 , we conclude that Ylli does not have any obvious insertion specificity. Indeed, among the 26 RSTs carrying part of Ylli and part of the Y. lipolytica genome, we found that Ylli was associated with matches defined previously (Casaregola et al. 2000Citation ) in five RSTs, four to proteins and one to a tRNA (table 2 and fig. 7 ). Except for the strong match to a hypothetical Escherichia coli protein for which Ylli may be inserted in the promoter region, Ylli was found to be inserted downstream from the detected ORFs in the opposite orientation. We also found by comparing the sequence of Ylli to all Y. lipolytica sequences, a 430-bp-long 5'-truncate Ylli insertion (including the poly-dA) in the promoter region of ylPEX10 of the strain CX161-1B (Iida et al., accession number AB036770). Interestingly, this insertion is not present in the same promoter from strain W29 (LeDall et al., accession number AJ012084). The RST XAW0AA002D05D1 is very likely chimeric as it caries an internal part of Ylli of 705 bp, and the rest of the RST carries a sequence homologous to a S. cerevisiae tRNA. The RST AW0AA018C11D1 carries a short 114-bp-long match to mitochondrial DNA located 476 bp upstream of a 110-bp Ylli truncate. In that case, the RST AW0AA018C11T1 on the same clone is entirely matching Y. lipolytica mtDNA. The clone AW0AA018C11 is probably chimeric, but we cannot exclude that this clone carries a mitochondrial DNA insertion in nuclear DNA, as described in Ricchetti, Fairhead, and Dujon (1999)Citation .



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 7.—Schematic representation of RSTs carrying part of Ylli and matches to known ORFs. The long arrow indicates the RST and the sequencing orientation. The dotted boxes indicate the 3' end of Ylli and the black triangle the poly-A tail. The open arrows indicate ORFs and their transcription orientation. The hatched arrow indicates a tRNA and its transcription orientation. All the representations are with respect to Ylli transcription. Name of RSTs and name of matching ORFs and tRNA are indicated. Distances between Ylli and the ORFs are not indicated, as they cannot be determined

 
Distribution of Ylli Among Y. lipolytica Lineages
We amplified 866 bp of the 3' end of Ylli and used the PCR product as a probe to hybridize complete EcoRI and HindIII digestions of genomic DNA from three different strains of Y. lipolytica W29, H222, and CX161-1B (see Materials and Methods). By choosing the strains W29, H222, and CX161-1B, we selected a representative of each of the three known lineages of this species (Barth and Gaillardin 1997Citation ). Figure 2 shows that Ylli is present in all three lineages. This is in contrast with the presence of Ylt1 which is only present in the strains from the American lineage (Juretzek et al. 2001Citation ). The results shown in figure 2 confirm that Ylli is present in a high copy number in each strain and indicate an important RFLP between the strains tested which very likely reflects a variability of insertion.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
This study clearly demonstrates the existence of a new non–LTR retrotransposons family in the yeast Y. lipolytica. Ylli, for Y. lipolytica LINE, shares most characteristics of non–LTR retrotransposons: two ORFs showing structure and sequence similarities with known non–LTR elements ORFs, the first ORFp carries a Cys-His domain, and the second carries an APE, a RT, and a Cys-His domain. Ylli also carries an homopolymeric d(A) tract of different sizes, according to the element that is located 409 bp downstream from the C-terminus of ORF2. Ylli is present at multiple copies in the Y. lipolytica genome and does not display any obvious insertion specificity.

We were not able to demonstrate that the sequence described in this paper corresponds to a full-length element. Although Ylli RT probably generates 5' truncates (see subsequently), we could not find in the RST library a number of different RSTs that would have allowed the detection of Ylli 5' end. We only found four RSTs, corresponding to two different sequences, that after additional sequencing on the clones that carried them extended past the putative start of ORF1 and that diverged 30 bp upstream from this start (fig. 1 ). One of these two sequences, whose 5' end overlapped with another RST (AW0AA027B04T1), carried several features that could constitute a part of the promoter. A 35-bp-long region that is very T-rich and interrupted by few Cs is located 27 and 52 bp upstream from two putative TATA boxes. Such a poly-dT tract was found to serve as an upstream promoter element for constitutive expression of two different genes in S. cerevisiae, HIS3, and DED1. The poly-dA complementary sequence of the promoter of HIS3 was also shown to act positively on the transcription of the divergent gene PET56 (Struhl 1985Citation ). Interestingly, such a poly-A that contains 27 A residues is present 5 bp upstream from the putative start codon of C. albicans Zorro-3 ORF1. In Ylli, five direct repeats CAGCAA, that we called the Y-boxes, are clustered in the 476-bp-long sequence. This is very reminiscent of the 6-bp consensus motif CANNTG, called E-box, first described in the human L1 promoter (Minakami et al. 1992Citation ) and also found in several copies in the turtle CR1 5' UTR (Kajikawa, Ohshima, and Okada 1997Citation ). In humans, these cis-elements are binding targets for basic helix-loop-helix (bHLH) proteins. We have no evidence that this sequence constitutes the 5' UTR of Ylli because Ylli could have inserted past the promoter of an unrelated gene. This sequence contains nevertheless the 5' 35-bp repeat also found in the 3' end of the element; interestingly, the 5' 35-bp repeat itself contains one of the five Y-boxes detected in this region.

The presence of non–LTR retrotransposons in yeast raises two questions. The first one concerns the existence of an internal promoter which has never been documented in yeasts. The other question concerns the mechanism involved in the translation of both ORFs of Ylli. In most yeast LTR retrotransposons, ribosomal frameshifting ensures the translation of both ORFs, including the downregulation of ORF2 without the need for reinitiation of translation. Recently, the existence of a second mechanism was suggested for Tca2 (previously called pCA1) in C. albicans (Matthews et al. 1997Citation ). This mechanism uses the read-trough suppression of an in-frame stop codon that separates both ORFs of the retrotransposon. In Ylli, ORF1 ends with a stop TAA, and 2 bp separate this stop codon from the ORF2 putative start codon. This structure suggests that, like in other non–LTR retrotransposons (Bouhidel, Terzian, and Pinon 1994Citation ; Szafranski et al. 1999Citation ), translation may be reinitiated past the ORF1 stop codon with an as yet undescribed mechanism in yeasts.

Another non–LTR retrotransposon characteristic shared by Ylli is the presence of elements truncated in 5', probably because of the low processivity of the reverse transcriptase. We found four RSTs along Ylli that carry part of Ylli and are associated with genomic DNA at the 5' end. We also showed that by hybridization of genomic DNA to a 5'- and a 3'-specific probe, less bands were detected with the 5'-specific probe. A puzzling feature of Ylli is the existence of a large number of short repeats matching the end of the element, including the poly-A tail; some of these could constitute Ylli extreme 5' truncations. These short repeats are also reminiscent of SINEs found in very high copy number in mammals and derived from tRNA gene promoter associated with the 3' end of non–LTR retrotransposons. SINEs have thus two conserved sequences, boxes A and B, conserved in promoters of RNA polymerase-III–transcribed genes. The Ylli repeats are much shorter than SINE 3' end (126 out of the 19 detected short repeats are less than 78 bp long), and no homology to the conserved boxes was detected in the Y. lipolytica repeats, ruling out the possibility that the Y. lipolytica 3' short repeats could be classical SINEs. Such conserved boxes were found in the Y. lipolytica 7SL RNA genes, SCR1 and SCR2 (He et al. 1989Citation ). Alternatively, the presence of these short repeats could be explained by the existence of two 35-bp repeats immediately upstream from ORF1 and immediately upstream from the poly-A tract. By analogy to the presence of solo LTR in S. cerevisiae because of recombination between LTRs and excision of the Tys, the short repeats maybe generated after imprecise recombination between the two 35-bp direct repeats and excision of the element. This is consistent with the fact that most of the junctions genomic DNA-short repeat are located in or immediately upstream the 3'-end 35-bp direct repeat, although the size of this homology is very reduced to allow homologous recombination. If one cannot explained the role of these 35-bp repeats in the life cycle of Ylli, their remarkable sequence conservation is consistent with their possible involvement in the Ylli excision by recombination.

A likely possibility is that these short 3' repeats are 5' extreme truncates caused by a secondary structure generated by the dyad symmetry overlapping the 3'-end 35-bp direct repeat shown in figure 1 . This structure might reduce reverse transcription by provoking abortive transcription. This is consistent with the few 5' truncates detected along Ylli, indicating that Ylli reverse transcriptase seems to be more processive than higher eukaryote ones. The mechanism mentioned here could therefore be involved in the regulation of the propagation of Ylli. Again, the fact that most of the junctions genomic DNA-short repeat are located in or immediately upstream the dyad symmetry fits this explanation. It was shown that the last 250 nucleotides of the 3' UTR of the B. mori R2 non–LTR element was required for reverse transcription, and that secondary structure was likely to be involved in the binding of the transcript to the reverse transcriptase during initiation of transcription (Luan and Eickbush 1995Citation ). The 3'-end sequence of different non–LTR elements was poorly conserved but seemed to be able to form secondary structure with common characteristics (Mathews et al. 1997Citation ), supporting a role for the putative secondary structure formed by the Ylli 3' end. Further work is necessary to choose between these hypotheses.

Non–LTR retrotransposons, especially in mammals, are present in high copy number. This is less true for elements in other organisms, such as Swimmer in the teleost fish medaka (Duvernell and Turner 1998Citation ). On the basis of the number of Ylli 3' ends containing a poly-A tract in the Génolevures sequences and the coverage of the Y. lipolytica genome in the Génolevures project, we have estimated the number of Ylli in the Y. lipolytica W29 genome at over 100 if the short repeats were taken into account or at 15–20 if they were not. Considering we are dealing with only one element, this is a high figure for a yeast, as 52 LTR-retrotransposon–Tys were found in the sequenced strain of S. cerevisiae, and a total of 250 remnants of previous insertions of Tys, represented by solo LTR, exist in this strain. Recent work performed in the Génolevures project and in the laboratory indicates that various yeast species tend to have less LTR-retrotransposon copies than S. cerevisiae, and that species belonging to the Zygosaccharomyces, Kluyveromyces, and Saccharomyces genera carry very few if any LTR retrotransposons (Génolevures 2000Citation ; unpublished data). In Y. lipolytica the only characterized LTR-retrotransposon, Ylt1, is present at 35 copies and 50–60 solo LTRs can be found (Schmid-Berger, Schmid, and Barth 1994Citation ).

This is in contrast with the low number of representatives of the Zorros families of non–LTR retrotransposons-like found in C. albicans (Goodwin, Ormandy, and Poulter 2001Citation ) detected by hybridization with the RT gene of Zorro-1 and Zorro-3. We found that both elements of C. albicans, Zorro-1 and Zorro-3, are unique and that 5' truncates were absent in the available Stanford's C. albicans sequence assembly 6, except for a 5' 624-bp-long truncate of Zorro-1, and few central parts of Zorro-3 ORF2 that are very likely caused by genomic rearrangements (see also Goodwin, Ormandy, and Poulter 2001Citation ). This might indicate that Ylli and the Zorros do not use the same mechanism for propagation, that they are not regulated the same way, or that Zorros are being lost from the C. albicans genome. Indeed, one degenerate Zorro-2 was detected and some strains do not carry Zorro-1 or Zorro-3 (Goodwin, Ormandy, and Poulter 2001Citation ), whereas this is not the case for Ylli. The yeasts Y. lipolytica and C. albicans are both dimorphic and related on the basis of 26S rDNA sequence comparison (Souciet et al. 2000Citation ). Although it is thus tempting to link dimorphism and the presence of non–LTR retrotransposons in the genome, we also studied 12 other yeasts in the Génolevures project, including a dimorphic yeast, close to C. albicans, Candida tropicalis, but no non–LTR elements-like sequence were detected in any of the yeasts studied in this particular project. This rules out the possibility that dimorphism and non–LTR retrotransposon maintenance could be somehow linked.

We have shown that, unlike the LTR retrotransposon Ylt1, Ylli is present in all known traceable strain lineages, American, French, and German. On the basis of RFLP analysis, clear differences could be seen between strains, indicating that that this element may still be active. In this respect, we detected an insertion within the promoter of the PEX10 gene in the "American" strain CX161-1B that is not present at the same location in the "French" strain W29. In addition, we detected few sequence nucleotide polymorphisms (SNP) in the Ylli family, and none of these SNPs affected the ORFs by introducing stop codons. The variability between strains and the high sequence conservation among the copies of Ylli suggest that Ylli transposed recently.

We have clearly shown, on the basis of the RT domain phylogeny and overall ORF2 sequence comparison, that Ylli, like the Zorros, belongs to the L1 clade, one of the oldest non–LTR retrotransposon clades. We also found that the yeast non–LTR retrotransposons are closely related to the mammalian L1 and to the members of the D. discoideum TRE family. From various sequence alignments, Malik, Burke, and Eickbush (1999)Citation could date the acquisition of the diverse domains that constitute ORF2s and the transition from the elements made of one single ORF to elements that contain two ORFs. Ylli and Zorro-1 ORF1s are long (up to 714 amino acids for Ylli) and carry Cys-rich motifs in the second half of the ORF gene product. They thus display a size and a structure which closely resembles that of the most recent non–LTR element clades (Tad1, R1, LOA...), whereas the phylogenetic analysis of the RT domain as well as entire ORF2 sequence comparison placed clearly the yeast elements within the L1 clade.

We also showed that Ylli does not have any obvious insertion specificity. The closest elements, the TRE non–LTR retrotransposon from D. discoideum, are site specific. This confirms that RT phylogeny does not differentiate between elements with different mechanisms of insertion as pointed out by Malik, Burke, and Eickbush (1999)Citation . Ylli is the only retrotransposon described in yeast, together with Ylt1, that does not insert in the vicinity of tRNA genes and therefore could constitute an efficient tool for insertional mutagenesis.

With the Zorros in C. albicans, this work confirms the presence of non–LTR retrotransposons in hemiascomycetous yeasts. It further strongly suggests that non–LTR retrotransposons in yeasts are restricted to few species. Interestingly, both species have the ability of filamenting. Among the yeast non–LTR elements, Ylli was shown to be closest to the already described elements. Indeed, Ylli may still be active, as sequence conservation between the various copies would suggest. In addition to the evolutionary implications of this work, analysis of the various mechanisms of retrotransposition will be greatly facilitated in a genetically amenable yeast species like Y. lipolytica.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The composite sequence of Ylli was deposited at EMBL under accession number AJ319752. The sequence alignment used for the phylogenetic tree shown in figure 6 is available at the web site http://www.molbiolevol.org/.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
S.C. is grateful to Bernard Labedan (IGM, Orsay) for help with phylogenetic trees and to Robert Kelly (Institut Pasteur, Paris) for reading the manuscript. This work was supported by INRA, CNRS, and the GDR/CNRS 2354 "Génolevures II". E.B. was supported by the EEC scientific research grant QLRI-1999-01333. Part of this work was supported by a BRG grant (ressources génétiques des microorganismes Number 11-0926-99).


    Footnotes
 
Thomas Eickbush, Reviewing Editor

Abbreviations: ORF, open reading frame; LTR, long terminal repeat; UTR, untranslated region. Back

Keywords: yeast non–LTR retrotransposon Yarrowia lipolytica repeats phylogeny Back

Address for correspondence and reprints: Serge Casaregola, Collection de Levures d'Intérêt Biotechnologique, Laboratoire de Génétique Moleculaire et Cellulaire, INRA UR216, CNRS URA1925, INA-PG, F-78850 Thiverval-Grignon, France. serge.casaregola{at}grignon.inra.fr Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Altschul S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D. J. Lipman, 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Nucleic Acids Res 25:3389-3402[Abstract/Free Full Text]

    Artiguenave F., P. Wincker, P. Brottier, S. Duprat, F. Jovelin, C. Scarpelli, J. Verdier, V. Vico, J. Weissenbach, W. Saurin, 2000 Genomic exploration of the hemiascomycetous yeasts: 2. Data generation and processing FEBS Lett 487:13-16[ISI][Medline]

    Bailey J. A., L. Carrel, A. Chakravarti, E. E. Eichler, 2000 Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: the Lyon repeat hypothesis Proc. Natl. Acad. Sci. USA 97:6634-6639[Abstract/Free Full Text]

    Barth G., C. Gaillardin, 1997 Physiology and genetics of the dimorphic fungus Yarrowia lipolytica FEMS Microbiol. Rev 19:219-237[ISI][Medline]

    Bouhidel K., C. Terzian, H. Pinon, 1994 The full-length transcript of the I factor, a LINE element of Drosophila melanogaster, is a potential bicistronic RNA messenger Nucleic Acids Res 22:2370-2374[Abstract]

    Cambareri E. B., J. Helber, J. A. Kinsey, 1994 Tad1-1, an active LINE-like element of Neurospora crassa Mol. Gen. Genet 242:658-665[ISI][Medline]

    Casaregola S., C. Feynerol, M. Diez, P. Fournier, C. Gaillardin, 1997 Genomic organization of the yeast Yarrowia lipolytica Chromosoma 106:380-390[ISI][Medline]

    Casaregola S., C. Neuveglise, A. Lepingle, E. Bon, C. Feynerol, F. Artiguenave, P. Wincker, C. Gaillardin, 2000 Genomic exploration of the hemiascomycetous yeasts: 17. Yarrowia lipolytica FEBS Lett 487:95-100[ISI][Medline]

    Casaregola S., H. V. Nguyen, A. Lepingle, P. Brignon, F. Gendre, C. Gaillardin, 1998 A family of laboratory strains of Saccharomyces cerevisiae carry rearrangements involving chromosomes I and III Yeast 14:551-564[ISI][Medline]

    Chibana H., B. B. Magee, S. Grindle, Y. Ran, S. Scherer, P. T. Magee, 1998 A physical map of chromosome 7 of Candida albicans Genetics 149:1739-1752[Abstract/Free Full Text]

    Church G. M., W. Gilbert, 1984 Genomic sequencing Proc. Natl. Acad. Sci. USA 81:1991-1995[Abstract]

    Duvernell D. D., B. J. Turner, 1998 Swimmer 1, a new low-copy-number LINE family in teleost genomes with sequence similarity to mammalian L1 Mol. Biol. Evol 15:1791-1793[Free Full Text]

    Esnault C., J. Maestre, T. Heidmann, 2000 Human LINE retrotransposons generate processed pseudogenes Nat. Genet 24:363-367[ISI][Medline]

    Ewing B., P. Green, 1998 Base-calling of automated sequencer traces using phred. II. Error probabilities Genome Res 8:186-194[Abstract/Free Full Text]

    Ewing B., L. Hillier, M. C. Wendl, P. Green, 1998 Base-calling of automated sequencer traces using phred. I. Accuracy assessment Genome Res 8:175-185[Abstract/Free Full Text]

    Feng Q., J. V. Moran, H. H. Kazazian Jr., J. D. Boeke, 1996 Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition Cell 87:905-916[ISI][Medline]

    Furano A. V., 2000 The biological properties and evolutionary dynamics of mammalian LINE-1 retrotransposons Prog. Nucleic Acid Res. Mol. Biol 64:255-294[ISI][Medline]

    Génolevures, 2000 Genomic exploration of the hemiascomycetous yeasts FEBS Lett. 487

    Goodwin J. D., J. E. Ormandy, R. T. M. Poulter, 2001 L1-like non LTR retrotransposons in the yeast Candida albicans Curr. Genet 39:83-91[ISI][Medline]

    Goodwin T. J., R. T. Poulter, 2000 Multiple LTR-retrotransposon families in the asexual yeast Candida albicans Genome Res 10:174-191[Abstract/Free Full Text]

    ———. 2001 The diversity of retrotransposons in the yeast Cryptococcus neoformans Yeast 18:865-880[ISI][Medline]

    Gordon D., C. Abajian, P. Green, 1998 Consed: a graphical tool for sequence finishing Genome Res 8:195-202[Abstract/Free Full Text]

    Goyon C., J. L. Rossignol, G. Faugeron, 1996 Native DNA repeats and methylation in Ascobolus Nucleic Acids Res 24:3348-3356[Abstract/Free Full Text]

    Hamer J. E., L. Farrall, M. J. Orbach, B. Valent, F. G. Chumley, 1989 Host species-specific conservation of a family of repeated DNA sequences in the genome of a fungal plant pathogen Proc. Natl. Acad. Sci. USA 86:9981-9985[Abstract]

    He C., J. P. Nourse, S. Kelemu, J. A. Irwin, J. M. Manners, 1996 CgT1: a non–LTR retrotransposon with restricted distribution in the fungal phytopathogen Colletotrichum gloeosporioides Mol. Gen. Genet 252:320-331[ISI][Medline]

    He F., J. M. Beckerich, V. Ribes, D. Tollervey, C. M. Gaillardin, 1989 Two genes encode 7SL RNAs in the yeast Yarrowia lipolytica Curr. Genet 16:347-350[ISI][Medline]

    Higgins D. G., J. D. Thompson, T. J. Gibson, 1996 Using CLUSTAL for multiple sequence alignments Methods Enzymol 266:383-402[ISI][Medline]

    Hohjoh H., M. F. Singer, 1997 Sequence-specific single-strand RNA binding protein encoded by the human LINE-1 retrotransposon EMBO J 16:6034-6043[Abstract/Free Full Text]

    Hutchinson C. A. III,, S. C. Hardies, D. D. Loeb, W. R. Shehee, M. H. Edgell, 1989 LINEs and related retrotransposons Pp. 593–617 in D. E. Berg and M. M. Howe, eds. Mobile DNA. American Society for Microbiology, Washington DC

    Juretzek T., M. Le Dall, S. Mauersberger, C. Gaillardin, G. Barth, J.-M. Nicaud, 2001 Vectors for gene expression and amplification in the yeast Yarrowia lipolytica Yeast 18:97-113[ISI][Medline]

    Kajikawa M., K. Ohshima, N. Okada, 1997 Determination of the entire sequence of turtle CR1: the first open reading frame of the turtle CR1 element encodes a protein with a novel zinc finger motif Mol. Biol. Evol 14:1206-1217[Abstract]

    Kazazian H. H. Jr., 1998 Mobile elements and disease Curr. Opin. Genet. Dev 8:343-350[ISI][Medline]

    Kordis D., F. Gubensek, 1998 Unusual horizontal transfer of a long interspersed nuclear element between distant vertebrate classes Proc. Natl. Acad. Sci. USA 95:10704-10709[Abstract/Free Full Text]

    Kurtzman C. P., C. J. Robnett, 1998 Identification and phylogeny of ascomycetous yeasts from analysis of nuclear large subunit (26S) ribosomal DNA partial sequences Antonie Leeuwenhoek 73:331-371[ISI]

    Le Q. H., S. Wright, Z. Yu, T. Bureau, 2000 Transposon diversity in Arabidopsis thaliana Proc. Natl. Acad. Sci. USA 97:7376-7381[Abstract/Free Full Text]

    Li W. H., Z. Gu, H. Wang, A. Nekrutenko, 2001 Evolutionary analyses of the human genome Nature 409:847-849[ISI][Medline]

    Luan D. D., T. H. Eickbush, 1995 RNA template requirements for target DNA-primed reverse transcription by the R2 retrotransposable element Mol. Cell. Biol 15:3882-3891[Abstract]

    Malik H. S., W. D. Burke, T. H. Eickbush, 1999 The age and evolution of non–LTR retrotransposable elements Mol. Biol. Evol 16:793-805[Abstract]

    Martin S. L., F. D. Bushman, 2001 Nucleic acid chaperone activity of the ORF1 protein from the mouse LINE- 1 retrotransposon Mol. Cell. Biol 21:467-475[Abstract/Free Full Text]

    Mathews D. H., A. R. Banerjee, D. D. Luan, T. H. Eickbush, D. H. Turner, 1997 Secondary structure model of the RNA recognized by the reverse transcriptase from the R2 retrotransposable element RNA 3:1-16[Abstract/Free Full Text]

    Matthews G. D., T. J. Goodwin, M. I. Butler, T. A. Berryman, R. T. Poulter, 1997 pCal, a highly unusual Ty1/copia retrotransposon from the pathogenic yeast Candida albicans J. Bacteriol 179:7118-7128[Abstract]

    Minakami R., K. Kurose, K. Etoh, Y. Furuhata, M. Hattori, Y. Sakaki, 1992 Identification of an internal cis-element essential for the human L1 transcription and a nuclear factor(s) binding to the element Nucleic Acids Res 20:3139-3145[Abstract]

    Page R. D., 1996 TreeView: an application to display phylogenetic trees on personal computers Comput. Appl. Biosci 12:357-358[Medline]

    Pearson W. R., D. J. Lipman, 1988 Improved tools for biological sequence comparison Proc. Natl. Acad. Sci. USA 85:2444-2448[Abstract]

    Picologlou S., M. E. Dicig, P. Kovarik, S. W. Liebman, 1988 The same configuration of Ty elements promotes different types and frequencies of rearrangements in different yeast strains Mol. Gen. Genet 211:272-281[ISI][Medline]

    Ricchetti M., C. Fairhead, B. Dujon, 1999 Mitochondrial DNA repairs double-strand breaks in yeast chromosomes Nature 402:96-100[ISI][Medline]

    Romano A., S. Casaregola, P. Torre, C. Gaillardin, 1996 Use of RAPD and mitochondrial DNA RFLP for typing of Candida zeylanoides and Debaryomyces hansenii yeast strains isolated from cheese Syst. Appl. Microbiol 19:255-264[ISI]

    Saitou N., M. Nei, 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]

    Sambrook J., E. Fritsch, T. Maniatis, 1989 Molecular cloning: a laboratory manual Cold Spring Harbor, Cold Spring Harbor Laboratory Press, New York

    Sassaman D. M., B. A. Dombroski, J. V. Moran, M. L. Kimberland, T. P. Naas, R. J. DeBerardinis, A. Gabriel, G. D. Swergold, H. H. Kazazian Jr., 1997 Many human L1 elements are capable of retrotransposition Nat. Genet 16:37-43[ISI][Medline]

    Schmid-Berger N., B. Schmid, G. Barth, 1994 Ylt1, a highly repetitive retrotransposon in the genome of the dimorphic fungus Yarrowia lipolytica J. Bacteriol 176:2477-2482[Abstract]

    Souciet J., M. Aigle, F. Artiguenave, et al. (21 co-authors) 2000 Genomic exploration of the hemiascomycetous yeasts: 1. A set of yeast species for molecular evolution studies FEBS Lett 487:3-12[ISI][Medline]

    Struhl K., 1985 Naturally occurring poly(dA-dT) sequences are upstream promoter elements for constitutive transcription in yeast Proc. Natl. Acad. Sci. USA 82:8419-8423[Abstract]

    Szafranski K., G. Glockner, T. Dingermann, K. Dannat, A. A. Noegel, L. Eichinger, A. Rosenthal, T. Winckler, 1999 Non–LTR retrotransposons with unique integration preferences downstream of Dictyostelium discoideum tRNA genes Mol. Gen. Genet 262:772-780[ISI][Medline]

    The Arabidopsis Genome Initiative. 2000 Analysis of the genome sequence of the flowering plant Arabidopsis thaliana Nature 408:796-815[ISI][Medline]

    Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882[Abstract/Free Full Text]

    Volff J. N., C. Korting, A. Meyer, M. Schartl, 2001 Evolution and continuous distribution of Rex3 retrotransposon in fish Mol. Biol. Evol 18:427-431[Free Full Text]

    Warmington J. R., R. P. Green, C. S. Newlon, S. G. Oliver, 1987 Polymorphisms on the right arm of yeast chromosome III associated with Ty transposition and recombination events Nucleic Acids Res 15:8963-8982[Abstract]

    Xiong Y., T. H. Eickbush, 1988 The site-specific ribosomal DNA insertion element R1Bm belongs to a class of non–long-terminal-repeat retrotransposons Mol. Cell. Biol 8:114-123[ISI][Medline]

    ——— 1990 Origin and evolution of retroelements based upon their reverse transcriptase sequences EMBO J 9:3353-3362[Abstract]

Accepted for publication December 18, 2001.