Collection de Levures d'Intérêt Biotechnologique, Laboratoire de Génétique Moleculaire et Cellulaire, Thiverval-Grignon, France
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
NonLTR retroelements (also called LINEs for Long Interspersed Repetitive Elements) have been found in all eukaryotes, including plants, mammals, and fungi. The most extensively studied nonLTR retrotransposons, L1, was found to constitute 15.4% of the human genome (Li et al. 2001
). L1 consists of a 5' untranslated region (UTR) of variable length and structure that carries a polymerase-II promoter, two open reading frames (ORFs), and a 3' UTR terminating in a poly-A tract (Hutchinson III et al. 1989
), whereas the oldest nonLTR retrotransposon families are constituted by a single ORF. ORF1 is considered to be the equivalent of the retroviral gag gene, although the proteins do not show sequence conservation. Human ORF1 gene product was shown to bind L1 RNA in vitro (Hohjoh and Singer 1997
). ORF2 encodes several domains with enzymatic characteristics, including an endonuclease and a RT activities involved in the retrotransposition process and a cysteine-rich (Cys-rich) motif of the C-terminus thought to be involved in nucleic acid binding. A total of 11 families of nonLTR retrotransposons were recently described, and variability between families was proposed to be linked to the gain or loss of various domains in both ORFs (Malik, Burke, and Eickbush 1999
). NonLTR retrotransposon maintenance is based on vertical transmission (Malik, Burke, and Eickbush 1999
; Furano 2000
), although some instances of horizontal transfer have been documented (Kordis and Gubensek 1998
; Volff et al. 2001
).
Propagation of nonLTR retrotransposons requires a reverse transcription step, and these elements were mostly found truncated because of abortive transcription. Over 800,000 copies are present in humans, but very few are entire. Even fewer (3060) are capable of transposition (Sassaman et al. 1997
). Besides the colonization of their host genome, nonLTR retrotransposons were recently found to be responsible for generating pseudogenes (Esnault, Maestre, and Heidmann 2000
). NonLTR retrotransposons were also proposed to be involved in the propagation of X chromosome inactivation in mammals (Bailey et al. 2000
).
NonLTR retrotransposons have been detected in four fungal species, Tad-1 in Neurospora crassa (Cambareri, Helber, and Kinsey 1994
), MGR583 in Magnaporthe grisea (Hamer et al. 1989
), CgT1 Colletotrichum gloeosporioides (He et al. 1996
), and Mars in Ascobollus immersus (Goyon, Rossignol, and Faugeron 1996
), although no entire element was described for the last species. Whereas LTR retrotransposons are widely distributed in all yeast species tested, except in Pichia sorbitophila (Génolevures 2000
; Goodwin and Poulter 2000
; unpublished data), no nonLTR retrotransposons have been found in yeast until recently. Chibana et al. (1998)
mentioned a gene, LRT2, that displayed homology with the nonLTR retrotransposon RT and hybridized to three chromosomes in the yeast Candida albicans. Very recently, three elements, the Zorros, phylogenetically related to the L1 clade, have been described in C. albicans (Goodwin, Ormandy, and Poulter 2001
). Interestingly, only one was full length, the others were either truncated in 5' or degenerate. In addition, unlike most nonLTR retrotransposons, the Zorros are present in a few copies per genome. Another nonLTR retrotransposon family was very recently described in basidiomycetous yeast Cryptococcus neoformans, which was found to belong to the most ancient clade CRE (Goodwin and Poulter 2001
).
Yarrowia lipolytica is a yeast with many unusual properties. It is dimorphic, and it can grow on fatty acids and alkanes. It is heterothallic; although it can display sexuality, most of its isolates are haploid, reminiscent of filamentous fungi. In contrast to other yeasts, the Y. lipolytica genome shares several properties with higher eukaryotes, such as dispersion of the rDNA clusters and the 5S RNA genes and the presence of a typical Signal Recognition Particle 7S RNA (Barth and Gaillardin 1997
). Despite the numerous features that Y. lipolytica shares with higher eukaryotes and filamentous fungi, rDNA sequence phylogeny placed it unambiguously among the hemiascomycetous yeasts (Kurtzman and Robnett 1998
). A retrotransposon, Ylt1, belonging to the Ty3-gypsy group and bound by unusually long (714 bp) LTRs, was identified (Schmid-Berger, Schmid, and Barth 1994
). Surprisingly, Ylt1 is not systematically present in all wild isolates of Y. lipolytica (Juretzek et al. 2001
).
A recent genome survey of Y. lipolytica based on random sequencing (Casaregola et al. 2000
) confirmed that no homologous sequence to Ylt1 could be found in the wild French isolate W29, as suggested by (Juretzek et al. 2001
), but that another Ty3-like retrotransposon was present. This work also evidenced BLASTX matches to mammalian nonLTR retrotransposons. Here, we report the characterization of this Y. lipolytica nonLTR retrotransposon family.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
DNA Techniques
Common DNA manipulations were performed as described in Sambrook, Fritsch, and Maniatis (1989)
. Restriction enzymes were purchased from GIBCO-BRL and BioLabs. Blotting of DNA onto a GeneScreen nylon membrane (DuPont) was performed according to Casaregola et al. (1998)
. DNA-DNA hybridization was performed according to Church and Gilbert (1984)
, with P32-
-dCTPlabeled DNA probes using the Megaprime labeling kit (Amersham, U.K.). Final washes were done at 65°C in 0.1 x SSC, 0.1% SDS. For Pulse Field Gel Electrophoresis, genomic DNA in agarose plugs was prepared according to Casaregola et al. (1997)
. Chromosomes were separated using a BioRad CHEF MAPPER apparatus in 0.5 x TBE running buffer at 14°C for 15 h 39 min with pulses from 0.05 to 0.92 s in 1% Pulse field certified Biorad agarose gels with voltages of 9 and 6 Volts/cm and an angle of 180°.
The program consed (Gordon, Abajian, and Green 1998
) was used for the design of the primers. The primers used for PCR amplification of the 5'-end of Ylli are 5'-CCTCGAGCACCTCGATA and 5'-GAGCCGGTAAGCCTAGC and of the 3' end are 5'-TGACCTTCGAAAACACTACAT and 5'-GTACGCTTACACGAATATTACTAA. PCR amplifications were performed with 50 ng genomic DNA prepared according to Romano et al. (1996)
using a Crocodile III or a Perkin-Elmer 9600 thermocycler. Amplification conditions were as follows: 4 min at 94°C, 25 or 30 cycles consisting of 30 s at 94°C, 30 s at the Tm of the primers, 1 min per kb to be amplified at 72°C followed by 7 min at 72°C. Appligene Taq polymerase (Oncor) (2.5 units) in the supplied buffer was used. The PCR products were run on a 0.8% agarose gel (ICN) in TAE 1 x buffer.
Sequencing and Sequence Assembly
Most of the sequences used in this study were generated during the Génolevures project (Artiguenave et al. 2000
; Casaregola et al. 2000
) on strain W29. Additional sequences were obtained from clones of the Y. lipolytica genomic DNA library used in Génolevures. Sequence assembly was performed using programs phred (Version 0.980904.c) and phrap (Version 0.960731) with a minscore of 14 and a minmatch of 30 (Ewing and Green 1998
; Ewing et al. 1998
). The sequence compilation was edited with consed, Version 10.38 (beta) (Gordon, Abajian, and Green 1998
).
Sequence Analysis
BLAST (Altschul et al. 1997
) was used to screen sequence databases for homology. Sequences were analyzed with various programs in the GCG environment (Genetics Computer Group, Madison), including FASTA (Pearson and Lipman 1988
). The alignment of the RT domains of nonLTR retrotransposons described by Malik, Burke, and Eickbush (1999)
was used as a basis for the alignment shown in figure 5
. Sequence alignments were generated using CLUSTAL W (Higgins, Thompson, and Gibson 1996
) and CLUSTAL X (Thompson et al. 1997
) and were manually adjusted in Genedoc (http://www.psc.edu/biomed/genedoc). Phylogenetic trees were generated by the neighbor-joining method (Saitou and Nei 1987
) and were visualized with Treeview, Version 1.6.5 (Page 1996
).
|
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Whereas the 3'-end of Ylli could be identified without difficulty (see previously), we were unable to find more than two clones with divergent sequences immediately upstream of ORF1 in our Y. lipolytica library. Therefore, we could not define precisely the start of the element, and we have no evidence either for the location of the 5'-end of Ylli in the contig 02-1100 or for the presence of an entire element in the Y. lipolytica genome. In addition, nonLTR retrotransposon transposition creates a 7- to 20-bp genomic duplication bounding the element that was not detected. Out of the two sequences associated with ORF1 generated by additional sequencing of existing clones, that included in the sequence compilation 02-1100 spans 476 bp and carries interesting features that are reminiscent of a yeast promoter. We found a poly-T tract, interrupted by three Cs, 121 bp upstream from the first ATG of ORF1 with the following sequence: T9 C T3 C T8 C T2 C T7. This type of poly-dT, found in the promoter of three different gene of S. cerevisiae and located upstream from a TATA box, was shown to act as a cis-activating sequence (Struhl 1985
). Two putative TATA boxes are present 27 and 52 bp downstream from the poly-dT (fig. 1B
). We also found a 6-bp-long sequence CAGCAA, that we called Y box, repeated five times within the 476-bp upstream from the ORF1, with two of these repeats in tandem. Interestingly, within this region, a 35-bp sequence, CTTTCAC A/G GCAAAGTTATAATTAAATGAAT A/G TATA, located 31 bp upstream of the Met codon of ORF1, is directly repeated, with two mismatches, at the end of the 3' UTR, ending up 5 bp upstream from the poly-A tract (fig. 1B
). The first mismatch (A/G) is caused by two different sequences within the compilation, whereas the second mismatch (A/G) corresponds to a divergence between the 5' and the 3' repeats. It is noteworthy that one of the six Y boxes that we detected in the sequence upstream from the ORF1 is located within the 35-bp repeat, supporting the idea that the 476-bp sequence may belong to Ylli. Further work will be necessary to demonstrate this possibility. In addition to this direct repeat, we found in the 3' UTR a long region of dyad symmetry with an internal energy of -3.8 kcal that overlaps the second 35-bp-long direct repeat of the element (fig. 1B
). It is depicted in figure 1C
(see subsequently). This is reminiscent of the end of the 3' UTR that is recognized by the protein encoded by the Bombyx mori site-specific element R2 and is susceptible to form secondary structure (Mathews et al. 1997
).
Ylli Copy Number
A total of 11 RSTs containing the poly-A and corresponding to eight different insertions in the genome of W29 were found in the sequence compilation 02-1100. Overall, BLASTN comparison of the sequence 02-1100 with the Génolevures Y. lipolytica RSTs identified 23 RSTs which had between 23 and 134 bp matching the 3' end of the element, 16 of these were not included by the assembler used to generate the contig Yl-C834 (table 2
) because of short homology size. These 23 RSTs carry isolated repeats flanked by genomic material, indicating that they might be extreme 5' truncations of the elements. Overall, 19 distinct repeats matching the 3'-end of Ylli, including the poly-A were identified. One is 431 bp long and another 350 bp long. All the others are short repeats varying in size from 23 to 134 bp (table 2
). Other RSTs homologous to Ylli and carrying genomic material were also detected along the element, that diverged from the Ylli consensus sequence (AW0AA006D11T1 at position 447, AW0AA003D09D1 at position 1393, AW0AA027H02T1 at position 5826, and AW0AA018D09D1 at position 6173). These 5' truncates were rare compared with the short repeats. As the coverage of the Y. lipolytica genome in the Génolevures project is around 20% (Casaregola et al. 2000
), the overall number of Ylli insertions in the genome of the strain W29 was estimated to be around 100, if short repeats are considered, or 1520 if these are omitted.
|
|
Analysis of Ylli ORF1
Pairwise amino acid sequence comparisons with FASTA gave an amino acid identity between Ylli and Zorro-3 ORF1 amounting to 18.5% over 583 amino acids, whereas we obtained 23.5% identity over 136 amino acids when ORF1 of Zorro-1 and Ylli were compared. Comparisons of both C. albicans retrotransposons ORF1 gave 20.3% identity over 291 amino acids. These results are consistent with the difference in size observed for the ORFs of the three elements because the best match is obtained when Ylli and Zorro-3 are compared (table 3
). Pairwise comparison of Ylli ORF1 gave 19.2% identity over 208 amino acids with human L1Hs ORF1 and 8.9% identity over 307 amino acids with D. discoideum TRE3_A ORF1 (see subsequently). Comparisons with other nonLTR retrotransposon ORF1 sequences, including those of fungi, gave poor scores.
|
An alignment of the ORF1p of Ylli, Zorro-1, and Zorro-3, in addition to the human L1Hs and the mouse L1Md ORF1p sequences, is shown in figure 3A
. This alignment clearly shows that the yeast ORFs have a C-terminal extension with respect to the mammalian proteins, consistent with the lack of the Cys-rich motif in the mammalian proteins. From this alignment we confirmed that, as mentioned by Goodwin, Ormandy, and Poulter (2001)
, the N-terminal part of Zorro-1 ORF1p is not carried by the C. albicans C6-1996 contig because the most conserved part of the aligned sequences is located before the beginning of Zorro-1 ORF1p.
|
Because the best score in BLASTX comparisons of the entire Ylli ORF2p with proteins in databases in the initial search was obtained with L1Hs, we compared systematically the yeast ORF2p to this protein and to other ORF2s, including the fungal proteins. Pairwise comparisons over the entire ORFs gave an identity of 21.9% over 1,073 amino acids for Ylli and L1Hs. Pairwise FASTA comparisons between yeast proteins Ylli/Zorro-1, Ylli/Zorro-3, and Zorro-1/Zorro-3 gave 22.6% identity over 1,179 amino acids, 21.4% identity over 868 amino acids, and 29.6% identity over 1,073 amino acids, respectively. In addition, the size of the Ylli, Zorro-1, and Zorro-3 ORF2p are comparable to mammalian ORF2p (table 3 ). A recent search in public databases with Ylli ORF2 revealed that the best match (E value of 10-50 compared with 10-39 for human L1Hs) was with ORF2 of TRE3-A of the slime mold D. discoideum, with 25.2% identity over 865 amino acids.
Although the APE domain is not well conserved, we were able to detect such a domain in the N-terminal part of Ylli ORF2p. In retrotransposons of the L1 clade and in most nonLTR retrotransposon families, this domain is located in the N-terminal end. The multiple alignment shown in figure 4A
indicated that, except for an aspartate replaced by a serine in the fifth block of the domain, all of the residues common to the nonLTR retrotransposons tested in Feng et al. (1996)
are present in Ylli. In addition, the essential residues required for endonuclease activity detected so far in L1Hs (Feng et al. 1996
) are also present in Ylli ORF2. By comparison, the APE domain, especially at its 5' end, seems to have diverged considerably in the Zorros as five common residues are missing in Zorro-1 and four in Zorro-3.
|
Phylogenetic Analysis
Because of its strong conservation, the RT domain has been widely used for retrotransposons phylogenetic studies (Xiong and Eickbush 1990
; Malik, Burke, and Eickbush 1999
). The data shown previously, in particular the overall amino acid sequence comparisons, indicate that the structure of Ylli is closer to that of the elements belonging to the L1 family, in spite of some differences like the longer size of ORF1 associated with the presence of a putative Cys-rich motif. To confirm this, we performed a phylogenetic analysis on the RT domain of the nonLTR retrotransposons introducing the yeast element described here. We used at least two members of each of the clades described by Malik, Burke, and Eickbush (1999)
, and we added few elements belonging to the L1 clade, including members of the D. discoideum TRE family that gave a good score in BLAST searches with Ylli ORF2 in nonredundant public databases. We also included the Zorro-1 and Zorro-3 sequences. Sequences were aligned with CLUSTAL W and CLUSTAL X (see Materials and Methods). Then, phylogenetic reconstruction was performed on the RT domain, including the nine segments 07, as shown in figure 4B
, and the less conserved segments 8 and 9, recently described by Malik, Burke, and Eickbush (1999)
. We included the intervening regions that separate the conserved blocks, even for rare domains that showed interblock expansion. The phylogenetic tree obtained is shown in figure 6
. Results confirmed a close relationship observed between the D. discoideum TRE elements, L1 from mammals, and Ylli. Ylli and the Zorros form a monophyletic group, but the branching separating the yeast clade from the clade that contains the mammalian retrotransposons and that of D. discoideum is not supported by a high bootstrap value. The poor support of the branching between mammals L1s and the slime mould TRE was already observed by Malik, Burke, and Eickbush (1999)
. On the other hand, the branching separating the mammalian L1s and the D. discoideum TRE from the other L1 families, including that from plants (maize and Arabidopsis) or animals (Xenopus) is better supported. It is clear from this tree and the sequence comparisons that Ylli forms a novel family that belongs to the L1 clade. We also confirm the classification of the Zorros within the L1 clade (Goodwin, Ormandy, and Poulter 2001
). Interestingly, the yeast clade is quite distant from the Tad clade containing the fungal sequences. The yeast and fungal nonLTR retrotransposons are therefore well distinct from each other on the basis of ORF2 sequence comparisons.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We were not able to demonstrate that the sequence described in this paper corresponds to a full-length element. Although Ylli RT probably generates 5' truncates (see subsequently), we could not find in the RST library a number of different RSTs that would have allowed the detection of Ylli 5' end. We only found four RSTs, corresponding to two different sequences, that after additional sequencing on the clones that carried them extended past the putative start of ORF1 and that diverged 30 bp upstream from this start (fig. 1
). One of these two sequences, whose 5' end overlapped with another RST (AW0AA027B04T1), carried several features that could constitute a part of the promoter. A 35-bp-long region that is very T-rich and interrupted by few Cs is located 27 and 52 bp upstream from two putative TATA boxes. Such a poly-dT tract was found to serve as an upstream promoter element for constitutive expression of two different genes in S. cerevisiae, HIS3, and DED1. The poly-dA complementary sequence of the promoter of HIS3 was also shown to act positively on the transcription of the divergent gene PET56 (Struhl 1985
). Interestingly, such a poly-A that contains 27 A residues is present 5 bp upstream from the putative start codon of C. albicans Zorro-3 ORF1. In Ylli, five direct repeats CAGCAA, that we called the Y-boxes, are clustered in the 476-bp-long sequence. This is very reminiscent of the 6-bp consensus motif CANNTG, called E-box, first described in the human L1 promoter (Minakami et al. 1992
) and also found in several copies in the turtle CR1 5' UTR (Kajikawa, Ohshima, and Okada 1997
). In humans, these cis-elements are binding targets for basic helix-loop-helix (bHLH) proteins. We have no evidence that this sequence constitutes the 5' UTR of Ylli because Ylli could have inserted past the promoter of an unrelated gene. This sequence contains nevertheless the 5' 35-bp repeat also found in the 3' end of the element; interestingly, the 5' 35-bp repeat itself contains one of the five Y-boxes detected in this region.
The presence of nonLTR retrotransposons in yeast raises two questions. The first one concerns the existence of an internal promoter which has never been documented in yeasts. The other question concerns the mechanism involved in the translation of both ORFs of Ylli. In most yeast LTR retrotransposons, ribosomal frameshifting ensures the translation of both ORFs, including the downregulation of ORF2 without the need for reinitiation of translation. Recently, the existence of a second mechanism was suggested for Tca2 (previously called pCA1) in C. albicans (Matthews et al. 1997
). This mechanism uses the read-trough suppression of an in-frame stop codon that separates both ORFs of the retrotransposon. In Ylli, ORF1 ends with a stop TAA, and 2 bp separate this stop codon from the ORF2 putative start codon. This structure suggests that, like in other nonLTR retrotransposons (Bouhidel, Terzian, and Pinon 1994
; Szafranski et al. 1999
), translation may be reinitiated past the ORF1 stop codon with an as yet undescribed mechanism in yeasts.
Another nonLTR retrotransposon characteristic shared by Ylli is the presence of elements truncated in 5', probably because of the low processivity of the reverse transcriptase. We found four RSTs along Ylli that carry part of Ylli and are associated with genomic DNA at the 5' end. We also showed that by hybridization of genomic DNA to a 5'- and a 3'-specific probe, less bands were detected with the 5'-specific probe. A puzzling feature of Ylli is the existence of a large number of short repeats matching the end of the element, including the poly-A tail; some of these could constitute Ylli extreme 5' truncations. These short repeats are also reminiscent of SINEs found in very high copy number in mammals and derived from tRNA gene promoter associated with the 3' end of nonLTR retrotransposons. SINEs have thus two conserved sequences, boxes A and B, conserved in promoters of RNA polymerase-IIItranscribed genes. The Ylli repeats are much shorter than SINE 3' end (126 out of the 19 detected short repeats are less than 78 bp long), and no homology to the conserved boxes was detected in the Y. lipolytica repeats, ruling out the possibility that the Y. lipolytica 3' short repeats could be classical SINEs. Such conserved boxes were found in the Y. lipolytica 7SL RNA genes, SCR1 and SCR2 (He et al. 1989
). Alternatively, the presence of these short repeats could be explained by the existence of two 35-bp repeats immediately upstream from ORF1 and immediately upstream from the poly-A tract. By analogy to the presence of solo LTR in S. cerevisiae because of recombination between LTRs and excision of the Tys, the short repeats maybe generated after imprecise recombination between the two 35-bp direct repeats and excision of the element. This is consistent with the fact that most of the junctions genomic DNA-short repeat are located in or immediately upstream the 3'-end 35-bp direct repeat, although the size of this homology is very reduced to allow homologous recombination. If one cannot explained the role of these 35-bp repeats in the life cycle of Ylli, their remarkable sequence conservation is consistent with their possible involvement in the Ylli excision by recombination.
A likely possibility is that these short 3' repeats are 5' extreme truncates caused by a secondary structure generated by the dyad symmetry overlapping the 3'-end 35-bp direct repeat shown in figure 1
. This structure might reduce reverse transcription by provoking abortive transcription. This is consistent with the few 5' truncates detected along Ylli, indicating that Ylli reverse transcriptase seems to be more processive than higher eukaryote ones. The mechanism mentioned here could therefore be involved in the regulation of the propagation of Ylli. Again, the fact that most of the junctions genomic DNA-short repeat are located in or immediately upstream the dyad symmetry fits this explanation. It was shown that the last 250 nucleotides of the 3' UTR of the B. mori R2 nonLTR element was required for reverse transcription, and that secondary structure was likely to be involved in the binding of the transcript to the reverse transcriptase during initiation of transcription (Luan and Eickbush 1995
). The 3'-end sequence of different nonLTR elements was poorly conserved but seemed to be able to form secondary structure with common characteristics (Mathews et al. 1997
), supporting a role for the putative secondary structure formed by the Ylli 3' end. Further work is necessary to choose between these hypotheses.
NonLTR retrotransposons, especially in mammals, are present in high copy number. This is less true for elements in other organisms, such as Swimmer in the teleost fish medaka (Duvernell and Turner 1998
). On the basis of the number of Ylli 3' ends containing a poly-A tract in the Génolevures sequences and the coverage of the Y. lipolytica genome in the Génolevures project, we have estimated the number of Ylli in the Y. lipolytica W29 genome at over 100 if the short repeats were taken into account or at 1520 if they were not. Considering we are dealing with only one element, this is a high figure for a yeast, as 52 LTR-retrotransposonTys were found in the sequenced strain of S. cerevisiae, and a total of 250 remnants of previous insertions of Tys, represented by solo LTR, exist in this strain. Recent work performed in the Génolevures project and in the laboratory indicates that various yeast species tend to have less LTR-retrotransposon copies than S. cerevisiae, and that species belonging to the Zygosaccharomyces, Kluyveromyces, and Saccharomyces genera carry very few if any LTR retrotransposons (Génolevures 2000
; unpublished data). In Y. lipolytica the only characterized LTR-retrotransposon, Ylt1, is present at 35 copies and 5060 solo LTRs can be found (Schmid-Berger, Schmid, and Barth 1994
).
This is in contrast with the low number of representatives of the Zorros families of nonLTR retrotransposons-like found in C. albicans (Goodwin, Ormandy, and Poulter 2001
) detected by hybridization with the RT gene of Zorro-1 and Zorro-3. We found that both elements of C. albicans, Zorro-1 and Zorro-3, are unique and that 5' truncates were absent in the available Stanford's C. albicans sequence assembly 6, except for a 5' 624-bp-long truncate of Zorro-1, and few central parts of Zorro-3 ORF2 that are very likely caused by genomic rearrangements (see also Goodwin, Ormandy, and Poulter 2001
). This might indicate that Ylli and the Zorros do not use the same mechanism for propagation, that they are not regulated the same way, or that Zorros are being lost from the C. albicans genome. Indeed, one degenerate Zorro-2 was detected and some strains do not carry Zorro-1 or Zorro-3 (Goodwin, Ormandy, and Poulter 2001
), whereas this is not the case for Ylli. The yeasts Y. lipolytica and C. albicans are both dimorphic and related on the basis of 26S rDNA sequence comparison (Souciet et al. 2000
). Although it is thus tempting to link dimorphism and the presence of nonLTR retrotransposons in the genome, we also studied 12 other yeasts in the Génolevures project, including a dimorphic yeast, close to C. albicans, Candida tropicalis, but no nonLTR elements-like sequence were detected in any of the yeasts studied in this particular project. This rules out the possibility that dimorphism and nonLTR retrotransposon maintenance could be somehow linked.
We have shown that, unlike the LTR retrotransposon Ylt1, Ylli is present in all known traceable strain lineages, American, French, and German. On the basis of RFLP analysis, clear differences could be seen between strains, indicating that that this element may still be active. In this respect, we detected an insertion within the promoter of the PEX10 gene in the "American" strain CX161-1B that is not present at the same location in the "French" strain W29. In addition, we detected few sequence nucleotide polymorphisms (SNP) in the Ylli family, and none of these SNPs affected the ORFs by introducing stop codons. The variability between strains and the high sequence conservation among the copies of Ylli suggest that Ylli transposed recently.
We have clearly shown, on the basis of the RT domain phylogeny and overall ORF2 sequence comparison, that Ylli, like the Zorros, belongs to the L1 clade, one of the oldest nonLTR retrotransposon clades. We also found that the yeast nonLTR retrotransposons are closely related to the mammalian L1 and to the members of the D. discoideum TRE family. From various sequence alignments, Malik, Burke, and Eickbush (1999)
could date the acquisition of the diverse domains that constitute ORF2s and the transition from the elements made of one single ORF to elements that contain two ORFs. Ylli and Zorro-1 ORF1s are long (up to 714 amino acids for Ylli) and carry Cys-rich motifs in the second half of the ORF gene product. They thus display a size and a structure which closely resembles that of the most recent nonLTR element clades (Tad1, R1, LOA...), whereas the phylogenetic analysis of the RT domain as well as entire ORF2 sequence comparison placed clearly the yeast elements within the L1 clade.
We also showed that Ylli does not have any obvious insertion specificity. The closest elements, the TRE nonLTR retrotransposon from D. discoideum, are site specific. This confirms that RT phylogeny does not differentiate between elements with different mechanisms of insertion as pointed out by Malik, Burke, and Eickbush (1999)
. Ylli is the only retrotransposon described in yeast, together with Ylt1, that does not insert in the vicinity of tRNA genes and therefore could constitute an efficient tool for insertional mutagenesis.
With the Zorros in C. albicans, this work confirms the presence of nonLTR retrotransposons in hemiascomycetous yeasts. It further strongly suggests that nonLTR retrotransposons in yeasts are restricted to few species. Interestingly, both species have the ability of filamenting. Among the yeast nonLTR elements, Ylli was shown to be closest to the already described elements. Indeed, Ylli may still be active, as sequence conservation between the various copies would suggest. In addition to the evolutionary implications of this work, analysis of the various mechanisms of retrotransposition will be greatly facilitated in a genetically amenable yeast species like Y. lipolytica.
![]() |
Supplementary Material |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Abbreviations: ORF, open reading frame; LTR, long terminal repeat; UTR, untranslated region.
Keywords: yeast
nonLTR retrotransposon
Yarrowia lipolytica
repeats
phylogeny
Address for correspondence and reprints: Serge Casaregola, Collection de Levures d'Intérêt Biotechnologique, Laboratoire de Génétique Moleculaire et Cellulaire, INRA UR216, CNRS URA1925, INA-PG, F-78850 Thiverval-Grignon, France. serge.casaregola{at}grignon.inra.fr
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Altschul S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D. J. Lipman, 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Nucleic Acids Res 25:3389-3402
Artiguenave F., P. Wincker, P. Brottier, S. Duprat, F. Jovelin, C. Scarpelli, J. Verdier, V. Vico, J. Weissenbach, W. Saurin, 2000 Genomic exploration of the hemiascomycetous yeasts: 2. Data generation and processing FEBS Lett 487:13-16[ISI][Medline]
Bailey J. A., L. Carrel, A. Chakravarti, E. E. Eichler, 2000 Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: the Lyon repeat hypothesis Proc. Natl. Acad. Sci. USA 97:6634-6639
Barth G., C. Gaillardin, 1997 Physiology and genetics of the dimorphic fungus Yarrowia lipolytica FEMS Microbiol. Rev 19:219-237[ISI][Medline]
Bouhidel K., C. Terzian, H. Pinon, 1994 The full-length transcript of the I factor, a LINE element of Drosophila melanogaster, is a potential bicistronic RNA messenger Nucleic Acids Res 22:2370-2374[Abstract]
Cambareri E. B., J. Helber, J. A. Kinsey, 1994 Tad1-1, an active LINE-like element of Neurospora crassa Mol. Gen. Genet 242:658-665[ISI][Medline]
Casaregola S., C. Feynerol, M. Diez, P. Fournier, C. Gaillardin, 1997 Genomic organization of the yeast Yarrowia lipolytica Chromosoma 106:380-390[ISI][Medline]
Casaregola S., C. Neuveglise, A. Lepingle, E. Bon, C. Feynerol, F. Artiguenave, P. Wincker, C. Gaillardin, 2000 Genomic exploration of the hemiascomycetous yeasts: 17. Yarrowia lipolytica FEBS Lett 487:95-100[ISI][Medline]
Casaregola S., H. V. Nguyen, A. Lepingle, P. Brignon, F. Gendre, C. Gaillardin, 1998 A family of laboratory strains of Saccharomyces cerevisiae carry rearrangements involving chromosomes I and III Yeast 14:551-564[ISI][Medline]
Chibana H., B. B. Magee, S. Grindle, Y. Ran, S. Scherer, P. T. Magee, 1998 A physical map of chromosome 7 of Candida albicans Genetics 149:1739-1752
Church G. M., W. Gilbert, 1984 Genomic sequencing Proc. Natl. Acad. Sci. USA 81:1991-1995[Abstract]
Duvernell D. D., B. J. Turner, 1998 Swimmer 1, a new low-copy-number LINE family in teleost genomes with sequence similarity to mammalian L1 Mol. Biol. Evol 15:1791-1793
Esnault C., J. Maestre, T. Heidmann, 2000 Human LINE retrotransposons generate processed pseudogenes Nat. Genet 24:363-367[ISI][Medline]
Ewing B., P. Green, 1998 Base-calling of automated sequencer traces using phred. II. Error probabilities Genome Res 8:186-194
Ewing B., L. Hillier, M. C. Wendl, P. Green, 1998 Base-calling of automated sequencer traces using phred. I. Accuracy assessment Genome Res 8:175-185
Feng Q., J. V. Moran, H. H. Kazazian Jr., J. D. Boeke, 1996 Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition Cell 87:905-916[ISI][Medline]
Furano A. V., 2000 The biological properties and evolutionary dynamics of mammalian LINE-1 retrotransposons Prog. Nucleic Acid Res. Mol. Biol 64:255-294[ISI][Medline]
Génolevures, 2000 Genomic exploration of the hemiascomycetous yeasts FEBS Lett. 487
Goodwin J. D., J. E. Ormandy, R. T. M. Poulter, 2001 L1-like non LTR retrotransposons in the yeast Candida albicans Curr. Genet 39:83-91[ISI][Medline]
Goodwin T. J., R. T. Poulter, 2000 Multiple LTR-retrotransposon families in the asexual yeast Candida albicans Genome Res 10:174-191
. 2001 The diversity of retrotransposons in the yeast Cryptococcus neoformans Yeast 18:865-880[ISI][Medline]
Gordon D., C. Abajian, P. Green, 1998 Consed: a graphical tool for sequence finishing Genome Res 8:195-202
Goyon C., J. L. Rossignol, G. Faugeron, 1996 Native DNA repeats and methylation in Ascobolus Nucleic Acids Res 24:3348-3356
Hamer J. E., L. Farrall, M. J. Orbach, B. Valent, F. G. Chumley, 1989 Host species-specific conservation of a family of repeated DNA sequences in the genome of a fungal plant pathogen Proc. Natl. Acad. Sci. USA 86:9981-9985[Abstract]
He C., J. P. Nourse, S. Kelemu, J. A. Irwin, J. M. Manners, 1996 CgT1: a nonLTR retrotransposon with restricted distribution in the fungal phytopathogen Colletotrichum gloeosporioides Mol. Gen. Genet 252:320-331[ISI][Medline]
He F., J. M. Beckerich, V. Ribes, D. Tollervey, C. M. Gaillardin, 1989 Two genes encode 7SL RNAs in the yeast Yarrowia lipolytica Curr. Genet 16:347-350[ISI][Medline]
Higgins D. G., J. D. Thompson, T. J. Gibson, 1996 Using CLUSTAL for multiple sequence alignments Methods Enzymol 266:383-402[ISI][Medline]
Hohjoh H., M. F. Singer, 1997 Sequence-specific single-strand RNA binding protein encoded by the human LINE-1 retrotransposon EMBO J 16:6034-6043
Hutchinson C. A. III,, S. C. Hardies, D. D. Loeb, W. R. Shehee, M. H. Edgell, 1989 LINEs and related retrotransposons Pp. 593617 in D. E. Berg and M. M. Howe, eds. Mobile DNA. American Society for Microbiology, Washington DC
Juretzek T., M. Le Dall, S. Mauersberger, C. Gaillardin, G. Barth, J.-M. Nicaud, 2001 Vectors for gene expression and amplification in the yeast Yarrowia lipolytica Yeast 18:97-113[ISI][Medline]
Kajikawa M., K. Ohshima, N. Okada, 1997 Determination of the entire sequence of turtle CR1: the first open reading frame of the turtle CR1 element encodes a protein with a novel zinc finger motif Mol. Biol. Evol 14:1206-1217[Abstract]
Kazazian H. H. Jr., 1998 Mobile elements and disease Curr. Opin. Genet. Dev 8:343-350[ISI][Medline]
Kordis D., F. Gubensek, 1998 Unusual horizontal transfer of a long interspersed nuclear element between distant vertebrate classes Proc. Natl. Acad. Sci. USA 95:10704-10709
Kurtzman C. P., C. J. Robnett, 1998 Identification and phylogeny of ascomycetous yeasts from analysis of nuclear large subunit (26S) ribosomal DNA partial sequences Antonie Leeuwenhoek 73:331-371[ISI]
Le Q. H., S. Wright, Z. Yu, T. Bureau, 2000 Transposon diversity in Arabidopsis thaliana Proc. Natl. Acad. Sci. USA 97:7376-7381
Li W. H., Z. Gu, H. Wang, A. Nekrutenko, 2001 Evolutionary analyses of the human genome Nature 409:847-849[ISI][Medline]
Luan D. D., T. H. Eickbush, 1995 RNA template requirements for target DNA-primed reverse transcription by the R2 retrotransposable element Mol. Cell. Biol 15:3882-3891[Abstract]
Malik H. S., W. D. Burke, T. H. Eickbush, 1999 The age and evolution of nonLTR retrotransposable elements Mol. Biol. Evol 16:793-805[Abstract]
Martin S. L., F. D. Bushman, 2001 Nucleic acid chaperone activity of the ORF1 protein from the mouse LINE- 1 retrotransposon Mol. Cell. Biol 21:467-475
Mathews D. H., A. R. Banerjee, D. D. Luan, T. H. Eickbush, D. H. Turner, 1997 Secondary structure model of the RNA recognized by the reverse transcriptase from the R2 retrotransposable element RNA 3:1-16
Matthews G. D., T. J. Goodwin, M. I. Butler, T. A. Berryman, R. T. Poulter, 1997 pCal, a highly unusual Ty1/copia retrotransposon from the pathogenic yeast Candida albicans J. Bacteriol 179:7118-7128[Abstract]
Minakami R., K. Kurose, K. Etoh, Y. Furuhata, M. Hattori, Y. Sakaki, 1992 Identification of an internal cis-element essential for the human L1 transcription and a nuclear factor(s) binding to the element Nucleic Acids Res 20:3139-3145[Abstract]
Page R. D., 1996 TreeView: an application to display phylogenetic trees on personal computers Comput. Appl. Biosci 12:357-358[Medline]
Pearson W. R., D. J. Lipman, 1988 Improved tools for biological sequence comparison Proc. Natl. Acad. Sci. USA 85:2444-2448[Abstract]
Picologlou S., M. E. Dicig, P. Kovarik, S. W. Liebman, 1988 The same configuration of Ty elements promotes different types and frequencies of rearrangements in different yeast strains Mol. Gen. Genet 211:272-281[ISI][Medline]
Ricchetti M., C. Fairhead, B. Dujon, 1999 Mitochondrial DNA repairs double-strand breaks in yeast chromosomes Nature 402:96-100[ISI][Medline]
Romano A., S. Casaregola, P. Torre, C. Gaillardin, 1996 Use of RAPD and mitochondrial DNA RFLP for typing of Candida zeylanoides and Debaryomyces hansenii yeast strains isolated from cheese Syst. Appl. Microbiol 19:255-264[ISI]
Saitou N., M. Nei, 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]
Sambrook J., E. Fritsch, T. Maniatis, 1989 Molecular cloning: a laboratory manual Cold Spring Harbor, Cold Spring Harbor Laboratory Press, New York
Sassaman D. M., B. A. Dombroski, J. V. Moran, M. L. Kimberland, T. P. Naas, R. J. DeBerardinis, A. Gabriel, G. D. Swergold, H. H. Kazazian Jr., 1997 Many human L1 elements are capable of retrotransposition Nat. Genet 16:37-43[ISI][Medline]
Schmid-Berger N., B. Schmid, G. Barth, 1994 Ylt1, a highly repetitive retrotransposon in the genome of the dimorphic fungus Yarrowia lipolytica J. Bacteriol 176:2477-2482[Abstract]
Souciet J., M. Aigle, F. Artiguenave, et al. (21 co-authors) 2000 Genomic exploration of the hemiascomycetous yeasts: 1. A set of yeast species for molecular evolution studies FEBS Lett 487:3-12[ISI][Medline]
Struhl K., 1985 Naturally occurring poly(dA-dT) sequences are upstream promoter elements for constitutive transcription in yeast Proc. Natl. Acad. Sci. USA 82:8419-8423[Abstract]
Szafranski K., G. Glockner, T. Dingermann, K. Dannat, A. A. Noegel, L. Eichinger, A. Rosenthal, T. Winckler, 1999 NonLTR retrotransposons with unique integration preferences downstream of Dictyostelium discoideum tRNA genes Mol. Gen. Genet 262:772-780[ISI][Medline]
The Arabidopsis Genome Initiative. 2000 Analysis of the genome sequence of the flowering plant Arabidopsis thaliana Nature 408:796-815[ISI][Medline]
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882
Volff J. N., C. Korting, A. Meyer, M. Schartl, 2001 Evolution and continuous distribution of Rex3 retrotransposon in fish Mol. Biol. Evol 18:427-431
Warmington J. R., R. P. Green, C. S. Newlon, S. G. Oliver, 1987 Polymorphisms on the right arm of yeast chromosome III associated with Ty transposition and recombination events Nucleic Acids Res 15:8963-8982[Abstract]
Xiong Y., T. H. Eickbush, 1988 The site-specific ribosomal DNA insertion element R1Bm belongs to a class of nonlong-terminal-repeat retrotransposons Mol. Cell. Biol 8:114-123[ISI][Medline]
1990 Origin and evolution of retroelements based upon their reverse transcriptase sequences EMBO J 9:3353-3362[Abstract]