Departamento de Genética, Facultad de Ciencias Biológicas, Universitat de València, Valencia, Spain
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The small genome of A. thaliana (130 Mb) has a low content of repetitive DNA. The interspersed DNA fraction is especially low, constituting 2% of the genome (Meyerowitz 1992
). During the last decade, some families of transposable elements have been described in this species: the non-long-terminal-repeat (non-LTR) retrotransposons Ta11-1 (Wright et al. 1996
) and TSCL (Chye, Cheung, and Xu 1997
), the LTR-retrotransposon gypsy-like elements Tat1 (Peleman et al. 1991
; Wright and Voytas 1998
) and Athila (Pélisier et al. 1995, 1996
), the transposon-like elements Limpet1 (Klimyuk and Jones 1997
), Tag1 (Tsay et al. 1993; Frank et al. 1997
), and Tag2 (Henk, Warren, and Innes 1999
), the superfamilies Arnold and Harbinger (Kapitonov and Jurka 1999
) and Basho (Le et al. 2000), and the foldback transposon Hairpin (Adé and Belzile 1999
). Some families of repetitive elements structurally related to the miniature inverted-repeat transposable elements (MITEs) have also been described (Casacuberta et al. 1998
; Surzycki and Belknap 1999
; Feschotte and Mouchès 2000
; Le et al. 2000
). Referring to the copia-like elements, 10 related families, designated Ta1Ta10, were considered a superfamily within the genome of A. thaliana (Voytas and Ausubel 1988
; Voytas et al. 1990
; Konieczny et al. 1991
). Some sequences related to Ta1Ta10 elements have been reported (Brandes et al. 1997
), and the divergence between members of these families is high. Voytas (1992)
considered that the elements that share >95% nucleotide identity can be regarded as members of the same family, and those that share <85% can be considered members of different families. The copy numbers are low, with only a few copies per family, compared with other copia-like plant families. It has been estimated that the Ta1Ta10 superfamily constitutes 0.1% of the A. thaliana genome (Konieczny et al. 1991
), with most of the elements located in clusters at the paracentromeric heterochromatin (Brandes et al. 1997
). These data are in accord with works that show that the copia-like elements concentrate in the centromeric regions (Heslop-Harrison et al. 1997
) and, in a more generic reference, that regions flanking the centromeres are densely populated by transposable elements (Copenhaver et al. 1999
; Cold Spring Harbor Laboratory et al. 2000). copia-like elements other than those of the Ta1Ta10 superfamily, such as Evelknievel (Henikoff and Comai 1998
), Art1 (Hervé et al. 1999
), Meta1 (Kapitonov and Jurka 1999
), and AtRE1-AtRE2 (Kuwahara, Kato, and Komeda 2000), have been described within the A. thaliana genome. It has also been described that the presence of copia-like elements extends to the mitochondrial genome of this species (Knoop et al. 1996
).
The complete sequence of A. thaliana chromosomes II and IV have recently been reported (European Union Arabidopsis Genome Sequencing Consortium et al. 1999
; Lin et al. 1999
). The analysis of those sequences has shown the low content of repetitive DNA of the A. thaliana genome when compared with other plant species. Dispersal repeats, which consist of predominantly LTR and non-LTR retrotransposons, are found throughout the chromosome arms. However, the main fraction of transposable elements is concentrated at the pericentromeric heterochromatin, as had previously been described. This region is constituted mainly by a few genes and a high density of presumably inactive mobile elements (Lin et al. 1999
). In addition to the complete sequences of chromosomes II and IV, studies of different stretches of the A. thaliana chromosomes have been performed (Quigley et al. 1996
; Thompson, Schmidt, and Dean 1996
; Comella et al. 1999
; Terryn et al. 1999
), as have computer-assisted analyses, looking for the presence of different families of transposable elements (Kapitonov and Jurka 1999
; Le et al. 2000
). In this paper, we report the results of a careful computer-based analysis performed in order to identify, characterize, and establish the evolutionary relationships among copia-like elements that exist in the A. thaliana genome.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Sequence Analyses
The searches for open reading frames (ORFs) on the genomic clones were performed with the GENSCAN program (Burge and Karlin 1997) in collaboration with the Munich Information Center for Protein Sequences (MIPS). The GenBank (GB) and European Molecular Biology (EMB) databases were searched for sequence similarities using the BLAST programs at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/BLAST/) (Altschul et al. 1997
). An estimate of the similarity between sequences was obtained with the GAP program from the GCG Software Package (Wisconsin University). Multiple alignments were performed with the CLUSTAL W program (Thompson, Higgins, and Gibson 1994
). Genetic distances were calculated with the Poisson correction method (Nei and Chakraborty 1976
) for amino acid sequences; the phylogenetic trees were constructed with the neighbor-joining (Saitou and Nei 1987
) and UPGMA (Swofford and Selander 1981
) methods, the bootstrap test was carried out with 1,000 iterations. These evolutionary analyses were performed with the MEGA platform (Kumar, Tamura, and Nei 1993
).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We used the GenScan program to predict the ORFs present on the sequences we had determined. We obtained a total of 113 putative ORFs, which is in agreement with the analysis performed at the MIPS, which produced a total of 107 predicted coding sequences. The hypothetical proteins produced by the predicted ORFs were used in different systematic searches against the GB and EMB databases using the BLAST sequence similarity search tool (http://www.ncbi.nlm.nih.gov/BLAST/). These searches yielded only four sequences which presented similarity to previously described transposable elements: one similar to the Ac-like transposable element, another one similar to gypsy-like sequences, and the two remaining ones presenting a high degree of similarity to copia-like retrotransposons. We decided to concentrate our studies on the two copia-like sequences, which were named AtC1 (accession number AF287471) and AtC2 (accession number AF287472), and we performed further analyses on them.
AtC1 was predicted as a single 4,359-bp-long ORF coding for a hypothetical protein of 1,452 amino acids. Computer searches of the GB and EMB databases found significant similarity between AtC1 and several known copia-like retrotransposons. These comparisons revealed that AtC1 contains all the amino acid domains found to be conserved among autonomously active retroelements. Both the amino acid conservation and the domain order served to identify AtC1 as a copia-like retrotransposon.
We analyzed the genomic sequence to determine the structure of this new element (fig. 1A
). AtC1 has a long internal region of 4,629 bp, bounded by 335-bp LTRs, which present the canonical inverted repeats at their ends (TG-CA). The nucleotide sequences of the LTRs were found to be completely identical; a 5-bp direct repeat (CTGCT) flanking the LTRs could correspond to the target site duplication (TSD) (fig. 1B
). The internal region contains one large ORF which could code for the genes gag and pol, encoding nucleic acid, protease, integrase, reverse transcriptase, and RNase H domains, respectively, in that order (fig. 1A
). The primer-binding site is in accord with the one described by Gauss and Sprinzl (1983)
for plant tRNAimet (PBS) (fig. 1B
). There is also a polypurine tract (PPT) at the end of the ORF. Thus, the structural analysis of AtC1 suggests that it could be an active element, as it displays all the main features described in the retrotransposons that have been shown to be able to transpose.
|
The overall similarity at the amino acid level between AtC1 and AtC2 was 42.21%, although the conservation varied extensively along the sequence. The best conserved domains were the integrase (68.1% similarity) and the retrotranscriptase (60% similarity). The similarity data and the differences in the lengths and structures of the elements indicate that AtC1 and AtC2 diverged long ago, as the degree of conservation between them is similar to that found between these elements and the copia element from Drosophila melanogaster (43.2% and 44.15% similarity, respectively).
Search for Homologous Sequences
Database searches performed at the DNA and amino acid levels with the programs BLASTP and TBLASTN did not yield any sequences identical to AtC1 or to AtC2. However, the TBLASTN searches allowed us to identify two new copia-like sequences on clones T15G18 (accession number AC006567; located on chromosome IV) and K2N11 (accession number AB022213; located on chromosome V) that were very similar to AtC1 and AtC2, respectively. We named the elements AtC3 (accession number AJ292423) and AtC4 (accession number AJ293575).
The BLAST P searches performed on the GB and EMB databases with AtC1 and AtC2 produced several hundreds of sequences related to the Ty-copia family of retrotransposons, and we decided to study them in detail. Only a few of these sequences corresponded to previously described elements: Evelknievel (Henikoff and Comai 1998
), the Ta1 (Voytas et al. 1990
; Konieczny et al. 1991
) and AtRe1 (Kuwahara, Kato, and Komeda 2000) families from A. thaliana, Hopscotch from Zea mays (White, Habera, and Wessler 1994
), Bare-1 from barley (Manninen and Schulman 1993
), Tnt from tobacco (Grandbastien, Spielmann, and Caboche 1989
), and SIRE-1 from soybean (Laten, Majumdar, and Gaucher 1998
). Most of the other database entries corresponded to predicted proteins from A. thaliana and had been produced by the automatic annotation of genomic clones. These sequences are the result of the work of the international consortium which is performing the sequencing of the genome of this model plant.
The copia-like sequences we obtained showed extreme heterogeneity in length and degree of conservation. The size of the predicted proteins ranged from approximately 80 to 1,500 amino acids, which indicates the existence of many defective elements. Many proteins had been described as the products of ORFs, with a number of predicted exons that ranged from 2 to 8. We verified that, as it was the case with AtC2, the predicted introns were artifacts caused by the presence of indels or stop codons. In this way, we established that the copia-like polyproteins described as multiexonic ORFs indicated the existence of defective or mutated copia-like elements that were unable to transpose autonomously. We decided then to select those copia-like polyproteins coded by a single ORF in order to allow us to identify putative active elements.
We chose a total of 25 sequences which might correspond to active elements and studied them in detail. Eight of the elements belonged to copia-like families previously described in A. thaliana: Ta1 (Voytas et al. 1990
; Konieczny et al. 1991
), Evelknievel (Henikoff and Comai 1998
), and AtRE1 and AtRE2 (Kuwahara, Kato, and Komeda 2000). The remaining 17 sequences, including AtC1, AtC3, and AtC4, were characterized in proof for the first time, and the 14 new elements were named AtC5AtC18. In spite of having described AtC2 as an element producing a truncated polyprotein, we included it in the study, as it presented most of the structural features of this kind of transposon and could be representative of a new copia-like family.
Structural Characterization of the New copia-like Elements
We identified the DNA sequences from which the analyzed polyproteins were derived and carried out a structural analysis in order to find all of the characteristic features of the copia-like retrotransposons. The results are summarized in table 1
, in which the known families Ta-1, Evelknievel, AtRE1, and AtRE2 have been also included. The most striking observation is that of the high degree of heterogeneity between the 26 analyzed elements. They all have different sizes, ranging from 4,629 to 5,738 bp, more than 1 kb of difference in size. The 26 elements present many of the structural features that characterize the autonomously active elements. All them present LTRs, with 17 of them showing identical 5' and 3' repeats. In the remaining 9 elements (AtRE2, AtC2, AtC4, AtC5, AtC8, AtC11, AtC14, AtC15, and AtC17), the 5' and 3' LTRs present small differences in size, mainly due to one-base indels. The inverted repeats are portrayed by 19 of our elements, while AtC2, AtC8, AtC11AtC13, AtC17, and AtC18 present LTRs where the inverted repeats begin with the canonical TG sequence but have a 3' change. Only the elements from the families AtRE1 and Ta-1 present LTRs identical in size and sequence. The rest of the elements display LTRs very heterogeneous in size, ranging from 120 to 734 bp, and sequence, with no significative similarities between them. When we analyzed the genomic sequences adjacent to the LTRs we found that 17 presented direct repeats that could correspond to the TSD, suggesting recent transposition events. Finally, most of the elements presented both the primer-binding site (PBS) and the polypurine tract (PPT) (adjacent to the 5' LTR and before the 3' LTR, respectively), suggesting that they can transpose autonomously.
|
Copy Number and Chromosome Distribution
TBLASTN and BLASTN searches on the GB and EMB databases were performed for each of the elements studied in this work in order to find out how many identical copies existed in the A. thaliana genome (table 1
). Most of the elements were present as single copies in the genome; AtC3, AtC5, AtC7, AtC13, AtC15, and AtC67 showed two identical copies in the same or different chromosomes, and only Evelknievel and AtC10 could be considered multicopy elements, with three and six copies, respectively. Nevertheless, these data will have to be confirmed when the A. thaliana genome is completely sequenced. The elements are widely dispersed on the five A. thaliana chromosomes, with seven copies on chromosome I, eight on chromosome II, seven on chromosome III, six on chromosome IV, and two on chromosome V. The smaller number of elements on chromosome V is probably due to the status of the sequencing project, with a smaller number of sequenced clones in the databases. The chromosome positions of the genomic clones which carry the retrotransposons indicate that they are dispersed along the chromosomes.
Search for Expressed Sequence Tags
The DNA sequence of the newly described elements was used to perform searches on the expressed sequence tag (EST) databases in order to find the corresponding ESTs. Three of the analyzed copia-like sequences produced several identical ESTs: AtC7, with eight different ESTs; AtC10, with two ESTs; and AtC18, with 1 EST. The fact that AtC7 and AtC10 present both transcriptional activity and several copies (two and six, respectively) on the A. thaliana genome enforces the idea they might be active retrotransposons.
Phylogenetic Analysis
The comparison of the amino acid sequences of the polyproteins coded by the copia-like elements characterized in this work reflects the high heterogeneity found at the structural level. The degree of similarity varies from 99% to 41%. As has been mentioned above, the similarity was much higher when the conserved domains, rather than the complete protein, were compared (data not shown). A phylogenetic tree was constructed with the neighbor-joining method based on complete sequences of the 26 polyproteins, and the bootstrap test was performed with a total of 1,000 iterations (fig. 2
). A similar topology was obtained using the UPGMA method. We observed a distribution of the sequences into six major lineages, or families. Elements belonging to the same lineage showed similarities higher than 50%. The topology of the tree was supported by the high bootstrap values of the main branches. We decided to name the lineages copia IVI.
|
The tree also includes copia elements from different plant species: Hopscotch from maize (White, Habera, and Wessler 1994
), Tnt from tobacco (Grandbastien, Spielmann, and Caboche 1989
), and SIRE-1 from soybean (Laten, Majumdar, and Gaucher 1998
). Each of these sequences group with one of the lineages we have described, indicating that the copia elements of those families are closer to the elements from different species than to the other A. thaliana families (fig. 2
).
In an effort to clarify the structure and evolutionary relationships of the copia-like sequence population in A. thaliana, we identified the RT region used by Konieczny et al. (1991)
in their study of the Ta-1 family in all the copia-like sequences we analyzed. We performed a multiple alignment with the RT domains of 63 Arabidopsis polyproteins and included the 8 TA sequences (TA2TA10) characterized in the above-mentioned work. The phylogenetic tree constructed with the alignment using the neighbor-joining method (fig. 3
) displays a topology very similar to the ones obtained for the complete polyprotein, although the bootstrap values are much lower. We also obtained a similar cluster configuration when the UPGMA method was used to construct the tree. Most of the RT sequences could be assigned to the six major lineages described in figure 2
, although there are a number them that could form other lineages different from the ones we have proposed.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The estimated size for the A. thaliana genome is 130 Mb and, to date, 120.7 Mb are available on the databases (updated on July 27, 2000, obtained from the Arabidopsis Genome Initiative web page at http://www.arabidopsis.org/agi.html). Thus, 92.8% of the A. thaliana genome is available, and 60% has already been annotated. A systematic computer-based survey had already been performed to identify mobile elements near wild-type genes, with negative results in this species (Bureau, Ronald, and Wessler 1996
). Recently, some analyses on the mobile element fraction of the A. thaliana genome using computer-based methodology have been performed. Six new lineages of the Ty3/gypsy group have been found, and phylogenetic analysis reveals that this group of plant retroelements forms two main monophyletic clades (Marin and Llorens 2000
). The presence of 142 groups of putative transposable elements has been detected in a systematic survey of a large sample (17.2 Mb) of the A. thaliana genome, of which 27 are copia-like retrotransposons (Le et al. 2000). According to Kapitonov and Jurka (1999)
, at least 100 diverse families of copia-like retrotransposons are present in this species.
The BLASTN and TBLASTN searches performed on the DNA databases allowed us to calculate that the approximate number of copia-related sequences present in the A. thaliana genome was 300. Previous works had estimated it at 200 copies (Voytas et al. 1990
; Konieczny et al. 1991
; Flavell et al. 1997
); thus, our estimate represents a 50% increase over the previous one. These differences can be explained because of the approach used in the previous works, for which the PCR technique limited the scope of the detection to sequences which had conserved the binding sites for the primers used in the amplifications. If we assume that the mean size of the copia-like sequences is between 4 and 5 kb and that the copy number is approximately 300, it can be deduced that the copia-like fraction detected in this computer analysis represents at least 1% of the A. thaliana genome. Previous data suggested that members of Ta1Ta10 families constituted approximately 0.1% of the genome of this species (Konieczny et al. 1991
). Our results agree with recently published works that suggest that the copia-like fraction would constitute 1% of the A. thaliana genome (Kapitonov and Jurka 1999
). If this datum were confirmed, this percentage would constitute an extremely low proportion of the A. thaliana genome when compared with the copia-like components of other plants, such as barley (Manninen and Schulman 1993
), maize (SanMiguel et al. 1996
), rye (Pearce et al. 1997
), and Avena (Linares, Serna, and Fominaya 1999
). Several works have pointed out that there is a relationship between the overall total copy number of retrotransposons and the host genome size and that this could largely account for most genome size variation in plants (Katsiotis, Schmidt, and Heslop-Harrison 1996
; Kumar et al. 1997
; Linares, Serna, and Fominaya 1999
). If this is the case, the low copy number of copia sequences in A. thaliana would be in correspondence with the small size of its genome.
We have analyzed 71 copia-like sequences, and most of them appeared to be highly heterogeneous in size and potentially inactive. Only 25 elements showed many of the structural characteristics necessary for potentially activity, with eight of these corresponding to families described in previous works. The presence of target site duplication in these 25 elements, along with the LTRs being identical or nearly identical for each element, indicates a predictable recent activity. In this respect, the high sequence identity between the 5' and 3' LTRs of retrotransposons from different families of maize (SanMiguel et al. 1998
) led to the proposal of a recent burst of activity of the elements in the maize genome. Although most of the analyzed sequences indicate the presence of inactive elements that could be considered relicts of ancient active ones, it is interesting to note the presence of potentially active copia-like elements in the A. thaliana genome. Supporting the potential activity of these elements, three of them have associated ESTs, which implies transcriptional activity for them. None of the analyzed elements are located on the heterochromatin, indicating that they have inserted in coding regions, although it would be necessary to perform a more exhaustive analysis of the genomic sequences adjacent to the insertion points to find how many genes, if any, have been interrupted by the insertions.
The most noticeable characteristic of the 23 copia-like analyzed elements is their heterogeneity. All of these elements vary in sequence and size, although they maintain similar copia-like structures. Divergence values ranging from 5% to 40% are in accord with the extreme variability previously described for copia-like elements in plants (Flavell et al. 1992, 1997
; Flavell, Smith, and Kumar 1992
; VanderWiel, Voytas, and Wendel 1993
; Kumar 1996
; Matsuoka and Tsunewaki 1996, 1999
; Kumar et al. 1997
; Pearce et al. 1997
; Wang et al. 1997
; Kuipers, Heslop-Harrison, and Jacobsen 1998
; Yañez et al. 1998
; Kumar and Bennetzen 1999
). Some of these works find that the nucleotide divergence among copia-like elements ranges from 0.4% to 57.8% in several grass species (Matsuoka and Tsunewaki 1999
) and from 17% to 61% in rye (Pearce et al. 1997
), and that the divergence at the amino acid level varies from 1% to 64% in rice (Wang et al. 1997
) and from 33% to 58% in Lycopersicon chilense (Yañez et al. 1998
). Nevertheless, a relatively high homogeneity has been found among members of the copia-like SIRE-1 family in the soybean genome (Laten, Majumdar, and Gaucher 1998
). In A. thaliana, high divergence values between members of the Ta1Ta10 families was reported early (Voytas and Ausubel 1988
; Voytas et al. 1990
; Konieczny et al. 1991
). This high divergence is confirmed by the fact that all of the elements we analyzed in our work are very different ones. It could be proposed that all of the elements belong to different families, and therefore in the A. thaliana genome there would exist at least 23 families of different copia-like elements. The phylogenetic analysis indicated strong phylogenetic relationships among them, and six major clades were defined, supported by high bootstrap values. These results strongly agree with recent data from Le et al. (2000), which mention the presence of 27 groups of copia-like elements in the A. thaliana genome. In contrast, Kapitonov and Jurka (1999)
reported the presence of 100 copia-like diverse families in the genome of this species. These strong differences in the estimation of family number are probably caused by the criteria used to establish families of copia-like elements.
Ten families, Ta1Ta10, were described early in the A. thaliana genome, and the authors considered that elements sharing >95% nucleotide identity could be considered members of the same family and that those sharing <85% identity could be regarded as members of different families (Voytas 1992
). Assuming the same criterion, with the exception of AtRE-1 and AtRE-2, the rest of the elements could be considered members of separate families. In a similar way, we propose that there are at least 23 families of copia-like families in the A. thaliana genome. These 23 families are grouped into six major lineages which display high divergence values between them. Interestingly, divergences between members belonging to different lineages are higher than divergences with respect to copia-like elements from other plant species. For example, the copia I lineage includes the Hopscotch element from maize, copia V includes SIRE-1 from soybean, and copia VI includes Tnt1 from Nicotiana tabacum. It is remarkable that the copia I lineage presents elements from such distant species as the monocotyledonous Z. mays and the dicotyledonous A. thaliana. Our present data suggest the existence of these 23 copia-like families among plant genomes; these families would be grouped into a few main lineages. Four superfamilies, named families G1G4 by Matsuoka and Tsunewaki (1999)
, have been described extended to grass species. We think that the lineages copia IVI described here are related to the G1G4 superfamilies described in the mentioned work. In fact, copia VI and G2 share some elements analyzed in both works. The presence of sharply diverged lineages between the copia-like fraction of several plant species has been established, for example, in Vicia, Solanum, Gossypium, and Lycopersycon (VanderWiel, Voytas, and Wendel 1993
; Kumar et al. 1997
; Yañez et al. 1998
). However, to establish the phylogenetic relationships between these interspecific lineages of copia-like elements, a more extensive analysis must be done.
As we discussed above, most of the analyzed members of the corresponding families are potentially active. They tend to maintain the structural characteristics necessary for retrotransposition. The existence in a given superfamily of active elements present in the genome of phylogenetically very distant species suggests the presence of active copies of the ancient element in the ancestral species. Therefore, divergent changes operating in the elements of the different species would have been subjected to functional constraints. Finally, most of the 71 analyzed sequences were defective and probably located in the heterochromatin region. We estimate that approximately 80% of copia-like elements in the A. thaliana genome are defective, which is coincident with the copia-like fraction estimated for other plants, such as rye, with 96% defective elements (Pearce et al. 1997
). The presence of such a large fraction of defective elements could constitute a good strategy for the plants to avoid the deleterious effects of the huge number of transposable elements that inhabit their genomes.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: copia-like elements
Arabidopsis thaliana,
evolution of transposable elements
2 Address for correspondence and reprints: Rosa de Frutos, Departamento de Genética Facultad de Ciencias Biológicas, Dr. Moliner 50, 46100 Burjasot, Valencia, Spain. rosa.frutos{at}uv.es
![]() |
literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adé, J., and F. J. Belzile. 1999. Hairpin elements, the first family of foldback transposons (FTs) in Arabidopsis thaliana. Plant J. 19:591597.
Altschul, S. F., T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:33893402.
Brandes, A., J. S. Heslop-Harrison, A. Kamm, S. Kubis, R. L. Doudrick, and T. Schmidt. 1997. Comparative analysis of the chromosomal genomic organization of Ty1-copia-like retrotransposons in pteridophytes, gymnosperms and angiosperms. Plant Mol. Biol. 33:1121.[ISI][Medline]
Bureau, T. E., P. C. Ronald, and S. R. Wessler. 1996. A computer-based systematic survey reveals the predominance of small inverted-repeat elements in wild-type rice genes. Proc. Natl. Acad. Sci. USA 93:85249.
Burge, C., and S. Karlin. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268:7894.[ISI][Medline]
Casacuberta, E., J. M. Casacuberta, P. Puigdomènech, and A. Monfort. 1998. Presence of miniature inverted-repeat transposable elements (MITEs) in the genome of Arabidopsis thaliana: characteristics of the Emigrant family of elements. Plant J. 16:7985.[ISI][Medline]
Chye, M. L., K. Y. Cheung, and J. Xu. 1997. Characterization of TSCL, a nonviral retroposon from Arabidopsis thaliana. Plant Mol. Biol. 35:893904.
Cold Spring Harbor Laboratory, Washington University Genome Sequencing Center, and PE Biosystems Arabidopsis Sequencing Consortium. 2000. The complete sequence of a heterochromatic island from a higher eukaryote. Cell 100:377386.
Comella, P., H. J. Wu, M. Laudie, C. Berger, R. Cooke, M. Delseny, and F. Grellet. 1999. Fine sequence analysis of 60 kb around the Arabidopsis thaliana AtEm1 locus on chromosome III. Plant Mol. Biol. 41:687700.[ISI][Medline]
Copenhaver, G. P., K. Nickel, T. Kuromori et al. (14 co-authors). 1999. Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286:24682474.
European Union Arabidopsis Genome Sequencing Consortium, Cold Spring Harbor Laboratory, Washington University in St Louis, and PE Biosystems Arabidopsis Sequencing Consortium. 1999. Progress in sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402:769777.
European Union Chromosome 3 Arabidopsis Sequencing Consortium, Institute for Genomic Research, and Kazusa DNA Research Institute. 2000. Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana. Nature 408:820823.
Feschotte, C., and C. Mouchès. 2000. Evidence that a family of miniature inverted-repeat transposable elements (MITEs) from Arabidopsis thaliana genome has arisen from a pogo-like transposon. Mol. Biol. Evol. 17:730737.
Flavell, A. J., E. Dunbar, R. Anderson, S. R. Pearce, R. Hartley, and A. Kumar. 1992. Ty1-copia group retrotransposons are ubiquitous and heterogeneous in higher plants. Nucleic Acids Res. 20:36393644.[Abstract]
Flavell, A. J., S. R. Pearce, P. Heslop-Harrison, and A. Kumar. 1997. The evolution of Ty1-copia group retrotransposons in eukaryote genomes. Genetica 100:185195.
Flavell, A. J., D. B Smith, and A. Kumar. 1992. Extreme heterogeneity of Ty1-copia group retrotransposons in plants. Mol. Gen. Genet. 231:233242.[ISI][Medline]
Frank, M. J., D. Liu, Y. F. Tsay, C. Ustach, and N. M. Crawford. 1997. Tag1 is an autonomous transposable element that shows somatic excision in both Arabidopsis and tobacco. Plant Cell 9:17451756.
Garber, K., I. Bilic, O. Pusch, J. Tohme, A. Bachmair, D. Schweizer, and V. Jantsch. 1999. The Tpv2 family of retrotransposons of Phaseolus vulgaris: structure, integration characteristics, and use for genotype classification. Plant Mol. Biol. 39:797807.[ISI][Medline]
Gauss, D. H., and M. Sprinzl. 1983. Compilation of sequences of tRNA genes. Nucleic Acids Res. 11:55103.
Grandbastien, M. A., A. Spielmann, and M. Caboche. 1989. Tnt1, a mobile retroviral-like transposable element of tobacco isolated by plant cell genetics. Nature 337:376380.
Henikoff, S., and L. Comai. 1998. A DNA methyltransferase homolog with a chromodomain exists in multiple polymorphic forms in Arabidopsis. Genetics 149:307318.
Henk, A. D., R. F. Warren, and R. W. Innes. 1999. A new Ac-like transposon of Arabidopsis is associated with a deletion of the RPS5 disease resistence gene. Genetics 151:15811589.
Hervé, C., J. Serres, P. Dabos, H. Canut, A. Barre, P. Rougé, and B. Lescure. 1999. Characterization of the Arabidopsis lecRK-a genes: members of a superfamily encoding putative receptors with an extracellular domain homologous to legume lectins. Plant Mol. Biol. 39:671682.[ISI][Medline]
Heslop-Harrison, J. S., A. Brandes, S. Taketa et al. (15 co-authors). 1997. The chromosomal distributions of Ty1-copia group retrotransposable elements in higher plants and their implications for genome evolution. Genetica 100:197204.
Hirochika, H., K. Sugimoto, Y. Otsuki, H. Tsugawa, and M. Kanda. 1996. Retrotransposons of rice involved in mutations induced by tissue culture. Proc. Natl. Acad. Sci. USA 93:77837788.
Janetzky, B., and L. Lehle. 1992. Ty4, a new retrotransposon from Saccharomyces cerevisiae, flanked by tau-elements. Biol. Chem. 267:1979819805.
Johnson, M. S., M. A. McClure, D. F. Feng, J. Gray, and R. F. Doolittle. 1986. Computer analysis of retroviral pol genes: assignment of enzymatic functions to specific sequences and homologies with nonviral enzymes. Proc. Natl. Acad. Sci. USA 83:76487652.
Kapitonov, V. V., and J. Jurka. 1999. Molecular paleontology of transposable elements from Arabidopsis thaliana. Genetica 107:2737.
Katsiotis, A., T. Schmidt, and J. S. Heslop-Harrison. 1996. Chromosomal and genomic organization of Ty1-copia-like retrotransposon sequences in the genus Avena. Genome 39:410417.
Kim, J. M., S. Vanguri, J. D. Boeke, A. Gabriel, and D. F. Voytas. 1998. Transposable elements and genome organization: a comprehensive survey of retrotransposons revealed by the complete Saccharomyces cerevisiae genome sequence. Genome Res. 8:464478.
Klimyuk, V. I., and J. D. G. Jones. 1997. AtDMC1, the Arabidopsis homologue of the yeast DMC1 gene: characterization, transposon-induced allelic variation and meiosis-associated expression. Plant J. 11:114.[ISI][Medline]
Knoop, V., M. Unseld, J. Marienfeld, P. Brandt, S. Sünkel, H. Ullrich, and A. Brennicke. 1996. copia, gypsy and LINE-like retrotransposon fragments in the mitochondrial genome of Arabidopsis thaliana. Genetics 142:579585.
Konieczny, A., D. F Voytas, M. P. Cummings, and F. M. Ausubel. 1991. A superfamily of Arabidopsis thaliana retrotransposons. Genetics 127:801809.
Kuipers, A. G., J. S. Heslop-Harrison, and E. Jacobsen. 1998. Characterisation and physical localisation of Ty1-copia-like retrotransposons in four Alstroemeria species. Genome 41:357367.
Kumar, A. 1996. The adventures of the Ty1-copia group of retrotransposons in plants. Trends Genet. 12:4143.[ISI][Medline]
Kumar, A., and J. L. Bennetzen. 1999. Plant retrotransposons. Annu. Rev. Genet. 33:479532.[ISI][Medline]
Kumar, A., S. R. Pearce, K. McLean, G. Harrison, J. S. Heslop-Harrison, R. Waugh, and A. J. Flavell. 1997. The Ty1-copia group of retrotransposons in plants: genomic organisation, evolution, and use as molecular markers. Genetica 100:205217.
Kumar, S., K. Tamura, and M. Nei. 1993. MEGA: molecular evolutionary genetic analysis. Version 1.0. Pennsylvania State University, University Park.
Kuwahara, A., A. Kato, and Y. Komeda. 2000. Isolation and characterization of copia-type retrotransposons in Arabidopsis thaliana. Gene 244:127136.
Laten, H. M., A. Majumdar, and E. A. Gaucher. 1998. SIRE-1, a copia/Ty1-like retroelement from soybean, encodes a retroviral envelope-like protein. Proc. Natl. Acad. Sci. USA 95:68976902.
Le, Q. H., S. Wright, Z. Yu, and T. Bureau. 2000. Transposon diversity in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 20:73767381.
Lin, X., S. Kaul, S. Rounsley et al. (37 co-authors). 1999. Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402:761768.
Linares, C., A. Serna, and A. Fominaya. 1999. Chromosomal organization of a sequence related to LTR-like elements of Ty1-copia retrotransposons in Avena species. Genome 42:706713.
Manninen, I., and A. H. Schulman. 1993. BARE-1, a copia-like retroelement in barley (Hordeum vulgare L.). Plant Mol. Biol. 22:829846.[ISI][Medline]
Marin, I., and C. Llorens. 2000. Ty3/Gypsy retrotransposons: description of new Arabidopsis thaliana elements and evolutionary perspectives derived from comparative genomic data. Mol. Biol. Evol. 17:10401049.
Matsuoka, Y., and K. Tsunewaki. 1996. Wheat retrotransposon families identified by reverse transcriptase domain analysis. Mol. Biol. Evol. 13:13841392.
. 1999. Evolutionary dynamics of Ty1-copia group retrotransposons in grass shown by reverse transcriptase domain analysis. Mol. Biol. Evol. 16:208217.[Abstract]
Meyerowitz, E. M. 1992. Introduction to the Arabidopsis genome. Pp. 100118 in C. Koncz, N. Chua, and J. Schell, eds. Methods in Arabidopsis research. World Scientific Publishing, Singapore.
Miller, J. T., F. Dong, S. A. Jackson, J. Song, and J. Jiang. 1998. Retrotransposon-related DNA sequences in the centromeres of grass chromosomes. Genetics 150:16151623.
Nei, M., and R. Chakraborty. 1976. Empirical relationship between the number of nucleotide substitutions and interspecific identity of amino acid sequences in some proteins. J. Mol. Evol. 26:313323.
Oosumi, T., B. Garlick, and W. R. Belknap. 1996. Identification of putative nonautonomous transposable elements associated with several transposon families in Caenorhabditis elegans. J. Mol. Evol. 43:1118.[ISI][Medline]
Pearce, S. R., G. Harrison, P. J. Heslop-Harrison, A. J. Flavell, and A. Kumar. 1997. Characterization and genomic organization of Ty1-copia group retrotransposons in rye (Secale cereale). Genome 40:617625.
Pearce, S. R., G. Harrison, D. Li, J. Heslop-Harrison, A. Kumar, and A. J. Flavell. 1996a. The Ty1-copia group retrotransposons in Vicia species: copy number, sequence heterogeneity and chromosomal localisation. Mol. Gen. Genet. 250:305315.
Pearce, S. R., U. Pich, G. Harrison, A. J. Flavell, J. S. Heslop-Harrison, I. Schubert, and A. Kumar. 1996b. The Ty1-copia group retrotransposons of Allium cepa are distributed throughout the chromosomes but are enriched in the terminal heterochromatin. Chromosome Res. 4:357364.
Pearl, L. H., and W. R. Taylor. 1987. A structural model for the retroviral proteases. Nature 329:351354.
Peleman, J., B. Cottyn, M. Van Montagu, and D. Inzé. 1991. Transient occurrence of extrachromosomal DNA of an Arabidopsis thaliana transposon-like elements, Tat1. Proc. Natl. Acad. Sci. USA 88:35183622.
Pélisier, T., S. Tutois, J. M. Deragon, S. Tourmente, S. Genestier, and G. Picard. 1995. Athila, a new retroelement from Arabidopsis thaliana. Plant Mol. Biol. 29:441452.
Pélisier, T., S. Tutois, S. Tourmente, J. M. Deragon, and G. Picard. 1996. DNA regions flanking the major Arabidopsis thaliana satellite are principally enriched in Athila retroelement sequences. Genetica 97:141151.
Prats, A. C., L. Sarih, C. Gabus, S. Litvak, G. Keith, and J. L. Darlix. 1988. Small finger protein of avian and murine retroviruses has nucleic acid annealing activity and positions the replication primer tRNA onto genomic RNA. EMBO J. 7:17771783.[Abstract]
Quigley, F., P. Dao, A. Cottet, and R. Mache. 1996. Sequence analysis of an 81 kb contig from Arabidopsis thaliana chromosome III. Nucleic Acids Research 24:43134318.
Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406425.[Abstract]
SanMiguel, P., B. S. Gaut, A. Tikhonov, Y. Nakajima, and J. L. Bennetzen. 1998. The paleontology of intergene retrotransposons of maize. Nat. Genet. 20:4345.[ISI][Medline]
SanMiguel, P., A. Tikhonov, Y. K. Jin et al. (11 co-authors). 1996. Nested retro-transposons in the intergenic regions of the maize genome. Science 274:765768.
Staden, R. 1996. The Staden sequence analysis package. Mol. Biotechnol. 5:233241.[ISI][Medline]
Surzycki, S. A., and W. R. Belknap. 1999. Characterization of repetitive DNA elements in Arabidopsis. J. Mol. Evol. 48:684691.[ISI][Medline]
Swofford, D. L., and R. R. Selander. 1981. BIOSYS-1. A computer program for the analysis of allelic variation in genetics. University of Illinois, Urbana.
Terryn, N., L. Heijnen, A. De Keyser et al. (21 co-authors). 1999. Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by sequencing and analyzing a 400-kb contig at the APETALA2 locus on chromosome 4. FEBS Lett. 445:237245.[ISI][Medline]
Thompson, H. L., R. Schmidt, and C. Dean. 1996. Analysis of the occurrence and nature of repeated DNA in an 850 kb region of Arabidopsis thaliana chromosome 4. Plant Mol. Biol. 32:553557.[ISI][Medline]
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:46734680.[Abstract]
Tsay, Y. F., M. J. Frank, T. Page, C. Dean, and N. M. Crawford. 1993. Identification of a mobile endogenous transposon in Arabidopsis thaliana. Science 260:342344.
VanderWiel, P. L., D. F. Voytas, and J. F. Wendel. 1993. copia-like retrotransposable element evolution in diploid and polyploid cotton (Gossypium L.). J. Mol. Evol. 36:429447.[ISI][Medline]
Voytas, D. F. 1992. Arabidopsis and cotton (Gossypium) as models for studying copia-like retrotransposon evolution. Genetica 86:1320.
Voytas, D. F., and F. M. Ausubel. 1988. A copia-like transposable element family in Arabidopsis thaliana. Nature 336:242244.
Voytas, D. F., M. P. Cummings, A. Koniczny, F. M. Ausubel, and S. R. Rodermel. 1992. copia-like retrotransposons are ubiquitous among plants. Proc. Natl. Acad. Sci. USA 89:71247128.
Voytas, D. F., A. Konieczny, M. P. Cummings, and F. M. Ausubel. 1990. The structure, distribution and evolution of the Ta1 retrotransposable element family of Arabidopsis thaliana. Genetics 26:713721.
Wang, S., N. Liu, K. Peng, and Q. Zhang. 1999. The distribution and copy number of copia-like retrotransposons in rice (Oryza sativa L.) and their implications in the organization and evolution of the rice genome. Proc. Natl. Acad. Sci. USA 96:68246828.
Wang, S., Q. Zhang, P. J. Maughan, and M. A. Saghai Maroof. 1997. copia-like retrotransposons in rice: sequence heterogeneity, species distribution and chromosomal locations. Plant Mol. Biol. 33:10511058.[ISI][Medline]
White, S. E., L. F. Habera, and S. R. Wessler. 1994. Retrotransposons in the flanking regions of normal plant genes: a role for copia-like elements in the evolution of gene structure and expression. Proc. Natl. Acad. Sci. USA 91:1179211796.
Wright, D. A., N. Ke, J. Smalle, B. M. Hauge, H. M. Goodman, and D. F. Voytas. 1996. Multiple non-LTR retrotransposons in the genome of Arabidopsis thaliana. Genetics 142:569578.
Wright, D. A., and D. F. Voytas. 1998. Potential retroviruses in pants: Tat1 is related to a group of Arabidopsis thaliana Ty3/gypsy retrotransposons that encode envelope-like proteins. Genetics 149:703715.
Xiong, Y., and T. H. Eickbush. 1990. Origin and evolution of retroelements based upon their reverse transcriptase sequences. EMBO J. 9:33533362.[Abstract]
Yáñez, M., I. Verdugo, M. RodrÍguez, S. Prat, and S. Ruiz-Lara. 1998. Highly heterogeneous families of Ty1/copia retrotransposons in the Lycopersicon chilense genome. Gene 222:223228.