* Department of Biology, Loyola University Chicago
Department of Zoology and Genetics, Iowa State University
Correspondence: E-mail: hlaten{at}luc.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: endogenous retrovirus retrotransposon phylogenetics Ty1-copia Glycine max
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Among the plant LTR retrotransposons, families of both Ty1-copia and Ty3-gypsy elements have been discovered that encode envelope-like proteins (Turcich et al. 1996; Laten, Majumdar, and Gaucher 1998; Wright and Voytas 1998; Kapitonov and Jurka 1999; Laten 1999; Peterson-Burch et al. 2000; Vicient, Kalendar, and Schulman 2001; Wright and Voytas 2002). This suggests that plant genomeslike those of Drosophila, other lower animals, and vertebrates (Boeke and Stoye 1997)harbor endogenous retroviruses. Of the plant elements with env-like genes, SIRE1-1 from Glycine max is the only element whose coding sequences are not truncated or peppered with nonsense and/or frameshift mutations (Laten, Majumdar, and Gaucher 1998; Laten 1999; Peterson-Burch et al. 2000). Furthermore, Southern hybridization analyses suggested that the approximately 1,000 SIRE1 copies in the soybean genome are homogeneous (Laten and Morris 1993; Laten, Majumdar, and Gaucher 1998), implying that SIRE1 elements may have recently undergone a replication burst. To explore this hypothesis and to understand the evolutionary dynamics of the SIRE1 family, we characterized the DNA sequences of seven additional independent SIRE1 insertions, as well as SIRE1 reverse transcriptases from the ancestral wild soybean, Glycine soja.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
For sequencing, phage DNAs were isolated from plate lysates (Qiagen). SIRE1-7, SIRE1-8, and SIRE1-9 DNAs were sequenced directly from recombinant phage at the University of Chicago Cancer Research Center DNA Sequencing Facility, as were selected regions of SIRE1-2, SIRE1-4, SIRE1-13, and SIRE1-14. Phage DNAs for SIRE1-2, SIRE1-4, SIRE1-13, and SIRE1-14 were amplified using the high fidelity DNA polymerase, Pfx (Invitrogen), with primers based on the sequence of SIRE1-1. PCR products were purified (Qiagen) and sequenced at the Iowa State University DNA Sequencing Facility. Repeated amplifications using the same templates yielded products with identical DNA sequences. DNAs isolated from Glycine soja leaves (Klimyuk et al. 1993) were PCR-amplified under standard conditions using forward and reverse primers based on SIRE1-1 rt (GAGGCACTGACTGATGAGTTC and TTCTTTGCATACTTGCTTTGTGAG, respectively). PCR products were cloned into TOPO2.1 vectors (Invitrogen), and three clones were sequenced using vector primers at the Iowa State University DNA Sequencing Facility.
DNA sequences were aligned using ClustalW (Higgins, Thompson, and Gibson 1996). The presence of size polymorphisms in the region between the env-like ORF and the 3' LTR (bases 8200 to 8700) made alignments difficult, and so the region was manually realigned. Gaps were inserted to maximize alignments of nearly identical blocks of duplicated nucleotides. Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 2.1 (Kumar et al. 2001). DNA p-distances were used for closely related distances (d < 0.05) and, where appropriate, gamma distances were calculated using Kimura's two-parameter method (Kimura 1980). Minimum-evolution, neighbor-joining, and maximum-parsimony trees were constructed and were evaluated on the basis of 5,000 bootstrap replicates. To evaluate the synonymous to nonsynonymous substitution ratios (dS/dN), ORF1 was split into two subregions, one encoding just the structural Gag protein(s) and one encoding PR, IN, and RT (Pol). Since the cleavage site between Gag and Pol is yet unknown, the junction was defined to be 25 codons upstream of the conserved Asp-Ser-Gly presumed to be the protease active site. This position approximates the protease cleavage site for HIV (Pearl and Taylor 1987) as well as for Ty1 (Merkulov et al. 1996) and Ty3 (Kirchner and Sandmeyer 1993). To evaluate the dS/dN ratios for the env-like ORF, the amino acid immediately following the pol termination codon was designated the start codon. Codon-aligned nucleotide sequences were analyzed using SNAP (Nei and Gojobori 1986). The ages of selected elements were determined by calculating p distances between the two LTRs of individual elements and by using a cruciferous molecular clock to estimate the times of insertion (Haubold and Wiehe 2001). Sequences in GenBank related to SIRE1 and those flanking SIRE1 insertions were sought using BlastN, TBlastN, and TBlastX (Altschul et al. 1997).
Potential TATA promoter elements and transcriptional start sites were predicted using time delay neural network (TDNN) (Reese 2001) and ProScan 1.7 (Prestridge 1995). Transmembrane peptides were identified using TMpred (Hofman and Stoffel 1993) and PHDhtm (Rost et al. 1995). Searches for potential transcription factor binding sites were performed using MatInspector V2.2 based on the TRANSFAC 4.0 database (Quandt et al. 1995) and SignalScan (Higo et al. 1999). For clarity and consistency, all nucleotide positions refer to the consensus sequence found with the full alignment (see Supplementary Material below).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
|
Another measure of the relative age of the SIRE1 elements is the divergence between the LTRs of the same element. The LTRs of a single retroelement are theoretically identical at the time of insertion because they are reverse transcribed from the same template sequence. Once integrated, changes in LTR sequences should not be subject to selection, and the frequency should approximate the mutation rate. Of the elements with two complete LTRs, SIRE1-4 had one base-pair change and SIRE1-8 had two base-pair changes. The three elements truncated in the 3' LTR: SIRE1-7, SIRE1-9, and SIRE1-13, had no base-pair changes, one base-pair change, and no base-pair changes, respectively. Using an Arabidopsis molecular clock for the synonymous evolution rate (Haubold and Wiehe 2001), we estimated that the insertions of SIRE1-4 and SIRE1-8 occurred approximately 30,000 and 70,000 years ago, respectively.
The LTRs and Putative cis-Acting Sequences
The LTRs range in length from 902 bp to 1,194 bp (table 1). The length polymorphisms among LTRs are due primarily to tandem sequence duplications. The 5' ends of the SIRE1-4, SIRE1-7, SIRE1-9, and SIRE1-13 LTRs have a common 96-bp duplication separated by 5 bp (fig. 4). The distribution of this duplication replicates that of the length polymorphisms (see table 3). In addition, the LTRs of SIRE1-4 and SIRE1-7 have four tandem copies of an imperfect 20-bp repeat beginning at base 726; SIRE1-13 has three and one quarter copies; SIRE1-9 has three copies; and SIRE1-2, SIRE1-8, and SIRE1-14 have two copies each.
|
The LTRs contain several repeats of variable length that are suggestive of regulatory elements (fig. 4). Although none of these repeats contains motifs resembling cis-acting regulatory elements in characterized plant retrotransposons (Grandbastien et al. 1997; Takeda et al. 1999), several contain the sequence AAAG, which forms the core binding site for Dof zinc-finger transcription factors (Yanagisawa and Schmidt 1999). Between bases 418 and 508, this tetranucleotide is present five times (SIRE1-1, SIRE1-2, SIRE1-8, SIRE1-13, and SIRE1-14) and eight times (SIRE1-4, SIRE1-7, SIRE1-9), respectively. The same sequence is also present at elevated density on the complementary strand (fig. 4). Based on the overall DNA composition of the LTR, AAAG and CTTT would be expected to occur 0.6 and 0.4 times, respectively, in this region. The cluster of AAAG is most dense between 95 and 185 bp upstream of the putative TATA box typical of other retrotransposon regulatory elements (Grandbastien et al. 1997; Takeda et al. 1999).
The tRNA primer binding site (PBS) in SIRE1 is complementary to soybean tRNA imet (Bi and Laten 1996). Among the insertions sequenced, all clade 1 elements are complementary to 10 bases of the 3' end of the tRNA. Clade 2 elements are complementary to the first 12 bases. Interestingly, the first 10 bases of the PBS (TGGTATCAGA) are repeated just upstream of the 3' end of the LTR in every SIRE1 member. The polypurine tract (PPT) lies adjacent to the 3' LTR and has the sequence AAAGGGGGAGA. There are no sequence polymorphisms within the PPT or in the 50 bp upstream of this sequence.
gag-pol
A consensus sequence of SIRE1 elements encodes Gag and Pol on a single open reading frame, which is presumably translated as a single polyprotein. Within Gag-Pol are the invariant amino acid residues and conserved motifs found in most Ty1-copia class retrotransposons (Peterson-Burch and Voytas 2002). These include a zinc fingerlike Cys-Cys-His-Cys motif in the presumed nucleocapsid protein (SIRE1 has two), an Asp-Ser-Gly motif in the catalytic site of protease, His-His-Cys-Cys and Asp-Asp-35-Glu motifs in IN, and several conserved domains within RT.
The SIRE1 gag-pol coding region is remarkably conserved, ranging between 95% and 99% identity, with an average of 98%. Some of these nucleotide changes likely compromise SIRE1 function. SIRE1-2 has four, single-base frameshift mutations within gag, whereas SIRE1-13 and SIRE1-1 each have a single nonsense mutation. Despite these obvious mutations, six short indels have occurred that preserve the reading frame. All but one of these indels are located in the first 1,700 bp of ORF1, within the Gag and PR coding regions. In addition, we calculated the proportion of nucleotide changes that preserved the amino acid sequence (dS/dN ratio). For gag, defined as the coding region from the presumed start codon to 25 amino acids upstream of the protease active site, the average dS/dN ratio among elements was 3.90, denoting selective constraint at most sites. Selection for function of pol was considerably stronger, with a dS/dN ratio of 7.45.
The env-like Gene
The env-like gene is in the same reading frame as gag-pol, and except for SIRE1-1, it is separated from gag-pol by a single stop codon. Immediately after the stop codon is a nucleotide motif (CARYTA) known to facilitate stop codon suppression in tobacco mosaic virus (Skuzeski et al. 1991) and several other ssRNA plant viruses (Beier and Grimm 2001). Although there are no examples of Pol-Env fusions in retroelements, constructs carrying the sequence promoted readthrough of the SIRE1 pol stop codon in vivo (Havecker and Voytas 2003).
The length polymorphisms in env are primarily the result of 11 in-frame indels. All but one of these are confined to the first 550 and last 300 bp of this 2,080-bp ORF. Of the 285 polymorphic nucleotide sites, 25% are located within the first 300 bp of the coding region.
To calculate the dS/dN ratio, the nucleotide sequences were codon-aligned, and the ratio was found to average 3.29 between element pairs. Previously, we identified three motifs in the conceptual translation of this ORF analogous to structural elements in retroviral envelope proteinsa transmembrane domain, a fusion peptide, and a coiled-coil domain (Laten, Majumdar, and Gaucher 1998). The putative 19amino-acid fusion peptide is perfectly conserved among all eight sequenced elements, and the presumed 32-residue coiled-coil has only two polymorphic positions, neither of which alter the heptad repeat pattern (data not shown). The amino terminal transmembrane domain is polymorphic at 16 of 24 residues, yet all variations are predicted to be membrane-spanning peptides with strong confidence (data not shown).
The Interval Between the env-like Gene and the 3' LTR
The most variable region in SIRE1 lies immediately downstream of the env-like gene and extends to within 100 bp of the PPT adjacent to the 3' LTR (fig. 5). Variation is primarily in the form of a complex pattern of sequence duplications ranging from simple trinucleotide repeats to imperfect tandem duplications of 100 bp. One shared feature of many of the sequence duplications is the presence of PPT-like sequences. Between bases 8176 and 8845, each SIRE1 member contains four to six copies of the sequence AGGGGGAG. Another is the presence of short duplications bordering the indels.
|
SIRE1-1 is adjacent to the gag-pol region of a member of the Ty3-gypsylike retroelement, diaspora (Yano, Das, Panbehi, Damergis, and Laten, unpublished [GenBank accession number AF095730]). When the sequence adjacent to SIRE1-10 was translated and used to query GenBank using TBlastN and BlastP, the gag-pol ORF of Cinful-1 retroelement sequences from maize (SanMiguel et al. 1996) and RIRE2 from Oryza sativa (Ohtsubo, Kumekawa, and Ohtsubo 1999) were returned, as were several additional closely related sequences from Lotus japonicus (Sato et al. 2001). The DNA that abuts the gag region of SIRE1-14 also appears to be a member of a repetitive family. Paralogs of this sequence are present upstream of a coumarate:CoA ligase isoenzyme 3 gene (GenBank accession number AF002257) and a coding region identified as an allergen (GenBank accession number AB013289). The sequence is also represented in 38 BAC-end sequences. The DNA adjacent to the 3' LTR of SIRE1-14 is not associated with the upstream sequence in either GenBank accession or in the BAC-ends. However, paralogs of this DNA are present in over 20 additional BAC-ends. The 230 bp flanking the 5' end of SIRE1-4 is 95% identical to a single BAC-end sequence (GenBank accession number AZ221409). None of the other flanking DNAs contained extended ORFs, nor did BlastN or TBlastX database searches generate significant hits.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Of the sequence diversity observed among SIRE1 family members, most occurs within the noncoding regions, namely the LTRs and the spacer region between the env-like ORF and the 3' LTR. Particularly evident are tandem sequence duplications in the 5' portion of the LTR that result in length polymorphisms ranging from 902 to 1,205 bp. In addition, the shorter duplications contained multiple candidate-binding sites for the Dof zinc finger transcription factor just upstream of the putative promoter. Dof proteins regulate a broad spectrum of target genes in both monocots and dicots, including those that are auxin regulated (Kisu et al. 1997; Baumann et al. 1999), light responsive (Yanagisawa and Sheen 1998), and stress induced (Zhang et al. 1995). Stress conditions and defense elicitors are known to induce Tnt1, Tto1, and Tos17 (Hirochika et al. 1996b; Grandbastien et al. 1997; Takeda et al. 1998). Repetition of putative, cis-acting sequence motifs in LTRs has been noted in four actively transcribed elements: BARE1, Tos17, Tnt1, and Tto1 (Hirochika et al. 1996b; Suoniemi, Narvanto, and Schulman 1996; Grandbastien et al. 1997; Takeda et al. 1999). In the cases of Tnt1 and Tto1, the repeated motifs have been shown experimentally to sponsor inducible element expression (Grandbastien et al. 1997; Takeda et al. 1999), and a MYB-related transcription factor was shown to interact with and regulate Tto1 at these motifs (Sugimoto, Takeda, and Hirochika 2000). In barley, a MYB transcription factor interacts with the Dof transcription factor, BPBF, to regulate endosperm-specific genes (Diaz et al. 2002). Interestingly, the SIRE1 LTRs contain two potential MYB-binding sites just upstream of the AAAG-dense region (fig. 4). As of yet, there is no evidence that SIRE1 RNA is induced by these types of stimuli.
The region between the env-like ORF and the 3'LTR varies in length from 496 to 636 bp. The sequence duplications in this region are unusual but not unprecedented among retroelements. The Grande1 family from maize contains two arrays of tandem repeats between pol and the 3' LTR (Martinez-Izquierdo, Garcia-Martinez, and Vicient 1997), and numerous PPT-like sequences characterize the large noncoding region following the env-like ORF of the Arabidopsis Athila elements (Wright and Voytas 2002). The best explanation for the gain and loss of these repeats is replication slippage (Viguera, Canceill, and Ehrlich 2001). Since strand transfer is a requisite component of retrovirus and retrotransposon replication, some replication slippage by RT at internal regions is quite plausible. Reinitiation at nearby similar or duplicated sequences upstream or downstream could be expected, generating the kind of duplications and subsequent deletions that pervade retroviral genomes (Temin 1993). The presence of tandem triplet repeats and direct repeats of 4 to 7 bp flanking several of the gaps (fig. 5) is consistent with this explanation. In fact, long direct repeats in retroviral DNAs are deleted at high frequency (Rhode, Emerman, and Temin 1987).
With the exception of SIRE1-2, sequence variation within gag-pol and the env-like gene seems to preserve coding information, as all duplications and deletions maintain the open reading frame. For example, the variable N-termini of the env-like ORFs among the eight elements contain five different indels, all of which preserve the reading frame. In addition, the high ratio of synonymous to nonsynonymous changes among the SIRE1 genes further indicates that the elements are evolving under purifying selection. In our study, the dS/dN ratio for pol averages 7.45, whereas the ratios for gag and the env-like gene average 3.90 and 3.29, respectively. Interestingly, in the predicted ENV-like protein, amino acid substitutions still preserved structural features of the protein. For example, the predicted N-terminal transmembrane domain was preserved in all SIRE1 proteins analyzed, despite the relatively high number of nonsynonymous substitutions in this region. The functional constraint placed on the SIRE1 env-like gene contrasts with what has been found in mammalian retroviral envelope genes, where adaptive selection results in high levels of variation to avoid the immune response (Nielsen and Yang 1998; Yamaguchi-Kabata and Gojobori 2000).
The flanking DNAs of 10 SIRE1 insertions were sequenced and two belong to identified plant members of the Ty3-gypsy family. Of the remaining eight, one is flanked on either side by members of two different repetitive families, and one is an apparent paralog of a single BAC-end sequence. The identities of the rest are unknown. These results are suggestive of clustering and/or nesting of some high-copy-number retroelements in G. max, similar to what has been reported for other plant genomes (Bennetzen 2000).
Glycine max has been under cultivation for approximately 10,000 years (Hymowitz and Newell 1981) and was derived from wild ancestral Glycine soja. Both species contain SIRE1 elements, indicating that this family was present long before soybean domestication. However, the high degree of similarity among SIRE1 sequences suggests that this family may still be proliferating in the soybean genome.
The presence of env-like ORFs in SIRE1 and some Ty3-gypsy retroelements has raised speculation that these elements may be retroviruses. The functional role, if any, of an envelope protein for viral propagation in a plant host is unknown, and cell walls preclude membrane fusion as a suitable invasive strategy. However, the presence of env genes in plant viruses is not unusual. All enveloped plant viruses utilize invertebrate vectors in which the glycosylated envelope proteins sponsor host cell recognition and membrane fusion (VandenHeuvel, Franz, and VanderWilk 2002). ENV has been shown to be dispensable in the plant host. When tospoviruses, plant members of the Bunyaviridae, are maintained solely by mechanical inoculation of host plants, morphological isolates that lack functional envelope proteins can be recovered with point and frameshift mutations in the glycoprotein gene (Goldbach and Peters 1996). These isolates are active in the plant host but fail to reinfect the native thrips host (Goldbach and Peters 1996; Nagata et al. 2000).
The presence of a conserved ORF in a multicopy element family that can be identified across diverse host plant taxa constitutes strong evidence that it has been and may continue to be selectively maintained. Although the conceptual polypeptides from the unusual ORFs of SIRE1, Athila4, and Bagy-2 have been designated as env-like (Laten, Majumdar, and Gaucher 1998; Vicient, Kalendar, and Schulman 2001; Wright and Voytas 2002), they may embody heretofore unknown functions related to maintenance in their plant hosts. Whatever the role of the env-like gene, the high levels of sequence conservation observed among SIRE1 elements offers promise that these elements can be used to understand the function of this additional coding sequence.
![]() |
Supplementary Material |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Agrawal, G. K., M. Yamazaki, M. Kobayashi, R. Hirochika, A. Miyao, and H. Hirochika. 2001. Screening of the rice viviparous mutants generated by endogenous retrotransposon Tos17 insertion: tagging of a zeaxanthin epoxidase gene and a novel OsTATC gene. Plant Physiol. 125:1248-1257.
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. H. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped Blast and Psi-Blasta new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
Baumann, K., A. De Paolis, P. Costantino, and G. Gualberti. 1999. The DNA binding site of the Dof protein NtBBF1 is essential for tissue-specific and auxin-regulated expression of the rolB oncogene in plants. Plant Cell 11:323-333.
Beier, H., and M. Grimm. 2001. Misreading of termination codons in eukaryotes by natural nonsense suppressor tRNAs. Nucleic Acids Res. 29:4767-4782.
Bennetzen, J. L. 2000. Transposable element contributions to plant gene and genome evolution. Plant Mol. Biol. 42:251-269.[CrossRef][ISI][Medline]
Bi, Y. A., and H. M. Laten. 1996. Sequence analysis of a cDNA containing the gag and prot regions of the soybean retrovirus-like element, SIRE-1. Plant Mol. Biol. 30:1315-1319.[ISI][Medline]
Boeke, J. D., and J. P. Stoye. 1997. Retrotransposons, endogenous retroviruses, and the evolution of retroelements. Pp. 343435 in J. M. Coffin, S. H. Hughes and H. E. Varmus, eds. Retroviruses. Cold Spring Harbor Laboratory Press, Plainview, New York.
Casacuberta, J. M., S. Vernhettes, and M. A. Grandbastien. 1995. Sequence variability within the tobacco retrotransposon Tnt1 population. EMBO J. 14:2670-2678.[Abstract]
Diaz, I., J. Vicente-Carbajosa, Z. Abraham, M. Martinez, I. Isabel-La Moneda, and P. Carbonero. 2002. The GAMYB protein from barley interacts with the Dof transcription factor BPBF and activates endosperm-specific genes during seed development. Plant J. 29:453-464.[CrossRef][ISI][Medline]
Goldbach, R., and D. Peters. 1996. Molecular and biological aspects of tospoviruses. Pp. 129157 in R. M. Elliot, ed. The Bunyaviridae. Plenum Press, New York.
Grandbastien, M. A., H. Lucas, J. B. Morel, C. Mhiri, S. Vernhettes, and J. M. Casacuberta. 1997. The expression of the tobacco Tnt1 retrotransposon is linked to plant defense responses. Genetica 100:241-252.[CrossRef][ISI][Medline]
Grandbastien, M. A., A. Spielmann, and M. Caboche. 1989. Tnt1, a mobile retroviral-like transposable element of tobacco isolated by plant cell genetics. Nature. 337:376-380.[CrossRef][ISI][Medline]
Haubold, B., and T. Wiehe. 2001. Statistics of divergence times. Mol. Biol. Evol. 18:1157-1160.
Havecker, E. R., and D. F. Voytas. 2003. The soybean retroelement SIRE1 uses stop codon suppression to express its envelope-like protein. EMBO Rep. 4:274-277.
Higgins, D. G., J. D. Thompson, and T. J. Gibson. 1996. Using Clustal for multiple sequence alignments. Meth. Enzymol. 266:383-402.[ISI][Medline]
Higo, K., Y. Ugawa, M. Iwamoto, and T. Korenaga. 1999. Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res. 27:297-300.
Hirochika, H., H. Otsuki, M. Yoshikawa, Y. Otsuki, K. Sugimoto, and S. Takeda. 1996a. Autonomous transposition of the tobacco retrotransposon Tto1 in rice. Plant Cell 8:725-734.
Hirochika, H., K. Sugimoto, Y. Otsuki, H. Tsugawa, and M. Kanda. 1996b. Retrotransposons of rice involved in mutations induced by tissue culture. Proc. Natl. Acad. Sci. USA 93:7783-7788.
Hofman, K., and W. Stoffel. 1993. TMbaseA database of membrane spanning protein segments. Biol. Chem. Hoppe. Seyler 374:166.
Hymowitz, T., and C. A. Newell. 1981. Taxonomy of the genus Glycine: domestication and uses of soybeans. Econ. Bot. 35:272-288.[ISI]
Jaaskelainen, M., A. H. Mykkanen, T. Arna, C. M. Vicient, A. Suoniemi, R. Kalendar, H. Savilahti, and A. H. Schulman. 1999. Retrotransposon BARE-1: expression of encoded proteins and formation of virus-like particles in barley cells. Plant J. 20:413-422.[CrossRef][ISI][Medline]
Kapitonov, V. V., and J. Jurka. 1999. Molecular paleontology of transposable elements from Arabidopsis thaliana. Genetica 107:27-37.[CrossRef][ISI][Medline]
Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.[ISI][Medline]
Kirchner, J., and S. Sandmeyer. 1993. Proteolytic processing of Ty3 proteins is required for transposition. J. Virology 67:19-28.[Abstract]
Kisu, Y., Y. Harada, M. Goto, and M. Esaka. 1997. Cloning of the pumpkin ascorbate oxidase gene and analysis of a cis-acting region involved in induction by auxin. Plant Cell Physiol. 38:631-637.[ISI][Medline]
Klimyuk, V. I., B. J. Carroll, C. M. Thomas, and J. D. Jones. 1993. Alkali treatment for rapid preparation of plant material for reliable PCR analysis. Plant J. 3:493-494.[CrossRef][ISI][Medline]
Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.
Laten, H. M. 1999. Phylogenetic evidence for Ty1-copia-like endogenous retroviruses in plant genomes. Genetica 107:87-93.[CrossRef][ISI][Medline]
Laten, H. M., A. Majumdar, and E. A. Gaucher. 1998. SIRE-1, a copia/Ty1-like retroelement from soybean, encodes a retroviral envelope-like protein. Proc. Natl. Acad. Sci. USA 95:6897-6902.
Laten, H. M., and R. O. Morris. 1993. SIRE-1, a long interspersed repetitive DNA element from soybean with weak sequence similarity to retrotransposons: initial characterization and partial sequence. Gene 134:153-159.[CrossRef][ISI][Medline]
Marek, L. F., J. Mudge, and L. Darnielle, et al. (19 co-authors). 2001. Soybean genomic survey: BAC-end sequences near RFLP and SSR markers. Genome 44:572-581.[CrossRef][ISI][Medline]
Martinez-Izquierdo, J. A., J. Garcia-Martinez, and C. M. Vicient. 1997. What makes Grande1 retrotransposon different? Genetica 100:15-28.[CrossRef][ISI][Medline]
Merkulov, G. V., K. M. Swiderek, C. B. Brachmann, and J. D. Boeke. 1996. A critical proteolytic cleavage site near the C terminus of the yeast retrotransposon Ty1 Gag protein. J. Virol. 70:5548-5556.[Abstract]
Nagata, T., A. K. Inoue-Nagata, M. Prins, R. Goldbach, and D. Peters. 2000. Impeded thrips transmission of defective tomato spotted with virus isolates. Phytopathology 90:454-459.[ISI]
Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418-426.[Abstract]
Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929-936.
Ohtsubo, H., N. Kumekawa, and E. Ohtsubo. 1999. RIRE2, a novel gypsy-type retrotransposon from rice. Genes Genet. Syst. 74:83-91.[CrossRef][ISI][Medline]
Pearl, L. H., and W. R. Taylor. 1987. Sequence specificity of retroviral proteases. Nature. 328:482.[Medline]
Peterson-Burch, B. D., and D. F. Voytas. 2002. Genes of the Pseudoviridae (Ty1/copia retrotransposons). Mol. Biol. Evol. 19:1832-1845.
Peterson-Burch, B. D., D. A. Wright, H. M. Laten, and D. F. Voytas. 2000. Retroviruses in plants? Trends Genet. 16:151-152.[CrossRef][ISI][Medline]
Pouteau, S., E. Huttner, M. A. Grandbastien, and M. Caboche. 1991. Specific expression of the tobacco Tnt1 retrotransposon in protoplasts. EMBO J. 10:1911-1918.[Abstract]
Prestridge, D. S. 1995. Predicting pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 249:923-932.[CrossRef][ISI][Medline]
Quandt, K., K. Frech, H. Karas, E. Wingender, and T. Werner. 1995. MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res. 23:4878-4884.[Abstract]
Reese, M.G. 2001. Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Comp. Chem. 26:51-56.[CrossRef][ISI]
Rhode, B. W., M. Emerman, and H. M. Temin. 1987. Instability of large direct repeats in retrovirus vectors. J. Virol. 61:925-927.[ISI][Medline]
Rost, B., R. Casadio, P. Fariselli, and C. Sander. 1995. Transmembrane helices predicted at 95-percent accuracy. Protein Sci. 4:521-533.
Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.
SanMiguel, P., B. S. Gaut, A. Tikhonov, Y. Nakajima, and J. L. Bennetzen. 1998. The paleontology of intergene retrotransposons of maize. Nat. Genet. 20:43-45.[CrossRef][ISI][Medline]
SanMiguel, P., A. Tikhonov, and Y. K. Jin, et al. (11 co-authors). 1996. Nested retrotransposons in the intergenic regions of the maize genome. Science 274:765-768.
Sato, S., T. Kaneko, Y. Nakamura, E. Asamizu, T. Kato, and S. Tabata. 2001. Structural analysis of a Lotus japonicus genome. I. Sequence features and mapping of fifty-six TAC clones which cover the 5.4 Mb regions of the genome. DNA Res. 8:311-318.[ISI][Medline]
Skuzeski, J. M., L. M. Nichols, R. F. Gesteland, and J. F. Atkins. 1991. The signal for a leaky UAG stop codon in several plant viruses includes the two downstream codons. J. Mol. Biol. 218:365-373.[ISI][Medline]
Sugimoto, K., S. Takeda, and H. Hirochika. 2000. MYB-related transcription factor NtMYB2 induced by wounding and elicitors is a regulator of the tobacco retrotransposon Tto1 and defense-related genes. Plant Cell 12:2511-2527.
Suoniemi, A., A. Narvanto, and A. H. Schulman. 1996. The BARE-1 retrotransposon is transcribed in barley from an LTR promoter active in transient assays. Plant Mol. Biol. 31:295-306.[ISI][Medline]
Takeda, S., K. Sugimoto, H. Otsuki, and H. Hirochika. 1998. Transcriptional activation of the tobacco retrotransposon Tto1 by wounding and methyl jasmonate. Plant Mol. Biol. 36:365-376.[CrossRef][ISI][Medline]
Takeda, S., K. Sugimoto, H. Otsuki, and H. Hirochika. 1999. A 13-bp cis-regulatory element in the LTR promoter of the tobacco retrotransposon Tto1 is involved in responsiveness to tissue culture, wounding, methyl jasmonate and fungal elicitors. Plant J. 18:383-393.[CrossRef][ISI][Medline]
Temin, H. M. 1993. Retrovirus variation and reverse transcription: abnormal strand transfers result in retrovirus genetic variation. Proc. Natl. Acad. Sci. USA 90:6900-6903.[Abstract]
Turcich, M. P., A. Bokhaririza, D. A. Hamilton, C. P. He, W. Messier, C. B. Stewart, and J. P. Mascarenhas. 1996. Prem-2, a copia-type retroelement in maize is expressed preferentially in early microspores. Sex. Plant Reprod. 9:65-74.[ISI]
Vandenheuvel, J. F. J. M., A. W. E. Franz, and F. Vanderwilk. 2002. Molecular basis of virus transmission. Pp. 183210 in C. L. Mandahar, ed. Molecular biology of plant viruses. Kluwer, Boston.
Vicient, C. M., R. Kalendar, and A. H. Schulman. 2001. Envelope-class retrovirus-like elements are widespread, transcribed and spliced, and insertionally polymorphic in plants. Genome Res. 11:2041-2049.
Viguera, E., D. Canceill, and S. D. Ehrlich. 2001. Replication slippage involves DNA polymerase pausing and dissociation. EMBO J. 20:2587-2595.
Wessler, S. R., T. E. Bureau, and S. E. White. 1995. LTR-retrotransposons and MITEs: important players in the evolution of plant genomes. Curr. Opin. Genet. Dev. 5:814-821.[CrossRef][ISI][Medline]
Wright, D. A., and D. F. Voytas. 1998. Potential retroviruses in plants: Tat1 is related to a group of Arabidopsis thaliana Ty3/gypsy retrotransposons that encode envelope- like proteins. Genetics 149:703-715.
Wright, D. A., and D. F. Voytas. 2002. Athila4 of Arabidopsis and Calypso of soybean define a lineage of endogenous plant retroviruses. Genome Res. 12:122-131.
Yamaguchi-Kabata, Y., and T. Gojobori. 2000. Reevaluation of amino acid variability of the human immunodeficiency virus type 1 gp120 envelope glycoprotein and prediction of new discontinuous epitopes. J. Virol. 74:4335-4350.
Yanagisawa, S., and R. J. Schmidt. 1999. Diversity and similarity among recognition sequences of Dof transcription factors. Plant J. 17:209-214.[CrossRef][ISI][Medline]
Yanagisawa, S., and J. Sheen. 1998. Involvement of maize Dof zinc finger proteins in tissue-specific and light-regulated gene expression. Plant Cell 10:75-89.
Zhang, B., W. Chen, R. C. Foley, M. Buttner, and K. B. Singh. 1995. Interactions between distinct types of DNA binding proteins enhance binding to ocs element promoter sequences. Plant Cell 7:2241-2252.