(Received for publication, September 16, 1994; and in revised form, November 2, 1994)
From the
Bacterial reverse transcriptase (RT) is responsible for synthesis of multicopy single-stranded DNA (msDNA) consisting of single-stranded DNA linked to an internal guanosine residue of RNA by an unusual 2`,5`-phosphodiester linkage. Here we purified a bacterial RT to homogeneity from Escherichia coli harboring the RT gene from retron-Ec73. The purified RT-Ec73 was able to synthesize msDNA in a cell-free system using an RNA template produced in vitro by T7 RNA polymerase. The in vitro synthesized msDNA was released from the template RNA only when treated with yeast debranching enzyme DBR1, a specific nuclease for a 2`,5`-phosphodiester linkage. The position of the branching G residue in the template RNA and the DNA sequence of the cell-free product were identical to those of msDNA-Ec73 synthesized in vivo. These results clearly demonstrate that the formation of the 2`,5`-phosphodiester linkage in msDNA synthesis is carried out by RT itself.
Reverse transcriptases (RT) ()are unique among DNA
polymerases because of their ability to use RNA as templates.
Eukaryotic RTs associating with retroviruses and retrotransposons are
known to use a specific cellular tRNA as a primer for cDNA
synthesis(1) . Recently it has been demonstrated that the 3`-OH
group of the 3`-end A residue of tRNA
is used
exclusively for the priming reaction by HIV retroviral RT(2) .
In contrast to these eukaryotic RTs, it has been suggested that
bacterial RTs specifically initiate cDNA synthesis from the 2`-OH group
of an internal G residue of a template RNA(3, 4) .
Bacterial RTs have been shown to exist in a Gram-negative soil bacterium, Myxococcus xanthus(5, 6) , and Escherichia coli(7, 8) . Bacterial RTs homologous to retroviral RTs are responsible for the synthesis of an unusual satellite single-stranded DNA called msDNA (multicopy single-stranded DNA). The 5`-end of msDNA is covalently linked to the 2`-OH group of an internal G residue of a single-stranded RNA (msdRNA). The DNA and RNA molecules form a heteroduplex at their 3`-ends. A number of different msDNAs have been found in Myxobacteria, E. coli, Rhizobium, Salmonella, Proteus, Klebsiella(3, 4, 9, 10) . A genetic element called ``retron'' is required for msDNA production and consists of msr (a coding region for msdRNA), msd (a coding region for msDNA), and RT (a gene for RT)(3) . Eight different retrons have been so far identified and characterized in E. coli and Myxobacteria.
The proposed synthesis of msDNA is shown in Fig. 1(3, 4) . First, an RNA transcript encompassing msr, msd, and RT is produced from the promoter located upstream of msr (step 1 in Fig. 1). The RNA transcript containing a1 and a2 inverted repeat sequences is thought to form secondary structures including a stable a1-a2 stem structure (step 2). The branching G residue is placed at the end of the a1-a2 stem in this folded structure. No specific primary sequences are required in the a1-a2 structure(11, 12) . In order for RT to initiate msDNA synthesis, another specific structure for each RT located downstream of the branching G residue is required in addition to the a1-a2 stem structure and the G residue(12) . The primary reaction of msDNA or cDNA synthesis is thought to start from the 2`-OH group of the G residue (step 3). The first base is added using the RNA transcript as the template, and cDNA synthesis continues on the same template RNA to a specific termination site (steps 3 and 4). The template RNA is removed by RNase H, leaving the short 3`-end RNA-DNA overlapping region.
Figure 1: Biosynthetic pathway of msDNA synthesis. Short thin arrows represent the inverted repeats (a1-a2 and b1-b2). Thick arrows represent the genes for msdRNA (msr), msDNA (msd), and RT. The branching G residue is circled. Long solid lines represent mRNA transcript. Thick lines in the transcript correspond to msdRNA, and wavy lines correspond to cDNA or msDNA. See text for details.
In this report, we purified a bacterial RT to homogeneity from E. coli harboring the RT gene from retron-Ec73. Using the purified enzyme (RT-Ec73) and an RNA template synthesized in vitro by T7 RNA polymerase (Fig. 2A), we established a cell-free system synthesizing the full-length msDNA-Ec73 (Fig. 2B). Furthermore, we now unambiguously demonstrate that the bacterial RT is indeed able to prime cDNA synthesis from the 2`-OH group of a specific internal G residue of the template RNA molecule. Therefore, the cDNA-priming mechanism of bacterial RTs appears to be quite different from that of retroviral RTs. This raises interesting questions as to the molecular mechanism and the evolutionary significance of the 2`-OH priming reaction.
Figure 2: Proposed secondary structures of msr-msd RNA from retron-Ec73 synthesized in vitro by T7 RNA polymerase and msDNA-Ec73. A, putative secondary structure of msr-msd RNA from retron-Ec73 synthesized in vitro by T7 RNA polymerase using pUCT7MS73 as a template. The branching G residue is circled. The two arrows indicate a1-a2 inverted repeats. A solid triangle indicates a termination site of msDNA-Ec73 synthesis in vivo and an open triangle indicates a cleavage site of the template RNA by RNase H after msDNA synthesis (see Fig. 2B). The synthesized RNA is 192 bases in length. B, secondary structure of msDNA-Ec73(19) . The branching G residue is circled, and the RNA region is boxed. The trinucleotide (5`-AGC-3`) covered with a shaded box is linked to cDNA after digestion with RNase A.
pUCT7MS73 was constructed for in vitro transcription. First, an 84-base pair fragment including a T7 promoter (Ø10) was amplified by polymerase chain reaction using oligonucleotides T7p-73 (5`-CTAGGTTTGGCTCTGCTATAGTGAGTCGTA-3`) and PBR-b (5`-CCGGCCACGATGCGTCC-3`) as primers and pET11a (13) as a template. A G residue was used as the first base for transcription in front of the 5`-end C residue of msr because the T7 RNA polymerase preferably uses G for the transcription initiation(22) . After purification by polyacrylamide gel electrophoresis, this fragment was mixed with p73-Hc0.7, and the second polymerase chain reaction was done by using the PBR-b and 73f (5`-TCGGATCCTTATGCACCTT-3`) (12) as primers. The amplified fragment was isolated and cloned into the SmaI site of pUC19(23) . A plasmid, which had the insert in the opposite orientation against the lac promoter of pUC19, was selected and designated pUCT7MS73. The DNA sequence of the inserted fragment was confirmed by the method of Sanger et al.(20) .
Figure 3: Purification of RT-Ec73(His). RT-Ec73(His) was purified as described under ``Experimental Procedures.'' Lane 1, total cell protein of LE392(DE3)/pET73RT(His) without IPTG induction; lane 2, with IPTG induction; lane 3, the soluble protein fraction after high speed centrifugation; lane 4, the membrane protein fraction; lane 5, protein fraction solubilized with guanidine-HCl; lane 6, purified RT-Ec73(His) from Ni-NTA column. An arrowhead indicates the band corresponding to RT-Ec73(His). The calculated molecular mass of RT-Ec73(His) is 37.6 kDa. Bovine serum albumin (66 kDa), ovalbumin (43 kDa), and carbonic anhydrase (29 kDa) were used as molecular mass markers. The gel was stained with Coomassie brilliant blue.
To characterize the cDNA products, cDNA was
synthesized in a large scale and labeled at the 3`-end. 120 µl of
the RNA transcript was first annealed and mixed with RT buffer and 0.3
mM (each) dNTP. cDNA synthesis was started by adding 400
µl (20 µg) of the purified RT-Ec73(His) to 3 ml of
reaction mixture and stopped by adding the stop solution and 3 volumes
of ethanol. The precipitated sample was redissolved in 1 mM EDTA and loaded on polyacrylamide gel for electrophoresis. All
bands were cut out of the gel and electroeluted to remove
unincorporated nucleotides. The eluted products were labeled at the
3`-end in 200 µl of labeling buffer containing 250 µCi of
[
-
P]ddATP (5,000 Ci/mmol) and 96 units of
terminal deoxynucleotidyl transferase (TdT, International
Biotechnologies, Inc.) at 37 °C for 1 h as described previously (12) . The reaction was stopped with EDTA and precipitated with
isopropyl alcohol. The labeled products were redissolved in 10 µl
of 1 mM EDTA and heat-denatured. RNA was digested by
incubating with 0.8 µg of RNase A at 37 °C for 10 min twice.
Then the labeled products were separated on a preparative sequencing
gel (12% polyacrylamide gel in 8 M urea). Each labeled band
was cut out and eluted from the gel as described
previously(12) . The DNA sequences were determined by the
method of Maxam and Gilbert(25) .
Figure 7:
Analysis of the phosphodiester linkage
between RNA and cDNA. A, schematic diagram of the digestion of
the band 4 product with yeast debranching enzyme and ligation with the
oligonucleotides. The branching G residues are circled, and
RNA regions are boxed. Boldface letters represent
cDNA product synthesized in the cell-free system. Asterisks indicate the P-labeled dideoxyadenosine. B,
digestion of cDNA products with yeast debranching enzyme and ligation
of resulted products with oligonucleotides. Experiments were carried
out as described under ``Experimental Procedures.'' Band 4
product (lanes 1-4) and band 5 product (lanes
5-8) in Fig. 5were used in this analysis. The
products digested with yeast debranching enzyme from band 4 and 5 are
indicated by arrowheads in lanes 2 and 6, respectively. The
digests (lanes 4 and 8) and undigested products (lanes 3 and 7) were ligated with oligonucleotides A
and B. The ligated products from band 4 and 5 are indicated by arrowheads in lanes 4 and 8, respectively. Numbers in
the left hand side indicate bases in
length.
Figure 5:
Analysis of cDNA products labeled at their
3`-ends. cDNAs were first synthesized in the cell-free system without
any radioactive nucleotide and then labeled at their 3`-ends with
[-
P]ddATP as described under
``Experimental Procedures.'' The synthesized cDNA products
were divided to two parts, and each part was loaded to one lane of 12%
polyacrylamide, 8 M urea preparative sequencing gel. Numbers with arrowheads represent the fragments used to
determine DNA and RNA sequences (see Fig. 68). Molecular weight
standards were the same as those described in Fig. 4.
Figure 6: DNA sequences of the cDNA products synthesized in the cell-free system. DNA sequences were determined by the method of Maxam and Gilbert(25) . Lane N, the same samples without sequencing reaction. A, DNA sequence of band 1 in Fig. 5. Dotted lines between the two gels indicate the same bases. B, DNA sequence of band 2 in Fig. 5. C, DNA sequence of band 3 in Fig. 5.
Figure 4:
cDNA
synthesis in vitro by using the purified RT-Ec73(His) and RNA
template (msr-msd) synthesized in vitro.
cDNAs were synthesized as described under ``Experimental
Procedures'' were treated without (lane 1) or with (lane 2) RNase A and separated on 6% polyacrylamide, 8 M urea gel. The predicted structures of band a (lane 1) and
bands b and c (lane 2) were depicted in the right hand side of
each band (see ``Results'' for details). Products in band X
are assumed to result from further cDNA extension of the band b
product. Molecular weight standards were pBR322 digested with MspI and labeled at the 3`-ends with
[-
P]dCTP and Klenow fragment of E. coli DNA polymerase I. Both thick and thin lines represent RNA template, where the thick region corresponds to
msdRNA. Wavy lines represent cDNA. The branching G residue is circled.
Figure 8:
RNA
sequence analysis by RNase digestion. Band 4 product in Fig. 5was first labeled with [-
P]ATP
and T4 polynucleotide kinase at the 5`-end of RNA portion. Then the
labeled product digested with debranching enzyme (lanes 2, 4, and 6) or without digestion (lanes 1, 3, and 5) was treated with RNase T1 (lanes 3 and 4) or U2 (lanes 5 and 6). Each
product was separated on 20% polyacrylamide, 8 M urea gel
electrophoresis. Asterisks indicate
P labeling.
Note that the trinucleotide (5`-AGC-3`) migrated at exactly the same
position as that from msDNA-Ec73 isolated in vivo (data not
shown).
Next, we determined DNA sequences of bands 1, 2, and 3 (Fig. 5) by the method of Maxam and Gilbert(25) . As shown in Fig. 6A, band 1 product had the same sequence as the in vivo msDNA-Ec73 (from T at position 1 to T at position 73, Fig. 2B) except for 2 or 3 extra A residues at the 3`-end. To determine whether the products migrating at positions shorter than the 27-base marker are produced by premature termination, the DNA sequences of band 2 and band 3 products (25 and 21 bases in length, respectively; Fig. 5) were determined. As shown in Fig. 6, B and C, the 5`-end of the DNA sequences of the band 2 and 3 products were 5`-TTGAGCACGTCGAT-3`, which is identical to that of msDNA-Ec73 produced in vivo (Fig. 2B) and the band 1 product (see Fig. 6A). The exact 3`-end sequences of the band 2 and 3 products were unable to be determined. These sequencing analyses clearly demonstrate that the cDNA synthesis very accurately started from the T residue complementary to the A residue at position 143 (Fig. 2A). Note that all samples without sequencing reaction (lane N in Fig. 6) migrated more slowly by one base than the other sequencing lanes. This is due to the piperidine treatment during sequencing reaction which eliminated one 5`-end base of the RNA molecule attached to DNA(12, 26) . Therefore the cDNAs synthesized in the cell-free system are likely to be linked to RNA.
The 5`-end structures of both molecules were further examined by ligation analysis as depicted in Fig. 7A. Oligonucleotide B of 18 bases in length is designed to be complementary to both oligonucleotide A and the band 5 product in such a way that both oligonucleotides can be complementarily aligned on oligonucleotide B as shown in Fig. 7A. However, if the 5`-end of the band 5 product is blocked by a branched RNA molecule, the two oligonucleotides (oligonucleotide A and the band 5 product) cannot be ligated on oligonucleotide B. Indeed, the band 5 product, as well as the band 4 product, was unable to ligate to oligonucleotide A without treatment with debranching enzyme (Fig. 7B, lanes 7 and 3, respectively). On the other hand, when they were treated with the debranching enzyme, the 9- and 11-base molecules were generated (lanes 6 and 2, respectively) which were capable of ligating to the oligonucleotide A resulting in new longer products of 18 and 20 bases in length for bands 5 and 4, respectively (lanes 8 and 4). Debranching enzyme is known to generate a 5`-phosphoryl group(27) , which is consistent with the present result.
These data clearly demonstrate that both the band 4 and 5 products were blocked at their 5`-ends by forming a 2`,5`-phosphodiester linkage and had the identical sequence to the 5`-end sequence of msDNA-Ec73.
When the labeled band 4 product was digested with the yeast debranching enzyme, a new band (lane 2, Fig. 8) appeared at exactly the same position as the tri-ribonucleotide, 5`-AGC-3` that was derived by the same treatment from msDNA-Ec73 produced in vivo (data not shown). In order to determine the sequence of the tri-ribonucleotide from the band 4 product, it was further digested with RNase T1. As shown in lane 4, Fig. 8, a dinucleotide band was generated by RNase T1 treatment. It should be noted that the dinucleotide band was not obtained if the band 4 product was treated with RNase T1 before digestion with the debranching enzyme (lane 3). These results clearly indicate that the cDNA molecule was branched out from the 2`-OH group of a G residue. Furthermore, it was found that a mononucleotide was released by the treatment with RNase U2, a specific RNase for A residues, either before or after the debranching enzyme treatment (lanes 5 and 6). The mononucleotide released from the band 4 product was further confirmed as A by treating it with nuclease P1 followed by two-dimensional thin layer chromatography as described previously(26) . Together, the RNA sequence of the trinucleotide derived from the band 4 product was concluded to be 5`-AG(C or U)-3`, consistent with the sequence for msDNA-Ec73 (see Fig. 2B).
In this report, we constructed a complete cell-free system for the synthesis of msDNA-Ec73, which consists of the purified RT from retron-Ec73 (RT-Ec73), the msr-msd RNA template, and four dNTPs. Using this system, we now unambiguously demonstrate that the bacterial reverse transcriptase indeed initiates the cDNA priming reaction from the 2`-OH group of a specific internal G residue in the RNA template forming a 2`,5`-phosphodiester linkage at the 5`-end of the cDNA. The ability to form the 2`,5`-phosphodiester linkage is therefore an intrinsic property of the bacterial enzyme.
Bacterial
RTs, although evolutionarily related to eukaryotic RTs, are
significantly smaller than eukaryotic RTs. For example, RT-Ec73
consists of only 316 amino acid residues and has no RNase H domain (19) . Nevertheless, bacterial RTs have remarkably stringent
requirements for the cDNA priming reaction. Each bacterial RT requires
specific secondary structures downstream of the branching G residue for
the cDNA priming reaction in addition to a stem structure immediately
upstream of the G residue(11, 12) . Furthermore, we
have recently demonstrated a requirement of secondary structures in the
region serving as the cDNA template(28) . It is of great
interest to determine how bacterial RTs are able to specifically
recognize these RNA secondary structures. It has been shown that HIV-1
RT forms a heterodimer between the full-length 66-kDa product (p66) and
the 51-kDa product (p51) lacking the C-terminal RNase H
domain(29, 30) . The primer tRNA molecule is proposed to bind to the p51 subunit(31) .
Recently, the requirements of secondary structures around the primer
binding site have been reported for the cDNA priming reaction for
retroviral RTs(32, 33, 34) . These
requirements may somehow be related to those found for bacterial RTs in
terms of the three-dimensional structures of the enzymes.
It is also an intriguing question how various RTs have developed their own specific cDNA priming mechanism; retroviral RTs use cellular tRNAs (1) , whereas RT from hepatitis virus (hepadnavirus) uses the protein itself as a primer for cDNA synthesis(35) . RT from a non-LTR retrotransposon, R2Bm, is known to have endonuclease activity, which creates a nick in double-stranded DNA to prime the reverse transcription of RNA template(36) . The Mauriceville plasmid RT initiates cDNA synthesis from the 3`-end of the template RNA without any specific primers as in the case of RNA-dependent RNA polymerase (37) .
The significance of the formation of the 2`,5`-phosphodiester linkage remains to be answered. Interestingly, the transposition efficiency of Ty1 element, a yeast retrotransposon, was significantly reduced in a strain lacking the debranching enzyme DBR1, suggesting that the formation of a 2`,5`-phosphodiester linkage may somehow be involved in the retrotransposition of Ty1 element(27) .
It is interesting to note that RNase H was not
required for msDNA synthesis, although the addition of RNase H to the
cell-free system stimulated the production of msDNA. ()The
accumulation of msDNA of 10 15 bases in length observed in the present
study may be due to the stable secondary structure in the RNA template,
which hinders the elongation of cDNA synthesis. It is possible that
another protein factor(s) such as an RNA helicase may be used for
efficient production of msDNA in vivo.