©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
A Long Purine-Pyrimidine Homopolymer Acts as a Transcriptional Diode (*)

(Received for publication, June 10, 1994; and in revised form, October 12, 1994)

Ed Grabczyk Mark C. Fishman (§)

From the Developmental Biology Laboratory and Cardiovascular Research Center, Massachusetts General Hospital and the Department of Medicine, Harvard Medical School, Charlestown, Massachusetts 02129-2600

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Polypurine-polypyrimidine (RbulletY) sequences have the unusual ability to form DNA triple helices. Such tracts are overrepresented upstream of eukaryotic genes, although a function there has not been clear. We report that transcription in vitro into one such upstream RbulletY tract in the direction that makes a predominantly purine RNA is effectively blocked by formation of an intramolecular triple helix. The triplex is triggered by transcription and stabilized by the binding of nascent purine RNA to the template. Transcription in the opposite direction is not restricted. Polypurine-polypyrimidine DNA may provide a dynamic and selective block to transcription without the aid of accessory proteins.


INTRODUCTION

DNA sequences with an asymmetric strand distribution of purine and pyrimidine bases (RbulletY) constitute up to 0.4% of mammalian genomes and are frequent in the upstream regions of genes, where they contribute S1 nuclease hypersensitive sites(1, 2, 3) . This has led to the speculation that RbulletY sequences may be involved in gene regulation (4, 5, 6) . The nuclease sensitivity of RbulletY sequences stems from the ability to adopt non-B secondary structures, including DNA triple helices. To form an intramolecular triplex, part of the RbulletY DNA duplex must dissociate and wind back down the major groove of the DNA helix to form nonstandard bonds with a central purine strand. In so doing, it relaxes negative supercoils. Consequently, the presence of negative supercoiling has a strong influence on the formation of intramolecular triple helices(4, 5, 6) .

An argument against the existence of intramolecular triple helices in vivo has been that such structures require negative supercoiling to form at physiologic pH and salt conditions, whereas evidence does not suggest that genomic DNA is highly supercoiled. A source of negative supercoiling for these RbulletY tracts could be transcription elongation, during which a local wave of negative supercoiling is generated by the passage of DNA through a polymerase (7, 8) . Indeed, it has been reported that transcription through a RbulletY repeat can lead to formation of intramolecular triple helices on a supercoiled template(9) .

We examined the effects of transcription through a naturally occurring RbulletY sequence from the upstream region of a neuron-specific gene. More than 80% of the coding strand in the first 500 base pairs (bp) (^1)immediately upstream of the rat GAP-43 gene is composed of purines(10) . Within this are three RbulletY tracts (Fig. 1, top). We demonstrate that these sequences form stable non-B structures that are consistent with triple helices on either supercoiled or linear DNA under conditions that approximate the in vivo state. These structures form only when transcription is in the direction that produces a purine-rich RNA, and formation of these structures blocks further transcription.


Figure 1: Transcription into an RbulletY tract to make a purine-rich transcript relaxes negative supercoils. Top, the sequence of the insert used in this work is shown 50 bases/line(10) . RbulletY tracts I-III (numbered from the top or 5` end) are underlined. Plasmids containing the insert in the orientation that produces a purine-rich transcript from the T7 promoter are designated T7+. Bottom, the native mobility of plasmids pSP72(-), 72T7- () and 72T7+ () are shown in lanes 1-3, respectively. The effects of transcription with either SP6 (S) or T7 (T) RNA polymerase on gel mobility are shown in lanes 4-9. Lanes 10-15 contain the same material after single strand-specific RNase treatment (RNase A, 20 µg/ml and T1, 1000 units/ml for 1 h at 37 °C). Templates transcribed to make a purine-rich RNA now exhibit bands with the mobility of relaxed and partially relaxed conformers (lanes 12 and 15), and all others have been returned to a pretranscription mobility. Lane M is the -DNA BstEII size marker.




EXPERIMENTAL PROCEDURES

Construction of Plasmids

A 562-base pair XbaI/SspI fragment (.48R1X.08) that contains the rat GAP-43 first exon was inserted into pSP72 (Promega) (XbaI/PvuII) to form 72T7+ (2994 bp), into pSP72 (XbaI/EcoRV) to form 72T7- (2985 bp), into pSP73 (XbaI/PvuII) to form 73T7- (2996 bp) and pSP73 (XbaI/EcoRV) to form 73T7+ (2987 bp). All restriction enzymes were from New England Biolabs.

In Vitro Transcription

T7 and SP6 polymerases were from U. S. Biochemical Corp. RNA transcription from phage promoters was initially carried out in SP6 transcription buffer provided by the manufacturer (40 mM Tris pH 8, 6 mM MgCl(2), 2 mM spermidine, 1 mM dithiothreitol, and 0.5 mM each NTPs) for 15 min at 37 °C. For some samples prepared directly for EM, the transcription buffer was changed (40 mM HEPES pH 7.5, 6 mM MgCl(2), 1 mM beta-mercaptoethanol, and 0.5 mM each NTP) to avoid potential EM artifacts from dithiothreitol or spermidine.

RNase Digestion

Unless otherwise specified in the figure legends, RNase A and T1 were used at a concentration of 20 µg/ml and 1000 units/ml, respectively, for 1 h at 37 °C in TE (10 mM Tris HCl, pH 8.0, 1 mM EDTA). RNase H (250 units/ml) digestion was carried out in 1 times T-4 polymerase buffer (33 mM Tris acetate pH 8, 66 mM potassium acetate, 10 mM magnesium acetate, 5 mM dithiothreitol, and 1 mg/ml bovine serum albumin) for 1 h at 37 °C. The effects of RNase A and T1 were independent of buffers used; digestion in TE, 300 mM NaCl, T4 polymerase buffer, or SP6 polymerase buffer all gave comparable results. RNase A was from Sigma, T1 was from Boehringer Mannheim, and H was from Life Technologies, Inc.

Gel Electrophoresis

Electrophoresis was carried out in 1% agarose gels in standard TAE buffer (40 mM Tris acetate, 2 mM EDTA at pH 8.0). Gels were stained with ethidium bromide after electrophoresis. In some cases DNA was isolated from agarose gel fragments by extrusion of the liquid using low speed centrifugation over a small column, followed by ethanol precipitation.

Analysis of Transcription Arrest Points

At t = -2 min, 20 µl of 1 times T7 transcription buffer with 0.5 mM each NTP and 500 units/ml T7 enzyme prewarmed to 37 °C was added to 1 µl (20 ng) of linear plasmid (T7+ or T7-) and incubated at 37 °C (2 min was empirically found to allow for nearly 100% blockade of the non-permitted direction). At time t = 0, 3 µl of prewarmed label mix with 25 µCi of [-P]GTP (DuPont NEN, NEG-004Z, 6000 Ci/mmol) in 1 times T7 buffer was added to each tube. At t = 2, 5, and 10 min, 6-µl samples were taken from each reaction and stopped by mixing with 50 µl of stop buffer (2 M guanidinium isothiocyanate, 50% phenol, and 10 µg/ml tRNA). The samples were ethanol precipitated twice, resuspended in a loading buffer containing 90% formamide, heated to 80 °C, and loaded on a prewarmed 7% polyacrylamide sequencing gel containing 8 M urea. A sequence reaction provided a size marker.

Electron Microscopy

Unless otherwise specified in the figure legends, reactions were stopped by ethanol precipitation and aliquots of templates were resuspended in the appropriate spreading mix. DNA samples were titrated into either a denaturing (50% formamide, 100 mM Tris pH 8.0, 10 mM EDTA, and 5 µg/ml cytochrome c) or protein-free (10-100 µg/ml ethidium bromide, 0.5-4 mM MgCl(2), and 10 mM HEPES pH 7.5) solution. Drops (50 µl) on parafilm were incubated for 2-20 min in a humidified chamber before being picked up by a parlodion-coated grid. The grids were stained for 60 s with 100 µM uranyl acetate in 70% ethanol, rinsed with 70% ethanol, then blotted dry. Grids were rotary-shadowed with platinum/paladium (80:20) at an average angle of 7.5° at 2 times 10 torr.


RESULTS

The possibility that transcription might cause the GAP-43 upstream RbulletY DNA tract (Fig. 1, top) to adopt an alternative structure was first tested with supercoiled templates. The templates were transcribed from flanking phage promoters, and the effect of such transcription on the mobility of supercoiled templates was analyzed by gel electrophoresis (Fig. 1, bottom). Transcription initially produces a smeared appearance for all templates (Fig. 1, bottom lanes 4-9). The majority of this smearing is resolved by treatment with single strand-specific RNases, suggesting that it is due to RNA tangled with the supercoiled templates. The parental plasmid transcribed in either direction, and the plasmids with insert transcribed to make a predominantly pyrimidine (Y) transcript, are fully resolved by RNase treatment (lanes 10, 11, 13, and 14). However, templates transcribed to make a predominantly purine transcript are resistant to complete resolution by single strand-specific RNases (lanes 12 and 15). The same results were obtained with pSP73, 73T7-, and 73T7+ (not shown). Only transcription to make a purine-rich transcript produces this RNase-resistant mobility shift. It is independent of polymerase used, orientation within the plasmid, or distance from the promoter. RNase treatment of the untranscribed plasmids had no effect (not shown). The several discrete species that remain in lanes 12 and 15 after RNase treatment exhibit the mobility of relaxed conformers, as has been shown by Reaban and Griffin (9) for a different RbulletY tract in a supercoiled template. The sequence we tested resulted in greater relaxation of supercoils than was evident in the figures presented in that report (9) , an indication that it may have a higher propensity to form alternative structures.

A Long Purine-Pyrimidine Tract Permits Transcription in Only One Direction

We speculated that local supercoiling generated by polymerase might suffice to drive the formation of an intramolecular triplex on this upstream RbulletY sequence on linear DNA without the benefit of previously existing negative supercoils. Therefore, linear plasmids with the RbulletY tracts inserted in opposite orientations (Fig. 2, bottom) were transcribed with T7 RNA polymerase. The products of these reactions were subjected to electrophoresis through an agarose gel to determine if transcription through the RbulletY repeats could trigger formation of non-B structures on linear templates, as revealed by altered electrophoretic mobility.


Figure 2: Transcription through an RbulletY tract is blocked in one direction. Linear plasmids with the GAP-43 upstream RbulletY tracts inserted in opposite orientations are illustrated at the bottom, with boxes indicating the length contribution and orientation of the RbulletY tracts. The cut point is 10 base pairs downstream of the sequence shown in the top of Fig. 1for the GA-rich direction, and 40 bp upstream of the sequence as shown, for the inverted (CU) direction. Lanes 1 and 2 show the mobility of the control linear plasmids. Lanes 3-8 show mobility after T7 transcription followed by no further treatment (lanes 3 and 4), treatment with RNases specific for single strands (lanes 5 and 6) or treatment with an RNase specific for the RNA in RNAbulletDNA hybrids (lanes 7 and 8). Transcription to make a purine-rich RNA (lane 3) results in no detectable product and retarded template mobility. Transcription producing a pyrimidine-rich RNA (lane 4) results in an abundance of product and no alteration in template mobility. Retarded mobility is partially reduced in the retarded template by RNase A and T1 treatment (lane 5) and completely eliminated by RNase H treatment (lane 7). Lane M, -DNA BstEII digest.



Transcription that produces a purine-rich RNA (Fig. 2, lane 3) causes a very large mobility shift in the linear DNA template but generates little free product RNA. Conversely, transcription that produces a pyrimidine-rich RNA transcript (Fig. 2, lane 4) generates an abundance of product and no alteration in the mobility of the template. The aberrant migration of templates generating purine transcripts (lane 3) appears to be due to retention of nascent transcripts and polymerase. The combination of single strand specific RNases (A and T1) reduces the retardation (lane 5). Partial protection from RNase A and T1 implies that some of the RNA can base pair with the template, as is confirmed by the complete resolution by RNase H (lane 7) which is specific for the RNA in an RNAbulletDNA hybrid. As expected, RNase A and T1 completely remove the product of opposite orientation transcription (lane 6), while RNase H treatment does not affect the mobility of the template or transcripts from that transcription (lane 8).

Electron Microscopy Shows a Stalled Transcription Complex

To further investigate the nature of the mobility-altering structural changes, transcribed templates were analyzed by electron microscopy after preparation with a protein-free spreading technique that would allow for visualization of bound proteins and better retention of structure than conventional denaturing methods(11) . After transcription in the permitted direction (Fig. 3A), the templates have a uniform appearance, identical to that of untranscribed controls (not shown), except that they are surrounded by transcripts. The molar excess of transcripts indicates that these templates sustained multiple rounds of transcription. This is consistent with the appearance these templates exhibit in lane 4 of Fig. 2. Since the pyrimidine-rich transcripts migrate as a single band of expected size when electrophoresis is carried out under denaturing conditions (Fig. 4), the variable appearance of the transcripts in the EM (and the smearing in Fig. 2) is most likely the result of secondary structures.


Figure 3: Electron micrographs reveal a knotted complex. A, EM of templates transcribed in the ``permitted'' direction (pyrimidine RNA, as in lane 4 of Fig. 2). Small structures corresponding to RNA transcripts are a large molar excess to the plasmid templates. Of 221 templates viewed, 218 (99%) had this simple, linear appearance. B, EM of templates transcribed in the ``blocked'' direction (purine RNA, as in lane 3 of Fig. 2). Knotted complexes near the end of the templates may correspond to stalled transcription units. Of 251 templates scored, structures were present at one end in 240 (96%) while eight had a simple linear appearance and three had additional structures away from the end. Note the absence of anything resembling free transcript molecules. C, EM of templates transcribed in the blocked direction in which the RbulletY tract is located closer to the center. Templates were generated by cutting the 2987-bp templates 660 bp downstream of the site used in B. The point of knotting is closer to the center, as predicted, in 30 (88%) out of 34 templates viewed. Again, little that might correspond to free transcripts is visible in these spreads. D, a typical molecule, as in B, at higher resolution. A sharp bend causes the free end (wide arrow) to bend back sharply, almost touching the plasmid at a point upstream of the knot. Dark stained beads near the bend point (small arrows) may correspond to polymerase units. Roughly 80% (194/240) of the templates with complexes at the end exhibit a similar conformation. In templates where the free end is clearly visible, the bend ranges from 90 to 180 degrees (as in B). E, about 20% of the templates with complexes at the end (46/240) have a more ``relaxed'' appearance, exhibiting an open loop (hollow arrow) of varying size. Templates with a large open loop, such as this example, exhibit no particular orientation of the free end (wide filled arrow) in relation to the rest of the template. The dark staining beads (small arrows) may correspond to polymerase, giving this the appearance of an extended transcription bubble. Size bar equals 0.25 µm.




Figure 4: Transcription is blocked at multiple points. End-labeled transcripts show the accumulation of CU-rich RNA after 2, 5, and 10 min of labeling (lanes 1-3) and the lack of accumulation of GA-rich transcripts over the same period (lanes 4-6). Longer exposure of lanes 4-6 (lanes 4-6) shows multiple points of blockade to GA production. The schematic map indicates the distance, in bases, from the point of T7 initiation (bottom) to the cut end of the plasmid (top) for the GA-rich RNA. The total length of the expected transcript is 621 bases, including 47 bases of plasmid sequence 5` to the insert shown in Fig. 1. The majority of stops occur 5` to and within the 5` end of the polypurine tracts. Blockade does not appear to occur within the 3` half of tract I (210-290 bases) nor beyond the 5` portion of tract III (450 to the end of the template).



Fig. 3B illustrates the appearance of the templates transcribed to make a purine-rich RNA product (blocked direction), corresponding to those with an altered mobility in lane 3 of Fig. 2. The 2996-bp templates had been made linear 10 bp downstream of the sequence shown in Fig. 1and exhibit structures on one end that are consistent with stalled transcription complexes. The position of the complex is close to the end of the linear DNA in Fig. 3B, as predicted by the near terminal location of the RbulletY tracts on the templates. When templates are made linear by cutting 660 bp downstream of the point used in Fig. 3B, the knotted complex formed after transcription in the blocked direction is more toward the center of the template (Fig. 3C). This demonstrates that the RbulletY tract determines the position of the knot and that the unidirectional transcription block does not require proximity to a free end.

In Fig. 3, D and E, individual templates from a preparation shown in Fig. 3B are displayed at higher resolution. In Fig. 3D, the knotted structure near the end of the template includes a sharp bend that causes the free duplex end of the template (wide arrow) to kink back sharply toward the rest of the template. Such a kink is predicted by models of intramolecular triple helices(12) . Triplex-induced kinks may have consequences for both transcription and recombination by positioning formerly distant sequences(4, 6, 12, 13) , as in this example. The small arrows point to blobs with tails that we believe are polymerase units with nascent transcripts that remain with the template. No fixation step was used in any of these experiments and the association of polymerase/nascent transcript with the template must be fairly stable. The template exhibited in Fig. 3E has a more relaxed appearance. The free end of the template (wide arrow) is similar in length to the one present in Fig. 3D but is not kinked. The sharp bend is replaced by a bubble (wide, hollow arrow). The dark stained objects opposite this (small arrows) may correspond to polymerase molecules. We suggest that Fig. 3D is consistent with the expected conformation of a template containing an intact triple helix and that Fig. 3E is consistent with one that has unwound after trapping RNA polymerase.

The structures corresponding to stalled transcription units in Fig. 3predicted the existence of truncated transcripts. Such structures sometimes appeared to be bound at slightly different distances from the free end of comparable templates (Fig. 3C) and some templates appear to contain several such structures bound (e.g.Fig. 3B, Fig. 3D, and Fig. 3E), which predicted multiple truncation points. To examine the extent and location of these transcription stop points, transcripts were end-labeled during transcription and the products analyzed on a denaturing polyacrylamide gel.

End-labeled Transcripts Reveal Multiple Points of Arrest

Templates with opposing insert orientations were linearized as in Fig. 2, to place the RbulletY sequences at the end. T7 polymerase initiates transcription with a G residue outside the polylinker of these templates. Transcription in the presence of -labeled GTP allows only the first G residue to retain label, regardless of insert orientation, thus avoiding the inherent strand bias of the RbulletY tracts. In Fig. 4, lanes 1-3 show the progressive accumulation of a single band corresponding to the predicted 612 nucleotide pyrimidine-rich transcript 2, 5, and 10 min after addition of label. In contrast, samples taken 2, 5, and 10 min from parallel transcription reactions to make a purine-rich transcript through the RbulletY tracts (lanes 4-6) show little accumulation of transcript over the same time.

Lanes 4-6 are an 8-fold longer exposure of lanes 4-6, to show the points of arrest during purine-rich transcription. There are many stop points spanning nearly 400 bases (over 10% of the total template length). The stopping points cluster into two primary groups, those between 300 and 400 bases and those below 210 bases in length. The schematic map shown to the right indicates the 621 base region transcribed (bottom to top) to produce the purine-rich material in lanes 4-6. Thin lines join selected size intervals on the gel to the corresponding point on the schematic, so that stop points can be correlated with the locations of the RbulletY tracts. It can be seen that the lower group of stop points are located 5` to, and within the 5` half of tract I. Transcription does not stop from midway through the first RbulletY tract to its end (from around 210 to 300 bases). Similarly, the second major cluster of stops are 5` to, and within the 5` end of, tract II. Few additional stops occur from midway through this tract to the cut end of the plasmid (621 bases, or the top of the schematic). Given the frequency of transcription stops in both repetitive and non-repetitive sequences, recognition of a particular stop sequence by the polymerase seems unlikely. Rather, it appears that structures formed by the RbulletY sequences block entry by RNA polymerase. The many stop points may indicate either that a number of structures are formed, or that polymerase units stack 5` to a few structures. The existence of two primary clusters of stops indicates that there are likely to be at least two classes of structures.

Separate RbulletY Tracts May Cooperate

The stability of the H-DNA triplex has been demonstrated to be dependent on both the length and sequence of a RbulletY repeat(14, 15, 16, 17) . It is likely that the unidirectional transcription blockade by the proposed R-RbulletY triplex is also subject to some sequence and length constraints. Its greater promiscuity in predicted bonding possibilities (18, 19, 20) suggests that its formation requires less dyad symmetry than does H-DNA(15, 16) . Whatever the length requirement, the opportunities for this transcription block to occur are enhanced if multiple, separated tracts can interact. We took advantage of the nature of the region studied here (see Fig. 1A) to determine if multiple, heterogeneous RbulletY tracts could interact over a short distance.

Fig. 5shows the changes in gel mobility for templates that had been transcribed in the blocked direction into tract I by itself (lanes 1-4), tracts I and II (lanes 5-8), or all three (lanes 9-12). When more than one tract is transcribed additional smearing and a slight mobility change is apparent even after RNase treatment (lanes 7 and 8). When all three tracts are transcribed two broad smears that may correspond to multiple mobility classes are visible (lanes 11 and 12). These span the mobility characteristic of transcription through just tract I (lanes 3 and 4) and tracts I and II (lanes 7 and 8). Multiple mobility classes are also apparent when all three tracts are transcribed in the blocked direction in supercoiled plasmids (Fig. 1, lanes 12 and 15). Our interpretation is that transcription through all three RbulletY tracts can generate a variety of triplex structures both within and between the separate tracts.


Figure 5: The three GAP-43 RbulletY tracts cooperate in forming mobility altering structures. The plasmid 73T7+ was digested with restriction enzyme NsiI (lanes 1-4), NheI (lanes 5-8), or BglII (lanes 9-12) to include either one, two or all three RbulletY tracts downstream of the T7 promoter in the blocked orientation. One-fourth of each digest was left untreated. The rest was reacted with T7 polymerase and then incubated with no RNase(-), 0.2 µg/ml RNase A and 10 units/ml T1 (+), or 20 µg/ml A and 1000 units/ml T1 (+++) for 30 min at 37 °C. The inclusion of all three tracts leads to a greater degree of complexity in mobility after RNase treatment.



This was confirmed by direct EM visualization of templates corresponding to those in lane 11 of Fig. 5after preparation with either a denaturing spreading technique (21) or non-denaturing technique(11) . The fields in Fig. 6, A and B, show the predominant classes of structures found in these preparations. Formamide is predicted to denature the putative triplex, resulting in an R-loop (Fig. 6C, top). The denatured templates shown in Fig. 6A clearly demonstrate the presence of the predicted RNAbulletDNA hybrids as multiple size classes of R-loops.


Figure 6: Multiple structures formed by transcription resist single strand specific RNase treatment. A, partial denaturation by formamide reveals the predicted RNAbulletDNA hybrids as R-loops. Of 119 molecules scored, 58 (49%) had a large single loop, 27 (23%) had a small single loop, 23 (19%) appeared to be linear, 9 were Y-shaped, and 3 had two separate loops. B, lack of denaturation reveals knotted complexes. Ethidium bromide intercalates into double-stranded nucleic acid and extends it, giving the templates a long thin appearance. The templates have a characteristic knot and loop structure, and the knot corresponds to the predicted location of the putative R-RbulletY triplex. The loops in B correspond to the RNase-resistant RNAbulletDNA hybrids that appear as R-loops in A. Of 82 separate molecules scored, 30 (37%) had a large loop (form 3), 17 (21%) had a double loop (form 2), 15 (18%) had a small loop (form 1), and 4 (5%) were linear. An additional 16 molecules had a loop evident but were not spread enough to interpret as structures 1, 2, or 3. C, interpretations. After spreading with formamide/cytochrome c, the triplex would be denatured, leaving R-loops which should vary in size. Below are shown interpretations of the three major species found in the ethidium bromide/magnesium spreads. Thickened lines in the area of the triplex regions are used to indicate the condensed nature of the triplex. Size bars equal 0.25 µm.



When spread using the non-denaturing technique to permit better retention of structure, the templates exhibit the conformations displayed in Fig. 6B. Interpretations of the non-denatured conformations are shown at the bottom of Fig. 6C. The knot in the templates shown in Fig. 6B is characteristic and corresponds to the predicted location of the proposed R-RbulletY triplex structure. This condensed area is not the polymerase, which stains much more densely in these preparations (see Fig. 3C for a polymerase next to a knot). The loops correspond to the RNAbulletDNA hybrids that are seen as R-loops under the denaturing conditions in Fig. 6A (compare also to Fig. 3, D and E). The various structures shown in Fig. 6indicate that R-RbulletY triple helices are capable of forming between non-symmetrical RbulletY sequences and across a distance.


DISCUSSION

Transcription into the GAP-43 upstream RbulletY tract in the direction that produces a predominantly purine RNA triggers formation of a structure that traps the transcription complex and blocks subsequent rounds of transcription. Since it permits transcription in only one direction, we refer to the RbulletY tract as a DNA diode.

The documented ability of RbulletY sequences to form triple helices suggests that the blocking complex contains a triple helix. Blockade occurs only when a purine-rich RNA is generated, so any triple helix formed is not likely to be the well-described Y-RbulletY (H-DNA) triplex, but rather the more recently described R-RbulletY structure(5, 18, 22, 23) . This would be true whether this RNA acts as a third strand or replaces the ``donated'' strand in an intramolecular triplex. The relaxation of negatively supercoiled templates in Fig. 1is indicative of an intramolecular triplex.

This is the first example of a transcription elongation block by an intramolecular triplex, and its dynamic nature is somewhat surprising. Triple helices have been shown to interfere with transcription in other systems. Oligonucleotides acting as the third strand in a triplex near the site of a DNA-binding protein needed for transcription activation may compete with it and thereby lower transcription initiation (24, 25, 26, 27) . Oligonucleotide-directed interference with in vitro transcription elongation by intermolecular triple helix formation has been reported for Escherichia coli RNA polymerase (28) and for RNA polymerase II(29) . In both cases chemically modified pyrimidine oligonucleotides caused some pausing of polymerase, but efficient blockade of transcription elongation was observed only when these oligonucleotides were covalently cross-linked to the templates. In contrast, the transcription elongation block by the naturally occurring RbulletY repeats we observe is due to an intramolecular R-RbulletY triplex and needs no base modifications, exogenous oligonucleotides, or cross-linking for its formation and stability. It is also very efficient.

An advancing polymerase generates local negative supercoiling in its wake(7, 8, 30) , while intramolecular triple helices relax negative supercoils and can form in response to increased negative supercoiling (4, 6, 17) . The diagram in Fig. 7(AandB) is presented as a way to visualize the combination of these events. As a polymerase moves through the RbulletY tract, local negative supercoiling behind the polymerase helps drive the formation of a triple-stranded structure. As the donated third strand is wound down the major groove, the rotation of the acceptor helix releases negative supercoils (Fig. 7A). The spring-like tension relaxed by the denaturation and winding back of the third strand helps make formation of the triplex energetically favorable.


Figure 7: Genesis of the DNA diode. A, when RNA polymerase moves through an RbulletY tract in the direction that makes purine RNA, the local negative supercoiling generated in the wake of the polymerase drives the formation of a triple-stranded structure. As the acceptor duplex DNA rotates (direction shown by wide arrow), it releases negative supercoils. This rotation pulls the donated third strand (direction shown by small arrow) into the major groove, much as a string would be wrapped around a rotating screw. The initiation and progression of this triple strand formation are probably also aided by the polymerase-induced denaturation of the DNA duplex and, perhaps most importantly, by the propensity of the R-RbulletY triplex to form. B, after formation, an RNAbulletDNA duplex stabilizes the triplex. The triplex blocks subsequent passage by RNA polymerase. Polymerase in the loop during triplex formation may become trapped, but how this occurs is not yet clear. Because RNase treatment specific for single strands can remove trapped polymerase molecules, we favor a mechanism of steric hindrance by a higher order structure(s) involving the free end of the transcript. For clarity, this has been omitted from the figure. C, a representation of an R-RbulletY triplex (the H-r3 conformation) that could form behind an advancing polymerase. The additional purine strand is antiparallel to the central purine strand, and all bonds are stable at physiologic pH. Normal Watson-Crick base pairs are indicated by a single large dot, and alternative bonds stable at physiologic pH are indicated by two small dots. D, a representation of the Y-RbulletY or ``H-DNA'' triplex is shown (H-y3 conformation). The pyrimidine third strand is paired in parallel to the central purine strand. The G-C Hoogsteen bonds require protonation and are indicated by a plus sign. In both triplex structures shown, a donated strand folds back in an intramolecular triplex and would alter the supercoiling. In contrast, an intermolecular triplex would not affect supercoiling.



As shown in Fig. 7B, an RNAbulletDNA hybrid stabilizes the triplex (9) by blocking the donated purine DNA strand from reassociation. The sensitivity of the complex to RNase H, resistance to RNases A and T1, and direct EM visualization of R-loops all support the hypothesis that an RNAbulletDNA hybrid with normal base pairing stabilizes the structure. Such kinetic trapping of a triplex by occupation of the complement to the donated strand has been demonstrated for the H-DNA triplex using pyrimidine oligonucleotides (31) and has been proposed to result in the protection of purine oligonucleotides by the R-RbulletY triplex in vivo(32) .

We believe that the diode-like effect of the RbulletY sequences stems from differential stability of the two potential types of intramolecular triple helices generated by transcription. The triple helix under discussion here, the R-RbulletY triplex (Fig. 7C), is stable at physiologic pH, and its stability is enhanced by the presence of divalent cations, conditions that are likely to prevail in the nucleus (18, 22, 23, 32, 33, 34) .

The Y-RbulletY triplex (Fig. 7D) might be expected to form in response to transcription that makes a pyrimidine-rich RNA. However, cytosine bases on the third strand must be protonated to make Hoogsteen base pairs in a C+GbulletC triplet, and H-DNA usually requires a low pH to achieve triplex formation(4, 6) . Even then, when free energy was calculated, the stability of the third strand in an R-RbulletY triplex at pH 7.3 was found to be twice that of the third strand in the corresponding Y-RbulletY structure at pH 5.5(35, 36) . Had the Y-RbulletY (H-DNA) triplex been as stable as the R-RbulletY triplex under conditions of transcription, both directions should have been blocked.

A prior report suggesting transcription-mediated formation of intramolecular triple helices on supercoiled templates originally did predict an H-DNA triplex stabilized by a pyrimidine RNA(9) , which would seem to be at odds with our findings. However, the model was subsequently modified to that of an R-RbulletY triplex(37) , in accordance with what we have found.

The conditions for triplex formation may be less stringent for the R-RbulletY structure even beyond that of pH requirements, since it appears able to form without strict mirror symmetry, between non-contiguous tracts, and on linear DNA. In Fig. 8a model of one of the possible interactions between tracts I and II is shown. The apparent mismatches between these non-mirror tracts may not be as destabilizing as they appear, given the relative promiscuity and flexibility in the proposed hydrogen bonding schemes for R-RbulletY triplets(18, 19, 20) . Accordingly, transcription-triggered formation of an R-RbulletY triplex as presented in Fig. 7may be able to initiate anywhere within the perfectly symmetrical (GA) repeats and also at a number of points within the more variable sequences of tracts II and III. The only requirement may be that any triplex formed must be stable enough to remain after the transient wave of negative supercoiling has dissipated.


Figure 8: A possible alignment of RbulletY tracts I and II across 70 bp of non-RbulletY DNA. An example of how non-mirror RbulletY tracts may interact. The sequence shown is from Fig. 1, and numbered accordingly. It shows one possible alignment of a 3` section of tract I with the purine strand of tract II. The hairpin shown is one of several roughly equivalent possibilities predicted by a folding program(38, 39) . Cruciform extrusion of intervening DNA may assist in the formation of alternative structures between separate RbulletY tracts(40) . The pyrimidine-rich strand is shown only in the area of the triplex for clarity but is continuous with the rest of the plasmid on both ends (as is the folded strand). As in the previous figure, a single black dot denotes normal base pairs, and two small dots indicate alternative interactions.



In the example in Fig. 8, the intervening non-RbulletY DNA between tracts I and II is shown as a self-paired hairpin, one of several predicted by a folding program(38, 39) . The non-RbulletY DNA between regions II and III is a perfect inverted repeat (centered on base 369 in Fig. 1). It is intriguing that both intervening non-RbulletY segments are potentially capable of forming hairpins, since cruciform extrusion of inverted repeats between separate RbulletY tracts has been demonstrated to facilitate the interaction of those tracts(40) . Moreover, cruciform extrusion of palindrome sequences is driven by negative supercoiling and initiated by a denaturation event(41) . Transcription could be expected to aid cruciform extrusion in a way analogous to the model presented in Fig. 7for triplex formation.

Given the propensity of these structures to form in vitro, and their remarkable stability after formation, it seems very likely that they do exist, at least transiently, in vivo. RbulletY tracts of greater than 20 consecutive base pairs are rare in prokaryotes and phage but are overrepresented in the genomes of higher eucaryotes, and in birds and mammals in particular(1, 2) . In theory, long RbulletY tracts may be able to regulate transcription without additional DNA-binding proteins. Whether or not that is the case, the relatively facile formation of these structures suggests a need for enzymes to resolve them.

Our results suggest that the RbulletY tracts upstream of the major GAP-43 transcription starts may serve to block polymerase units starting upstream of the gene from transcribing through it. RbulletY sequences are present in the upstream regions of numerous genes(4, 12) , but since the observed transcription blockade is directional, analysis of orientation is necessary before concluding that it has more general relevance as a motif upstream of tightly controlled transcription units.

For ribosomal RNA, the potential for RbulletY sequences to play an important role in transcription blockade is particularly striking. The tandem array of genes coding for ribosomal RNA contains long RbulletY tracts in the so-called ``non-transcribed spacer''(42) . In the human ribosomal non-transcribed spacer, RbulletY tracts constitute more than 10% of the sequence(43) , five RbulletY homopolymers range from 50 to 200 bp in length, and all are oriented such that antisense transcription into this region would make a predominantly purine RNA and cause triplex formation. This could serve to prevent production of antisense ribosomal RNA from cryptic initiation sites upstream of the strong polymerase I enhancer.


FOOTNOTES

*
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) L21190[GenBank].

§
To whom correspondence should be addressed: Developmental Biology Laboratory & Cardiovascular Research Center, Massachusetts General Hospital, 149 13th Str., 4th Floor, Charlestown, MA 02129- 2600. Tel.: 617-726-3738; Fax: 617-726-5806.

(^1)
The abbreviation used is: bp, base pair(s).


ACKNOWLEDGEMENTS

We thank Richard Kolodner for use of the EM. Thanks also to Arlen Johnson, David Norris, and Karen Rock for helpful tips and assistance with EM.


REFERENCES

  1. Manor, H., Rao, B. S., and Martin, R. G. (1988) J. Mol. Evol. 27, 96-101 [Medline] [Order article via Infotrieve]
  2. Behe, M. J. (1987) Biochemistry 26, 7870-7875 [Medline] [Order article via Infotrieve]
  3. Larsen, A., and Weintraub, H. (1982) Cell 29, 609-622 [Medline] [Order article via Infotrieve]
  4. Wells, R. D., Collier, D. A., Hanvey, J. C., Shimizu, M., and Wohlrab, F. (1988) FASEB J. 2, 2939-2949 [Abstract/Free Full Text]
  5. Bernues, J., Beltran, R., Casasnovas, J. M., and Azorin, F. (1989) EMBO J. 8, 2087-2094 [Abstract]
  6. Htun, H., and Dahlberg, J. E. (1989) Science 243, 1571-1576 [Medline] [Order article via Infotrieve]
  7. Liu, L. F., and Wang, J. C. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 7024-7027 [Abstract]
  8. Wu, H. Y., Shyy, S. H., Wang, J. C., and Liu, L. F. (1988) Cell 53, 433-440 [Medline] [Order article via Infotrieve]
  9. Reaban, M. E., and Griffin, J. A. (1990) Nature 348, 342-344 [CrossRef][Medline] [Order article via Infotrieve]
  10. Grabczyk, E., Zuber, M. X., Federoff, H. J., Ng, S.-C., Pack, A., and Fishman, M. C. (1990) Eur. J. Neurosci 2, 822-827 [Medline] [Order article via Infotrieve]
  11. Koller, T., Sogo, J. M., and Bujard, H. (1974) Biopolymers 13, 995-1009 [Medline] [Order article via Infotrieve]
  12. Htun, H., and Dahlberg, J. E. (1988) Science 241, 1791-1796 [Medline] [Order article via Infotrieve]
  13. Kohwi, Y., and Panchenko, Y. (1993) Genes & Dev 7, 1766-1778
  14. Hanvey, J. C., Klysik, J., and Wells, R. D. (1988) J. Biol. Chem. 263, 7386-7396 [Abstract/Free Full Text]
  15. Mirkin, S. M., Lyamichev, V. I., Drushlyak, K. N., Dobrynin, V. N., Filippov, S. A., and Frank-Kamenetskii, M. D. (1987) Nature 330, 495-497 [CrossRef][Medline] [Order article via Infotrieve]
  16. Voloshin, O. N., Mirkin, S. M., Lyamichev, V. I., Belotserkovskii, B. P., and Frank-Kamenetskii, M. D. (1988) Nature 333, 475-476 [CrossRef][Medline] [Order article via Infotrieve]
  17. Collier, D. A., and Wells, R. D. (1990) J. Biol. Chem. 265, 10652-10658 [Abstract/Free Full Text]
  18. Beal, P. A., and Dervan, P. B. (1991) Science 251, 1360-1363 [Medline] [Order article via Infotrieve]
  19. Durland, R. H., Kessler, D. J., Gunnell, S., Duvic, M., Pettitt, B. M., and Hogan, M. E. (1991) Biochemistry 30, 9246-9255 [Medline] [Order article via Infotrieve]
  20. Radhakrishnan, I., de los Santos, C., and Patel, D. J. (1991) J. Mol. Biol. 221, 1403-1418 [Medline] [Order article via Infotrieve]
  21. Gordon, C. N., and Kleinschmidt, A. K. (1968) Biochim. Biophys. Acta 155, 5-7
  22. Kohwi, Y., and Kohwi-Shigematsu, T. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 3781-3785 [Abstract]
  23. Bernues, J., Beltran, R., Casasnovas, J. M., and Azorin, F. (1990) Nucleic Acids Res. 18, 4067-4073 [Abstract]
  24. Cooney, M., Czernuszewicz, G., Postel, E. H., Flint, S. J., and Hogan, M. E. (1988) Science 241, 456-459 [Medline] [Order article via Infotrieve]
  25. Grigoriev, M., Praseuth, D., Guieysse, A. L., Robin, P., Thuong, N. T., Helene, C., and Harel, B. A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 3501-3505 [Abstract]
  26. Gee, J. E., Blume, S., Snyder, R. C., Ray, R., and Miller, D. M. (1992) J. Biol. Chem. 267, 11163-11167 [Abstract/Free Full Text]
  27. Orson, F. M., Thomas, D. W., McShan, W. M., Kessler, D. J., and Hogan, M. E. (1991) Nucleic Acids Res. 19, 3435-3441 [Abstract]
  28. Duval-Valentin, G., Thuong, N. T., and Helene, C. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 504-508 [Abstract]
  29. Young, S. L., Krawczyk, S. H., Matteucci, M. D., and Toole, J. J. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 10023-10026 [Abstract]
  30. ten Heggeler-Bordier, B., Wahli, W., Adrian, M., Stasiak, A., and Dubochet, J. (1992) EMBO J. 11, 667-672 [Abstract]
  31. Belotserkovskii, B. P., Krasilnikova, M. M., Veselkov, A. G., and Frank-Kamenetskii, M. D. (1992) Nucleic Acids Res. 20, 1903-1908 [Abstract]
  32. Michel, D., Chatelain, G., Herault, Y., and Brun, G. (1992) Nucleic Acids Res. 20, 439-443 [Abstract]
  33. Kang, S., and Wells, R. D. (1992) J. Biol. Chem. 267, 20889-20891
  34. Kohwi, Y., Malkhosyan, S. R., and Kohwi-Shigematsu, T. (1992) J. Mol. Biol. 223, 817-822 [Medline] [Order article via Infotrieve]
  35. Pilch, D. S., Brousseau, R., and Shafer, R. H. (1990) Nucleic Acids Res. 18, 5743-5750 [Abstract]
  36. Pilch, D. S., Levenson, C., and Shafer, R. H. (1991) Biochemistry 30, 6081-6088 [Medline] [Order article via Infotrieve]
  37. Stavnezer, J. (1991) Nature 351, 447-448 [Medline] [Order article via Infotrieve]
  38. Zuker, M. (1989) Science 244, 48-52 [Medline] [Order article via Infotrieve]
  39. Jaeger, J. A., Turner, D. H., and Zuker, M. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 7706-7710 [Abstract]
  40. Klysik, J. (1992) J. Biol. Chem. 267, 17430-17437 [Abstract/Free Full Text]
  41. Murchie, A. I., and Lilley, D. M. (1987) Nucleic Acids Res. 15, 9641-9654 [Abstract]
  42. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) J. Mol. Biol. 215, 403-410 [CrossRef][Medline] [Order article via Infotrieve]
  43. Dickson, K. R., Braaten, D. C., and Schlessinger, D. (1989) Gene (Amst.) 84, 197-200 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.