©1996 by The American Society for Biochemistry and Molecular Biology, Inc.
The Chicken -Globin Gene Promoter Forms a Novel Cinched Tetrahelical Structure (*)

(Received for publication, September 11, 1995; and in revised form, December 20, 1995)

R. M. Howell (§) K. J. Woodford M. N. Weitzmann K. Usdin (¶)

From the Section on Genomic Structure and Function, Laboratory of Biochemical Pharmacology, NIDDK, National Institutes of Health, Bethesda, Maryland 20892-0830

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES

ABSTRACT

We have previously shown that the G-rich sequence GCG(GGT)(2)GG in the promoter region of the chicken beta-globin gene poses a formidable barrier to DNA synthesis in vitro (Woodford et al., 1994, J. Biol. Chem. 269, 27029-27035). The K requirement, template-strand specificity, template concentration independence, and involvement of Hoogsteen bonding suggested that the underlying basis of this new type of DNA synthesis arrest site might be an intrastrand tetrahelical structure. However, the arrest site lacks the four G-rich repeats that are a hallmark of previously described intramolecular tetraplexes and contains a number of noncanonical bases that would be expected to greatly destabilize such a structure. Here we report evidence for an unusual K-dependent intrastrand ``cinched'' tetraplex. This structure has several unique features including the incorporation of bases other than guanine into the stem of the tetraplex, interaction between loop bases and bases in the flanking region, and base pairing between bases 3` and 5` of the tetrahelix-forming region to form a molecular ``cinch.'' This finding extends the range of sequences capable of tetraplex formation as well as our appreciation of the conformational complexity of the chicken beta-globin promoter.


INTRODUCTION

Sequences that cause arrest of DNA synthesis have been identified in plasmids, viruses, and chromosomes. Some of these arrest sites signal the point of replication termination in plasmids and chromosomes (1) . Others are associated with phenomena such as the amplification of genomic sequences(2) , strand switching during replication(3, 4, 5, 6) , or mutational hotspots(7) . Some of these sequences act by binding specific proteins that then block progression of the polymerase (8, 9, 10) , while others block DNA synthesis by forming DNA structures that are sufficient in and of themselves to impede the progress of the polymerase(2, 6, 11, 12, 13, 14, 15, 16) .

Previously identified structures implicated in DNA synthesis arrest include hairpins(14) , and triplexes(2, 17) . We have recently described the presence of a strong K-dependent DNA synthesis arrest site in a G-rich region of the chicken beta-globin gene promoter(18) . The sequence of the arrest site is shown in boldface type in Fig. 1A. The arrest site is composed of three independent blocks to DNA synthesis (K1, K2, and K3), suggesting three different structural blocks to DNA synthesis. The first block is the strongest, and under some conditions no chain extension is seen beyond this site (Fig. 1B). The characteristics of this region are not consistent with any previously defined category of DNA synthesis arrest site.


Figure 1: A, sequence of portion of chicken beta-globin promoter (GenBank locus: CHKHBBA) (32) indicating the sequence of the previously identified DNA arrest sites and some of the bases flanking the arrest site(18) . The arrest site sequence is shown in boldface type with numbers in boldface type above the sequence indicating the numbering scheme for bases in the arrest site that are used in this report. The arrest site sequence is thus labeled 1-26, with G^1 being the 5`-most base in the arrest site and G being the base at the 3` end of the arrest site. G^1 corresponds to the residue 195 bases upstream of the start of transcription. The positions of the previously described DNA synthesis arrest sites are also marked, K1-K3, respectively, and the relative strength of each arrest site is indicated by the number of filled arrowheads at that position. B, arrest of DNA synthesis by the templates GCG(GGT)(2)GG (chicken beta-globin promoter) and G in the presence (+) and absence(-) of 40 mM K. DNA synthesis arrest assays were performed as described previously(18) . C, diagram of generic intrastrand tetraplex.



We have previously shown that the underlying physical basis of this block to DNA synthesis is the formation of a series of intrastrand DNA structures that involve Hoogsteen base interactions between guanines (18) . It is known that some G-rich sequences associate into higher order structures via guanine tetrad formation. Four DNA strands containing sequences with a single G-rich motif can associate to form an intermolecular tetraplex referred to as G4 DNA(19) . Sequences containing two G-rich repeats can form G-G hairpins that can then dimerize to form tetraplexes made up of two DNA strands(20) , and sequences with four G-rich repeats or long G runs (21) can fold back to form an intrastrand tetraplex. An example of a generic intrastrand tetraplex is shown in Fig. 1C.

The properties of the chicken beta-globin DNA arrest site are consistent with the formation of an intrastrand tetraplex, in that they are template concentration-independent, are specific to the G-rich strand, are stable at elevated temperatures, require K, and involve non-Watson-Crick base interactions between guanines. The K specificity is particularly compelling since the binding constants of alkali metal ions to the phosphate groups in DNA are known to decrease slightly with increasing metal ion radius, i.e. Li Na K Rb Cs, and it is therefore difficult to rationalize the K specificity in terms of a hairpin or other similar structure. It has been suggested that the K specificity for tetraplexes results from some sort of size constraint for which K ions are particularly well suited(20) . Hydrogen bonding between four DNA strands in an intramolecular tetraplex creates an internal cavity that would exclude large ions such as Cs. Small ions such as Li can fit inside the cavity but are too small to form stable complexes with multiple ligand binding sites within the cavity. It has been claimed that the K ion is both small enough to fit inside the cavity and large enough to be able to bridge multiple binding sites within the cavity, thus forming octahedral coordination complexes with O-6 atoms in adjacent tetrads, thereby stabilizing the tetraplex(20) . However, the chicken beta-globin promoter arrest site sequence GCG(GGT)(2)GG lacks the repeated motif normally associated with tetraplexes and contains a number of non-guanine bases that might be expected to reduce the stability of the tetraplex.

Data presented here indicate that the chicken beta-globin promoter DNA synthesis arrest site does indeed form an intrastrand tetrahelical structure in the presence of K. However, this structure differs from conventional tetraplexes in a number of important respects. In addition to the incorporation of a number of non-guanine bases into the stem of the tetraplex, the structure is stabilized by interactions between a loop guanine and a guanine in the flanking region and hydrogen bonding between bases in the 5`- and 3`-flanking regions to form a ``cinch'' that holds one end of the tetraplex together. We suggest that the stabilizing effect is due to duplex formation by the G-rich flanking sequence that effectively closes off the ``open'' end of the tetraplex. We also demonstrate that in the absence of K, part of this region is able to form a hairpin containing a mixture of G-G and G-C base pairs. Our findings, together with those that describe the triplex-forming ability of this same sequence, demonstrate the structural complexity of the chicken beta-globin promoter. This conformational complexity may have implications for the transcriptional regulation of this gene. Our data also indicate that since the absence of four perfect G-motifs does not preclude tetraplex formation, the number of potential tetrahelix-forming sequences is much broader than previously thought. Our observations demonstrate a clear link between K-dependent DNA synthesis arrest sites and tetrahelix formation, suggesting that the K-dependent blocks to DNA synthesis might be a general feature and useful diagnostic property of this class of structures.


MATERIALS AND METHODS

Oligomer Synthesis and Purification

Oligonucleotides were synthesized using an Applied Biosystems 381A synthesizer according to standard procedures and resuspended in 10 mM Tris-HCl, 1 mM, EDTA, pH 8.0 (TE). Oligonucleotides used for sequencing and polymerase chain reaction amplification were used without further purification. Oligonucleotides used for chemical modification experiments were purified by electrophoresis on 20% denaturing polyacrylamide gels. Bands containing full-length synthesis products were excised and eluted with 100 mM Tris-HCl, pH 8.0, 5 mM EDTA, and 0.3 M NaCl overnight at 55 °C. Eluates were filtered through a Durapore filter and precipitated with ethanol. Oligonucleotides were labeled with [-P]ATP using 6 units of T4 polynucleotide kinase (U.S. Biochemical Corp.) in 10 mM Tris-HCl and 10 mM MgCl(2) for 30 min at 37 °C and then purified by elution from Nensorb columns (DuPont NEN) using 40% ethanol. Eluates were evaporated to dryness, and the residue was resuspended in TE.

Chemical Modification

The arrest site oligonucleotide used in these studies contained the arrest site flanked by 10 bases of sequence at the 5` end and 25 bases at the 3` end (5`-GTA CGA ATT CGG GGG GGG GGG GGG GGC GGG TGG TGG TGT GGC TCG AGT CAA CGT AAC ACT TT-3`). Oligonucleotides were suspended in 50 mM Tris-HCl, pH 9.3, 2.5 mM MgCl(2), or TE at a concentration of approximately 13 nM, overlaid with a drop of mineral oil, and denatured for 5 min at 94 °C. Immediately following denaturation, KCl was added to some of the samples to a final concentration of 40 mM. Oligonucleotides were heated for 30 s at 94 °C, 30 s at 55 °C, and 30 s at 72 °C on a Perkin-Elmer thermocycler. Samples were removed from under the oil and placed in screw-capped 1.5-ml tubes.

Modification of labeled oligonucleotides with dimethyl sulfate (DMS) (^1)was performed using reagents from a Maxam-Gilbert sequencing kit (Sigma) using a modified version of the manufacturer's procedure. Briefly, the oligonucleotide suspensions were diluted with 180 µl of DMS reaction buffer. One microliter and 0.5 µl of DMS was added to tubes with KCl and without KCl, respectively, and incubated at 18 °C for 1 min.

Bromoacetaldehyde (BAA) modification was carried out as follows. BAA was prepared as described previously(22) . Labeled oligonucleotides were brought to a volume of 49 µl with distilled H(2)O and mixed with 1 µl of BAA and then incubated for 10 min at 37 °C. The volume was brought to 100 µl with distilled H(2)O, extracted with 100 µl of phenol:chloroform:isoamyl alcohol (25:24:1) and then with 100 µl of ether.

Formic acid modification was carried out using a final concentration of 36% formic acid for 40 s at 18 °C. Osmium tetroxide (OsO(4); Sigma) modification was carried out as described by Palecek(23) .

All reactions were stopped by precipitating the oligonucleotides with 1 ml of butanol. Pellets were washed with 70% ethanol and dried under vacuum. Pellets were resuspended in 100 µl of 1 M piperidine, and the cleavage reaction was carried out for 30 min at 90 °C. Reactions were removed from heat and butanol-precipitated. Samples were resuspended in 6-10 µl of distilled H(2)O to which a volume of Sequenase Stop buffer (U.S. Biochemical Corp., Amersham Corp.) had been added. A portion of the reaction was run on a 20% polyacrylamide gel containing 7 M urea.

The image from the gel autoradiograph of the BAA chemical modification assays was captured using a CCD camera, and the relative density of each band on the autoradiograph was determined using NIH Image (24) .

Construction of Plasmids Used in the DNA Arrest Site Assays

Fragments were amplified from pZ189 (25) using 5` primers (T(2)G(5))(4) (5`-GTA CGA ATT C(T(2)G(5))(4) TCG AGT CAA CGT AAC ACT TT-3`) and C-stem (5`-GTA CGA ATT C(T(2)G(5)) T(2) GGC GG (T(2)G(5))(2) TCG AGT CAA CGT AAC ACT TT-3`) and 3` primer supFR1(18) . Amplified fragments were cloned into pMS189Delta as described previously (18) to create p(T(2)G(5))(4) and pCstem. The pM clones 1-3 were constructed in a similar manner from fragments amplified using pBG6 (18) as template and M1, M2, or M3 (M1 is 5`-GTA CGA ATT CGG GAC ACC ACC CG-3`, M2 is 5`-GTA CGA ATT CGG ACC ACC ACC CGC CCC CCC CCC CCC CGA GTC AAC GTA ACA CTT T-3`, and M3 is 5`-GTA CGA ATT CGG ACC ACC ACC CGC CCC CCC CCC CCG GGT CAA CGT AAC ACT TT-3`) as 5` primers and supFR1 as the 3` primer.

DNA Synthesis Arrest Assay

DNA synthesis arrest assays were performed with the SequiTherm DNA sequencing kit (EpiCentre Technologies) and end-labeled primers Zseq (5`-AGT GCC ACC TGA CGT CTA-3`), M2prime (5`-TTC GCC ACC TCT GAC TTG AGC GTC-3`), or supFR4 (5`-ATG CTT TTA CTG GCC TGC T-3`) as described previously(18) .


RESULTS

We have previously shown that the chicken beta-globin gene promoter contains a G-rich sequence, GCG(GGT)(2)GG, that forms a strong DNA synthesis arrest site in the presence of K(18). The location of this arrest site in the promoter is shown in Fig. 1A. The individual bases in the arrest site are labeled 1-26 with base 1 being the 5`-most guanine in the arrest site (G). In fact, this arrest site consists of a series of three successive blocks to DNA synthesis, since under some conditions three stops are seen opposite successive T residues in the template (see arrows in Fig. 1A, labeled K1, K2, and K3). Polymerase arrest is significantly more efficient at K1 than at K2 and K3, and under some conditions almost no read-through is seen past K1 (Fig. 1B). The amount of DNA synthesis arrest by the chicken beta-globin promoter is similar to that seen for a run of uninterrupted guanines of the same length (Fig. 1B). These blocks to DNA synthesis are eliminated if some of the guanines in this sequence are blocked at the N-7 position, suggesting that formation of a series of structures involving non-Watson-Crick base interactions is responsible for DNA synthesis arrest. Arrest of DNA synthesis is independent of the anion present and is not seen in the presence of other cations such as Li, NH, Rb, or Cs(18) . This K-specific effect is thus not simply a general ion-screening effect. We have also previously shown that a hairpin-forming sequence, GC, of the same length as the beta-globin arrest site does not form a K-specific block to DNA synthesis, (^2)suggesting that the K-specific effect seen in the chicken beta-globin promoter is not due to hairpin formation. Neither the pattern of DNA synthesis arrest nor the ion specificity are consistent with the formation of triplexes(2) .

Previous data showed that the arrest site is found only when the G-rich strand served as template and was independent of template concentration, with the arrest of DNA synthesis being observed even when only femtomoles of template were present(18) . These findings suggested that unusual intrastrand tetraplex-like structures might form the basis of the blocks to DNA synthesis.

The intrastrand nature of these structures was confirmed by the observation that even at very low oligonucleotide concentrations, corresponding to template concentrations at which the blocks to DNA synthesis are still clearly visible, no intermolecular associations of an oligonucleotide containing the arrest site were observed by nondenaturing polyacrylamide gel electrophoresis (data not shown). However, in gels containing KCl, this oligonucleotide migrates slightly faster than an oligonucleotide containing the complement of the arrest site, suggesting that it can form a more compact intrastrand structure. While the difference in mobility is small, it is reproducible and is consistent with mobility differences that we have found for known tetraplex-forming sequences(26) .

Evidence for Two Classes of Structures That Can Form on the Template Strand

To examine the K-dependent structures at single-nucleotide resolution we probed the arrest site oligonucleotide with DMS, OsO(4), and BAA. These results are shown in Fig. 2and Fig. 3.


Figure 2: Densitometric analysis of the bromoacetaldehyde modification of the arrest site oligonucleotide. The oligonucleotide was reacted with bromoacetaldehyde in the absence (A) and presence of 40 mM K (B) and treated with hot piperidine as described under ``Materials and Methods.'' The products were resolved on a 20% sequencing gel. The gel was autoradiographed, and the autoradiographic image was captured using a CCD camera and imported to NIH Image for analysis. The C residue in the arrest site is marked with an arrow.




Figure 3: Chemical modification of the GCG(GGT)(2)GG sequence with DMS and OsO(4) in the absence and presence of K. The arrest site oligonucleotide was reacted with either DMS or OsO(4) in the absence (0) and presence (K) of 40 mM K as described under ``Materials and Methods.'' The lanes labeled C represent control reactions in which no DMS or OsO(4) was used.



Chemical modification reactions carried out under the same conditions as the assay of DNA synthesis arrest produce a result that reflects the sum total of the chemical modification of all the structures in the mixture. From a comparison of the amount of prematurely terminated polynucleotide chains relative to full-length products it is clear that the major molecular species present in the reaction represent those that cause DNA synthesis arrest at K1, with minor contributions from the structures that cause arrest at K2 and K3 (see Fig. 1), and that for all intents and purposes the chemical modification data will reflect the K1 structure.

To examine if the cytosine residue at position 17 was base-paired we first reacted the oligonucleotide with BAA and then treated it with formic acid, followed by piperidine. BAA reacts with the N-3 and the N-4 position of unpaired C residues. Treatment of a BAA-modified cytosine with formic acid enhances beta-elimination by piperidine. Fig. 2shows the results obtained for the BAA/formic acid modification of the arrest site oligomer. In the presence of 40 mM KCl, a strong band corresponding to the C residue is seen on a sequencing gel. This strong band translates into a tall peak for this residue on densitometric analysis. However, in the absence of KCl the density of this band is much reduced. These data suggest that C is modified by BAA in the presence of KCl, i.e. it is unpaired, while in the absence of KCl, it is resistant to BAA modification and is thus involved in a hydrogen-bonding interaction. As expected, the C residue was not reactive in the presence or absence of K when treated only with formic acid (data not shown).

To examine the thymidine residues in the arrest site, the arrest site oligonucleotide was modified with OsO(4) in the presence of pyridine. Under these conditions, addition to the C-5 and C-6 double bonds of thymidine residues promotes formation of osmium esters that are susceptible to cleavage with hot piperidine. OsO(4) is significantly more reactive with unpaired residues and has been used successfully as a probe for DNA conformation junctions and for identifying loop regions in cruciforms ( (27) and references therein). Our results are shown in Fig. 3. All thymidine residues in the sequence were reactive in the presence and absence of K, but the intensity of modification of the T residues in the G tract was markedly increased in the presence of 40 mM K, with T being particularly susceptible to OsO(4) modification in comparison with T and T.

DMS treatment of DNA results in the methylation of G residues at the N-7 position. This modification makes the residue susceptible to cleavage by piperidine. In the absence of KCl, cleavage with piperidine is significantly above background at all positions. G is most reactive, followed by G-G, G^1-G^5, and the guanines outside of the arrest site (Fig. 3). Under these conditions, BAA modification indicated some protection of the C site, suggesting that it is base-paired. The pattern of slight protection from DMS by bases in the middle of the arrest site, combined with the hyperreactivity of G, is consistent with the formation of a stem-loop structure with the G being in the loop. In such a hairpin, the N-7 of each G in a G-G base pair would be available for DMS modification about 50% of the time. The base that constitutes the hairpin loop, G, would be the only base that was consistently available for DMS modification and would therefore appear hyperreactive. Since no arrest of DNA synthesis is seen under these conditions, it seems that this hairpin structure does not block DNA synthesis. This is consistent with our observation that a G-C hairpin of the same length also does not block DNA synthesis under these conditions.

In contrast, almost complete protection from DMS modification of some residues was seen in the presence of 40 mM KCl (Fig. 3). In the presence of K, DMS modification at G^1 was similar to guanines outside the arrest site, and intermediate reactivity was observed at G^2, G^4, G^7, and G. DMS hyperreactivity was observed at position G^3. The reactivity of the remaining G residues was reduced to close to background levels. Protection of the N-7 position of guanine residues is diagnostic of structures containing G-G Hoogsteen base interactions. The apparent complete protection of some of the guanine residues from DMS modification indicates that they are involved in hydrogen bonding interactions in which they act as N-7 donors almost all of the time. The DMS reactivity pattern observed in 40 mM NaCl was identical to that observed without potassium, illustrating that the DMS protection observed in the presence of K is not simply a general cation effect (data not shown).

The Effect of Interruption of Guanine Runs in a Tetraplex Stem

If the structure formed by the wild-type chicken beta-globin gene promoter were indeed a tetrahelical structure of some sort, it would suggest that a number of non-guanine bases could be accommodated in the tetrahelix. Our chemical modification experiments with BAA indicated that the cytosine in the structure is unpaired. We addressed the question of the effect of a C residue in a tetraplex stem on structural stability in two other sets of experiments. First, a plasmid bearing the sequence (T(2)G(5))T(2)G(2)CG(2)(T(2)G(5))(2) (pCstem) was constructed and tested for its ability to arrest DNA synthesis in vitro. For comparison a plasmid bearing the known tetraplex-forming sequence (T(2)G(5))(4) that was constructed in our lab and shown to block DNA synthesis in the presence of K was employed. (^3)The Cstem sequence has a single cytosine in place of a guanine in the central guanine-tract that comprises one strand of the stem of the tetraplex. The Cstem sequence still formed a block to DNA synthesis at the same position as that observed for the (T(2)G(5))(4) sequence (Fig. 4), although the arrest site was significantly weaker than that observed with the (T(2)G(5))(4) sequence. This is consistent with our observation of strong protection from DMS modification for the (T(2)G(5))(4) sequence, and much weaker protection for the Cstem sequence (data not shown).


Figure 4: A sequence with an interrupted guanine motif can arrest DNA synthesis in vitro. The ability of two sequences, (T(2)G(5))(4) and Cstem ((T(2)G(5))T(2)G(2)CG(2)(T(2)G(5))(2)) to arrest DNA synthesis was tested. Sequencing reactions were conducted on plasmids bearing these sequences in the absence (0) and presence (K) of 40 mM K. The location where DNA synthesis arrest occurred in both sequences is shown by the arrow.



However, the arrest site in the chicken beta-globin locus seems to contain at least three non-G interruptions. The fact that this region still forms such a strong block to DNA synthesis is indicative of the fact that some additional stabilizing factors must be present that compensate in some way for these interruptions.

Defining Arrest Site Requirements Using Arrest Site Sequence Variants

To define those bases important for the structure adopted by the GCG(GGT)(2)GG sequence, we constructed a series of plasmids with slight sequence variations in the arrest site. The pattern of DNA synthesis arrest in these variants is shown in Fig. 5. Stopping at K1 was restored only in those variants pM2 and pM3, but the strength of the arrest sites was not duplicated in any of the constructed mutants. Replacement of G^1-G^4 with TCGA, TGGA, TCGG, and TCCC abolished stopping at K2 (BG6, pM1, pM2, and pM3), indicating that the structure forming the underlying block to synthesis at K2 requires at least the sequence GCG(3)TG(2). The stop at K3 was not negatively affected by any of these mutations, indicating that the sequence necessary and sufficient for the structure that causes the stop at K3 is GCG(3). This observation is interesting since it demonstrates that even relatively short interrupted G runs can still form blocks to DNA synthesis. With respect to K1, replacing G^1-G^2 with TC (pM2) reduced the extent of stopping but did not abolish it, suggesting that these residues are important for stability but are not essential in order to get a block at this point. However, replacement of G^4 by an A was sufficient to eliminate the stop (BG6), and this effect could be partially alleviated by substitution of a C for the A (pM3). Elimination of T changed the pattern of polymerase arrest, illustrating that this residue is not looped out of the structure but is an integral feature of the arrest site.


Figure 5: DNA synthesis arrest patterns of mutants with altered sequence in the region proposed to form a molecular cinch. Construction of mutants of the chicken beta-globin sequence that forms a block to DNA synthesis is described in the text. Positions 1-26 correspond to -195 through -169 of the chicken beta-globin sequence (GenBank locus: CHKHBBA). The bases that vary from that of the wild type arrest site are shown in outline. The position and relative strength of each arrest site seen in the presence of 40 mM KCl is denoted by the position and number of triangles.




DISCUSSION

We have previously shown that the chicken beta-globin promoter contains a strong composite arrest site for DNA synthesis in vitro(18) . That DNA synthesis arrest is template concentration-independent and is seen only on one strand suggested that the underlying physical basis was the formation of a series of intrastrand structures. The G-richness of the arrest site (the sequence 5`-GCG(GGT)(2)GG-3` is necessary and sufficient to cause synthesis arrest) suggested that the arrest site might involve G-G base interactions. The K specificity suggested that despite its relatively short length, its lack of four clearly identifiable G-repeats, and the presence of a number of non-canonical bases, arrest was due to a series of intrastrand tetrahelical structures of some kind.

These conclusions are supported by experiments shown here. In gel electrophoresis of oligonucleotides containing the arrest site in the presence of K a high mobility species was observed consistent with intrastrand folding. The fact that a hairpin-forming sequence (GC) of the same length as the arrest site produces no K-dependent block to DNA synthesis suggested that arrest of DNA synthesis by the chicken promoter is not due to hairpin formation.

Our chemical modification data are consistent with the major DNA synthesis arrest site being due to the formation of a novel intramolecular tetrahelical structure in the presence of K. The complete protection of bases G-G from DMS modification indicates that guanine tetrads are involved. The hyperreactivity of G^3 suggests that it might be located at the junction between the tetraplex and bases 5` of the tetraplex, and the OsO(4) hyperreactivity of the T just 3` of the arrest site defines the 3` limit of bases involved in the structure. A tetraplex of the length defined by the distance between these two bases i.e. 23 would have three loops spaced approximately an equal number of bases apart at around G^7-G^9, G-G, and G-T. The DMS reactivity seen for bases G^4-G is confined to bases G^4, G^7, and G. It is hard to fit all of these reactive bases into the loops of the tetraplex, and it seems likely that the loop bases are not in fact DMS-reactive and that reactivity at G^4, G^7, and G is the result of some other structural feature. The lack of reactivity of loop bases may be due to stacking interactions, transient base pairing in or between loops, or binding to K. The reactivity of G^4, G^7, and G can perhaps be accounted for by placing them adjacent to some of the non-G bases. The reactivity of C with BAA and the OsO(4) hyperreactivity of T and T suggest that these bases are all unpaired. The DMS protection of G residues that would be in the same plane as G^4, G^7, and G are consistent with G-G-G base triplets, with the DMS-reactive base acting as an N-7 acceptor but not an N-7 donor. The non-G base adjacent to the reactive G presumably fills the space that would normally be occupied by the fourth G in the tetrad but does not participate in hydrogen bonding. One possible structure that accounts for this pattern of reactivity is shown in Fig. 6B. In this structure G is shown as being in a loop on the same side of the structure as the base G and G. Interaction among G, G, and G in a G-G-G triplet in which G and G act as N-7 donors would explain the DMS protection of G and G.


Figure 6: Structures formed by the chicken beta-globin sequence in the absence and presence of K. Structural models were generated on the basis of chemical modification data as described in the text. In the absence of K the sequence CGCG(GGT)(2)GG forms a hairpin structure (A) that does not present a block to DNA synthesis in vitro. In the presence of K the sequence GCG(GGT)(2)GG forms a cinched tetrahelix (B). Bases adjacent to the four-stranded tetraplex structure interact to stabilize the structure. This structure presents a formidable block to DNA synthesis in vitro. The G residues shown in outline are those modified by DMS.



The pattern of DNA synthesis arrest by sequence variants confirms various details of the structure shown in Fig. 6B. Replacement of G^1-G^2 with the residues T-C eliminates the second arrest site (K2) and reduces the extent of arrest at K1. Replacement of these residues together with a substitution of A for G^4 eliminates the first arrest site altogether. On the other hand, replacement of G^4 by a C reduces but does not eliminate this arrest site. This might indicate that G^4 is involved in hydrogen bonding in a context in which a C can substitute at least partially. We interpret the hydrogen-bonding contribution made by G^3 in terms of a molecular cinch that holds the end of the tetraplex closed, making it more difficult for the polymerase to traverse this region. The fact that a G-to-A substitution at G^4 eliminates the stop at K1 and that a C at that position partially restores the stop might be due to the fact that the C permits interaction with the top portion of the stem that, according to this model, becomes folded back, while an A at that position would hydrogen bond to the T in the same end of the stem, providing no stabilization of the fold-back structure.

The fact that the ability of all of these variants to block DNA synthesis is considerably less than that of the wild type suggests that bases G^1-G^2 also make a contribution to the stability of the structure, perhaps as a result of stacking interactions on the single strand or from pairing with bases outside the tetraplex-forming region. The DMS protection of G is consistent with a G-G interaction between G and G^2 in which G^2 acts as the N-7 acceptor. Deletion of T (Tout in Fig. 5) abolished the original pattern of DNA synthesis arrest and resulted in the formation of two new arrest sites both located at bases 3` of the original arrest site. This illustrates that T plays an important role in the arrest site structure. The pattern of arrest in this mutant is also consistent with the structure shown in Fig. 6B, if it is assumed that formation of a G-G base pair with bases immediately flanking the tetrahelix is an important stabilizing factor. In this case G would move into the tetrahelix, and G would be available for hydrogen bonding to G^3 and G^14. However, in the absence of a hydrogen bonding partner for G^2 3` of G this interaction might not be stable, resulting in the stop at G. In spite of the paucity of complete tetrads in the wild type arrest site structure, K may still be able to bind to guanines in adjacent rungs of the structure since the internal dimension of the channel might still resemble a more conventional tetraplex.

Direct evidence for the ability of non-G bases to be accommodated into the stems of tetraplexes was obtained by comparing a known tetraplex-forming sequence (T(2)G(5))(4), and a sequence T(2)G(5)T(2)G(2)CG(2)(T(2)G(5))(2) that is identical except for a single C residue that disrupts one of the four G-rich repeat motifs. Both the ability to block DNA synthesis and the DMS protection of stem guanines were decreased markedly in the template containing the interrupted motif, but clear evidence for tetraplex formation was still visible. The extent of DNA synthesis arrest by the full-length arrest site is comparable with a pure G tract of the same length as the chicken arrest site (Fig. 1B) despite the presence of three non-G bases. Given the effect of a single interruption in these experiments, the relative efficiency of the chicken beta-globin DNA arrest site is thus all the more remarkable.

Tetraplex-forming sequences studied to date have not shown evidence for incorporation of bases other than G into the tetraplex stems. In telomere sequences with a (T(x)AG(y))(4) repeated motif, the A bases have been shown to reside in the loop of the intramolecular tetraplex, and other variants of this sequence such as (T(4)G(3)A)(4), (T(4)AGAG)(4), and (T(4)GAGA)(4) were unable to form stable intramolecular tetraplexes(28) . Our observation that a number of non-G bases can be accommodated within the stem of a tetraplex, particularly if additional stability is provided by hydrogen-bonding interactions of flanking G-rich sequences to form a cinch, greatly extends the range of sequences that could potentially form tetrahelical structures that block DNA synthesis.

In theory the structure we have described could form in vivo any time that the duplex region containing the sequence becomes unpaired. Replication or transcription would provide such an opportunity, as would local melting of the duplex or formation of the triplex, found in this region(29) . In fact a large region of the chicken beta-globin gene promoter is known to be susceptible to chemical modification in vivo(30, 31) . One possible role for the hairpin or cinched tetraplex could be in modifying expression of the beta-globin gene, with this region perhaps acting as a K-sensitive switch. The structure may act by binding of conformation-specific factors that affect transcription or by the occlusion of a binding site.


FOOTNOTES

*
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
Current Address: Biotech Research Laboratories, 3 Taft Court, Rockville, MD 20850.

To whom correspondence should be addressed: Building 8, Room 202, National Institutes of Health, 8 Center Dr MSC 0830, Bethesda, MD 20892-0830. Tel: 301-496-2189; Fax: 301-402-0240; :ku{at}helix.nih.gov.

(^1)
The abbreviations used are: DMS, dimethyl sulfate; BAA, bromoacetaldehyde.

(^2)
M. N. Weitzmann, K. J. Woodford, and K. Usdin, manuscript in preparation.

(^3)
M. N. Weitzmann, K. J. Woodford, and K. Usdin, unpublished results.


REFERENCES

  1. Germino, J., and Bastia, D. (1981) Cell 23, 681-687 [Medline] [Order article via Infotrieve]
  2. Baran, N., Lapidot, A., and Manor, H. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 507-511 [Abstract]
  3. DeStefano, J. J., Mallaber, L. M., Rodriguez, R. L., Fay, P. J., and Bambara, R. A. (1992) J. Virol. 66, 6370-6378 [Abstract]
  4. Buiser, R. G., Bambara, R. A., and Fay, P. J. (1993) Biochim. Biophys. Acta. 1216, 20-30 [Medline] [Order article via Infotrieve]
  5. Fry, M., and Loeb, L. A. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 763-767 [Abstract]
  6. Klarmann, G. J., Schauber, C. A., and Preston, B. D. (1993) J. Biol. Chem. 268, 9793-9802 [Abstract/Free Full Text]
  7. Bebenek, K., Abbotts, J., Wilson, S. H., and Kunkel, T. A. (1993) J. Biol. Chem. 268, 10324-10334 [Abstract/Free Full Text]
  8. Smith, M. T., and Wake, R. G. (1989) Gene (Amst.) 85, 187-192
  9. Hill, T. M., Tecklenburg, M. L., Pelletier, A. J., and Kuempel, P. L. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 1593-1597 [Abstract]
  10. Hidaka, M., Kobayashi, T., Takenaka, S., Takeya, H., and Horiuchi, T. (1989) J. Biol. Chem. 264, 21031-21037 [Abstract/Free Full Text]
  11. Lapidot, A., Baran, N., and Manor, H. (1989) Nucleic Acids Res. 17, 883-900 [Abstract]
  12. d'Ambrosio, E., and Furano, A. V. (1987) Nucleic Acids Res. 15, 3155-3175 [Abstract]
  13. Bedinger, P., Munn, M., and Alberts, B. M. (1989) J. Biol. Chem. 264, 16880-16886 [Abstract/Free Full Text]
  14. Weaver, D. T., and DePamphilis, M. L. (1984) J. Mol. Biol. 180, 961-986 [Medline] [Order article via Infotrieve]
  15. Weisman-Shomer, P., Dube, D. K., Perrino, F. W., Stokes, K., Loeb, L. A., and Fry, M. (1989) Biochem. Biophys. Res. Commun. 164, 1149-1156 [Medline] [Order article via Infotrieve]
  16. Usdin, K., and Furano, A. V. (1989) J. Biol. Chem. 264, 15681-15687 [Abstract/Free Full Text]
  17. Giovannangeli, C., Thuong, N. T., and Hélène, C. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 10013-10017 [Abstract]
  18. Woodford, K. J., Howell, R. M., and Usdin, K. (1994) J. Biol. Chem. 269, 27029-27035 [Abstract/Free Full Text]
  19. Sen, D., and Gilbert, W. (1988) Nature 334, 364-366 [CrossRef][Medline] [Order article via Infotrieve]
  20. Sen, D., and Gilbert, W. (1990) Nature 344, 410-414 [CrossRef][Medline] [Order article via Infotrieve]
  21. Panyutin, I. G., Kovalsky, O. I., Budowsky, E. I., Dickerson, R. E., Rikhirev, M. E., and Lipanov, A. A. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 867-870 [Abstract]
  22. Usdin, K., and Furano, A. V. (1989) J. Biol. Chem. 264, 20736-20743 [Abstract/Free Full Text]
  23. Palecek, E., Boublikova, P., Galazka, G., and Klysik, J. (1987) Gen. Physiol. Biophys. 6, 327-341 [Medline] [Order article via Infotrieve]
  24. O'Neill, R. R., Mitchell, L. G., Merril, C. R., and Rasband, W. S. (1989) Appl. Theor. Electrophor. 1, 163-167 [Medline] [Order article via Infotrieve]
  25. Kraemer, K. H., and Seidman, M. M. (1989) Mutat. Res. 220, 61-72 [Medline] [Order article via Infotrieve]
  26. Usdin, K., and Woodford, K. J. (1995) Nucleic Acids Res. 23, 4202-4209 [Abstract]
  27. Wells, R. D., Collier, D. A., Hanvey, J. C., Shimizu, M., and Wohlrab, F. (1988) FASEB J. 2, 2939-2949 [Abstract/Free Full Text]
  28. Murchie, A. I., and Lilley, D. M. (1994) EMBO J. 13, 993-1001 [Abstract]
  29. Kohwi, Y. (1989) Nucleic Acids Res. 17, 4493-4502 [Abstract]
  30. Kohwi-Shigematsu, T., Gelinas, R., and Weintraub, H. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 4389-4393 [Abstract]
  31. McGhee, J. D., Wood, W. I., Dolan, M., Engel, J. D., and Felsenfeld, G. (1981) Cell 27, 45-55 [Medline] [Order article via Infotrieve]
  32. Day, L. E., Hirst, A. J., Lai, E. C., Mace, M. J., and Woo, S. L. (1981) Biochemistry 20, 2091-2098 [Medline] [Order article via Infotrieve]

©1996 by The American Society for Biochemistry and Molecular Biology, Inc.