©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Pausing of DNA Synthesis in Vitro at Specific Loci in CTG and CGG Triplet Repeats from Human Hereditary Disease Genes (*)

(Received for publication, March 20, 1995; and in revised form, August 9, 1995)

Seongman Kang(§)(¶) Keiichi Ohshima (§) Miho Shimizu Sorour Amirhaeri Robert D. Wells (**)

From the Institute of Biosciences and Technology, Texas A & M University, Texas Medical Center, Houston, Texas 77030

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Several human hereditary neuromuscular disease genes are associated with the expansion of CTG or CGG triplet repeats. The DNA syntheses of CTG triplets ranging from 17 to 180 and CGG repeats from 9 to 160 repeats in length were studied in vitro. Primer extensions using the Klenow fragment of DNA polymerase I, the modified T7 DNA polymerase (Sequenase), or the human DNA polymerase beta paused strongly at specific loci in the CTG repeats. The pausings were abolished by heating at 70 °C. As the length of the triplet repeats in duplex DNA, but not in single-stranded DNA, was increased, the magnitude of pausings increased. The location of the pause sites was determined by the distance between the site of primer hybridization and the beginning of the triplet repeats. CGG triplet repeats also showed similar, but not identical, patterns of pausings. These results indicate that appropriate lengths of the triplets adopt a non-B conformation(s) that blocks DNA polymerase progression; the resultant idling polymerase may catalyze slippages to give expanded sequences and hence provide the molecular basis for this non-Mendelian genetic process. These mechanisms, if present in human cells, may be related to the etiology of certain neuromuscular diseases such as myotonic dystrophy and Fragile X syndrome.


INTRODUCTION

CTG and CGG trinucleotide repeat expansions are associated with a number of human hereditary genetic disease genes including human myotonic dystrophy(1, 2, 3) , Kennedy's disease(4) , spinocerebellar ataxia type I(5) , Huntington's disease(6) , dentatorubral-pallidoluysian atrophy(7, 8) , Haw River syndrome(9) , Machado-Joseph disease(10) , and Fragile X and XE syndromes(11, 12, 13) . The triplet repeat sequences occur at different locations relative to the coding regions. For example, autosomal dominant myotonic dystrophy, which is characterized mainly by myotonia and progressive muscle weakness, has CTG triplet repeats in the 3`-untranslated region of the myotonin kinase gene(3, 14) . Fragile X syndrome, the most frequent inherited mental retardation, contains multiple CGG repeats within the mRNA noncoding region of a Fragile X associated gene (FMR-1) while, for the other hereditary diseases, the CTG repeat is located within the genes and encodes a tract of glutamines (4, 5, 6, 7, 8, 9, 10, 11, 12, 13) . These highly polymorphic triplet repeats have been shown to range from 5 to 37 copies on normal chromosomes, whereas carriers and affected individuals have more than 39 copies(15) , and the largest expansion observed is 2000 or more copies of the repeats in some Fragile X syndrome and myotonic dystrophy patients(14) . These diseases have a common property known as anticipation in which the severity of the disease is increased and the age of onset is reduced with each successive generation. These behaviors correlate with the massive expansions of triplet repeats, a non-Mendelian genetic process.

DNA structural investigations have shown evidence that a number of repeating DNA sequences including triplets such as GAA, GGA, and TTA adopt non-B conformations under appropriate environmental conditions (15, 16, 17, 18) . (^1)Cruciform structures form within regions of inverted repeat symmetry in negatively supercoiled DNA, and left-handed Z-DNA is formed at regions of alternating purines and pyrimidines(20, 21) . pur bullet pyr (^2)tracts (i.e. (A-G) or (G runs)) can form intramolecular triplexes (15, 22, 23, 24, 25) . In addition, other alternative conformations including bent DNA, slipped structures, and nodule DNA are known to exist in microsatellite-type DNA sequences(15, 17, 26) .

Repeating sequences can cause DNA synthesis aberrations by the cellular replication machinery. Inverted repeats generate frequent deletions, and simple repeats promote multiple slippages in templates, which give rise to deletions and duplications as well as strand switching during replication in cells(27, 28, 29) . Previous studies indicated that pur bullet pyr sequences may be pause (arrest) sites for DNA replication and amplification, and replication pause sites are potentially mutagenic(30, 31, 32) . Recently, the replication of CGG triplet repeats of Fragile X syndrome was found to be delayed, compared with the replication of alleles from normal males(33) .

Herein, we show that CTG and CGG triplet repeat sequences, which originated from hereditary genetic disease patients, have unorthodox properties; DNA polymerases pause at specific locations in these sequences. The pausings are dependent on the length of the repeat tract and temperature. Our results suggest that non-B conformations of the triplet repeats may be responsible for the pausing and that these properties are related to the etiology of some hereditary genetic diseases.


MATERIALS AND METHODS

Plasmid Construction

pRW1981 and pRW1980, which were derived from pSH2 and pSH1(34) , contain genomic DNAs with 130 and 75 CTG triplet repeats in the HincII of the polylinker of pUC19, respectively. pRW1980 has a four-bp deletion in the HincII site. The CTG repeat sequences (gifts of C. T. Caskey, Baylor College of Medicine), which were derived from the genomes of myotonic dystrophy patients(3, 34) , contain one G to A mutation at the 28th repeat. pRW3211 and pRW3212 have 17 and 26 CTG repeats, respectively, which are deleted sequences from the 130-repeat sequence. Plasmids containing various lengths of CTG triplets (50-180 repeats) were constructed by the expansion and deletion method(35) . Plasmids were isolated by the cesium chloride method and investigated for the CTG inserts using restriction enzymes and dideoxy chain termination sequencing. For the construction of pRW3234, pRW3213 (34) (which contains (CTG)) was digested with SmaI and HindIII, and then the insert was filled in using the Escherichia coli DNA polymerase I (Klenow fragment) and dNTP. The insert was cloned in the SmaI site of pRW3213. Thus, the 100-bp random DNA sequence between the two (CTG) tracts in pRW3234 is the 43-bp genomic DNA flanking the CTG triplet repeats and the polylinker of pUC19. Also, DNA sequencing revealed that the pair of (CTG) tracts are on the same strand. For the construction of pRW3111, pRW1981 was digested with SacI and HindIII, and the fragment containing (CTG) was cloned into the pGEM3Zf(+) (Promega). For the construction of pRW3262, a fragment containing (CTG) that was prepared by digesting pRW1981 with Sau3AI was cloned into the BamHI site of pUC19. This plasmid has a 38-bp deletion in the genomic DNA flanking the CTG triplets compared with pRW1981.

pRW3306 containing 160 CGG triplet repeats was constructed as follows. A DNA fragment, which contains (CGG), was isolated from pTM10 (a gift of B. A. Oostra, Erasmus University, The Netherlands) by digestion with BstUI and HaeIII. The insert was ligated to generate multimers using T4 DNA ligase. A head-to-tail dimer of (CGG) was cloned into the HincII site of the polylinker of pUC19. The CGG repeat sequences in pTM10, which were derived from the cDNA of Fragile X patients(11) , contain mutations of the perfect repeat at the 12th repeat (AGG) and at the 73rd repeat (CAG). pRW3306 also contains a non-CGG repeat sequence (CTGGG) at the junction of the two blocks of (CGG). Plasmids were grown in SURE cells (Stratagene) and were isolated by the methods described above for the CTG-containing plasmids.

Generation of Topoisomers

Topoisomers were generated as described previously(36) . 6 µg of pRW1981 (34) was incubated in 100 µl of a solution containing TrisbulletCl (pH 7.6), 50 mM KCl, 10 mM 2-mercaptoethanol, 1 mM EDTA, 0-4 µM ethidium bromide, and chicken erythrocyte topoisomerase (gift of J. E. Larson, this laboratory) for 60 min at 37 °C. The DNAs were purified by two phenol extractions, two ether extractions, and ethanol precipitation.

Isolation of Single-stranded DNA

E. coli NM522 (Promega) harboring pRW3111, a phagemid pGEM3Zf(+) derivative, was grown in 50 ml of TYP broth containing 50 µg/ml ampicillin. After 30 min of incubation with vigorous shaking at 37 °C, helper phage M13KO7 (Promega) at multiplicity of infection of 10 was infected and cultivated for 16 h with vigorous agitation and good aeration. Phage particles were separated from E. coli by centrifugation at 12,000 times g for 15 min and then precipitated by adding 5% polyethylene glycol and 0.9 M ammonium acetate (pH 7.5). The pellet was resuspended in 400 µl of TE buffer (10 mM TrisbulletCl (pH 8.0) and 1 mM EDTA) followed by phenol and chloroform extraction, and then the single-stranded DNA was purified by ethanol precipitation.

Primer Extension Analyses

Unless otherwise noted, the primer extension experiment was performed by the following procedure. 2 µg of plasmids were resuspended in 13 µl of H(2)O, 2 µl of 2 N NaOH, and 5 µl of P-labeled primer (3 ng, 3 Ci/µmol). Primers were chemically synthesized by the Gene Technologies Laboratory at Texas A & M University or were purchased from New England BioLabs Inc. (M13 reverse primers (-24 and -48) and M13 primers (-20 and -40)) and Promega (SP6 primer and T7 primer). The sample was heated for 90 s at 90 °C followed by incubation for 4 min at room temperature. The reaction mixture was neutralized with 2 µl of 3 M sodium acetate (pH 5.2). After DNA was purified by ethanol precipitation, the DNA was dissolved in a buffer containing 40 mM TrisbulletCl (pH 7.5), 20 mM MgCl(2), 50 mM NaCl, and 1 mM of dNTPs. For the experiments where dGTP was replaced with dITP or 7-deaza-dGTP (U. S. Biochemical Corp.), the trinucleotide analogs were in the same molar amounts as dGTP in the reaction mixtures. Before the Sequenase version 2.0 (U. S. Biochemical Corp.) and the Klenow fragment of E. coli DNA polymerase I (U. S. Biochemical Corp.) were added, the DNAs were preincubated under the conditions indicated in the figures. 5 units of either the Sequenase version 2.0 or the Klenow fragment of E. coli DNA polymerase I were added and then incubated for 10 min at 37 °C; for the definition of units, see the U. S. Biochemical Corp. catalog. The molar ratio of the polymerases to the primers is 3-fold. The T7 sequenase was fully saturated with thioredoxin. Studies with the human DNA polymerase beta were conducted and analyzed under the same conditions as for the two prokaryotic DNA polymerases. The reactions were stopped by the addition of 20 mM EDTA and 95% formamide and analyzed by electrophoresis on 12% denaturing polyacrylamide gels (CTG) or by electrophoresis on 10% Long Ranger (J.T. Baker) gels containing 30% formamide and 7 M urea (CGG). The amounts of newly synthesized DNA at the CTG major pausing site (triplet repeats numbered 88-94), if necessary, were measured using a densitometer (Molecular Dynamics).


RESULTS

DNA Polymerases Pause at CTG Triplet Repeat Sequences

One possible explanation for the relationship between the massive expansion of CTG and CGG repeats and several human genetic neuromuscular diseases is that their unorthodox properties cause aberrant DNA replication slippage reactions that result in the extension of the repeats. Previously, it was shown that CT repeat sequences are pause sites for DNA replication(31) .

Thus, we investigated CTG repeat sequences for their ability to be replicated in vitro using primer extension methods with the Klenow fragment of DNA polymerase I and a modified T7 DNA polymerase (Sequenase). Supercoiled pRW1981, which contains (CTG), was denatured with alkali and then annealed with primers. After renaturation, the primer extension mixture was preincubated for 10 min at 37 or 50 °C, and then either the Klenow fragment or the Sequenase was added and incubated for 10 min at 37 °C. Fig. 1A shows that, unexpectedly, the DNA polymerases do not progress uniformly through the triplet repeat sequences, but instead pause sites were found for both Sequenase (S) and the DNA polymerase I (Klenow fragment) (K) in both strands of (CTG). In the primer extension of the bottom strand with preincubation at 37 °C, the Sequenase (lane 2) and the DNA polymerase I (Klenow fragment) (lane 3) paused at the beginning region of the multiple CTG insert (triplets numbered 118-126, Fig. 1B) and at 37 CTG triplets away (triplets numbered 88-94, Fig. 1B); two brackets on the left side in Fig. 1A indicate these regions. Sequenase paused much stronger in the beginning region than in the distal region 37 CTG triplets away (very weak pausing), but the DNA polymerase I (Klenow fragment) showed more pausing in the distal region (37 CTG triplets away) than in the beginning region. At 50 °C (lane 1), the amount of pausing of Sequenase in the bottom strand was greatly reduced in the beginning region compared with the 37 °C study.


Figure 1: DNA polymerase pause sites in CTG repeat sequence in plasmids. A, DNA sequencing gel analysis of pausing of DNA polymerase in CTG repeats. S and K represent Sequenase and the Klenow fragment of DNA polymerase I, respectively. The arrows represent the beginning site of the multiple CTG insert. The principal pause sites are marked by the brackets. B, summary of the pausing in 130-CTG repeat sequences. The length of the bars represents an approximate visual estimate of the amount of pausing. The data at both 50 and 37 °C are included to show all of the pause sites. C, pause sites for the human DNA polymerase beta. DNA was preincubated at 37 °C for 10 min, and reactions were performed at 37 °C for 20 min. Data for the bottom strand is shown. The enzyme was isolated as described previously (19) (generous gift of Drs. R. K. Singhal and S. H. Wilson). The arrow and bracket have the same designations as in panel A. D, effect of temperature on the pausings. Primer extensions using Sequenase were performed with various sizes of CTG repeating sequences (130, 75, and 26 repeats) as described under ``Materials and Methods'' except using the DNAs and the conditions designated at the top. The pausings in the beginning of the bottom strand are shown. The arrows designate the beginning of the triplet repeats.



Also, a second type of pausing, albeit weak, was observed. The DNA polymerases paused throughout the CTG repeat sequences with a base pair periodicity of 12 + 9 (i.e. an average of 10.5 bp) (lane 1). This phenomenon was more pronounced with preincubation at 50 than at 37 °C.

In the top strand, Sequenase (sequencing lanes G, A, T, and C) paused at the beginning region of the insert and at the 30th CTG triplet (indicated by the brackets on the right side of Fig. 1A). However, the pausing was abolished in sequencing experiments by preincubation of the DNA at 70 °C, indicating that the observed bands were pause sites of the DNA polymerases and not due to the sequencing reactions. The DNA polymerase I (Klenow Fragment) (lane 4) paused strongly at about the 30th CTG triplet on the top strand.

Fig. 1B summarizes the overall pausing sites; the 3`-half of both strands could not be investigated since they were too far from the primer binding sites to be studied by these methods.

That DNA polymerases are arrested at specific locations in the CTG repeat sequences suggests that DNA secondary structures exist at CTG repeat sequences that inhibit polymerase progression. Also, the properties of secondary structures in the proximal region and the distal region (the 37th triplet from the 5`-end of the bottom strand and the 30th triplet from the 5`-end of the top strand) seem to be different from each other, since the Klenow fragment and the Sequenase polymerases paused in different manners.

Studies were also performed with the human DNA polymerase beta (collaboration with Drs. R. K. Singhal and S. H. Wilson, University of Texas Medical Branch, Galveston, TX). As found for the two prokaryotic polymerases, the human enzymes also paused (Fig. 1C) at the vector-triplet repeat interface, but the principal pause began at 33 repeats away on the bottom strand. Hence the pausing behavior was found for prokaryotic as well as eukaryotic DNA polymerases. The majority of the studies described below were conducted with the model prokaryotic enzymes due to convenience and to the reaction similarities.

Effect of Preincubation Temperature

It is well known that DNA structures such as triplexes and hairpin conformations that are stabilized by hydrogen bonds are affected by temperature and that their stability is dependent on their length(15, 16, 17, 18, 20) . Thus, we investigated various plasmids containing different lengths of CTG sequences (26, 75, and 130 repeats) at various temperatures of preincubation (50, 40, and 37 °C) with Sequenase. As shown in Fig. 1D, longer inserts showed stronger pausings of the Sequenase. For shorter inserts, the strength of the pausings was weakened. Also, the pausings were stronger at the lower temperature. These results demonstrate that the strength of pausing is dependent on the length of the CTG repeat insert and the temperature, supporting the concept that the pausings were caused by a DNA structure(s). It may be noted that Fig. 1D shows only the beginning of the repeats because Sequenase pauses weakly in the 37th CTG (triplets numbered 88-94, Fig. 1B) of the bottom strand.

The Conformation of the CTG Repeats Causes the Pausing

We also tested whether the structure that arrested the DNA polymerase I (Klenow fragment) progression could be abolished at high temperature, which would be expected to destroy H-bonds. Primer extension experiments with pRW1981 were performed as described under ``Materials and Methods.'' The pause sites at the CTG repeats numbered 88-94 were monitored. Before the Klenow fragment was added, preincubation was performed at 37 or 70 °C for 10 min. The polymerase paused in the sample with the preincubation at 37 °C, but no pausing was found with the 70 °C preincubation. In order to test if the structure could be bypassed during the polymerase reaction, another experiment was performed as follows: the first preincubation was at 37 °C followed by the polymerase reaction at 37 °C for 10 min. This mixture then was kept at 70 °C for 10 min, cooled to 37 °C, fresh polymerase was added, and the reaction continued for 10 min. No pausing was observed, indicating that the apparent structure could be abolished even after it had caused the arrest step. The heat treatment at 70 °C did not significantly dissociate the primer from the template or otherwise adversely affect the DNA synthesis reaction; 84% of the annealed primers remained after the treatment, and the long ``readthrough'' product (on the top of gel) was present. Hence, these results indicate that heat treatment destroyed a structure (probably H-bonded) that blocked polymerase movement.

Pausings Are CTG Length-dependent

The pausings of the Klenow fragment of DNA polymerase I were studied with plasmids containing various lengths (50, 83, 100, 130, 140, and 180) of CTG repeats and with a plasmid that contains two 50 CTG triplet repeats with a 100-bp nontriplet repeat interruption in the middle. As shown in Fig. 2, primer extension experiments showed that the strength of pausings was increased as the CTG triplet length increased. However, the pausing seemed to be saturated at lengths over 130 CTG repeats. The location of the pause sites was unaffected by the length of the triplet repeat. Interestingly, the plasmid containing two (CTG) blocks with a central 100-bp nontriplet repeat interruption sequence (rectangle) showed little pausing. By comparison, little pausing was found for 50 CTG repeats, while (CTG) showed strong pausings. These results indicate that the CTG repeats must be a contiguous length of >80 for the pausing. This length is required apparently to enable the repeat sequence to form a non-B conformation of sufficient stability to cause the arrest of DNA synthesis.


Figure 2: Effect of CTG repeat length on the pausings. Plasmids containing various CTG repeat sequences were constructed as described under ``Materials and Methods.'' Primer extensions of the bottom strand of plasmids containing 50, 80, 100, 130, 140, and 180 repeats of CTG sequences, respectively, were performed with the Klenow fragment and M13 reverse primer(-24). The amounts of newly synthesized DNA at the major pausing site (triplet repeats numbered 88-94) were measured. The amount of pausing is expressed relative to the amount of (CTG) as 100. Reproducibility is 9%. The rectangle represents pRW3234, which contains two (CTG) repeats flanking a 100-bp nontriplet repeat interruption sequence.



Pausings Require Double-stranded DNA, but Not Supercoiling

To study whether pausings can occur in single-stranded DNA, we used the pGEM plasmid system for the production of (CAG) as single-stranded DNA. Primer extension was performed for both the single-stranded viral form and the double-stranded replicative form of pRW3111, which contain (CTG). The investigation showed no pausings in the single-stranded DNA (less than 10% found for duplex DNA) (Fig. 3A, lanes 1 and 2). However, CTG repeat sequences in the double-stranded replicative factor DNA did show pausings (lanes 3-6). These results demonstrate that double-stranded DNA is necessary for the pausings. Furthermore, studies on linear DNA described below treated with heat and with alkali are consistent with this conclusion.


Figure 3: Pausings require double-stranded CTG repeats but not supercoiling. A, primer extension experiments were performed for single-stranded (S-S) and double-stranded (D-S) pRW3111. T7 and SP6 promoter primers were used. S, Sequenase; K, Klenow. The arrows designate the beginning of the triplet repeats. B, various topoisomers of pRW1981 were used for primer extension. Lanes 1-4 have average supercoil density 0, -0.02, -0.04, and -0.06, respectively. Lane 6 is linear DNA that was prepared by digestion with SacI, and lane 5 is a mixture of linear DNA (1 µg) and supercoiled DNA (average = -0.053) (1 µg). The arrow represents the beginning site of the multiple CTG insert. C, pRW3219, which contains (CTG), was denatured with 0.2 N NaOH in the presence of P-labeled primer and renatured by adding 0.3 M sodium acetate. The DNAs were resuspended in a buffer containing 10 mM TrisbulletCl (pH 7.9), 10 mM MgCl(2), 1 mM dithiothreitol, and 50 or 100 mM NaCl and incubated for 10 min at 37 °C. 10 units of EcoRI was added to the samples as indicated at the top and then incubated further for 20 min. 5 or 0.5 units of Sequenase or DNA polymerase I (Klenow fragment) were added and incubated for 10 min. After the reactions were stopped, the DNAs were run on a 12% polyacrylamide gel.



Usually, non B-DNA structures are underwound relative to the normal B-DNA(17, 20, 37) . Hence, negative supercoiling stabilizes unusual DNAs such as cruciforms, Z-DNA, and triplexes(15, 17, 20, 37) . We examined the pausing of the Klenow fragment with various topoisomers of pRW1981. Fig. 3B shows that the DNA polymerase paused at all supercoil densities (average , 0 to -0.06) (lanes 1-4), indicating that negative supercoil density does not have a major influence on the pausing. However, linear DNA does not display the same pausing pattern as circular DNAs, whereas a mixture of linear DNA and circular DNA shows half the amount of pausing at the same loci as in the closed circular DNA. The reason why linear DNA shows no pausing after treatment with heat and alkali is probably due to strand dissociation, which disfavors renaturation in the annealing step compared with the circular DNA.

The extension of primers by DNA polymerases with closed circular DNAs may generate waves of positive supercoils in front of the synthesizing polymerases, as shown in the translocation of RNA polymerase along duplex DNA (twin-supercoiled domain)(38, 39) . The accumulated torsional stress may cause the pausing of DNA polymerases. To test this, we performed a primer extension experiment for DNA that was linearized with EcoRI after the supercoiled pRW3219 and the primer were annealed and preincubated for 10 min at 37 °C. Fig. 3C shows that the DNA polymerases paused in DNAs linearized after the preincubation with the same pattern as the supercoiled DNA (no EcoRI treatment); the bands at the tops of the gels (top arrow) (lanes 1-4 and 7-10) indicate that a high percentage of the DNAs were linearized with EcoRI. Also, Fig. 3C shows that pausings in the distal region are increased, although not dramatically, in the presence of 100 mM NaCl compared with 50 mM NaCl. In addition, the DNA polymerases pause more at the beginning region (bottom arrow) of the CTG repeats with low concentration of the enzymes and more in the distal region (37th CTG) (bracket) with high concentration. Hence, these results indicate that the pausing was not induced by an accumulated torsional stress resulting from the progression of the DNA polymerases.

Primer Binding Sites Determine the Loci of Pausings

The location of the pause site(s) was investigated as a function of the position of hybridization of the primers to pRW1981. Surprisingly, we found that different primers, ranging from 36 to 113 bp in the distance between the 5`-end of the primer and the first CTG triplet, caused pausing in different loci. A primer whose 5`-end is 63 bp from the first CTG has pause sites at the 28th CTG (Fig. 4A, lanes 3 and 4), whereas a primer whose 5`-end is 89 bp from the first CTG has pause sites at the 37th CTG (Fig. 4A, lanes 5 and 6). The complementary strand (the CAG triplet repeat is the template) also shows a dependence of pause sites on the distance between the primer and the first CAG repeat. DNA polymerase paused at the 30th triplet with a primer whose 5`-end is 70 bp from the first CAG (Fig. 4A, lanes 1 and 2). Also, we examined an extremely closely located primer (whose 5`-end is 36 bp from the CTG triplet repeats). As shown in lane 4 of Fig. 4B, the Klenow fragment paused at the 22nd CTG as well as in the beginning region. In addition, the amount of pausing is much less, compared with the primer whose 5`-end is 89 bp away (Fig. 4B, lanes 1 and 2). Most of the pause sites with Sequenase (Fig. 4B, lane 3) were in the beginning region as found in the primer extensions using primers that were distant from the first CTG triplet.


Figure 4: The pausing sites are determined by the location of the primer binding site. Primer extension experiments were performed with various primers whose 5`-ends have various distances from the first CTG repeats in pRW1981 and pRW3262. pRW3262 was used in place of pRW1981; pRW3262 is pRW1981 with a deletion 38 bp (the left side of the triplets) and has (CTG) in the BamHI site instead of the HincII site. Thus, the primer(-20) was effectively moved 26 bp closer to the (CTG) tract. The primers were GTAAAACGACGGCCAGT(-20), AACAGCTATGACCATG(-21), and CCTGGCCGAAAGAAAT (p19). Other experimental details were as described under ``Materials and Methods.'' A, lanes 1 and 2, 5`-end of the primer is 70 bp distance from the first CTG triplets; lanes 3 and 4, 63 bp; lanes 5 and 6, 89 bp. The arrows designate the beginning of the triplet repeats. B, lanes 1 and 2, 89 bp distance; lanes 3 and 4, 36 bp. S, Sequenase; K, DNA polymerase I (Klenow Fragment). The arrow represents the beginning site of the multiple CTG insert. C, summary of relationships between the locations of 5`-ends of primers (open bars) and pause sites by the Klenow fragment of DNA polymerase I (filled bars).



A summary of these observations is shown in Fig. 4C. The extension of primers with a longer distance from the initiation site of the CTG repeat stopped at triplets located farther from the initiation site; the distance between the pausing site in the distal region and the first CTG is about 20 bp longer than the distance between the first CTG and the 5`-end of primers. This phenomenon occurs in both strands, although the extent of the pausings seemed to be different. The lengths of the filled bars in Fig. 4C represent an approximate visual estimate of the amounts of pausing of the DNA polymerase I (Klenow fragment) occurring in the distal regions; the closest primer (e) showed the least amount of pausing. These results imply that the length of DNA synthesized influences the conformation that causes the pausing and hence the location of the pause sites.

Comparison of Pausing Patterns with dGTP, dITP, and 7-deaza-dGTP as Substrates

For 7-deaza-dGTP and dITP, the N-7 atom and the exocyclic amino group of dGTP are replaced with a carbon atom and a hydrogen, respectively. These replacements would be expected to eliminate hydrogen bonds involved in triplexes and tetraplexes(30, 40) . To determine whether changes in the dNTP substrates could alter the pausing patterns, the nucleotide analogs, dITP or 7-deaza-dGTP were used instead of dGTP in the primer extension experiments. Fig. 5A shows that the pausing patterns are similar in the presence of dGTP (lanes 1, 4, 7, and 10) and 7-deaza-dGTP (lanes 3, 6, and 9), indicating that the N-7 position of guanine is not required for the pausing. However, the substitution of dGTP with dITP eliminated the pausing observed in the distal region (37th CTG). Instead, stronger pausing at the five CTG repeats in the beginning region of the CTG triplets was observed with both Sequenase and the Klenow fragment (lanes 2, 5, 8, and 11). However, this pausing pattern is not the same as shown in the beginning region of the CTG repeats when dGTP or 7-deaza-dGTP were used as substrates.


Figure 5: Capacity of dITP or 7-deaza-dGTP to replace dGTP. A, dGTP (G) as a substrate was replaced by dITP (I) or 7-deaza-dGTP (D) in primer extension experiments. The preincubation was done at 37 or 50 °C. The arrows represent the beginning site of the multiple CTG insert. B, primer extensions were performed with dNTPs containing either dGTP or dITP as substrates for plasmids containing 130, 26, or 17 CTG repeats. The preincubation was done at 37 °C. The arrow represents the beginning site of the multiple CTG insert.



The progression of the DNA polymerase I (Klenow fragment) in the presence of dITP was arrested at the five dIs at the beginning region of the CTG repeats. Although the pattern is quite different from that of dGTP, the pausing is length-dependent as also occurred in the primer extension with dGTP. The progression of the Klenow fragment of DNA polymerase I with dITP (Fig. 5B) was arrested in the beginning region of (CTG) (lane 4) but not with (CTG) (lane 5) and (CTG) (lane 6).

Our results indicate that the substitution of 7-deaza-dGTP for dGTP does not change the pattern of pausing, whereas the substitution of dITP does, suggesting that the conformation(s) that causes the pausings is not a tetraplex or a triplex in which N-7 atom is involved. The details of the structure remain to be elucidated.

Pausings in Long CGG Triplet Repeats

Since expansions of CGG triplet repeats are associated with Fragile X and XE syndromes, we investigated whether DNA polymerases also pause in CGG repeats. Interestingly, we found that long CGG repeat sequences also caused pausings. Primer extension of pRW3306, which contains 160 CGG repeats, was performed using the Klenow fragment. The 5`-end of the -40 M13 primer is 72 bp away from the CGG repeats. As shown in Fig. 6, a pausing pattern similar to that with CTG repeats was observed at the 29-31st triplets away from the CGG start site along with a weak pausing at the second repeat site (lane 5). Among the repeat sizes examined (9, 20, 26, 44, 61, 71, 80, and 160), pausing was observed only with tracts longer than 61 repeats (data not shown). As for the CTG repeats, the length of the CGG repeat affected the strength but not the location of the pause sites. However, unlike the CTG repeats, polymerase pauses only when the CCG strand is the template; no pausing was found when CGG was the template. Using Sequenase, there is no pausing in the distal region of the CGG inserts (data not shown). Other factors that affect the polymerase pausing in CTG sequences were also examined with the CGG-containing plasmids. Polymerase pausing in plasmids containing CGG triplets does not need negative supercoiling but requires closed circular DNA. The primer binding sites determine both the location and the intensity of pausings. The structure that blocked polymerase movement must be more thermally stable than the conformation in CTG, since the heat treatment (second incubation) up to 90 °C did not abolish the pausing. Methylation, which plays an important role for complete inactivation of FMR-1 gene expression in the Fragile X syndrome, had no effect on the polymerase pausing; complete inhibition of cleavage by AciI showed that pRW3306 was fully methylated in vitro by the SssI methylase for these studies. Thus, the DNA polymerase pausing studies with long CGG repeats show similar, but not identical, results to those found with CTG results.


Figure 6: DNA polymerases pause in long CGG triplet repeat sequences. Primer extension of pRW3306 was performed using the M13 primer(-40). Lanes 1-4, dideoxy termination reactions using Sequenase, G, A, T, and C, respectively. Lane 5, primer extension with the DNA polymerase I (Klenow fragment). DNA was preincubated at 37 °C for 10 min before the extension reaction. The arrow on the left side indicates the CGG repeat start site. Polymerase pausing was observed at the 29-31st CGG repeats as indicated on the right side of lane 5.




DISCUSSION

Our in vitro experiments show aberrant DNA synthesis in massive CTG and CGG repeat sequences; pausings of DNA polymerases occur at specific loci. The distance between the primer binding site and the initiation of the triplet repeats determines the pause location. The pausing is dependent on the length of the contiguous CTG insert and is temperature-dependent. These data suggest that DNA secondary structures, probably stabilized by H-bonds, exist in long CTG and CGG repeat sequences that inhibit DNA polymerase movement.

Our experiments indicate that CTG triplet repeats have unorthodox properties. If these features are present in vivo in human cells, they may play a role in the expansion and the subsequent molecular pathology of myotonic dystrophy, Huntington's disease, Kennedy's disease, spinocerebellar ataxia I, and dentatorubral-pallidoluysian atrophy. Other studies (^3)also show the non-B properties of longer CTG repeats that are different from previously characterized unusual DNA conformations (15, 17, 20, 37) (left-handed Z-DNA, cruciforms, triplexes, nodule DNA, etc.) and formation of tetraplexes for CGG oligomers(40) .

DNA polymerases as well as chemical probes and physical analyses have been used to study unusual DNA structures(30, 31, 32) . Mirkin and co-workers (32) showed that DNA polymerases pause at intramolecular triplex forming sequences and that the pause sites are dependent on the type of triplex isomer(32) . In addition, polymerases may be able to recognize sites such as smoothly curved DNAs that chemical probes cannot detect since there are no perturbed (unpaired) bases. Thus, DNA polymerases may recognize non-B DNA structures that preexist or that are formed in the course of DNA polymerization. The observed pausing of DNA polymerases in the CTG repeats is apparently caused by a non-B DNA conformation. Since no pausing was found in single-stranded triplet repeats, double-stranded DNA must be a prerequisite for the structure. However, it is not clear whether the structure preexisted in the duplex DNA, resulted from a misalignment created after the denaturation-renaturation process, and/or was formed by the process of polymerase elongation.

Expansion and contraction of CTG triplet repeats occur in E. coli(35) . Expansion may be related to the pausing of DNA polymerase by the following explanation. An insertion and a deletion of a few bases by a slippage mechanism have been reported(41, 42) . As proposed previously(15, 43) , slippage might be promoted by an ``idling polymerase'' at a strong block such as a DNA structure or bound proteins such as nucleosomes; the result could be multiple slippages, which causes the expansion of larger sequences. Other studies showed that nucleosomes are preferentially positioned at CTG triplets and that this behavior is more pronounced for longer repeats (34, 44) . We proposed that the non-B conformation adopted by the CTG repeats is a toroid;^3 this structure would be expected to serve as a superior histone binding site, as observed(34, 44) . This unorthodox DNA structure may block polymerase movement and transiently cause the dissociation of the template and the newly synthesized strand (Fig. 7). A primer reassociation in a misaligned configuration may generate a hairpin structure in the newly synthesized strand and hence elicit expansions.


Figure 7: A model for the relationship between the pausing of DNA polymerases and expansion of triplet repeat sequences as mediated by primer realignment. The third process (Expansion) is a multistep reaction. n is the number of triplet repeats and n+s represents the repeat numbers expanded by the slipped (s) increment.



The pausing phenomenon is general for both eukaryotic and prokaryotic polymerases. The replication of eukaryotic chromosomes requires the participation of a multi-component ensemble that includes DNA polymerases, helicases, ligases, and single-stranded binding proteins. These proteins may influence the pausing of polymerases. Whereas further studies are required to determine the relationship of these observations to the behavior in living human cells, our studies provide a basis for exploring the genetic and biochemical mechanisms of expansion and deletion in a well characterized simple organism as related to expansion in human neuromuscular genetic diseases.


FOOTNOTES

*
This work was supported by the National Institutes of Health (Grant GM52982), the National Science Foundation (Grant DMB-9103942), and the Robert A. Welch Foundation. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
Contributed equally to this work.

Present address: Laboratory of Genetic Disease Research, NCHGR, NIH, Bldg. 49, Bethesda, MD 20892.

**
To whom correspondence should be addressed: Inst. of Biosciences and Technology, Texas A & M University, Texas Medical Center, 2121 Holcombe Blvd., Houston, TX 77030. Tel.: 713-677-7651; Fax: 713-677-7689; rwells@ibt.tamu.edu.

(^1)
K. Ohshima, S. Kang, J. E. Larson, and R. D. Wells, manuscript in preparation.

(^2)
The abbreviations used are: pur bullet pyr, homopurine bullet homopyrimidine; bp, base pair(s); TYP, tryptone-yeast extract-phosphate broth.

(^3)
R. Gellibolian, M. Shimizu, S. Amirhaeri, S. Kang, K. Ohshima, J. E. Larson, Y.-H. Fu, C. T. Caskey, B. A. Oostra, and R. D. Wells, manuscript in preparation.


ACKNOWLEDGEMENTS

We thank Drs. J. Klysik and A. Jaworski for critical discussions.


REFERENCES

  1. Mahadevan, M., Tsilfidis, C., Sabourin, L., Shutler, G., Amemiya, C., Jansen, G., Neville, C., Narang, M., Barcelo, J., O'Hoy, K., Leblond, S., Earle-MacDonald, J., De Jong, P. J., Wieringa, B., and Korneluk, R. G. (1992) Science 255, 1253-1255 [Medline] [Order article via Infotrieve]
  2. Brook, J. D., McCurrach, M. E., Harley, H. G., Buckler, A. J., Church, D., Aburatani, H., Hunter, K., Stanton, V. P., Thirion, J.-P., Hudson, T., Sohn, R., Zemelman, B., Snell, R. G., Rundle, S. A., Crow, S., Davies, J., Shelbourne, P., Buxton, J., Jones, C., Juvonen, V., Johnson, K., Harper, P. S., Shaw, D. J., and Housman, D. E. (1992) Cell 68, 799-808 [Medline] [Order article via Infotrieve]
  3. Fu, Y.-H., Pizzuti, A., Fenwick, R. G., Jr., King, J., Rajnarayan, S., Dunne, P. W., Dubel, J., Nasser, G. A., Ashizawa, T., De Jong, P., Wieringa, B., Korneluk, R., Perryman, M. B., Epstein, H. F., and Caskey, C. T. (1992) Science 255, 1256-1258 [Medline] [Order article via Infotrieve]
  4. La Spada, A. R., Wilson, E. M., Lubahn, D. B., Harding, A. E., and Fischbeck, K. H. (1991) Nature 352, 77-79 [CrossRef][Medline] [Order article via Infotrieve]
  5. Orr, H. T., Chung, M.-y., Banfi, S., Kwiatkowski, T. J., Jr., Servadio, A., Beaudet, A. L., McCall, A. E., Duvick, L. A., Ranum, L. P. W., and Zoghbi, H. Y. (1993) Nature Genet. 4, 221-226 [Medline] [Order article via Infotrieve]
  6. The Huntington's Disease Collaborative Research Group (1993) Cell 72, 971-983 [Medline] [Order article via Infotrieve]
  7. Koide, R., Ikeuchi, T., Onodera, O., Tanaka, H., Igarashi, S., Endo, K., Takahashi, H., Kondo, R., Ishikawa, A., Hayashi, T., Saito, M., Tomoda, A., Miike, T., Naito, H., Ikuta, F., and Tsuji, S. (1994) Nature Genet. 6, 9-13 [Medline] [Order article via Infotrieve]
  8. Nagafuchi, S., Yanagisawa, H., Sato, K., Shirayama, T., Ohsaki, E., Bundo, M., Tanaka, T., Tadokoro, K., Kondo, I., Murayama, N., Tanaka, Y., Kikushima, H., Umino, K., Kurosawa, H., Furukawa, T., Nihei, K., Inoue, T., Sano, A., Komure, O., Takahashi, M., Yoshizawa, T., Kanazawa, I., and Yamada, M. (1994) Nature Genet. 6, 14-18 [Medline] [Order article via Infotrieve]
  9. Burke, J. R., Wingfield, M. S., Lewis, K. E., Roses, A. D., Lee, J. E., Hulette, C., Pericak-Vance, M. A., and Vance, J. M. (1994) Nature Genet. 7, 521-524 [Medline] [Order article via Infotrieve]
  10. Kawaguchi, Y., Okamoto, T., Taniwaki, Aizawa, M., Inoue, M., Katayama, S., Kawakami, H., Nakamura, S., Nishimura, M., Akiguchi, I., Kimura, J., Narumiya, S., and Kakizuka, A. (1994) Nature Genet. 8, 221-228 [Medline] [Order article via Infotrieve]
  11. Verkerk, A. J. M. H., Pieretti, M., Sutcliffe, J. S., Fu, Y.-H., Kuhl, D. P. A., Pizzuti, A., Reiner, O., Richards, S., Victoria, M. F., Zhang, F., Eussen, B. E., van Ommen, G.-J. B., Blonden, L. A. J., Riggins, G. J., Chastain, J. L., Kunst, C. B., Galjaard, H., Caskey, C. T., Nelson, D. L., Oostra, B. A., and Warren, S. T. (1991) Cell 65, 905-914 [Medline] [Order article via Infotrieve]
  12. Knight, S. J. L., Flannery, A. V., Hirst, M. C., Campbell, L., Christodoulou, Z., Phelps, S. R., Pointon, J., Middleton-Price, H. R., Barnicoat, A., Pembrey, M. E., Holland, J., Oostra, B. A., Bobrow, M., and Davies, K. E. (1993) Cell 74, 127-134 [Medline] [Order article via Infotrieve]
  13. Nelson, D. L. (1993) in Genome Analysis Vol. 7: Genome Rearrangement and Stability (Davis, K. E., and Warren, S. T., eds) pp. 1-24, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  14. Wieringa, B. (1994) Hum. Mol. Genet. 3, 1-7 [Medline] [Order article via Infotrieve]
  15. Wells, R. D., and Sinden, R. R. (1993) in Genome Analysis Vol. 7: Genome Rearrangement and Stability (Davis, K. E., and Warren, S. T., eds) pp. 107-138, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  16. Hanvey, J. C., Klysik, J., and Wells, R. D. (1988) J. Biol. Chem. 263, 7386-7396 [Abstract/Free Full Text]
  17. Sinden, R. R. (1994) DNA Structure and Function , Academic Press, San Diego, CA
  18. Kang, S., Wohlrab, F., and Wells, R. D. (1992) J. Biol. Chem. 267, 1259-1264 [Abstract/Free Full Text]
  19. Singhal, R. K., and Wilson, S. H. (1993) J. Biol. Chem. 268, 15906-15911 [Abstract/Free Full Text]
  20. Wells, R. D. (1988) J. Biol. Chem. 263, 1095-1098 [Free Full Text]
  21. Zheng, G., Kochel, T., Hoepfner, R. W., Timmons, S. E., and Sinden, R. R. (1991) J. Mol Biol. 221, 107-129 [CrossRef][Medline] [Order article via Infotrieve]
  22. Bernués, J., Beltrán, R., Casasnovas, J. M., and Azorín, F. (1989) EMBO J. 8, 2087-2094 [Abstract]
  23. Kohwi, Y., and Kohwi-Shigematsu, T. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 3781-3785 [Abstract]
  24. Wells, R. D., Collier, D. A., Hanvey, J. C., Shimizu, M., and Wohlrab, F. (1988) FASEB J. 2, 2939-2949 [Abstract/Free Full Text]
  25. Kang, S., and Wells, R. D. (1992) J. Biol. Chem. 267, 20889-20891
  26. Panyutin, I. G., and Wells, R. D. (1992) J. Biol. Chem. 267, 5495-5501 [Abstract/Free Full Text]
  27. Trinh, T. Q., and Sinden, R. R. (1991) Nature 352, 544-548 [CrossRef][Medline] [Order article via Infotrieve]
  28. Ohshima, A., Inouye, S., and Inouye, M. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 1016-1020 [Abstract]
  29. Ripley, L. S. (1990) Annu. Rev. Genet. 24, 189-213 [CrossRef][Medline] [Order article via Infotrieve]
  30. Baran, N., Lapidot, A., and Manor, H. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 507-511 [Abstract]
  31. Lapidot, A., Baran, N., and Manor, H. (1989) Nucleic Acids Res. 17, 883-900 [Abstract]
  32. Dayn, A., Samadashwily, G. M., and Mirkin, S. M. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 11406-11410 [Abstract]
  33. Hansen, R. S., Canfield, T. K., Lamb, M. M., Gartler, S. M., and Laird, C. D. (1993) Nature Genet. 73, 1403-1409
  34. Wang, Y.-H., Amirhaeri, S., Kang, S., Wells, R. D., and Griffith, J. (1994) Science 265, 669-671 [Medline] [Order article via Infotrieve]
  35. Kang, S., Jaworski, A., Ohshima, K., and Wells, R. D. (1995) Nature Genet. 10, 213-218 [Medline] [Order article via Infotrieve]
  36. Singleton, C. K., and Wells, R. D. (1982) Anal. Biochem. 122, 253-257 [Medline] [Order article via Infotrieve]
  37. Palecek, E. (1991) Crit. Rev. Biochem. Mol. Biol. 26, 151-226 [Abstract]
  38. Liu, L. F., and Wang, J. C. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 7024-7027 [Abstract]
  39. Rahmouni, A. R., and Wells, R. D. (1992) J. Mol. Biol. 223, 131-144 [Medline] [Order article via Infotrieve]
  40. Fry, M., and Loeb, L. A. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 4950-4954 [Abstract]
  41. Strand, M., Prolla, T. A., Liskay, R. M., and Petes, T. D. (1993) Nature 365, 274-276 [CrossRef][Medline] [Order article via Infotrieve]
  42. Levinson, G., and Gutman, G. A. (1987) Nucleic Acids Res. 15, 5323-5338 [Abstract]
  43. Richards, R. I., and Sutherland, G. R. (1994) Nature Genet. 6, 114-116 [Medline] [Order article via Infotrieve]
  44. Wang, Y. W., and Griffith, J. (1995) Genomics 25, 570-573 [CrossRef][Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.