(Received for publication, December 10, 1996, and in revised form, April 4, 1997)
From the Center for Genome Research, Institute of Biosciences and Technology and the Department of Biochemistry and Biophysics, Texas A & M University, Texas Medical Center, 2121 W. Holcombe Blvd., Houston, Texas 77030-3303
Genetic expansion of DNA triplet repeat sequences
(TRS) found in neurogenetic disorders may be due to abnormal DNA
replication. We have previously observed strong DNA synthesis pausings
at specific loci within the long tracts (>~70 repeats) of CTG·CAG
and CGG·CCG as well as GTC·GAC by primer extensions in
vitro using DNA polymerases (the Klenow fragment of
Escherichia coli DNA polymerase I, the modified T7 DNA
polymerase (Sequenase), and human DNA polymerase ). Herein, we have
isolated and analyzed the products of stalled synthesis found at
~30-40 triplets from the beginning of the TRS. DNA sequence analyses
revealed that the stalled products contained short tracts of
homogeneous TRS (6-12 repeats) in the middle of the sequence
corresponding to the flanking region of the primer-template sequence.
The sequence at the 3
-side terminated at the end of the primer,
indicating that the primer molecule had served as a template. In
addition, chemical probe and polyacrylamide gel electrophoretic
analyses revealed that the stalled products existed in hairpin
structures. We postulate that these products of the DNA polymerases are
caused by the existence of an unusual DNA conformation(s) within the
TRS, during the in vitro DNA synthesis, enhancing the DNA
slippages and the hairpin formations in the TRS due to primer
realignment. The consequence of these steps is DNA synthesis to the end
of the primer and termination. Primer realignment including hairpin
formation may play an important intermediate role in the replication of
TRS in vivo to elicit genetic expansions.
Neurogenetic diseases including myotonic dystrophy, fragile X syndrome, Huntington's disease, spinobulbar muscular atrophy, spinocerebellar ataxia type 1, and Friedreich's ataxia have been found to result from expanded triplet repeat sequences (TRS)1 CTG·CAG, CGG·CCG, and GAA·TTC within their genes (1, 2). Also, instabilities (expansions and deletions) of TRS have been associated with limb developmental diseases including human synpolydactyly (3), hypodactyly in mice (4), and hereditary nonpolyposis colon cancer (5). The age of onset and the severity of the neurological diseases are influenced by the lengths of the TRS. Long tracts of TRS are unstable and are prone to change repeat size by expansions and deletions in successive generations. In addition to observations in humans, TRS instabilities have been demonstrated in Escherichia coli (6), and subsequent investigations revealed the influences of genetic factors including host strain (6-9), DNA replication (6, 8), methyl-directed mismatch repair (10), transcription,2 and nucleotide excision repair.3
Simple repetitive sequences are known to cause misalignment-mediated DNA synthesis errors that give rise to the instabilities (deletions and expansions) (11-14). These instabilities are thought to be due to the formation of unusual secondary structures, which can cause frameshift mutations during DNA synthesis (15). Mono- and dinucleotide repeats including Z-DNA-forming sequences promote multiple slippages in templates (12-14, 16). Inverted repeats, which have the potential to form hairpin structures that stabilize misalignments (17), produced a stem-loop DNA structure in pUC19 (18) due to template switching during replication in E. coli (11). Pause sites are hot spots for mutation caused by DNA misalignment (19-21). Polypurine·polypyrimidine tracts including GAA·TTC repeats caused pausing (arrest) of DNA polymerases in vitro due to triplex formation (22-24).
The mechanism of TRS expansion is uncertain. However, it has been postulated that DNA replication errors may cause the expansion of the TRS due to the slippage of DNA complementary strands (6, 25, 26). The molecular basis of expansions versus deletions of CTG·CAG (6) and CGG·CCG (7) was explained on the basis of preferential stabilization of loop structures during replication (26). Also, the FMR1 gene containing CGG·CCG repeats associated with the fragile X syndrome delayed replication in vivo (27). Similar studies with CTG·CAG in yeast agreed with this molecular model (28).
Our previous studies showed that DNA polymerases, including the Klenow
fragment of E. coli DNA polymerase I, the modified T7 DNA
polymerase (Sequenase), and human DNA polymerase , paused in
vitro within certain TRS, including CTG·CAG, CGG·CCG, and
GTC·CAG, for lengths of >~70 repeats (29). Herein, we report that
the stalled products at the distal region of the TRS insert were
termination products during DNA synthesis due to primer realignment by
the formation of hairpin structures of TRS. Our results indicate that the unusual DNA conformation(s) of the TRS (30, 31) that blocks DNA
polymerization may be responsible for the primer realignment and may
play an important role in expansion in vivo.
Table I lists the plasmids used in this study. pRW1981 contains a segment of the myotonic dystrophy gene (32) with 130 CTG·CAG triplet repeats along with 19 and 43 base pairs of the genomic flanking sequences, which was cloned into the HincII site of the polylinker of pUC19 (29, 33). pRW3262 was a recloned product of the Sau3AI fragment containing (CTG·CAG)130 of pRW1981 cloned into the BamHI site of pUC19 (29). pRW3306 containing (CGG·CCG)160 was cloned using a head-to-tail dimer of (CGG·CCG)80, which was derived from the FMR1 gene (34), into the HincII site of pUC19 (7). pRW3416 containing (GTC·GAC)98 was produced using a synthetic oligonucleotide and the in vivo expansion method (6, 22). The lengths of the repeat sequences for pRW3306 and pRW3416 were confirmed by DNA sequence analysis. The inserts for pRW1981 and pRW3262 were partially sequenced, and the repeat lengths were estimated by agarose gel electrophoresis. The number of repeats is estimated as ±5 triplets.
|
Primers (New England Biolabs
Inc.) were the M13/pUC forward sequencing primer 1212 (17-mer) or the
M13/pUC reverse sequencing primer 1201 (16-mer) and were labeled with
[-32P]ATP (DuPont NEN) using 5 units of T4
polynucleotide kinase (U. S. Biochemical Corp.) for 60 min at
37 °C. The mixtures were heated for 5 min at 65 °C and purified
by gel filtration through Sephadex G-50 (medium) microcolumns
(Pharmacia Biotech Inc.) equilibrated with H2O.
3 µg of each plasmid DNA was dissolved in 20 µl of a solution
containing 0.2 M NaOH and 2 µg of
5-32P-end-labeled primer. The mixture was heated for
90 s at 90 °C, followed by cooling to room temperature for 4 min, and neutralized with 2 µl of 3 M sodium acetate (pH
5.2). After DNA was precipitated by ethanol, the pellet was washed with
70% ethanol and dried under vacuum. The DNA was resuspended in 10 µl
of a buffer containing 40 mM Tris-Cl (pH 7.5), 50 mM NaCl, 20 mM MgCl2, 10 mM dithiothreitol, and 0.5 mM
2
-deoxynucleoside triphosphates. Before the extension reaction was
performed, the mixture was incubated at 37 °C for 10 min. 5 units of
the Klenow fragment of E. coli DNA polymerase I (U. S.
Biochemical Corp.) was added to the mixture, which was incubated for 10 min at 37 °C. The reaction was terminated by the addition of 95%
formamide and 20 mM EDTA. The mixture was heated for 4 min
at 90 °C and chilled on ice. A portion of the reaction mixture was
run on a 12% polyacrylamide gel containing 7 M urea.
For the investigations with betaine and the E. coli single-stranded DNA-binding protein (SSB), before adding the Klenow fragment, 5.5 M betaine (Sigma) or SSB (>98% pure; U. S. Biochemical Corp.) was added to the reaction mixtures to give different concentrations in which 1 or 0.5 µg of DNA was used as a template, respectively. The mixtures were incubated at 37 °C for 30 min. After DNA elongation reactions, SSB was extracted with phenol prior to gel electrophoresis.
DNA Sequencing and Chemical ModificationsTo sequence the
stalled products, a total of 24 µg of DNA was used for the primer
extension reactions as described above (3 µg of DNA/reaction). After
the reaction mixtures were run on a 12% denaturing polyacrylamide
electrophoresis gel, the stalled products (see Fig. 1) were excised and
eluted by electrophoresis. The eluates were purified by Sephadex G-50
(medium) and precipitated by ethanol along with an oligomer as a
carrier.
The purified stalled products were sequenced by the chemical
degradation method (35). For the G reaction, the DNA was incubated with
0.5% dimethyl sulfate (DMS) (Aldrich) for 2 min at 25 °C in 200 µl of DMS buffer (50 mM sodium cacodylate (pH 7.0) and 1 mM EDTA). For the reactions of purine bases (G and A), the
DNA was incubated with 1% formic acid (Fluka) at 37 °C for 15 min in 20 µl of H2O. For the pyrimidine bases (T and C), the
DNA was incubated with 60% hydrazine (Aldrich) at 25 °C for 9 min
in 50 µl of H2O. The reactions were terminated by the
addition of DMS stop buffer (1.5 M sodium acetate (pH 7.0),
1 M -mercaptoethanol, and 250 µg/ml yeast tRNA) for
the DMS reactions and hydrazine stop buffer (0.3 M sodium
acetate (pH 7.0), 0.1 mM EDTA, and 100 µg/ml yeast tRNA)
for both the formic acid and hydrazine reactions.
Chemical modifications of the stalled products were performed with bromoacetaldehyde (BAA) and diethyl pyrocarbonate (DEPC) (Aldrich). BAA was prepared as described previously (36). For the BAA reactions, the eluted stalled products described above were incubated with 2% BAA in 100 µl of DMS buffer for 90 min at 25 °C. For the DEPC reactions, the DNAs were incubated with 10% DEPC in 100 µl of DMS buffer at 25 °C for 30 min. Both reactions were terminated by chilling on ice and washed twice with cold ether.
After recovery by ethanol precipitation, the DNAs were dissolved in 100 µl of 10% piperidine (Aldrich) and heated to 90 °C for 30 min, and then the piperidine was removed by lyophilization. The DNAs were dissolved in 95% formamide and 20 mM EDTA, heated at 90 °C for 4 min, and then chilled on ice. A portion of the sample was fractionated on a 12% denaturing polyacrylamide gel. The bands were visualized by autoradiography.
Electrophoretic MobilityThe purified stalled products described above were dissolved in 10 µl of DMS buffer. The DNAs were heated at 90 °C for 15 min, followed by cooling to room temperature for 12 h, and subjected to 15% polyacrylamide gel electrophoresis in a buffer containing 45 mM Tris borate (pH 8.3) and 1 mM EDTA at room temperature at 8.3 V/cm. The bands were visualized by autoradiography.
Our previous studies suggested that a temperature-sensitive unusual DNA conformation(s) (30, 31) might be formed within long tracts of CTG·CAG, CGG·CCG, and GTC·GAC repeats that arrested DNA synthesis in vitro (29, 37). To further investigate this idea, we isolated some of the stalled products from the gels shown in Fig. 1 and performed chemical DNA sequence analyses.
For (CTG·CAG)130 in pRW1981, three strong stalled
products were observed at sites corresponding to the G residues in the
37th, 39th, and 41st CAG triplets, respectively (Fig. 1). DNA sequence analyses (Fig. 2A) revealed that the stalled
product at the 37th repeat contained 8 repeats of CAG adjacent to the
5-flanking sequence (88 nt). In addition, the complementary sequence
(87 or 88 nt; described below) to the 5
-flanking region was contiguous on the 3
-side of (CAG)8 (Fig. 2A), and the
product molecule ended with the sequence complementary to the primer.
The stalled products at the 39th and 41st repeats also contained
homogeneous (CAG)10 and (CAG)12, respectively,
that were contiguous with the 5
-flanking and the primer-complementary
sequences (Fig. 2A). We are not certain that the 3
-flanking
sequence contained an A residue adjacent to G5. However,
the A residue might not be incorporated since the subtraction of 24 nt
(the length of the synthesized (CAG)8) from 111 nt (the
length between the first pausing site and the beginning of the CAG
triplet) was 87 nt instead of 88 nt, which was the known length of the
flanking region (Fig. 1).
Since the distance between the primer-binding site and the beginning of
the triplet repeat units influenced the length of the stalled product
(29), we analyzed pRW3262. For pRW3262, four strong arrest sites were
observed at the G residues in the 29th, 31st, 33rd, and 35th CAG
triplet units from the beginning (Fig. 1). The first three of these
products were identified to contain (CAG)8,
(CAG)10, and (CAG)12, respectively, as well as the 5-flanking (63 nt) and the primer-complementary (63 nt) sequences (Fig. 2B). The other stalled product (35th unit) was not
sequenced due to poor recovery. The three arrest products observed for
pRW1981 and pRW3262 contained (CAG)8, (CAG)10,
and (CAG)12, indicating that our previous observations on
the dependence of the distance between the location of the primer and
the TRS were due to the different lengths of the flanking
sequences.
For pRW1981 and pRW3262, another strong arrest product was found
(open arrowheads in Fig. 1). DNA sequence analyses revealed that these stalled products contained (CTG)20, and the
reactions were extended through the 3-flanking sequences of the
plasmids, indicating that the products were derived from the deletion
of the plasmids containing (CTG·CAG)130. In fact, these
plasmid preparations were found to contain small amounts (~15%) of
deletions. Although the products prematurely terminated by DNA
polymerase were observed when the CAG strand was the template for
CTG·CAG (29), we could not identify the stalled products due to their
weaker intensity.
In the case of the CGG·CCG
repeats in pRW3306, the paused products at the 30th, 31st, and 32nd CGG
triplets also contained symmetric sequences in the 5-flanking (73 nt)
and 3
-flanking (72 nt) regions, along with the triplet repeat units,
which were composed of (CGG)6, (CGG)7, and
(CGG)8, respectively (Fig. 2C). The three arrest
sites were identified at the 3rd nucleotide (CGG) in the
CGG triplet unit (Fig. 1); hence, the 18 nt in the (CGG)6 tract for the 30th product was the stalled site. The distance between
the primer and the beginning of the triplets was 73 nt, and thus, there
were 91 nt from the primer to the arrest site. However, the 3rd
nucleotide (CGG) in the 6th CGG unit formed a base pair
with the C residue in the 5
-flanking sequence. Therefore, the
difference of 1 nucleotide between the 73-nt 5
-flanking and 72-nt
3
-flanking regions was due to the base pairing of the two complementary sequences in the hairpin (discussed below (Fig. 4B)).
Stalled Products from GTC·GAC
For pRW3416 containing
(GTC·GAC)98, the arrest product at the G residue in the
28th CGA triplet unit (Fig. 1) was composed of (CGA)7CG,
the 60-nt 5-flanking sequence, and its complementary 60-nt sequence on
the 3
-side (Fig. 2D). This result indicates that the 8th A
residue was not incorporated during DNA synthesis, suggesting that DNA
polymerase was arrested at the G residue in the 7th CGA unit, which was
similar to our observations for the CTG·CAG and CGG·CCG repeats,
where DNA polymerase arrested at the G residues (Figs. 1 and 2). This
result was consistent with the observations that the 83 nt between the
beginning of the triplets and the arrest site were composed of
(CGA)7CG (23 nt) and the 3
-flanking sequence that was
complementary to the 60-nt 5
-flanking sequence. The stalled products
at the 29th, 30th, and 32nd triplet units (Fig. 1) were not sequenced
due to their scarcity.
In summary, these results revealed that the stalled products contain short tracts of homogeneous TRS and the symmetric flanking sequences, suggesting that they were produced by the termination of DNA synthesis due to template switching (see "Discussion").
Formation of Hairpin StructuresThe formation of hairpin
structures with duplex antiparallel conformations has been proposed for
oligonucleotides containing single-stranded CTG, CAG, CGG, CCG, GTC,
and GAC repeats with varying degrees of stability from thermodynamic,
electrophoretic gel mobility, and NMR studies (38-49). Thus, we
conducted investigations to determine if the stalled products, which
had the potential to exist as hairpins as revealed by DNA sequence
analyses, were, in fact, hairpins or were in other conformations
(i.e. linear duplexes). Polyacrylamide gel electrophoretic
analyses showed that the stalled products that had been heat-denatured
and then annealed migrated as single bands; the size of these products was almost identical to the duplex length formed by the 5- and 3
-halves of the products (data not shown). These results suggested that the stalled products formed duplex hairpin structures.
Chemical and enzymatic probe analyses (38-42) were used to identify
hairpin DNA conformations at the base pair level in oligomers containing TRS, making use of the specific reactivities for the stems
and for the single-stranded loops. Therefore, we performed analyses
using BAA and DEPC to identify the structures formed in the homogeneous
TRS stalled products: BAA reacts specifically with adenines and
cytosines in single-stranded DNA (50); DEPC reacts at the N-7 positions
of adenines and guanines in single-stranded DNA (50). For the stalled
products containing CAG repeats, only the C base in the 5th CAG triplet
at the 37th stalled site (Fig. 1), which was identified to contain
(CAG)8 (Fig. 3A), was
hypersensitive to BAA, but the rest of the C and A bases in
(CAG)8 as well as the flanking sequences were not
specifically modified. However, the DEPC modifications were observed at
all A bases in (CAG)8, except for the A base in the 8th
unit, as well as at the G base in the 4th unit; the A bases in the 4th
and 5th repeats reacted strongly, whereas the A bases in the other
repeats (1st, 2nd, 3rd, 6th, and 7th repeats) reacted less strongly.
For the stalled products at the 39th and 41st repeats, which contained
(CAG)10 and (CAG)12, respectively, BAA strongly
modified the C bases in the 6th and 7th triplet units, respectively,
and DEPC was hyper-reactive at the middle A bases and the G base in the
5th and 6th units, respectively. These results indicate that the
hairpin loops were composed of 4 bases (AGCA) and that the hairpin
stems were composed of the other residues in the CAG repeats and the
symmetric flanking regions (Fig. 4A).
However, the strength of the A·A mismatch pairings in the hairpin
stems for the CAG repeats was dependent on the repeat lengths: A·As
embedded in the stems nearest the G·C tracts were stronger than the
A·As nearest the loops (Fig. 4A). Hence, the longer
lengths of CAG repeats probably form more stable hairpins.
For CGG·CCG, BAA specifically modified the C bases in the 4th CGG units at the 30th and 31st arrested sites and the C base in the 5th unit at the 32nd stalled site, whereas neither of the other C bases in the CGG repeats nor the C and A bases in the flanking regions were modified (Fig. 3B). DEPC weakly modified the G bases in the 3rd and 4th CGG units at the 30th and 32nd stalled sites, respectively. However, at the 31st arrest site, the 2nd G base in the 3rd CGG unit (CGG) was strongly modified, and the G bases in the 4th CGG unit were weakly reactive (Fig. 3B). The rest of the G bases in the CGG repeats as well as the G and A bases in the flanking regions were not sensitive to DEPC. These results indicate that, unlike the case of the CAG repeats described above, the loop structures were different depending on the repeat lengths. In addition, G·G mismatch base pairings were formed to give duplexes in the stems (Fig. 4B), in which the Hoogsteen base pairing involved atoms at O-6 and N-7 of the guanine in the syn-conformation with atoms at N-1 and N-2 of the anti-paired guanine (51). In fact, the DMS sequencing reactions, which attack N-7 of guanine (50), protected the G bases in the CGG repeats of the stem regions to a certain extent (Figs. 2C, 3B, and 4B).
For GTC·GAC, at the 28th and 30th stalled sites, similar modifications were observed by BAA and DEPC (Fig. 3C), except for the difference in the stem structures due to the repeat lengths. BAA modifications were observed in the middle of the GTC repeats; the C base in the 5th repeat was strongly modified, and the C and A bases in the 4th repeat were slightly modified at the 28th stalled site. The A bases in the middle region (3rd, 4th, and 5th repeats) were also hypersensitive to DEPC. The rest of the A bases were weakly modified. These modifications indicate the presence of unpaired A bases in the stems, and the loop regions contained 7 bases (Fig. 4C). This larger hairpin loop structure than observed for CTG·CAG and CGG·CCG (Fig. 4) is probably a manifestation of the different stabilities for the hairpin structures. At the 29th arrest site, BAA modification was observed only at the C base in the 5th repeat, and strong DEPC modifications were observed at the A base in the 4th repeat and at the G and A bases in the 5th repeat, indicating a different loop structure from that found at the 28th and 30th stalled sites (Fig. 3C).
In summary, chemical probe analyses of the stalled products revealed that different types of loop structures formed in the hairpins as well as the stem structures depending on the TRS.
Effect of Secondary Structure-destabilizing Agents on StallingBetaine alters DNA stability due to reduction of the base pair composition dependence on DNA thermal melting transitions (52, 53). 2 M betaine in the DNA primer extension reactions effectively suppressed pausing by DNA polymerases in G·C-rich regions (52). Thus, we performed polymerase studies to investigate if betaine reduces or eliminates the arrest sites. The primer-template complexes for (CGG·CCG)160 in pRW3306 were incubated in the presence of 2 M betaine at 37 °C for 30 min before the DNA elongation reactions by the Klenow fragment; the stallings were not significantly influenced (data not shown). Other studies with higher concentrations of betaine (3.3 and 5.2 M) were not informative due to the general inhibition of DNA polymerization. Hence, betaine did not alter the secondary structure (30, 31) in the TRS, in agreement with the observation that the stallings were heat-resistant up to 90 °C.
Furthermore, we also investigated the effect of SSB on the stallings. SSB is known to prevent the formation of secondary structures such as hairpins and triplexes, which cause the pausing of DNA polymerases (24, 54). Also, SSB stabilizes plasmids containing long tracts of TRS in E. coli (9). Different amounts (1, 2, and 10 µg) of SSB were preincubated at 37 °C for 30 min with the primer-template complexes for (CGG·CCG)160 in pRW3306. The addition of 10 µg of SSB decreased the stallings by 70%, whereas 1 or 2 µg showed an ~20% reduction (data not shown). These results indicate that, in contrast to betaine treatment, SSB may stabilize TRS by preventing the formation of DNA secondary structures (9).
This study revealed that the
strongly arrested in vitro products of DNA polymerases (the
Klenow fragment of E. coli DNA polymerase I and human DNA
polymerase ), which were observed in the region distal (~30-40
triplet units) from the beginning of long tracts of CTG·CAG,
CGG·CCG, and GTC·GAC, contain short tracts of homogeneous TRS
embedded in the middle of symmetric flanking sequences. Hence, these
products have the capability to form hairpin structures. Thus, we
propose that the strong stallings were caused by termination of DNA
synthesis due to the hairpin formation of the TRS followed by primer
realignment (Fig. 5). First, the DNA polymerase might encounter a flexible and writhed TRS (30, 31), which impedes the
progression of DNA synthesis. Pausings in the proximal regions of the
TRS (~12 triplet units) (brackets in Fig. 1) could be
intermediate paused products during the DNA synthesis; however, since
they have not been sequenced, their identity is uncertain. Second, the
idling of the impeded DNA polymerase might enable DNA slippage in the
nascent strand. Third, the hairpin which was formed might allow for
primer realignment, creating a functional primer 3
-end. Fourth, as the
template is now switched, the DNA chain would be elongated using the
nascent strand as a template. Finally, DNA synthesis will be terminated
at the end of the primer-nascent strand molecule. Hence, the stalling
of DNA polymerase in the proximal region (~12 triplet units) caused
by an unusual TRS conformation (30, 31) is critical for template
switching.
Unusual DNA structures including triplexes and hairpins cause the premature termination by DNA polymerases in vitro (23, 24, 54-56). Our previous study showed that double-stranded (not single-stranded) DNA is required for the termination (29). Long duplex tracts of CTG·CAG and CCG·CGG are more flexible and writhed than random sequence DNA (30, 31), which may cause the aberrant DNA polymerization reactions. We observed that SSB reduced the stallings, probably by preventing the formation of the flexible and writhed conformation(s). Alternatively, the stallings were not influenced by betaine or heat treatment (29), which indicates the thermodynamic stability of the flexible and writhed structure.
Hairpin Formation in TRSThe discovery of hairpins formed in vitro is important for confirming, at least partially, our in vivo model of genetic instabilities (6, 11, 28, 38, 43, 57). Other investigations with relatively short oligonucleotides postulated models for hairpin structures of CTG·CAG, CGG·CCG, and GTC·GAC repeats on the basis of NMR, thermodynamic, and polyacrylamide gel electrophoretic studies (38-49). Moreover, Mitas et al. (38-41) have used enzymatic and chemical probe analyses to identify the single-stranded regions in the hairpins. However, our structures (Fig. 4) are somewhat different from those proposed previously (38-41), which may be due to the conditions investigated (physical properties versus DNA polymerase reactions). Also, our investigations provided information on the minimum length appropriate to adopt hairpin structures during DNA synthesis. In the case of CAG repeats in the nascent strand when CTG was the template, (CAG)8, (CAG)10, and (CAG)12 were the terminated products, indicating their stability as hairpin structures. The other repeat lengths, such as (CAG)6, (CAG)7, (CAG)9, (CAG)11, and (CAG)13, might be too unstable to form hairpin structures. This could explain why the CAG repeat units differed in length by 2 repeats.
For CGG·CCG, the terminated products contained (CGG)6,
(CGG)7, and (CGG)8. The distance between the
two stallings was 1 CGG unit, instead of 2 as observed for CTG·CAG,
indicating that the capacity to form hairpins for CGG repeats was not
influenced by the different repeat lengths even though two different
loop structures were found (Fig. 4B). However, the hairpin
structures might be influenced by the flanking sequences since the last
incorporated G nucleotide was paired with the C nucleotide from the
5-flanking sequence. Therefore, we cannot rule out the formation of
another type of hairpin structure that is composed of G·G mismatches
between the 2nd G bases (39, 43) in the CGG unit instead of the 3rd G
bases.
The termination of DNA
synthesis was dependent on the processivity of the DNA polymerases
as well as temperature. The Klenow fragment of E. coli DNA
polymerase I and human DNA polymerase produced strong terminated
products in the distal region of the TRS insert, whereas Sequenase
showed stallings principally in the proximal region (29). It is
plausible that since the Klenow fragment and human DNA polymerase
have low processivity (58, 59), they stalled upon encountering the
flexible and writhed conformation (30, 31), resulting in idling of the DNA polymerase, which allows the template switching observed in this
study. On the other hand, even though Sequenase was arrested in the
proximal TRS region (29), it could pass by the unusual DNA conformation
due to its high processivity (58, 59). As expected, when we studied the
processive Taq polymerase at 70 °C in a parallel
experiment, no distal stalling was observed (data not shown).
Plasmids containing inverted repeats produced a stem-loop DNA structure during DNA replication in E. coli due to template switching (11, 18). We have shown that certain TRS produced similar stem-loop structures during DNA synthesis in vitro; hence, replication may be terminated even in vivo. In fact, the growth kinetics of E. coli harboring plasmids containing CTG·CAG repeats are influenced by the repeat length; as the length of CTG·CAG repeats is increased, the growth rate is reduced (8).2 Thus, replication of the TRS in vivo may be subject to the stalling of DNA synthesis, which could result in the slower growth.
The stalling by DNA polymerases may play an important role in the expansion process in vivo. As an alternative to the third line in Fig. 5, if the template was not switched but the DNA polymerase continued to copy the original template strand, expansion of the product molecule would ensue. Hence, certain types and lengths of TRS in human chromosomes may cause the stalling of DNA polymerases, which promotes genetic expansion.
We thank Drs. K. J. Marians and W. A. Beard and R. R. Iyer for helpful suggestions and Dr. S. H. Wilson for advice and critically reviewing the manuscript.