(Received for publication, November 26, 1996, and in revised form, March 22, 1997)
From the Center for Genome Research, Institute of
Biosciences and Technology and the Department of Biochemistry and
Biophysics, Texas A & M University, Texas Medical Center, 2121 Holcombe
Blvd., Houston, Texas 77030, the ¶ Department of Biochemistry
and Molecular Genetics, University of Alabama at Birmingham, School of
Medicine, Birmingham, Alabama 35294-0005, and the
Department of
Biochemistry, Tufts University, School of Medicine, 136 Harrison
Ave., Boston, Massachusetts 02111
The properties of duplex CTG·CAG and CGG·CCG,
which are involved in the etiology of several hereditary
neurodegenerative diseases, were investigated by a variety of methods,
including circularization kinetics, apparent helical repeat
determination, and polyacrylamide gel electrophoresis. The bending
moduli were 1.13 × 1019 erg·cm for CTG and
1.27 × 10
19 erg·cm for CGG, ~40% less than for
random B-DNA. Also, the persistence lengths of the triplet repeat
sequences were ~60% the value for random B-DNA. However, the
torsional moduli and the helical repeats were 2.3 × 10
19 erg·cm and 10.4 base pairs (bp)/turn for CTG and
2.4 × 10
19 erg·cm and 10.3 bp/turn for CGG,
respectively, all within the range for random B-DNA. Determination of
the apparent helical repeat by the band shift assay indicated that the
writhe of the repeats was different from that of random B-DNA. In
addition, molecules of 224-245 bp in length (64-71 triplet repeats)
were able to form topological isomers upon cyclization. The low bending moduli are consistent with predictions from crystallographic variations in slide, roll, and tilt. No unpaired bases or non-B-DNA structures could be detected by chemical and enzymatic probe analyses,
two-dimensional agarose gel electrophoresis, and immunological studies.
Hence, CTG and CGG are more flexible and highly writhed than random
B-DNA and thus would be expected to act as sinks for the accumulation of superhelical density.
Eleven human genetic disorders (including fragile X syndrome, myotonic dystrophy, Kennedy's disease, Huntington's disease, spinocerebellar ataxia type 1, dentatorubral-pallidoluysian atrophy, and Friedreich's ataxia) are characterized at the molecular level by the expansion of DNA triplet repeats (CTG, CGG, or AAG)1 from <15 copies in normal individuals to scores of copies in affected cases (1-6). In some cases, the CTG and CGG tracts are transcribed into mature mRNAs, whereas the AAG tracts in Friedreich's ataxia are in the first intron of the frataxin gene. The mechanism for expansion is not known, but it may involve slippage of the complementary strands during DNA synthesis (7-10). Expanded alleles undergo further expansions upon passage to offspring and, in some diseases, are associated with the clinical observation called anticipation, whereby the symptoms become more severe in each successive generation and with an earlier age of onset (1-5). This is a novel type of mutation and shows non-mendelian genetic transmission (11, 12).
Prior investigations suggested that triplet repeat sequences (TRS)2 do not have the properties of random B-DNA. First, CTG tracts greatly facilitate nucleosome assembly (13-15), which, in turn, may repress transcription. Second, DNA synthesis in vitro pauses at specific loci in fragments containing CTG and CGG (16). Third, long tracts of AAG and AGG form intramolecular triplexes that arrest DNA synthesis (17). Fourth, CTG and CGG migrate up to 30% more rapidly than expected on polyacrylamide gel electrophoresis, whereas their migration is normal on agarose gels (18). Fifth, CTG is preferentially expanded in Escherichia coli compared with the other nine TRS (8). Sixth, the frequency of expansions and deletions in E. coli (7, 9, 10) is influenced by the direction of replication, suggesting the formation of stable hairpin loops in the lagging strand template or the newly synthesized nascent strand.
Conformational investigations were conducted on plasmids and restriction fragments containing CTG and CGG to evaluate their role in the biological behaviors described above. Several methods were applied, including circularization kinetics, apparent helical repeat determinations, the rate of migration through acrylamide and agarose gel electrophoresis, chemical and enzymatic probe analyses, two-dimensional gel electrophoresis, and the induction of an immune response. The analyses indicate that both CTG and CGG exist as fully paired, right-handed B-helices. However, their flexibilities are substantially greater than that of random B-DNA, and this causes the TRS to be more writhed. As a result, the average superhelical density of a DNA domain containing a TRS region will be unevenly distributed, a higher density being concentrated within the TRS tracts. This finding in unprecedented and enables the hypothesis that part of the biological response elicited by CTG and CGG is mediated by topological features associated with their increased flexibility.
Cloning of Recombinant Plasmids
Recombinant plasmids with (CTG)n and (CGG)n
inserts used for the cyclization experiments were obtained by cloning a
synthetic duplex that had XbaI and BamHI ends
flanking (CTG)36 or (CGG)24 into
pUC19-NotI cleaved with XbaI and
BamHI. The top-strand sequence of the
5-XbaI
3
-BamHI insert was
TCTAGAGGATCGCTCTTCG(TRS)nCGAAGAGCGGATCGCTAGCGGATCC. 3
to the TRS was an NheI site (GCTAGC)
that allowed the TRS-containing fragments to self-anneal via the
complementary CTAG resulting from XbaI-NheI
cleavage. Plasmids with longer lengths of TRS were obtained as
described (9, 19). Inserts were sequenced on both strands.
Cloning of the 32 plasmids containing (CTG)n and (CGG)n inserts used for the apparent helical repeat determination has been reported (7-10, 19). In addition, five plasmids harboring random sequence DNA inserts were obtained by cloning HaeIII restriction fragments of pUC18 into HincII of pUC19-NotI.
Kinetics of Circularization: Theory of Ring Closure
In a random-coil chain, the distribution (W) of the end-to-end distance (v) is given by the normalized gaussian function (20, 21),
![]() |
(Eq. 1) |
![]() |
(Eq. 2) |
For a linear DNA duplex with cohesive ends, free in solution,
the term (3/2ni)3/2
3,
or J-factor, specifies the concentration of intramolecular
ends in dV and is directly correlated to the equilibrium
constant Kc for the reaction M
C
(Kc = [C]/[M]), where C and M are circular and
linear monomers, respectively. The association and dissociation rate
constants between any two ends are determined by their homogeneous
distribution throughout the volume of the system, i.e. by
the equilibrium constant for the intermolecular association 2M
D
(Ka = [D]/[M]2), where [D] is the
concentration of linear dimers. It is assumed that the noncovalent
interactions formed during the intramolecular and intermolecular
reactions are identical and that the entropy change (
S)
at the reactive site is the only factor determining the rate constants
(no influence from length and composition of the molecule). Under these
conditions, Kc = KaJ, whereby the concentration of circular monomers J is given by
the ratio of the intramolecular to the intermolecular equilibrium constants (22). In this formulation, Ka =
K*, where K* is the observed equilibrium
constant, and
is related to the permutation number by which the
monomers can associate to give the same dimer. Under steady-state
kinetic conditions, Kc
k1
and Ka
k2 for Reactions A
and B,
![]() |
![]() |
![]() |
![]() |
Determination of k1
![]() |
(Eq. 3) |
Determination of k2
![]() |
(Eq. 4) |
Interpolation Formulas
The log J(M) values were plotted against
log nbp. Log J(M) is a
complex oscillatory function of log nbp.
Interpolation of log J(M) with the equations
derived by Shimada and Yamakawa (25) for the ring-closure probabilities
of a twisted worm-like chain yields the bending modulus , the
torsional modulus
, the length of the Kuhn segment
1 (
1 = 2P, P
being the persistence length), and the helical repeat h0 of the DNA. Three successive computational
steps were performed.
Equations 5-9 were used as the starting functions to construct the theoretical curve for J1(L). J1(L) evaluates the behavior of the contour length of the DNA and is not complicated by the twist dependence of cyclization. The dependence on twist arises from the fact that linear DNA molecules with a non-integral number of helical turns need to untwist (or overtwist) to cyclize. G(0,u0|u0;L) expresses the length-dependent probability of ring closure for a polymer with the end tangents specified. L denotes the reduced contour length, defined as the ratio of the contour length of the DNA chain (nbp × 3.4 Å) to the length of the Kuhn segment,
![]() |
(Eq. 5) |
![]() |
(Eq. 6) |
![]() |
(Eq. 7) |
![]() |
(Eq. 8) |
![]() |
(Eq. 9) |
is Poisson's ratio and is
related to the bending and torsional moduli by
/
= 1 +
.
establishes the upper and lower boundaries for the oscillating log
J(M). Here, r is a periodic function
of L (0
r
0.5) that reproduces
the varying fractional helical turn (and therefore twist) of the DNA
chain. For the evaluation of the upper and lower boundaries,
r was set equal to 0 and 0.5 to follow the log
J(M) values of DNA fragments with 0 and 0.5 fractional helical turns, respectively. Equations 10-14 were used to
construct a theoretical J(L)* function, and the
previous conversion factor was then used to transform
J(L)* into log J(M)*.
![]() |
(Eq. 10) |
![]() |
(Eq. 11) |
![]() |
(Eq. 12) |
![]() |
![]() |
(Eq. 13) |
![]() |
(Eq. 14) |
0 is
the constant torsion and determines the period of oscillation of log
J(M).
0 is related to the helical
repeat (h0) of the DNA by
0 = 2
/h0lbp, where
lbp is the distance between base pairs, 3.4 Å.
The theoretical J(L) value was evaluated with
Equations 10-14, where r was taken in small increments
according to the following,
![]() |
![]() |
(Eq. 15) |
The previous conversion factor was again used to convert
J(L) into log J(M).
0 was varied to find a good fit to the experimental log
J(M) values. For illustration purposes, log
nbp was converted to nbp
in the figures. For all of the DNAs, the r + 1 term in Equation 10 was omitted because |
Lk|/(1 +
)
. This has been shown to cause large errors in the extrapolations (25).
Satisfactory values for 1,
, and
0
were estimated by visual inspection of the fits, and no statistical
tests were performed. It should be noted that the torsional (
)
moduli obtained by these fits were identical to those estimated
manually based on pairs of molecules having r (
Tw) values
of 0 and 0.5 (23).
Variance of Writhe (Wr2
) and Free
Energy of Supercoiling (nbpK/RT)
These computations are reported in detail in the accompanying paper (26).
Apparent Helical Repeat Determination
30 µg of plasmid DNA was treated with chicken erythrocyte DNA topoisomerase I at 0 °C overnight and purified by phenol/chloroform extraction and ethanol precipitation. The resulting set of topoisomers was resolved by agarose gel electrophoresis. This was performed with 0.5 µg of DNA at 23 °C (60 V for 18 h) on 1% agarose containing 40 mM Tris-HCl, 25 mM sodium acetate, and 1 mM EDTA at pH 8.3. The gel dimensions were 20 cm × 22 cm × 3 mm. Positively supercoiled molecules were obtained by relaxing the plasmids at 0 °C and performing the electrophoresis at 23 °C, whereas negatively supercoiled plasmids were generated by relaxation at 37 °C and electrophoresis at 23 °C. Average unwinding was 0.012°/°C/bp. Apparent helical repeat (h0) was calculated (27) from an average of three determinations (with a S.D. of ±0.1 bp/turn). Methylation of the CGG tracts was performed by treating plasmid DNA with SssI methylase in the presence of S-adenosylmethionine at 37 °C overnight. To check for complete reaction, cleavage studies with AciI (CCGC) and PAGE analysis were conducted.
Other Methods
Chemical probe analyses, polyacrylamide gel electrophoresis, and induction of mouse antibodies were conducted as referred to below.
To calculate the J-factors for the TRS, eight plasmids containing 36-80 consecutive CTG repeats (140-272 bp long) and 19 plasmids containing 24-73 consecutive CGG repeats (104-251 bp long) were prepared (Table I). The TRS-containing restriction fragments had a protruding CTAG at each end that allowed the intramolecular or intermolecular association (Reactions A and B under "Experimental Procedures") to take place. k1 and k2 were measured under conditions in which the intermediates C and D were converted into covalently closed products by T4 DNA ligase (23). Fig. 1 shows representative time course reactions and plots. For k2, two molecules of different lengths were chosen and blunted at one end. This scheme enables the molecules to dimerize via the single CTAG end, but disallows the intramolecular circularization. Fig. 1A shows a PAGE separation of the products between (CGG)24 and (CGG)40. Three ligated species were observed, which correspond to dimers of (CGG)24, dimers of (CGG)24 and (CGG)40, and dimers of (CGG)40 in the ratio of 1:2:1 (20, 25). In general, higher order aggregates were also detected at concentrations within a few percent of the linear dimers. These aggregates may have originated from blunt-end ligations.
|
The rate of disappearance of the combined linear monomers was plotted
as a function of time. Fig. 1B shows the results for (CGG)24 plus (CGG)40. The average
k2 obtained from four determinations, two with
(CTG)n and two with (CGG)n fragments, was (0.89 ± 0.48) × 102 M1 s
1
when normalized for an enzyme concentration of 1 × 10
9 M. This error associated with
k2 was quite large and may be a reflection of
both experimental variation and differences in T4 DNA ligase-DNA
interactions between the (CTG·CAG)n and (CGG·CCG)n
sequences. Nevertheless, the average value was comparable (~20%
lower) to the value of 1.12 × 102
M
1 s
1 measured previously (28)
for DNAs with different sequences.
k1 was measured on each of the restriction
fragments in Table I under conditions that inhibited the intermolecular
association, i.e. low concentrations of DNA and ligase. Fig.
1C shows the electrophoretic separation of a kinetic
reaction for (CTG)59. One major ligated product was seen,
corresponding to circularized (CTG)59. A fainter band was
also evident, which was the result of dimerization events. For some
other fragments, linear trimer and multimers also were seen, depending
on the amount of ligase that was needed to detect the cyclized monomer.
The rate of accumulation of the circular product was quantitated and
plotted as a function of time. Fig. 1D shows the
time-dependent circle formation for (CGG)59 at
three ligase concentrations. In this experiment,
k1 obtained from the three plots was (2.39 ± 0.32) × 105 s
1 when normalized for
1.0 × 10
9 M T4 DNA ligase. In general,
the errors associated with k1 were much smaller
than for k2.
The log J(M)
(k1/k2) values are
graphed in Fig. 2 as a function of the number of base
pairs. Also shown (bottom panel) are the molar
J-factors for random sequence DNA of comparable lengths as
determined previously (23). This panel was included for a comparison
between the TRS and random sequence DNA. Our results show that
(a) the log J(M) values for
(CGG)n and (CTG)n are ~10- and ~20-fold higher,
respectively, than for random DNA, indicating that both TRS are very
flexible; and (b) the widely oscillating behavior of log
J(M) reflects the requirement of torsionally aligned ends and suggests the presence of fully paired duplexes (23).
It should be noted that all of the values of log
J(M) are affected by the choice of
k2. Therefore, if we were to consider the upper
limit of k2 (0.89 + (2 × 0.48) = 1.85),
then all of the values reported for log J(M)
would be reduced by 50%. Nevertheless, these values would still be
higher than for random DNA by 5-10-fold.
To estimate the torsional and bending moduli, we fit the experimental
data to the equations for the ring-closure probabilities of a twisted
worm-like chain (25). The oscillating solid line in Fig. 2
(top two panels) represents the fit to the experimental data
as specified by the parameters 0,
1,
and
.
0, or continuous torsion, describes the twist
angle between adjacent base pairs and is related to the number of
bp/turn in the DNA helix (h0).
1 is related to the persistence length (P)
and, therefore, to the bending flexibility of the DNA. Finally,
, or
Poisson's ratio, is related to the torsional stiffness of the chain.
The upper and lower boundaries in log J(M)
(dotted lines in the top two panels) correspond
to the values for linear molecules with integral numbers of helical
turns (upper) and those with a fractional 0.5 helical turn (lower). The
fits describe the experimental data very well.
The values of 1,
, and
0 (converted
to h0) along with the calculated bending (
)
and torsional (
) moduli are reported in Table II. As
anticipated, the values of the Kuhn segment (
1) for
CTG and CGG were much lower than that for random DNA (556 and 630 Å versus 950 Å, respectively), indicating that both TRS have
a high degree of flexibility (low bending modulus
). By contrast,
the torsional modulus
(2.3 and 2.4 × 10
19
erg·cm versus 2.4 × 10
19 erg·cm) and
the helical repeat h0 (10.41 and 10.35 bp/turn
versus 10.46 bp/turn) were close to those of random DNA,
showing that fragments containing CTG and CGG form right-handed B-type
helices under these conditions. Considering the S.D. associated with
k2 and the three interpolation steps, the
accuracy of these values (Table II and Fig. 2) is within ±10%. This
error also applies to the results from the analyses that follow (see
Figs. 3 and 4) on the variance of writhe and the free energy of
supercoiling.
|
The choice of 1 (950 Å),
(1.9 × 10
19erg·cm), and
(2.4 × 10
19
erg·cm) for B-DNA was based on the following. Measurements of the
persistence length (P) performed by cryoelectron microscopy (29), cyclization kinetics (25, 30), Monte Carlo simulation (31), and
electro-optic techniques (32, 33) yielded a consensus value close to
500 Å. Measurements of
performed by ring closure, Monte Carlo
simulation, topoisomer distribution, fluorescence depolarization, and
electron paramagnetic resonance experiments produced results that
ranged from 1.5 to 3.6 × 10
19 erg·cm (reviewed in
Ref. 34). The value of 2.4 × 10
19 erg·cm (23) was
selected because it was within the range of 2.0-2.4 × 10
19 erg·cm measured by cyclization kinetics (23, 30,
35). This choice of
and
yields
=
0.20, which is equal to
the value measured experimentally on plasmid DNA at low superhelical
densities (36). It should be noted that in the analyses that follow, a variation in
from 1.5 to 3.6 × 10
19 erg·cm
for random DNA would not change the outcome of the conclusions (26).
The data reported in Table II enable the prediction of the topological behaviors of the three DNA species. Fig. 3 shows the results of an analysis based on the ring-closure probability J1 (34), which enables the determination of the optimal length for circularization (J1max). The probability J1 takes into account the proximity of the two ends within the volume dV, as well as the correct orientation of their tangents, but does not consider the twist-dependent alignment (25).
The J1/J1max ratio reflects the efficiency of circularization as a function of length. The end points of the curves indicate the nbp at which J1 is maximum. An interpolation for phased A-tract DNA is also shown. These molecules circularize ~1000 times more efficiently than random sequence DNA due to phased, in-plane, static curvatures (28, 37). For random DNA, the calculated optimal length is 552 bp, whereas for fragments containing (CTG)n and (CGG)n, the calculated optimal lengths are 326 and 366 bp, respectively, or ~2.4 times greater than for A-tract DNA (~140 bp). Hence, the decrease in persistence length in (CTG)n and (CGG)n causes an ~40% drop in the optimal length of circularization as compared with random DNA.
The second prediction concerns the writhe and the free energy of
supercoiling. Writhe quantitates the out-of-plane trajectory of the
helix axis in circular DNA (38). Its variance, Wr2
,
is directly related to
1 (34). Fig. 4
(inset) compares
Wr2
as a function of
chain length. (CTG)n and (CGG)n display greater
variations than random DNA, in the order CTG > CGG > random
DNA, with differences increasing rapidly with length. Thus, we conclude
that the range of conformations adopted by (CTG)n and
(CGG)n is greater than for random DNA.
In sufficiently long molecules, twisting of the two ends before ring
closure produces a set of topological isomers that differ in the number
of helical turns (Lk). The strain generated by twisting is then
partitioned with writhe, which gives rise to tertiary turns (), or
supercoils. The distribution of the topoisomer population is described
by
(
Lk)2
=
Wr2
+
(
Tw)2
(see definitions under "Experimental
Procedures"). The free energy associated with supercoiling
(
G
) is related to
by
G
= K
2, where
K is the apparent twisting coefficient. K and
(
Lk)2
are related by
nbpK/RT = nbp/2
(
Lk)2
, where
R is the gas constant, and T the absolute
temperature. A computation of
nbpK/RT is shown in Fig.
4. The calculations indicate a rapid decrease in this range. The values
for (CTG)n and (CGG)n are always lower than for random
DNA, implying that, for a defined length, it will require less energy
to supercoil (CTG)n and (CGG)n than random DNA. In
other words, these TRS will act as a "sink" for localizing writhe
when embedded in DNA of random sequence.
Additional evidence that
(CTG)n and (CGG)n writhe more easily than random DNA
comes from the analysis of cyclized products in the ring-closure
experiments. Fig. 5 (left panel) shows the
pattern of (CGG)63 (221 bp) before the addition of T4 DNA
ligase (lane 1) and 4 min after the enzyme was added (lane 2). The linear monomer (m1) was
converted to the linear dimer (d1) plus one circular
monomer (c). Lanes 1 and 2 for
(CGG)70 (242 bp) report identical time courses. However, in
this case, the linear monomer (m2) was converted to two circular species (ct1 and
ct2) in almost equal amounts. A similar pattern was
also seen for (CGG)71 (data not shown). Analogously, the
reaction performed on (CTG)54 (194 bp) (right
panel) ligated the monomer (M1) into a linear dimer (D1) and a single circular species
(C1). On the other hand, the reaction performed on
(CTG)64 (224 bp) converted the linear monomer
(M2) into four circular species
(CT1 to CT4). Thus,
(CTG)64 circularized to give four isomers, whereas for
(CGG)70/71, only two species were observed, despite their
longer length. (CGG)70, (CGG)71, and
(CTG)64 had fractional helical turns (Lk0;
Lk0 = Lk0
Int(Lk0)) of 0.38, 0.67, and 0.52, respectively. The formation of topoisomers is
facilitated in linear molecules that have
Lk0 close to
0.5. In fact, during ring closure, such molecules need to untwist, or
overtwist, to align their ends. If both of these movements occur,
topoisomers will be observed. The energetic barrier to be overcome by
this process is inversely proportional to chain length (Fig. 2); thus,
for very short chains, only untwisting, or overtwisting, is practically
observed. The finding that fragments of CGG of 242 and 245 bp and of
CTG of 224 bp form topoisomers in almost equimolar amounts indicates
that this barrier is low at these lengths. This contrasts with random
DNA, for which fragments of up to ~245 bp, with
Lk0 ~ 0.5, were shown to circularize in mostly one species (39, 40).
Thus, experimental results and theoretical predictions fully agree that
the onset and distribution of topoisomeric species follow the order
CTG > CGG > random DNA.
Writhe in Plasmids Containing (CTG)n and (CGG)n
To further test the above predictions, we
performed band shift assays, a method based on the relative
electrophoretic velocities of a family of circular plasmids (27, 41).
Enzymatic nicking and closing of circular DNA result in a gaussian
distribution of topoisomers that peaks at the planar species (Lk
closest to 0). Flanking isomers differ by ±1 in their value of
Lk
due to pivoting of the ends about the nick. If a segment of
xbp containing an integral number of helical
turns is inserted into a plasmid of nbp, then
the topoisomers of A (A = nbp) and B
(B = nbp + xbp) have
identical
Lk values and, therefore, identical writhe. Since writhe
affects the electrophoretic mobility, the migration pattern of A and B
will be identical when corrected for the size difference. On the other
hand, if xbp contains a non-integral number of
helical turns, topoisomers A and B will have different
Lk as well as writhe. This will cause a shift in migration, which may be used to
calculate the h0 of
xbp.
It is obvious that the calculation of h0 by this method requires that the writhe of the inserted fragment be the same as that of the vector. If the bending force of xbp is different from that of nbp, the magnitude and the partition of writhe between xbp and nbp will be modified, and the velocity of migration altered.
Fig. 6A shows a representative agarose gel
comparing plasmids containing n repeats of CGG with those
containing n+a repeats. The latter is then compared with a
plasmid containing n+a+b repeats and so on. Studies were
also performed on a family of plasmids containing random DNA ranging in
length from 90 to 270 bp (Fig. 6B). Each of the
triangles corresponds to the h0
calculated from inserting xbp of random DNA into
a vector. The average value of h0 (10.26 ± 0.1 bp/turn) is in excellent agreement with that reported previously
(27, 41). The straight horizontal line passing through these
points is a good indication that the writhe of the random sequence
inserts is the same as that of the vector.
Fig. 6 (B (circles), C, and
D) shows the effect of inserting various lengths of
(CTG)n or (CGG)n into plasmids. The widely oscillating
behavior of the calculated h0 reflects complex
changes in the topology of the molecules, as expected if the writhe of
the inserted TRS differs from that of the vector. The results include
inserts with methylated cytosine residues or naturally occurring
polymorphisms (panel D) as well as positively and negatively
writhed molecules (panel C). In all experiments, the
calculated h0 was not a constant value, as
observed for the random DNA inserts. In addition, the results of
panel D substantiate the observed lack of influence (42) on
the J-factor by methylated cytosines. Thus, the writhe of
(CTG)n and (CGG)n is different from that of random
sequence DNA, and this change causes drastic alterations in the
topology of the supercoiled molecules. Specifically, due to their low
bending modulus , the TRS regions must be the most flexible or
contortable domains of the plasmids and, hence, the preferential sites
for the partitioning of supercoil density.
Chemical and enzymatic probes used for characterizing non-B-DNA conformations such as cruciforms, B-Z junctions, triplexes, nodule DNA, and unpaired AT-rich regions (38, 43-45) were employed on plasmids containing various lengths of CTG and CGG. Exhaustive analyses were performed with bromoacetaldehyde, osmium tetroxide, dimethyl sulfate, potassium permanganate, chloroacetaldehyde, copper-phenanthroline, diethyl pyrocarbonate, S1 nuclease, and DNase I under a wide variety of experimental conditions. No reactivities were detected that would indicate accessible bases or unpaired regions as found for the conformations identified above. However, positive internal control studies with a cruciform or B-Z junctions confirmed the validity of the interpretations.
Also, two-dimensional agarose gel electrophoresis has been widely used to study supercoil-driven structural transitions (38, 43-45). These studies were performed on plasmids containing from 24 to 240 CGG repeats (19) (both non-methylated and methylated) and on CTG repeats containing 75 and 130 units (7-10) (data not shown). No supercoil-induced transitions were observed.
Furthermore, a 420-bp BamHI fragment containing (CTG)130 (16) behaved immunologically as expected for a fully base-paired B-DNA of high G + C content. It competed as strongly as calf thymus DNA for binding to a monoclonal antibody that favors G/C sequences in B-DNA conformation and less strongly for a monoclonal antibody that favors A/T sequences. It did not react with a monoclonal antibody specific for single-stranded DNA. When (CTG)130-methylated bovine serum albumin complexes with adjuvant, were injected into three normal mice, there was no antibody response above that stimulated by adjuvant alone, and no antibodies specific for the flexible (CTG)130 structure were formed. Conformational variants (such as Z-DNA, cruciforms, triplex, A-helix, or single-stranded DNA) do induce structure-specific antibodies (46). In summary, all of the above structural analyses revealed that the flexible (CTG)n and (CGG)n TRS behave as fully paired, right-handed B-DNA and that there are no structural transitions induced by supercoil density that are detected by these methods.
Alternatively, nondenaturing PAGE analyses were performed on restriction fragments containing various lengths of CTG (18) and CGG (9-240 repeats, both methylated and non-methylated), which revealed their expected rapid migration (by up to 30%). However, these sequences migrated normally on agarose gel electrophoresis. The increased velocity could be abolished by treatment with chloroquine or reduced by the addition of 7 M urea (data not shown). These results further confirm the inherent structural properties of these TRS. Comparable studies conducted on the other eight TRS (17, 47) (repeat lengths investigated as follows: AGT, 77 and 45; ATG, 91 and 49; AAC, 58 and 42; ACC, 140 and 46; AGG, 74 and 53; AAG, 100 and 52; GTC, 98 and 62; and TTA, 90 and 31) showed normal mobilities except for the longest ACC and GTC, which displayed the behavior of CTG and CGG, but to a smaller extent (data not shown).
TRS (CTG, CGG, and AAG) occupy a central role in our comprehension of several human hereditary neuromuscular diseases. This report documents the flexible structure (with an inherently greater writhe than random sequence DNA) of right-handed B-helical CTG·CAG and CGG·CCG based on cyclization kinetics, helical repeat determinations, and PAGE analyses.
The low persistence lengths of (CTG)n and (CGG)n
reflect an enhanced flexibility along the helix axis. Two mechanisms
contribute to the deflection of the helix axis from linearity: dynamic
thermal motion and static bends (48, 49). A static bend is represented
by a wedge between 2 adjacent bp, which may be caused by the geometry
of the base pair step itself or by an intercalated ligand or amino acid
residue(s). Thermal motion produces constant fluctuations of the helix
axis, and consequently, the observed (apparent) persistence length
(Pa) is composed of static (Ps)
and dynamic (Pd) components (1/Pa = (1/Ps) + (1/Pd)) (48, 49). For
a sequence that repeats regularly along the helix,
Ps
, and therefore, Pa = Pd.
Both (CTG)n and (CGG)n are a monotonous succession of
trinucleotide units, each one occupying the same position along the
helix every other turn (3 nucleotides × 7 = 21 nucleotides/2h0, h0 10.5 bp/turn). Hence, the macroscopic, idealized shape of a TRS should
be that of a straight rod, in the absence of thermal motion. The
estimated Pd for straight random B-DNA is ~800 Å (29). The values of Pa (=Pd) of
278 and 315 Å measured for (CTG)n and (CGG)n,
respectively, are ~60% Pa (500 Å) and ~40%
Pd for random DNA. This suggests that one or more
dinucleotide steps within each trinucleotide repeat unit are more
flexible than average (50) and/or that there are flexible hinges along
the sequence.
Of the 10 possible combinations, (CTG)n contains 1:3 each of AG/CT, CA/TG, and GC/GC, whereas (CGG)n contains 1:3 each of GC/GC, GG/CC, and CG/CG. Thus, both (CTG)n and (CGG)n share GC/GC. The geometry (flexibility at the dinucleotide step) of duplex DNA is best analyzed by x-ray crystallography.
Studies indicate that CA/TG is highly polymorphic (51-54), especially in the degree of slide, roll, and twist, and is very dynamic (55). GC/GC also is rated high on a "flexibility" scale (third in one analysis and fifth in the other) (52, 54). This dinucleotide step is associated with high roll wedge angles, which may be stabilized by Mg2+ ions (56). AG/CT is not well represented in the family of crystal structures and, therefore, was excluded from one of the studies (54). For GG/CC and CG/CG, the most recent study indicates that they are flexible (54).
For each of the 10 dinucleotide steps, the mean roll and tilt angles are estimates of the equilibrium values, while the statistical scatter in these angles reflects the intrinsic flexibility for roll and tilt. It has been shown that such an analysis, averaged over all sequences, gives a reasonable value for the persistence length of B-DNA (52). We note that in the dinucleotide steps discussed above, CA/TG, CG/CG, GC/GC, and GG/CC are ranked second through fifth in terms of flexibility, out of 10 (AG/CT is ranked eighth). The dinucleotide flexibilities can then be used to predict the relative flexibilities of the different TRS (Table III). CGG and CTG are predicted to be the third and fourth most flexible, out of a total of 12 sequences. Interestingly, the TRS (ACC and GTC) that ranked first and fifth also showed anomalously rapid PAGE mobilities.
|
Flexibility may also be caused by the unpairing of the double helix (35). Due to their repetitive nature, the complementary strands of (CTG)n as well as (CGG)n may slide relative to each other following transient melting and, therefore, form slipped structures (38, 43) of varying size, which exist in low proportions, at multiple sites. Because of their random location, these structures might escape detection by chemical probe analyses. However, they would also be expected to cause the loss of dependence of the J-factors on the fractional twist (35), an occurrence not observed experimentally.
Thus, despite the imprecision of the model, the limitations of the dinucleotide approximation, and uncertainties associated with crystal packing forces, the low persistence lengths of (CTG)n and (CGG)n are consistent with the variations in crystallographic values of slide, roll, and tilt.
Flexible (highly writhed) DNA is the first intrinsic, unusual DNA conformational feature associated with human hereditary neuromuscular diseases. It is tempting to speculate that this property promotes the slippage of complementary DNA strands and is responsible for the expansion and the non-mendelian transmission of these diseases (7-10, 38, 43). However, the role of flexibility (and writhe) in expansion, toroidal nucleosome structure (13-15), DNA polymerase pausing (16), recognition of methyl-directed mismatch repair enzymes (57), binding of certain specific proteins (58, 59), and preferential methylation of long CGG tracts (60) remains to be elucidated. The establishment of methodologies to investigate conformational problems in living cells (43-45) offers hope for the evaluation of these flexible and highly writhed structures in molecular mechanisms responsible for human genetic diseases.
We thank Timothy Farrell and Ela Klysik for technical help; Drs. R. L. Baldwin, S. D. Levene, and M. A. El Hassan for helpful discussions; Dr. J. C. Wang for critically reading the manuscript; and Dr. C. T. Caskey for providing plasmids containing 75 and 105 CTG·CAG repeats.