(Received for publication, December 27, 1996)
From the Center for Genome Research, Institute of Biosciences and Technology and the Department of Biochemistry and Biophysics, Texas A & M University, Texas Medical Center, 2121 Holcombe Blvd., Houston, Texas 77030-3303
The variance of writhe, the contribution of writhe to supercoiling, and the free energies of supercoiling were calculated for (CTG·CAG)n and (CGG·CCG)n triplet repeat sequences (TRS) by statistical mechanics from the bending and torsional moduli previously determined. Expansions of these sequences are inherited by non-mendelian transmission and are linked with several hereditary neuromuscular diseases. The variance of writhe was greater for the TRS than for random B-DNA. For random B-DNA, (CGG)n, and (CTG)n, the contribution of writhe to supercoiling was 70, 78, and 79%, whereas the free energy of supercoiling at a length of 10 kilobase pairs was 1040·RT, 760·RT, and 685·RT, respectively. These data indicate that the TRS are preferential sites for the partitioning of supercoiling. Calculations of the differences in free energy of supercoiling between the TRS and random B-DNA revealed a local minimum at ~520 base pairs. Human medical genetic studies have shown that individuals carrying up to 180-200 copies of TRS (540-600 base pairs, premutations) in the fragile X or myotonic dystrophy gene loci are usually asymptomatic, whereas large expansions (>200 repeats, full mutations), which lead to disease, are observed in their offspring. Therefore, the length corresponding to the local minimum in free energy of supercoiling correlates with the genetic breakpoint between premutation and full mutation. We propose that (a) TRS instability is mediated by DNA mispairing caused by the accumulation of supercoiling within the repeats, and (b) the expansions that take place at the premutation to full mutation threshold are associated with increased mispairing caused by the optimal partitioning of writhe within the TRS at this length.
Several human loci associated with neurodegenerative disorders have been shown to carry a new form of mutation, i.e. the expansion of a DNA triplet repeat sequence (TRS)1 with composition (CTG·CAG)n, (CGG·CCG)n, or (GAA·TTC)n, referred to as (CTG)n, (CGG)n, and (GAA)n, respectively (reviewed in Refs. 1-4). These diseases fall into two categories (1). The first, which includes spinal and bulbar muscular atrophy, Huntington's disease, spinocerebellar ataxia type 1, dentatorubral-pallidoluysian atrophy, and Machado-Joseph disease, is characterized by small expansions of a (CTG)n repeat from the ~10-40 in the normal population to ~40-120 units in diseased individuals that encode a polyglutamine tract in the corresponding gene products. This mutation may impart a gain of function to the mature polypeptides that is deleterious to neuronal activity (1-3).
The second, which includes the myotonic dystrophy (dystrophia myotonica (DM)) and the fragile X and E (FRAXA and FRAXE) genes, is characterized by much larger expansions of either a (CTG)n or (CGG)n repeat, respectively, in the untranslated region of the genes. The mechanisms by which these expansions lead to disease are not fully understood (5, 6). The number of repeats is polymorphic in the normal population and ranges from 6 to 52 in FRAXA, from 6 to 25 in FRAXE, and from 5 to 37 in DM. Most asymptomatic carriers have expanded CTG or CGG tracts between 50 and 180 repeats in DM and between 60 and 200 repeats in FRAXA, respectively. The stability of the repeats decreases as their number increases, and offspring of carriers are subject to inherit a higher number of repeats than the donor parent. However, inherited expanded alleles with >200 TRS copies in FRAXA, FRAXE, and DM loci lead to disease with a severity that is proportional to the number of repeats. Furthermore, the "quantum jump" of an inherited expansion may be of thousands of repeats if the carrier approaches the threshold of 200 (1-3). These behaviors are unique to this second category of disease genes. Hence, the mechanisms of expansion and the association between the "180-200" threshold with large expansions coupled with the onset of disease are unknown.
We have previously determined by circularization kinetics, helical repeat determination, and polyacrylamide gel electrophoresis that both (CTG)n and (CGG)n are highly flexible (7), being characterized by a persistence length ~40% shorter than random DNA (B-DNA) (278 Å for (CTG)n, 315 Å for (CGG)n, and 475 Å for B-DNA). This property enables these TRS to sustain higher levels of writhe (supercoiling) than random sequence DNA.
Herein, we quantitated the variance of writhe, its contribution to supercoiling, and the free energy of supercoiling for (CTG)n and (CGG)n of up to 10 kbp and compared the same parameters calculated for B-DNA. Interestingly, we found that the differences were greatest for DNA lengths around 200 repeats, whereas the free energies of supercoiling were always lower for the TRS than for B-DNA. These results suggest that supercoiling (writhe) plays a crucial role in the mechanism of expansion of (CTG)n and (CGG)n.
The calculations of the variance of writhe, the variance of
twist, and the free energy of supercoiling were performed with the
equations of Shimada and Yamakawa (8) developed by statistical mechanics for the probabilities of ring closure of a twisted worm-like chain. The following equations were used. For the variance of writhe,
Wr2
,
![]() |
(Eq. 1) |
![]() |
(Eq. 2) |
![]() |
(Eq. 3) |
1 denotes the length of the Kuhn segment, whereas
represents Poisson's ratio. The values of
1 and
employed in the calculations were 556 Å and
0.51 for (CTG)n, 630 Å and
0.46 for (CGG)n, and 950 Å and
0.20 for random B-DNA. These values were taken from circularization kinetic experiments performed on restriction fragments containing the DNA sequences of
(CTG)n, (CGG)n, or random composition (7, 9, 10). For
the TRS, all calculations pertain to perfect, non-interrupted
sequences.
1 and
enable the calculation of the bending
and the twisting
moduli from the following relations,
![]() |
(Eq. 4) |
![]() |
(Eq. 5) |
Closed circular plasmid DNA contains
different numbers of helical turns when the linear form with cohesive
ends is circularized and ligated (11, 12). This behavior is due to the
fact that DNA is flexible and the helices bend and twist under the
influence of thermal energy. Since the molecules vary in their degree
of bend and twist at the time of closure, topological isomers
(topoisomers) are formed. Within this population, which is described by
a gaussian envelope, each isomer differs from its neighbor by one in
the total number of helical turns. For a closed circular DNA, the linking number (Lk) is the number of times the two strands cross each
other. The variance in Lk ((
Lk)2
), which describes
the width of a distribution of topological isomers, results from the
sum of the variance of writhe and twist:
(
Lk)2
=
Wr2
+
(
Tw)2
(13). Both the
variance of writhe and the variance of twist may be calculated by
statistical mechanics (8) from the length of the Kuhn segment,
1 (which is twice the persistence length,
P), and Poisson's ratio,
.
1 and
have been determined experimentally by the kinetics of circularization
of restriction fragments containing (CTG)n, (CGG)n, or
B-DNA (7, 9, 10).
Fig. 1 shows the variance of writhe normalized for chain
length, Wr2
/L, or reduced writhe for
B-DNA, (CGG)n, and (CTG)n repeats. The calculations
(see "Experimental Procedures") show that in all cases the
normalized writhe converges to 0.095 at infinite length. However,
(CTG)n and (CGG)n approach the limit at much shorter
lengths than B-DNA, implying greater fluctuations of the helix axis.
The differences in
Wr2
/L between the TRS
and B-DNA,
(
Wr2
/L), vary with length
(Fig. 1, inset).
(
Wr2
/L)
rises sharply, reaches a maximum of 0.0260 at 730 bp for (CTG)n
and 0.0202 at 780 bp for (CGG)n, and then declines to ~9 × 10
4 at 10 kbp. Hence, this analysis predicts that
molecules of (CTG)n or (CGG)n that are 700-800 bp in
length (230-270 repeats) have the greatest fluctuations in the helix
axis compared with B-DNA.
Contribution of Writhe and Twist to the Linking Number (Lk)
Tw expresses the difference in twist between an
unconstrained linear DNA molecule and the constrained closed species.
The calculation of
(
Tw)2
was performed according
to Equation 2 under "Experimental Procedures." This formula
expresses the linear relation between
(
Tw)2
and
DNA length. Since
Wr2
and
(
Tw)2
both add to the distribution of topological isomers, their relative contribution to supercoiling may be obtained from the ratio
Wr2
/
(
Tw)2
.
Fig. 2 shows this calculation for B-DNA, (CGG)n,
and (CTG)n. The values increase rapidly at short DNA lengths and reach a plateau of 2.307 for B-DNA, 3.448 for (CGG)n, and
3.806 for (CTG)n at 10 kbp. The data show that writhe and twist
contribute as follows to the supercoiling in long molecules: for B-DNA,
70 and 30%; for (CGG)n, 78 and 22%; and for (CTG)n,
79 and 21%, respectively. Thus, the ratios of (CTG)n and
(CGG)n are greater than for B-DNA. Since the estimated
torsional modulus for (CTG)n and (CGG)n is within
the range of 2.0-2.4 × 10
19 erg·cm found for
B-DNA by circularization kinetics (7, 10, 14, 15), we conclude that
segments of (CTG)n or (CGG)n supercoil more efficiently
than B-DNA due to an increased contribution from writhe.
Region of Hyperflexibility
The free energy of supercoiling,
G
Lk, is related to
Lk by
G
Lk = K
(Lk)2
(13), where K, which is expressed in kcal·bp/mol and is
also called apparent twisting coefficient, decreases with increasing DNA length (16, 17). Fig. 3 shows the calculation of
K as nbpK/RT
for random DNA, (CGG)n, and (CTG)n. nbpK reaches a constant value of
1040·RT (606 kcal/mol, RT = 0.5825 at
20 °C) at 10 kbp for B-DNA, 760·RT (443 kcal/mol) for
(CGG)n, and 685·RT (399 kcal/mol) for
(CTG)n. The values for the TRS are 27 and 34%, respectively,
lower than random sequence DNA for chains longer than 3 kbp. Below 2 kbp, nbpK/RT rises sharply and approaches 3400·RT (1980 kcal/mol) at 0 length for all
DNAs. The differences in
nbpK/RT between the TRS
and B-DNA,
(nbpK/RT), are computed in Fig. 3 (inset A).
nbpK becomes progressively more negative at increasing DNA length, reaching a minimum of
1241·RT at 500 bp for (CTG)n (167 repeats,
723
kcal/mol) and
958·RT at 540 bp for (CGG)n (180 repeats,
558 kcal/mol). Thus, this analysis shows that (CTG)n
and (CGG)n require less energy to supercoil (writhe) than
random DNA.
In addition, the difference in free energy of supercoiling between the
TRS and random DNA is not uniform with length, but reaches a maximum at
~520 bp. At this nbp, the variance in linking number, obtained from the relation (
Lk)2
= RT/2K, equals 0.100 for B-DNA, whereas it is
0.157 for (CGG)n and 0.190 for (CTG)n, an increase of
57 and 90%, respectively.
K was also calculated for B-DNA, (CTG)n, and
(CGG)n from
nbpK/RT. Contrary to
nbpK/RT, for which the
values spanned a 3-5-fold range, K covered a range of
several thousandfold from 0 to 10 kbp. For B-DNA, for example, the
results obtained at 10 kbp, 1 kbp, 100 bp, and 10 bp were 0.06, 1.0, 19, and 200 kcal·bp/mol, respectively. As a consequence, when the
differences in K were taken, the values were greatly
amplified as the lengths approached 0, with their magnitude being
highly sensitive to the initial choice of 1 and
.
For these reasons, these analyses involving
K were less rigorous than those with
nbpK/RT. The differences
in K between the TRS and random DNA are reported in Fig. 3
(inset B). As for nbpK/RT,
K
did not decrease uniformly with length. A local minimum was located at
355 bp (118 repeats) for (CTG)n and at 400 bp (133 repeats) for
(CGG)n, at which positions,
K was
1.7 and
1.2
kcal·bp/mol, respectively, ~1 kcal·bp/mol lower than expected if
the differences were steadily decreasing.
Therefore, this analysis confirms that there is an optimal length at which (CTG)n and (CGG)n writhe more favorably than at shorter or longer lengths and indicates that the free energy of supercoiling further decreases by ~1 kcal·bp/mol at this optimal length.
In summary, all of the calculations that involved the evaluation of
Wr2
, namely
(
Wr2
/L),
(nbpK/RT), and
K,
indicate that, whereas the TRS have a greater ability to supercoil than
random sequence DNA, tracts of (CTG)n or (CGG)n
500-550 bp long (167-183 repeats as estimated from
nbpK/RT) have the greatest
tendency to writhe when compared with the same lengths of random B-DNA. This length is referred to as a region of hyperflexibility.
The presence of this optimal length for the partitioning of writhe in the TRS is significant, considering its surprisingly close correspondence with the repeat size of 180-200 that demarcates both the premutation range from the full mutation range and the occurrence of small expansions versus large expansions in the FRAXA and DM loci (1-3).
Dominant Role of the Persistence Length1
and
are related to the bending
and torsional
moduli by
Equations 4 and 5. We wished to compare a hypothetical DNA that had
equal to that of random sequence DNA, but a greater or smaller
, or,
alternatively, a hypothetical DNA that had
equal to that of random
DNA, but a greater or smaller
, to determine whether the region of
hyperflexibility depends on differences in
,
, or both. For these
comparisons, two sets of analyses were performed. In the first,
1 was held constant at 950 Å, and
was varied
between 0.3 and
0.5 in intervals of 0.2 so as to simulate five
hypothetical DNAs with a torsional modulus varying from 1.48 to
3.84 × 10
19 erg·cm.
nbpK/RT was calculated for
the five DNAs, and the value of
nbpK/RT for B-DNA was
subtracted from each of them. The differences in
nbpK/RT are plotted in
Fig. 4. This calculation shows that variations in the
torsional modulus alone do not produce a substantial local maximum or
minimum in
(nbpK/RT).
In the second comparison (Fig. 5), five hypothetical
DNAs were simulated in which the torsional modulus
was held
constant at 2.4 × 10
19 erg·cm, and
1 was varied from 1300 to 500 Å in intervals of 200 Å (corresponding to a bending modulus
from 2.63 to 1.01 × 10
19 erg·cm).
nbpK/RT was calculated for
each DNA, and nbpK/RT for B-DNA was subtracted. In this case, it is evident that a local maximum
or minimum occurs according to the magnitude of
1.
Thus, a persistence length smaller than B-DNA gives rise to a local
minimum, whereas a persistence length greater than B-DNA results in a
local maximum. These comparisons indicate that whenever two DNA
sequences differ in their values of bending moduli, an optimum length
window (between ~400 and 800 bp when one of them is random B-DNA)
will result, in which the differences in free energy of supercoiling
will be largest. In addition, different torsional moduli have little
influence on the magnitude of the local maximum (or minimum).
In summary, these comparisons enabled us to conclude that the regions of hyperflexibility for (CTG)n and (CGG)n are caused by the lower persistence lengths of the TRS as compared with random B-DNA.
We show by statistical mechanical calculations that the free energy of supercoiling for the triplet repeats (CTG)n and (CGG)n is lower than for random DNA and that TRS lengths of 500-550 bp can accommodate the highest degrees of writhe when compared with the same lengths of random B-DNA. The bending of DNA is required during chromatin organization, recognition among protein complexes distally located along the DNA, and high affinity interactions involving the binding of regulatory factors (13). In addition, torsional constraints are introduced during replication and transcription, due to underwinding and overwinding of the helices (18).
The restoring forces opposing bending and twisting have been measured
for B-DNA of random composition by optical anisotropy (Ref. 19 and
reviewed in Ref. 20), kinetics of ring closure (10, 14, 15, 21),
cryoelectron microscopy (22), and topoisomer distribution (16, 17) and
have been estimated by Monte Carlo simulation (23-27) and statistical
mechanics (8, 9). By these methods, the bending modulus for random
B-DNA is found to be close to 2.0 × 10
19 erg·cm,
which gives a persistence length of 500 Å. By contrast, the values for
the torsional modulus
have ranged from 1.5 to 3.6 × 10
19 erg·cm, which prevents its accurate estimate.
The free energies of supercoiling were also calculated experimentally for random B-DNA from the topoisomer distribution (16, 17) and have been computed by statistical mechanics from the values of the bending and torsional moduli (8). These analyses indicated that the free energy of supercoiling is ~1150·RT for B-DNA chains longer than 2 kbp.
The calculations performed in this study on (CTG)n, (CGG)n, and random B-DNA show that TRS lengths of 400-600 bp have the highest differences in free energies of supercoiling when compared with analogous lengths of random sequence DNA. We also demonstrate that this behavior depends on the differences in persistence lengths and is independent of the torsional moduli. This increased hyperflexibility of the TRS coincides with the length of repeats (180-200 units) that demarcates the premutation from the full mutation range in fragile X and myotonic dystrophy and also coincides with the repeat size that leads to far greater expansions (hundreds of repeats) in offspring (1-3). This correspondence of the region of hyperflexibility with the premutation to full mutation threshold makes it tempting to speculate that triplet repeat expansion is associated with the supercoiling of DNA. No doubt, the poorly understood molecular and cellular events involved with DNA slippage, genetic instabilities, anticipation, and alterations in gene expression that elicit changes in development that are recognized as disease syndromes are complex. We do not propose that DNA structure alone is responsible. However, the dynamic as well as static conformational features of the TRS may play a role.
Cellular processes such as transcription and replication dramatically
alter the local superhelical densities of DNA due to the unwinding of
the helices (18, 28, 29). Also, the extent of instability for a
(CTG)n or (CGG)n tract depends on its length and
orientation relative to the origin of replication in Escherichia
coli (30, 31). We propose that high levels of supercoiling
(writhe) accumulate within the TRS during processes such as
transcription and replication, which lead to instability. Some of these
events are outlined in Fig. 6. Translocation of the DNA
polymerase complex generates positive supercoils ahead of the
replication fork, and possibly negative supercoils behind it, on the
leading strand (step A) (13, 18). Positive and negative
supercoils partition preferentially within a tract of the TRS (rather
than within random B-DNA sequences) due to its higher flexibility
(step B). This partitioning is influenced by the length of
the TRS, with segments of 400-600 bp accommodating the greatest levels
of writhe as opposed to the same lengths of random DNA. The
TRS-localized increases in positive superhelicity hinder their
efficient removal by topoisomerases, thus decreasing the processivity
of the polymerase complex. Pausing (32) of the enzymatic complex would
then allow reiterative DNA synthesis (33) to take place (step
C), thus leading to expanded daughter strands. Alternatively, the
polymerase complex may dissociate from the template (step D)
and allow the diffusion of positive and negative supercoil domains.
This diffusion is accompanied by the release of a newly synthesized
strand(s) from its parent strand(s), which causes hairpins (34) to form
in the daughter strand(s). This leads to expansion, then synthesis
resumes (step E), and the process is repeated. Hence, TRS
writhing may be involved in genetic instabilities.
We thank Drs. S. C. Harvey and R. R. Sinden for critically reading the manuscript.