Flexible DNA: Genetically Unstable CTG·CAG and CGG·CCG from Human Hereditary Neuromuscular Disease Genes*

(Received for publication, November 26, 1996, and in revised form, March 22, 1997)

Albino Bacolla Dagger §, Robert Gellibolian Dagger §, Miho Shimizu Dagger , Sorour Amirhaeri Dagger , Seongman Kang Dagger , Keiichi Ohshima Dagger , Jacquelynn E. Larson Dagger , Stephen C. Harvey , B. David Stollar par and Robert D. Wells Dagger **

From the Dagger  Center for Genome Research, Institute of Biosciences and Technology and the Department of Biochemistry and Biophysics, Texas A & M University, Texas Medical Center, 2121 Holcombe Blvd., Houston, Texas 77030, the  Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, School of Medicine, Birmingham, Alabama 35294-0005, and the par  Department of Biochemistry, Tufts University, School of Medicine, 136 Harrison Ave., Boston, Massachusetts 02111

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES


ABSTRACT

The properties of duplex CTG·CAG and CGG·CCG, which are involved in the etiology of several hereditary neurodegenerative diseases, were investigated by a variety of methods, including circularization kinetics, apparent helical repeat determination, and polyacrylamide gel electrophoresis. The bending moduli were 1.13 × 10-19 erg·cm for CTG and 1.27 × 10-19 erg·cm for CGG, ~40% less than for random B-DNA. Also, the persistence lengths of the triplet repeat sequences were ~60% the value for random B-DNA. However, the torsional moduli and the helical repeats were 2.3 × 10-19 erg·cm and 10.4 base pairs (bp)/turn for CTG and 2.4 × 10-19 erg·cm and 10.3 bp/turn for CGG, respectively, all within the range for random B-DNA. Determination of the apparent helical repeat by the band shift assay indicated that the writhe of the repeats was different from that of random B-DNA. In addition, molecules of 224-245 bp in length (64-71 triplet repeats) were able to form topological isomers upon cyclization. The low bending moduli are consistent with predictions from crystallographic variations in slide, roll, and tilt. No unpaired bases or non-B-DNA structures could be detected by chemical and enzymatic probe analyses, two-dimensional agarose gel electrophoresis, and immunological studies. Hence, CTG and CGG are more flexible and highly writhed than random B-DNA and thus would be expected to act as sinks for the accumulation of superhelical density.


INTRODUCTION

Eleven human genetic disorders (including fragile X syndrome, myotonic dystrophy, Kennedy's disease, Huntington's disease, spinocerebellar ataxia type 1, dentatorubral-pallidoluysian atrophy, and Friedreich's ataxia) are characterized at the molecular level by the expansion of DNA triplet repeats (CTG, CGG, or AAG)1 from <15 copies in normal individuals to scores of copies in affected cases (1-6). In some cases, the CTG and CGG tracts are transcribed into mature mRNAs, whereas the AAG tracts in Friedreich's ataxia are in the first intron of the frataxin gene. The mechanism for expansion is not known, but it may involve slippage of the complementary strands during DNA synthesis (7-10). Expanded alleles undergo further expansions upon passage to offspring and, in some diseases, are associated with the clinical observation called anticipation, whereby the symptoms become more severe in each successive generation and with an earlier age of onset (1-5). This is a novel type of mutation and shows non-mendelian genetic transmission (11, 12).

Prior investigations suggested that triplet repeat sequences (TRS)2 do not have the properties of random B-DNA. First, CTG tracts greatly facilitate nucleosome assembly (13-15), which, in turn, may repress transcription. Second, DNA synthesis in vitro pauses at specific loci in fragments containing CTG and CGG (16). Third, long tracts of AAG and AGG form intramolecular triplexes that arrest DNA synthesis (17). Fourth, CTG and CGG migrate up to 30% more rapidly than expected on polyacrylamide gel electrophoresis, whereas their migration is normal on agarose gels (18). Fifth, CTG is preferentially expanded in Escherichia coli compared with the other nine TRS (8). Sixth, the frequency of expansions and deletions in E. coli (7, 9, 10) is influenced by the direction of replication, suggesting the formation of stable hairpin loops in the lagging strand template or the newly synthesized nascent strand.

Conformational investigations were conducted on plasmids and restriction fragments containing CTG and CGG to evaluate their role in the biological behaviors described above. Several methods were applied, including circularization kinetics, apparent helical repeat determinations, the rate of migration through acrylamide and agarose gel electrophoresis, chemical and enzymatic probe analyses, two-dimensional gel electrophoresis, and the induction of an immune response. The analyses indicate that both CTG and CGG exist as fully paired, right-handed B-helices. However, their flexibilities are substantially greater than that of random B-DNA, and this causes the TRS to be more writhed. As a result, the average superhelical density of a DNA domain containing a TRS region will be unevenly distributed, a higher density being concentrated within the TRS tracts. This finding in unprecedented and enables the hypothesis that part of the biological response elicited by CTG and CGG is mediated by topological features associated with their increased flexibility.


EXPERIMENTAL PROCEDURES

Cloning of Recombinant Plasmids

Recombinant plasmids with (CTG)n and (CGG)n inserts used for the cyclization experiments were obtained by cloning a synthetic duplex that had XbaI and BamHI ends flanking (CTG)36 or (CGG)24 into pUC19-NotI cleaved with XbaI and BamHI. The top-strand sequence of the 5'-XbaI right-arrow 3'-BamHI insert was TCTAGAGGATCGCTCTTCG(TRS)nCGAAGAGCGGATCGCTAGCGGATCC. 3' to the TRS was an NheI site (GCTAGC) that allowed the TRS-containing fragments to self-anneal via the complementary CTAG resulting from XbaI-NheI cleavage. Plasmids with longer lengths of TRS were obtained as described (9, 19). Inserts were sequenced on both strands.

Cloning of the 32 plasmids containing (CTG)n and (CGG)n inserts used for the apparent helical repeat determination has been reported (7-10, 19). In addition, five plasmids harboring random sequence DNA inserts were obtained by cloning HaeIII restriction fragments of pUC18 into HincII of pUC19-NotI.

Kinetics of Circularization: Theory of Ring Closure

In a random-coil chain, the distribution (W) of the end-to-end distance (v) is given by the normalized gaussian function (20, 21),
W(v)<UP>d</UP>v=(&dgr;<SUP>2</SUP>/&pgr;)<SUP>3/2</SUP> <UP>exp</UP>(<UP>−</UP>&dgr;<SUP>2</SUP>v<SUP>2</SUP>)4&pgr;v<SUP>2</SUP><UP>d</UP>v (Eq. 1)
where delta  is related to the mean square end-to-end distance (< v2> 0) by delta 2 = 3/2< v2> 0 (< v2> 0 = nilambda -2), ni is the number of statistical segments (Kuhn segments), and lambda -1 is the length of a statistical segment in Å. The probability (W(0)) that the ends are confined to a volume V at a distance v (v right-arrow 0) is as follows.
W(0)<UP>d</UP>v=(3/2&pgr;n<SUB>i</SUB>)<SUP>3/2</SUP>&lgr;<SUP>3</SUP><UP>d</UP>V (Eq. 2)

For a linear DNA duplex with cohesive ends, free in solution, the term (3/2pi ni)3/2lambda 3, or J-factor, specifies the concentration of intramolecular ends in dV and is directly correlated to the equilibrium constant Kc for the reaction M right-left-harpoons  C (Kc = [C]/[M]), where C and M are circular and linear monomers, respectively. The association and dissociation rate constants between any two ends are determined by their homogeneous distribution throughout the volume of the system, i.e. by the equilibrium constant for the intermolecular association 2M right-left-harpoons  D (Ka = [D]/[M]2), where [D] is the concentration of linear dimers. It is assumed that the noncovalent interactions formed during the intramolecular and intermolecular reactions are identical and that the entropy change (Delta S) at the reactive site is the only factor determining the rate constants (no influence from length and composition of the molecule). Under these conditions, Kc = KaJ, whereby the concentration of circular monomers J is given by the ratio of the intramolecular to the intermolecular equilibrium constants (22). In this formulation, Ka = nu K*, where K* is the observed equilibrium constant, and nu  is related to the permutation number by which the monomers can associate to give the same dimer. Under steady-state kinetic conditions, Kc congruent  k1 and Ka congruent  k2 for Reactions A and B,
<UP>M</UP> <LIM><OP><ARROW>⇌</ARROW></OP><LL>k<SUB><UP>−</UP>1</SUB></LL><UL>k<SUB>1</SUB></UL></LIM> <UP>C</UP>
<UP><SC>Reaction A</SC></UP>
2<UP>M</UP> <LIM><OP><ARROW>⇌</ARROW></OP><LL>k<SUB><UP>−</UP>2</SUB></LL><UL>k<SUB>2</SUB></UL></LIM> <UP>D</UP>
<UP><SC>Reaction B</SC></UP>
so that the J-factor is defined as the ratio between the two forward rate constants, J = k1/k2 (23, 24).

Determination of k1

k1 is defined as follows,
k<SUB>1</SUB>≡<UP>−</UP>(<UP>d</UP>[<UP>D</UP>]<UP>/d</UP>t)/[<UP>D</UP>] (Eq. 3)
where [D] triple-bond  [S] + [M], and [S] is the concentration of non-ligated, but hybridized, circles (23). To determine k1, plasmid DNA was cleaved with XbaI, dephosphorylated with calf intestinal phosphatase, end-labeled with [gamma -32P]ATP and T4 polynucleotide kinase, and cleaved with NheI. The XbaI-NheI fragments containing TRS were isolated on 5-8% polyacrylamide gels, purified, and quantitated by liquid scintillation using the specific activity of free [gamma -32P]ATP. 2.1 × 10-10 M DNA was incubated in 100 µl of buffer A containing 50 mM Tris-HCl (pH 7.6), 10 mM dithiothreitol, 10 mM MgCl2, 1 mM ATP, and 0.1 mg/ml bovine serum albumin for 20 min at 20 °C. 5 µl was withdrawn at time 0, and 5 µl of T4 DNA ligase (USBTM; specific activity of 6326 units/mg) was added to 95 µl of reaction mixture; 8 µl was withdrawn at 20-s intervals for 4 min and quenched in 8 µl of 100 mM EDTA at 0 °C. Samples were heated at 70 °C for 20 min and loaded onto 5-6% polyacrylamide gels. Gels were dried and quantitated on a PhosphorImager (Molecular Dynamics, Inc., Sunnyvale, CA). T4 DNA ligase was diluted in buffer B (25 mM Tris-HCl (pH 7.6), 1 mM dithiothreitol, 100 mM NaCl, 0.1 mM EDTA, 30% glycerol, and 2 mg/ml bovine serum albumin) and used at 0.25-3.5 × 10-7 M. Three reactions were performed on each fragment at different enzyme concentrations. k1 was calculated from the slope of the integrated form of Equation 3, where [S] was omitted. It has been shown that [S] contributes only by few percent under these conditions (23). The products were identified by (a) comparing the migration to that of a reference ladder and (b) assessing the loss of radioactivity due to dephosphorylation of the linear species and the changes in relative percentages following re-cleavage with NheI and/or XbaI. In fact, no restriction sites are regenerated for the circular monomer, whereas all other ligated species contain at least one of the restriction sites.

Determination of k2

k2 is defined as follows,
k<SUB>2</SUB>≡<UP>−</UP>(<UP>d</UP>[<UP>D</UP>]<UP>/d</UP>t)/[<UP>D</UP>]<SUP>2</SUP> (Eq. 4)
where [D] = [H] + 2[S], and [H] is the concentration of two distinguishable half-molecules, A and B (23). A half-molecule is a linear DNA fragment that can hybridize only through one end, the other end being blunt. Half-molecules of (CGG)24 were obtained by cleaving plasmid DNA with XbaI; filling in with the Klenow fragment of E. coli DNA polymerase I, dGTP, dATP, dTTP, and [alpha -32P]dCTP; and then cleaving with NheI. Purification and quantitation were as described before. Half-molecules of (CGG)40 were obtained similarly, except that plasmid DNA was cleaved with NheI first and then filled in and cleaved with XbaI. Equimolar amounts (1.5-6.0 × 10-9 M) of half-molecules of (CGG)24 and (CGG)40 were mixed in 100 µl of buffer A and processed as described for k1. Concentrations of T4 DNA ligase were 1 and 2 × 10-6 M. Identical procedures were applied to half-molecules of (CTG)36 and (CTG)56. k2 was calculated for the association of (CGG)24 (A) with (CGG)40 (B) and of (CTG)36 (A) with (CTG)56 (B) from the slope of the integrated form of Equation 4 times the initial DNA concentrations. The values obtained were then divided by 2 due to the asymmetry of the molecules. The molar J-factors (J(M)) were obtained from the ratio k1/k2. Note that k2 is a constant, whereas k1 is unique to each length of DNA.

Interpolation Formulas

The log J(M) values were plotted against log nbp. Log J(M) is a complex oscillatory function of log nbp. Interpolation of log J(M) with the equations derived by Shimada and Yamakawa (25) for the ring-closure probabilities of a twisted worm-like chain yields the bending modulus alpha , the torsional modulus beta , the length of the Kuhn segment lambda -1 (lambda -1 = 2P, P being the persistence length), and the helical repeat h0 of the DNA. Three successive computational steps were performed.

Step A: Evaluation of lambda -1

Equations 5-9 were used as the starting functions to construct the theoretical curve for J1(L). J1(L) evaluates the behavior of the contour length of the DNA and is not complicated by the twist dependence of cyclization. The dependence on twist arises from the fact that linear DNA molecules with a non-integral number of helical turns need to untwist (or overtwist) to cyclize. G(0,u0|u0;L) expresses the length-dependent probability of ring closure for a polymer with the end tangents specified. L denotes the reduced contour length, defined as the ratio of the contour length of the DNA chain (nbp × 3.4 Å) to the length of the Kuhn segment,
J<SUB>1</SUB>(L)=4&pgr;G(0, <B><UP>u</UP></B><SUB>0</SUB>‖<B><UP>u</UP></B><SUB>0</SUB>; L) (Eq. 5)
G(0, <B><UP>u</UP></B><SUB>0</SUB>‖<B><UP>u</UP></B><SUB>0</SUB>; L)=&pgr;<SUP>2</SUP>L<SUP><UP>−6</UP></SUP><UP> exp</UP>((<UP>−</UP>&pgr;<SUP>2</SUP>/L)+0.514L) (<UP>for</UP> L<1.9) (Eq. 6)
G(0, <B><UP>u</UP></B><SUB>0</SUB>‖<B><UP>u</UP></B><SUB>0</SUB>; L)=(4&pgr;)<SUP><UP>−</UP>1</SUP>F<SUB>0</SUB>(L) (<UP>for</UP> 2.8<L≤4) (Eq. 7)
F<SUB>0</SUB>(L)=<LIM><OP>∑</OP><LL>k<UP>=</UP>0</LL><UL>3</UL></LIM> f<SUB>0k</SUB>L<SUP><UP>−</UP>k<UP>−</UP>3/2</SUP> (Eq. 8)
J<SUB>1</SUB>(L)=0.03882+0.003494(L−1.9)−0.01618(L−1.9)<SUP>2</SUP>+0.008601(L−1.9)<SUP>3</SUP> (<UP>for</UP> 1.9<L<2.8) (Eq. 9)
where f0k are numerical constants. J1(L) was then transformed into J1 by the conversion factor 1027lambda 3/NA (28), where NA is Avogadro's number. Log J(M) oscillates around log J1. Thus, by varying lambda -1 in the conversion factor, log J1 may be found that runs midway through the log J(M) values. During the transformation, L (x axis) was also converted to log nbp according to the value of lambda -1.

Step B: Evaluation of sigma

sigma is Poisson's ratio and is related to the bending and torsional moduli by alpha /beta  = 1 + sigma . sigma  establishes the upper and lower boundaries for the oscillating log J(M). Here, r is a periodic function of L (0 <=  r <=  0.5) that reproduces the varying fractional helical turn (and therefore twist) of the DNA chain. For the evaluation of the upper and lower boundaries, r was set equal to 0 and 0.5 to follow the log J(M) values of DNA fragments with 0 and 0.5 fractional helical turns, respectively. Equations 10-14 were used to construct a theoretical J(L)* function, and the previous conversion factor was then used to transform J(L)* into log J(M)*.
J(L)*=<LIM><OP>∑</OP><LL>&Dgr;<UP>Lk−1=−1</UP></LL><UL>1</UL></LIM> J<SUB><UP>Lk</UP></SUB>(L) (&Dgr;<UP>Lk</UP>=r <UP>and</UP> r±1) (Eq. 10)
Lk, the linking number, expresses the number of times the two helices revolve about one another in circular DNA. This number is always an integer. Lk0 relates to the linking number of a linear DNA in its unconstrained state and needs not to be an integer. Delta Lk = Lk - Lk0. Delta Tw is the difference in twist between linear and circular DNA. Delta Tw = Delta Lk + Wr, Wr (writhe) being the measure of the deviation of the helix axis from planarity. For small circles of random DNA, Wr is approximated to 0, so that Delta Tw = Delta Lk. JLk(L), the linking number-dependent J-factor, is defined as follows,
J<SUB><UP>Lk</UP></SUB>(L)=8&pgr;<SUP>2</SUP>G(0, &OHgr;<SUB>0</SUB>‖&OHgr;<SUB>0</SUB>; <UP>Lk</UP>, L) (Eq. 11)
and G(0,Omega 0|Omega 0;Lk,L), the linking number-dependent ring-closure probability, is as follows,
G(0, &OHgr;<SUB>0</SUB>‖&OHgr;<SUB>0</SUB>; <UP>Lk</UP>, L)=C<SUB>0</SUB> L<SUP><UP>−</UP>13/2</SUP> <UP>exp</UP>(<UP>−</UP>&pgr;<SUP>2</SUP>/L(1+(&Dgr;<UP>Lk</UP>)<SUP>2</SUP>/(1+&sfgr;))+ (Eq. 12)
(C<SUB>1</SUB>+<FR><NU>1</NU><DE>4</DE></FR>)<UP>L</UP>)
with
C<SUB>0</SUB>=(1+&sfgr;)<SUP><UP>−</UP>1/2</SUP><LIM><OP>∑</OP><LL>j<UP>=</UP>0</LL><UL>7</UL></LIM> a<SUB>0j</SUB>(&Dgr;<UP>Lk</UP>/(1+&sfgr;))<SUP>2j</SUP> (Eq. 13)
C<SUB>1</SUB>=<LIM><OP>∑</OP><LL>j<UP>=</UP>0</LL><UL>7</UL></LIM>(a<SUB>1j</SUB><SUP>(0)</SUP>+a<SUB>1j</SUB><SUP>(1)</SUP>/(1+&sfgr;))(&Dgr;<UP>Lk</UP>/(1+&sfgr;))<SUP>2j</SUP> (Eq. 14)
where a0j, a1j(0), and a1j(1) are numerical constants.

Step C: Evaluation of tau 0

tau 0 is the constant torsion and determines the period of oscillation of log J(M). tau 0 is related to the helical repeat (h0) of the DNA by tau 0 = 2pi /h0lbp, where lbp is the distance between base pairs, 3.4 Å. The theoretical J(L) value was evaluated with Equations 10-14, where r was taken in small increments according to the following,
r(L)=‖&tgr;<SUB>0</SUB>‖L/2&pgr;−k (<UP>for</UP> k≤&cjs0822;&tgr;<SUB>0</SUB>&cjs0822;L/2&pgr;≤k+<FR><NU>1</NU><DE>2</DE></FR>)
r(L)=1−‖&tgr;<SUB>0</SUB>‖L/2&pgr;+k (<UP>for</UP> k+<FR><NU>1</NU><DE>2</DE></FR><&cjs0822;&tgr;<SUB>0</SUB>&cjs0822;L/2&pgr;<k+1) (Eq. 15)
where k is an integer >= 0.

The previous conversion factor was again used to convert J(L) into log J(M). tau 0 was varied to find a good fit to the experimental log J(M) values. For illustration purposes, log nbp was converted to nbp in the figures. For all of the DNAs, the r + 1 term in Equation 10 was omitted because |Delta Lk|/(1 + sigma >=  <RAD><RCD>3</RCD></RAD>. This has been shown to cause large errors in the extrapolations (25).

Satisfactory values for lambda -1, sigma , and tau 0 were estimated by visual inspection of the fits, and no statistical tests were performed. It should be noted that the torsional (beta ) moduli obtained by these fits were identical to those estimated manually based on pairs of molecules having r (Delta Tw) values of 0 and 0.5 (23).

Variance of Writhe (< Wr2> ) and Free Energy of Supercoiling (nbpK/RT)

These computations are reported in detail in the accompanying paper (26).

Apparent Helical Repeat Determination

30 µg of plasmid DNA was treated with chicken erythrocyte DNA topoisomerase I at 0 °C overnight and purified by phenol/chloroform extraction and ethanol precipitation. The resulting set of topoisomers was resolved by agarose gel electrophoresis. This was performed with 0.5 µg of DNA at 23 °C (60 V for 18 h) on 1% agarose containing 40 mM Tris-HCl, 25 mM sodium acetate, and 1 mM EDTA at pH 8.3. The gel dimensions were 20 cm × 22 cm × 3 mm. Positively supercoiled molecules were obtained by relaxing the plasmids at 0 °C and performing the electrophoresis at 23 °C, whereas negatively supercoiled plasmids were generated by relaxation at 37 °C and electrophoresis at 23 °C. Average unwinding was 0.012°/°C/bp. Apparent helical repeat (h0) was calculated (27) from an average of three determinations (with a S.D. of ±0.1 bp/turn). Methylation of the CGG tracts was performed by treating plasmid DNA with SssI methylase in the presence of S-adenosylmethionine at 37 °C overnight. To check for complete reaction, cleavage studies with AciI (CCGC) and PAGE analysis were conducted.

Other Methods

Chemical probe analyses, polyacrylamide gel electrophoresis, and induction of mouse antibodies were conducted as referred to below.


RESULTS

Determination of J-Factors

To calculate the J-factors for the TRS, eight plasmids containing 36-80 consecutive CTG repeats (140-272 bp long) and 19 plasmids containing 24-73 consecutive CGG repeats (104-251 bp long) were prepared (Table I). The TRS-containing restriction fragments had a protruding CTAG at each end that allowed the intramolecular or intermolecular association (Reactions A and B under "Experimental Procedures") to take place. k1 and k2 were measured under conditions in which the intermediates C and D were converted into covalently closed products by T4 DNA ligase (23). Fig. 1 shows representative time course reactions and plots. For k2, two molecules of different lengths were chosen and blunted at one end. This scheme enables the molecules to dimerize via the single CTAG end, but disallows the intramolecular circularization. Fig. 1A shows a PAGE separation of the products between (CGG)24 and (CGG)40. Three ligated species were observed, which correspond to dimers of (CGG)24, dimers of (CGG)24 and (CGG)40, and dimers of (CGG)40 in the ratio of 1:2:1 (20, 25). In general, higher order aggregates were also detected at concentrations within a few percent of the linear dimers. These aggregates may have originated from blunt-end ligations.

Table I. Plasmids containing CTG · CAG and CGG · CCG inserts used for cyclization kinetic studies

After cleavage by XbaI and NheI the purified inserts have the following 5' right-arrow 3' top-strand sequence organization: a 5'-protruding CTAG followed by duplex AGGATCGCTCTTCG(TRS)nCGAAGAGCGGATCG and finally a 3'-recessive GATC. Therefore, the length of the circularized molecules equals the number of triplet repeats plus 32 bp.

Triplet repeat Plasmid No. of repeats Size of fragmentsa

bp
(CTG)n pRW3036 36 140
pRW3047 47 173
pRW3054 54 194
pRW3056 56 200
pRW3059 59 209
pRW3064 64 224
pRW3068 68 236
pRW3080 80 272
(CGG)n pRW3651 24 104
pRW3661 29 119
pRW3671 32 128
pRW3675 34 134
pRW3681 40 152
pRW3679 43 161
pRW3653 44 164
pRW3677 46 170
pRW3683 49 179
pRW3697 52 188
pRW3699 56 200
pRW3689 57 203
pRW3685 59 209
pRW3687 60 212
pRW3693 62 218
pRW3695 63 221
pRW3673 70 242
pRW3655 71 245
pRW3691 73 251

a Due to the presence of cohesive ends on the fragments, the total number of base pairs is the number of bases in one strand of the DNA.


Fig. 1. Representative PAGE and plots for the calculation of k1 and k2. A and B, bimolecular association, k2. A, 6% PAGE of (CGG)24 and (CGG)40. Fragments and reaction conditions were as described under "Experimental Procedures." 3.0 × 10-9 M each (CGG)24 and (CGG)40 were used; T4 DNA ligase was at 1.0 × 10-6 M. Each lane represents a 20-s time point during the course of the reaction from 0 to 4 min. M1, linear monomer of (CGG)24; M2, linear monomer of (CGG)40; D1, linear dimer of (CGG)24; D2, linear dimer of (CGG)24 and (CGG)40; D3, linear dimer of (CGG)40. B, integrated rate of disappearance of the combined monomers of (CGG)24 and (CGG)40 as a function of time. bullet , [DNA] = 6.0 × 10-9 M and [T4 DNA ligase] = 1.0 × 10-6 M; black-square, [DNA] = 1.5 × 10-9 M and [T4 DNA ligase] = 2.0 × 10-6 M. M0 = M1 + M2 at time 0; Mt M1 + M2 at subsequent time points. C and D, circularization, k1. C, 5% PAGE of (CTG)59. The fragment was prepared and processed as described under "Experimental Procedures." Each lane represents a 20-s time point during the course of the reaction from 0 to 4 min. [DNA] = 2.0 × 10-10 M and [T4 DNA ligase] = 1.0 × 10-7 M. M, linear monomer; D, linear dimer; C, circular monomer. Lane L, reference size ladder. D, integrated rate of accumulation of the (CGG)59 circular monomer. fc is the fraction of cyclized product at each time point. [DNA] = 2.0 × 10-10 M and [T4 DNA ligase] = 1.0 × 10-7 M (bullet ), 5.0 × 10-8 M (black-square), and 2.5 × 10-8 M (black-triangle).
[View Larger Version of this Image (44K GIF file)]

The rate of disappearance of the combined linear monomers was plotted as a function of time. Fig. 1B shows the results for (CGG)24 plus (CGG)40. The average k2 obtained from four determinations, two with (CTG)n and two with (CGG)n fragments, was (0.89 ± 0.48) × 102 M-1 s-1 when normalized for an enzyme concentration of 1 × 10-9 M. This error associated with k2 was quite large and may be a reflection of both experimental variation and differences in T4 DNA ligase-DNA interactions between the (CTG·CAG)n and (CGG·CCG)n sequences. Nevertheless, the average value was comparable (~20% lower) to the value of 1.12 × 102 M-1 s-1 measured previously (28) for DNAs with different sequences.

k1 was measured on each of the restriction fragments in Table I under conditions that inhibited the intermolecular association, i.e. low concentrations of DNA and ligase. Fig. 1C shows the electrophoretic separation of a kinetic reaction for (CTG)59. One major ligated product was seen, corresponding to circularized (CTG)59. A fainter band was also evident, which was the result of dimerization events. For some other fragments, linear trimer and multimers also were seen, depending on the amount of ligase that was needed to detect the cyclized monomer. The rate of accumulation of the circular product was quantitated and plotted as a function of time. Fig. 1D shows the time-dependent circle formation for (CGG)59 at three ligase concentrations. In this experiment, k1 obtained from the three plots was (2.39 ± 0.32) × 10-5 s-1 when normalized for 1.0 × 10-9 M T4 DNA ligase. In general, the errors associated with k1 were much smaller than for k2.

The log J(M) (k1/k2) values are graphed in Fig. 2 as a function of the number of base pairs. Also shown (bottom panel) are the molar J-factors for random sequence DNA of comparable lengths as determined previously (23). This panel was included for a comparison between the TRS and random sequence DNA. Our results show that (a) the log J(M) values for (CGG)n and (CTG)n are ~10- and ~20-fold higher, respectively, than for random DNA, indicating that both TRS are very flexible; and (b) the widely oscillating behavior of log J(M) reflects the requirement of torsionally aligned ends and suggests the presence of fully paired duplexes (23). It should be noted that all of the values of log J(M) are affected by the choice of k2. Therefore, if we were to consider the upper limit of k2 (0.89 + (2 × 0.48) = 1.85), then all of the values reported for log J(M) would be reduced by 50%. Nevertheless, these values would still be higher than for random DNA by 5-10-fold.


Fig. 2. Plots of log J(M) versus number of base pairs. Two top panels, (CTG)n and (CGG)n. The values of log J(M) for the various DNA restriction fragments are plotted as a function of the number of base pairs. Solid curves represent the fit resulting from the interpolation with Equations 10-15 under "Experimental Procedures." Dotted lines represent the upper and lower boundaries of log J(M) (Delta Lk = 0 and Delta Lk = 0.5) calculated from Equations 10-14 under "Experimental Procedures." Bottom panel, random DNA. The dotted curve represents lambda -1 = 900 Å, sigma  = -0.4, and h0 = 10.46 bp/turn. The solid curve represents lambda -1 = 950 Å, sigma  = -0.2, and h0 = 10.46 bp/turn. Reprinted with permission from Shimada and Yamakawa (25). Copyright 1984 American Chemical Society.
[View Larger Version of this Image (25K GIF file)]

To estimate the torsional and bending moduli, we fit the experimental data to the equations for the ring-closure probabilities of a twisted worm-like chain (25). The oscillating solid line in Fig. 2 (top two panels) represents the fit to the experimental data as specified by the parameters tau 0, lambda -1, and sigma . tau 0, or continuous torsion, describes the twist angle between adjacent base pairs and is related to the number of bp/turn in the DNA helix (h0). lambda -1 is related to the persistence length (P) and, therefore, to the bending flexibility of the DNA. Finally, sigma , or Poisson's ratio, is related to the torsional stiffness of the chain. The upper and lower boundaries in log J(M) (dotted lines in the top two panels) correspond to the values for linear molecules with integral numbers of helical turns (upper) and those with a fractional 0.5 helical turn (lower). The fits describe the experimental data very well.

The values of lambda -1, sigma , and tau 0 (converted to h0) along with the calculated bending (alpha ) and torsional (beta ) moduli are reported in Table II. As anticipated, the values of the Kuhn segment (lambda -1) for CTG and CGG were much lower than that for random DNA (556 and 630 Å versus 950 Å, respectively), indicating that both TRS have a high degree of flexibility (low bending modulus alpha ). By contrast, the torsional modulus beta  (2.3 and 2.4 × 10-19 erg·cm versus 2.4 × 10-19 erg·cm) and the helical repeat h0 (10.41 and 10.35 bp/turn versus 10.46 bp/turn) were close to those of random DNA, showing that fragments containing CTG and CGG form right-handed B-type helices under these conditions. Considering the S.D. associated with k2 and the three interpolation steps, the accuracy of these values (Table II and Fig. 2) is within ±10%. This error also applies to the results from the analyses that follow (see Figs. 3 and 4) on the variance of writhe and the free energy of supercoiling.

Table II. Elastic moduli of random DNA, (CTG)n, and (CGG)n

For (CTG)n and (CGG)n, the values were derived from the best fit in Fig. 2. Values for random DNA are from the solid line in the bottom panel.

Parameter Bending modulus alpha a Torsional modulus beta b Length of Kuhn segment, lambda -1c Poisson ratio, sigma Helical repeat, h0d

×10-19 erg · cm ×10-19 erg · cm Å bp/turn
Random DNA 1.92 2.4 950  -0.20 10.46
(CTG)n 1.13 2.3 556  -0.51 10.41
(CGG)n 1.27 2.4 630  -0.46 10.35

a Bending modulus alpha  = kBTP, where kB is the Boltzmann constant, T is the temperature in kelvin, and P is the persistence length.
b Torsional modulus beta  = alpha /(1 + sigma ).
c Length of the Kuhn segment (lambda -1) = 2P.
d Helical repeat h0 = 2pi /tau 0lbp, where tau 0 and lbp are the continuous torsion and the distance between adjacent base pairs (3.4 Å), respectively.


Fig. 3. Relative efficiencies of cyclization. The oscillation-independent ring-closure probability (J1) was calculated for (CTG)n, (CGG)n, and random DNA according to Equations 5-9 and the values in Table II. The length at which J1 reaches a maximum is defined as J1max. The curves represent J1/J1max ratios. For phased A-tract DNA, the analysis was performed on experimental ratios (28) (bullet ).
[View Larger Version of this Image (19K GIF file)]


Fig. 4. Variance of writhe and free energy of supercoiling. nbpK/RT was calculated as reported (26). Inset, the variance of writhe (< Wr2> ) versus nbp was calculated for (CTG)n, (CGG)n, and random DNA as described (26) using the values reported in Table II.
[View Larger Version of this Image (21K GIF file)]

The choice of lambda -1 (950 Å), alpha  (1.9 × 10-19erg·cm), and beta  (2.4 × 10-19 erg·cm) for B-DNA was based on the following. Measurements of the persistence length (P) performed by cryoelectron microscopy (29), cyclization kinetics (25, 30), Monte Carlo simulation (31), and electro-optic techniques (32, 33) yielded a consensus value close to 500 Å. Measurements of beta  performed by ring closure, Monte Carlo simulation, topoisomer distribution, fluorescence depolarization, and electron paramagnetic resonance experiments produced results that ranged from 1.5 to 3.6 × 10-19 erg·cm (reviewed in Ref. 34). The value of 2.4 × 10-19 erg·cm (23) was selected because it was within the range of 2.0-2.4 × 10-19 erg·cm measured by cyclization kinetics (23, 30, 35). This choice of alpha  and beta  yields sigma  = -0.20, which is equal to the value measured experimentally on plasmid DNA at low superhelical densities (36). It should be noted that in the analyses that follow, a variation in beta  from 1.5 to 3.6 × 10-19 erg·cm for random DNA would not change the outcome of the conclusions (26).

Writhe and Supercoiling

The data reported in Table II enable the prediction of the topological behaviors of the three DNA species. Fig. 3 shows the results of an analysis based on the ring-closure probability J1 (34), which enables the determination of the optimal length for circularization (J1max). The probability J1 takes into account the proximity of the two ends within the volume dV, as well as the correct orientation of their tangents, but does not consider the twist-dependent alignment (25).

The J1/J1max ratio reflects the efficiency of circularization as a function of length. The end points of the curves indicate the nbp at which J1 is maximum. An interpolation for phased A-tract DNA is also shown. These molecules circularize ~1000 times more efficiently than random sequence DNA due to phased, in-plane, static curvatures (28, 37). For random DNA, the calculated optimal length is 552 bp, whereas for fragments containing (CTG)n and (CGG)n, the calculated optimal lengths are 326 and 366 bp, respectively, or ~2.4 times greater than for A-tract DNA (~140 bp). Hence, the decrease in persistence length in (CTG)n and (CGG)n causes an ~40% drop in the optimal length of circularization as compared with random DNA.

The second prediction concerns the writhe and the free energy of supercoiling. Writhe quantitates the out-of-plane trajectory of the helix axis in circular DNA (38). Its variance, < Wr2> , is directly related to lambda -1 (34). Fig. 4 (inset) compares < Wr2> as a function of chain length. (CTG)n and (CGG)n display greater variations than random DNA, in the order CTG > CGG > random DNA, with differences increasing rapidly with length. Thus, we conclude that the range of conformations adopted by (CTG)n and (CGG)n is greater than for random DNA.

In sufficiently long molecules, twisting of the two ends before ring closure produces a set of topological isomers that differ in the number of helical turns (Lk). The strain generated by twisting is then partitioned with writhe, which gives rise to tertiary turns (tau ), or supercoils. The distribution of the topoisomer population is described by < (Delta Lk)2>  = < Wr2>  + < (Delta Tw)2> (see definitions under "Experimental Procedures"). The free energy associated with supercoiling (Delta Gtau ) is related to tau by Delta Gtau  = Ktau 2, where K is the apparent twisting coefficient. K and < (Delta Lk)2> are related by nbpK/RT = nbp/2< (Delta Lk)2> , where R is the gas constant, and T the absolute temperature. A computation of nbpK/RT is shown in Fig. 4. The calculations indicate a rapid decrease in this range. The values for (CTG)n and (CGG)n are always lower than for random DNA, implying that, for a defined length, it will require less energy to supercoil (CTG)n and (CGG)n than random DNA. In other words, these TRS will act as a "sink" for localizing writhe when embedded in DNA of random sequence.

Topoisomers of Small Circles

Additional evidence that (CTG)n and (CGG)n writhe more easily than random DNA comes from the analysis of cyclized products in the ring-closure experiments. Fig. 5 (left panel) shows the pattern of (CGG)63 (221 bp) before the addition of T4 DNA ligase (lane 1) and 4 min after the enzyme was added (lane 2). The linear monomer (m1) was converted to the linear dimer (d1) plus one circular monomer (c). Lanes and 2 for (CGG)70 (242 bp) report identical time courses. However, in this case, the linear monomer (m2) was converted to two circular species (ct1 and ct2) in almost equal amounts. A similar pattern was also seen for (CGG)71 (data not shown). Analogously, the reaction performed on (CTG)54 (194 bp) (right panel) ligated the monomer (M1) into a linear dimer (D1) and a single circular species (C1). On the other hand, the reaction performed on (CTG)64 (224 bp) converted the linear monomer (M2) into four circular species (CT1 to CT4). Thus, (CTG)64 circularized to give four isomers, whereas for (CGG)70/71, only two species were observed, despite their longer length. (CGG)70, (CGG)71, and (CTG)64 had fractional helical turns (Delta Lk0; Delta Lk0 = Lk0 - Int(Lk0)) of 0.38, 0.67, and 0.52, respectively. The formation of topoisomers is facilitated in linear molecules that have Delta Lk0 close to 0.5. In fact, during ring closure, such molecules need to untwist, or overtwist, to align their ends. If both of these movements occur, topoisomers will be observed. The energetic barrier to be overcome by this process is inversely proportional to chain length (Fig. 2); thus, for very short chains, only untwisting, or overtwisting, is practically observed. The finding that fragments of CGG of 242 and 245 bp and of CTG of 224 bp form topoisomers in almost equimolar amounts indicates that this barrier is low at these lengths. This contrasts with random DNA, for which fragments of up to ~245 bp, with Delta Lk0 ~ 0.5, were shown to circularize in mostly one species (39, 40). Thus, experimental results and theoretical predictions fully agree that the onset and distribution of topoisomeric species follow the order CTG > CGG > random DNA.


Fig. 5. Topoisomers of circular monomers. Fragments were prepared and processed as described for the determination of k1. Lanes 1 show the migration of linear monomers before the addition of T4 DNA ligase. Lanes 2 show the products formed after 4 min of reaction with the enzyme. m1, m2, M1, and M2, linear monomers of (CGG)63, (CGG)70, (CTG)54, and (CTG)64, respectively; d1, d2, D1, and D2, linear dimers of (CGG)63, (CGG)70, (CTG)54, and (CTG)64, respectively; c, circular monomer of (CGG)63; ct1 and ct2, circular monomers of (CGG)70; C1, circular monomer of (CTG)54; CT1, CT2, CT3, and CT4, circular monomers of (CTG)64. Lane L, reference size ladder.
[View Larger Version of this Image (60K GIF file)]

Writhe in Plasmids Containing (CTG)n and (CGG)n

To further test the above predictions, we performed band shift assays, a method based on the relative electrophoretic velocities of a family of circular plasmids (27, 41). Enzymatic nicking and closing of circular DNA result in a gaussian distribution of topoisomers that peaks at the planar species (Delta Lk closest to 0). Flanking isomers differ by ±1 in their value of Delta Lk due to pivoting of the ends about the nick. If a segment of xbp containing an integral number of helical turns is inserted into a plasmid of nbp, then the topoisomers of A (A = nbp) and B (B = nbp + xbp) have identical Delta Lk values and, therefore, identical writhe. Since writhe affects the electrophoretic mobility, the migration pattern of A and B will be identical when corrected for the size difference. On the other hand, if xbp contains a non-integral number of helical turns, topoisomers A and B will have different Delta Lk as well as writhe. This will cause a shift in migration, which may be used to calculate the h0 of xbp.

It is obvious that the calculation of h0 by this method requires that the writhe of the inserted fragment be the same as that of the vector. If the bending force of xbp is different from that of nbp, the magnitude and the partition of writhe between xbp and nbp will be modified, and the velocity of migration altered.

Fig. 6A shows a representative agarose gel comparing plasmids containing n repeats of CGG with those containing n+a repeats. The latter is then compared with a plasmid containing n+a+b repeats and so on. Studies were also performed on a family of plasmids containing random DNA ranging in length from 90 to 270 bp (Fig. 6B). Each of the triangles corresponds to the h0 calculated from inserting xbp of random DNA into a vector. The average value of h0 (10.26 ± 0.1 bp/turn) is in excellent agreement with that reported previously (27, 41). The straight horizontal line passing through these points is a good indication that the writhe of the random sequence inserts is the same as that of the vector.


Fig. 6. Apparent helical repeat as a measure of writhe. A, agarose gel electrophoretic pattern of plasmids containing (CGG)n. Lane a, pUC19-NotI; lane b, pUC19-NotI/pRW3006 (6 repeats); lane c, pRW3006 (6 repeats); lane d, pRW3006 (6 repeats)/pRW3008 (8 repeats); lane e, pRW3008 (8 repeats); lane f, pRW3008 (8 repeats)/pRW3010 (10 repeats); lane g, pRW3010 (10 repeats); lane h, pRW3010 (10 repeats)/pRW3017 (17 repeats); lane i, pRW3017 (17 repeats); lane j, pRW3017 (17 repeats)/pRW3024 (24 repeats); lane k, pRW3024 (24 repeats); lane l, pRW3024 (24 repeats)/pRW3029 (29 repeats); lane m, pRW3029 (29 repeats); lane n, pRW3029 (29 repeats)/pRW3032 (32 repeats); lane o, pRW3032 (32 repeats). B, apparent h0 values for random DNA (black-triangle) and (CTG)n (bullet ). C, dependence of the apparent h0 of (CGG)n on positive (open circle ) and negative (bullet ) supercoiling. D, apparent h0 values for (CGG)n (black-triangle), (CGG)n containing interruptions (bullet ), methylated (CGG)n (triangle ), and methylated (CGG)n containing interruptions (open circle ). Each experimental data point represents an average value from triplicate experiments, and the S.D. of each point is within ±0.1 bp/turn.
[View Larger Version of this Image (29K GIF file)]

Fig. 6 (B (circles), C, and D) shows the effect of inserting various lengths of (CTG)n or (CGG)n into plasmids. The widely oscillating behavior of the calculated h0 reflects complex changes in the topology of the molecules, as expected if the writhe of the inserted TRS differs from that of the vector. The results include inserts with methylated cytosine residues or naturally occurring polymorphisms (panel D) as well as positively and negatively writhed molecules (panel C). In all experiments, the calculated h0 was not a constant value, as observed for the random DNA inserts. In addition, the results of panel D substantiate the observed lack of influence (42) on the J-factor by methylated cytosines. Thus, the writhe of (CTG)n and (CGG)n is different from that of random sequence DNA, and this change causes drastic alterations in the topology of the supercoiled molecules. Specifically, due to their low bending modulus alpha , the TRS regions must be the most flexible or contortable domains of the plasmids and, hence, the preferential sites for the partitioning of supercoil density.

Other Structural Analyses

Chemical and enzymatic probes used for characterizing non-B-DNA conformations such as cruciforms, B-Z junctions, triplexes, nodule DNA, and unpaired AT-rich regions (38, 43-45) were employed on plasmids containing various lengths of CTG and CGG. Exhaustive analyses were performed with bromoacetaldehyde, osmium tetroxide, dimethyl sulfate, potassium permanganate, chloroacetaldehyde, copper-phenanthroline, diethyl pyrocarbonate, S1 nuclease, and DNase I under a wide variety of experimental conditions. No reactivities were detected that would indicate accessible bases or unpaired regions as found for the conformations identified above. However, positive internal control studies with a cruciform or B-Z junctions confirmed the validity of the interpretations.

Also, two-dimensional agarose gel electrophoresis has been widely used to study supercoil-driven structural transitions (38, 43-45). These studies were performed on plasmids containing from 24 to 240 CGG repeats (19) (both non-methylated and methylated) and on CTG repeats containing 75 and 130 units (7-10) (data not shown). No supercoil-induced transitions were observed.

Furthermore, a 420-bp BamHI fragment containing (CTG)130 (16) behaved immunologically as expected for a fully base-paired B-DNA of high G + C content. It competed as strongly as calf thymus DNA for binding to a monoclonal antibody that favors G/C sequences in B-DNA conformation and less strongly for a monoclonal antibody that favors A/T sequences. It did not react with a monoclonal antibody specific for single-stranded DNA. When (CTG)130-methylated bovine serum albumin complexes with adjuvant, were injected into three normal mice, there was no antibody response above that stimulated by adjuvant alone, and no antibodies specific for the flexible (CTG)130 structure were formed. Conformational variants (such as Z-DNA, cruciforms, triplex, A-helix, or single-stranded DNA) do induce structure-specific antibodies (46). In summary, all of the above structural analyses revealed that the flexible (CTG)n and (CGG)n TRS behave as fully paired, right-handed B-DNA and that there are no structural transitions induced by supercoil density that are detected by these methods.

Alternatively, nondenaturing PAGE analyses were performed on restriction fragments containing various lengths of CTG (18) and CGG (9-240 repeats, both methylated and non-methylated), which revealed their expected rapid migration (by up to 30%). However, these sequences migrated normally on agarose gel electrophoresis. The increased velocity could be abolished by treatment with chloroquine or reduced by the addition of 7 M urea (data not shown). These results further confirm the inherent structural properties of these TRS. Comparable studies conducted on the other eight TRS (17, 47) (repeat lengths investigated as follows: AGT, 77 and 45; ATG, 91 and 49; AAC, 58 and 42; ACC, 140 and 46; AGG, 74 and 53; AAG, 100 and 52; GTC, 98 and 62; and TTA, 90 and 31) showed normal mobilities except for the longest ACC and GTC, which displayed the behavior of CTG and CGG, but to a smaller extent (data not shown).


DISCUSSION

TRS (CTG, CGG, and AAG) occupy a central role in our comprehension of several human hereditary neuromuscular diseases. This report documents the flexible structure (with an inherently greater writhe than random sequence DNA) of right-handed B-helical CTG·CAG and CGG·CCG based on cyclization kinetics, helical repeat determinations, and PAGE analyses.

The low persistence lengths of (CTG)n and (CGG)n reflect an enhanced flexibility along the helix axis. Two mechanisms contribute to the deflection of the helix axis from linearity: dynamic thermal motion and static bends (48, 49). A static bend is represented by a wedge between 2 adjacent bp, which may be caused by the geometry of the base pair step itself or by an intercalated ligand or amino acid residue(s). Thermal motion produces constant fluctuations of the helix axis, and consequently, the observed (apparent) persistence length (Pa) is composed of static (Ps) and dynamic (Pd) components (1/Pa = (1/Ps) + (1/Pd)) (48, 49). For a sequence that repeats regularly along the helix, Ps right-arrow infinity , and therefore, Pa = Pd.

Both (CTG)n and (CGG)n are a monotonous succession of trinucleotide units, each one occupying the same position along the helix every other turn (3 nucleotides × 7 = 21 nucleotides/2h0, h0 congruent  10.5 bp/turn). Hence, the macroscopic, idealized shape of a TRS should be that of a straight rod, in the absence of thermal motion. The estimated Pd for straight random B-DNA is ~800 Å (29). The values of Pa (=Pd) of 278 and 315 Å measured for (CTG)n and (CGG)n, respectively, are ~60% Pa (500 Å) and ~40% Pd for random DNA. This suggests that one or more dinucleotide steps within each trinucleotide repeat unit are more flexible than average (50) and/or that there are flexible hinges along the sequence.

Of the 10 possible combinations, (CTG)n contains 1:3 each of AG/CT, CA/TG, and GC/GC, whereas (CGG)n contains 1:3 each of GC/GC, GG/CC, and CG/CG. Thus, both (CTG)n and (CGG)n share GC/GC. The geometry (flexibility at the dinucleotide step) of duplex DNA is best analyzed by x-ray crystallography.

Studies indicate that CA/TG is highly polymorphic (51-54), especially in the degree of slide, roll, and twist, and is very dynamic (55). GC/GC also is rated high on a "flexibility" scale (third in one analysis and fifth in the other) (52, 54). This dinucleotide step is associated with high roll wedge angles, which may be stabilized by Mg2+ ions (56). AG/CT is not well represented in the family of crystal structures and, therefore, was excluded from one of the studies (54). For GG/CC and CG/CG, the most recent study indicates that they are flexible (54).

For each of the 10 dinucleotide steps, the mean roll and tilt angles are estimates of the equilibrium values, while the statistical scatter in these angles reflects the intrinsic flexibility for roll and tilt. It has been shown that such an analysis, averaged over all sequences, gives a reasonable value for the persistence length of B-DNA (52). We note that in the dinucleotide steps discussed above, CA/TG, CG/CG, GC/GC, and GG/CC are ranked second through fifth in terms of flexibility, out of 10 (AG/CT is ranked eighth). The dinucleotide flexibilities can then be used to predict the relative flexibilities of the different TRS (Table III). CGG and CTG are predicted to be the third and fourth most flexible, out of a total of 12 sequences. Interestingly, the TRS (ACC and GTC) that ranked first and fifth also showed anomalously rapid PAGE mobilities.

Table III. Relative TRS flexibilities based on analysis of random DNA crystal structures

The S.D. of roll and tilt angle for each of the 10 dinucleotide steps was determined as described previously (53) using all of the crystal structures available as of summer 1996. If those two modes of flexibility are assumed to be independent, then the flexibility of each dinucleotide step can be estimated by adding the variances in roll and tilt. Furthermore, the flexibility of each TRS is estimated by adding the variances of the appropriate three dinucleotide steps. Thus, for CTG, contributions are added for the CT, TG, and GC steps. The flexibility of each TRS is reported as a S.D., i.e. the square root of the variance, measured in degrees. This corresponds to the predicted root mean square deviation in the direction of the helix axis as a result of thermal fluctuations. The relative flexibilities are ranked, with 1 indicating the most flexible TRS and 12 the least flexible. There are 43 = 64 possible TRS, but only 12 of these sequences are unique.

Rank TRS Flexibility

1 ACC = CCA = CAC = GGT = GTG = TGG 15.2°
2 AAC = ACA = CAA = GTT = TTG = TGT 14.5°
3 CGG = GGC = GCG = CCG = CGC = GCC 14.3°
4 CTG = TGC = GCT = CAG = AGC = GCA 14.0°
5 GTC = TCG = CGT = GAC = ACG = GCA 13.9°
6              CCC = GGG 13.0°
7 ACT = CTA = TAC = AGT = GTA = TAG 12.7°
8 ATC = TCA = CAT = GAT = ATG = TGA 11.8°
9 AGG = GGA = GAG = CCT = CTC = TCC 11.0°
10              AAA = TTT 10.4°
11 AAG = AGA = GAA = CTT = TTC = TCT 10.1°
12 AAT = ATA = TAA = ATT = TTA = TAT 9.8°

Flexibility may also be caused by the unpairing of the double helix (35). Due to their repetitive nature, the complementary strands of (CTG)n as well as (CGG)n may slide relative to each other following transient melting and, therefore, form slipped structures (38, 43) of varying size, which exist in low proportions, at multiple sites. Because of their random location, these structures might escape detection by chemical probe analyses. However, they would also be expected to cause the loss of dependence of the J-factors on the fractional twist (35), an occurrence not observed experimentally.

Thus, despite the imprecision of the model, the limitations of the dinucleotide approximation, and uncertainties associated with crystal packing forces, the low persistence lengths of (CTG)n and (CGG)n are consistent with the variations in crystallographic values of slide, roll, and tilt.

Flexible (highly writhed) DNA is the first intrinsic, unusual DNA conformational feature associated with human hereditary neuromuscular diseases. It is tempting to speculate that this property promotes the slippage of complementary DNA strands and is responsible for the expansion and the non-mendelian transmission of these diseases (7-10, 38, 43). However, the role of flexibility (and writhe) in expansion, toroidal nucleosome structure (13-15), DNA polymerase pausing (16), recognition of methyl-directed mismatch repair enzymes (57), binding of certain specific proteins (58, 59), and preferential methylation of long CGG tracts (60) remains to be elucidated. The establishment of methodologies to investigate conformational problems in living cells (43-45) offers hope for the evaluation of these flexible and highly writhed structures in molecular mechanisms responsible for human genetic diseases.


FOOTNOTES

*   This work was supported by National Institutes of Health Grants GM 52982 (to R. D. W.) and GM 32375 (to B. D. S.) and by a grant from the Robert A. Welch Foundation (to R. D. W.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
§   Contributed equally to this work.
**   To whom correspondence should be addressed: Center for Genome Research, Inst. of Biosciences and Technology and Dept. of Biochemistry and Biophysics, Texas A & M University, Texas Medical Center, 2121 W. Holcombe Blvd., Houston, TX 77030. Tel.: 713-677-7651; Fax: 713-677-7689; E-mail: RWELLS{at}IBT.TAMU.EDU.
1   For nomenclature of triplet repeat sequences, CTG or (CTG)n designates a duplex sequence of repeating CTG·CAG units, which may also be written as GCT or TGC; the complementary strand may also be written as GCA or AGC. The orientation is 5' to 3' for both designations of the antiparallel strands. The same principles hold for the other triplet repeat sequences.
2   The abbreviations used are: TRS, triplet repeat sequence(s); bp, base pair(s); PAGE, polyacrylamide gel electrophoresis.

ACKNOWLEDGEMENTS

We thank Timothy Farrell and Ela Klysik for technical help; Drs. R. L. Baldwin, S. D. Levene, and M. A. El Hassan for helpful discussions; Dr. J. C. Wang for critically reading the manuscript; and Dr. C. T. Caskey for providing plasmids containing 75 and 105 CTG·CAG repeats.


REFERENCES

  1. Davies, K. E., and Warren, S. T. (eds) (1993) Genome Analysis: Genome Rearrangement and Stability, Vol. 7, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  2. Bates, G., and Lehrach, H. (1994) Bioessays 16, 277-284 [Medline] [Order article via Infotrieve]
  3. Sutherland, G. R., and Richards, R. I. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 3636-3641 [Abstract/Free Full Text]
  4. Panzer, S., Kuhl, D. P. A., and Caskey, C. T. (1995) Stem Cells 13, 146-157 [Abstract]
  5. Krahe, R., and Ashizawa, T. (1995) in Hypervariable Genetic Markers (Wetherall, J., and Groth, D., eds), pp. 29-60, CRC Press, Inc., Boca Raton, FL
  6. Campuzano, V., Montermini, L., Moltò, M. D., Pianese, L., Cossée, M., Cavalcanti, F., Monros, E., Rodius, F., Duclos, F., Monticelli, A., Zara, F., Cañizares, J., Koutnikova, H., Bidichandani, S. I., Gellera, C., Brice, A., Trouillas, P., De Michele, G., Filla, A., De Frutos, R., Palau, F., Patel, P. I., Di Donato, S., Mandel, J.-L., Cocozza, S., Koenig, M., and Pandolfo, M. (1996) Science 271, 1423-1427 [Abstract]
  7. Kang, S., Ohshima, K., Jaworski, A., and Wells, R. D. (1996) J. Mol. Biol. 258, 543-547 [CrossRef][Medline] [Order article via Infotrieve]
  8. Ohshima, K., Kang, S., and Wells, R. D. (1996) J. Biol. Chem. 271, 1853-1856 [Abstract/Free Full Text]
  9. Kang, S., Jaworski, A., Ohshima, K., and Wells, R. D. (1995) Nat. Genet. 10, 213-218 [Medline] [Order article via Infotrieve]
  10. Bowater, R. P., Rosche, W. A., Jaworski, A., Sinden, R. R., and Wells, R. D. (1996) J. Mol. Biol. 264, 82-96 [CrossRef][Medline] [Order article via Infotrieve]
  11. Nelson, D. L. (1993) in Genome Analysis: Genome Rearrangement and Stability (Davis, K. E., and Warren, S. T., eds), Vol. 7, pp. 1-24, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  12. Wieringa, B. (1994) Hum. Mol. Genet. 3, 1-7 [Medline] [Order article via Infotrieve]
  13. Wang, Y.-H., Amirhaeri, S., Kang, S., Wells, R. D., and Griffith, J. D. (1994) Science 265, 669-671 [Medline] [Order article via Infotrieve]
  14. Wang, Y.-H., and Griffith, J. (1995) Genomics 25, 570-573 [CrossRef][Medline] [Order article via Infotrieve]
  15. Godde, J. S., and Wolffe, A. P. (1996) J. Biol. Chem. 271, 15222-15229 [Abstract/Free Full Text]
  16. Kang, S., Ohshima, K., Shimizu, M., Amirhaeri, S., and Wells, R. D. (1995) J. Biol. Chem. 270, 27014-27021 [Abstract/Free Full Text]
  17. Ohshima, K., Kang, S., Larson, J. E., and Wells, R. D. (1996) J. Biol. Chem. 271, 16773-16783 [Abstract/Free Full Text]
  18. Chastain, P. D., II, Eichler, E. E., Kang, S., Nelson, D. L., Levene, S. D., and Sinden, R. R. (1995) Biochemistry 34, 16125-16131 [Medline] [Order article via Infotrieve]
  19. Shimizu, M., Gellibolian, R., Oostra, B. A., and Wells, R. D. (1996) J. Mol. Biol. 258, 614-626 [CrossRef][Medline] [Order article via Infotrieve]
  20. Jacobson, H., and Stockmayer, W. H. (1950) J. Chem. Phys. 18, 1600-1606
  21. Wang, J. C., and Davidson, N. (1966) J. Mol. Biol. 19, 469-482 [Medline] [Order article via Infotrieve]
  22. Cantor, C. R., and Schimmel, P. R. (1980) Biophysical Chemistry, Vol. III, W. H. Freeman & Co., New York
  23. Shore, D., and Baldwin, R. L. (1983) J. Mol. Biol. 170, 957-981 [Medline] [Order article via Infotrieve]
  24. Crothers, D. M., Drak, J., Kahn, J. D., and Levene, S. D. (1992) Methods Enzymol. 212, 3-29 [Medline] [Order article via Infotrieve]
  25. Shimada, J., and Yamakawa, H. (1984) Macromolecules 17, 689-698
  26. Gellibolian, R., Bacolla, A., and Wells, R. D. (1997) J. Biol. Chem. 272, 16793-16797 [Abstract/Free Full Text]
  27. Wang, J. C. (1979) Proc. Natl. Acad. Sci. U. S. A. 76, 200-203 [Abstract]
  28. Koo, H.-S., Drak, J., Rice, J. A., and Crothers, D. M. (1990) Biochemistry 29, 4227-4234 [Medline] [Order article via Infotrieve]
  29. Bednar, J., Fuller, P., Katritch, V., Stasiak, A., Dubochet, J., and Stasiak, A. (1995) J. Mol. Biol. 254, 579-594 [CrossRef][Medline] [Order article via Infotrieve]
  30. Taylor, W. H., and Hagerman, P. J. (1990) J. Mol. Biol. 212, 363-376 [Medline] [Order article via Infotrieve]
  31. Levene, S. D., and Crothers, D. M. (1986) J. Mol. Biol. 189, 73-83 [CrossRef][Medline] [Order article via Infotrieve]
  32. Hagerman, P. J. (1988) Annu. Rev. Biophys. Biophys. Chem. 17, 265-286 [CrossRef][Medline] [Order article via Infotrieve]
  33. Diekmann, S., Hillen, W., Morgeneyer, B., Wells, R. D., and Pörschke, D. (1982) Biophys. Chem. 15, 263-270 [CrossRef][Medline] [Order article via Infotrieve]
  34. Shimada, J., and Yamakawa, H. (1985) J. Mol. Biol. 184, 319-329 [Medline] [Order article via Infotrieve]
  35. Kahn, J. D., Yun, E., and Crothers, D. M. (1994) Nature 368, 163-166 [CrossRef][Medline] [Order article via Infotrieve]
  36. Boles, T. C., White, J. H., and Cozzarelli, N. R. (1990) J. Mol. Biol. 213, 931-951 [Medline] [Order article via Infotrieve]
  37. Ulanovsky, L., Bodner, M., Trifonov, E. N., and Choder, M. (1986) Proc. Natl. Acad. Sci. U. S. A. 83, 862-866 [Abstract]
  38. Sinden, R. R. (1995) DNA Structure and Function, Academic Press, Inc., San Diego, CA
  39. Shore, D., and Baldwin, R. L. (1983) J. Mol. Biol. 170, 983-1007 [Medline] [Order article via Infotrieve]
  40. Horowitz, D. S., and Wang, J. C. (1984) J. Mol. Biol. 173, 75-91 [Medline] [Order article via Infotrieve]
  41. Peck, L. J., and Wang, J. C. (1981) Nature 292, 375-377 [Medline] [Order article via Infotrieve]
  42. Hodges-Garcia, Y., and Hagerman, P. J. (1990) J. Biol. Chem. 270, 197-201 [Abstract/Free Full Text]
  43. Wells, R. D. (1996) J. Biol. Chem. 271, 2875-2878 [Free Full Text]
  44. Lukomski, S., and Wells, R. D. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 9980-9984 [Abstract/Free Full Text]
  45. Herbert, A., and Rich, A. (1996) J. Biol. Chem. 271, 11595-11598 [Free Full Text]
  46. Stollar, B. D. (1992) Prog. Nucleic Acid Res. Mol. Biol. 42, 39-77 [Medline] [Order article via Infotrieve]
  47. Ohshima, K., Kang, S., Larson, J. E., and Wells, R. D. (1996) J. Biol. Chem. 271, 16784-16791 [Abstract/Free Full Text]
  48. Trifonov, E. N., Tan, R. K.-Z., and Harvey, S. C. (1987) in Structure & Expression: DNA Bending and Curvature (Olson, W. K., Sarma, M. H., Sarma, R. H., and Sundaralingam, M., eds), Vol. 3, pp. 243-253, Adenine Press, New York
  49. Schellman, J. A., and Harvey, S. C. (1995) Biophys. Chem. 55, 95-114 [CrossRef][Medline] [Order article via Infotrieve]
  50. Olson, W. K., Marky, N. L., Jernigan, R. L., and Zhurkin, V. B. (1993) J. Mol. Biol. 232, 530-554 [CrossRef][Medline] [Order article via Infotrieve]
  51. Dickerson, R. E., Goodsell, D. S., and Neidle, S. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 3579-3583 [Abstract]
  52. Gorin, A. A., Zhurkin, V. B., and Olson, W. K. (1995) J. Mol. Biol. 247, 34-48 [CrossRef][Medline] [Order article via Infotrieve]
  53. Young, M. A., Ravishanker, G., Beveridge, D. L., and Berman, H. M. (1995) Biophys. J. 68, 2454-2468 [Abstract]
  54. El Hassan, M. A., and Calladine, C. R. (1996) J. Mol. Biol. 259, 95-103 [CrossRef][Medline] [Order article via Infotrieve]
  55. Harrington, R. E., and Winicov, I. (1994) Prog. Nucleic Acid Res. Mol. Biol. 17, 195-270
  56. Brukner, I., Susic, S., Dlakic, M., Savic, A., and Pongor, S. (1994) J. Mol. Biol. 236, 26-32 [CrossRef][Medline] [Order article via Infotrieve]
  57. Jaworski, A., Rosche, W. A., Gellibolian, R., Kang, S., Shimizu, M., Bowater, R. P., Sinden, R. R., and Wells, R. D. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 11019-11023 [Abstract]
  58. Richards, R. I., Holman, K., Yu, S., and Sutherland, G. R. (1993) Hum. Mol. Genet. 2, 1429-1435 [Abstract]
  59. Deissler, H., Behn-Krappa, A., and Doerfler, W. (1996) J. Biol. Chem. 271, 4327-4334 [Abstract/Free Full Text]
  60. Ashley, C. T., and Warren, S. T. (1995) Annu. Rev. Genet. 29, 703-728 [CrossRef][Medline] [Order article via Infotrieve]

©1997 by The American Society for Biochemistry and Molecular Biology, Inc.