(Received for publication, July 5, 1995; and in revised form, August 30, 1995)
From the
Expansion of a d(CGG) run within the
5`-untranslated region of the X-linked human gene FMR1 blocks FMR1 transcription, delays its replication, and precipitates
the fragile X syndrome. We showed previously that d(CGG)
tracts aggregate into interstrand tetrahelical complexes
(Fry, M., and Loeb, L. A.(1994) Proc. Natl. Acad. Sci. U. S. A. 91, 4950-4954). Here we show that these sequences also form
under physiological conditions in in vitro unimolecular
hairpin structures. Folding is demonstrated by temperature-dependent
mobility of d(CGG)
oligomers in a nondenaturing
polyacrylamide gel, by UV-hyperchromicity of thermally denaturing
oligomers, and by UV cross-linking of compact forms of d(CGG)
runs interspersed by thymidine clusters. That the compact
d(CGG)
structures are unimolecular is suggested by
their zero-order kinetics of formation. Diethyl pyrocarbonate
modification reveals a single, 4-5 residue-long central or
epicentral unpaired loop in folded d(CGG)
oligomers. The position of this loop remains unchanged by
insertion of thymidine clusters into 15- or 33-mer d(CGG) tracts as
indicated by KMnO
probing of unpaired thymidines. The
presence of a single loop in folded d(CGG)
oligomers and the accessibility of every guanine to dimethyl
sulfate modification suggest that they are hairpin and not tetraplex
structures. Modeling indicates that different d(CGG)
hairpins are stabilized by guanine-guanine Hoogsteen hydrogen
bonds or by Hoogsteen and Watson-Crick bonds. If formed in
vivo, d(CGG)
hairpins could entail slippage
and trinucleotide expansion during replication and could obstruct FMR1 gene transcription and replication.
Fragile X syndrome is an inherited, X-linked dominant mental
retardation disorder affecting about one male in 1500 and one female in
2500 (Webb, 1989; Richards and Sutherland, 1992). This syndrome is
frequently associated with a folate-sensitive fragile site, Xq27.3, on
chromosome X of cells of affected males (Sutherland, 1977). The fragile
X syndrome is characterized by a substantial expansion of a d(CGG)
trinucleotide repeat located in the 5`-untranslated region of a
housekeeping gene, FMR1, which was identified at a locus
coincident with the Xq27.3 breakpoint (Verkek et al., 1991).
Whereas normal individuals have 2-50 copies of the d(CGG)
sequence, the trinucleotide is amplified in affected subjects to
>200-2000 copies (Fu et al., 1991; Kremer et
al., 1991; Oberléet al., 1991;
Nakhahori et al., 1991; Verkek et al., 1991; Yu et al., 1991). Expansion of the d(CGG) repeat is accompanied
by methylation of the FMR1 promoter and of the amplified
trinucleotide tract (Bell et al., 1991; Vincent et
al., 1991; Pieretti et al., 1991; Luo et al.,
1993). Subsequent to d(CGG) expansion and hypermethylation, the FMR1 gene becomes transcriptionally silent (Pieretti et
al., 1991; Hansen et al., 1992; Sutcliffe et
al., 1992), and the replication of a chromosomal segment spanning
150 kb 5` and
34 kb 3` from the d(CGG)
stretch is delayed (Hansen et al., 1993).
The
molecular mechanisms that govern d(CGG) amplification and hypermethylation and that link d(CGG)
expansion to the suppressed transcription of FMR1 and to its
delayed replication are not known. In addressing these problems, we
showed recently that model d(CGG)
oligomers form
under physiological conditions a tetramolecular four-stranded structure
in a time-dependent and DNA concentration-dependent kinetics. We also
found that the multimolecular tetraplex structure of short
d(CGG)
tracts is stabilized by 5-methylation of
the cytosine residues (Fry and Loeb, 1994). The facile formation of
tetramolecular complexes by d(CGG)
raised the
hypothesis that single stranded stretches of this tract may also
generate unimolecular hairpin or tetraplex structures (Fry and Loeb,
1994). In this communication we use gel mobility analysis and chemical
probing to demonstrate that d(CGG)
oligodeoxynucleotides can fold back at a physiological range
of salt concentrations, temperatures, and pH values to form hairpin
structures. In contrast to the second-order kinetics of formation of
tetramolecular quadruplex d(CGG)
(Fry and Loeb,
1994), hairpin structures of this sequence are formed at a zero-order
kinetics. Our evidence for hairpin formation by d(CGG)
is sustained by reports by Gacy et al.(1995) and
Chen et al.(1995), which were published as this work was
completed and that provide NMR evidence to also show the formation of
hairpin structures by d(CGG)
and by other
trinucleotide tracts.
The folding of exposed expanded single strand
runs of d(CGG)during the replication of fragile X
cell DNA could entail slippage and trinucleotide expansion.
Furthermore, guanine-rich tracts of RNA or DNA that fold back to
generate a hairpin (Christiansen et al., 1994) or tetraplex
structure (Christiansen et al., 1994; Woodford et
al., 1994) were shown to arrest DNA synthesis in vitro.
If formed in vivo along single strand stretches, back-folded
d(CGG)
structures may also obstruct DNA
transcription and replication. These structures might, therefore,
contribute to the observed blocking of FMR1 transcription and
to its delayed replication in fragile X cells.
Thymidine residues in oligomers that contain thymidine clusters were
probed by KMnO, which selectively modifies unpaired
thymidine residues in folded nucleotide sequences (Burton and Reilly,
1966; Hayatsu and Ukita, 1967; Friedman and Brown, 1978). DNA
oligomers, folded or denatured as described above, were exposed to
KMnO
at a final concentration of 0.5 mM for either
30 min at 4 °C (folding conditions) or for 5 min at 80 °C
(denaturation conditions). Termination of the KMnO
reaction, isolation of the DNA, its treatment by piperidine and
electrophoresis in a 20% polyacrylamide sequencing gel were as
described above for DEPC.
Figure 3: Formation by UV-irradiation of covalently cross-linked forms of the H3T3 oligomer. End-labeled H3T3 DNA was heat-denatured, annealed, and irradiated by UV light in the absence or presence of 200 mM of the indicated salt as described under ``Experimental Procedures.'' Shown is an autoradiogram of 12% polyacrylamide denaturing gel electrophoresis of control unirradiated and UV irradiated H3T3 oligomer samples. Arrows mark the band of rapidly migrating cross-linked DNA.
Figure 1:
Electrophoretic migration
of d(CGG)oligomers in a nondenaturing
polyacrylamide gel at 4 or 75 °C. A, electrophoregram of
d(CGG)
oligomers run under nondenaturing
conditions at 4 °C and in the absence of salt. End-labeled oligomer
or DNA marker,
500 pg of DNA in 2.5 µl of TE buffer pH 8.0,
was heat-denatured for 3 min at 90 °C, annealed for 30 min at 4
°C, and electrophoresed at 4 °C through a nondenaturing 12%
polyacrylamide gel (see ``Experimental Procedures''). B, electrophoregram of d(CGG)
oligomers
run under nondenaturing conditions at 75 °C and in the absence of
salt. End-labeled oligomer or DNA marker,
500 pg of DNA in 2.5
µl TE buffer pH 8.0, was heat-denatured for 3 min at 90 °C,
incubated for additional 30 min at 75 °C, and electrophoresed at 75
°C through a nondenaturing 12% polyacrylamide gel (see
``Experimental Procedures''). C, electrophoregram of
d(CGG)
oligomers run under nondenaturing
conditions at 4 °C and in the presence of salt. DNA sample
preparation and electrophoresis were performed as described for A above, except that DNA annealing and electrophoresis were
conducted in buffers contained 100 mM KCl. D,
electrophoregram of d(CGG)
oligomers run under
nondenaturing conditions at 75 °C and in the presence of salt. DNA
sample preparation and electrophoresis were performed as in B above except that DNA annealing and electrophoresis were conducted
in buffers contained 100 mM KCl. E, electrophoregram
of d(CGG)
oligomers run through a 12%
polyacrylamide, 8.0 M urea denaturing gel. DNA sample
preparation and electrophoresis were conducted as described under
``Experimental Procedures.'' Marker DNA designations were as
follows: Univ., bacteriophage M13 universal primer; Tet, Tetrahymena telomeric sequence
5`-d(T
G
)
-3`; H3T2 CL,
UV-light cross-linked oligomer H3T2.
The rapidly migrating forms of all of the examined
d(CGG) oligomers are found to form within 5-7 min,
which is the shortest testable annealing time (data not presented).
Furthermore, these structures are generated at an identical efficiency
over a range of DNA concentrations of 1.0-500.0 µg/ml
(results not shown). Thus, in contrast to the second-order kinetics of
formation of a multimolecular d(CGG)
tetraplex (Fry and
Loeb, 1994) the observed compact forms of d(CGG)
oligomers
are generated at a zero-order kinetics, suggesting that these
structures are unimolecular. It is notable that when incubated for
>30 min at 4 °C in the presence of 100 mM KCl, the
24-mer d(CGG) sequence displays in a nondenaturing gel a major rapidly
migrating band and a minor electrophoretically retarded band that
presumably represents a multimolecular tetraplex (Fry and Loeb, 1994)
(results not shown). It thus appears that an equilibrium between a
folded unimolecular form of d(CGG)
and a multistranded
tetraplex form of such a sequence is determined by the concentration of
DNA, the presence of salt, and the time of incubation.
That
d(CGG) oligomers unfold at elevated temperatures was
substantiated by absorbance thermal denaturation analysis. As shown in Fig. 2, heating of every examined d(CGG)
oligomer
elicits hyperchromicity at 257 mµ, indicating the melting of these
DNA fragments.
Figure 2:
Absorbance thermal denaturation of
d(CGG) oligomers. Oligomer DNA, 2.0-4.0
µg in 1.0 ml of TE buffer pH 8.0, 100 mM KCl, was boiled
for 3 min and annealed for additional 30 min at 4 °C. Changes in
the absorbance of the DNA at 257 mµ as a function of temperature
were measured as described under ``Material and Methods.''
The curve depicts the increase in absorbance of each oligomer relative
to its absorption at 20 °C normalized to
1.0.
To further demonstrate the formation of compact forms
of d(CGG) variants, we irradiated the annealed DNA with UV
light to form covalently cross-linked folded species. A unimolecular
tetrahelical form of the Tetrahymena telomeric sequence Tet,
d(T
G
)
was shown to become
cross-linked by UV-irradiation through the generation of thymidine
dimers (Williamson et al., 1989). To examine the formation of
cross-linked folded d(CGG)
structures, oligomers H3T2 and
H3T3 were constructed that contain d(CGG)
tracts
interspersed by thymidine runs (Table 1). End-labeled H3T2 or
H3T3 oligodeoxynucleotides were UV irradiated, and the formation of
covalently cross-linked compact forms of these oligomers was detected
by discerning species that retain rapid migration in a denaturing gel
(see ``Experimental Procedures''). As shown in Fig. 3,
H3T3 that was UV-irradiated in a buffer solution with no added salt or
in the presence of 200 mM chloride salts of
Na
, K
, Cs
, or
Rb
displays similar amounts of a form that migrates in
a denaturing sequencing gel ahead of the main band of denatured
oligomer. This form is generated at a lower efficiency when H3T3 is
UV-irradiated in the presence of 200 mM NH
Cl (Fig. 3), and it is undetectable when this oligomer is
irradiated in the presence of 10 mM Mg
(not
shown). Comparable results are obtained by UV irradiation of H3T2
oligomer, except that a rapidly migrating cross-linked form of this
sequence is generated efficiently in the presence of
NH
ions (data not presented). Hence,
d(CGG)
tracts interspersed by runs of thymidines form
compact structures that can be cross-linked by UV irradiation. Unlike
quadruplex DNA, which requires specific cations for its formation and
stabilization (Sen and Gilbert, 1991; Williamson, 1992) the compact
forms of d(CGG)
appear to be generated independently of the
type or the presence of a cation.
Figure 4:
Modification by DEPC of
d(CGG)oligomers. A, denaturing
polyacrylamide gel electrophoregram of DEPC-treated d(CGG)
oligomers. End-labeled 8-, 11-, 15-, 24- and 33-mer
oligodeoxynucleotides, 0.20-0.25 µg of DNA in 40 µl of TE
buffer, pH 8.0, containing 100 mM KCl, were heat-denatured as
described under ``Experimental Procedures.'' The oligomers
were either annealed and treated with DEPC at 4 °C or heated and
exposed to DEPC at 80 °C. The DEPC-treated DNA was cleaved by
piperidine and electrophoresed through a denaturing 12% polyacrylamide,
8.0 M urea gel as detailed under ``Experimental
Procedures.'' Bck, piperidine cleavage of oligomers that
were not exposed to DEPC; solid arrowheads, strong cleavage; open arrowheads, weak cleavage. B, allocation of
DEPC-modified nucleotide tracts within the d(CGG)
sequence. The solid-line frame marks a strongly cleaved
region, and the broken line demarcates the weakly cleaved
nucleotides.
Telomeric
DNA fragments contain clusters of thymidines (Guo et al.,
1993; Murchie and Lilley, 1994) or of thymidine and adenine residues
(Murchie and Lilley, 1994) interspersed between tracts of contiguous
guanines. Folding of these sequences into tetraplex structures involves
the looping-out of the thymidine or thymidine-adenine runs. Adenine
residues were also found to be interspersed within the d(CGG) sequences of the FMR1 gene in normal subjects (Kunst and
Warren, 1994). We inquired, therefore, whether or not the introduction
of thymidine clusters into a d(CGG)
stretch alters its
pattern of folding. Annealed or denatured 15-mer and H1T2 oligomers
were modified by DEPC or KMnO
, which detect unpaired
guanine or thymidine residues, respectively. As seen in Fig. 5,
the region that loops-out at 4 °C is located at a similar position
in the two oligomers. However, the two thymidine residues introduced
into H1T2 remain unpaired at 4 °C and merge with the unpaired
guanines to form a larger loop. To broaden this inquiry, the 33-mer,
H3T1, H3T2, and H3T3 oligomers were similarly probed by DEPC and
KMnO
for regions of unpaired guanine and thymidine
residues, respectively. Patterns of DEPC-modification shown in Fig. 6A indicate that the introduction of three runs of one,
two, or three thymidine residues each into the 33-mer DNA does not
alter the position of the primary unpaired loop. Note that the patterns
of KMnO
modification demonstrate that whereas the
central thymidine cluster merges with the unpaired guanines to form a
larger loop, the remaining two thymidine runs form two additional
unpaired regions that do not include guanine residues (Fig. 6B). However, all three unpaired regions in H3T3
contain thymidine residues with only a negligible participation of
guanines in the looped-out tracts (Fig. 6, A and B and the scheme in Fig. 6C). It appears,
therefore, that the symmetry of the primary fold in d(CGG)
remains unaltered whether or not thymidine residues are
introduced. Yet, as additional tracts of thymidine are added, they may
form auxiliary loops that do not include guanine residues.
Figure 5:
Modification by DEPC and KMnO of the 15-mer and H1T2 oligomers. A, denaturing
polyacrylamide gel electrophoregram of DEPC- and
KMnO
-treated oligomers. End-labeled oligomers,
0.20-025 µg of DNA in 40 µl of TE buffer, pH 8.0, 100
mM KCl, were heat-denatured, annealed at 4 °C, or heated
at 80 °C and treated by DEPC or KMnO
at these
respective temperatures as described under ``Experimental
Procedures.'' Piperidine cleavage and 12% polyacrylamide, 8.0 M urea denaturing gel electrophoresis were conducted as
described for Fig. 4. Bck, piperidine cleavage of
oligomers not treated by DEPC or KMnO
; solid
arrowheads, strong cleavage; open arrowheads, weak
cleavage. B allocation of DEPC- and KMnO
-modified
nucleotide tracts within the 15-mer and H1T2 sequences. The solid
line frame marks a strongly cleaved region, and a broken line demarcates the weakly cleaved
nucleotides.
Figure 6:
Modification by DEPC and KMnO of d(CGG)
oligomers containing thymidine
tracts of different length. A, denaturing polyacrylamide gel
electrophoregram of DEPC-treated oligomers. End-labeled 33-mer, H3T1,
H3T2, and H1T2 oligomers, 0.20-0.25 µg of DNA in 40 µl of
TE buffer, pH 8.0, 100 mM KCl were denatured, annealed at 4
°C, or heated at 80 °C and reacted with DEPC at these
respective temperatures as described under ``Experimental
Procedures.'' The oligomers were cleaved by piperidine and
electrophoresed through a 12% polyacrylamide, 8.0 M urea
denaturing gel as described in the legend to Fig. 4. Bck, piperidine cleavage of oligomers not treated by DEPC; solid arrowheads, strong cleavage; open arrowheads,
weak cleavage. B, denaturing polyacrylamide gel
electrophoregram of KMnO
-treated oligomers. End-labeled
33-mer, H3T1, H3T2, and H1T2 oligomers were treated with
KMnO
, cleaved by piperidine, and electrophoresed through a
12% polyacrylamide denaturing gel under conditions detailed under
``Experimental Procedures.'' Bck, piperidine
cleavage of oligomers not treated by KMnO
; solid
arrowheads, strong cleavage; open arrowheads, weak
cleavage. C, allocation of DEPC- and KMnO
-modified
nucleotide tracts within the examined sequences. A solid line frame marks the strongly cleaved region, and the broken line demarcates the weakly cleaved
nucleotides.
Figure 7:
Methylation-protection analysis of
d(CGG) oligomers. Oligomers, 0.20-0.25
µg of DNA in 40 µl of TBE buffer, pH 8.3, 100 mM KCl
were heat-denatured for 3 min at 90 °C and annealed for 30 min at 4
°C. After the addition of 0.1 volume of 1.0 mg/ml
dG
carrier DNA and 0.1 volume of 0.5% DMS freshly
diluted in TBE buffer, pH 8.3, 100 mM KCl, the samples were
either incubated for 15 min at 4 °C or for 45 s at 90 °C.
Termination of the methylation reaction, piperidine cleavage, and
denaturing gel electrophoresis were conducted as described under
``Experimental Procedures.''
The substantial expansion of a d(CGG) trinucleotide repeat
within the 5` exon of the FMR1 gene in fragile X syndrome
cells (Fu et al., 1991; Kremer et al., 1991;
Oberléet al., 1991; Nakahori et
al., 1991; Verkerk et al., 1991; Yu et al.,
1991) and the hypermethylation of this tract and of the FMR1 gene promoter (Bell et al., 1991; Verkerk et
al., 1991; Yu et al., 1991), entail transcriptional
silencing of FMR1 (Pieretti et al., 1991; Hansen et al., 1992; Sutcliffe et al., 1992) and a delay in
the replication of a DNA region that encompasses FMR1 (Hansen et al., 1993). Illumination of the molecular basis for the
expansion of the d(CGG) tract in fragile X cells and the
ensuing blocking of FMR1 transcription and its delayed
replication, requires a better understanding of the alternative
structures of the d(CGG)
sequence. We reported previously
that d(CGG)
oligomers readily form in vitro a
multistrand Hoogsteen-bonded tetrahelical structure that is stabilized
by cytosine C-5 methylation (Fry and Loeb, 1994). This observation led
us to speculate that single strand tracts of d(CGG)
, which
become exposed during replication or transcription, might also fold
back to form Hoogsteen-bonded unimolecular hairpin or tetraplex
domains. Such structures could act in vivo to precipitate
slippage during replication that could result in d(CGG)
expansion, and they may also block the progression of the
transcription or replication machineries.
In this communication, we
demonstrate that short stretches of d(CGG) readily loop
back under physiological conditions in vitro to form hairpin
structures. Unimolecular hairpin or tetraplex structures of telomeric
DNA display an increased electrophoretic mobility in a nondenaturing
gel and conversely, their electrophoretic migration slows when they
become thermally denatured (Williamson et al., 1989). In
analogy, we find that whereas heat-denatured d(CGG)
oligomers migrate in a nondenaturing gel at rates inversely
proportional to their length, their relative mobility becomes
anomalously accelerated under annealing conditions (Fig. 1).
Taken as evidence for the folding of d(CGG)
into more
compact forms, this observation is strengthened by the demonstration of
UV hyperchromicity of thermally denaturing d(CGG)
oligomers (Fig. 2). Similar hyperchromicity is displayed by unfolding
telomeric DNA sequences (Scaria et al., 1992). Finally, as was
shown in the past for telomeric DNA (Williamson et al., 1989),
compact forms of d(CGG)
runs interspersed by thymidine
clusters can be covalently cross-linked by UV light (Fig. 3).
That the compact structures of d(CGG) are unimolecular
is indicated by their faster mobility relative to the unfolded
oligomers ( Fig. 1and Fig. 3) and by the zero-order
kinetics of their formation (see ``Results''). Such
monomolecular compact forms may represent either fold-back hairpin
structures or an intrastrand tetrahelix. Formation and stabilization of
tetraplex DNA require Na
or K
cations
(Sen and Gilbert, 1991; Williamson, 1992). As the compact forms of
d(CGG)
are generated and maintained to a comparable extent
with or without various cations ( Fig. 1and Fig. 3), it
is unlikely that they represent quadruplex structures. The modification
of all of the guanines in folded d(CGG)
oligomers by DMS (Fig. 7) further argues against their arrangement in a tetraplex
conformation. Finally, DEPC modification provides direct evidence for
the folding of d(CGG)
into a hairpin structure rather than
into a tetrahelix. Had the folded structures been unimolecular
quadruplexes, three unpaired loops that serve as hinges should have
been observed. Instead, a single region of unpaired nucleotides is
discerned at a central or epicentral location within several
d(CGG)
oligomers (Fig. 4). Insertion of three
thymidine clusters consisting of one, two, or three residues each into
a d(CGG)
tract does not change the location of this primary
looped-out region ( Fig. 5and Fig. 6). Insertion of three
thymidine clusters into the d(CGG)
sequence introduces two
additional regions of unpaired thymidines, which do not include
contiguous guanines ( Fig. 5and Fig. 6). These oligomers
remain fully accessible to DMS (Fig. 7), and thus it appears
that these oligomers also maintain a hairpin structure rather than a
tetraplex conformation. The weight of the evidence provided here
points, therefore, at a similarity between the d(CGG)
tracts and telomeric DNA sequences, which are capable of forming
back-folded hairpin structures at a physiological range of salt
concentrations, temperatures and pH values (Sundquist and Klug, 1989;
Sen and Gilbert, 1990; Balagurumoorthy and Brahmachari, 1994; Choi and
Choi, 1994).
The gel mobility and chemical probing evidence provided
here, which indicates the facile formation of hairpin structures by
d(CGG) tracts, is corroborated by the works of Gacy et
al.(1995) and Chen et al.(1995), which were published as
this work was completed. Employing NMR analysis, these authors
demonstrated that d(CGG)
, as well as the complementary
d(CCG)
sequence (Chen et al., 1995) and other
trinucleotide sequences that expand in human disease (Gacy et
al., 1995), form hairpin structures. Calculations indicate that
the stability of d(CGG)
hairpin increases linearly with
length and that a threshold stability is attained at lengths conforming
with those of expanded d(CGG)
sequence in individuals
afflicted with the fragile X syndrome (Gacy et al., 1995).
Based on the location of the looped-out sector in d(CGG) oligomers (Fig. 4) or in tracts interspersed with
thymidine clusters ( Fig. 5and Fig. 6), we constructed
models for the structure of d(CGG)
fold-back hairpins. The
scheme presented in Fig. 8A suggests that the 24-mer DNA
folds back asymmetrically to form a hairpin structure that is
stabilized by five guanine-guanine Hoogsteen base pairs. By contrast,
the 33-mer folds back with a nearly perfect symmetry to form a hairpin
structure that is maintained by four guanine-guanine Hoogsteen
base pairs and/or 10 guanine-cytosine Watson-Crick pairs (Fig. 8A). Modification by DEPC of purines in the
folded 33-mer d(CGG) tract cannot reveal whether or not the cytosine
residue at position 14 from the 5` end (marked *C in Fig. 8A), is paired or not. Hence, an alternative
structure of folded 33-mer can be envisioned in which this residue
remains unpaired and the guanine at position 19 pairs with a guanine at
position 13. A resulting hairpin structure is less symmetric than the
one depicted in Fig. 8A, having a dinucleotide 3`
unpaired tail and being stabilized by eight Watson-Crick and/or five
Hoogsteen hydrogen bonds (model not shown). Since the electrophoregram
shown in Fig. 8A fails to distinguish between a single-
or dinucleotide 3` overhang, it cannot be determined which alternative
folded form exists. A generally similar model for the hairpin structure
of a d(CGG)
oligomer, based on NMR analysis was presented
recently by Gacy et al.(1995). A stereo model of d(CGG)
constructed on the basis of NMR analysis implicates both
G
C
and
G
G
in the stabilization of the stem
of the hairpin (Chen et al., 1995). Hence, it appears that the
schematic representation of the hairpin structures in Fig. 8A is sterically plausible as attested by
independent NMR analysis and model construction.
Figure 8:
Models for the back-folded hairpin forms
of d(CGG) oligomers. A, schemes of the
hairpin forms of the 24- and 33-mer d(CGG)
fragments. The model is based on the identification by
DEPC-modification of looped regions of unpaired guanines in the
annealed oligomers (Fig. 4). Watson-Crick and non-Watson-Crick
bonds are marked by three dashes and two Xs,
respectively. The absence of a bridging notation indicates lack of base
pairing. The marking of pairing in the scheme does not indicate that
any specific bond is actually formed. The notation *C in the
model of folded 33-mer marks a cytosine residue that may or may not be
paired (see ``Discussion''). B, schemes of
alternative folded forms of the H3T3 oligomer. The models are based on
the identification by DEPC and KMnO
modification of
unpaired guanines and thymidines, respectively, in the annealed
oligomers (Fig. 6).
As indicated in Fig. 8B, the H3T3 oligomer can assume either one of two alternative folded structures. Both possible hairpin forms are maintained by eight guanine-guanine pairs. However, whereas conformation I is linear, the central loop in form II serves as a hinge for a parallel arrangement of two duplex arms (Fig. 8B). Form II is perhaps less likely to exist than conformation I since the former could easily convert into a tetraplex that, as evidence indicates (Fig. 7), is not generated.
The
propensity of the d(CGG) sequence to form Hoogsteen-bonded
tetrahelix (Fry and Loeb, 1994) or back-folded hairpin structures (Gacy et al.(1995) and this communication) may have a biological
significance. First, the exposure of long stretches of single-stranded
d(CGG)
during DNA replication and the formation of a stable
hairpin structure by this tract could ensue slippage and trinucleotide
expansion. Second, hairpin formation could block replication,
transcription, and perhaps translation. It was shown recently that the
progression in vitro of AMV reverse transcriptase along
insulin-like growth factor II mRNA is blocked at a guanine-rich region
that forms a quadruplex domain and two stable hairpins (Christiansen et al., 1994). Similarly, the progression in vitro of
a variety of DNA polymerases along a guanine-rich region in the
globin promoter DNA is arrested at an intrastrand tetraplex-forming
sequence (Woodford et al., 1994). Last, expansion of the
trinucleotide (CGG) repeat stalls the progression of the 40 S ribosomal
subunit during the translation of FMR1 mRNA (Feng et
al., 1995). Hence, it may be argued that the exposure of stretches
of expanded (CGG)
tracts in DNA or mRNA and their folding
into hairpin structures might block replication and transcription or
translation, respectively. The formation of (CGG)
secondary
structures in FMR1 DNA or mRNA could thus contribute to the
observed delay of FMR1 replication in fragile X cells (Hansen et al., 1993), to the silencing of its transcription (Pierreti et al., 1991; Hansen et al., 1992; Sutcliffe et
al., 1992), and to the interrupted translation of its mRNA (Feng et al., 1995). Last, evidence indicates that cytosine
methylation by methyltransferase is enhanced by hairpins formed by
human c-Ha-ras telomere-like sequence and d(CCG) trinucleotide
repeat (Smith et al., 1994). Thus, the folding of d(CGG)
may also be instrumental in the hypermethylation of the
d(CGG)
region in DNA of fragile X syndrome cells.
Folding of d(CGG) and the resulting inhibition of DNA
synthesis and transcription may not occur in cells of normal and
``premutation'' individuals that have a shorter and unstable
d(CGG)
stretch (Gacy et al., 1995). These shorter
tracts may also be stabilized in a rigid conformation by single strand
d(CGG)
binding proteins of the type that was recently
identified in HeLa cells (Richards et al., 1993). The
expansion of the d(CGG)
stretch in individuals afflicted
with the fragile X syndrome could, however, defeat the capacity of such
binding proteins to coat it, and the naked d(CGG)
stretches
may fold-back to form stable hairpin domains.