(Received for publication, May 9, 1995; and in revised form, July 12, 1995)
From the
The solution structure of the DNA duplex
d(CG
C
G
A
D
A
C
G
C
C
)-d(G
C
G
C
T
A
T
G
C
G
G
),
with D indicating a deoxyribose aldehyde abasic site and numbering from
5` to 3`, has been determined by the combined use of NMR and restrained
molecular dynamics. The
P and
P-
H
correlation data indicate that the backbones of these duplex DNAs are
regular. One- and two-dimensional
H NMR data indicate that
the duplexes are right-handed and B-form. Conformational changes due to
the presence of the abasic site extends to the base pairs adjacent to
the lesion site with the local conformation of the DNA being dependent
on whether the abasic site is in the
or
configuration. When
the sugar of the abasic site is in the
configuration the
deoxyribose is within the helix, whereas when the sugar is in the
configuration the deoxyribose is out of the helix. The base of residue
A
in the position opposite the abasic site is
predominantly stacked in the helix in both cases. A water molecule can
apparently form a hydrogen bond bridge between the
abasic site
and A
.
Damage to DNA bases can arise from a number of naturally occurring routes including oxidative stress as well as by the action of various chemical agents and by radiative processes. Base damage such as the spontaneous deamination of cytosine to form uracil, the oxidation of thymine to thymine glycol, or the oxidation of guanine to 8-oxo guanine can be repaired via abasic sites. The first step in repair in vivo is often the hydrolytic cleavage of the modified base, at the C-N bond between the sugar and the damaged or unusual base to generate an abasic site. The cleavage of the glycosidic bonds is catalyzed by DNA glycosylases which were first identified in 1974(1) , and there are nine known distinct classes of glycosylases(2, 3, 4, 5) . Uracil glycosylase is the most familiar of the glycosylases and catalyzes the reaction shown below. Structures of two uracil glycosylases have been recently determined (6, 7) (Fig. Z1).
Figure Z1: Structure 1
The
abasic site is not a chemically unique species but is an equilibrium
mixture of four forms(8, 9, 10) . The abasic
site is an equilibrium mixture of - (I), and
- (II)
hemiacetals (2-deoxy-D-erythro-pentofuranoses), of
aldehyde (III), and of hydrated aldehyde (IV), as depicted below. The
hemiacetal forms predominate with about 1% aldehyde being
present(8, 11) . The strand cleavage at the 3` side of
the abasic site catalyzed by UV endonuclease V of bacteriophage T4 or
endonuclease III of Escherichia coli occurs via a syn
-elimination reaction(9, 10, 12) .
The hydroxide-catalyzed reaction proceeds via a trans
-elimination reaction(10) (Fig. Z2).
Figure Z2: Structure 2
A primary source of abasic sites is the spontaneous deamination of
cytosine to
uracil(2, 13, 14, 15, 16, 17) .
For a typical E. coli, it has been estimated that there are
about 40-400 such events per cell division and in a typical
mammalian cell 4,000-40,000 uracil formed per cell
division(2, 3, 16, 17) . A genetic
test has shown that the deamination rate of C deamination in
double-stranded M13 in vitro is about
10/s(18) . The number of abasic sites in a
``typical'' human cell is not known as the rate of formation
and the rate of repair are dependent on many factors. Ames and
co-workers (19) have estimated that there are more than 10,000
damaged sites/typical human cell at any given time. The number and
types of damage to DNA which are required for transformation to occur
are just now being
determined(20, 21, 22, 23) . The
damage of mitochondrial DNA can also lead to cancer, and the repair
processes of mitochondria are not well understood(24) .
During the past few years there has been a growing appreciation of the diversity of DNA repair responses depending on the state of the cell. DNA repair can be highly coupled with transcription(25) . Thus, in mature cells damaged DNA sites can accumulate, and the structural and dynamical effects of damaged DNA and the intermediates in DNA repair may be more important in mature than in dividing cells. It is now known that DNA repair can be strand specific with only the transcribed strand being repaired(2, 3, 24, 25, 26, 27) . In mature nerve cells there is apparently little DNA repair occurring and damaged sites accumulate (28, 29, 30) . When a mature nerve cell is infected by herpes, pseudorabies or other virus DNA repair is activated by repair enzymes, including uracil glycosylase, coded by the virus(28, 29, 30) . It is now recognized that damaged DNA can also have pronounced effects on transcription and chromosome integrity (21, 31) as well as cellular aging(32, 33) .
There have been a number of investigations of the effects of unrepaired abasic sites on replication(2, 3, 4, 5, 21, 22, 34, 35, 36, 37, 38) . When DNA polymerase copies DNA containing an abasic site there is a strong preference, about 90%, for dA to be put in the daughter strand in the position opposite the abasic site(2, 3, 16, 17, 36, 39) . There may be special features associated with dA interacting with the abasic site with these physical properties determining to the dA preference which occurs during the rate determining step in the polymerase reaction. On the other hand if the various bases have similar interactions with the abasic site in the context of duplex DNA then the preference for dA is most likely that of the replication complex(22) . There is no consensus at the present time for the origin of this preference, and there is some evidence that it is due to a kinetic effect (40) suggesting that the conformation of the abasic site may be important.
Abasic sites also affect transcription. The limited results to date suggest that the presence of an abasic site slows down but does not block transcription(21, 22, 26, 31, 41, 42) . The base most commonly placed in the RNA at the position complementary to the abasic site is rA. This is the same preference as found for DNA polymerases and suggests that the dA/rA preference might be at least partially due to the preferential interaction of A with the abasic site.
There have been a number of studies of DNA duplexes containing analogues of the naturally occurring abasic site(43, 44, 45, 46) . These tetrahydrofuran or other analogues differ from the natural abasic site in hydrogen bonding potential, chemical reactivity, and perhaps other properties as well. The hydrogen bonding potential may be important since it is likely that the abasic site interacts with water molecules. A recent investigation indicated that the conformational preferences of at least some of these abasic site analogues are different from those of aldehydic abasic sites(47) .
Studies on abasic sites are also of interest as they relate to studies on DNA degradation by drugs such as bleomycin and neocarzinostatin(15, 48, 49, 50) . These drugs lead to products such as deoxyribolactones and other types of non-aldehydic abasic. The refined solution state structures of duplex DNAs containing abasic sites may be useful for comparison with those generated by anti-tumor and other drugs.
The abasic site single strand
d(CG
C
G
A
D
A
C
G
C
C
)
was prepared by treating DNA with the single strand
d(C
G
C
G
A
U
A
C
G
C
C
)
that contains a single U residue, with N-uracil glycosylase as
described
previously(8, 9, 10, 11, 12) .
The extent of reaction was monitored during the reaction by the reverse
phase HPLC method described above which separates free uracil, the DNA
single strand containing U, and the DNA single strand containing the
abasic site. The single strand containing the abasic site was purified
by gel filtration chromatography on a preparative TSK-GEL G2000SW
column and eluted with 25 mM sodium phosphate buffer and 100
mM sodium chloride at pH 7.0 to remove N-uracil
glycosylase and free uracil. The purified single strand was
subsequently dialyzed, lyophilized to dryness, and dissolved in pH 7.0
buffer containing 10 mM sodium phosphate, 100 mM sodium chloride, and 0.05 mM EDTA in 99.96%
H
O. Abasic site containing single-stranded DNA
prepared by this approach was found to be pure both by proton and
P NMR and to be free of phosphodiester cleavage products.
Overall yield for the conversion of the single-stranded material to
abasic site containing DNA was about 85%. Due to the degradation of DNA
containing abasic sites at elevated temperatures with subsequent
irreversibility of duplex formation, a precise melting temperature for
the duplex was not determined.
The heteroduplexes were formed by
mixing equimolar quantities based on the extinction coefficients of the
two strands and by monitoring the titration of the single strand
containing the residue dU or D with the adjacent strand. The duplex was
lyophilized several times in H
O and dissolved
in 0.4 ml of 99.96%
H
O. The purified duplex was
studied at 1-1.5 mM concentration in pH 7.0 buffer
containing 10 mM sodium phosphate, 100 mM sodium
chloride, and 0.05 mM EDTA in 99.96%
H
O. For one-dimensional NMR experiments
involving the exchangeable imino protons, the duplex was lyophilized
and dissolved in 90% H
O, 10%
H
O.
Heteronuclear P-
H correlation data were obtained using the
Unityplus 400 via a two-dimensional PHH-heterotocsy
experiment(52) . The heteronuclear spinlock time was 100 ms.
The spectral width in the
H dimension was 2232 Hz collected
into 1664 points and in the
P dimension was 500 Hz
collected into 128 complex points. 128 scans for each of the 128 t
increments were obtained with an acquisition
time of 0.4 s. The data were zero-filled to 2K data points in F
and 1K data points in F
dimension. The data were
weighed with a Gaussian apodization in each dimension.
NOESY
experiments in H
O were carried out at mixing
times of 100 and 200 ms with a 1.6 s equilibration delay with
presaturation of the water resonance, using the Bruker 500 MHz with a
spectral width 5000 Hz in each dimension. At each mixing time, 512 t
increments were acquired with 64 scans for each
increment. The F
dimension was zero filled to 2K, and the
data were weighed with a Gaussian apodization in each dimension prior
to 2K
2K Fourier transformation. These data were used for
quantification of the NOESY cross-peaks.
A NOESY spectrum was
obtained with a 300-ms mixing time using the Varian Unityplus 400
spectrometer at 20 °C in H
O with
P decoupling during the evolution time. The data were
collected into 2K complex points in t
and 1K
complex points in t
with a spectral width of 4000
Hz in each dimension, and 64 transients were acquired for each of 512
increments of t
. A Gaussian weighting was used in
both dimensions, and the spectra were zero-filled to 4096 by 4096 real
points. The heteronuclear J
couplings were determined by
comparison of the proton linewidth along F
and
F
.
NOESY experiments in 90% HO, 10%
H
O were carried out on the Bruker 600 MHz
spectrometer with jump and return pulses replacing the last 90°
pulse in the standard NOESY sequence. The delay between the jump and
return pulses was 55 ms. 650 increments of t
with
96 scans/increment were used, and 4K data points in t
were acquired. The spectral width in each dimension was 12,000
Hz. The data was processed to minimize the intensity of the water
signal of each FID prior first Fourier transformation in the F
dimension. Gaussian apodization was used in both dimensions, the
data was zero-filled to 4K data points in t
, and a
second order polynomial base-line correction was used for the F
dimension.
PECOSY spectra were collected at 400 MHz, and the
data were collected into 2K complex points in t and 512 complex points in t
. The spectral
width in F
was 3200 Hz, and in F
2600 Hz. 256
transients were acquired for each increment in t
.
The data was linear predicted to 360 points before zero filling, and
Gaussian weighting was used in both dimensions and the spectra were
zero-filled to 4K
4K real points.
One-dimensional P NMR spectra were obtained at 161.9 MHz with proton
decoupling. The spectral width was 2687 Hz with 4288 complex points and
128 scans. A Lorentzian apodization of 3 Hz was applied prior to
Fourier transformation. One-dimensional spectra of the imino protons
were obtained at 400 MHz with spectral width of 10,000 Hz using a jump
and return pulse for water suppression and 8K of complex points. The
standard solvent subtraction was used prior to Fourier transformation.
The optimum relative weighting of the NOE constraint, 40 kcal/mol, was found to be larger for the abasic site DNA than for undamaged DNAs, 20 kcal/mol. In addition, the optimum weighting of the biharmonic potential for the heavy atoms involved in experimentally observed base pair hydrogen bonds was found to be 30 rather than the 20 kcal/mol found for undamaged DNA. The other force constants were the same as used for undamaged DNA(53) . The X-PLOR force field has been optimized for normal duplex DNA. The presence of the abasic site eliminates both stacking and steric interactions between the bases found in undamaged DNA and which are built into the X-POLR force field. The increase in the relative weighting of the experimental constraints compensates for the differences of the abasic site DNA.
The starting structures were
generated from a canonical B-DNA by replacing the base with a hydroxyl
group at the C1` - or
-position of the abasic site. This
hydroxyl group has a partial negative charge of -0.325. The
structures of the
and
structures were refined separately.
The energy of each of the starting structures was minimized in 100 steps of Powell's conjugate gradient minimization using X-PLOR. The relaxation matrix refinements were carried out in vacuum at 300°K. These were further minimized using the force field with all restraints for 100 steps of minimization and then subjected to a 100 ps relaxation matrix simulation followed by 200 steps of conjugate gradient energy minimization. Each trajectory was run for a total of 100 ps, and the structures appeared to reach equilibrium after about 20 ps. There were no significant differences between the structures after 20 ps and those after 100 ps. The structures at 20 ps were used for generating the back calculated spectra. Coordinates of both structures are available at the ftp site.
The NOE cross-peak volumes for each
of these structures were back-calculated separately using an overall
correlation time of 5 ns, a leakage rate of 0.33 s and a distance cutoff of 5.5 Å. The spectra calculated for
each of the structures were added together for comparison with the
experimental results.
An anonymous ftp site has been set up that
contains the assignments, NOE and dihedral constraints, and the
structures of the and
forms of the DNA. This site can be
accessed either by http://prophet.chem.wesleyan.edu/Chemistry.html
using a web browser and following the links
papers-bolton-ABASICSITEJBC1995. Alternatively the ftp site can be
accessed at ftp://prophet.chem.wesleyan.edu with the material in
the/pub/papers/bolton/ABASICSITEJBC1995 directory.
The imino proton spectrum of the duplex is shown in Fig. 1. The spectrum indicates that the resonances in the AT region are broadened due to exchange. The integration of the AT region corresponds to less than two base pairs. The GC iminos are similar to those of the parent duplex with the broad resonance at 12.5 parts/million being from a terminal base pair.
Figure 1:
The top spectrum is
the 161 MHz proton decoupled spectrum of
d(CG
C
G
A
D
A
C
G
C
C
)-d(G
C
G
C
T
A
T
G
C
G
G
),
with D indicating a deoxyribose aldehyde abasic site and numbering from
5` to 3`. The middle spectrum is the 400 MHz proton spectrum
of the imino proton region of this DNA. The bottom spectrum is
the 400 MHz proton spectrum of the non-exchangeable
protons.
The proton-decoupled P spectrum of the duplex is also shown in Fig. 1.
The resonances all appear in the region associated with B-form DNA, and
there is relatively low resolution in the spectrum. Two resonances
appear slightly downfield from the others. The most downfield resonance
is from the phosphate between residues 16 and 17. Analogous downfield
resonances have been previously observed from DNA duplexes containing
abasic sites.
The one-dimensional proton spectrum of the non-exchangeable protons of the DNA are shown in Fig. 1. This spectrum was used to demonstrate that a one to one complex was formed by comparison with the spectra of the two individual strands that were combined to form the duplex.
The assignments of the protons of the
DNA were made using the standard connectivities of B-form DNA. Fig. 2contains the NOE connectivities of the aromatic and H1`
protons as well as the connectivities of the aromatic and H2`,H2`` and
methyl protons. Most of the interresidue connections are indicated in Fig. 2. In the configuration of the abasic site, the H1`
and H2`` protons are spatially close and in the
configuration the
H1` and H2` protons are spatially close. These NOEs were observed and
allowed the assignment of the H1`, H2`, and H2`` resonances of the
and
forms of the abasic site. The assignments of the proton
resonances and the assignments of the analogous duplex which does not
contain an abasic site are available at the ftp site. NOEs involving
the abasic site include those between A7H8 and D6H1` (
), A5H2 and
D6H1`(
), A7H8 and D6H2`, D6H2"(
) as indicated in Fig. 2. The NMR results are consistent with our prior
observations (8, 11) of essentially equal amounts of
the
and
forms.
Figure 2:
On
the left are shown the two-dimensional 600 MHz NOESY spectrum of the
DNA obtained with a 100 ms mixing time is shown with the sequential
assignments of the
d(CG
C
G
A
D
A
C
G
C
C
)
strand indicated in the top spectrum and the assignments of
the
d(G
C
G
C
T
A
T
G
C
G
G
)
strand indicated in the bottom spectrum. This spectral region
contains the aromatic to H5 and H1` cross-peaks. On the right are shown
the two-dimensional 600 MHz NOESY spectrum of the DNA obtained with a
100 ms mixing time is shown with the sequential assignments of the
d(C
G
C
G
A
D
A
C
G
C
C
)
strand indicated in the top spectrum and the assignments of
the
d(G
C
G
C
T
A
T
G
C
G
G
)
strand indicated in the bottom spectrum. This spectral region
contains the aromatic to H2`, H2``, and methyl
cross-peaks.
The NOEs involving the imino protons are shown in Fig. 3. A strong NOE is observed between A5H2 and the imino proton of T18. This indicates that A5-T18 is a good base pair and that it is A7-T16 which is disrupted by the presence of the abasic site. The NOE connectivities of the imino protons of the G residues involved in base pairs are those of B-form DNA.
Figure 3:
The two-dimensional 600 MHz NOESY spectrum
of the DNA obtained with the sample in 90% HO, 10%
H
O.
The qualitative
analysis of the NOE and other results indicates that the presence of
the abasic site at position disrupts the A7-T16 base pair but not the
A5-T18 base pair. The NOEs indicate that the structure of the DNA
duplex in the region near the abasic site depends on whether the abasic
site is in the or
configuration since distinctly different
NOEs are observed for the two forms. The results also indicate that the
structure of the duplex DNA is quite similar to that of the analogous
duplex at the base pairs more than one removed from the abasic site.
The NMR results were used to obtain refined structures for duplex
DNA containing an abasic site by separately refining the structures and
comparing the results predicted by each structure with the experimental
results. The experimental results are actually the combination of the
results from the and
forms so the results predicted by each
form were combined and compared with the experimental data.
Fig. 4contains the NOE cross-peaks in the aromatic to H1`
region for the refined structure, the refined
structure,
and the combination of the
and
structures, and the
experimental spectrum is shown for comparison. The differences between
the predicted spectra of the
and
forms are highlighted in
this figure. In this spectral region two of the main differences
between the
and
forms are the A7H2-D6H1` and the A5H8-G4H1`
cross-peaks which are only found in the predicted spectrum of the
form.
Figure 4:
The back calculated NOESY spectra for the
NMR refined structures of the and
forms of the DNA are
shown as well as the sum of these two back calculated spectra. The
experimental spectrum is also shown for comparison. This spectral
region contains the aromatic to H5 and H1`
cross-peaks.
Fig. 5contains the cross-peaks of the aromatic to
H2`/H2``methyl region for the refined structure, the refined
structure, and the combination of the
and
structures,
and the experimental spectrum is shown for comparison. The differences
between the predicted spectra of the
and
forms are
highlighted in this figure. In this spectral region, the A7H8-H2` and
A7H8-H2` are only predicted by the
form structure, and the
A17H8-T18 methyl cross-peak is only predicted for the
form
structure. The two forms also differ in their predictions of some of
the NOEs of the terminal residues.
Figure 5:
The back calculated NOESY spectra for the
NMR refined structures of the and
forms of the DNA are
shown as well as the sum of these two back calculated spectra. The
experimental spectrum is also shown for comparison. This spectral
region contains the aromatic to H2`, H2", and methyl
cross-peaks.
These results show that the NOE
spectra back calculated from neither the or
form offer good
agreement with the NOEs observed from the protons of the abasic site
while the sum of the two back calculated spectra account for the NOEs
of the abasic site. This is in accord with both forms being present in
solution.
The two refined structures are shown in Fig. 6with
the minor groove prominent and Fig. 7with the major groove
prominent. The structure of the form is much more
``B-like'' in the region near the abasic site than is the
form. In both cases A5-T18 is a good base pair while A7-T16 is
not. In the
form the abasic site sugar is almost extrahelical and
residues 5 and 7 are relatively close together. In the
form the
spacing between the aromatic rings of residues 5 and 7 is comparable to
that of B-form DNA. Thus, it appears that when the abasic site in the
form that the structural distortion induced by the abasic site is
minimal whereas in the
form there is a considerable distortion
induced by the presence of the abasic site. Fig. 8shows the
overlay of structures consistent with the NMR data for the
and
forms.
Figure 6:
The structures of the and
forms of the DNA are shown with the minor groove
prominent.
Figure 7:
The structures of the and
forms of the DNA are shown with the major groove
prominent.
Figure 8:
The structures obtained during the
80-100 ps time intervals of the trajectories used for determining
the structures of the and
forms of the duplex DNA
containing the abasic site. The structures obtained at points equally
spaced in time are overlaid for each of the two
trajectories.
A rationale for this difference can be ascribed to the
hydrogen bonding potential of the abasic site. In the form D6 and
A17 can be bridged by an intervening water molecule which hydrogen
bonds to both as depicted in Fig. 9. The position of the water
molecule was found by minimizing the energy of the water molecule while
keeping the DNA fixed. The water molecule can form a bifurcated
hydrogen bond to the OH and ribose ring oxygen of D6 as well as a
regular hydrogen bond to the N1 of A
. The analogous
hydrogen bonding is less likely to occur for the
form since
positioning the 1`OH of D6 would be accompanied by unfavorable steric
interactions of the abasic site sugar with the ring of A7. The abasic
site will be able to be the donor in hydrogen bonds to the ring
nitrogen of dA and dC residues and the acceptor in hydrogen bonds to
the imino protons of dG and dT residues. The hydrogen bonding
interactions of the abasic site are likely to be important in
determining the differences in the properties of DNA duplexes with
different bases opposite the abasic site. The interaction of the
and
abasic sites with DNA polymerase and other enzymes may be
different due to their different hydrogen bonding capabilities.
Additional studies to determine the structures of DNA duplexes with
residues other than dA opposite the abasic site are underway.
Figure 9:
The structure of the form of the DNA
is shown with a water molecule forming a hydrogen bond bridge between
the abasic site and the A17 on the opposite strand. The water molecule
is shown darker than the DNA structure, and the hydrogen bonds are
indicated by the dashed lines. This stereo depiction has the
minor groove of the DNA prominent.