(Received for publication, November 3, 1995; and in revised form, January 31, 1996)
From the
A set of phosphorothioate-containing oligonucleotides based on
pGACGATATCGTC, a self-complementary dodecamer that contains the EcoRV recognition sequence (GATATC), has been prepared. The
phosphorothioate group has been individually introduced at the central
nine phosphate positions and the two diastereomers produced at each
site separated and purified. The K and V
values found for each of these modified DNA
molecules with the EcoRV restriction endonuclease have been
determined and compared with those seen for the unmodified
all-phosphate-containing dodecamer. This has enabled an evaluation of
the roles that both of the non-esterified oxygen atoms in the
individual phosphates play in DNA binding and hydrolysis by the
endonuclease. The results have also been compared with crystal
structures of the EcoRV endonuclease, complexed with an
oligodeoxynucleotide, to allow further definition of phosphate group
function during substrate binding and turnover. For further study, see
the related article ``Probing the Indirect Readout of the
Restriction Enzyme EcoRV: Mutational Analysis of Contacts to
the DNA Backbone'' (Wenz, A., Jeltsch, A., and Pingoud, A. (1996) J. Biol. Chem. 271, 5565-5573).
The specific recognition of DNA sequences by proteins involves
the formation of a very precise and intimate interface between the two
macromolecules. Direct contacts are observed between the protein and
both the bases and phosphates, and in addition complicated networks
bind the interacting elements
together(1, 2, 3, 4) . Much of the
emphasis has been on the study of contacts between the protein and the
DNA bases. This, known as direct readout, is thought to contribute much
of the selectivity(5, 6, 7) . The phosphate
backbone has received less attention. However, as well as a passive
role in endowing a basal DNA binding affinity to a DNA-binding protein,
it is clear that the phosphates normally play additional, more active,
roles in the generation of specificity. Work with the trp repressor has led to the concept of indirect readout(8) .
Most simply, this takes place when a protein binds specifically to the
phosphates in a DNA sequence that has an unusual conformation, with an
altered phosphate backbone, that differs from ``ideal''
B-DNA. In addition, almost all DNA sequences are distorted on binding
to proteins and this is often required to match up protein-phosphate
contacts. Thus phosphates can play a role in selectivity following any
DNA distortion that takes place on binding. This coupling of the
recognition of specific bases, DNA distortion, and specific phosphate
binding occurs with the restriction endonuclease EcoRI(9, 10) . The importance of DNA
phosphate-protein contacts provides a compelling reason for their
study. One of the commonest methods for evaluating protein-substrate
interactions is to make alterations to the partners involved and
observe the consequences. This has been widely carried out for protein
amino acids using site-directed mutagenesis and for DNA with modified
bases(11, 12, 13, 14, 15, 16) .
For DNA phosphates the most useful analogues are the phosphorothioates (17, 18, 19, 20, 21) ,
illustrated in Fig. 1, which exist as a pair of diastereomers.
Phosphorothioates probably represent the most conservative change that
can be made to a phosphate. The sulfur atom is slightly bigger than the
oxygen it replaces and the P-S bond length a little longer than the
P-O(22) . Phosphorothioates are also slightly more acidic than
phosphates (23) and may be differently solvated. With
phosphates the negative charge is evenly distributed over the two
non-bridging oxygens, but in the case of phosphorothioates, current
evidence favors negative charge localization on
sulfur(22, 24, 25) . Importantly
Mg, an essential cofactor for the EcoRV
endonuclease, strongly favors co-ordination to oxygen atoms in
phosphorothioates(26, 27) . Despite these differences,
the single known structure of a phosphorothioate-containing
oligonucleotide shows minimal structural perturbations as compared to
the all-phosphate parent(28) . Oligonucleotide
phosphorothioates have been used to determine the stereochemical course
of the reactions catalyzed by EcoRI (29) and EcoRV (30) and to probe details of protein-DNA
interactions(31, 32, 33, 34) .
Figure 1: The structure of oligodeoxynucleotide phosphorothioates, which exist as a pair of diastereomers.
The EcoRV endonuclease is a well characterized enzyme (35, 36) for which crystallographic data is
available(37, 38, 39) . A large body of
kinetic and binding data has revealed that, in the absence of
Mg, the endonuclease binds to all DNA sequences with
equal affinity(40, 41, 42, 43) .
However, specific contacts, made only to the bases in the cognate
GATATC sequence cause severe distortion of the DNA, and this
concomitantly creates a high affinity Mg
binding
site, allowing hydrolysis to take place. The observation that specific
binding to GATATC sequences occurs in the presence of the essential
metal ion, and that the DNA in these complexes is bent to the same
extent as is observed in the crystal structure, lends support to the
above model(44, 45, 46) . Recent data (39, 47, 48) has suggested that the
endonuclease might use two metal ions for catalysis.
The
endonuclease DNA contacts seen with cognate but not non-cognate
sequences provide the energy required for the energetically unfavorable
distortion of the bound DNA. This distortion is essential for
hydrolysis, and so it is these interactions that are ultimately
responsible for the discrimination of 10
(40) shown by the enzyme. The interactions, between the
protein and the GATATC bases, have been probed using alternative
sequences(40, 42, 49) , base
analogues(13, 14, 15, 50, 51, 52, 53) ,
and site-directed mutagenesis(43, 54) , and many have
been shown to be essential for efficient catalysis. The endonuclease
also makes extensive contacts to the phosphate backbone (38, 39) . However, there has been little systematic
investigation into the role that these interactions play in DNA
recognition. M13 DNA, containing R
-phosphorothioates, showed that GATsATC sequences
were very refractory to hydrolysis and substitution elsewhere tended to
reduce cutting rates(32) . Very recently, the phosphorothioate
oligonucleotides used in this study have been used to show the presence
of a metal ion binding site distinct from the catalytic
center(55) . In this publication, we examine the effects of
both isomers of phosphorothioates within and immediately flanking the EcoRV restriction site on endonuclease-catalyzed hydrolysis
and relate the results found to the available crystal structures.
Figure 2:
The reverse phase HPLC trace of crude
pGACGATATCsGTC (blue line). The two diastereomers are clearly
visible as the two largest peaks between 10 and 12 min. The R isomer elutes before the S
.
The traces of the purified diastereomers following HPLC separation are
also shown. Green line, R
isomer; red
line, S
isomer.
Many examples (17, 19, 30, 61) have shown that R-oligonucleotide phosphorothioates elute before
the S
on reverse phase columns using
triethylammonium acetate buffers. This was found here in every case,
and the faster elution of R
-isomers was also
maintained when the buffer was altered to morpholine acetate. The
configuration of fully purified oligonucleotide phosphorothioates can
be unequivocally assigned using digestion with enzymes of known
stereospecificities. Snake venom phosphodiesterase cuts oligonucleotide
phosphorothioates having the R
conformation but
does not digest those of S
(62) . These
stereospecificities are reversed with nuclease P1(63) .
Therefore the phosphorothioates were separately digested with nuclease
P1 and snake venom phosphodiesterase and the resulting deoxynucleoside
products were analyzed by reverse phase HPLC(17, 30) .
When ``early'' isomers were treated with the snake venom
enzyme, a peak corresponding to dNMPs was observed by HPLC, whereas
reaction with P1 gave a NsN unhydrolyzed dinucleotide peak (not shown).
``Late'' isomers gave NsN with venom phosphodiesterase and
dNMPs with nuclease P1 (not shown). This confirms that the early
eluting isomers have the R
configuration and the
late the S
.
Figure 3:
Determination of the Kand V
values for both the R
and the S
diastereomers of pGACGATATCsGTC. All the other phosphorothioates
gave results of similar quality.
Figure 4:
A summary of the specificity constants (V/K
) for
all the phosphorothioate-containing oligonucleotides. These are
referenced to the all-phosphate-containing dodecamer control, for which
this value is set to 100%.
Figure 5:
The interactions seen between the DNA
phosphate groups and the EcoRV restriction endonuclease. This
is taken from the complex of EcoRV with AAAGATATCTT that
contains Mg bound to one of the DNA strands (1RVB
co-crystal structure in the Protein Data Bank, Brookhaven National
Laboratory, Upton, NY)(39) . ASN185 is asparagine 185
from one subunit of the dimeric endonuclease (the A subunit) and ASN*185 asparagine from the other (B) subunit etc. MC, main chain; SC, side chain.
, water
molecule. All the contacts illustrated are
3.5 Å in length.
The sequence used for crystallography has been substituted, at the
appropriate positions, by the sequence used in this study, and the
GATATC recognition site is shown in uppercase letters. Only
the contacts made to the phosphates that have been investigated in this
paper are shown.
In addition to the above difficulties, the tacit
assumption that the analogue does not change the overall global
structure of the macromolecule is usually made. This is hard to
establish unequivocally. Only one crystal structure, for all R-GsCGsCGsC, of an oligonucleotide
phosphorothioate is known(28) . Comparison with (GC)
runs of other sequences indicates that the phosphorothioates
cause no structural perturbations in this instance. However, it remains
to be established whether this will hold generally for other
phosphorothioate-containing oligonucleotides. Many synthetic
oligonucleotides containing one or a small number of phosphorothioates,
analogous to those used in this study, have been prepared. However,
very little structural characterization has been reported. The low
resolution methods used (usually based on T
or circular
dichroism spectroscopy) have shown near identity with parent,
all-phosphate oligomers. It is also possible to model phosphorothioates
into ideal B-DNA without altering structural parameters. In R
-isomers the sulfur points into the major grove,
and for S
it is located directly on the face of
the sugar phosphate backbone. Thus we have made the usual simplifying
assumption that phosphorothioates do not alter overall DNA structure.
The three phosphates GACGpATATCGTC,
GACGApTATCGTC, and GACGATpATCGTC, which fall within the recognition
site and include the scissile phosphodiester can be considered
together. These phosphates are characterized by extensive direct
contacts to the protein and participation in an extended interconnected
network of hydrogen bonds. Much of this network is mediated by highly
structured and ordered water molecules(39) . This network
performs the following roles: 1) it interconnects the three phosphates,
2) it joins the phosphates to the amino acids in the recognition
(R)-loop (these comprise amino acids 182-187, which interact with
the GATATC bases via the major groove), 3) it links the phosphates to
the Q-loop (this is centered on amino acid 70 and interacts with the
DNA via the minor groove), and 4) it assembles the catalytic components
(Mg and three acidic residues: Glu
,
Asp
, and Asp
) in a manner competent for
catalysis. The pro-S oxygen of GACGpATATCGTC makes the fewest contacts,
an isolated charge/charge interaction with the flexible side chain of
Lys
, and this is the only position where sulfur
substitution gives a reasonable substrate. Alterations to all the other
five oxygen atoms give poor substrates. This undoubtedly arises from
disturbances to the elaborate network illustrated in Fig. 5,
which gives the impression of a very high degree of co-operativity,
whereby these three phosphates are connected to the amino acids
responsible for both specific base recognition and for catalysis. Thus
changes to these phosphate oxygens, even the very minor one of sulfur
substitution, are likely to lead to movements throughout the
hydrogen-bonded network and to the weakening of further protein-DNA
interactions. This concerted breakdown of the macromolecular interface
will reduce the binding energy available for DNA distortion and thus
prevent efficient catalysis. Two further points deserve mentioning.
First, by comparing the Mg
- and
Mn
-catalyzed cleavage of the phosphorothioates, an
additional metal ion, remote from the active site, was proposed to bind
to the GACGpATATCGTC phosphate(55) . The exact role of this
metal ion in selectivity remains obscure, but it could certainly be
incorporated as an additional structural element in the network
illustrated. Second, in the case of the scissile bond GACGATpATCGTC,
the endonuclease is required to cleave a phosphorothioate diester
rather than a phosphodiester. These two diesters show approximately
similar hydrolysis rates, using model compounds in
solution(69) , and so the very low enzyme-catalyzed rate cannot
be due to intrinsic lower chemical reactivity. Rather, as above, it is
likely to arise because of network disturbance and in the case of the S
-phosphorothioate (a non-substrate) to reduced
Mg
binding. This atom provides one of the ligands for
the metal ion, and it is well documented that phosphorothioates bind
Mg
more poorly than do
phosphates(26, 27) .
The S phosphorothioate of GACGATApTCGTC is also not a substrate,
whereas the R
is well cut. A similar result has
been observed with the EcoRI endonuclease (34) . Based
on these results, and a number of other observations, it has been
proposed that the pro-R oxygen of this phosphate is the base that
deprotonates the hydrolytic water molecule for both
nucleases(66, 70) . The location of the negative
charge on sulfur means that, with the R
phosphorothioate an S
atom replaces the pro-R
oxygen, and this is able to deprotonate the attacking water molecule,
giving turnover. In contrast with the S
phosphorothioate, an uncharged double-bonded oxygen is placed in
the R
position. This cannot abstract a proton from
the water, and so no hydrolysis is seen. This proposal has some
difficulties, such as phosphodiesters having (at least in free
solution) the wrong pK
value to deprotonate water.
Nevertheless, it is clear from our results that the pro-R oxygen of
this phosphate has a very important function. However, it is possible
that the loss of catalysis could also be due to disruptions to the
interactions shown in Fig. 5. These differ slightly between the
two strands and involve direct contacts to the side chains of
Lys
and Thr
, as well as a set of water
mediated contacts that interconnects this phosphate with the adjacent
scissile phosphate and the catalytic apparatus.
Both
phosphorothioates of GACGATATpCGTC were poor substrates with an
extremely low rate being observed with the R isomer. The pro-R oxygen of GACGATATpCGTC contacts the side chain
of Thr
and is also in contact, via two water molecules,
with the important pro-R oxygen of the neighboring 5` phosphate. One
assumes that the large reductions in rate seen with the R
phosphorothioate of GACGATATpCGTC arise from
both alterations to its immediate protein contacts and to disturbances
to the critical preceding phosphate. The pro-S oxygen of GACGATATpCGTC
makes a hydrogen bond to the side chain of Thr
. In a
related article(75) , it is shown, using site-directed
mutagenesis, that Thr
is the most critical of all
phosphate binding residues. As pointed out in this report, Thr
is at one end of an
-helix that also contains a critical
catalytic residue Glu
. Furthermore, Fig. 5shows
that Thr
interacts with Gln
on the other
subunit. This Gln is in the Q loop and approaches the minor groove of
the DNA. The two oxygen atoms of GACGATATCpGTC interact with the side
chains of Tyr
and Arg
, and this explains
simply the phosphorothioate results. At least one of the contacts will
be weakened by phosphorothioate substitution. We also note that the
pro-S oxygen forms a solvent-mediated contact with the side chain of
Gln
. The main chain carbonyl oxygen of this amino acid is
a Mg
ligand in the enzyme product
complex(39) .
The important phosphates, as assessed using
phosphorothioates, comprise GACGpApTpApTpCpGTC. This agrees very well
with the site-directed mutagenesis results reported in a related
paper(75) . Furthermore, a confirmation of the critical nature
of these phosphates has come from an ethylation interference study with
the RV endonuclease. ()At present we are unable to comment
in depth as to what each phosphate contributes, quantitatively, to
catalysis and specificity. This is because the steady state kinetics we
have used only report on the slowest step of the reaction, which for
these oligonucleotides is a mixture of the cleavage step and product
release(47) . Thus the results in Table 1report on the
important cutting step but may not provide a full and accurate
description of it. In addition, the symmetrical oligonucleotides used
place two phosphorothioates in the double-stranded 12`-mer, and in some
cases situations like this can give rise to non-additive
effects(33) . Nevertheless, we note that many of the
phosphorothioates have V
/K
values that are much less than 30% of the rate seen with the
control. This occurs not only for phosphates that may be directly
involved in catalysis (the scissile phosphate and the potential
phosphate base removed one step in the 3` direction) but also for
several others, removed from the cleavage site, such as GACGpATATCGTC,
GACGATATpCGTC, and GACGATATCpGTC. It has previously been suggested that
a 0.7 kcal/mol penalty, i.e. a drop in V
/K
to 30% of the control
value, is the most one might expect from a single phosphorothioate
substitution(12, 33) . In our case we have two
phosphorothioates, one per strand, and this would translate to a
reduction to 15%. We propose that the larger V
/K
reductions seen, arise
from an initial perturbation of the protein-phosphate interaction,
which is followed by changes in the network of hydrogen bonds outlined
in Fig. 5. These concerted alterations to the protein-DNA
interface cause the disruption of additional protein-DNA contacts, and
so result in poorer turnover than might be expected from simple
phosphorothioate substitution. Of the important phosphates, only
GACGApTATCGTC gives V
/K
reductions consistent with straightforward phosphorothioate
replacement. However, these preliminary suggestions await confirmation
using more sophisticated kinetic measurements.
Under physiological conditions the EcoRV endonuclease must distinguish between GATATC and all other sequences. One-base pair alterations to the GATATC recognition site give substrates that, under optimal conditions, are cut at vanishingly small rates(40, 42, 49) . Using base analogues, we and others have shown that even small changes to the bases in the recognition site frequently give very poor substrates(13, 14, 15, 50, 51, 52) . Often the reduction in turnover is much greater than can be accounted for in terms of simple loss of protein-DNA contacts, exactly as is seen with some of the phosphates. Just as alterations to particular phosphates lead to much poorer substrates than expected, because of weakening of other protein-DNA interactions, it is clear that changes to the GATATC sequence do not simply result in the deletion of the contact in question. Instead, concerted movements lead to further misalignments that include incorrect protein-phosphate interactions. This paper begins to identify the phosphates involved in this process. Only with a fully cognate GATATC sequence are the proper protein-phosphate contacts made, and only this provides enough energy to bend the DNA, create the metal binding site, and give rapid hydrolysis.