©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Sequence-specific DNA Recognition by the SmaI Endonuclease (*)

(Received for publication, August 22, 1994; and in revised form, January 11, 1995)

Barbara E. Withers (§) Joan C. Dunbar (¶)

From the Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, Michigan 48201

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES

ABSTRACT

SmaI endonuclease recognizes and cleaves the sequence CCCGGG. The enzyme requires magnesium for catalysis; however, equilibrium binding assays revealed that the enzyme binds specifically to DNA in the absence of magnesium. A specific association constant of 0.9 times 10^8M was determined for SmaI binding to a 22-base duplex oligonucleotide. Furthermore, the K was a function of the length of the DNA substrate and the enzyme exhibited an affinity of 1.2 times 10^9M for a 195-base pair fragment and which represented a 10^4-fold increase in affinity over binding to nonspecific sequences. A K of 17.5 nM was estimated from kinetic assays based on cleavage of the 22-base oligonucleotide and is not significantly different from the K estimated from the thermodynamic analyses. Footprinting (dimethyl sulfate and missing nucleoside) analyses revealed that SmaI interacts with each of the base pairs within the recognition sequence. Ethylation interference assays suggested that the protein contacts three adjacent phosphates on each strand of the recognition sequence. Significantly, a predicted protein contact with the phosphate 3` of the scissile bond may have implications in the mechanism of catalysis by SmaI.


INTRODUCTION

The molecular mechanisms of sequence-specific recognition by DNA binding proteins is a complex phenomena of which the details are still emerging. In recent years, resolution of the structures of a number of DNA-binding proteins and of proteinbulletDNA cocomplexes has provided insight into the variety of structures of the DNA-binding domains, the network of bonding between the macromolecules, and also the significance of both protein and DNA conformational changes in the specific recognition process(1, 2, 3, 4) .

The endonucleases of the bacterial type II restriction-modification systems provide excellent models by which to study mechanisms of sequence specificity of DNA-binding proteins(5, 6, 7) . Two of the endonucleases, EcoRI and EcoRV, have been extensively studied by both biochemical and x-ray crystal structural analyses and have revealed distinct mechanisms by which they achieve their high degree of sequence specificity: the cleavage selectivity of EcoRI is derived from both binding and catalytic specificity (8, 9) whereas the EcoRV endonuclease exhibits the same intrinsic affinity for all DNA sequences but achieves catalytic specificity in the presence of magnesium(10, 11, 12) . The crystal structures also revealed that the enzymes utilize novel protein structures for DNA recognition and that there are significant differences in the topology of the respective proteinbulletDNA complexes. In the EcoRIbulletDNA complex, the protein approaches the DNA predominantly from the major groove and makes specific interactions in the major groove with each of the purines and pyrimidines of the recognition site. In contrast, the EcoRV endonuclease approaches the DNA predominantly from the minor groove although the sequence specific contacts are made in the major groove. Protein-induced unwinding and bending of the DNA, including unstacking of the central base pairs, are features that characterize both endonucleases. However, whereas the net effect in the EcoRIbulletDNA complex is widening of the major groove(9) , the DNA in the EcoRVbulletDNA complex is reminiscent of A-form DNA with a more narrow and deeper major groove(10) . Furthermore, the overall kink and curvature appears to preclude the formation of hydrogen bonds between EcoRV and the central base pairs of the recognition sequence such that direct interactions only occur at the outer 2 base pairs of each half site of the recognition palindrome (10) .

The general lack of sequence similarity between the type II endonucleases (13) has restricted the use of sequence comparisons to probe the structure-function relationship of these enzymes. Furthermore, detailed biochemical and structural analyses of endonucleases has in the past been limited. Nonetheless, recent reports suggest that EcoRI and EcoRV may be paradigms for other endonucleases. 1) It has now been shown (14) that TaqI is similar to EcoRV in that it achieves specificity only in the presence of magnesium. 2) The recent determination of the structure of the BamHI endonuclease (15) has revealed an overall conformation very similar to that of EcoRI. The structural similarity occurs in the absence of any obvious sequence similarity between these enzymes. Furthermore, it has been suggested that the structure of the ``common core motif'' which, in EcoRI and BamHI, provides an ideal scaffold for positioning the active sites of the enzyme near the scissile bond, may also be conserved in other enzymes that similarly cleave their hexanucleotide recognition sequences to yield a 5`, 4 bp (^1)stagger (15) . 3) Similarities have been detected in the architecture of the active site of all four endonucleases (EcoRI(9) , EcoRV(10) , BamHI(15) , and PvuII(16, 17) ) for which crystal structures are currently available.

Analyses of additional endonucleases should therefore enable potential trends to be discerned in the mechanism of recognition by these DNA-binding proteins. Furthermore, the existence of endonuclease isoschizomers makes it possible to analyze and compare the mechanism by which different enzymes interact with the same DNA sequence and how the requirements for recognition and catalysis are satisfied. We have initiated a comparative study of the SmaI and XmaI endonucleases. The enzymes recognize the sequence CCCGGG but cleave at different positions within the sequence such that SmaI cleaves at the internal CpG to yield a blunt-end scission whereas XmaI cleaves between the external cytosines to yield a 4 bp stagger. In the present study, an initial examination has been undertaken of the mechanism of sequence-specific recognition by the SmaI endonuclease and compared to that of EcoRV and PvuII which also produce blunt-end scissions.


MATERIALS AND METHODS

Enzymes and Chemicals

Synthetic oligonucleotides were prepared by solid-phase phosphoramidate synthesis at the Macromolecular Structure Facility, Michigan State University. T4 polynucleotide kinase was from New England Biolabs. [P]ATP (>3,000 Ci/mmol) for end-labeling was obtained from Amersham Corp. NA-45-DEAE membranes were from Schliecher and Schuell and used according to manufacturer's instructions. Bio-Gel P6 was obtained from Bio-Rad. Formic acid, dimethyl sulfate, and piperidine were obtained from Aldrich. Ethylnitrosurea was purchased from Sigma and hydrazine was supplied by Kodak. Duracryl, high tensile strength acrylamide, was obtained from Millipore Corp. In the studies reported, the endonuclease used was a commercial preparation obtained from Life Technologies, Inc. However, enzyme obtained from New England Biolabs or SmaI purified to homogeneity (^2)yielded similar results in all assays. The concentration of the endonuclease ([E](t)) was estimated from the x axis intercept in Scatchard plots of equilibrium data binding data(18) .

Preparation of DNA Substrates

Oligonucleotides were 5`-endlabeled with T4 polynucleotide kinase and purified by electrophoresis on 18% denaturing polyacrylamide gels. The full-length oligomers were recovered by electroelution onto NA-45 membranes and desalted by gel filtration on a Bio-Gel P6 column from which the fragments were eluted with 20% ethanol. Equimolar concentrations of the complementary strands were resuspended in 66 mM Tris-Cl, pH 7.6, 7 mM MgCl(2), 3 mM dithiothreitol and annealed by heating at 90° for 5 min and then cooling slowly to room temperature. The sample was desalted by gel filtration as described above and the ethanol removed by evaporation. The samples were resuspended in H(2)O immediately before use. The concentration of DNA was estimated from the absorbance at 260 nm.

Electrophoretic Analyses of ProteinbulletDNA Complexes

SmaIbulletDNA complexes were formed by incubation of the enzyme with P-labeled DNA in 20 mM HEPES, pH 7.5, 20 mM potassium glutamate, 0.5 mm EDTA, and 0.1 mM dithiothreitol in a 20-µl reaction volume. Samples were incubated for 1 h at room temperature after which 5 µl of loading buffer (20 mM HEPES, pH 7.8, containing 2 mM EDTA, pH 8.0, 30% glycerol and 0.01% bromphenol blue) was added to the binding reaction. The samples electrophoresed on 10% polyacrylamide gels containing 50 mM HEPES, 2 mM EDTA, pH 7.8. Electrophoresis was carried out at room temperature at a constant voltage of 200 V in an electrophoresis buffer of 50 mM HEPES and 2 mM EDTA, pH 7.8. The proteinbulletDNA complexes were detected by autoradiography.

Determination of the Equilibrium Binding Constant

The equilibrium association constant was determined for the binding of SmaI to a specific recognition oligonucleotide as shown: CATGACTGGCCCGGGATCCAGT, CTGACCGGGCCCTAGGTCAGTA, and also to a 195-bp fragment. The latter fragment was obtained from amplification of the sequences surrounding the SmaI site in M13mp18 using the reaction conditions and primer set C as described previously(19) . The DNA substrates were 5`-end-labeled with T4 polynucleotide kinase, and binding reactions were carried out as described above in a 20-µl reaction volume. The concentration of endonuclease was 1 nM for binding to the 195-bp fragment (0.01-5 nM DNA) and 4.0 nM for incubation with the oligonucleotide substrate (0.2-40 nM DNA). Reactions were incubated for 1 h at room temperature after which the free DNA and proteinbulletDNA complexes were separated by electrophoresis on either a 7.5% (195-bp fragment) or 10% (oligonucleotide substrate) non-denaturing polyacrylamide gel as described above. The distribution of radioactivity between protein-bound and free DNA was quantitated on an AMBIS Radioanalytical Imaging System (AMBIS, San Diego, CA). The amount of single-stranded DNA in the oligonucleotide-containing samples was less than 10% of the total DNA concentration. The association constant was determined from the reciprocal of the dissociation constant which was estimated from non-linear regression analysis of the binding data. The nonspecific association constant was estimated from equilibrium competition assays as described by Terry et al.(20) . The oligonucleotide fragment in the competition assays was a GC-rich 22-bp duplex oligonucleotide (ATTCGATCGGGGCGGGGCGAGC) lacking the SmaI recognition site. The competitor DNA for the 195-bp substrate was obtained from polymerase chain reaction amplification of M13mp19 using the forward primer of primer set E and the reverse primer from primer set F, as described previously(19) . The 229-bp product was digested with BamHI to remove the SmaI site. The cleaved sample was electrophoresed on a 1.5% agarose gel, and the resulting 185-bp fragment was subsequently purified using Magic PCR Preps DNA Purification System (Promega).

Kinetic Assays

Steady-state cleavage assays were carried out at room temperature in 50 mM potassium acetate, 20 mM Tris- acetate, 10 mM MgCl(2), 1 mM dithiothreitol, pH 7.4. The reaction was initiated by the addition of the endonuclease to a final concentration of 70 pM, and 3-µl samples were withdrawn at various intervals and quenched by the addition of an equal volume of cold ``stop'' solution (50 mM EDTA, pH 8.0, 30% glycerol, 0.01% bromphenol blue). Samples were kept on ice until all time points were taken and then diluted with an equal volume of H(2)O prior to loading on a 20% polyacrylamide gel. Electrophoresis was carried out in TBE buffer in 8-cm gels at a constant voltage of 17 V/cm. The percent radioactivity in substrate and product was quantitated on an AMBIS Radioanalytical Imaging System. The initial velocities at each substrate concentration were obtained from the linear plots of product formed versus time over a period in which less than 10% of substrate was utilized. Substrate concentrations ranged from 5 to 100 nM which corresponds to approximately 0.3-6.0 times the calculated K(m). The kinetic constants were calculated from the initial velocity data using Enzfitter (Biosoft, Cambridge, United Kingdom) software.

Interference Footprinting Assays: Dimethyl sulfate (DMS) Methylation

Chemical modification of the oligonucleotides with dimethyl sulfate was carried out essentially as described by Siebenlist and Gilbert(21) . Approximately 10 µg of the duplex oligonucleotide, in which only one of the strands was 5`-end-labeled, was resuspended in 200 µl of 50 mM sodium cacodylate, pH 8.0, and 1 mM EDTA, pH 8.0. 1 µl of DMS was added, and the samples were incubated for 10 min at room temperature. Reactions were stopped by the addition of 40 µl of DMS stop solution (1.5 M sodium acetate, pH 7.0, containing 1 M 2-mercaptoethanol). The methylated DNA was ethanol precipitated (in the presence of 5 µg of tRNA), washed with 80% ethanol, and dried under vacuum. The DNA was resuspended in dH(2)O and subsequently used as a substrate for endonuclease binding and gel retardation assays as described above.

The DNA in the proteinbulletDNA complexes and the free DNA was electrotransferred onto NA45 membranes, eluted with 1 M NaCl, extracted with phenol-chloroform, and recovered by ethanol precipitation. Strand cleavage at the sites of modification was carried out by incubation of the sample in 100 µl of 10% piperidine at 90 °C for 30 min. Samples were three times lyophilized to remove the piperidine and subsequently analyzed on denaturing (20%) polyacrylamide gels.

Missing Nucleoside Analysis

The depurination and depyrimidination modification reactions were carried out on the single-stranded oligonucleotides which were subsequently annealed to the complementary, unmodified strand.

Pyrimidine modification of the 5`-end-labeled oligonucleotides (10 µg) was carried out essentially as described by Brunelle and Schlief (22) . The reaction was stopped by the addition of 200 µl of hydrazine stop buffer (0.3 M sodium acetate, pH 7.0, containing 1 mM EDTA). The DNA was ethanol precipitated in the presence of 5 µg of tRNA, washed with 80% ethanol, and dried under vacuum. Depurination reactions were carried out using formic acid as described previously(22) .

Phosphate Alkylation

10 µg of 5`-end-labeled single-stranded DNA was suspended in 100 µl of 50 mM sodium cacodylate. 100 µl of ethanol saturated with ethylnitrosurea was added, and the reaction was incubated for 1 h at 50 °C after which the DNA was recovered by ethanol precipitation. Modified oligonucleotides were annealed to the unmodified complementary strand and the duplex oligonucleotides used as substrates for protein binding. Binding reactions, retardation assays, and DNA isolation were performed as described above. The DNA was resuspended in 15 µl of 10 mM NaPO(4), pH 7.0, containing 1 mM EDTA, and strand scission was carried by the addition of 2.5 µl of 1 M NaOH and incubation of the reaction at 90 °C for 30 min. The DNA was recovered by ethanol precipitation and analyzed on 18% denaturing polyacrylamide gels. A ladder of sized DNA fragments was generated by phosphodiesterase digestion of the unmodified DNA. The oligonucleotide was resuspended in 25 mM Tris-Cl, pH 8.4, containing 5 mm of MgCl(2) and incubated at 37 °C with 3 times 10 units of phosphodiesterase. Aliquots were withdrawn at 2-min intervals over a period of 15-20 min. The reaction was stopped at each time point by the addition of chloroform. The samples were pooled and subsequently electrophoresed in an 18% denaturing polyacrylamide gel.

Quantitation

Band intensities on the sequencing gels used in the interference assays were quantitated after scanning of the autoradiograms with a Bio Image (Kodak) laser scanning densitometer. The data were analyzed with Visage system software (Milligen). Peak areas were used as a measure of image density.


RESULTS

Characterization of the SmaIbulletDNA Complex

Incubation of the SmaI endonuclease with a 22-base duplex oligonucleotide containing the recognition sequence resulted in the appearance of a single proteinbulletDNA complex in gel retardation assays (Fig. 1A). In contrast, no retardation of the DNA was detected when a GC-rich fragment lacking the recognition site was used as a substrate (Fig. 1B) or when the recognition oligonucleotide was modified with the SmaI methylase (data not shown). The inclusion of magnesium in the binding reaction with the specific substrate or the addition of magnesium to preformed complexes resulted in the loss of the complex and the appearance of the cleavage products (Fig. 1C). The SmaI endonuclease therefore appears to form stable specific complexes with DNA in the absence of magnesium.


Figure 1: Gel retardation assays of DNA binding by SmaI endonuclease. A, proteinbulletDNA complexes formed after incubation of the specific 22-bp duplex oligonucleotide (0.3 nM) with SmaI endonuclease. Lanes 1-7 correspond to 0, 0.2, 0.38, 0.75, 1.5, 3.0, and 6.0 nMSmaI, respectively. B, electrophoresis of binding reactions in which SmaI was incubated with a 22-bp nonspecific DNA fragment (0.3 nM). Lanes 1 and 2 correspond to endonuclease concentrations of 0.75 and 3 nM, respectively. C, 0.3-nm specific oligonucleotide (lane 1) and nonspecific oligonucleotide (lane 2) were incubated with SmaI endonuclease for 1 h at room temperature in the presence of 10 mm of MgCl(2).



Binding of the endonuclease to the recognition fragment occurs over a narrow range of KCl concentrations with maximum binding occurring at approximately 25 mM KCl. The apparent salt dependence of the protein-DNA interaction was anion-specific. At higher (>30 mM) salt concentrations, the inhibition by potassium glutamate was considerably less than that of potassium chloride at an equivalent concentration (Fig. 2). A similar effect of glutamate has been observed with RsrI (5) as well as other DNA-binding proteins including DNA polymerase III holoenzyme (23) and the lac repressor(24) . In the latter study, glutamate was concluded to be an inert anion in the relative competition between anions and DNA phosphate groups for binding to the protein. It is significant that, as also shown in Fig. 2, binding of SmaI to the specific substrate also occurred in the absence of KCl or potassium glutamate. SmaI has an absolute requirement for potassium, as well as magnesium, for catalytic activity. This requirement, therefore, appears to reflect a property of the cleavage reaction rather than a role for potassium in substrate recognition.


Figure 2: Salt dependence of the formation of the SmaIbulletDNA complexes. The endonuclease (2 nM) was incubated with the specific oligonucleotide (0.3 nM) in 20 mM HEPES, pH 8.0, containing 0.5 mM EDTA and 0.1 mM dithiothreitol and various concentrations of KCl (C) or potassium glutamate (G) as shown. In the last lane (0 mM), the reactions were carried out in 20 mM HEPES, pH 7.8, containing 0.5 mM EDTA and 0.1 mM dithiothreitol. Binding reactions were carried out at room temperature for 1 h.



Preliminary studies revealed that the SmaIbulletDNA complexes are not efficiently retained on nitrocellulose filters under standard binding reaction conditions. Quantitation of the proteinbulletDNA complexes in gel retardation assays was therefore used to estimate the equilibrium association constant for the protein-DNA interaction. The binding isotherm for the interaction of SmaI with the 22-bp oligonucleotide is shown in Fig. 3A. Incubation of a fixed concentration of the enzyme with an increasing concentration of the recognition fragment resulted in a saturatable and hyperbolic binding curve. The site-specific association constant was calculated from the derived Scatchard plot (Fig. 3B) and yielded a K(A) of 0.91 (±0.32) times 10^8M. Competition assays, similar to those described by Terry et al.(20) and Aiken and Gumport (5) were used to determine the relative affinity of SmaI for specific and nonspecific DNA sequences. The K(i) for a 22-bp GC-rich fragment, lacking the recognition site, was estimated to be 1.09 times 10^6M. It can be estimated from the minimum size of the oligonucleotide required for maximum activity of the enzyme that the stable interaction of SmaI with DNA appears to require at least 12 bp. The competitor oligonucleotide therefore contains at least 10 possible binding sites for the endonuclease. Consequently, a K(i) of approximately 1 times 10^5M/site can be estimated. This value suggests that there is at least a 1,000-fold difference in the affinity of the enzyme for its recognition sequence over that for non-cognate sequences.


Figure 3: A, binding isotherm of SmaI endonuclease to a 22-bp specific recognition oligonucleotide. 4.0 nM endonuclease was incubated with 0-80 nM DNA as shown. B, Scatchard analysis of the binding data.



The specificity of the endonuclease was further examined using a 195-bp recognition fragment. Titration of the DNA substrate with increasing concentrations of the enzyme again resulted in the appearance of a single retarded complex (Fig. 4A). There was no evidence of multiple proteinbulletDNA complexes indicative of nonspecific binding of the endonuclease to the DNA fragment. At the higher concentrations of enzyme there was smearing of the band corresponding to the proteinbulletDNA complex. This effect arises from adding a larger volume of the enzyme (and hence increasing the percentage of glycerol (in which the enzyme is stored)) in the samples. Decreasing the concentration of glycerol in the reactions eliminated the smearing. Titration of the enzyme with the 195-bp fragment and the resulting binding isotherm is shown in Fig. 4, B and C. The corresponding specific association constant was calculated to be 1.23 (± 0.1) times 10^9M and represents an affinity which is an order of magnitude greater than that observed for the short oligonucleotide substrate. Binding assays carried out in the presence of the 185-bp competitor fragment yielded a value of 3.7 times 10^7M for the K(i) as determined directly from the Dixon plot (Fig. 4D). The site-specific association constant for non-cognate sites (K(i)/number of potential binding sites(25) ) is approximately 2.08 times 10^5M. The SmaI endonuclease therefore exhibited an affinity for its recognition site approximately 10^4 times greater than that of random DNA sequences and which is indicative of sequence-specific binding by the enzyme in the absence of magnesium.


Figure 4: Analysis of binding of SmaI endonuclease to a specific 195-bp substrate. A, titration of 0.3 nM specific DNA with SmaI endonuclease. Lanes 1-7 correspond to 0, 0.1, 0.2, 0.4, 0.8, 1.6, 3.2, and 6.4 nM endonuclease, respectively. Binding reactions were carried out in 20 mm HEPES, pH 7.8, containing 20 mM potassium glutamate for 1 h at room termperature. The proteinbulletDNA complexes were visualized by autoradiography after electrophoresis of the samples on 7.5% non-denaturing gels. B, titration of 1 nMSmaI with increasing concentrations of 195-bp substrate. Lanes 1-9 correspond to 0, 0.01, 0.025, 0.05, 0.10, 0.25, 0.5, 1.0, 2.5, and 5.0 nM DNA, respectively. C, binding isotherm (inset) and derived Scatchard analysis of the binding data shown in B. D, Dixon plot representing competition for SmaI binding by a 185-bp competitor lacking the CCCGGG recognition sequence.



Steady-state kinetic analysis of SmaI cleavage of the specific oligonucleotide substrate is shown in Fig. 5. The reactions were carried out at room temperature, and no dissociation of the double-stranded substrate during the course of the reaction was evident in the gel electrophoresis assays. The endonuclease obeys Michealis-Menten kinetics and can be saturated with substrate. The kinetic analyses yielded a K(m) of 17.3 nM and a k of 23.8 min (average of three determinations).


Figure 5: Steady-state kinetic analysis of SmaI endonuclease cleavage of the 22- base duplex oligonucleotide substrate. SmaI was incubated with 0.005-0.085 µM DNA as described under ``Materials and Methods'' and the initial rates of hydrolysis determined. A, initial velocity of cleavage as a function of the substrate concentration. B, Eadie-Scatchard analysis of the initial velocity data. The line represents a linear least-squares fit to the data.



Interference Footprinting Assays: SmaI Base Contacts

The results of the methylation interference assays in which the DNA was partially alkylated with dimethyl sulfate at the N-7 position of guanine are shown in Fig. 6A. The same 22-bp duplex oligonucleotide, as used in the binding assays, was also used as a substrate for DNA binding in these experiments. The protein-bound and free DNA probes were isolated and cleaved at the phosphodiester bond at each of the methylated bases. Comparison of the cleavage patterns of the free DNA and the DNA present in the proteinbulletDNA complex suggested that N^7 methylation of any of the purines within the recognition site significantly decreased the protein-DNA interaction. Modification of guanines beyond the recognition site did not appear to interfere with complex formation. SmaI therefore appears to make sequence specific contacts with each of the guanines in the major groove of the DNA.


Figure 6: A, DMS interference footprinting assays of the SmaIbulletDNA complexes for the top and bottom strand of the 22-bp recognition fragment. Lane G corresponds to the G+A specific (Maxam-Gilbert) sequencing products. B (bound) and F (free) correspond to the fragments generated from the DNA isolated from the proteinbulletDNA complexes and free DNA, respectively. B, missing nucleoside analysis of SmaI binding to the 30-bp recognition oligonculeotide. Lanes G+A correspond to the fragments generated from the acid depurination of the substrate; T+C represents the depyrimidation of the substrate DNA. B and F correspond to the protein-bound and free DNA as described in A. C, histogram summary of the missing nucleoside analysis for SmaI binding to the top and bottom strands of the 30-bp recognition oligonucleotide. Relative intensity represents the ratio of the intensities of the free/bound sample for each fragment.



The importance of the guanine bases in sequence specific binding was further indicated by missing nucleoside analyses(22) . This latter approach, in which the purine or pyrimidine bases can be selectivley removed from the DNA, provides a method for analyzing the contribution of both purines and pyrimidines to the specific recognition of the DNA by the protein. Furthermore, while methylation effects may arise from steric hindrance, the effect of base removal, as used in these experiments, more accurately reflects potential H-bond interactions.

The oligonucleotide substrate used in the missing base analysis was a 30-bp oligomer GCATGCACATGACTGGCCCGGGATCCAGT, ACGTGTACTGACCGGGCCCTAGGTCATCGT. The internal sequence of the oligonucleotide was identical to that of the 22-mer, used in the methylation interference assays, but the fragment contained additional bases at the 5`-end to facilitate the recovery of more fragments flanking the recognition site after cleavage of the DNA. Maxam-Gilbert sequencing of each strand of the control oligonucleotide revealed that the upper strand was heterogeneous in that it contained a dG in addition to the correct dC at the third position of the recognition site, i.e. CC(C/G)GGG (and is apparent in the control T+C lane in Fig. 6B). The presence of the mixed bases is presumed not to interfere with the footprinting experiments since the fragments with the incorrect recognition sequence will not be specifically bound by the endonuclease.

Fig. 6B is representative of the autoradiograms obtained for the missing base analyses of the SmaI-DNA interaction. An autoradiogram corresponding to the top strand only is shown for illustrative purposes and the results for both top and bottom strands are summarized in Fig. 6C and are representative of three different experiments. Comparison of the band intensities in the lanes corresponding to the protein-bound and free DNA revealed that depurination of the guanines within the recognition site significantly reduced protein binding. The absence of guanines in the lanes corresponding to the proteinbulletDNA complex was evident for both the top and bottom strands of the recognition fragment. Similarly, in the protein-bound lanes there was a marked reduction in intensity of the fragments corresponding to the modification of cytosines within the recognition sequence. However, there appeared to be an increase in the band intensity for the cytosine of the central, CpG dinucleotide relative to the adjacent cytosines of the recognition site. Similar differences in the relative intensity were detected in the free and control DNA samples and suggest that this base may be generally more reactive toward the modification reagent than the adjacent bases. Nonetheless, the free/bound ratio was consistently less than that of the outer two cytosines of the recognition sequence. There appeared to be no significant interaction of the SmaI endonuclease with bases flanking the CCCGGG recognition site.

The interference assays suggest full site recognition by the SmaI endonuclease. At least part of the recognition appears to occur in the major groove of the DNA, as evidenced by the effect of guanine N^7 methylation on the formation of the proteinbulletDNA complexes. Furthermore, the concordance between the DMS and missing base analyses suggests the effect of DMS methylation may be attributed to the loss of hydrogen bond interactions rather than a steric effect of the methyl group.

Interference Footprinting Assay: SmaI-Phosphate Contacts

Ethylation interference assays were used to delineate the phosphate groups which are important in the endonuclease-DNA interaction. Fig. 7A shows the interference pattern typically obtained for SmaI binding to the substrate premodified with ethylnitrosurea. Doublet bands are apparent particularly in the lane corresponding to the free DNA. The double bands arise from the alkaline hydrolysis which can occur at either side of the alkylated phosphate, yielding fragments that terminate in a 3`-hydroxyl or 3`-ethylphosphate(26) . Due to the short length of the oligonucleotides used in these studies, these species are partially resolved on the sequencing gels. To accurately identify the fragments in the footprinting assays, several controls were simultaneously run on the sequencing gels. In addition to the Maxam-Gilbert sequence ladder (in which fragments resulting from the alkaline cleavage migrate approximately 0.5 positions slower than the corresponding base specific cleavage bands(21) ), the control oligonucleotides comprising the upper and lower strand of the duplex substrate were individually 5`-end-labeled and hydrolyzed with phosphodiesterase to yield a ladder of fragments terminating in a 3`-hydroxyl. Furthermore, the duplex end-labeled fragments were cleaved with XmaI which generates two fragments, a 17-base fragment from the upper strand and a 12-base fragment from the lower strand. The XmaI cleavage products were used to identify fragments in the sample lanes of the sequencing gels which terminate at the first cytosine of the recognition sequence and from which the identity of the remaining bands in the sample can be unambiguously assigned.


Figure 7: A, ethylation interference footprints of SmaI binding to the top strand of the recognition fragment. Lanes G+A and T+C correspond to the Maxam-Gilbert sequencing products. B and F correspond to the fragments derived from the protein-bound and free DNA, respectively. C1 represents the ladder of sized DNA fragments generated by partial phosphodiesterase cleavage of the substrate DNA. C2 corresponds to fragments generated from XmaI cleavage of the duplex substrate and which yields a 17- and 12-base fragment resulting from cleavage of the top and bottom strands, respectively. B, histogram summary of the phosphate alkylation interference assays for the top and bottom strands of the duplex substrate.



The interference pattern obtained for the SmaIbulletDNA complex revealed only three potential phosphate contacts/strand. The contacts were symmetrical and corresponded to the GGG trinucleotide on each strand. Inspection of the autoradiogram and resulting histogram summary (Fig. 7B) suggested there may be additional protein-phosphate contacts beyond the recognition site on the upper strand. However, the relative intensity (bound/free) is considerably less than that of the proposed phosphate contacts within the recognition site.


DISCUSSION

The SmaI endonuclease appears to readily discriminate between specific and nonspecific sequences in the absence of magnesium. The endonuclease formed a stable complex with a short (22 bp) recognition oligonucleotide but failed to bind to oligonucleotides lacking the cognate sequence. Furthermore, titration of a 195-bp fragment containing the recognition site revealed only a single proteinbulletDNA complex even in the presence of greater than a 20-fold molar excess of the enzyme.

The specific association constants have been determined for only a limited number of endonucleases. The affinity of SmaI for short oligonucleotide substrates is lower than that of EcoRI for which an association constant of approximately 1 times 10M (for a 34-bp substrate) has previously been reported(20) . Nonetheless, there are additional examples of endonucleases which have relatively low specific association constants yet bind specifically to DNA. Thielking and co-workers(11) , using a 20-mer as a substrate, have determined an affinity constant of 4 times 10^8M for an inactive mutant of EcoRV that in the presence of magnesium binds specifically to DNA but fails to cleave the substrate. Furthermore, Jen-Jacobsen et al.(27) constructed N-terminal deletion mutants of EcoRI of reduced (100-fold) binding affinity but which retained the ability to discriminate between specific and nonspecific sequences. Furthermore, the specificity index for SmaI of 10^3-10^4 is also consistent with sequence-specific binding when compared to the less than a 40-fold difference in the affinity of binding of EcoRV to specific and nonspecific sequences (25) and the 4-fold difference reported for TaqI(14) .

The affinity of the SmaI endonuclease for specific (and nonspecific) sequences was also a function of the length of the DNA substrate. Increasing the substrate from 22 to 195 bp resulted in an apparent 10-fold increase in the binding affinity. Similar trends have previously been observed for other proteins including the lac repressor (28) and the EcoRV endonuclease (29) . In addition, Taylor et al. (25) have reported that the effective equilibrium constant for EcoRV binding to its recognition sequence (calculated from preferential cleavage assays) ranged from 5 times 10^7 (55-mer) to 2.5 times 10^8 (381-mer) to 1 times 10^9M for a 3.9-kilobase fragment. The dependence of the enzyme affinity (determined as K(m) for several endonucleases) on the length of the DNA substrate is frequently interpreted in terms of long range effects such as facilitated diffusion. It should be noted, however, that although facilitated diffusion has been well documented for EcoRI (29, 30) there is no significant difference in the K(A) of EcoRI for a 34-bp and pBR322 substrate(20) . Furthermore, for SmaI the apparent dependence of K(A) on the length of the substrate may also the reflect the conformation or conformational stability of the substrate: the 22-base duplex substrate is a GC-rich oligonucleotide. The recognition sequence of a decamer containing the CCCGGG recognition sequence has been shown to assume an A-form conformation under cystallographic conditions(32) . The substrate DNA may therefore be subject to local DNA distortions. Furthermore, binding of SmaI appears to bend the DNA toward the major groove(19) . It is possible that the intrinsic sequence-specific conformation of the substrate or the SmaI-induced DNA conformational changes may not be stable under the conditions (of low cation concentrations) used for the binding assays. The longer substrate may help stabilize such conformations.

It generally has been noted that K(m) and K(D) for an enzyme are not necessarily equivalent (33) . There is a considerable (greater than 100-fold) difference in the value of the affinity constant estimated for EcoRI from the the kinetic and thermodynamic assays(31, 34) . The discrepancy has been attributed to differences between the dissociation rate constant and the cleavage rate constant(34, 35) . A much smaller, 3-5-fold, difference is apparent between the K(D) and K(m) reported for RsrI using pBR322 as a substrate(36) . The K(m) (17.5 nM) for SmaI for the 22-base oligonucleotide is lower than that reported for several other endonucleases. However, many of the previously described kinetic assays have utilized very short (8-12 bp) oligonucleotide substrates (37, 38, 39) which may not be optimal for endonuclease binding(35) . A K(m) near 30 nM reported for the PaeR7 endonuclease with a 30 nucleotide substrate (40) is comparable to that obtained for SmaI. Furthermore, the K(m) for SmaI is not significantly different from the K(D) (calculated from the inverse of the equilibrium association constant). A mechanism in which strand scission is the rate-limiting step in the SmaI cleavage reaction would be one interpretation of the similarity between the kinetic and thermodynamic constants. Alternatively, magnesium, which is present only in the kinetic assays, may increase the DNA affinity of SmaI so that the value of K(m) more closely approaches the K(D). Magnesium, in addition to conferring substrate specificity, has been attributed with increasing the DNA affinity of EcoRV(29) , and it has been suggested that PaeR7 fails to bind DNA in the absence of magnesium (40) . For EcoRI it has been shown that magnesium does not influence the equilibrium association constant(27) . There appears, then, to be a variable role for magnesium in the activity of the type II endonucleases. The isolation of catalytic defective mutants that retain the ability to bind to DNA will be useful for examining the role of magnesium in the sequence-specific binding of the SmaI endonuclease.

Footprinting analyses of the SmaIbulletDNA complexes suggest a direct readout of each of the bases within the recognition site by the endonuclease. DMS interference analyses indicated that the protein contacted each of the guanines of the recognition site within the major groove of the DNA. This conclusion was supported by the missing base analyses and implies specific hydrogen bond interactions between the protein and the donor and/or acceptor groups of the purines. Missing base analyses also implicated interactions between the protein and each of the cytosines within the recognition sequence. Since both the DMS interference assays and the missing base analyses implicated each of the guanine bases, the selective removal of the cytosines in the latter assays may result in subtle changes in the conformation of the guanines and, therefore, indirectly influence binding of the enzyme. Nonetheless, N-4 methylation of the second cytosine of the recognition sequence by the cognate methylase inhibits binding of the enzyme. C-5 methylation of the external cytosine also markedly reduces the K(m) of the enzyme^2 suggesting a direct role for the cytosines in the sequence specific recognition by SmaI. Protein-DNA contacts at each of the base pairs within the recognition site also characterizes the EcoRI (9) and PvuII (16) proteinbulletDNA complexes.

Ethylation interference assays revealed that SmaI interacts with the phosphates of three adjacent bases on each strand of the recognition hexanucleotide. Lesser and colleagues (35) examined the protein-phosphate interactions in the EcoRIbulletDNA complex and have similarly determined that only six symmetry related phosphates have a crucial role in recognition, although the pattern of phosphate interactions is quite distinct for SmaI and EcoRI. SmaI also exhibited ``half-site'' recognition of the phosphates by interacting only with those phosphates 5` of the guanosines. The proposed phosphate contacts for SmaI therefore differ from the other characterized blunt-end cutters, EcoRV (10) and PvuII(16) , both of which exhibit an extensive network of phosphate interactions both within and flanking the recognition sequence.

Identification of the potential protein-phosphate contacts is important not only for the analysis of sequence specific recognition but also for the potential mechanism of catalysis. Substrate-assisted catalysis has recently been suggested for the EcoRI and EcoRV endonucleases based on the structural similarity of the PD(X) EXK catalytic motif(41, 42) . It has been proposed that the attacking water molecule in the cleavage reaction is activated by the phosphoryl oxygen of the phosphate group on the 3`-side of the scissile bond. Recent studies of EcoRI and EcoRV cleavage of substrates containing phosphate substitutions 3` of the scissile bond are consistent with the proposal of substrate assisted catalysis(43) . In neither of the proteinbulletDNA complexes does the phosphate make a contact required for specific binding(43) . Although the active site structure BamHI is very similar to that of EcoRI and EcoRV, the sequence of the catalytic motif is not well conserved(15, 44) . Consequently, it has been suggested that BamHI may utilize an alternative mechanism for the activation of a water molecule for nucleophilic attack during catalysis(15) . SmaI similarly lacks the consensus PD(X) EXK sequence motif(45) , and a protein contact with the 3`-phosphate is inferred from the ethylation interference studies. SmaI may, therefore, resemble BamHI in a reaction mechanism that differs from that proposed for EcoRI and EcoRV. The requirement for KCl by SmaI also suggests some differences in the reaction mechanism.

Potential mechanisms of sequence discrimination and catalysis by the type II endonucleases have begun to emerge from recent biochemical and structural analyses of these proteins. The architecture of the active site appears to be conserved although the functional amino acids may differ(15, 16) . Furthermore, the similarity in the overall structure of the EcoRI and BamHI endonucleases has prompted the suggestion of a relationship between the position of cleavage within the recognition site and the structure of the enzyme(15) . Anderson (6) has suggested a correlation between the position of the scissile bond (i.e. within the major or minor groove of the DNA) and the orientation of the DNA-binding domain. The SmaI and EcoRV endonucleases are similar in that they each cleave within a 6-bp recognition sequence to produce a blunt-end scission. However, they appear to differ significantly in the interaction with their specific sequences: SmaI belongs to the class of enzymes designated by Zebala et al.(14) as SEL (Specificity Early and Late) whereas EcoRV is the prototype of the SLO class at which specificity occurs predominantly at the cleavage reaction(25, 29) . SmaI induces bending of the DNA, and although the direction of the bend is similar to that of EcoRV the bend angle is significantly smaller(19) . Consequently, whereas the extensive EcoRV-induced DNA conformational changes preclude the formation of hydrogen bonds at the central base pairs of the binding site(10) , SmaI appears to interact with each of the base pairs within the recognition sequence. Furthermore, it appears that the amino acids within the active sites of SmaI and EcoRV differ, and SmaI may not utilize substrate assisted catalysis. The only other blunt-end cutter that has been examined to date is PvuII(16) . Although the role of magnesium in the specificity of the PvuII has not yet been determined, the enzyme displays certain similarities with SmaI including (i) interaction with each of the base pairs within the recognition site, (ii) a potential protein contact to the phosphate 3` to the scissile bond, (iii) does not significantly bend the DNA, and (iv) the active site residues differ from the consensus sequence motif. It will be of interest to determine whether there is any conservation of the structures of the PvuII and SmaI endonucleases.

In contrast to the general lack of sequence similarities between the type II restriction endonucleases, current studies are beginning to reveal some common themes in their mechanism of interaction with their DNA substrate. Determination of the structure and interactions of more endonucleases will provide insight into the extent of the diversity of mechanisms by which these enzymes achieve their binding and catalytic specificity.


FOOTNOTES

*
This work was supported by the National Science Foundation Grant MCB9004611 (to J. D.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
Supported in part by a Rumble Graduate Fellowship from Wayne State University. Present address: Dept. of Cancer Research, Parke-Davis Pharmaceutical, Warner-Lambert Company, 2800 Plymouth Rd., Ann Arbor, MI 48105.

To whom correspondence should be addressed: Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, 3126 Scott Hall, Detroit, MI 48201. Tel.: 313-577-5545; Fax: 313-577-5218; jdunbar{at}cmb.biosci.wayne.edu.

(^1)
The abbreviations used are: bp, base pair(s); DMS, dimethyl sulfate.

(^2)
J. C. Dunbar, unpublished results.


REFERENCES

  1. Steitz, T. A. (1990) Quart. Rev. Biophys. 23, 205-280 [Medline] [Order article via Infotrieve]
  2. Von Hippel, P. (1993) Science 263, 769-770
  3. Burley, S. K. (1994) Curr. Opin. Struct. Biol. 4, 3-11
  4. Raumann, B. E., Brown, B. M. and Sauer, R. T. (1994) Curr. Opin. Struct. Biol. 4, 36-43
  5. Aiken, C. R., Fisher, E. W. and Gumport, R. I. (1991) J. Biol. Chem. 266, 19063-19069 [Abstract/Free Full Text]
  6. Anderson, J. (1993) Curr. Opin. Struct. Biol. 3, 24-30 [CrossRef]
  7. McLaughlin, L. W., Benseler, F., Graeser, E., Piel, N., and Scholtissek, S. (1987) Biochemistry 26, 7238-7245 [Medline] [Order article via Infotrieve]
  8. Heitman, J. (1992) Bioessays 14, 445-454 [Medline] [Order article via Infotrieve]
  9. Rosenberg, J. M. (1991) Curr. Opin. Struct. Biol. 1, 104-113
  10. Winkler, F., Banner, D. W., Oefner, C., Tsernoglou, D., Brown, R. S., Heathman, S. P., Bryan, R. K., Martin, P. D., Petraros, K. K., and Wilson, K. S. (1993) EMBO J. 12, 1781-1795 [Abstract]
  11. Thielking, V., Selent, U., Kohler, E., Wolfes, H., Pieper, U., Geiger, R., Urbanke, C., Winkler, F., and Pingoud, A. (1991) Biochemistry 30, 6416-6422 [Medline] [Order article via Infotrieve]
  12. Vermote, C. L., and Halford, S. E. (1992) Biochemistry 31, 6082-6089 [Medline] [Order article via Infotrieve]
  13. Wilson, G. G., and Murray, N. E. (1991) Annu. Rev. Genet. 25, 585-627 [CrossRef][Medline] [Order article via Infotrieve]
  14. Zebala, J. A., Choi, J., Trainor, G. L., and Barany, F. (1992) J. Biol. Chem. 267, 8106-8116 [Abstract/Free Full Text]
  15. Newman, M., Strzelecka, T., Dorner, L., Schildkraut, I., and Aggarwal, A. (1994) Nature 368, 660-664 [CrossRef][Medline] [Order article via Infotrieve]
  16. Cheng, X., Balendiran, K., Schildkraut, I., and Anderson, J. E. (1994) EMBO J. 13, 3927-3935 [Abstract]
  17. Athanasiadis, A., Vlassi, M., Kotsifaki, D., Tucker, P. A., Wilson, K. S., and Kokkinidis, M. (1994) Nature Struct. Biol. 1, 469-475 [Medline] [Order article via Infotrieve]
  18. Segel, I. H. (1975) Enzyme Kinetics , Wiley, New York
  19. Withers, B. E., and Dunbar, J. C. (1993) Nucleic Acids Res. 21, 2571-2577 [Abstract]
  20. Terry, B., Jack, W., Rubin, R., and Modrich, P. (1983) J. Biol. Chem. 258, 9820-9825 [Abstract/Free Full Text]
  21. Siebenlist, U., and Gilbert, W. (1980) Proc. Natl. Acad. Sci. U. S. A. 77, 121-126
  22. Brunelle, A., and Schlief, R. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 6673-6676 [Abstract]
  23. Griep, M. A., and McHenry, C. S. (1989) J. Biol. Chem. 264, 11294-11301 [Abstract/Free Full Text]
  24. Ha, J-H., Capp, M. W., Hohenwalter, M. D., Baskerville, M., and Record, T. (1992) J. Mol. Biol. 228, 252-264 [Medline] [Order article via Infotrieve]
  25. Taylor, J. D., Badcoe, I. G., Clarke, A. R., and Halford, S. E. (1991) Biochemistry 30, 8743-8753 [Medline] [Order article via Infotrieve]
  26. Sakonju, S., and Brown, D. D. (1982) Cell 31, 395-405 [Medline] [Order article via Infotrieve]
  27. Jen-Jacobsen, L., Lesser, D., and Kurpiewski, D. (1986) Cell 45, 619-629 [Medline] [Order article via Infotrieve]
  28. Winter, R. B., and von Hippel, P. H. (1981) Biochemistry 20, 6948-6960 [Medline] [Order article via Infotrieve]
  29. Thielking, V., Selent, U., Kohler, E., Landgraf, A., Wolfes, H., Alves, J., and Pingoud, A. (1992) Biochemistry 15, 3727-3732
  30. Terry, B. J., Jack, W. E., and Modrich, P. (1985) J. Biol. Chem. 260, 13130-13137 [Abstract/Free Full Text]
  31. Jack, W. E., Terry, B. J., and Modrich, P. (1982) Proc. Natl. Acad. Sci. U. S. A. 79, 4010-4014 [Abstract]
  32. Harran, T. E., Shakked, Z., Wang, A. H-J., and Rich, A. (1987) J. Biomol. Struct. Dyn. 5, 199-217 [Medline] [Order article via Infotrieve]
  33. Fersht, A. (1985) Enzyme Sructure and Mechanism , 2nd Ed., Freeman, New York
  34. Modrich, P., and Zabel, D. (1976) J. Biol. Chem. 251, 5866-5874 [Abstract]
  35. Lesser, D. R., Kurpiewski, M. R., and Jen-Jacobsen, L. (1990) Science 250, 776-786 [Medline] [Order article via Infotrieve]
  36. Aiken, C. R., McLaughlin, L. W., and Gumport, R. I. (1991) J. Biol. Chem. 266, 19070-19078 [Abstract/Free Full Text]
  37. Brennan, C. A., Van Cleve, M. D., and Gumport, R. I. (1986) J. Biol. Chem. 261, 7270-7278 [Abstract/Free Full Text]
  38. Newman, P. C., Williams, D. M., Cosstick, R., Seela, F., and Connolly, B. (1990) Biochemistry 29, 9902-9910 [Medline] [Order article via Infotrieve]
  39. Waters, T., and Connolly, B. (1994) Biochemistry 33, 1812-1819 [Medline] [Order article via Infotrieve]
  40. Ghosh, S. S., Obermiller, P. S., Kwoh, T. J., Gingeras, T. R. (1990) Nucleic Acids Res. 18, 5063-5068 [Abstract]
  41. Selent, U., Ruter, T., Kohler, E., Liedtke, M., Thielking, V., Alves, J., Oelheschlager, T., Wolfes, H., Peters, F., and Pingoud, A. (1992) Biochemistry 31, 4804-4815
  42. Jeltsh, A., Alves, J., Maass, G., and Pingoud, A. (1992) FEBS Lett. 304, 4-8 [CrossRef][Medline] [Order article via Infotrieve]
  43. Jeltsh, A., Alves, J., Wolfes, H., Maass, G., and Pingoud, A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 8499-8503 [Abstract/Free Full Text]
  44. Brooks, J. E., Nathan, P. D., Landry, D., Waite-Rees, P., Ives, C., Moran, L. S., Slatko, B., and Benner, J. S. (1991) Nucleic Acids Res. 19, 841-850 [Abstract]
  45. Heidmann, S., Seifert, W., Kessler, C., and Domdey, H. (1989) Nucleic Acids Res. 23, 9783-9796

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.