(Received for publication, September 25, 1995; and in revised form, January 3, 1996)
From the
Multicopy clones of Escherichia coli cytosine
methyltransferases Dcm and EcoRII methylase (M. EcoRII) cause 50-fold increase in C
T mutations at
their canonical site of methylation, 5`-CmeCAGG (meC is
5-methylcytosine). These plasmids also cause transition mutations at
the second cytosine in the sequences CCGGG at
10-fold lower
frequency. Similarly, M. HpaII was found to cause a
significant increase in C
T mutations at a CCAG site, in
addition to causing mutations at its canonical site of methylation,
CCGG. Using a plasmid that substantially overproduces M. EcoRII, in vivo methylation at CCSGG (S is C or G)
and other non-canonical sites could be detected using a gel
electrophoretic assay. There is a direct correlation between the level
of M. EcoRII activity in cells, the extent of methylation at
non-canonical sites and frequency of mutations at these same sites.
Overproduction of M. EcoRII in cells also causes degradation
of DNA and induction of the SOS response. In vitro, M. EcoRII methylates an oligonucleotide duplex containing a CCGGG
site at a slow rate, suggesting that overproduction of the enzyme is
essential for significant amounts of such methylation to occur.
Together these results show that cytosine methyltransferases
occasionally methylate cellular DNA at non-canonical sites and suggest
that in E. coli, methylation-specific restriction systems and
sequence specificity of the DNA mismatch correction systems may have
evolved to accommodate this fact. These results also suggest that
mutational effects of cytosine methyltransferases may be much broader
than previously imagined.
There is good correlation between the presence of methylation at
position 5 of cytosines in DNA and transition mutations. Several years
ago Coulondre et al.(1) showed that two cytosine
methylation sites in the lacI gene of Escherichia coli were hot spots for spontaneous C T mutations. These
cytosines mutated to thymine at frequencies many times higher than the
frequencies of mutations at any other base in the gene(1) .
More recently, a cytosine methylation site in the cI gene of a
phage
lysogen (2) was also shown to be a hot spot for
spontaneous C
T mutations. In vertebrates, methylation of
cytosines predominantly occurs within CpG dinucleotides, and cataloging
of sequence changes that cause human genetic diseases has revealed that
a disproportionately high fraction of these involve transition
mutations at CpG sites(3) . There is also a striking
correlation between some types of cancers and mutations at CpG
dinucleotides in the tumor suppressor gene p53 (reviewed in (4, 5, 6) ). In addition, 53% of all the
germline mutations found in Li-Fraumeni syndrome are C:G to T:A
mutations within CpGs(5) .
Spontaneous hydrolytic
deamination of 5-methylcytosine (5-meC) ()in DNA to thymine (7, 8) has traditionally been proposed (1) as
the explanation of this phenomenon. Recently several alternate
hypotheses for the occurrence of such mutational hot spots have been
proposed and studied. These include error-prone copying of 5-meC by DNA
polymerases(9) , cytosine methyltransferase (C5 MTase)-mediated
C
U (10) and 5-meC
T (11) conversions,
excision of 5-meC in DNA followed by error-prone repair (12) ,
and inhibition of mismatch correction systems by C5
MTases(13, 14) .
To help choose between these
alternative hypotheses, we have developed two genetic systems in E.
coli that can quantitate C T mutations at sites of cytosine
methylation. The assay involves scoring of kanamycin-resistant
(Kan
) revertants from kanamycin-sensitive (Kan
)
alleles in which Leu (TTG) at codon 94 of the kan gene was
replaced with Pro (CCG or CCA) mutations. Replacement of the second C
in the codon with T restores a different Leu codon and is scored as
Kan
( Fig. 1and Refs. 15 and 16). The two Kan
alleles will be referred to as kanS-H94 (codon 94; CCG)
and kanS-D94 (CCA), respectively. While the former system
detects C
T mutations within CpG sequences, the latter detects
mutations within the sequence context of cytosine methylation in E.
coli. EcoRII methyltransferase (M. EcoRII) is part of a
plasmid-borne restriction-modification system found in a clinical E. coli isolate(17, 18) . M. EcoRII
methylates position 5 of the second cytosine within the sequence
5`-CCWGG-3` (W is A or T; Refs. 19 and 20). The chromosome of E.
coli K-12 also codes for a C5 MTase called Dcm(21) , and
Dcm and M. EcoRII have identical methylation
specificities(22, 23) . As a result, both Dcm and M. EcoRII methylate within codon 94 of kanS-D94 (Fig. 1).
Figure 1: Wild-type, mutant, and revertant kan sequences. DNA sequences surrounding codons 70 and 94 are shown. For codon 94, the wild-type sequence is shown in the middle, and the two mutant alleles containing CCA and CCG sequences and their revertants are, respectively, shown above and below the wild-type sequence. The Dcm/EcoRII site and HpaII sites in the sequences are underlined. In the case of codon 70, the pentanucleotide sequence within which Dcm is expected to methylate is underlined.
Presence of a cognate MTase in cells
containing one of the kan alleles results in a
40-100-fold increase in the Kan
to Kan
reversion frequency(15, 16) . These studies
provide direct evidence that an action of C5 MTases, presumably
methylation itself, is the cause of mutational hot spots at sites of
cytosine methylation. While Dcm or M. EcoRII were used as the
cognate MTases with kanS-D94 in these studies, the
methyltransferase in the HpaII restriction-modification system
(M. HpaII) was used with kanS-H94. M. HpaII
methylates the second cytosine within the sequence CCGG(24) .
We have used restriction mapping and DNA sequencing to confirm that the
Kan
revertants obtained during these studies contain the
expected C
T mutation at codon 94(15, 16) .
This shows that the occurrence of second-site revertants is low in the
system and that the system is well suited to study the effect of
methylation within specific sequences on C
T mutations.
An
unexpected finding of these experiments was that the presence of
non-cognate methylases in the cells also increased Kan reversion frequency, although to a lesser extent. We describe
below this phenomenon, discover its cause, and explore its implications
to the structure and biology of C5 MTases.
Duplexes I and II were
methylated with M. EcoRII purified to apparent homogeneity
(specific activity 9.8 pmol of methyl groups/min/µg of protein)
using S-[methyl-H]adenosyl-L-methionine
(DuPont NEN) (0.078 µM, 85 Ci/mmol) as the methyl donor in
methylase buffer (100 mM Tris-HCl, pH 7.8, 20 mM EDTA, pH 8.0, 0.4 mM dithiothreitol). The reaction volume
was 50 µl, and the reactions were carried out at 37 °C and
terminated at various times by the addition of 2 µl of 10% SDS. The
samples were purified by extraction with phenol-chloroform and passed
through Sephadex G-50 (Pharmacia Biotech Inc.) spin columns to remove
unincorporated radioactive label. The incorporated radioactivity was
quantitated by scintillation counting. When duplex I was used, it was
at concentrations between 10 and 200 nM and the enzyme was at
2.1 nM. When duplex II was used, it was at concentrations
between 1.0 and 20.0 µM and the enzyme was at 0.454
µM. Steady-state kinetics of methyl transfer was analyzed
using the Statview Student package for the Macintosh, and the kinetic
constants were calculated.
When the gene for M. EcoRII (ecoRIIm) was cloned into
pBR322 and introduced into the test strains, it also caused an increase
in Kan
reversion frequency with the kanS-H94 allele (Table 2). Once again, the level of increase in the
reversion frequency was substantially lower than that with the kanS-D94 allele. Together, these data suggest that when dcm
or ecoRIIm
genes are cloned into medium copy number plasmids, there is an
increase in the frequency of C
T mutations at the non-canonical
site CCGGG by a factor of up to 4.
We were concerned that the
observed increases in Kan reversion frequency caused by
these MTases may be due to mutations at a site other than the CCWGG
site at codon 94. To eliminate this possibility, plasmid DNA was
extracted from 10 revertants obtained in experiments involving Dcm and kanS-H94. These DNA preparations typically contained three
plasmids: the plasmid carrying the MTase gene, the plasmid with the
original kanS-H94 allele, and the plasmid with the revertant.
The latter plasmid was separated from the other plasmids by
retransformation into a new host and selecting for Kan
phenotype. DNA was isolated from the transformants and analyzed
by restriction digests. When codon 94 is CCG, a C
T mutation at
the second position in this codon (but not at the first position)
eliminates a SmaI site (CCCGGG) and creates a BstNI
site (CCTGG, Fig. 1). Fig. 2shows the restriction
pattern for two such revertants. As expected, both the plasmids had
lost the SmaI site (lanes 6 and 7) and
gained a BstNI site (lanes 9 and 10).
Furthermore, the sizes of the newly created BstNI fragments in
the revertants were consistent with the sizes expected if a new BstNI site were to be created at codon 94 (not shown). The
remaining eight revertants also showed a similar pattern of restriction
sites. These results confirm that Dcm causes the expected sequence
change at codon 94 of kanS-H94.
Figure 2:
Restriction mapping of revertants.
pKanS-H94 DNA and DNAs of two independent revertants were digested with
different restriction enzymes, the products were electrophoresed on a
0.7% agarose gel and stained with ethidium bromide. Lanes 2, 5, and 8 contain pKanS-H94 DNA. Lanes 3, 6, and 9 contain DNA from one of the revertants, and lanes 4, 7, and 10 contain DNA from the
second revertant. Lane 1, X174 HaeIII digest; lanes 2-4, uncut DNA; lanes 5-7, SmaI digest; lanes 8-10, BstNI digest; lane 11, bacteriophage
BstEII digest. Reversion
causes disappearance of SmaI fragment marked A (lane 5) and a smaller fragment (not seen), and
appearance of a new fragment (marked B in lanes 6 and 7). The same mutation creates a BstNI site at codon
94, causing the disappearance of fragment C (lane 8)
and appearance of two new fragments (D and E in lanes 9 and 10).
It seemed conceivable that
somehow the presence of a DNA methyltransferase (MTase) in the cells or
the mere interruption of the tetracycline resistance (Tet)
gene in the vector causes a small increase in general mutation
frequency in the host and that this phenomenon was responsible for the
observed increase in Kan
reversion frequency. To eliminate
this possibility, EcoRI MTase was cloned into pACYC184 and the
plasmid was introduced into the test strains. M. EcoRI is an
MTase that methylates the second adenine in the sequence
GAATTC(20) . When the Kan
reversion frequency in
these strains was compared with that in strains with pACYC184, no
increase the frequency of mutations was observed with either of the two
test plasmids (Table 3). Therefore, the apparent increase in
reversion frequency due to the C5 MTases at non-canonical sites is not
the result of some nonspecific effect of the presence of an MTase in
the host on a multicopy plasmid or of the interruption of the
tetracycline resistance gene in pACYC184.
M. HpaII caused a 4-fold increase in
Kan reversion frequency with the kanS-D94 allele (Table 4). Once again, although this increase was substantially
less than that seen at the canonical site, it was quite reproducible.
In different experiments the enhancement in mutagenesis caused by M. HpaII at the non-canonical site has been found to vary between
about
3- and 4-fold above background (not shown). Plasmid DNA was
isolated from nine such revertants, and restriction analysis of the DNA
was performed. In this case, the previously existing BstNI
site (CCAGG) at codon 94 of kanS-D94 was found to be lost in 8
out of the 9 revertants (not shown). The remaining revertant had
suffered DNA rearrangements and was not studied further. These results
show that in nearly every case, the increase in the Kan
reversion frequency caused by M. HpaII was the result of
C
T change at codon 94 and was not due to second-site
revertants. Based on these results, we conclude that M. HpaII
is also capable of causing mutations at a non-canonical sequence.
We tested the ability of M. EcoRII in vitro to transfer methyl groups to a 27 bp DNA duplex containing CCGGG sequence (Duplex I) and compared the kinetic parameters for this reaction with those for methyl transfer to duplex containing a CCAGG sequence (Duplex II). The DNAs used in these experiments contained the sequences surrounding codon 94 of kanS-D94 or kanS-H94. For this reason, they were considered to be good models for understanding methylation at codon 94 of kan by M. EcoRII in vivo.
The enzyme methylated the
non-canonical DNA sequence at a low rate (Table 6).
Interestingly, the principal difference in the interaction of the
enzyme with the two substrates was reflected in differences in K (Table 6). While the K
for methyl transfer to the CCGGG-containing substrate was lower
than that for the canonical substrate by a factor of
2.0
10
, K
for the non-canonical substrate
was higher by only a factor of
27 (Table 6). If the K
values for the two substrates are taken to
reflect K
values, M. EcoRII can be said
to discriminate between the two substrates more at the level of
catalysis than at the level of DNA binding.
Although the rate of
methylation of the non-canonical duplex by M. EcoRII is poor,
methylation does take place at the expected site. We demonstrated this
by methylating P-labeled Duplex I with excess M. EcoRII and challenging the DNA with HpaII
endonuclease. The digested DNA was separated from resistant DNA by gel
electrophoresis, and the extent of protection against HpaII
was quantitated. While HpaII digested 93-96% of the
untreated DNA, it consistently cut the M. EcoRII-treated DNA
less well. Analysis of the gel using a PhosphorImager revealed that
7.0% (S.D.= ±2.4%; n = 3) more of the
total DNA was resistant to HpaII as a result of reaction with
M. EcoRII, than without it. Because HpaII is
inhibited by methylation of either cytosine in its recognition
sequence(34, 35) , we can only conclude that M. EcoRII must have methylated one of the cytosines in the
sequence CCGGG to render it resistant to HpaII.
To improve the chances of detecting methylation at non-canonical sites by the MTases, two changes were made in the procedure. First, the gene for M. EcoRII was cloned into a high copy number plasmid, pUC118. Cells containing the resulting plasmid, pR400, were found to contain approximately 50 times as much methyltransferase activity as those containing pR300 (Table 7). Because pBR322-based plasmids have 30-50 copies/cell, while pUC-based plasmids have several hundred copies per cell, the level of expression of M. EcoRII may be the result of gene dosage effect. We reasoned that the overproduction of the MTase should result in greater methylation at non-canonical sites and increase the likelihood of its detection by ethidium bromide staining. This plasmid was introduced in cells containing kanS-H94 in pACYC184 (pKanS-H94/ACYC), and the ability of pR400 to methylate at codon 94 of kan was studied. Second, SmaI was used to detect methylation at codon 94 instead of HpaII. Codon 94 is within a SmaI site (Fig. 1), and C-5 methylation of the innermost cytosine in the SmaI recognition sequence, CCCGGG, inhibits this enzyme(35) . Furthermore, as pR400 contains no SmaI sites (Fig. 3, lane 4), SmaI restriction pattern of plasmids from these cells containing pR400 and pKanS-H94/ACYC consists of linear fragments from pKanS-H94/ACYC.
Figure 3:
Protection of pR400 against SmaI.
Plasmid DNAs were digested with SmaI and the products
separated by gel electrophoresis. Lane 1, uncut pKanS-H94; lane 2, pKanS-H94 cut with SmaI; lane 3,
uncut pR400; lane 4, pR400 cut with SmaI; lane
5, uncut pR400 + pKanS-H94; lane 6, SmaI-cut pR400 + pKanS-H94; Lane 7,
bacteriophage BstEII
markers.
When pKanS-H94/ACYC was isolated from a strain lacking pR400 and was
digested with SmaI, the DNA was cut to completion revealing
three bands on the agarose gel (Fig. 3, lane 2). In
contrast, when pKanS-H94/ACYC was isolated from a strain containing
pR400, the former plasmid was found to be partially protected against SmaI. In this case, a significant fraction of pKanS-H94/ACYC
DNA appeared to be uncut (Fig. 3, compare lanes 5 and 6). As a result, although 3 times as much DNA was loaded in lane 6 compared to lane 2 of the gel, bands
corresponding to complete SmaI digest were more intense in the
latter lane (Fig. 3). In addition, three partial digestion
products could be seen on the gel (Fig. 3, lane
6), two of which had sizes consistent with the sizes
of expected partial digestion products containing codon 94. These
results directly demonstrate that when M. EcoRII is
overproduced in cells, the second cytosine in codon 94 of kanS-H94 is methylated in some molecules.
The higher levels of M. EcoRII in cells containing pR400 also caused higher
frequencies of Kan reversion. When pR400 was introduced in
cells containing pKanS-H94, the Kan
reversion frequency
increased by a factor of
18 (Table 2). This was more than
four times higher than the increase caused by pR300 at this site. This
suggests that the ability of M. EcoRII to cause mutations at
the CCGGG site may be directly related to the ability of the enzyme to
methylate this site.
Figure 4:
Protection of pR400 at CCSGG sites. Lane 1 contains bacteriophage BstEII markers. Lane 2 contains uncut pBR322 DNA. Lanes 3-11 contain different DNAs digested with different enzymes. Lanes
3, 6, and 9, pBR322; lanes 4, 7, and 10, pR300; lanes 5, 8, and 11, pR400. Lanes 3-5, EcoRII; lanes 6-8, NciI; lanes 9-11, ScrFI. The positions of partial digestion products in ScrFI digest are marked by brackets on the right side of lane 11.
As expected,
the ScrFI digestion pattern of pR400 was different than its NciI digestion pattern and contained a number of partial
digestion products (Fig. 4, compare lanes 8 and 11). It is clear from these data that M. EcoRII
produced by pR400 methylates several CCSGG sites in addition to its
methylation of CCWGG sites (Fig. 4, lane 5). We have
further shown that this result is not restricted to pR400, but is also
true of other overproducers of M. EcoRII. Plasmid DNAs from
other overproducers of M. EcoRII including one other pUC-based
overproducer and those based on overexpression of the gene from
P or P
promoters were partially protected at
CCSGG sites from ScrFI digestion (see below and data not
shown). In each case the DNA was completely sensitive to BstNI
and NciI, demonstrating that the lack of cutting by ScrFI was unlikely to be due to inhibition by contaminants in
the DNA.
We were interested in finding out whether sequences other than CCNGG were protected by M. EcoRII. In particular, it seemed possible that M. EcoRII may also recognize other ``four-out-of-five'' (4/5, for short) sites such as NCWGG and CCWGN in DNA and methylate the second base in the sequences. To test for such methylation, the sequence of pR400 was scanned and a PstI site (GCCTGCAGGT) in the pUC polylinker that overlaps with four 4/5 sites was identified. PstI is known to be inhibited by C-5 methylation within its recognition sequence(36, 37) . When pR400 DNA was digested with PstI, about 10% of its DNA was found to be resistant to PstI, confirming the partial methylation of this site (not shown).
Using pKanS-H94 it was not possible to determine whether M. EcoRII caused an increase in the rate of C T mutations
sites at the other non-canonical sites. However, we noticed that Jones
and colleagues had fortuitously constructed a mutant of kan with a 4/5 site at codon 70 ( Fig. 1and (13) ). If
this site were to be methylated by Dcm or M. EcoRII, the
mutant would be expected to revert at a higher frequency. To test this,
pDCM72 was introduced into cells containing this kan allele
(pKanS-D70) and the reversion frequency was determined. Presence of Dcm
in the cells more than doubled the reversion frequency (Table 1,
lines 5 and 6). A plasmid carrying the dcm
was used in these experiments instead of pR300 or pR400, because
the latter two plasmids are incompatible with the plasmid containing
the kan allele. These data further support our earlier
conclusion that C5 MTases can increase rates of C
T mutations at
sites related to their canonical sites.
Although M. EcoRII was found to methylate several different non-canonical sites, it does not methylate DNA indiscriminately. When pR400 was digested with Sau3AI (recognition sequence GATC), AluI (AGCT), HhaI (GCGC), or ApaLI (GTGCAC) and the products separated on agarose gels, the digests appeared to be complete (not shown). In addition a StuI site (AGGCCT) within ecoRIIm gene which overlaps two 4/5 sites appeared to be completely susceptible to StuI digestion by the gel electrophoretic assay. In contrast, HpaII or MspI (CCGG; inhibited by meCCGG) digests of pR400 contained some incomplete digestion products (not shown). Presumably, this is because the recognition sequences for the former group of enzymes are much less likely to overlap with 4/5 M. EcoRII sites than is CCGG.
Under
repressed conditions, there was enough M. EcoRII produced in
the cells to methylate the CCWGG sites making plasmid DNA isolated from
the cells resistant to EcoRII (not shown). ScrFI
digestion of the same DNA gave rise to a pattern identical to the
pattern generated by NciI (not shown) and contained no
indication of presence of partial digestion products (Fig. 5, lane 3). When the P promoter was induced by the
addition of IPTG to the growth medium (final concentration 100
µM), M. EcoRII activity in the cells increased by
a factor of
8 (Table 7). When plasmid DNA isolated from
cells after 3.5 h of induction was subjected to digestion by ScrFI, a substantial fraction of the DNA appeared as partial
digestion products ( Fig. 5compare lanes 3 and 5). This is consistent with our earlier conclusion that the
overproduction of M. EcoRII leads to methylation of CCSGG
sites.
Figure 5:
Protection of pR234 at CCSGG sites. Lane 1 contains bacteriophage HindIII markers. Lanes 2 and 3 contain pR234 DNA from uninduced cells. Lanes 4 and 5 contain pR234 DNA from induced cells. Lanes 2 and 4, uncut DNA; lanes 3 and 5, DNA cut with ScrFI.
Interestingly, the DNA isolated from the induced cells also
contained high molecular weight DNA that appeared as a smear in the gel
or did not enter the gel (Fig. 5, compare lanes 2 and 4). Such high molecular weight DNA was not found to
contaminate plasmids isolated from uninduced cells (Fig. 5, lane 2) or from cells induced for only 2 h (data not shown).
Furthermore, plasmid DNA isolated after 2 h of induction was only
slightly methylated at non-canonical sites (data not shown). We suggest
that the contaminating high M DNA is chromosomal
DNA that has been degraded by the Mcr systems between 2 and 3.5 h after
the addition of IPTG. It should be noted that such DNA was absent from
the preparations of pR400 (Fig. 4), probably because the plasmid
was isolated from an Mcr
Mrr
host
BH143.
We confirmed the apparent degradation of DNA following M. EcoRII induction using a genetic assay. Heitman and Model (26) have described an Mcr Mrr
E. coli strains in which a promoter-less lacZ gene is inserted downstream from a DNA damage-inducible promoter.
In these strains degradation of DNA results in the induction of the SOS
response, which causes induction of
-galactosidase. In many cases,
DNA degradation is not severe enough to cause immediate cell death and
hence colonies can be obtained on plates. Because the presence of
-galactosidase can be determined by a colorimetric assay, this is
a convenient system to study restriction by the Mcr and Mrr systems.
Using this strain, Heitman and Model (26) showed that presence
of M. HpaII and M. MspI in cells causes degradation
of DNA.
PR234 and pNK627 were introduced into one such strain
(JH140), and the cells were plated on LB plates containing
5-bromo-4-chloro-3-indoyl -D-galactoside (X-gal) alone or
with IPTG. Appropriate antibiotics were also present in the plates to
assure the retention of the two plasmids during cell growth. Overnight
incubation of the plates without IPTG at 30 °C gave rise to light
blue colonies, while the colonies that appeared on plates with X-gal
and IPTG were a much deeper shade of blue (not shown). The color of the
latter set of colonies was typical of strains that are
LacZ
. Presumably, IPTG-mediated induction of ecoRIIm gene resulted in significant damage to DNA leading to
SOS induction and expression of
-galactosidase.
To provide a
quantitative measure for this phenomenon, -galactosidase activity
from IPTG-induced cells was determined and compared to activity from
uninduced cells. The results are summarized in Fig. 6. The
uninduced cells contained significant levels of
-galactosidase
activity, and the activity increased by a factor of
3 as cells
entered stationary phase. This is consistent with the observation
mentioned above that colonies on plates without IPTG had a faint blue
color (see also (26) ). But the cells to which IPTG had been
added behaved differently in two significant ways. The
-galactosidase activity in these cells increased by a factor of
17, and the bulk of this increase occurred within the 3rd hour
after the addition of IPTG (Fig. 6B). Also, in contrast
to the uninduced cells, these cells appeared to have stopped dividing
at about 3 h following the addition of the inducer (Fig. 6A). The time course of induction of
-galactosidase activity and the cessation of cell division
correlate well with the increase in methylation at non-canonical sites
and the appearance of high M
DNA in the plasmid
preparations described above.
Figure 6:
SOS response following the induction of M. EcoRII activity. Vertical arrow marks the time at
which IPTG was added to one culture. Open squares, culture
grown with IPTG. Open circles, culture grown without IPTG. A, turbidity of cultures monitored using optical density
measurements at 600 nm. B, -galactosidase activity was
determined for samples from the same two cultures at various times. The
enzyme activity is normalized with respect to cell density as described
by Miller(32) .
We have shown here that overproduction of M. EcoRII
in E. coli causes significant methylation within several CCSGG
sequences and within other sequences that were not known to be
substrates for its catalytic action. This methylation causes a
substantial increase in C T mutations at the non-canonical
sites. This correlation between MTase overproduction, methylation at
non-canonical sites and mutagenesis at these sites helps explain our
initial observation that even at lower levels of MTase activity
significant amounts of C
T mutations can be detected at
non-canonical sites. These mutations are likely to be caused by the
deamination of 5-meC or due to other mutagenic interactions at sites of
methylation (see Introduction). Because we have demonstrated the
existence of this phenomenon with M. HpaII, in addition to M. EcoRII and Dcm, we expect that all C5 MTases will display such
effects.
The ability of C5 MTases to methylate non-canonical sites
was anticipated in one of our earlier studies(33) . In it we
showed that a multicopy plasmid carrying dsaVgene is restricted by McrBC
and concluded that this was consistent with the methylation by M. DsaV at sites other than its canonical sequence: CCNGG.
Methylation by N
-adenine methyltransferases at
non-canonical sites has been demonstrated
before(42, 43, 44, 45) , but this is
the first demonstration of similar behavior by C5 MTases.
Although the SOS response is
known to cause mutations, it cannot be the cause of mutations at the
non-canonical sites. One of the reasons for this conclusion is that the
strain used in the mutational studies was deleted for all known Mcr
systems and hence it is unlikely that SOS induction occurred in this
strain. Also, mutations at non-canonical sites are found to occur at
levels of M. EcoRII and Dcm at which there is no evidence of
SOS induction. When dcm and ecoRIIm
genes are present in medium copy
number plasmids such as pBR322 or pACYC184 and are expressed from their
endogenous promoters, the SOS response was not
observed(46) .
Additionally, SOS mutagenesis
requires error-prone ``translesion'' synthesis and copying of
5-meC by at least E. coli polymerase I Klenow fragment is
known not to be error-prone(9) . Finally, we and others (
)have found that plasmids similar to pR300 and pDCM72 do
not cause an overall increase in forward mutation frequency in E.
coli as measured by rifampicin and streptomycin resistance
assays.
Oddly, VSP repair also corrects T:G mismatches that lie within sequences other than CCWGG, albeit at lower efficiencies. Genetic evidence exists for the repair of T:G mismatches in NTAGG/NGTCC or 5`-CTAGN/3`-GGTCN to C:G by this system(47) . Additionally, purified Vsr protein (which is an endonuclease that nicks DNA immediately upstream of the mismatched T; (51) ) cleaves substrates that differ from the canonical Dcm sequence by one base pair(52, 53) . The most efficiently repaired mismatch is CTAGG/GGTCC, and replacement of the central A:T pair or the terminal C:G or G:C pairs by other pairs reduces the efficiency of repair to between 5 and 68% of this value (53) . In addition, two duplexes in which both the terminal base pairs had been substituted (TTAGA/AGTCT and TTAGC/AGTCG) are also repaired at a low efficiency(53) . It is not easy to understand why VSP repair has such broad sequence specificity if Dcm (and M. EcoRII) are assumed to interact with and methylate only CCWGG sequences. However, now that we have shown that Dcm and M. EcoRII can methylate non-canonical 4/5 sites under certain conditions, the imagined discrepancy between specificities of the MTases and the DNA repair process (53) may be eliminated. What remains to be done is to catalog all the sequences methylated by Dcm and M. EcoRII and compare them to the family of sequences within which VSP repair corrects T:G mismatches.
Significant levels of methylation at non-canonical sites only occur when the MTase is overexpressed. The level of Dcm protein expressed from the chromosomal copy of this gene is low enough that not all CCWGG sites are protected from R. EcoRII endonuclease cleavage (54, 55) . Like M. EcoRII (Table 6), Dcm is expected to have a strong preference for the canonical sequence; hence, under these conditions it is unlikely that a significant fraction of 4/5 sites will be methylated by Dcm. However, little is known about the regulation of dcm or the role of Dcm in E. coli, and it is possible that under certain physiological conditions, Dcm is overproduced in the cells causing complete methylation of the canonical sites and partial methylation of 4/5 sites. We suggest that it is this possibility that Vsr is designed to deal with. Deamination of 5-meC at the non-canonical sites would create T:G mismatches within 4/5 sequences, and these would then be repaired by VSP repair.
It is also interesting to note that M. EcoRII is a negative regulator of its own expression(57, 58) . Although the regulation of dcm and hpaIIm genes has not been studied, mspIm gene is also under autogenous control(59) . It is possible that one of the reasons for such tight regulation of C5 MTase genes is to reduce mutagenic damage caused by these enzymes. Negative regulation by M. EcoRII appears to be related to methylation of its canonical site. The enzyme has two DNA-binding domains, one for the methylation sequence and the other for an operator sequence within its promoter(58) . Further, unmethylated CCWGG sequence, but not the methylated sequence, inhibits binding of the enzyme to the operator. Presumably, the enzyme binds the operator and shuts off transcription when all CCWGG sites have been methylated. In this way, efficient transcription of ecoRIIm gene occurs only when unmethylated CCWGG sites are present.
It is not possible from these studies to pinpoint the level of MTase activity in the cells at which methylation of non-canonical sites becomes significant. While dcm is present as a single-copy chromosomal gene in E. coli, the EcoRII genes were originally found on a low copy number natural plasmid in a clinical isolate of E. coli(17, 18) . Compared to levels of the two MTases in these strains, the pBR322- and pACYC184-based clones used in our study express the MTases at a level that is higher by at least an order of magnitude ((25) ; see above). The same may be true of the clones that express M. HpaII. This is because the HpaII restriction-modification genes were originally found to lie in H. parainfluenza chromosome(60) . Based on the genetic assay used in our studies, the level of expression of hpaIIm gene from the plasmid-based clone is sufficient to significantly methylate the non-canonical sites. Interestingly, at this level of methylation gel electrophoresis-based assay is unable to detect this additional methylation. Clearly, the genetic assay should be of considerable use in studying this phenomenon further.
Note Added in Proof-Recently Clark et al. (65) transfected mammalian cells with DNA from E. coli and studied its methylation pattern after several generations. Their observation that some CAG, CTG, and CCG sequences that were not Dcm sites were methylated, can be explained by the results described here. Presumably, input E. coli DNA contained occasional methylation at CCWG, CWGG, and CCSGG sites. Therefore, it is unlikely that mammalian cells contain a de novo CNG methylating activity.