©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Natural DNA Precursor Pool Asymmetry and Base Sequence Context as Determinants of Replication Fidelity (*)

Xiaolin Zhang , Christopher K. Mathews (§)

From the (1) Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331-7305 and the NIEHS, National Institutes of Health, Research Triangle Park, North Carolina 27709

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS AND DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Previous studies showed a complex relationship between nucleotide composition of a gene and the rate of the gene's evolutionary variation. We have investigated mechanisms by constructing M13 phagemids containing part of the Escherichia coli lacZ gene, in which an opal codon is flanked either by nine adeninethymine base pairs on each side, or by nine guaninecytosine pairs, or by its wild-type sequence context. Reversions or pseudoreversions within the opal codon yield a lacZ -peptide that can undergo -complementation and yield a blue plaque when plated with a chromogenic substrate. When these constructs were replicated in HeLa cell extracts, in the presence of equimolar deoxyribonucleoside triphosphate (dNTP) mixtures, reversion was near background levels in both the AT-rich and GC-rich contexts. By contrast, when the DNAs were replicated at dNTP concentrations approximating those in HeLa cell nuclei, increases over background were seen in all three contexts. Replication of the phagemids in vivo led to even higher mutation frequencies. Replication in the presence of dGMP, added to inhibit proofreading, caused extraordinarily high reversion frequencies in the GC-flanked opal codon. Apparently, dNTP concentrations approximating intracellular concentrations are mildly but significantly mutagenic, and pool asymmetries and base sequence context both contribute to the natural fidelity of DNA replication.


INTRODUCTION

The studies described in this paper were inspired by a report (Wolfe et al., 1989) that the rate at which mammalian DNA sequences undergo evolutionary variation is a complex function of the guaninecytosine content of the sequence, with the highest rates observed in sequences containing about 50% G+C. It seemed likely that natural asymmetries in intracellular concentrations of deoxyribonucleoside triphosphates (dNTPs) could be at least partly responsible for the variations observed (Mathews and Ji, 1992). Specifically, dGTP accounts for only 5-10% of the total pool of the four common dNTPs in most mammalian cell lines that have been studied. Thus, misinsertion opposite template dCMP residues might be relatively frequent, causing increased mutation rates with increasing G+C content. At the same time, replication immediately 5` to a template dCMP residue might be more accurate if dGMP insertion at that site were slow enough to increase the probability of excision of misinserted nucleotides at the upstream site; this effect would decrease the replication error rate as a function of increased G+C content, as schematized below.It is well known that dNTP pool biases are mutagenic during DNA replication both in vitro and in vivo ( cf. Kunkel (1992) and Kunz et al. (1994)). The relationship between dNTP pool biases and replication fidelity has become of special interest with the report that pool imbalances during reverse transcription may be responsible for hypervariability of the human immunodeficiency virus genome (Vartanian et al., 1994).

Most in vitro studies have been carried out either at rather extreme dNTP pool biases, or with unnatural DNA templates, or both. The present studies were designed to ask: 1) whether replication of a natural gene sequence, at dNTP concentrations approximating natural pool asymmetries, is mutagenic; and 2) whether the immediate base sequence context influences replicative error rates in ways that would help explain the observed relationships between base composition of a gene and its evolutionary variation.


EXPERIMENTAL PROCEDURES

These studies used the modified phage M13mp2SV, described by Roberts and Kunkel (1988). This phage contains in its genome an SV40 DNA replication origin and a mutational target consisting of the first 45 codons for Escherichia coli -galactosidase plus 115 nucleotides of upstream sequence. Expression of the 45 codons of wild-type sequence generates a peptide that can undergo -complementation when introduced into a host strain, E. coli CSH50, which expresses the remainder of the lacZ gene. Complementation is scored by plating in the presence of 5-bromo-4-chloro-3-indoyl - D-galactoside, a chromogenic substrate for -galactosidase. A deep blue plaque is scored as wild-type, while mutants yield white or light blue plaques. Constructs for the reversion assays described here were prepared from M13mp2SV by site-directed mutagenesis, using the methods of Kunkel et al. (1987).

Other methods were also as described by Roberts and Kunkel (1988) and by Roberts et al. (1991), including preparation of HeLa cell extracts, preparation and purification of double-strand replicative form DNAs, conditions for SV40 origin-dependent DNA replication catalyzed by HeLa cell extracts, DpnI digestion to eliminate unreplicated DNA from analyses, electroporation of replicated DNA, plating on E. coli CSH50, and scoring mutations on the basis of plaque color. The host strain for electroporation, E. coli NR9162, was mutS, to minimize loss of replicational heterozygotes due to mismatch repair. In the reversion assays, plates were incubated for 15-18 h at 37 °C, followed by an additional 48-h incubation at room temperature, to allow detection of the maximum number of mutational events.

dNTP pool measurements were carried out essentially as described by North et al. (1980). These analyses, when carried out on the concentrated HeLa cell extracts used for the replication reactions, confirmedthat the dNTPs in these extracts contributed negligibly toward the dNTP concentrations in each reaction mixture. Also, similar analyses confirmed that dNTP degradation in the replication reactions was negligible (less than 5% of the starting values) over the course of the replication reactions.

For analysis of the M13mp2SV derivatives that had replicated in vivo, COS7 cells were grown in Dulbecco's modified Eagle medium (DMEM)()plus 10% fetal bovine serum to about 40% confluence, then washed twice with Opti-MEM reduced serum medium (Life Technologies, Inc.). For each 100-mm culture dish, 3 µg of replicative form I DNA and 10 µl of Transfectase reagent (Life Technologies, Inc.) were diluted to 300 µl with Opti-MEM I reduced serum medium, then mixed together. After standing 20 min at room temperature for formation of lipid-DNA complexes, each mixture was diluted to 3.0 ml with the same medium and added to cells treated as described above. Cells were incubated for 10 h at 37 °C, and then 3.0 ml of DMEM plus 20% fetal bovine serum was added, and incubation was continued for 24 h. At that point, medium was replaced with DMEM containing 10% fetal bovine serum and incubation continued for 12 h more. After trypsin treatment and centrifugation, each cell pellet was washed with phosphate-buffered saline and resuspended in 200 µl of 50 m M Tris-HCl, pH 7.5, 10 m M EDTA, and 100 µg/ml RNase A. Cells were lysed by adding an equal volume of 0.2 M NaOH and 1% SDS. Chromosomal DNA and cell debris were precipitated by adding potassium acetate, pH 4.8, to a final concentration of 0.44 M, followed by centrifugation. Episomal DNA was purified through Wizard mini-columns (Promega), and unreplicated DNA was eliminated from each mixture by digestion with DpnI prior to electroporation into E. coli NR9162 and subsequent analysis for revertants.


RESULTS AND DISCUSSION

Originally we contemplated a ``global'' approach to analyzing the relationship between DNA base composition, dNTP pool asymmetry, and mutagenesis. We planned to modify the G+C content of the 135-base pair protein-coding part of the lacZ gene in M13mp2SV, to values as high and low as possible without changing the amino acid sequence of the encoded lacZ -peptide. We would then replicate these modified constructs in vitro, in the presence of dNTP concentrations chosen to represent the approximate levels within HeLa cell nuclei (Leeds et al., 1985) and determine the extent to which the natural asymmetry in dNTP levels influenced replication error frequencies. This latter analysis would involve a forward mutation assay ( lacZ lacZ), where mutations anywhere in the 135-base pair target could be scored as a change in plaque color from dark blue to white or light blue.

However, a preliminary analysis (Table I) indicated that this approach would not be feasible. M13mp2SV DNA was replicated either in the presence of an equimolar dNTP mixture (100 µ M each dNTP) or an asymmetric mixture representing the estimated dNTP concentrations in S-phase HeLa cell nuclei (60 µ M dATP, 60 µ M dTTP, 30 µ M dCTP, 10 µ M dGTP) (Leeds et al., 1985). In both cases the replicated DNA samples showed significantly more mutants than unreplicated controls, which were treated identically to the experimental reaction mixtures, except for the omission of SV40 T-antigen during incubations with HeLa cell extract. However, we did not see a significant difference in mutation frequencies between the equimolar and asymmetric dNTP mixtures. It was apparent that an extremely large number of plates would have to be counted, if we were to learn whether the small difference we did observe was significant, since detection of white or light blue plaques can be done only if there are fewer than about 500 dark blue plaques/plate.

Accordingly, we turned from the global to a more local approach, involving reversion and pseudoreversion events within one codon. Reversion analysis in this system involves scoring blue plaques against a white plaque background, and this allows inspection of a far larger number of plaques per plate than does the forward mutation assay. For our analysis we chose a serine codon (residue 7) in a flexible part of the lacZ mutational target. The TCA encoding this serine was changed to an opal codon (TGA), and revertants or pseudorevertants were scored as dark blue or light blue plaque formers.

Extensive sequence analysis of mutants generated during in vitro replication of the lacZ target in M13mp2 and its derivatives has revealed few null (white plaque) mutations within this region ( cf. Kunkel and Alexander (1986)), suggesting that most mutations occurring here allow some retention of wild-type protein function. This means that: 1) we can alter the sequences flanking this codon and expect relatively little effect on protein function, and 2) we can expect most single-base substitution errors involving the engineered opal codon to generate a wild or pseudo-wild phenotype and, hence, to be scored as mutational events in a reversion analysis.

Because we are interested in sequence context as a determinant of replication fidelity in the presence of biologically biased dNTP concentrations, we wished to alter the base sequences flanking the local mutational target, namely the opal codon introduced in place of the serine-7 codon. Accordingly, we designed two sets of flanking sequences, as shown in Table II. In one construct, (AT)TCA(AT), the opal codon was flanked on each side by nine adeninethymine base pairs, which generated two conservative changes in the six codons from the wild-type flanking sequence. Design of the other construct, (GC)TCA(GC), required more changes in order to flank the opal codon with nine guaninecytosine pairs on each side. However, the changes apparently had little effect on function of the gene product, because many of the revertant plaques seen in analysis of all three constructs had a deep blue color indistinguishable from that given by the wild-type sequence.

Data from two experiments, summarized in Table III, reveal several noteworthy results. First, as noted elsewhere ( cf. Roberts et al. (1991)), replication in this in vitro system is quite accurate. The DNAs replicated in equimolar dNTP mixtures showed error rates comparable to those seen in the unreplicated controls (``background'' in Experiment 2). Second, the biological asymmetry in DNA precursor pools apparently does contribute toward the natural mutation rate. In all three constructs the mutant fraction was significantly higher when the DNA was replicated at ``biological'' dNTP concentrations, biased as described in . This effect was particularly significant in the (AT)TCA(AT)construct, where the mutant fractions in biological and equimolar dNTP mixtures differed by a factor of 3.5.

The third noteworthy result is the fact that replication of the three constructs in vivo, rather than under biologically biased pool conditions in vitro, also yielded mutant fractions significantly above background. In fact, as shown in the fourth line of Experiment 2, these values were even somewhat higher than the corresponding values from the in vitro experiment (third line). Of course, factors other than dNTP asymmetries may well contribute toward the error rates seen during replication in living cells. Mismatch repair, for example, could occur in vivo, but this would tend to decrease the mutant fractions to values lower than those seen after incubation in vitro. In any event, the results are consistent with the hypothesis that biological dNTP pool asymmetries contribute toward the natural replication error rate.

Fourth, whether replicated in vivo in bacterial cells (``background'') or mammalian cells, or in vitro in equimolar or biologically biased dNTP pools, replication was significantly more accurate when the opal codon was flanked by GC base pairs than with either AT base pairs or with the natural nucleotides. This may reflect the stability of guaninecytosine base pairs, which could lower the tolerance for insertion of incorrectly base paired nucleotides. This interpretation is consistent with the relatively large difference in mutant fraction between equimolar and biologically biased pools for the AT-flanked opal codon, described above. Formation of a CdGTP pair occurs during normal replication of the 3`-ACT-5` trinucleotide in the antisense strand at the opal codon. If mispairing in this site at low dGTP concentrations occurs more readily in an AT-rich sequence context, then the results described in the previous two paragraphs are readily understood. It seems unlikely that variations in mismatch repair are involved, because mismatch repair activities are thought to be low during replication in HeLa cell extracts (Roberts et al., 1991).

Fifth, although a GC-rich sequence context seems to promote correct base pairing at the insertion step, maintenance of high fidelity is highly dependent upon proofreading of insertion errors that do occur. Note both from Experiment 1 and from the fifth and sixth lines from Experiment 2 in I the extraordinary sensitivity of the (GC)TGA(GC)target to inhibition of proofreading, brought about by addition of a deoxyribonucleoside monophosphate at high concentrations. Error rates increased in all three constructs, but the severalfold increment in replication accuracy caused by GC-rich flanking sequences when proofreading was not inhibited was replaced by a decrement in replication accuracy, by about an order of magnitude, when proofreading was inhibited.

Essentially the same conclusion can be drawn from the ``dGTP excess'' experiments (Experiment 2, last line). Presumably, the mutations here were caused largely by the next nucleotide effect, which involves pool-driven incorporation of nucleotides past the site of a substitution error before that error can be repaired exonucleolytically (Roberts et al., 1991). Again, if helix unwinding is slower when the site of an error is flanked by guaninecytosine pairs, the sensitivity of the (GC)TGA(GC)construct to mutagenesis under these conditions is easily understood.

To propose an effect of flanking helix stability upon proofreading efficiency in this system is to propose that proofreading in eukaryotic DNA replication involves significant helix unwinding to place the primer terminus in the 3` exonuclease site, as evidently occurs in prokaryotic DNA replication (Beese et al., 1993). Whereas structural studies on eukaryotic replication proteins make this a reasonable expectation ( cf. Wang, 1991; Beckman and Loeb, 1994), it has not been explicitly demonstrated. However, our results are consistent with this model.

The influence of base sequence context upon replication fidelity has long been apparent, simply from the existence of hot spots for spontaneous mutagenesis. However, systematic analysis of this phenomenon has begun only recently. Of particular interest is a study of Bloom et al. (1994), who used pre-steady-state kinetic analysis to analyze 3` exonucleolytic proofreading, and who showed also the influence of helix stability at the primer terminus upon replication accuracy. The system of Bloom et al. involves proofreading of a nucleotide analog, in replication of synthetic DNA templates by a purified DNA polymerase. By contrast, our system involves replication of natural or near-natural DNA sequences by a multiprotein replication apparatus using natural DNA precursors at concentrations that can be adjusted to near-natural levels. Both kinds of analyses should be mutually supportive as investigations of spontaneous mutagenesis continue.

The preliminary results reported here demonstrate, we believe, the utility of this approach to understanding the effects of natural dNTP asymmetries upon replication accuracy and spontaneous mutagenesis. The results suggest a variety of informative approaches to be taken in subsequent investigations, including sequence analysis of the revertants, analysis of different sequence contexts ( e.g. AT upstream, GC downstream), and more definitive analyses of the effective dNTP concentrations at eukaryotic DNA replication sites.

  
Table: Forward mutation assay: laclac

0.5 µg of M13mp2SV replicative form DNA was replicated in each assay for 6 h, as described by Roberts and Kunkel (1988), at the specified dNTP concentrations. Incorporation of radioactivity from [-P]dCTP confirmed that replication was undetectable in the controls incubated in the absence of SV40 T antigen.


  
Table: DNA constructs used in the reversion assay

Aside from the sequence alterations shown, each construct is identical to M13mp2SV. Each altered sequence extends from codon 4 through 10 of the coding sequence for the lacZ -peptide.


  
Table: Reversion and pseudoreversion mutations generated during DNA replication

The DNA constructs described in Table II were replicated in vitro as described for Table I, except that the incubation period was 3 h. All dark blue and light blue plaques were scored as mutations, and the mutant fraction is the ratio of mutant to total plaques counted. The actual numbers of mutant and total plaques scored after each incubation are shown in parentheses. ``Background'' denotes DNAs incubated in the absence of SV40 T antigen, where no detectable replication occurred. ``Replicated in vivo'' means replicated in, and isolated from, COS7 cells. ND, not determined.



FOOTNOTES

*
This work was supported by National Science Foundation Research Grant DMB 9119854. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked `` advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed: Dept. of Biochemistry and Biophysics, Oregon State University, 2011 Agricultural and Life Sciences Bldg., Corvallis, OR 97331-7305. Tel.: 503-737-1865; Fax: 503-737-0481.

The abbreviation used is: DMEM, Dulbecco's modified Eagle's medium.


ACKNOWLEDGEMENTS

Much of the work described in this paper was carried out in the laboratory of Dr. Thomas A. Kunkel, NIEHS, National Institutes of Health during separate visits by each of the two authors to that laboratory. We are grateful to Dr. Kunkel and to Drs. John D. Roberts and David C. Thomas of that laboratory for hospitality, instruction, and guidance.


REFERENCES
  1. Beckman, R. A., and Loeb, L. R. (1994) Q. Rev. Biophys. 26, 225-331
  2. Beese, L. S., Derbyshire, V., and Steitz, T. A. (1993) Science 260, 352-355 [Medline] [Order article via Infotrieve]
  3. Bloom, L. B., Otto, M. R., Eritja, L. J., Reha-Krantz, L. J., Goodman, M. F., and Beechem, J. M. (1994) Biochemistry 33, 7576-7586 [Medline] [Order article via Infotrieve]
  4. Kunkel, T. A. (1992) BioEssays 14, 303-308 [Medline] [Order article via Infotrieve]
  5. Kunkel, T. A., and Alexander, P. S. (1986) J. Biol. Chem. 261, 160-166 [Abstract/Free Full Text]
  6. Kunkel, T. A., Roberts, J. D., and Zakour, R. A. (1987) Methods Enzymol. 154, 367-382 [Medline] [Order article via Infotrieve]
  7. Kunz, B. A., Kohalmi, S. E., Kunkel, T. A., Mathews, C. K., McIntosh, E. M., and Reidy, J. A. (1994) Mutat. Res. 318, 175-239 [Medline] [Order article via Infotrieve]
  8. Leeds, J. M., Slabaugh, M. B., and Mathews, C. K. (1985) Mol. Cell. Biol. 5, 3443-3450 [Medline] [Order article via Infotrieve]
  9. Mathews, C. K., and Ji, J. (1992) BioEssays 14, 295-301 [Medline] [Order article via Infotrieve]
  10. North, T. W., Bestwick, R. K., and Mathews, C. K. (1980) J. Biol. Chem. 255, 6640-6645 [Abstract/Free Full Text]
  11. Roberts, J. D., and Kunkel, T. A. (1988) Proc. Natl. Acad Sci. U. S. A. 85, 7064-7068 [Abstract]
  12. Roberts, J. D., Thomas, D. C., and Kunkel, T. A. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 3465-3469 [Abstract]
  13. Vartanian, J.-P., Meyerhans, A., Sala, M., and Wain-Hobson, S. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 3092-3096 [Abstract]
  14. Wang, T. S.-F. (1991) Annu. Rev. Biochem. 60, 513-552 [CrossRef][Medline] [Order article via Infotrieve]
  15. Wolfe, K. H., Sharpe, P. M., and Li, W.-H. (1989) Nature 337, 283-285 [CrossRef][Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.