(Received for publication, July 26, 1994; and in revised form, December 18, 1994)
From the
A gel fidelity assay, previously used in the analysis of DNA
polymerases having no associated 3` to 5` exonuclease activity, has
been generalized for use with polymerases that contain exonucleolytic
proofreading. The main purpose of this study was the development of a
general analysis, using a standard Markov model, to convert
experimentally observed DNA primer gel bands arising from insertion and
proofreading of right and wrong deoxyribonucleotides, into nucleotide
incorporation velocities and, most importantly, fidelities. The model
has been applied primarily to an analysis of polymerase kinetics and
fidelity in the presence of a next correct rescue dNTP, but the model
can be conveniently modified to investigate other experimental designs.
In the presence of rescue dNTP, direct competition occurs between
excision or extension of a mismatch. At concentrations of rescue dNTP
sufficient to suppress the gel band intensity at the mismatch target
site, nucleotide incorporation and misincorporation rates can be
obtained from the ratios of gel band intensities 3` (downstream) and 5`
(upstream) to the target site, measured as a function dNTP
concentration for ``wrong'' and ``right'' dNTP
substrates. The polymerase misincorporation efficiency, in the presence
of proofreading, is given by the ratio of wrong to right incorporation
efficiencies, V/K
,
obtained from the gel band ratios. The bacteriophage T4 polymerase with
a highly active 3`-exonuclease activity was used to illustrate the
assay. Nucleotide misincorporation efficiencies measured at several
template sites were dCMP
A
10
,
dGMP
A
10
, dTMP
T
2
10
, and dAMP
A < 10
.
Proofreading of the dGMP
A mispair was suppressed by about 3-fold
in the presence of high concentrations of next correct
``rescue'' dNTP causing a concomitant reduction in the
fidelity of dGMP
A to about 3
10
.
A large number of studies on the fidelity of DNA synthesis using
purified DNA polymerase have been performed over the past 30 years (1, 2) . An initial experiment performed by Kornberg
and co-workers (3) measured the frequency of mispairing of the
mutagenic base analogue bromouracil with G using Escherichia coli DNA pol I. ()Hall and Lehman(4) , compared
misincorporation of dTMP opposite G using DNA polymerases purified from
wild-type and mutator (tsL56) T4 bacteriophage. E. coli pol I and T4 polymerases contain 5`
3` polymerase and 3`
5` proofreading exonuclease activities on a single polypeptide
chain. Biochemical studies using pol I (5) and T4 mutator,
wild-type, and antimutator polymerases (6, 7, 8, 9, 10) were
crucial in demonstrating that discrimination against non-Watson-Crick
base pairs occurred during nucleotide insertion and excision, and the
combined action of both steps produced a high fidelity DNA product.
Subsequent experiments using a wide variety of DNA polymerases and
reverse transcriptases showed that the spectrum of errors and their
template locations could differ widely among polymerases; recent
reviews dealing with DNA synthesis fidelity are contained in Refs. 1,
2, 11, and 12.
Although DNA polymerase errors occur infrequently, different types of errors (e.g. transitions, transversion, and frameshifts) cover a wide range of frequencies and are distributed nonrandomly in DNA. Many factors, perhaps involving subtle differences in the interactions linking polymerase, matched and mismatched dNTP substrates, and primer-template DNA, can contribute to enzyme-specific variations in mutational spectra and nonrandom error distributions. Examples include fluctuations in base stacking that can perturb nucleotide insertion rates and fidelities(13) , primer-template slippage events that can lead both to base substitution and frameshift errors(14, 15, 16) , and sequence context effects (e.g. A-T or G-C ``richness'' that can influence proofreading efficiencies (17, 18, 19, 20, 21, 22) ).
The availability of a rapid assay measuring fidelity at arbitrary template locations would be useful for analyzing how differences in polymerases and primer template sequences contribute to each individual type of base substitution and frameshift mutation. Direct observations of incorporation fidelity, e.g. using two radioactive labels to measure the relative incorporation of mismatched and matched base pairs(9, 10) , are simple in concept but difficult in practice. In experiments where there is direct competition between matched versus mismatched dNTPs for incorporation into DNA, it is often difficult to detect misincorporations, even when dNTP pool concentrations are biased to favor insertion of a mispaired dNTP substrate(23) .
A ``gel kinetic'' assay, in which
incorporation of matched and mismatched dNTP substrates are measured,
in separate reactions, as a function of dNTP concentration,
has been used as an alternative method of deducing insertion fidelity
for polymerases (2, 24, 25) . In the gel
kinetic assay, extension of P-labeled primers by the
addition of either matched or mismatched unlabeled nucleotides opposite
a template target site can be visualized using polyacrylamide gel
electrophoresis to resolve primers differing in length by one or more
nucleotides. In the absence of exonuclease activity, the origin of each
gel band results from either a newly inserted ``right'' or
``wrong'' nucleotide or from polymerase dissociation. The
assay can be performed rapidly, and the kinetic analysis leads to
simple expressions for insertion rates and fidelities in terms of gel
band intensity ratios(2, 24, 25) . However,
the presence of proofreading adds considerable complexity to the
kinetic analysis, because gel bands can arise from either insertion or
excision of a nucleotide or from polymerase dissociation(2) .
In this paper, we provide a detailed kinetic analysis of the gel
fidelity assay for DNA synthesis in the presence of exonucleolytic
proofreading. We use the assay to measure the fidelity of synthesis by
T4 DNA polymerase at several defined template sites.
The primers and templates used were synthesized by Dr. Linda B. Bloom, University of Southern California, on an Applied Biosystems Inc. 381A DNA synthesizer, and by Dr. Lynn Williams (Comprehensive Cancer Center, University of Southern California, Los Angeles) and used after gel purification. The 30-mer template sequence was 5`-TCATCGAGCATGATCACGTCGTGACTGGGA-3`. The 15-mer primer used for the primer extension and turnover experiments was 5`-TCCCAGTCACGACGT-3`. Additional 18-mer primers used for the exonuclease partitioning experiments had the sequences 5`-TCCCAGTCACGACGTGAT-3` and 5`-TCCCAGTCACGACGTGAG-3`.
Figure 7:
Partitioning between polymerizaton and
proofreading at matched and mismatched primer termini from a standing
start. T4 DNA polymerase was presented with preformed correctly paired
or mispaired P/T termini. a, correctly paired dTMPA
primer-3` terminus; b, incorrectly paired dGMP
A
primer-3` terminus. The ratio of the extended primers (the sum of bands
after the primer band) to degraded primers (the sum of bands before the
primer band) is plotted as a function of the concentration of the next
correct nucleotide, dCTP.
Figure 1:
Sketch of gel kinetic DNA polymerase
fidelity assay. A 5` P-labeled primer annealed to DNA
template is extended by polymerase in a running-start reaction. The
running-start deoxynucleoside triphosphate substrates G (dGTP)
and A (dATP) are present at concentrations required to achieve
nearly maximum forward polymerization rates. The dNTPs, dTTP (right) and dGTP (wrong), to be inserted at the template
target site T (the target base is A for this example), are measured as
a function of concentration, in separate reactions, in the
presence of a rescue substrate (dCTP) present at a concentration
required to achieve approximately maximum rates of extension of the
mismatched primer terminus. The ratio of gel bands, I
/I
,
is plotted as a function of target dNTP concentrations, and the values
of V
/K
are
determined separately for the right and wrong incorporations; (I
= I
+ I
+ I
+ . . . ). The expected
gel band patterns for correct insertions (dTMP
A) yield a
relatively large value of I
compared with a low intensity band at T - 1; see Fig. 4a. For incorrect insertions (dGMP
A), I
is small compared with a
high intensity band at T - 1; (see Fig. 4b where
the T - 1 band corresponds to the intense A
T band located 2
nucleotides downstream of the primer-3` terminus). Ideally, the band at
the target site T is either missing or small compared with the bands at
T + 1 and T - 1. To satisfy SCH conditions, see
``Markov Model Analysis,'' less than 20% of the input
P-labeled primer band is extended during the course of the
reaction. The misincorporation efficiency, f
, is
determined from the V
/K
ratios for wrong and right
incorporations.
Figure 4:
Incorporation of one right and three wrong
nucleotides opposite a single site on the template. Primer extension
experiments were performed with 30 µM dGTP, 30 µM dATP and varying amounts of the nucleotide to be inserted opposite
template A, as well as varying amounts of the next correct nucleotide
(dCTP). In a, the incorporation of the correct dTMP opposite A
(the third nucleotide to be added) in the presence of dCTP (30
µM) is shown. In b, the misincorporation of dGMP
in the absence and presence of dCTP (30 µM) is shown,
while in c, the concentration of dCTP was varied and served as
both the misincorporated nucleotide and the next correct nucleotide.
The relative velocity of misincorporation of dCTP, at low
concentrations, was fit accurately by the quadratic expression v = (2.0 10
µM
)[dCTP]
. Paneld, shows the incorporation of the incorrect
dAMP in the absence and presence of the rescue nucleotide (dCTP). Note
that the incorporation in the lanes with the rescue nucleotide present
were caused by the misincorporation of
dCMP.
Extension of 5` P-labeled primers by
incorporation of matched and mismatched deoxyribonucleotides opposite
individual template target sites can be visualized directly by
polyacrylamide gel
electrophoresis(24, 27, 28) . Measurements of
the dependence of integrated gel band intensities as a function of the
concentration of right and wrong dNTP substrates have been used to
deduce the fidelity of nucleotide insertion for DNA polymerases devoid
of 3`-exonucleolytic proofreading
activity(13, 24, 25) . Our primary
experimental objective in this paper is to investigate the use of a gel
kinetic assay to measure DNA polymerase fidelity in the presence of
proofreading. A sketch illustrating the fidelity measurement is shown
in Fig. 1. Wild-type bacteriophage T4 DNA polymerase, known to
contain a relatively active associated 3`-exonuclease
(nuclease/polymerase ratio
0.1, see (6) and (10) ) has been used in the analysis.
The T4 proofreading exonuclease has been shown to excise a
sizable fraction (10-20%) of correctly inserted
nucleotides during rapid forward synthesis(9, 10) ,
and it was therefore important to investigate whether misincorporation
bands could be detected in the presence of an active proofreading
exonuclease (Fig. 2). The template sequence is shown at the
right-hand side, and the primer-3` terminus is indicated by the intense
band located opposite template site A (e.g. see lane1, None). A band corresponding to correct incorporation
opposite template C was observed only when dGTP was present (Fig. 2, lane2, G). No detectable
primer extension was observed when either dATP, dTTP, or dCTP was
included as the sole source of substrate (Fig. 2, lanes3-5). Extensive primer degradation caused by the
action of the polymerase associated 3`
5` exonuclease was
detected in the absence of dGTP (lanes3-5),
while the only detectable degradation band in the presence of dGTP
corresponded to removal of a single dTMP at the primer-3`-terminus (lane2).
Figure 2: Observation of misincorporations by T4 DNA polymerase. Each gel lane illustrates a primer extension reaction containing 70 µM of each of the nucleotides shown. The template sequence is given at the right-handside, and the primer-3` terminus is indicated by the intense band located opposite template site A (e.g. see lane1, None).
Primer extension was measured in the
presence of various combinations of two dNTP substrates (Fig. 2, lanes6-11). The gel patterns revealed the
presence of a variety of misincorporation events. The upperband opposite the template A position (lane7, ) arose from
misincorporation opposite template T followed by extension of the
mismatched primer terminus by the correct incorporation of dTMP
opposite A. A band of barely detectable intensity located opposite
template T (lane7) reflects a low probability of
polymerase dissociation from the P/T, prior either to excision or
extension of the mispair. A point of interest concerning the T4
polymerase is that, contrary to expectations, the identity of the
mispair appears to be dTMP
T rather than dGMP
T. Primer
extension was observed to increase with increasing concentrations of
dTTP but not dGTP (data not shown). The formation of a dCMP
T
mismatch is indicated by the band opposite template T (lane8,
). The primer terminated
following formation of the C
T mispair (lane8)
because, unlike the case of extension of the T
T mispair (lane7,
), the next correct
nucleotide, dTTP, was absent from the reaction shown in lane 8.
Misincorporation bands were also observed in the absence of the
nucleotide (dGTP) required for insertion at the primer-3` terminus
(lanes 9-11). Thus, the presence of detectable
misincorporation gel bands provides a means for analyzing the fidelity
of polymerases in the presence of exonucleolytic proofreading by using
a gel kinetic analysis analogous to the assay developed to measure
fidelity in the absence of
proofreading(13, 24, 25) .
In comparison
with reactions carried out with two substrates, combinations of three
substrates resulted in a small fraction of primers extended by as many
as 9 nucleotides containing at least one misincorporation band (data
not shown). A reaction containing all four dNTP substrates (lane12, All) showed prominent pause bands at
template incorporation sites 2, 12, 14, and 15 (e.g. template incorporation site 2 refers to the
template T site 2 nucleotides downstream from the primer-3` terminus A
site, lane1). Thus, during a single hit, a majority
of primers can be extended by at least 12 nucleotides, but the T4
polymerase also has a small (10%) probability of dissociating
following addition of 2 nucleotides.
Figure 3:
T4 DNA polymerase incorporation kinetics.
Primer extension experiments were run for 1 min with 30 µM dGTP and varying amounts of the nucleotide substrate to be
inserted opposite template T. Shown to the left of each gel is
the nucleotide inserted on the primer strand and the template base
corresponding to each band (i.e. AT represents dAMP
inserted opposite T). In a, the substrate was dATP, present at
the concentrations indicated, both with (closedcircles) and without (opencircles) the
next correct nucleotide dTTP (3 µM). A plot showing the
ratio (I
+ I
)/I
was fit (in the presence
and absence of rescue dTTP) to a rectangular hyperbola. A least-squares
fit gave a relative V
= 45 and K
= 12 µM (V
/K
=
3.8 µM
). In b, the substrate
for misincorporation was dTTP, which served as its own next correct
rescue nucleotide. The ratio I
/I
was plotted, and a linear fit was obtained corresponding to V
/K
= 6
10
µM
.
Reaction conditions are described under ``Experimental
Procedures,'' see ``Reaction Conditions for Primer
Extension Measurements.''
For
misincorporation of dTMP opposite template T, the target and rescue
substrates were both dTTP (Fig. 3b). Rescue bands at
site T +1 are clearly visible at the two highest
concentrations of dTTP, 100 and 300 µM, and a very weak,
but measurable, band is present at 30 µM dTTP (Fig. 3b). The misincorporation velocities were a
linear function of [dTTP], with no detectable curvature to
permit an estimate of K (Fig. 3b, right-handside). An important point to emphasize is
that the absence of a detectable primer extension band at the target
site implies that the rescue nucleotide was present at high enough
concentration (>30 µM) to allow extension of the
mismatched primer-3` terminus prior to polymerase dissociation and that
the complete removal of a mismatched dTTP
T by the exonuclease,
prior to polymerase dissociation, created a band at site T -1 by removing the remaining unextended portion of the primer band at
the target T.
The fidelity analysis is simplified when a
target band at T is either absent, or more generally, small
compared with the integrated intensities at both T +1 and T -1. The presence of an appreciable target band leads to
potential ambiguities in the analysis (see ``Markov Model
Analysis'') because a band at T can arise either from
incorporation at the target site followed by polymerase dissociation or
from exonucleolytic excision at the rescue site. Proofreading of a next
correct rescue nucleotide could occur with relatively high efficiency
because it is likely to be destabilized when located next to a mispair.
Thus, the origin of the target band will be ambiguous unless additional
measurements are made, specifically, a measurement of polymerase rate
of dissociation at the target site (2, 29, 30) and a measurement of turnover of
rescue dNTP dNMP(9, 10) . For the case of the
correct nucleotide, however, the target band T was used in the
analysis because the kinetics for incorporation of the correct
nucleotide were essentially the same in the presence and absence of the
next correct nucleotide (Fig. 3a, right-handside).
The misincorporation
efficiency, f, defined as the reciprocal of the
incorporation fidelity, is equal to the ratio of dTMP/dAMP
incorporation opposite T, and is expressed as a ratio of V
/K
values for T
T
(Wrong) compared with A
T (Right) base pairs.
A value of f
1.6
10
was obtained from the data in Fig. 3, a and b.
Insertion of all four nucleotides was
measured opposite template A, three nucleotides downstream from the
primer-3` terminus (Fig. 4). Two running-start nucleotides, dGTP
and dATP were present at concentrations 30 µM, and the
concentration of each of the four dNTPs was varied as shown.
Misincorporation kinetics for dGTP and dATP were measured in the
presence and absence of the rescue nucleotide, dCTP
30
µM, under multiple-hit conditions (there was an average of
about 1.5 hits/extended primer). The SCH approximation is extremely
useful for describing the assay; however, accurate fidelity
determinations can be made when multiple hits have occurred (see ``Correcting for Multiple Hits'' under
``Appendix'')
DNA synthesis was first carried out in the
presence of all four dNTPs, with increasing concentrations of the
correct substrate, dTTP (Fig. 4a). An increase in the
primer extension band corresponding to dGMPA misincorporation was
observed with increasing dGTP concentration in the absence of rescue
dCTP (Fig. 4b). There appeared to be no detectable
primer extension continuing beyond the G
A mismatch in the absence
of dCTP. When dCTP was included in the reaction, the dGMP
A
misincorporation band diminished significantly, and a trace dGMP
A
band was observed only at the highest concentration of dGTP (1000
µM).
A small fraction of primers was extended by the
addition of at least 10 nucleotides beyond the initial mismatch in the
presence of the rescue nucleotide, and these contained at least two,
and possibly three, additional mismatches located opposite the three
template A sites downstream from the initial target A site. At each
site, the band opposite A was missing (dTTP was absent from the
reaction in Fig. 4b), suggesting that the enzyme
excised the misinserted nucleotide with high efficiency, giving rise to
a relatively intense band, at T -1, 1 base before the
target site, and extended the mismatch at lower efficiency giving rise
to a band at T +1. Note that the excision bands
corresponding to 1 base prior to each A template site increased in
intensity with increasing dGTP concentration. It is therefore likely
that the mispairs were predominantly dGMPA, but some dCMP
A
mispairs may also have occurred. We estimated the efficiency of
misincorporating dGMP opposite A located 9 nucleotides from the primer
terminus by summing the integrated intensities of all the bands
downstream of the dGMP
A site, dividing by the integrated
intensity of the band prior to the mispair site, I
/I
,
and plotting the ratio as a function of the dGTP concentration (see and ). The efficiency of incorporating dTMP
opposite A can be determined in the same manner (Fig. 4a; note that inclusion of the low intensity
(T
A) band, I
, at the target site has a
negligible effect on the value of V
/K
for the correct
insertion (see ``Markov Model Analysis'')). For this template
A site, V
/K
(dGMP
A)
=1.5
10
, V
/K
(dTMP
A) =1.6
10
, resulting in f
10
.
The misincorporation of dCMP opposite A as a function of increasing dCTP concentration is shown in Fig. 4c. Significant primer extension beyond the four template A sites (sites 3, 6, 9, and 13) was observed, with a much greater extent of misincorporation and continued synthesis past the site of the mispair occurring at higher rescue dCTP levels (Fig. 4c). Summation of the integrated intensities of bands extending beyond the first A site (site 3) resulted in a quadratic dependence of mispair extension efficiency as a function of dCTP concentration, in the low dCTP concentration range (data not shown). A quadratic dependence is expected to occur at low concentration of rescue dNTP when the target and rescue dNTPs are the same(31) .
Misincorporation of dAMP
opposite A was not detected using dATP concentrations up to 1000
µM (Fig. 4d, NoRescue).
The absence of a detectable dAMPA band at high concentrations of
dATP was not likely to have been caused by substrate inhibition of the
polymerase because primer extension bands were observed in the presence
of rescue dCTP (Fig. 4d, Rescue). Therefore, the
misincorporation of dCMP opposite template A sites were most likely to
be responsible for primer elongation past the first two A sites in the
absence of dTTP. From the data in Fig. 4, the misincorporation
efficiencies () were estimated as dCMP
A
10
and dGMP
A
3
10
and 1
10
, in the presence and absence
of rescue dCTP, respectively. The absence of a detectable primer
extension band corresponding to misincorporation of dAMP opposite A (Fig. 4d, NoRescue) places an
approximate upper limit on the value of f
(dAMP
A) < 10
.
Figure 5:
Kinetics of the rescue of a mismatch by
the next correct nucleotide. In the primer extension reaction, the
substrate concentrations were dGTP = 100 µM and
dATP = 30 µM, and dCTP concentration was varied as
shown. In a, the gel bands are shown; in b, I/I
is
plotted as a function of dCTP concentration. Note that in the absence
of rescue nucleotide, some of the mismatches escaped cleavage because
of polymerase dissociation. In c, the value of I
/I
is plotted versusdCTP concentration, and this ratio reflects
the kinetics of incorporation of the rescue
nucleotide.
We considered the obvious possibility that a small
fraction of rescue dCTP could have undergone deamination to dUTP as an
explanation for the low K observed for extension
of the G
A mispair. In the presence of low contaminating levels of
dUTP, U
A correct pairs could have been formed in preference to
G
A mispairs, and rescue of the U
A pairs could then have
occurred at much lower dCTP concentrations than required for rescuing
G
A mispairs, since the addition of dCMP opposite G would occur
after a correct base pair (U
A). Since the K
for the addition of dCMP would be in the micromolar range, the
rescue should be saturated at [dCTP] above
1
µM. However, the absence of a detectable incorporation
band opposite A at 10 µM dCTP (Fig. 4c)
argues against this possibility because the rescue reaction is
approximately saturated in the presence of 10 µM dCTP (Fig. 5b). A second argument against the presence of
dUTP contamination is that as the concentration of dCTP is increased
between 0 and 10 µM in Fig. 5a, there is
no increase in insertion opposite A above the level of incorporation of
dGMP, where an increase would be expected if the dCTP was contaminated
with dUTP.
A comparison between wild-type and an
exonuclease-deficient T4 polymerase for dGMPA misincorporation
showed that f
(dGMP
A)
5
10
measured for the exonuclease-deficient enzyme was
about 2-fold higher than the wild-type error rate measured in the
presence of 30 µM dCTP rescue nucleotide and was about
5-fold higher than the wild-type rate measured in the absence of rescue
dCTP (Fig. 4b). The total amount of primer extension
for the exonuclease-deficient polymerase was essentially independent of
the concentration of rescue dCTP (Fig. 6), whereas primer
extension increased appreciably for the wild-type polymerase (Fig. 4b, Rescue). A second misincorporation,
either dGMP
G or possibly dAMP
G, formed immediately adjacent
to the dGMP
A misincorporation band, was clearly observed in the
banding pattern for the exonuclease-deficient enzyme (Fig. 6,
position indicated by an arrow) but was not present in case of
wild-type T4 polymerase (Fig. 4b). The mispair was most
likely to be G
G because dGTP was present at a 30-fold higher
concentration than dATP.
Figure 6:
Misincorporation by a T4
exonuclease-deficient polymerase and the effect of the next correct
nucleotide. In the primer extension reaction, the substrate
concentrations were dGTP = 100 µM and dATP =
30 µM, and dCTP concentration was varied as shown. The arrow indicates the location of a band in the 0 µM dCTP lane that corresponds most likely to the misincorporation of
dGMP opposite G (or possibly to the misincorporation of dAMP opposite
template G) following the formation of the GA
mismatch.
The probability of a P/T
being hit n times by polymerase is given by a Poisson
distribution, p =
(µ
/n!) exp (-µ), assuming that the
interaction between polymerase and any given P/T is independent of the
identity of the P/T and of the number of times it was previously hit.
The probability that any P/T is not extended is p
= exp (-µ), where µ is equal to the average
number of hits/primer, µ =
-ln(p
). The probability that a primer is not
extended during the reaction, p
, is measured as
the fraction of unextended primers present at the end of the reaction.
The distribution of bands provides an accurate reflection of the
probabilities that a polymerase dissociates at a particular template
site, and the band intensities reflect the probability that polymerase
either inserts or excises a right or wrong dNMP to reach a particular
template site prior to dissociation. For example, if the integrated
intensities the bands are: I(+0) = 100, I(+1) = 20, I(+2) = 10, and I(+3) = 30, then the relative probabilities of a
polymerase incorporating a given number of nucleotides during a single
hit are: P(+1) = 33%, P(+2) =
17%, and P(+3) = 50%. These probabilities result
from a single polymerase
P/T encounter and do not depend on any
specific polymerization model. An explicit model is required for the interpretation of band intensities in terms of an underlying
kinetic scheme. The effect of multiple encounters between polymerase
and P/T DNA is discussed below (see ``Correcting for
Multiple Hits'' under ``Appendix'').
Figure 8:
Markov models for DNA polymerase. In all
figures, transient states are indicated by largegraycircles, absorbing states by smallwhitecircles, and transitions by single arrows labeled with the rate of the transition. The polymerase is assumed
to reach the site immediately before the target site by adding the
running start nucleotides (doublearrows). a, polymerase without an exonuclease. The polymerase begins in
transient state 3. It will either make a transition to absorbing state
1 by dissociating with rate k or to absorbing
state 2 by adding the next nucleotide with rate k
. b, polymerase possessing an
exonuclease. This is a minimal model for a DNA polymerase possessing an
associated 3`-exonuclease. The system begins in transient state 4, from
which it will either make a transition to absorbing state 1 with rate k
by dissociating or make a transition to
transient state 5 with rate k
by adding the
target nucleotide. If the polymerase is in transient state 5, it will
either return to transient state 4 with rate k
by excising the target nucleotide, go to absorbing state 2 with
rate k
by dissociating, or go to
absorbing state 3 with rate k
by adding the next
correct rescue nucleotide. c, polymerase possessing an
exonuclease with discreet states for the binding of the target
nucleotide. The model is similar to the one shown in b, except
that the transient states have been expanded to include association of
a dNTP (indicated by N) with a polymerase
P/T complex,
initially in transient state 3, to make a transition to transient state
4 (polymerase
P/T
dNTP complex) and dissociation of dNTP to
make the reverse transition from state 4 back to state 3. Polymerase
dissociation from either state 3 or state 4 into the absorbing state 1
will give rise to a gel band at template site T - 1. The
polymerase can incorporate N, with rate constant k
, to enter state 5. Once in state 5, the
3`-exonuclease can excise the newly incorporated N, with rate constant k
, to return to state 3, or the polymerase can
insert a rescue nucleotide, with rate constant k
, to enter the absorbing state 2, giving rise
to a gel band at template site T + 1. It is assumed that, while in
state 3, the dissociation of the polymerase is small, i.e.k
(k
+ k
).
A knowledge of k rates at specific P/T
sites, which can be evaluated by measuring the decrease in primer
utilization as a function of time caused by polymerase
dissociation(2, 33) , is not required to measure
fidelity(2) . The polymerase is bound at the same correctly matched P/T terminus when making either right or wrong
insertions, and thus k
cancels from the
expression for fidelity (). The relative misincorporation
efficiency, f
, is expressed as the ratio of
apparent second order rate constants(26) ,
[(V
/K
)
/(V
/K
)
]
giving the efficiency of insertion of wrong compared with right dNMPs
at the target site (). V
/K
is given by the slope
of the linear portion of the rectangular hyperbola plot, k
versus dNTP concentration. A measurement of
the ratio I
/I
is sufficient to obtain the polymerase insertion fidelity in the
absence of proofreading.
In order to determine the relative band intensities in the presence of proofreading, we have used a Markov model to describe the system (see ``Appendix'' and (34) ). Solution of the model requires some simple matrix manipulations and gives the final distribution of band intensities in terms of the rate constants of the individual microscopic steps in the model (see Fig. 8b). Basically, solution of the model is accomplished by determining the final distribution of states of the system after an large amount of time has elapsed, assuming all systems initially prepared were polymerases located at template position T -1. The evolution of the ensemble of systems into the final distribution is governed by the multiplication of the vector corresponding to the initial state by a matrix that depends on the rates of transition between the states of the system. The final distribution of states when all polymerases have dissociated from their templates is identified with the gel band intensities. This corresponds to the completed hit approximation given earlier.
The model analysis
in the presence of proofreading is strongly influenced by experimental
conditions. At a concentration of rescue dNTP high enough to compete
effectively with polymerase dissociation at T, (k + k
)
k
, the band at T was not
detected above background (Fig. 3b and 4, b and d). The probability of making a transition from
states 4 to 1 contributing to I
is
proportional to k
(k
+k
), and the probability to go from
states 4 to 3 contributing to I
is
proportional to k
k
(see ). Thus, the ratio of T +1to T -1is
The ratio k/k
is
multiplied by a factor that gives the probability that the polymerase
adds a rescue nucleotide to go from T
T +1 prior to exonucleolytic excision of the inserted nucleotide to go
from T
T -1.
If the binding of the
nucleotide is explicitly included in the model for the exonuclease (Fig. 8c), the apparent nucleotide incorporation rate (k) for the model is
where V is the V
for the enzyme without an exonuclease, and dNTP is the nucleotide
inserted at the target site T (see ``Appendix'' and ). The exonuclease reduces the apparent V
by the factor k
/(k
+ k
) leaving K
unchanged. The misincorporation efficiency in the presence of
proofreading is determined from by varying the
concentration of target dNTP (for right and wrong nucleotides
separately) at a constant concentration of rescue dNTP. Values of V
/K
are obtained from the
linear portion of a plot of I
/I
versusdNTP concentration.
The addition of a band at T to that at T +1 (and beyond) is not likely to alter the kinetics of correct incorporation significantly because less than 20% of correctly inserted nucleotides are typically excised(10) . In the case of incorporation of dAMP opposite T, no significant differences were observed comparing the polymerization kinetics carried out either in the absence or presence of rescue dNTP (Fig. 3a, right-handside). In contrast, the presence of a rescue dNTP markedly affected the kinetics for incorrect incorporations (Fig. 5c).
Mutations occur nonrandomly in DNA. The types and magnitudes of base substitution and frameshift errors can vary widely depending on polymerase properties (1, 2) and local behavior of P/T DNA termini(22, 33, 35) . Previously, we introduced a gel kinetic assay to measure DNA polymerase fidelity at arbitrary template sites in the absence of exonuclease proofreading(24, 25) . Misincorporations of normal nucleotides are rare events that are difficult to measure in assays in which right and wrong nucleotides compete directly for insertion into DNA, even in the absence of proofreading. A gel kinetic assay, in which incorporation of wrong and right nucleotides are measured in separate reactions, is designed to avoid many of the problems inherent in a direct competition assay.
A greater latitude in
experimental design and gel band interpretation is possible using
proofreading-proficient polymerases in contrast to the straightforward
design and interpretation of fidelity measurements in the absence of
proofreading(2) . In this paper, we have generalized the gel
assay to measure fidelity in the presence of 3` 5` exonuclease
activity using wild-type T4 DNA polymerase. A Markov model is used to
convert experimentally observed primer extension gel band intensities,
arising from insertion or excision of right and wrong nucleotides, into
nucleotide incorporation velocities (see Markov Model Analysis). If
primer extension bands corresponding to correct and incorrect
nucleotide incorporations at defined template sites are detectable by
polyacrylamide gel electrophoresis (Fig. 2Fig. 3Fig. 4Fig. 5Fig. 6),
then incorporation efficiencies, V
/K
values, can be
determined and the relative misincorporation efficiency (the reciprocal
of incorporation fidelity) can be evaluated as the ratio of V
/K
for wrong compared with
right incorporations ( Fig. 1and ).
Using T4 polymerase, containing an active
proofreading exonuclease, we measured misincoporation efficiencies at
several P/T sites in the presence of saturating concentrations of
rescue dNTP. Compared with dTMP incorporation opposite A, dGMPA (Fig. 4b, Rescue) and dCMP
A (Fig. 4c), misincorporation efficiencies were
approximately 3
10
and
10
, respectively. Misincorporation of dAMP opposite
A was not detected (Fig. 4d), f
(dAMP
A) < 10
. A value of f
(dGMP
A)
10
was
measured at a target A site 9 nucleotides downstream from the first A
site (Fig. 4b). A value of f
1.6
10
was determined for
dTMP
T misincorporation (Fig. 3, a and b). In contrast to the high misincorporation of dTMP opposite
T, misincorporation of dGMP opposite T was at least a factor of 100
less efficient at this site, using either wild-type or
exonuclease-deficient mutant T4 polymerases (data not shown). It has
been observed that other proofreading-deficient polymerases have
considerably less difficulty making and extending G
T compared
with T
T mismatches(13, 36, 37) .
A
defining hallmark of the next nucleotide effect is that primer
extension competes with nucleotide excision, and thus proofreading
should be reduced as levels of rescue dNTP are increased leading to
increased misincorporation rates (10, 31) (see Fig. 3b). Values of V/K
for misincorporation
are predicted to increase quadratically with low increasing
concentrations of rescue dNTP when the identity of the misincorporated
and rescue nucleotides are the same(31) . The kinetics of
dCMP
A mispair formation (Fig. 5c) was found to
agree with this prediction (data not shown). When misincorporated and
rescue nucleotides differ in identity, misincorporation efficiencies
are expected to increase linearly at low increasing concentrations of
rescue dNTP(10, 31) , as observed in Fig. 5c.
An important advantage of the Markov analysis is that
it is straightforward in principle and practice to introduce
modifications into the model. Modifications are made directly in the
transition matrix P (see and for the
proofreading model used in this paper) to incorporate changes in the
model with respect to number of states and transitions between states.
The final state populations are arrived at by simple operations
performed using P. This approach is considerably less
cumbersome, much less time consuming, and much more intuitively
understandable than formulating and solving a new series of rate
equations for each alteration made to the model. To cite a single
example, we have ignored the possibility of excision of the rescue
nucleotide in the basic model (Fig. 8b). Although there
is unlikely to be excision of the rescue nucleotide following
incorporation of a correct nucleotide opposite the target template site T, the situation could be different following a misincorporation
at T. The presence of an unstable base pair at T could
lead to excision of a ``stable'' base pair at T +1. To allow for this possibility, one would allow for a transition
from state 3 to state 5 (rate constant k), and since state 3 would now
become a transient state, the polymerase would dissociate from state 3 (k
) to a newly defined absorbing
state to create a band at T +1, or it could add another
dNMP to create a band at T +2. The remainder of the
analysis to compute the matrix NR () required to
express integrated gel band intensities in terms of the insertion and
excision parameters would be carried out as described under
``Appendix.''
As the model becomes increasingly complex,
additional experiments would be required to obtain values for the newly
defined rate constants, e.g. measuring integrated band
intensities at several concentrations of rescue dNTP, determining
polymerase dissociation rates at various template sites (see (2) ). However, the data reported in this paper suggest that
the simple kinetic model, using gel band intensity ratios to
characterize wrong and right incorporation, (Fig. 8b),
may be sufficient to measure polymerase fidelity in the presence of
proofreading at arbitrarily chosen template locations. Two other model
based approximations, an analysis of multiple hits and exonuclease
cycling, have been dealt with in the final two sections under
``Appendix.'' Multiple hits, which increase the apparent rate
of synthesis by allowing primer templates that have been hit once to be
re-engaged by a polymerase and further extended, have no affect on
fidelity, provided that the average number of hits is similar for right
and wrong insertions. The kinetics experiments can easily be designed
to satisfy this condition. Exonuclease cycling occurs when a nucleotide
is inserted and excised multiple times during a single hit. If a high
concentration of rescue dNTP is present in the assay, then cycling is
unlikely to occur for insertion of a correct nucleotide. Practically
speaking, if the concentrations of the target dNTP are chosen so that
the amount of incorporation for either right or wrong nucleotides
remains in the linear region, then the effect of cycling on
determinations of V/K
is
negligible (see ``Appendix'').
The measurements carried out with T4 polymerase containing a highly active 3` to 5` exonuclease suggests that the gel assay may be useful for measuring the fidelity of a wide variety of proofreading and nonproofreading polymerases. It should be possible to use the assay in a running-start mode to measure fidelities for polymerases with subunits that greatly enhance processivity, e.g. T7 polymerase, T4, or E. coli pol III holoenzyme. To measure the fidelity of highly processive polymerases by the gel kinetic assay will require that a careful balance be maintained between the concentrations of target and rescue dNTPs so that measurable bands at the T +1 and T -1 template positions can be observed.
The ``Appendix'' contains six sections. The first
section describes the formal steps in the solution of a Markov model.
In the second section, the solution is used to express nucleotide
misincorporation efficiency, f (),
in the presence of proofreading, in terms of measured gel band
intensity ratios; is derived under ``Appendix''
as . The third section contains a Markov model derivation
of polymerase incorporation velocity in the presence of proofreading,
and explicit binding of the target nucleotide () for the
model, Fig. 8c, is derived in under ``Appendix'' as . The fourth section presents a general model for relating
the experimental V
/K
values
to fidelity. The fifth section analyzes the effects of multiple hits on
the determination of fidelity by the kinetic assay. The sixth section
contains a brief analysis of the effects of dNTP
dNMP cycling.
The main elements of a Markov model are (i) the ensemble of systems, (ii) the states of the systems of the ensemble, and (iii) the transitions between these states. Since the Markov model deals
with an ensemble of systems, the actual description of the ensemble is
in terms of a set of probabilities of finding any randomly selected
system in any given state. This set of probabilities is given by the
state vector V(t) =
[p
(t) . . . p
(t)], where T denotes the
transpose of the vector, n is the total number of states in
the system, and p
(t) is the probability
of finding a randomly chosen system in state i at time t.
The transitions are assumed to occur randomly over any
given time interval dt. Transitions are between two given
states, and the probability of a transfer between these states in the
time interval dt (assumed to be very small) is given as P(dt). Since there are n states
in the system, there are n
possible transitions
between these states (note that the probability of a transition from a
state back to itself must be specified). These probabilities are
organized into the time-independent matrix P(dt)
where i and j are arbitrary states. The probabilities must meet the condition
which ensures that the total probability that a system in state i makes any transition is 1. Typically, states are
divided up into transient and absorbing states on the basis of the
number of transitions out of the state. If there are no transitions out
of a state j, then P(dt)
= 1, and the state is classified as an absorbing state.
Otherwise, the state is classified as transient state. The
major distinction between transient and absorbing states is that in the
limit of infinite time, any system will be found only in an absorbing
state. The states of the systems are enumerated such that all absorbing
states are listed before any transient states. Thus, if there are three
states in the system and two are absorbing, the states 1 and 2 are the
absorbing ones, and 3 is the transient one.
The time evolution of the state vector is given by the difference equation
In the single completed hit approximation, we are interested in the infinite time limit for the behavior of the polymerase on the template. That is, when we specify a completed hit, this is formally equivalent to determining the state vector at infinite time (where all transitions out of transients states (states in which the polymerase is still ``on the DNA'') have occurred. The formal steps to arrive at the infinite time state vector are as follows.
1) Construct the partitioned transition matrix P.
P has been subdivided into the identity matrix I representing transitions of absorbing states to absorbing states, the zero matrix 0 representing transitions from absorbing states to transient states, the matrix R(dt) representing transitions from transient states to absorbing states, and the matrix Q(dt) representing transitions among transient states.
2) Calculate the final state vector, i.e. the vector containing the final distribution of states for the system. The final distribution is determined by the application of an infinite number of infinitesimal transition operations to the initial state vector,
where P(dt) is the matrix product of P(dt) with itself n times. The value of the P
matrix after an infinite amount of
multiplications is given by (40) .
The matricies R and Q are obtained from the
transition matrix P (). Q represents transitions between transient states and approaches 0 in the limit of infinite time, as it must to ensure the
convergence of N. The matrix NR is is the only
``nontrivial'' part of P
, and
represents the transitions from the initial to the final states.
These concepts are illustrated using the model presented in Fig. 8a. We take as our system our earlier model for
polymerase action in the absence of exonucleolytic
proofreading(2) . The system is assumed to have three states:
state 1, polymerase dissociated from unextended DNA; state 2, DNA
extended by the polymerase; and state 3, polymerase bound to unextended
DNA. There are two transitions in this model: the ``off''
transition between states 3 and 1, where the polymerase dissociates
from unextended DNA, and the ``pol'' transition between
states 3 and 2, where the polymerase adds a base to the DNA (further
dissociation or polymerization is not considered). Thus state 3 is a
transient state and states 1 and 2 are absorbing. The rate of the off
transition is k, and its probability in a small
time interval is k
dt. Similarly, the rate and
probability of the polymerase transition are k
and k
dt.
Initially, the ensemble is set
up so that all systems are in state 3 (V(0) =
[0 0 1]). The P matrix for the system is
The submatrices of this matrix are
The matricies N and NR are thus
Finally, the infinite time state vector V() is
Thus, in the limit of infinite time, the ratio of extended to
unextended DNA (states 2 and 1) is just k/k
, in agreement with
our previous results(2) .
In the model, the polymerase
enters state 4 by incorporating a running-start nucleotide opposite
template site T -1. The enzyme has two choices while in
state 4; it can either dissociate from the DNA with rate constant k to enter state 1, resulting in a labeled gel
band at position T -1, or it can insert a nucleotide
(right or wrong) opposite the target site T to enter state 5. The
insertion rate, k
, depends on dNTP concentration
in accordance with Michaelis-Menten kinetics. The enzyme has a choice
of three transitions while in state 5; it can dissociate (k
), resulting in a gel band at T (entering state 2), it can excise the newly incorporated nucleotide
and return to state 4 (transition rate constant k
), or it can add the next correct rescue
nucleotide (k
) resulting in a gel band at T
+1 (entering state 3).
The transition matrix for this system is
where the normalization factor det(I - Q) is
The final step of the analysis is to obtain the matrix NR, which contains the transitions from initial to final states.
The object of the analysis is to determine the misincorporation
efficiency at a template target site T, f (, see ``Results''), from the integrated
gel band intensities corresponding to primers extended to T -1, T, and T +1. The first row of NR is
the probability that a primer extended by incorporation of a
running-start nucleotide opposite template site T -1 (system initially in transient state 4, Fig. 8b)
reaches the states giving rise to gel bands T -1, T, and T +1 respectively. The final state vector is
given by
The first three elements of V(), representing the
absorbing state populations, are proportional to the integrated band
intensities I
, I
, and I
,
respectively. In the absence of proofreading and when there is no
rescue dNTP present the reaction, k
=k
=0, the ratio of target to previous
band intensities is given by , (see also under ``Results''). The relationship of the gel
bands at Tand T -1to polymerase misinsertion
efficiency in the absence of exonuclease is, for the most part,
independent of the model used for analysis(2) .
In the presence of proofreading and in the presence of rescue
dNTP, the condition that the band at T be small, II
implies
that k
k
. When this condition is satisfied,
where k/(k
+ k
) is the probability that a primer band
extended to the target site is rescued by the addition of the next
correct nucleotide before proofreading occurs at site T.
Application of the steps described in the previous section leads to an expression for the infinite time population vector
where the first and second elements are proportional to the integrated gel band intensities at sites T -1 and T +1, respectively,
Thus, the polymerization rate is
This is the expression given in (see ``Results'') showing that the effect of the exonuclease in the simple model described by Fig. 8b is to cause a reduction in the apparent maximum velocity of the incorporation reaction in the absence of exonuclease activity, leaving the Michaelis constant unchanged.
Figure 9:
General model for incorporation of a
single substrate molecule. This is a general model for chemical
processes in which the incorporation of a single substrate molecule,
not bound in a cooperative fashion, distinguishes initial material from
the final product. The initial states are represented by the leftmosttwocircles and the squigglyarrows between them, with the arrows representing any number of states and transitions between the
input material and state (the state in which initial material can
bind the substrate before incorporating it). The final states are
represented by the rightmosttwocircles and
the arrows between them. The state
is the state
immediately following the first irreversible step after the
incorporation of the substrate. The grayrectangle between states
and
represents the states between the
binding of the substrate and the irreversible step following its
incorporation.
Consider a series of chemical reactions in which initial material, e.g. P/T DNA (Fig. 9, farleftcircle) is converted to a final product (the two rightcircles) by a series of reactions in which the
bottleneck is the addition of a critical substrate (denoted by the grayrectangle). Connecting the two left and
two rightcircles are squigglyarrows that denote any number of transitions, possibly branching, between
the states denoted by the circles. State is defined as the
population of P/T DNA that have been extended to the T -1 position and remain bound to a polymerase molecule, ready to
receive a nucleotide to insert at position T. State
is
represented in the model by transient state 4 (Fig. 8b). State
is the population of P/T DNA
that have been extended to the T +1 position by
incorporation of a rescue nucleotide and are assumed to be refractory
to exonucleolytic attack. State
represents initial material that
has been irreversibly converted to final product (polymerase bound to
site Tafter releasing the PP
, assuming
[PP
] = 0, in the polymerase-only model, Fig. 8a, and polymerase bound to site T +1 after incorporating the rescue base in the polymerase-exonuclease
model, Fig. 8b). The irreversible step, e.g. release of PP
or DNA product, need not be the same for
incorporation of right and wrong substrates.
The arrow labeled k denotes a series of
steps occurring between the binding of the incoming substrate and the
irreversible step after its incorporation. These steps occurring within
the grayrectangle (corresponding to transient state
5, Fig. 8b) include, but are not limited to,
conformational changes occurring in the insertion pathway that might
differ substantially for right and wrong substrates (32, 41, 42, 43, 44) , and
branching reactions such as exonucleolytic proofreading(26) .
At low concentrations of the substrate, k
will be the slowest step in the
entire system and will have a rate
[S], since (assuming no cooperative
interactions between the incoming substrates) the rate can always be
made to fall within the linear region (which indicates that
substrate/enzyme association rate constants dominate the reaction) by
suitable choice of substrate concentration. The arrow labeled k
denotes steps that convert final
material back to initial material and is assumed to have a rate of zero
in the absence of accumulated byproducts.
Thus, two critical
assumptions are made in the model (Fig. 9). First the
concentration of the substrate is low enough that the formation of
is a linear function of [S]. The initial
velocity is in the V
/K
domain when binding of the substrate is rate-limiting. Second,
the concentration of the byproducts of the incorporation of the
substrate is zero, removing the possibility of conversion of any
formed back to
. Given these conditions, there will be a time
profile for the concentration of initial material in the
state.
If the rate of conversion from
to
is very low, then the
dynamics of
(t) will only depend on the steps occurring prior to the formation of
. The dynamics will be the same
no matter what the nature of the substrate is and is denoted
(t). The formation of
is given by the
following equations.
Note that since the value of (t`)dt` will
be the same for right and wrong dNTP substrates, the ratio of the
incorporation efficiencies is
In the model, we denote
(t)/[S] as V
/K
, which gives
where V/K
is an
experimentally observed quantity, and
is the actual rate of
conversion of the bound substrate into product.
[S] is the rate of going from
states
to
(), and
is
proportional to the probability of a single nucleotide being
incorporated per nucleotide binding event.
/
is the ratio of probabilities for
incorporation of right versus wrong nucleotides per binding
event (i.e. the fidelity); thus, measurement of V
/K
ratios will give the
fidelity in this general model.
Thus, the conditions of the previous
section are met if it is assumed that polymerase molecules that hit a
template always extend to at least position T -1, that no
band is observed at position T, that primers that are extended
to T +1 and beyond never go back, and that in a single
hit, negligable amounts of T +1 are observed. Note from that if reactions for the right and wrong nucleotides are
run for the same time (i.e. same number of multiple hits),
using the results of the previous section, it can be seen that
determination of the ratio of V/K
(from the concentration behavior of I
and I
) for the correct and
incorrect nucleotide in this multiple hit situation will give the
misincorporation efficiency, since the assumptions behind Fig. 9are satisfied. Only the band following the initial
misincorporation (T +1band) can be analyzed using
multiple hits since for misincorporations subsequent to the first, the
rate of reaching the
-state does depend on the identity of the
right or wrong nucleotide. This restriction, however, does not apply in
the single completed hit approximation where bands downstream from T
+1are amenable to analysis.
A simplified model of the cycling can be derived by
considering Fig. 8b. Initially, the polymerase is in
state 4, where it may add the target base with rate k or dissociate with rate k
. If a nucleotide
is added, then the polymerase will be at state 5 and either excise the
newly inserted nucleotide with rate k
or it will
either add the next base or dissociate with rate (k
+ k
). If the exonuclease
does act, the polymerase will generate a dNMP molecule and return to
state 4.
The number of polymerase cycles from state 4 to 5 and back
to 4 can be counted as follows(45) . First define the
probability that the polymerase makes a transition from 4 to 5 as p = k
/(k
+ k
) and the probability that the polymerase goes
from state 5 to 4 as p
= k
/(k
+ k
+ k
).
The probability that a polymerase makes at least n cycles
during its time on the template is then (p
p
)
. The
average number of cycles completed by the polymerase is the sum of the
product of the number of cycles, and the probability of making that
number of cycles, or
(1 - p
p
)(p
p
)
,
which reduces to p
p
/(1
- p
p
). The ideal
situation of zero cycles during a single hit is approached if either p
or p
approaches
zero. The value of p
can be made small only
if (k
+ k
)
k
,
which is hard to guarantee since one does not know what the relative
values are of k
and k
.
However, p
= k
/(k
+ k
) can be made as small as necessary by reducing
the concentration of the target nucleotide.
Thus, the rule is to
approximate the competitive conditions in a noncompetition experiment,
make all measurements at low dNTP substrate concentrations. While this
condition leads to a conclusion similar to that of the previous section
(namely, carry out kinetic measurements only in the V/K
region of substrate
concentration), in this case, the motivation is to make the series of
states between
and
the same in the noncompetition
experiment as would be the case in an experiment where right and wrong
dNTP substrates are in direct competition by polymerase.