(Received for publication, March 23, 1995; and in revised form, June 7, 1995)
From the
We have analyzed two human immunodeficiency virus (HIV-1) reverse transcriptase mutants of helix H in the thumb subdomain suggested by x-ray crystallography to interact with the primer strand of the template-primer. These enzymes, G262A and W266A, were previously shown to have greatly elevated dissociation rate constants for template-primer and to be much less sensitive to inhibition by 3`-azidodeoxythymidine 5`-triphosphate. Here we describe their processivity and error specificity. The results reveal that: (i) both enzymes have reduced processivity and lower fidelity for template-primer slippage errors, (ii) they differ from each other in sequence-dependent termination of processive synthesis and in error specificity, and (iii) the magnitude of the mutator effect relative to wild-type enzyme for deletions in homopolymeric sequences decreases as the length of the run increases. Thus amino acid substitutions in a subdomain thought to interact with the duplex template-primer confer a strand slippage mutator phenotype to a replicative DNA polymerase. This suggests that interactions between specific amino acids and the primer stem at positions well removed from the active site are critical determinants of processivity and fidelity. These effects, obtained in aqueous solution during catalytic cycling, are consistent with and support the existing crystallographic structural model.
The human immunodeficiency virus (HIV-1) ()exhibits
enormous genomic variation, allowing the virus to evade the
host's immune system and to generate variants resistant to drugs
and vaccines. One suspected source of variation is low replication
fidelity by the viral reverse transcriptase (RT). Numerous studies have
shown that HIV-1 RT, which lacks proofreading exonuclease activity
(Preston et al., 1988; Roberts et al., 1988), is
inaccurate (for review, see Bebenek and Kunkel(1993)). The RT is
unusually inaccurate for single base deletions, additions, and
substitutions in homopolymeric sequences (Bebenek et al.,
1989, 1993). These errors all likely involve template-primer slippage
rather than direct base miscoding.
Analyses with a DNA template revealed a correlation between sites of termination of processive synthesis and hot spots for frameshift errors, suggesting that the association-dissociation phase of the polymerization cycle could be a critical determinant of frameshift fidelity (Bebenek et al., 1989). This idea was supported by processivity and fidelity studies with DNA substrates differing by a single base in either the single-stranded template or the double-stranded template-primer stem. Sequence changes in the duplex template-primer region up to 6 bases from the terminus influenced both termination probability (Abbotts et al., 1993) and frameshift fidelity with the wild-type RT (Bebenek et al., 1993), suggesting that important interactions occurred between this region of the template-primer and amino acid residues in the reverse transcriptase.
HIV-1 reverse transcriptase is a heterodimer of a 66- and 51-kDa subunit. Crystallographic studies have revealed that both subunits consist of common subdomains designated palm, fingers, thumb, and connection subdomains (Kohlstaedt et al., 1992). The subdomains in the p66 subunit form a cleft to accommodate the nucleic acid, and amino acid residues 250-300 of the thumb subdomain interact with the double-stranded template-primer (Jacobo-Molina et al., 1993). In an attempt to identify amino acid residues important for various steps in catalysis, including DNA binding, processive synthesis, and fidelity, we are performing mutational analysis of defined regions of the thumb subdomain. The first study (Beard et al., 1994) analyzed mutants containing alanine substituted for individual residues from 253 to 271, which includes residues of helix H suggested to interact with the primer strand (Jacobo-Molina et al., 1993). Two mutants from this collection had reduced fidelity during DNA-dependent DNA synthesis. Given their other properties (Beard et al., 1994) and our earlier studies of processivity and template-primer slippage-initiated infidelity with the wild-type RT, we present here an analysis of the processivity and error specificity of the G262A and W266A mutants RTs.
Gap filling synthesis by several mutant derivatives of HIV-1
RT revealed two that gave elevated lacZ -complementation
mutant frequencies (Beard et al., 1994). These RTs contained
an alanine substituted for either a glycine at residue 262 or a
tryptophan at residue 266. Multiple determinations demonstrated that
average lacZ mutant frequencies for products generated by the
G262A and W266A RTs are 4- and 3-fold higher than with the wild-type
enzyme, respectively (Table 1A).
Figure 1: Circular dichroism spectra of wild type and mutant RTs. Top panel, G262A; center panel, W266A; and bottom panel, wild-type RT.
In addition, fluorescence spectroscopy was applied to assess any change in spectral line shape of the overall (Tyr + Trp) RT emission, as well as the quantum yield and emission maximum of the indole fluorophores upon Trp photoselection at 297 nm. All RT sequences were found to be dominated by a Trp contribution that accounted for 93-96% of the overall fluorescence (maximum emission at 340.5 ± 1 nm) and was typical of a partially solvent-exposed Trp population peaking at 342 ± 1 nm, as described previously for the wild-type p66/p66 RT homodimer (Casas-Finet et al., 1992). This is in agreement with the similarity in CD spectra of wild-type and mutant RTs, suggesting an underlying structural match.
Primer extension reactions were performed using conditions that prevent reinitiation of synthesis on previously used template-primers (Bebenek et al., 1989, 1993), and products were resolved by denaturing PAGE. As shown in Fig. 2A, the length of DNA products synthesized by the mutant RTs is decreased relative to wild-type RT. This effect is most obvious with W266A RT; synthesis did not proceed beyond position 66, a distance of only 40 nucleotides. Although the G262A RT generated somewhat longer products, these too were shorter on average than for wild-type RT.
Figure 2: Analysis of processivity of wild-type, G262A and W266A RTs. A, products of 15-min reactions resolved in a 12% denaturing polyacrylamide gel. Lanes G, C, A, and T are sequencing markers. B, quantitation of termination probability from position 104 to 60. C, termination probabilities of mutant RTs relative to wild type (WT).
The amount of products present at each template position from nucleotide 104 to 60 were determined for each of the three enzymes (Fig. 2B). The data are also expressed as a ratio of mutant:wild-type RT termination probability at each template nucleotide (Fig. 2C). These analyses reveal substantial and sequence-specific effects on processivity due to the single amino acid substitutions. The probability of cessation of processive synthesis increases by more than 10-fold at numerous template nucleotides (Fig. 2C), whereas effects of less than 2-fold are seen at other sites. The effects are stronger for the W266A RT than for the G262A RT. Thus, a 10-fold or greater increase was observed at 23 of 41 template sites with W266A RT, but at 8 of 45 template sites with W262A RT. Also, although termination probability at nucleotides 71, 79, and 84 is increased less than 2-fold for the G262A RT, the increase for the W266A RT is 14-, 31-, and 16-fold, respectively, at these same three positions. Although both mutant RTs exhibit a significant increase in termination probability at the first five positions from the primer terminus, the increase at position 102 is 11-fold for G262A enzyme and over 60-fold for the W266A RT.
Figure 3: Error rates for wild type (WT), G262A, and W266A RTs. Error rates per detectable nucleotide polymerized are based on the mutant frequencies and sequencing data from Table 1(and multiple mutants not shown). Calculations were performed as described (Bebenek and Kunkel, 1995). C (inset), presents error rates for deletion of a C at a CCC run at position 106-108.
The frequency of base substitutions with the G262A
and W266A mutant RTs is 3-fold higher than with wild-type RT (Table 1B). Previous studies of the wild-type enzyme (Bebenek et al., 1989, 1993) suggested that although certain
substitutions result from direct misinsertion, HIV-1 RT produces many
base substitutions by a dislocation mechanism. This involves
template-primer slippage in a homopolymeric run followed by correct
incorporation of the next nucleotide, then realignment to generate a
mispair. The best example of this is the hot spot for T C
substitutions at position -36 (Bebenek et al., 1989).
Since direct miscoding and dislocation represent two different
mechanisms for generating base substitutions, the T
C
substitutions at position -36 were considered separately. When
these were subtracted from the analysis, the overall error rates for
the remaining substitutions were similar for the wild type and W266A
RTs (Fig. 3A), suggesting that the W266A RT is not a
mutator for substitutions by direct miscoding. The substitution rate
for the G262A mutant is 2.5-fold higher than that of the wild-type RT.
Since many of these substitutions occurred at other locations that are
also consistent with the dislocation mechanism (e.g. see Fig. 2in Bebenek et al.(1989)), the G262A is at most a
slight mutator for substitutions resulting from miscoding. However, an
8-fold increase in the error rate for T
C transitions at
position -36 was observed for G262A and W266A RTs (Fig. 3B), suggesting that both are mutators for
substitutions initiated by template-primer slippage.
The frequencies of addition and deletion errors with the G262A and W266A mutant RTs are 3.7- and 2-fold higher, respectively, than with wild-type RT (Table 1B). The majority of these frameshift errors, whether produced by the wild type or mutant RTs, occur in homopolymeric sequences. The overall rate for -1 base frameshifts in runs is 7.5-fold higher for G262A and 5.3-fold higher for W266A than for wild type RT (Fig. 3C). The effect is even greater when specific homopolymeric runs are considered. For example, at a template CCC run (nucleotides 106-108), the rate for loss of one C is 30- and 50-fold higher for G262A and W266A, respectively, than for wild type RT (inset to Fig. 3C). Interestingly, only G262A RT is a mutator for +1 frameshifts (Fig. 3D).
With both mutant enzymes, the largest increase in mutant frequency relative to wild-type was for the category listed as ``others'' in Table 1B. The majority of these contain two widely separated single base changes, with one or both of the changes at one of the several homopolymeric run hot spots. Also included are mutants with three or more single base changes, a few larger deletions, and complex mutations involving a deletion and the addition of bases.
Figure 4:
1-Base deletion error rates from the
forward mutation assay. The error rates for -1 base frameshifts
in homopolymeric runs were calculated using the number of nucleotides
in runs of each length (Bebenel and Kunkel, 1995) and the sequence
analysis of mutants (Table 1, including multiple mutants). The
number of mutants in runs of length 2, 3, 4, or 5 and all other
mutants, respectively, were: for wild-type (WT) RT, 5, 14, 11,
4, and 189 mutants; for G262A RT, 2, 23, 3, 1, and 78 mutants; W266A,
2, 22, 2, 3, and 73 mutants. The spectra for the two mutants were
significantly different from the wild-type RT, with p <
0.002 in both cases using the Fisher exact test for 2 5 tables
(Mehta and Patel, 1980).
Given that frameshift error rates increase with increasing run length with wild-type HIV-1 RT, we were surprised to see that the pattern is different for the two mutant RTs. Thus, although the average error rates are higher for both mutant RTs at all run lengths, the increases relative to wild type are higher in 3-base runs than in 4- or 5-base runs (Fig. 4). For example, the G262A and W266A RTs are 14- and 10-fold mutators at 3-base runs, respectively, but only 2.3- and 1.2-fold mutators in 4-base runs. This unusual specificity suggests that the amino acid substitutions in the mutant RTs preferentially allow intermediates having an extra template base closer to the terminus to be processed into frameshift errors. In an attempt to understand this, we next modeled the interaction of residues 262 and 266 with the template-primer, using structural information on the wild-type RT.
Figure 5:
Model of helix H with
template-primer DNA. A, the model depicts the position of
amino acids Trp
and Gly
in
helix H in
the RT thumb subdomain relative to the duplex DNA. The template strand
is green, the primer strand is red, the
carbon backbone of the H
and I helices is white, and amino acids Gly
and
Trp
are yellow. Pr1 through Pr5 indicate the first five
nucleotides of the primer strand starting from the 3`-OH terminus. B, diagram of misaligned template-primers in runs of T
A
base pairs. These are the sequences of the substrates used for the
reversion analyses (Fig. 6). The asterisks indicate the
positions of residues Trp
and Gly
relative
to the template-primer.
Figure 6:
1-Base deletion mutant frequencies from
reversion assays. A, mutant frequencies in runs of TA
base pairs of length 3, 4, 5, and 7. B, -fold increase in
mutant frequencies for the G262A and W266A RTs relative to the
wild-type (WT) enzyme for runs of increasing
length.
This modeling is
consistent with possible interactions between the aromatic side chain
of residue 266 and the backbone of the primer strand that may reduce
the chance of formation or utilization of mutational intermediates with
an unpaired base located between the second and third base pairs from
the primer terminus (e.g. see Fig. 5B).
Altering these interactions by changing the aromatic side chain to the
methyl group side chain of alanine may lead to the strong mutator
phenotype in 3-base runs. Since template-primer slippage in longer runs (Fig. 5B) would result in intermediates with an extra
base further away from the primer terminus, these would no longer be in
the immediate proximity of the side chain at 266. In these instances,
altering the side chain would have a lesser impact on fidelity. A
similar logic would predict that the G262A mutator RT would enhance the
frameshift error rate even in longer runs, because Gly is
positioned further away from the terminus. Although alternative
explanations are of course possible (see ``Discussion''),
this hypothesis that the magnitude of the frameshift mutator effect
varies as a function of the position of a particular amino acid residue
relative to the position of an extra base in a misaligned homopolymeric
run prompted a more controlled examination of the relationship between
error rate and run length.
Gap-filling reactions were performed with the three RTs and each of the four substrates. The reversion frequencies of the resulting products for each substrate/RT combination are presented in Fig. 6A. Both mutant RTs are frameshift mutators, having lower fidelity than does the wild type enzyme with all four reversion substrates. As with the forward assay, the G262A enzyme is less accurate than the W266A RT. With the wild-type and G262A RTs, fidelity decreased as the length of the run increased from 3 to 7. With the W266A RT, the mutant frequency for the 3-base run is slightly higher than with the 4-base run. The mutant frequency increased substantially with a 5-base run but no further with the 7-base run.
Consistent with the results from the forward assay, the increases in reversion frequencies with the mutant RTs relative to the wild-type RT are again highest in the 3- and 4-base runs (Fig. 6B). With the W266A RT, the mutator effect diminishes as the run length increases. With the G262A mutant, the mutator effect is highest with the 4-base run, slightly lower with the 3-base run and substantially less with the 5- and 7-base runs.
Termination of processive synthesis by the wild-type RT is
critically dependent on the DNA sequence of the first 6 base pairs of
the duplex primer stem and single base pair differences in this region
strongly affect both processivity (Abbotts et al., 1993) and
frameshift fidelity (Bebenek et al., 1993). The present study
reinforces the functional importance of these protein-template-primer
interactions by showing that RT mutants containing single amino acid
changes in helix H of the thumb domain and having greatly
elevated dissociation rate constants for template-primer (Beard et
al., 1994) also have reduced processivity (Fig. 2) and
increased strand slippage error rates (Figs. 3, 4, and 6) during
DNA-dependent DNA synthesis. These effects are observed during
catalysis in aqueous solution and support the structural model for the
RT
template-primer complex derived by x-ray crystallography
(Jacobo-Molina et al., 1993). They suggest that interactions
between specific amino acid residues and the duplex template-primer at
positions well removed from the active site (Fig. 5) are
critical determinants of DNA binding, fidelity, and processivity. In
support of the functional role of thumb subdomain residues, mutant
forms of T7 RNA polymerase bearing deletions in this subdomain have
reduced processivity during transcription (Bonner et al.,
1994).
Earlier observations with the wild-type RT revealed a correlation between frameshift fidelity and processivity. Homopolymeric runs that were frameshift error hot spots were sites where the probability of cessation of processive synthesis was high (Bebenek et al., 1989). Moreover, sequence changes that altered processivity concomitantly altered frameshift fidelity (Bebenek et al., 1993). The present study extends this correlation by examining the effects of alterations in the protein rather than the substrate. They reinforce the relationship between processivity and fidelity, since both mutant RTs have concomitant reductions in both properties.
Differences between the two mutant RTs in the magnitude and specificity of effects on frameshift fidelity and processivity reinforce another interpretation of our earlier studies. In contrast to proteins that bind tightly to specific DNA sequences (e.g. Lac repressor), DNA polymerases have generally been considered sequence nonspecific binding proteins, as required for their roles in replication and repair of heterogeneous sequences. However, here as in earlier studies using altered template-primers (Abbotts et al., 1993; Bebenek et al., 1989, 1993), 10-100-fold differences in fidelity and processivity have been observed that depend on the template-primer sequence or on the amino acid sequence of the polymerase. Although these large differences may reflect interactions between the protein and the sugar-phosphate backbone more than between the protein and bases, they nonetheless illustrate that the RT does strongly respond to the structure of the substrate as defined by its base sequence. Thus, although HIV-1 RT is clearly not a sequence specific binding protein in the sense used for Lac repressor (for example), the fidelity and processivity specificity data demonstrate that DNA polymerases strongly respond to sequence differences and should not be considered sequence nonspecific binding proteins.
DNA
binding, fidelity, and processivity are affected to different extents
in the mutant RTs. The dissociation rate constant for template-primer
is elevated more for W266A than for G262A (Beard et al.,
1994), and the average processivity is reduced more for W266A than for
G262A (Fig. 2A). The probability of termination of
processive synthesis is a complex function of both the RT and the
sequence of the template-primer (see text under ``Results''
and Fig. 2B). The G262A RT has generally lower
frameshift fidelity than does W266A (Table 1; Fig. 3C; Fig. 4, runs of 2, 3, and 4 bases; Fig. 6A) but not at all homopolymeric runs (Fig. 3C (inset), Fig. 4, run of 5
bases). We also note that G262A RT is a mutator for +1 base
errors, whereas W266A RT is not (Fig. 3D). For such
errors, the extra nucleotide is in the primer strand, as opposed to
being in the template strand for -1 errors. The idea that amino
acid substitutions in the H helix, which is in close proximity to the
primer strand, might selectively affect plus-one frameshift error rate
is obviously not sufficient to explain the observation. All these
specificities suggest that the interactions between the wild-type RT
and the template-primer occurring during polymerization that determine
frameshift error rate and processivity are complex. Moreover,
replacement of the large aromatic side chain of tryptophan 266 with a
methyl group likely represents a very different circumstance from
replacing the hydrogen of glycine 262 with a methyl group. The model
presented in Fig. 5is consistent with the absence of important
side chain interactions when alanine is substituted for
Trp, but the presence of unfavorable interactions when
alanine is substituted for glycine. These possibilities can be probed
by analyzing additional amino acid substitutions at these two
positions.
The RT is a dimeric enzyme so that the amino acid substitutions are present in both subunits of the homodimer. The DNA duplex in the co-crystal with the RT (Jacobo-Molina et al., 1993) was not long enough to determine whether helix H in the p51 subunit interacts with the DNA. Thus, it is formally possible that the effects could be due to changes in the p66 subunit that corresponds to p51 in the heterodimer. We do not favor this possibility given the strong mutator effects in short homopolymeric sequences, which is consistent with alterations in enzyme-template primer interactions close to the primer terminus.
Both mutant RTs are mutators for
frameshift errors in homopolymeric runs. Their concomitant mutator
activity for T C substitutions at position -36 (Fig. 3B) further supports the earlier suggestion that
most of the substitutions at this template position are initiated by
template-primer slippage rather than by misinsertion (Bebenek et
al., 1989). The substitution results because, after template
slippage followed by incorporation of one more correct nucleotide,
realignment occurs to generate the terminal mispair responsible for the
substitution. We are currently examining this dislocation hypothesis by
determining the base substitution specificity of the mutant RTs using
the altered substrates employed to support the model with the wild-type
RT (Bebenek et al., 1993). Excluding these T
C
substitutions, the data in Fig. 3A suggest that the
W266A and G262A mutant RTs are not substantially less accurate for base
substitutions thought to result from direct misinsertion. With Klenow
DNA polymerase, a misinsertion mutator phenotype (Carroll et
al., 1991) results from an amino acid substitution in the O helix
of the fingers subdomain. Similar base substitution mutators may
eventually be found by altering amino acids in other regions of HIV-1
RT.
Crystallographic analysis of the wild-type RTduplex DNA
complex (Jacobo-Molina et al., 1993) revealed that the
template-primer conforms more closely to A-form DNA near the polymerase
active site and to B-form DNA near the RNase H active site.
Interestingly, there is a 40-45° bend in the DNA at the
transition point (Fig. 5). This is distributed over about 4 base
pairs in the vicinity of contacts with helix H in the p66 subunit. An
intriguing possibility is that this bend is related in some way to the
high frameshift infidelity of the wild-type HIV-1 RT. It may also
relate to the observation that the magnitude of the frameshift mutator
effect is larger in homopolymeric runs of three or four nucleotides
than in runs of five or seven nucleotides ( Fig. 4and Fig. 6). The bend in the substrate imposed by the RT may place
different constraints on unpaired nucleotides in short versus long runs.
A second, but not mutually exclusive, hypothesis is
that the magnitude of the frameshift mutator effect might vary as a
function of the position of a particular amino acid residue relative to
the position of the unpaired base in a misaligned homopolymeric run.
The logic is as follows. With wild-type HIV-1 RT (Fig. 4, top panel; Fig. 6), as with two other replicative DNA
polymerases (DNA polymerase (Kunkel, 1990) and T7 DNA polymerase
(Kunkel et al., 1994)), the frameshift error rate increases as
the length of the homopolymeric run increases. This trend suggests
that, of the many misaligned substrates possible within a run, the one
that is extended by the polymerase to seal the misalignment in place
contains the unpaired nucleotide as far back in the homopolymeric
duplex primer stem as possible (Fig. 5B). This will not
only be the most stable intermediate, but it will also place the
unpaired nucleotide the greatest distance from the polymerase active
site. Thus, template-primer slippage in a homopolymeric run of four
template T residues bases could yield an intermediate with an unpaired
base 3 base pairs away from the primer terminus (Fig. 5B). Interactions between the large side chain of
tryptophan 266 occurring with the sugar-phosphate backbone at the third
template-primer position (as suggested by the modeling in Fig. 5A) may reduce the chance of formation or
utilization of a misaligned intermediate with an unpaired base in this
region of the template-primer. This is consistent with the frameshift
mutator phenotype of the W266A RT, where replacement of the large
aromatic side chain with a methyl group may lessen constraints normally
imposed on a misaligned intermediate. However, slippage in a 7-base run
would yield an intermediate with an unpaired base 6 base pairs away
from the primer terminus, a location far enough away to be unaffected
by interactions between Trp
and the template-primer. This
is consistent with the observations that the W266A mutant reduces
frameshift fidelity more in short runs than in long runs ( Fig. 4and Fig. 6). A similar logic suggests that the
G262A mutator RT might more strongly enhance slippage errors in longer
homopolymeric runs than would W266A, since its position in helix H is
further from the 3`-OH terminus of the primer (Fig. 5). In favor
of this possibility, we note that the G262A change leads to a slightly
greater mutator effect in a 4-base run than in a 3-base run (Fig. 6B), while the reverse is true for the W266A
change. However, this difference is subtle and alternative explanations
are possible. For example, glycine 262 may have a role in defining the
overall structure of helix H, such that replacement with any side chain
will have a more global effect on helix structure and/or helix-DNA
interactions. The importance of the bend and the hypothesis that the
magnitude of the frameshift mutator effect varies as a function of the
position of a particular amino acid residue relative to the position of
the unpaired base can be examined by determining frameshift fidelity
with additional substrates and/or mutant RTs containing amino acid
substitutions at positions thought to interact closer to or farther
from the primer terminus. It will also be interesting to determine if
amino acid substitutions in secondary structural elements of other DNA
polymerase thumbs affect processivity or frameshift fidelity. The H
helix has recently been proposed to be part of a ``helix
clamp'' common to a number of nucleic acid polymerases (Hermann et al., 1994).
Substitution of alanine for tryptophan at 266 and glycine at 262 had only modest effects on the rate of catalysis or dNTP binding (Beard et al., 1994), yet has substantial effects on DNA binding, processivity, and frameshift fidelity. These observations are consistent with the possibility that mutant DNA polymerases might arise in vivo that are sufficiently active to fulfill their roles in replication, recombination, or repair but do so with reduced frameshift fidelity. This offers one explanation for the instability of repetitive sequences that are associated with cancer and neurodegenerative diseases (for review, see Loeb(1994) and Willems (1994)).