©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Reduced Frameshift Fidelity and Processivity of HIV-1 Reverse Transcriptase Mutants Containing Alanine Substitutions in Helix H of the Thumb Subdomain (*)

(Received for publication, March 23, 1995; and in revised form, June 7, 1995)

Katarzyna Bebenek (1) William A. Beard (3) Jose R. Casas-Finet (4)(§) Hyeung-Rak Kim (3) Thomas A. Darden (2) Samuel H. Wilson (3) Thomas A. Kunkel (1)(¶)

From the  (1)Laboratory of Molecular Genetics, (2)Laboratory of Quantitative and Computational Biology, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, the (3)Sealy Center for Molecular Science, University of Texas Medical Branch, Galveston, Texas 77555-1068, and the (4)Structural Biochemistry Program, PRI/DynCorp, National Cancer Institute, Frederick Cancer Research and Development Center, Frederick, Maryland 21702-1201

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

We have analyzed two human immunodeficiency virus (HIV-1) reverse transcriptase mutants of helix H in the thumb subdomain suggested by x-ray crystallography to interact with the primer strand of the template-primer. These enzymes, G262A and W266A, were previously shown to have greatly elevated dissociation rate constants for template-primer and to be much less sensitive to inhibition by 3`-azidodeoxythymidine 5`-triphosphate. Here we describe their processivity and error specificity. The results reveal that: (i) both enzymes have reduced processivity and lower fidelity for template-primer slippage errors, (ii) they differ from each other in sequence-dependent termination of processive synthesis and in error specificity, and (iii) the magnitude of the mutator effect relative to wild-type enzyme for deletions in homopolymeric sequences decreases as the length of the run increases. Thus amino acid substitutions in a subdomain thought to interact with the duplex template-primer confer a strand slippage mutator phenotype to a replicative DNA polymerase. This suggests that interactions between specific amino acids and the primer stem at positions well removed from the active site are critical determinants of processivity and fidelity. These effects, obtained in aqueous solution during catalytic cycling, are consistent with and support the existing crystallographic structural model.


INTRODUCTION

The human immunodeficiency virus (HIV-1) (^1)exhibits enormous genomic variation, allowing the virus to evade the host's immune system and to generate variants resistant to drugs and vaccines. One suspected source of variation is low replication fidelity by the viral reverse transcriptase (RT). Numerous studies have shown that HIV-1 RT, which lacks proofreading exonuclease activity (Preston et al., 1988; Roberts et al., 1988), is inaccurate (for review, see Bebenek and Kunkel(1993)). The RT is unusually inaccurate for single base deletions, additions, and substitutions in homopolymeric sequences (Bebenek et al., 1989, 1993). These errors all likely involve template-primer slippage rather than direct base miscoding.

Analyses with a DNA template revealed a correlation between sites of termination of processive synthesis and hot spots for frameshift errors, suggesting that the association-dissociation phase of the polymerization cycle could be a critical determinant of frameshift fidelity (Bebenek et al., 1989). This idea was supported by processivity and fidelity studies with DNA substrates differing by a single base in either the single-stranded template or the double-stranded template-primer stem. Sequence changes in the duplex template-primer region up to 6 bases from the terminus influenced both termination probability (Abbotts et al., 1993) and frameshift fidelity with the wild-type RT (Bebenek et al., 1993), suggesting that important interactions occurred between this region of the template-primer and amino acid residues in the reverse transcriptase.

HIV-1 reverse transcriptase is a heterodimer of a 66- and 51-kDa subunit. Crystallographic studies have revealed that both subunits consist of common subdomains designated palm, fingers, thumb, and connection subdomains (Kohlstaedt et al., 1992). The subdomains in the p66 subunit form a cleft to accommodate the nucleic acid, and amino acid residues 250-300 of the thumb subdomain interact with the double-stranded template-primer (Jacobo-Molina et al., 1993). In an attempt to identify amino acid residues important for various steps in catalysis, including DNA binding, processive synthesis, and fidelity, we are performing mutational analysis of defined regions of the thumb subdomain. The first study (Beard et al., 1994) analyzed mutants containing alanine substituted for individual residues from 253 to 271, which includes residues of helix H suggested to interact with the primer strand (Jacobo-Molina et al., 1993). Two mutants from this collection had reduced fidelity during DNA-dependent DNA synthesis. Given their other properties (Beard et al., 1994) and our earlier studies of processivity and template-primer slippage-initiated infidelity with the wild-type RT, we present here an analysis of the processivity and error specificity of the G262A and W266A mutants RTs.


EXPERIMENTAL PROCEDURES

Materials

Bacterial strains, phage, and other materials have been described (Bebenek and Kunkel, 1995), as have the wild type and mutant RTs (p66/p66 homodimers; Beard et al., 1994). These enzymes had at least 10-fold lower 3` 5` exonuclease activity than does Klenow polymerase (Beard et al., 1994).

Fidelity Assays

The forward assay scores errors in the lacZalpha gene in M13mp2 (Bebenek and Kunkel, 1995). Correct gap-filling polymerization produces DNA that yields blue plaques upon transfection and plating of an Escherichia coli strain. Errors are scored as lighter blue or colorless plaques. Reversion assays utilize substrates encoding colorless plaque phenotypes. Errors that restore alpha-complementation activity are detected as blue plaques. Here we used substrates that score 3n - 1 errors, primarily -1 base frameshifts (see below). (^2)

DNA Polymerase Reactions for Fidelity Studies

Reactions (25 µl) containing 20 mM Hepes (pH 7.8), 2 mM dithiothreitol, 10 mM MgCl(2), all four dNTPs at 1 mM, 32 fmol of gapped DNA, and 1-4.6 pmol of RT were incubated for 1 h at 37 °C and terminated by adding EDTA to 15 mM. Aliquots (20 µl) were analyzed by agarose electrophoresis to ensure complete gap filling. All reactions generated products that migrated coincident with nicked, double-stranded DNA.

Termination Probability Analysis

Processivity was measured on a M13mp2 DNA template primed with a P-5`-end-labeled 15-mer (Bebenek et al., 1993). Reactions were as above except that they contained a severalfold excess of DNA over enzyme and were incubated at 37 °C for 5-30 min. Aliquots were removed and mixed with an equal volume of 99% formamide, 5 mM EDTA, 0.1% xylene cyanole, 0.1% bromphenol blue. Products were analyzed by electrophoresis in a 12% polyacrylamide gel, in parallel with sequenced markers with the same template. Product bands were quantitated by phosphorimagery and termination probability at each site expressed in percent, as the ratio of products at a site to the products at that site plus all greater length products.

Circular Dichroism and Fluorescence Spectroscopy

CD spectra were acquired at room temperature on a Jasco J-720 spectropolarimeter. Protein solutions were between 0.4 and 2.2 mg/ml in 50 mM Tris-HCl (pH 7.5) containing 100 mM NaCl, 1 mM EDTA, and 1 mM dithiothreitol and were diluted to 0.08-0.21 mg/ml in 20 mM Tris-HCl (pH 8.1) containing 5 mM MgCl(2), 1 mM EDTA, and 1 mM 2-mercaptoethanol. Near-UV (340-250 nm) and far-UV (260-190 nm) spectra were obtained in a single pass, with a 1-nm bandwidth, a 16-s time constant, and a data density of 10 points/nm, using a 50-µm path length demountable quartz Suprasil cuvette closed at both ends (Uvonic Instruments, Plainview, NY). Buffer background was measured in a sample cuvette and subtracted to generate net spectra. Spectra were acquired in triplicate and used when two scans were superimposable over the entire wavelength range, except for nonrandom spike features below 200 nm. Neither smoothing nor averaging of replicate data sets was applied. Since variability in the deep UV region among multiple determinations with one form of RT were as great as variations when comparing a mutant with wild-type RT spectrum, no attempt was made to infer structural information from the observed spectral features in this region. Fluorescence spectra were acquired as described previously (Casas-Finet et al., 1991).

Model Building

Coordinates for the protein alpha-carbons and DNA phosphates were provided by E. Arnold (Rutgers University). To build a full DNA model, a succession of segments of 3 base pairs was built. For each segment, ideal A-form and B-form DNA was generated using the Nucgen module of the AMBER package (Weiner et al., 1986). These were superimposed on the experimentally determined phosphate positions and the closer fitting segment chosen. The first segment, in the RT active site, was better fit by A-form DNA, whereas the remainder was better fit by B-form DNA. After the full DNA model was built in this fashion, it was energy-minimized using AMBER while restraining the phosphate positions. To build the full protein model, the program WHATIF (Vriend, 1990) was used. This relies on a combination of data base fragments for backbone positions and a rotamer library for side chain positions. This method resulted in a number of severe steric conflicts between some side chain pairs that were distant in primary sequence. These were resolved manually using the MULTI program (Darden et al., 1991). The DNA and protein models were then energy-minimized together, using AMBER, while restraining the alpha-carbon and phosphate positions.


RESULTS

Gap filling synthesis by several mutant derivatives of HIV-1 RT revealed two that gave elevated lacZ alpha-complementation mutant frequencies (Beard et al., 1994). These RTs contained an alanine substituted for either a glycine at residue 262 or a tryptophan at residue 266. Multiple determinations demonstrated that average lacZ mutant frequencies for products generated by the G262A and W266A RTs are 4- and 3-fold higher than with the wild-type enzyme, respectively (Table 1A).



Structural Analyses by Circular Dichroism and Fluorescence Spectroscopy

To determine whether the reduced accuracy of the mutant RTs might result from altered structure, we examined their structure by circular dichroism. The CD spectrum of the G262A RT mutant (Fig. 1, top panel) exhibited a prominent band of negative ellipticity between 260 and 202 nm, peaking at 219 nm and showing a shoulder of comparable molar ellipticity at 211 nm, followed by a positive ellipticity narrow band that peaked at 197 nm. These spectral features are almost identical to those observed for the wild-type p66/p66 RT homodimer (Fig. 1, bottom panel; also see Goel et al.(1993)), including extremes at 218.5, 211.5, and 196.5 nm and crossovers at 202 and 187.5 nm. The CD spectrum of the W266A RT mutant (Fig. 1, center panel) also has a dominant negative ellipticity band, albeit marginally blue-shifted (peak at 217 nm, shoulder at 210.5 nm, crossover at 200 nm) compared with the G262A RT spectrum. It had a somewhat less intense positive ellipticity band peaking at 197 nm that trailed into the deep UV region (as seen for the wild-type RT). All three RTs had near-UV CD spectra dominated by a Cotton effect observed between 280 and 300 nm, assigned to Trp chromophores (not shown). Overall, these results suggest that the mutant RTs are similar to the wild-type enzyme in either secondary or tertiary structure.


Figure 1: Circular dichroism spectra of wild type and mutant RTs. Top panel, G262A; center panel, W266A; and bottom panel, wild-type RT.



In addition, fluorescence spectroscopy was applied to assess any change in spectral line shape of the overall (Tyr + Trp) RT emission, as well as the quantum yield and emission maximum of the indole fluorophores upon Trp photoselection at 297 nm. All RT sequences were found to be dominated by a Trp contribution that accounted for 93-96% of the overall fluorescence (maximum emission at 340.5 ± 1 nm) and was typical of a partially solvent-exposed Trp population peaking at 342 ± 1 nm, as described previously for the wild-type p66/p66 RT homodimer (Casas-Finet et al., 1992). This is in agreement with the similarity in CD spectra of wild-type and mutant RTs, suggesting an underlying structural match.

Analysis of Processivity of the Mutant RTs

We were interested in determining whether the two mutator RTs have altered processivity because both have greatly increased dissociation rate constants for template-primer (Beard et al., 1994). Moreover, the probability of termination of processive synthesis is affected by the sequence of the template-primer from one to six nucleotides away from the 3` terminus (Abbotts et al., 1993), and this is the region of the DNA suggested (Jacobo-Molina et al., 1993) to interact with residues of alpha helix H of RT, containing amino acids 262 and 266. Finally, with the wild-type RT, a correlation exists between the probability of termination of processive synthesis and frameshift fidelity (Bebenek et al., 1989, 1993). Thus we performed a quantitative analysis of the probability of termination of processive synthesis with the lacZ template.

Primer extension reactions were performed using conditions that prevent reinitiation of synthesis on previously used template-primers (Bebenek et al., 1989, 1993), and products were resolved by denaturing PAGE. As shown in Fig. 2A, the length of DNA products synthesized by the mutant RTs is decreased relative to wild-type RT. This effect is most obvious with W266A RT; synthesis did not proceed beyond position 66, a distance of only 40 nucleotides. Although the G262A RT generated somewhat longer products, these too were shorter on average than for wild-type RT.


Figure 2: Analysis of processivity of wild-type, G262A and W266A RTs. A, products of 15-min reactions resolved in a 12% denaturing polyacrylamide gel. Lanes G, C, A, and T are sequencing markers. B, quantitation of termination probability from position 104 to 60. C, termination probabilities of mutant RTs relative to wild type (WT).



The amount of products present at each template position from nucleotide 104 to 60 were determined for each of the three enzymes (Fig. 2B). The data are also expressed as a ratio of mutant:wild-type RT termination probability at each template nucleotide (Fig. 2C). These analyses reveal substantial and sequence-specific effects on processivity due to the single amino acid substitutions. The probability of cessation of processive synthesis increases by more than 10-fold at numerous template nucleotides (Fig. 2C), whereas effects of less than 2-fold are seen at other sites. The effects are stronger for the W266A RT than for the G262A RT. Thus, a 10-fold or greater increase was observed at 23 of 41 template sites with W266A RT, but at 8 of 45 template sites with W262A RT. Also, although termination probability at nucleotides 71, 79, and 84 is increased less than 2-fold for the G262A RT, the increase for the W266A RT is 14-, 31-, and 16-fold, respectively, at these same three positions. Although both mutant RTs exhibit a significant increase in termination probability at the first five positions from the primer terminus, the increase at position 102 is 11-fold for G262A enzyme and over 60-fold for the W266A RT.

Analysis of Mutational Specificity

The mutant frequency data in Table 1A represent the spectrum of many types of errors at many sites. To determine whether the lower fidelity of the mutant RTs results from a greater difference in rate for a specific subset of errors, we examined the DNA sequence of nucleotides -84 through +180 of the lacZ gene. We analyzed 223, 107, and 102 M13mp2 mutants generated in reactions with the wild-type, G262A, and W266A RTs, respectively (Table 1B). With knowledge of the number of detectable sites in the reporter gene for errors (Bebenek and Kunkel, 1995), these data can be used to calculate error rates per detectable nucleotide polymerized for specific subsets of errors (Fig. 3).


Figure 3: Error rates for wild type (WT), G262A, and W266A RTs. Error rates per detectable nucleotide polymerized are based on the mutant frequencies and sequencing data from Table 1(and multiple mutants not shown). Calculations were performed as described (Bebenek and Kunkel, 1995). C (inset), presents error rates for deletion of a C at a CCC run at position 106-108.



The frequency of base substitutions with the G262A and W266A mutant RTs is 3-fold higher than with wild-type RT (Table 1B). Previous studies of the wild-type enzyme (Bebenek et al., 1989, 1993) suggested that although certain substitutions result from direct misinsertion, HIV-1 RT produces many base substitutions by a dislocation mechanism. This involves template-primer slippage in a homopolymeric run followed by correct incorporation of the next nucleotide, then realignment to generate a mispair. The best example of this is the hot spot for T C substitutions at position -36 (Bebenek et al., 1989). Since direct miscoding and dislocation represent two different mechanisms for generating base substitutions, the T C substitutions at position -36 were considered separately. When these were subtracted from the analysis, the overall error rates for the remaining substitutions were similar for the wild type and W266A RTs (Fig. 3A), suggesting that the W266A RT is not a mutator for substitutions by direct miscoding. The substitution rate for the G262A mutant is 2.5-fold higher than that of the wild-type RT. Since many of these substitutions occurred at other locations that are also consistent with the dislocation mechanism (e.g. see Fig. 2in Bebenek et al.(1989)), the G262A is at most a slight mutator for substitutions resulting from miscoding. However, an 8-fold increase in the error rate for T C transitions at position -36 was observed for G262A and W266A RTs (Fig. 3B), suggesting that both are mutators for substitutions initiated by template-primer slippage.

The frequencies of addition and deletion errors with the G262A and W266A mutant RTs are 3.7- and 2-fold higher, respectively, than with wild-type RT (Table 1B). The majority of these frameshift errors, whether produced by the wild type or mutant RTs, occur in homopolymeric sequences. The overall rate for -1 base frameshifts in runs is 7.5-fold higher for G262A and 5.3-fold higher for W266A than for wild type RT (Fig. 3C). The effect is even greater when specific homopolymeric runs are considered. For example, at a template CCC run (nucleotides 106-108), the rate for loss of one C is 30- and 50-fold higher for G262A and W266A, respectively, than for wild type RT (inset to Fig. 3C). Interestingly, only G262A RT is a mutator for +1 frameshifts (Fig. 3D).

With both mutant enzymes, the largest increase in mutant frequency relative to wild-type was for the category listed as ``others'' in Table 1B. The majority of these contain two widely separated single base changes, with one or both of the changes at one of the several homopolymeric run hot spots. Also included are mutants with three or more single base changes, a few larger deletions, and complex mutations involving a deletion and the addition of bases.

Mutator Effects versus Homopolymeric Run Length

We considered in detail the -1 base errors by the wild type and mutant RTs. The template contains homopolymeric sequences of length 2 (29 runs), 3 (9 runs), 4 (3 runs), and 5 (1 run) at which -1 base errors can be scored by diminished plaque color. This information, the data in Table 1, and the error distribution (not shown, but see legend to Fig. 4) can be used to calculate the average frameshift error rate per detectable nucleotide polymerized (see legend to Fig. 3), as a function of increasing run length. For the wild-type RT (Fig. 4, top panel), the frameshift error rate increases as the length of the run increases from 2 to 4 bases. This relationship is predicted by the template-primer slippage model (Streisinger et al., 1966), because the potential number of correct base pairs stabilizing misaligned intermediates and the number of potential misaligned intermediates that can form both increase as the length of the run increases. The data with the wild-type RT thus suggest that the extra nucleotide in the template strand may be positioned within the run as far away as possible from the 3`-OH terminus.


Figure 4: 1-Base deletion error rates from the forward mutation assay. The error rates for -1 base frameshifts in homopolymeric runs were calculated using the number of nucleotides in runs of each length (Bebenel and Kunkel, 1995) and the sequence analysis of mutants (Table 1, including multiple mutants). The number of mutants in runs of length 2, 3, 4, or 5 and all other mutants, respectively, were: for wild-type (WT) RT, 5, 14, 11, 4, and 189 mutants; for G262A RT, 2, 23, 3, 1, and 78 mutants; W266A, 2, 22, 2, 3, and 73 mutants. The spectra for the two mutants were significantly different from the wild-type RT, with p < 0.002 in both cases using the Fisher exact test for 2 5 tables (Mehta and Patel, 1980).



Given that frameshift error rates increase with increasing run length with wild-type HIV-1 RT, we were surprised to see that the pattern is different for the two mutant RTs. Thus, although the average error rates are higher for both mutant RTs at all run lengths, the increases relative to wild type are higher in 3-base runs than in 4- or 5-base runs (Fig. 4). For example, the G262A and W266A RTs are 14- and 10-fold mutators at 3-base runs, respectively, but only 2.3- and 1.2-fold mutators in 4-base runs. This unusual specificity suggests that the amino acid substitutions in the mutant RTs preferentially allow intermediates having an extra template base closer to the terminus to be processed into frameshift errors. In an attempt to understand this, we next modeled the interaction of residues 262 and 266 with the template-primer, using structural information on the wild-type RT.

Structural Modeling

Crystallographic analysis of HIV-1 RT complexed with DNA indicates that the H helix interacts with the primer strand (Jacobo-Molina et al., 1993). Gly and Trp are on the side of the helix that faces the minor groove of the duplex (Beard et al., 1994). We built separate protein and DNA models, then energy-minimized these. The resulting model shows the positions of the nucleotides of the template and primer strands relative to the H and I helices (Fig. 5A). Glycine 262 and tryptophan 266 (including the hypothetical position of the side chain) are shown in yellow. The results suggest that the aromatic side chain of tryptophan 266 interacts with the sugar-phosphate backbone at the third position in the primer strand. The alpha-carbon of residue 262 comes very close to the fourth sugar of the primer strand (Fig. 5A, Calpha to C4` position = 4.0 Å; Calpha to C5` position = 3.9 Å).


Figure 5: Model of alpha helix H with template-primer DNA. A, the model depicts the position of amino acids Trp and Gly in alpha helix H in the RT thumb subdomain relative to the duplex DNA. The template strand is green, the primer strand is red, the alpha carbon backbone of the H and I helices is white, and amino acids Gly and Trp are yellow. Pr1 through Pr5 indicate the first five nucleotides of the primer strand starting from the 3`-OH terminus. B, diagram of misaligned template-primers in runs of TbulletA base pairs. These are the sequences of the substrates used for the reversion analyses (Fig. 6). The asterisks indicate the positions of residues Trp and Gly relative to the template-primer.




Figure 6: 1-Base deletion mutant frequencies from reversion assays. A, mutant frequencies in runs of TbulletA base pairs of length 3, 4, 5, and 7. B, -fold increase in mutant frequencies for the G262A and W266A RTs relative to the wild-type (WT) enzyme for runs of increasing length.



This modeling is consistent with possible interactions between the aromatic side chain of residue 266 and the backbone of the primer strand that may reduce the chance of formation or utilization of mutational intermediates with an unpaired base located between the second and third base pairs from the primer terminus (e.g. see Fig. 5B). Altering these interactions by changing the aromatic side chain to the methyl group side chain of alanine may lead to the strong mutator phenotype in 3-base runs. Since template-primer slippage in longer runs (Fig. 5B) would result in intermediates with an extra base further away from the primer terminus, these would no longer be in the immediate proximity of the side chain at 266. In these instances, altering the side chain would have a lesser impact on fidelity. A similar logic would predict that the G262A mutator RT would enhance the frameshift error rate even in longer runs, because Gly is positioned further away from the terminus. Although alternative explanations are of course possible (see ``Discussion''), this hypothesis that the magnitude of the frameshift mutator effect varies as a function of the position of a particular amino acid residue relative to the position of an extra base in a misaligned homopolymeric run prompted a more controlled examination of the relationship between error rate and run length.

Frameshift Mutant Frequency versus Run Length Using a Set of Reversion Substrates

The results from the forward assay represent average error rates for homopolymeric sequences that not only have different lengths but also different base compositions and flanking sequences. The latter two variables are known to strongly effect frameshift error rates with wild-type HIV-1 RT (Bebenek et al., 1989, 1993) and other DNA polymerases (for review, see Kunkel (1990)). To reduce the influence of these variables, we examined the mutator effects as a function of run length using a defined set of frameshift reversions substrates. These score loss of a TbulletA base pair at runs of lengths 3, 4, 5, and 7 nucleotides.^2

Gap-filling reactions were performed with the three RTs and each of the four substrates. The reversion frequencies of the resulting products for each substrate/RT combination are presented in Fig. 6A. Both mutant RTs are frameshift mutators, having lower fidelity than does the wild type enzyme with all four reversion substrates. As with the forward assay, the G262A enzyme is less accurate than the W266A RT. With the wild-type and G262A RTs, fidelity decreased as the length of the run increased from 3 to 7. With the W266A RT, the mutant frequency for the 3-base run is slightly higher than with the 4-base run. The mutant frequency increased substantially with a 5-base run but no further with the 7-base run.

Consistent with the results from the forward assay, the increases in reversion frequencies with the mutant RTs relative to the wild-type RT are again highest in the 3- and 4-base runs (Fig. 6B). With the W266A RT, the mutator effect diminishes as the run length increases. With the G262A mutant, the mutator effect is highest with the 4-base run, slightly lower with the 3-base run and substantially less with the 5- and 7-base runs.


DISCUSSION

Termination of processive synthesis by the wild-type RT is critically dependent on the DNA sequence of the first 6 base pairs of the duplex primer stem and single base pair differences in this region strongly affect both processivity (Abbotts et al., 1993) and frameshift fidelity (Bebenek et al., 1993). The present study reinforces the functional importance of these protein-template-primer interactions by showing that RT mutants containing single amino acid changes in alpha helix H of the thumb domain and having greatly elevated dissociation rate constants for template-primer (Beard et al., 1994) also have reduced processivity (Fig. 2) and increased strand slippage error rates (Figs. 3, 4, and 6) during DNA-dependent DNA synthesis. These effects are observed during catalysis in aqueous solution and support the structural model for the RTbullettemplate-primer complex derived by x-ray crystallography (Jacobo-Molina et al., 1993). They suggest that interactions between specific amino acid residues and the duplex template-primer at positions well removed from the active site (Fig. 5) are critical determinants of DNA binding, fidelity, and processivity. In support of the functional role of thumb subdomain residues, mutant forms of T7 RNA polymerase bearing deletions in this subdomain have reduced processivity during transcription (Bonner et al., 1994).

Earlier observations with the wild-type RT revealed a correlation between frameshift fidelity and processivity. Homopolymeric runs that were frameshift error hot spots were sites where the probability of cessation of processive synthesis was high (Bebenek et al., 1989). Moreover, sequence changes that altered processivity concomitantly altered frameshift fidelity (Bebenek et al., 1993). The present study extends this correlation by examining the effects of alterations in the protein rather than the substrate. They reinforce the relationship between processivity and fidelity, since both mutant RTs have concomitant reductions in both properties.

Differences between the two mutant RTs in the magnitude and specificity of effects on frameshift fidelity and processivity reinforce another interpretation of our earlier studies. In contrast to proteins that bind tightly to specific DNA sequences (e.g. Lac repressor), DNA polymerases have generally been considered sequence nonspecific binding proteins, as required for their roles in replication and repair of heterogeneous sequences. However, here as in earlier studies using altered template-primers (Abbotts et al., 1993; Bebenek et al., 1989, 1993), 10-100-fold differences in fidelity and processivity have been observed that depend on the template-primer sequence or on the amino acid sequence of the polymerase. Although these large differences may reflect interactions between the protein and the sugar-phosphate backbone more than between the protein and bases, they nonetheless illustrate that the RT does strongly respond to the structure of the substrate as defined by its base sequence. Thus, although HIV-1 RT is clearly not a sequence specific binding protein in the sense used for Lac repressor (for example), the fidelity and processivity specificity data demonstrate that DNA polymerases strongly respond to sequence differences and should not be considered sequence nonspecific binding proteins.

DNA binding, fidelity, and processivity are affected to different extents in the mutant RTs. The dissociation rate constant for template-primer is elevated more for W266A than for G262A (Beard et al., 1994), and the average processivity is reduced more for W266A than for G262A (Fig. 2A). The probability of termination of processive synthesis is a complex function of both the RT and the sequence of the template-primer (see text under ``Results'' and Fig. 2B). The G262A RT has generally lower frameshift fidelity than does W266A (Table 1; Fig. 3C; Fig. 4, runs of 2, 3, and 4 bases; Fig. 6A) but not at all homopolymeric runs (Fig. 3C (inset), Fig. 4, run of 5 bases). We also note that G262A RT is a mutator for +1 base errors, whereas W266A RT is not (Fig. 3D). For such errors, the extra nucleotide is in the primer strand, as opposed to being in the template strand for -1 errors. The idea that amino acid substitutions in the H helix, which is in close proximity to the primer strand, might selectively affect plus-one frameshift error rate is obviously not sufficient to explain the observation. All these specificities suggest that the interactions between the wild-type RT and the template-primer occurring during polymerization that determine frameshift error rate and processivity are complex. Moreover, replacement of the large aromatic side chain of tryptophan 266 with a methyl group likely represents a very different circumstance from replacing the hydrogen of glycine 262 with a methyl group. The model presented in Fig. 5is consistent with the absence of important side chain interactions when alanine is substituted for Trp, but the presence of unfavorable interactions when alanine is substituted for glycine. These possibilities can be probed by analyzing additional amino acid substitutions at these two positions.

The RT is a dimeric enzyme so that the amino acid substitutions are present in both subunits of the homodimer. The DNA duplex in the co-crystal with the RT (Jacobo-Molina et al., 1993) was not long enough to determine whether helix H in the p51 subunit interacts with the DNA. Thus, it is formally possible that the effects could be due to changes in the p66 subunit that corresponds to p51 in the heterodimer. We do not favor this possibility given the strong mutator effects in short homopolymeric sequences, which is consistent with alterations in enzyme-template primer interactions close to the primer terminus.

Both mutant RTs are mutators for frameshift errors in homopolymeric runs. Their concomitant mutator activity for T C substitutions at position -36 (Fig. 3B) further supports the earlier suggestion that most of the substitutions at this template position are initiated by template-primer slippage rather than by misinsertion (Bebenek et al., 1989). The substitution results because, after template slippage followed by incorporation of one more correct nucleotide, realignment occurs to generate the terminal mispair responsible for the substitution. We are currently examining this dislocation hypothesis by determining the base substitution specificity of the mutant RTs using the altered substrates employed to support the model with the wild-type RT (Bebenek et al., 1993). Excluding these T C substitutions, the data in Fig. 3A suggest that the W266A and G262A mutant RTs are not substantially less accurate for base substitutions thought to result from direct misinsertion. With Klenow DNA polymerase, a misinsertion mutator phenotype (Carroll et al., 1991) results from an amino acid substitution in the O helix of the fingers subdomain. Similar base substitution mutators may eventually be found by altering amino acids in other regions of HIV-1 RT.

Crystallographic analysis of the wild-type RTbulletduplex DNA complex (Jacobo-Molina et al., 1993) revealed that the template-primer conforms more closely to A-form DNA near the polymerase active site and to B-form DNA near the RNase H active site. Interestingly, there is a 40-45° bend in the DNA at the transition point (Fig. 5). This is distributed over about 4 base pairs in the vicinity of contacts with helix H in the p66 subunit. An intriguing possibility is that this bend is related in some way to the high frameshift infidelity of the wild-type HIV-1 RT. It may also relate to the observation that the magnitude of the frameshift mutator effect is larger in homopolymeric runs of three or four nucleotides than in runs of five or seven nucleotides ( Fig. 4and Fig. 6). The bend in the substrate imposed by the RT may place different constraints on unpaired nucleotides in short versus long runs.

A second, but not mutually exclusive, hypothesis is that the magnitude of the frameshift mutator effect might vary as a function of the position of a particular amino acid residue relative to the position of the unpaired base in a misaligned homopolymeric run. The logic is as follows. With wild-type HIV-1 RT (Fig. 4, top panel; Fig. 6), as with two other replicative DNA polymerases (DNA polymerase alpha (Kunkel, 1990) and T7 DNA polymerase (Kunkel et al., 1994)), the frameshift error rate increases as the length of the homopolymeric run increases. This trend suggests that, of the many misaligned substrates possible within a run, the one that is extended by the polymerase to seal the misalignment in place contains the unpaired nucleotide as far back in the homopolymeric duplex primer stem as possible (Fig. 5B). This will not only be the most stable intermediate, but it will also place the unpaired nucleotide the greatest distance from the polymerase active site. Thus, template-primer slippage in a homopolymeric run of four template T residues bases could yield an intermediate with an unpaired base 3 base pairs away from the primer terminus (Fig. 5B). Interactions between the large side chain of tryptophan 266 occurring with the sugar-phosphate backbone at the third template-primer position (as suggested by the modeling in Fig. 5A) may reduce the chance of formation or utilization of a misaligned intermediate with an unpaired base in this region of the template-primer. This is consistent with the frameshift mutator phenotype of the W266A RT, where replacement of the large aromatic side chain with a methyl group may lessen constraints normally imposed on a misaligned intermediate. However, slippage in a 7-base run would yield an intermediate with an unpaired base 6 base pairs away from the primer terminus, a location far enough away to be unaffected by interactions between Trp and the template-primer. This is consistent with the observations that the W266A mutant reduces frameshift fidelity more in short runs than in long runs ( Fig. 4and Fig. 6). A similar logic suggests that the G262A mutator RT might more strongly enhance slippage errors in longer homopolymeric runs than would W266A, since its position in helix H is further from the 3`-OH terminus of the primer (Fig. 5). In favor of this possibility, we note that the G262A change leads to a slightly greater mutator effect in a 4-base run than in a 3-base run (Fig. 6B), while the reverse is true for the W266A change. However, this difference is subtle and alternative explanations are possible. For example, glycine 262 may have a role in defining the overall structure of helix H, such that replacement with any side chain will have a more global effect on helix structure and/or helix-DNA interactions. The importance of the bend and the hypothesis that the magnitude of the frameshift mutator effect varies as a function of the position of a particular amino acid residue relative to the position of the unpaired base can be examined by determining frameshift fidelity with additional substrates and/or mutant RTs containing amino acid substitutions at positions thought to interact closer to or farther from the primer terminus. It will also be interesting to determine if amino acid substitutions in secondary structural elements of other DNA polymerase thumbs affect processivity or frameshift fidelity. The H helix has recently been proposed to be part of a ``helix clamp'' common to a number of nucleic acid polymerases (Hermann et al., 1994).

Substitution of alanine for tryptophan at 266 and glycine at 262 had only modest effects on the rate of catalysis or dNTP binding (Beard et al., 1994), yet has substantial effects on DNA binding, processivity, and frameshift fidelity. These observations are consistent with the possibility that mutant DNA polymerases might arise in vivo that are sufficiently active to fulfill their roles in replication, recombination, or repair but do so with reduced frameshift fidelity. This offers one explanation for the instability of repetitive sequences that are associated with cancer and neurodegenerative diseases (for review, see Loeb(1994) and Willems (1994)).


FOOTNOTES

*
This work was supported by a grant (to T. A. K.) from the National Institutes of Health Intramural AIDS Targeted Antiviral Program. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

To whom correspondence should be addressed. Tel.: 919-541-2644; Fax: 919-541-7613.

(^1)
The abbreviations used are: HIV-1, type 1 human immunodeficiency virus; RT, reverse transcriptase.

§
Present affiliation: AIDS Vaccine Program, SAIC, NCI, FCRDC, Frederick, MD 21702-1201.

(^2)
L. Kroutil, K. Register, K. Bebenek, and T. A. Kunkel, manuscript in preparation.


ACKNOWLEDGEMENTS

We thank William Copeland and Miriam Sander for their comments on the manuscript.


REFERENCES

  1. Abbotts, J., Bebenek, K., Kunkel, T. A. & Wilson, S. H. (1993) J. Biol. Chem. 268,10312-10323 [Abstract/Free Full Text]
  2. Beard, W. A., Stahl, S. J., Kim, H.-R., Bebenek, K., Kumar, A., Strub, M.-P., Becerra, S. P., Kunkel, T. A. & Wilson, S. H. (1994) J. Biol. Chem. 269,28091-28097 [Abstract/Free Full Text]
  3. Bebenek, K. & Kunkel, T. A. (1993) in Reverse Transcriptase (Goff, S., and Skalka, A. M., eds) pp. 85-102, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  4. Bebenek, K. & Kunkel, T. A. (1995) Methods Enzymol. 262,217-232 [Medline] [Order article via Infotrieve]
  5. Bebenek, K., Abbotts, J., Roberts, J. D., Wilson, S. H. & Kunkel, T. A. (1989) J. Biol. Chem. 264,16948-16956 [Abstract/Free Full Text]
  6. Bebenek, K., Abbotts, J., Wilson, S. H. & Kunkel, T. A. (1993) J. Biol. Chem. 268,10324-10334 [Abstract/Free Full Text]
  7. Bonner, G., Lafer, E. M. & Sousa, R. (1994) J. Biol. Chem. 269,25129-25136 [Abstract/Free Full Text]
  8. Carroll, S. S., Cowart, M. & Benkovic, S. J. (1991) Biochemistry 30,804-813 [Medline] [Order article via Infotrieve]
  9. Casas-Finet, J. R., Kumar, A., Morris, G., Wilson, S. H. & Karpel, R. L. (1991) J. Biol. Chem. 266,19618-19625 [Abstract/Free Full Text]
  10. Casas-Finet, J. R., Kumar, A., Karpel, R. L. & Wilson, S. H. (1992) FASEB J. 6,A486
  11. Darden, T., Johnson, J. & Smith, H. (1991) J. Mol. Graph. 9,18-23 [CrossRef][Medline] [Order article via Infotrieve]
  12. Goel, R., Beard, W. A., Kumar, A., Casas-Finet, J. R., Strub, M.-P., Stahl, S. J., Lewis, M. S., Bebenek, K., Becerra, S. P., Kunkel, T. A. & Wilson, S. H. (1993) Biochemistry 32,13012-13018 [Medline] [Order article via Infotrieve]
  13. Hermann, T. Meier, T., Götte, M. & Heumann, H. (1994) Nucleic Acids Res. 22,4625-4633 [Abstract]
  14. Jacobo-Molina, A., Ding, J., Nanni, R. G., Clark, A. D., Jr., Lu, X., Tantillo, C. H., Williams, R. L., Kamer, G., Ferris, A. L., Clark, P., Hizi, A. Hughes, S. H. & Arnold, E. (1993) Proc. Natl. Acad. Sci. U. S. A. 90,6320-6324 [Abstract]
  15. Kohlstaedt, L. A., Wang, J., Freidman, J. M., Rice, P. A. & Steitz, T. A. (1992) Science 256,1783-1790 [Medline] [Order article via Infotrieve]
  16. Kunkel, T. A. (1990) Biochemistry 29,8003-8011 [Medline] [Order article via Infotrieve]
  17. Kunkel, T. A., Patel, S. & Johnson, K. A. (1994) Proc. Natl. Acad. Sci. U. S. A. 91,6830-6834 [Abstract]
  18. Loeb, L. A. (1994) Cancer Res. 54,5059-5063 [Medline] [Order article via Infotrieve]
  19. Mehta, C. A. & Patel, N. R. (1980) Comm. Statist. Ser. B 9,649-664
  20. Preston, B. D., Poiesz, B. J. & Loeb, L. A. (1988) Science 242,1168-1171 [Medline] [Order article via Infotrieve]
  21. Roberts, J. D., Bebenek, K. & Kunkel, T. A. (1988) Science 242,1171-1173 [Medline] [Order article via Infotrieve]
  22. Streisinger, G., Okada, Y., Emrich, J., Newton, J., Tsugita, A., Terzaghi, E. & Inouye, M. (1966) Cold Spring Harbor Symp. Quant. Biol. 31,77-84 [Medline] [Order article via Infotrieve]
  23. Vriend, G. (1990) J. Mol. Graph. 8,52-56 [CrossRef][Medline] [Order article via Infotrieve]
  24. Weiner, S. J., Kollman, P. A., Nguyen, D. T. & Case, D. A. (1986) J. Comput. Chem. 7,230-252
  25. Willems, P. J. (1994) Nature Genet. 8,213-215 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.