(Received for publication, September 18, 1995; and in revised form, November 14, 1995)
From the
The specificity of the retroviral protease is determined by the
ability of substrate amino acid side chains to bind into eight
individual subsites within the enzyme. Although the subsites are able
to act somewhat independently in selection of amino acid side chains
that fit into each pocket, significant interactions exist between
individual subsites that substantially limit the number of cleavable
amino acid sequences. The substrate peptide binds within the enzyme in
an extended anti-parallel sheet conformation with substrate amino
acid side chains adjacent in the linear sequence extending in opposite
directions in the enzyme-substrate complex. From this geometry, we have
defined both cis and trans steric interactions, which
have been characterized by a steady state kinetic analysis of human
immunodeficiency virus, type-1 protease using a series of peptide
substrates that are derivatives of the avian leukosis/sarcoma virus
nucleocapsid-protease cleavage site. These peptides contain both single
and double amino acid substitutions in seven positions of the minimum
length substrate required by the retroviral protease for specific and
efficient cleavage. Steady state kinetic data from the single amino
acid substituted peptides were used to predict effects on
protease-catalyzed cleavage of corresponding double substituted peptide
substrates. The calculated Gibbs' free energy changes were
compared with actual experimental values in order to determine how the
fit of a substrate amino acid in one subsite influences the fit of
amino acids in adjacent subsites. Analysis of these data shows that
substrate specificity is limited by steric interactions between pairs
of enzyme subsites. Moreover, certain enzyme subsites are relatively
tolerant of substitutions in the substrate and exert little effect on
adjacent subsites, whereas others are more restrictive and have marked
influence on adjacent cis and trans subsites.
The retrovirus protease (PR) ()is responsible for the
post-translational processing of viral gag and gag-pol polyprotein precursors(1) . This proteolytic processing is
a necessary step in the replication of infectious virus and is a late
event occurring as particles bud from infected cells. Cleavage of the
viral polyproteins requires human immunodeficiency virus, type 1
(HIV-1), or avian myeloblastosis/Rous sarcoma virus (AMV/RSV) PR to act
on nine unique sequences, each 8 amino acids in length. Consistent with
this is the finding that the minimum length of a peptide substrate
required for specific cleavage by either PR is 6-8 amino acids,
depending upon the source of the enzyme (2, 3, 4) . Substrates bind to HIV-1 PR in an
extended anti-parallel
strand conformation with substrate amino
acid side chains adjacent in the linear sequence extending in opposite
directions in the enzyme-substrate complex (see Fig. 1).
Interaction between substrate amino acid side chains and the
corresponding binding pockets in the enzyme determines enzyme
specificity. It has been shown previously that a variety of amino acid
residues can be accommodated in each of the enzyme subsites, when
single amino acid substitutions are placed in the context of an
efficiently cleaved substrate(5) . Additionally, it was found
that individual enzyme subsites are capable of acting relatively
independently in recognition of amino acids in the corresponding
substrate position(6) . If each of the eight subsites were able
to accept n different amino acid side chains, where n =
4-7 amino acids as found in the naturally occurring gag and pol polyprotein cleavage sites and the subsites were
acting completely independently in substrate amino acid selection, then
PR would be able to cleave n
different substrate sequences.
However, in contrast to cellular proteases such as pepsin(7) ,
the retroviral PR displays a remarkably limited substrate range,
cleaving only a very select set of amino acid sequences.
Figure 1: Schematic representation of the AMV/RSV NC-PR substrate, PAVSLAMT, from P4 to P4` in the S4 to S4` subsites of PR. The relative size of each subsite is illustrated by the area enclosed by the curved line around each substrate side chain. Protease residues forming the subsites are shown for those that differ between the AMV/RSV and HIV-1 PRs. The AMV/RSV PR residue is shown outside of the parentheses, and the HIV-1 PR residue is shown in the parentheses. Most of the residues contribute to more than one adjacent subsite, and this is indicated by the position of the label relative to the subsites.
In this
report, we define substrate parameters that limit the possible
combinations of amino acids that constitute a functional cleavage site.
The activity of HIV-1 PR was analyzed with the use of a library of
single and double substituted synthetic peptide substrates,
representing the cleavage junction between the naturally occurring RSV
nucleocapsid (NC) and PR proteins in the gag precursor
polypeptide. A steady state kinetic analysis was used to calculate
G values representing the difference in the
Gibbs' free energy changes for the proteolysis reactions
resulting from amino acid substitutions in the wild type NC-PR-based
substrate. A comparison was made between the experimentally observed
G values for the doubled substituted peptides and
the predicted
G values calculated using the data
derived from the single substituted peptides. This analysis indicates
that there are steric interactions between amino acids in adjacent and
alternate substrate positions that restrict the combinations of amino
acids that comprise a functional cleavage site.
{k/K
}
represents the catalytic efficiency for the wild type NC-PR
peptide;
{k
/K
}
and
{k
/K
}
represent the catalytic efficiency for different single
substituted peptides, respectively;
{k
/K
}
represent the catalytic efficiency for double substituted
peptides.
The atomic coordinates
for the substituted residues were produced with the program
AMMP(15) , and the energy of the HIV protease-substrate complex
was minimized. A modified version of the UFF potential set (16) was used. Infrared spectral data were not included in the
original UFF parameterization and have been used to improve the
parameters for proteins and nucleic acids. ()These
modifications do not significantly change the performance of the
potential set on small molecules but result in consistently smaller
root mean square deviations between minimized and observed protein and
nucleic acid structures. One of the strengths of the UFF potential is
that the new terms were easy to add in a manner that is consistent with
the rest of the potential set. The atomic charges from the AMBER all
atom set were used for the protein and water(18) . Charges for
the nonstandard groups in the transition state were produced as
described in Harrison et al.(19) .
No screening
dielectric term or bulk solvent correction was included. No cut-off was
applied for nonbonded and electrostatic terms, which were calculated
with an algorithm that amortizes or spreads the cost of calculation
over many simpler calculations, which results in lower average cost, as
described in Harrison and Weber(20) . The atomic positions for
the protein and water molecules were initially tethered to those in the
crystal structure of HIV-1 protease in order to calculate and minimize
the hydrogen atom positions. The side chain atoms were removed down to
the C atom, or the C
for substitution of Gly, for the
substituted amino acids, and the new atomic positions were created by a
variation on distance geometry(19) . The new atoms were
minimized with respect to bond, angle, torsion, and hybrid potentials.
The protease structure with nonhydrogen atoms from the crystal
structure and minimized hydrogen atoms was combined with each of the
different peptides with single or double amino acid substitutions.
Then, each of the side chain torsion angles for substituted residues in
the peptide substrate was rotated through 360 ° in steps of 15
° to search for alternate conformations. This torsion search finds
the angle(s) that have a minimum in the nonbonded energy. Finally, each
model of HIV protease with a different substrate was optimized by a
longer minimization using 100 steps of conjugate gradients followed by
eight cycles of alternating conjugate gradients (30 steps) and short
runs of molecular dynamics (20 fs steps at 300 K).
We have used a simple method to determine the extent to which
amino acids in given substrate positions influence the fit of adjacent
and alternate substrate amino acids into their corresponding enzyme
subsites. This method involves a steady state kinetic analysis of HIV-1
PR on a series of peptide substrates based on the RSV NC-PR cleavage
sequence that have two residues altered from wild type. The amino acid
substitutions chosen were those that were analyzed as single
substitution mutations ( (5) and in Table 1) and were
designed to test primarily steric effects of the side chains in the
substituted pairs. Steady state data from the single and double
substituted peptides were used to calculate G values
according to the equations listed under ``Experimental
Procedures.'' Binding of substrate peptides to HIV PR has been
deduced from examination of crystal structures of HIV PR complexed with
various peptide-like inhibitors(21) . Inhibitors and by analogy
substrates bind in an extended anti-parallel
sheet conformation
between the flaps and the active site. This is shown in Fig. 1,
which presents the NC-PR peptide substrate docked in the eight subsites
of a retrovirus PR. Because of this structural orientation, amino acid
side chains in adjacent substrate positions, such as P1 and P2, extend
in an opposite or trans configuration; amino acid side chains
in every other position, such as P1 and P3, extend in the same or cis configuration. In cases where amino acids in given
substrate positions have little influence on the fit of amino acids in
adjacent or alternate positions,
G values determined
for the double substituted peptide should equal the sum of the
G values determined for the single substituted
peptides (). In contrast, if interactions between the
tested amino acids in the substrate are important, there will be
discrepancies between the experimental and predicted
G
values.
Steady state kinetic
parameters for a series of single and double substituted NC-PR peptide
substrates with HIV-1 PR are presented in Table 1and Table 2, respectively. The RSV NC-PR peptide was chosen as a
reference substrate because it is cleaved efficiently by both the HIV-1
and AMV PRs and because it contains many small amino acid residues that
present little or no steric interference to the other substrate
positions. The k and K
are
presented as values relative to the wild type NC-PR peptide. In Table 2, double substitutions in P1-P3 and P2-P1` examine cis interactions, whereas double substitutions in P3-P2, P2-P1, and
P1-P1` examine trans interactions. A more limited data set for
the AMV PR acting on selected peptides is presented in Table 3.
Figure 2:
Comparison of experimental and predicted
G values for cleavage by HIV-1 PR of the RSV NC-PR
peptide substrates containing double amino acid substitutions in the cis orientation. The calculated and experimentally determined
G values listed in Table 2for cleavage of a
peptide substrate representing the RSV NC-PR cleavage site with the
sequence of PAVS-LAMTMRR but containing substitutions in the P3 and P1
positions (A) and P2 and P1` (B) were plotted as a
function of the amino acids substituted (at the top of the graphs). A
G value equal to zero indicates
that a substituted peptide substrate has an activity equal to the wild
type NC-PR peptide substrate. Positive
G values
reflect substrates that are less efficiently cleaved, and negative
G values reflect substrates cleaved more efficiently
than the wild type substrate.
, predicted
G values;
, observed
G values.
Figure 3:
Comparison of experimental and predicted
G values for cleavage by HIV-1 PR of the RSV NC-PR
peptide substrates containing double amino acid substitutions in the trans orientation. The calculated and experimentally
determined
G values listed in Table 2for
cleavage of a peptide substrate representing the RSV NC-PR but
containing substitutions in P3 and P2 (A), P2 and P1 (B), and P1 and P1` (C) were plotted as a function of
the amino acids substituted (top of panels) as
described in the legend to Fig. 2.
, predicted
G values;
, observed
G values.
We propose that this lower activity is the result of steric
interference that places one or both of the amino acids in the cis subsites in an altered conformation not favorable for binding and
cleavage. This is suggested by analysis of the peptide substrates
modeled in the HIV-1 PR structure in Fig. 4, which shows the
relative positions of the P3 to P1 amino acids of the NC-PR peptide
substrate predicted by energy minimalization. Shown are two peptides,
both of which have His substituted for Ala in P3 and one that also has
Trp substituted for Ser in P1 (Fig. 4, thin lines and balls). The bulky Trp in P1 appears to directly effect the
position of the His in P3, which is pushed back toward
Phe. The only exception to this steric argument observed
so far involves Gly in P3 with Trp in P1. Glycine, which does not have
a side chain to contribute to substrate binding energy, is a special
case and will be discussed later in the context of several Gly
substituted peptides.
Figure 4:
Stereo views of the NC-PR peptide
containing substitutions at P3 and P1 is shown in the PR binding site.
Residues P3-P1 of the substrate with Trp at P1 and His at P3 are shown
in a ball and stick representation (thick
lines) compared with the single substituted substrate with His at
P3 and Ser at P1 (thin lines). PR residues 81` to 84` that
form the top of subsites S1 and S3, Phe, which lies to one
side of the S3 subsite, and V32, which forms the bottom of the subsite
S2, are shown as thin lines. Trp at P1 displaces His at P3
from its position in the single substituted peptide. The atomic
coordinates were obtained by molecular modeling as described under
``Experimental Procedures.'' Each of the substituted residues
was positioned in a minimum energy conformation by rotating the side
chain torsion angles, except for Gly and Ala. Then the HIV
protease-substrate models were minimized to ensure good bond and angle
geometry and to remove any close contacts between atoms by adjusting
the atomic positions. The minimized models showed that protease main
chain atoms had root mean square differences of 0.41-0.43 Å
compared with the starting HIV protease crystal structure. This is well
within the range of 0.16-0.79 Å for root mean square
differences between different crystal structures of the same
protein(24, 25) .
A similar steric interaction can be seen with cis substitutions in P2 and P1` (Fig. 2B).
When Gly or Ala is placed in P2 with Phe in P1`, the predicted and
experimental values agree. When the larger Leu is placed in P2, there
appears to be insufficient flexibility in the binding pockets to
accommodate the combination of the two large groups extending into the
same side of the enzyme. Peptides that fixed Leu in P2 and vary the
amino acid in P1` also have been analyzed. However, these peptides,
which have Ala or Gly in P1`, were predicted to be cleaved poorly
because the single substituted substrates are cleaved with low
efficiency. Indeed, this was observed, and experimental G values could not be calculated accurately because product did not
accumulate to any measurable extent under the assay conditions used. In
the above argument, the size of an amino acid relative to its given
subsite determines the magnitude of the steric effect. For instance,
Leu would be considered a large residue in the small S2 subsite,
although it would be a medium sized residue in the larger S1 or S3
subsites (see Fig. 1)(2) .
In the data shown in Fig. 2B, it was surprising to find that the presence of
His in P2 and Phe in P1` are predictive. This is in contrast with what
was observed when His was in P3 and Trp in P1 (Fig. 2A). The disparity between these two results may
reflect differences between subsites that interact with substrate amino
acids that span across the scissile bond and those that do not. Each
half of the inhibitor forms a short sheet with the two
anti-parallel strands of the flap and residues 27-29 of one
subunit(22, 23, 24) (Fig. 5). There
is a set of hydrogen bond interactions between the carbonyl oxygens and
amides of the inhibitor and the main chain carbonyl oxygens and amides
of PR. The
sheets are interrupted near the nonhydrolyzable group
of the inhibitors, where there is a kink in the extended conformation
of the inhibitor. The interruption in the
sheet near the scissile
bond means that the P2-P1` side chain interactions are not the same as
those of P1-P3 or P1`-P3`. In fact, the side chains of P1 and P1` tend
to be directed away from each other so that P1 interacts more closely
with P3 than P1` with P2. Therefore, steric interactions involving P1
and P3 may be more pronounced than those involving P1` and P2.
Figure 5:
Stereo views of interactions of a peptide
inhibitor with HIV-1 PR. The inhibitor and PR are from the crystal
structure of Jaskolski et al.(17) . The main chain
atoms of the inhibitor and residues 46-55 and 25-29 of each
subunit in the PR dimer are shown. The side chain atoms of the two
catalytic aspartic acid residues are also indicated. Each half of the
inhibitor forms a series of sheet-like hydrogen bond interactions
with PR residues 27-29 near the catalytic aspartates and with the
two anti-parallel
strands of the flap of each subunit. In the
center, these interactions are interrupted near the nonhydrolyzable
group of the inhibitor, and interactions are formed with a conserved
water molecule.
Although
there appear to be limited interactions between amino acids in P3 and
P2, there are significant interactions observed when similar
substitutions are placed in the P2 and P1 trans positions.
With Trp fixed in P1, Ala substituted for the natural Val in P2 was
predictive, whereas the substitution of the larger His or Leu was not (Fig. 3B). The mechanism by which amino acids in the trans configuration interact is not clear. However, it is
likely that the presence of a large substrate amino acid in an enzyme
subsite will distort the position of the substrate peptide backbone in
a way that can be catalytically compensated for by placing a smaller
rather than a larger residue in the trans subsite. This can be
seen in the structural model shown in Fig. 6. Shown are two
peptides in which Leu has been substituted in P2. One of the two
peptides also has Trp substituted for Ser in P1. The presence of the
Trp in P1 results in substantial movement of the substrate peptide
backbone with Leu in P2 being pushed deeper into the S2 subsite
resulting in loss of activity. A similar carbon backbone
distortion could also contribute to the differences in the
G values observed with the cis substituted
peptides.
Figure 6:
Stereo views of the NC-PR peptide
containing substitutions at P2 and P1 is shown in the PR binding site.
Residues P3-P1 of the substrate with Trp at P1 and Leu at P2 are shown
in a ball and stick representation (thick
lines) compared with the single substituted substrate with Leu at
P2 and Ser at P1 (thin lines) as described in the legend to Fig. 4. PR residues 81` to 84`, which form the top of subsites
S1 and S3, Phe, which lies to one side of the S3 subsite,
and V32, which forms the bottom of subsite S2, are shown as thin
lines. Trp at P1 displaces Leu at P2 and Ala at P3 from their
positions with Leu at P2 in the single substituted peptide. In the
double substitution, P2 Leu is moved deeper into the S2
subsite.
Trp in P1 and His or Leu in P2 represent about the largest residues that can fit into S1 and S2, respectively, without substantial loss of catalytic efficiency(5) . To determine whether steric interactions occur between other trans substrate pairs, substitutions were also placed in the P1 and P1` positions (Fig. 3C). In these instances, both favorable and unfavorable steric interactions were observed. With Trp or Leu fixed in P1, the presence of Ala in P1` produced a substrate that was more active than predicted. In contrast, the placement of the medium sized Val in P1` produced a substrate that was as active as predicted, whereas the placement of a larger Phe in P1` resulted in an efficiency of cleavage that was considerably less than predicted. Similar results are obtained if the bulky group is fixed in P1` and the size of the residue in P1 is varied (Fig. 3C).
As mentioned above, we have detected only two of the peptides so far that were more active than predicted by the single substituted peptides. These are the P1 Trp-P1` Ala and the P1 Leu-P1` Ala peptides. These peptides were predicted to be poor substrates primarily because single P1` Ala substituted peptides are cleaved poorly. The presence of a larger Leu or Trp substituted for Ser in P1, however, seems to restore cleavage of the P1` Ala substituted substrates to levels similar to those observed with the unmodified reference peptide (Fig. 3C). The P1` position has a preference for a large hydrophobic amino acid side chain. Therefore, an Ala residue in this position is too small to fit well into the HIV-1 PR S1` subsite to form the requisite stabilizing van der Waals' interactions with subsite amino acids. The presence of a bulky group in P1 may position the P1` Ala deeper into the S1` subsite allowing it form van der Walls interactions and thereby producing a substrate with more activity than predicted by the single substituted peptides. This interpretation is consistent with the observation that double substituted peptides containing Gly in P1` and Trp or Leu in P1 are also poor substrates. However, in this instance, an adjacent Trp or Leu in P1 does not ``rescue'' the cleavage defect. A Gly residue cannot provide a side chain for van der Walls interactions, even if the backbone of the peptide is shifted due to the larger residue in P1. Restoration of cleavage of a peptide containing Ala in P1` is predicted to be stronger when a bulky group is in a trans rather than the cis configuration. This is what is observed as shown in Fig. 2B. When Leu is placed into the adjacent cis position, as in the P2 Leu-P1` Ala peptide, there is very little cleavage of this peptide detected. Of the three trans interactions examined, P3-P2, P2-P1, and P1-P1`, the two involving exclusively the S1, S1`, or S2 subsite showed interaction effects, whereas the pair involving S3 and S2 did not. This may reflect the fact that S2, S1, and S1` are internal subsites near the scissile bond, whereas S3 is found near the enzyme surface and therefore able to accommodate a variety of larger amino acids with little steric interaction with the adjacent subsite.
Of the double substituted peptide set analyzed in this study, about
half had experimentally determined G values that
were predicted well from the single substituted data. These peptides
involve substitutions of at least one small sized residue relative to a
given subsite in one of the two substituted substrate positions. In
contrast, about 45% of the double substituted peptides were cleaved
with catalytic efficiencies that were significantly worse than
predicted. These discrepancies can be explained by steric interference.
Only two of the peptides tested so far had more activity than
predicted. Thus, substitutions of various amino acids into the
different substrate positions limit the number of amino acid
combinations that constitute a cleavable site. Although the data set
analyzed in this report has focused primarily on nonprime substrate
positions, we predict that similar relationships probably exist for the
prime side (see Fig. 1). Also, there may be effects of
substitutions of amino acid residues at substrate positions 3 or 4
amino acids apart in the linear sequence, on both sides of the scissile
bond, that would restrict further the choice of amino acids that would
constitute a cleavage site. These latter interactions, if they are
important, may be of less magnitude than those involved in adjacent and
alternate subsites reported in this study. Taken together, these
results indicate that although many different side chains can bind
effectively into each enzyme subsite, enzyme specificity is limited by
interactions between substrate amino acids bound in both cis and trans positions. Interactions between at least one
pair of subsites examined appears to be minimal. Clearly, an
understanding of these relationships will be very important to the
rational design of HIV-1 PR inhibitors as potential therapeutic agents
for AIDS. For a potential compound to bind effectively in the enzyme
subsites, it must not violate any of the adjacent occupancy rules
defined in this study.