(Received for publication, November 11, 1995, and in revised form, January 19, 1997)
From the Istituto di Ricerche di Biologia Molecolare (IRBM) P. Angeletti, Pomezia, Rome, Italy
The substrate specificity of a purified protein encompassing the hepatitis C virus NS3 serine protease domain was investigated by introducing systematic modifications, including non-natural amino acids, into substrate peptides derived from the NS4A/NS4B cleavage site. Kinetic parameters were determined in the absence and presence of a peptide mimicking the protease co-factor NS4A (Pep4A). Based on this study we draw the following conclusions: (i) the NS3 protease domain has an absolute requirement for a small residue in the P1 position of substrates, thereby confirming previous modelling predictions. (ii) Optimization of the P1 binding site occupancy primarily influences transition state binding, whereas the occupancy of distal binding sites is a determinant for both ground state and transition state binding. (iii) Optimized contacts at distal binding sites may contribute synergistically to cleavage efficiency.
The N-terminal third of the hepatitis C virus (HCV)1 NS3 protein contains a trypsin-like serine protease that accomplishes four out of the five processing events that take place during maturation of the nonstructural portion of the HCV polyprotein, performing cleavages at the NS3-NS4A, NS4A-NS4B, NS4B-NS5A, and NS5A-NS5B junctions (1-6). It has been shown that cleavage between NS3 and NS4A is an intramolecular event, whereas all remaining junctions are processed in trans.
In vivo, NS3 appears to form a heterodimer whereby the protease domain associates with the viral protein NS4A. The latter is a 54-residue protein that has been shown to bind to the N-terminal region of the protease via a central hydrophobic domain spanning residues 21-34 (7-13). NS4A acts as a co-factor of the protease enhancing cleavage at all sites and being an absolute requirement for processing of the NS4B/NS5A junction ex vivo (7). Several studies have shown that a peptide encompassing the central hydrophobic domain of NS4A is sufficient for eliciting activation of the protease (12, 14-17).
Serine proteases contact the P1 residue of their substrates through characteristic specificity pockets. The residues flanking the specificity pocket are important determinants of substrate recognition. Homology modelling of the S1 specificity pocket of the NS3 protease has predicted the presence of a phenylalanine as a prominent feature, thus rendering the pocket rather small and hydrophobic (18). These characteristics have led to the prediction of the preference for small, hydrophobic residues, ideally cysteine residues, in the P1 position of NS3 substrates. Radiosequencing of the single cleavage products has subsequently confirmed these predictions, yielding the consensus sequence (D/E)XXXXC(A/S) for all trans cleavage sites, with X being any amino acid and the scissile bond being located between Cys and Ala or Ser (2, 18). The homology model has been used to successfully redesign the enzyme's specificity, thereby increasing its validity. Very recentyl the three-dimensional structure of the protease has been solved by two different groups (20, 21), confirming the presence of a phenylalanine ring pointing into the pocket. The sequences of the four NS3 cleavage sites are listed in Table I, indicating that the intramolecular cleavage site between NS3 and NS4A differs from the consensus having a Thr in the P1 position.
|
Substrate specificity of the NS3 protease has been investigated by several groups in a qualitative way using transient transfection (22, 23), in vitro translation (24), or intracellular processing of fusion proteins in Escherichia coli (25). Availability of quantitative data using peptidic substrates has been so far hampered by difficulties in expressing and purifying sufficient amounts of enzymatically active recombinant NS3 protease. We (17, 26) and others (15, 20, 21, 27-33) have recently described efficient heterologous expression and purification of the enzyme and were able to define optimized conditions for the determination of protease activity (26).
In the present work we investigate the substrate specificity of the NS3 protease domain by introducing systematic modifications in a peptide substrate derived from the sequence of the NS4A/NS4B junction. Activity on modified substrates has been determined both in the presence and absence of the NS4A co-factor. The results are discussed in the light of our previous homology model of the S1 pocket of the enzyme.
The protease domain of the HCV Bk strain
NS3 protein encompassing residues 1027-1206 of the viral polyprotein
was purified from E. coli as described previously (26). The
enzyme was homogeneous as judged from silver-stained SDS-polyacrylamide
gel electrophoresis and >95% pure as judged from reversed phase HPLC
performed using a 4.6 × 250-mm Vydac C4 column. The enzyme
preparations were routinely checked by mass spectrometry done on HPLC
purified samples using a Perkin Elmer API 100 instrument and N-terminal
sequence analysis carried out using Edman degradation on an Applied
Biosystems model 470A gas-phase sequencer. Both techniques indicated
that in more than 90% of the enzyme molecules the N-terminal
methionine and alanine had been removed, yielding an enzyme starting
with Pro2. Enzyme stocks were made 50% in glycerol
content, quantified by amino acid analysis or by determining their
absorbance at 280 nm (17), shock-frozen in liquid nitrogen, and
kept in aliquots at 80 °C until use. Control experiments had shown
that this freezing procedure does not affect specific activity of the
enzyme.
Peptide synthesis was performed on a NovaSyn Gem flow synthesizer by Fmoc/t-Bu chemistry. Protecting groups were as follows: Nalfa(Fmoc), Asp(Ot-Bu), Glu(Ot-Bu), Tyr(t-Bu), Ser(t-Bu), Thr(t-Bu), His(Trt), Cys(Trt), Gln(Trt), and Trp(Boc). All amino acids were activated by benzotriazole-1-yl-oxy-tris-pyrrolidino-phosphonium hexafluorophosphate, N-hydroxybenzotriazole, and diisopropylethylamine. All peptides were assembled on a NovaSyn PR 500 resin, yielding peptide amides upon cleavage with 88% trifluoroacetic acid, 5% phenol, 2% triisopropylsilane, and 5% water.
Crude peptides were purified by reversed phase-HPLC on a Nucleosyl C18, 250 × 21 mm, 100 Å, 7 µm, using H2O, 0.1% trifluoroacetic acid and acetonitrile, 0.1% trifluoroacetic acid as eluents. Analytical HPLC was performed on a Ultrasphere C18, 250 × 4.6 mm, 80 Å, 5 µm (Beckman). Purified peptides were characterized by mass spectrometry and amino acid analysis.
Concentration of stock solutions of peptides, prepared in
Me2SO and kept at 80 °C until use, was determined by
quantitative amino acid analysis performed on HCl-hydrolyzed
samples.
Cleavage assays were performed in 27 µl of 50 mM Tris, pH
7.5, 2% CHAPS, 50% glycerol, 10 mM dithiothreitol, to
which 3 µl of substrate peptide in Me2SO, leading to 10%
final Me2SO concentration, were added. Enzyme
concentrations (50 nM to 6 µM) and incubation times were chosen to obtain <20% substrate conversion. Where
indicated, Pep4A having the sequence GSVVIVGRIILSGR(NH2)
was added at a final concentration of 3 µM. Reactions
were stopped by addition of 70 µl of 0.1% trifluoroacetic acid.
Cleavage of peptide substrates was determined by HPLC using a
Merck-Hitachi chromatograph equipped with an autosampler. 90-µl
samples were injected on a Lichrospher C18 reversed phase cartridge
column (4 × 125 mm, 5 µm, Merck) or on a Beckman Ultrasphere
ODS column (4.6 × 250 mm) and fragments were separated using a
3-100% acetonitrile gradient at 2%/min. Peak detection was
accomplished by monitoring both the absorbance at 220 nm and
fluorescence (ex = 260 nm,
em = 305 nm).
Cleavage products were quantitated by integration of chromatograms with respect to appropriate standards. Initial rates of cleavage were determined on samples having <20% substrate conversion. Kinetic parameters were calculated from least-squares fit of initial rates as a function of substrate concentration with the help of a Kaleidagraph software, assuming Michaelis-Menten kinetics. Where individual determination of kcat and Km was not possible kcat/Km values were calculated from the slope of the linear part of the Michaelis-Menten plot at substrate concentrations < Km. All experiments were repeated at least three times.
To determine Ki values, initial velocities were determined as a function of substrate concentration in the presence of three fixed competing inhibitor concentrations. Ki values were calculated from least-squares fit using a modified Michaelis-Menten equation (Equation 1),
![]() |
(Eq. 1) |
![]() |
(Eq. 2) |
Estimation of Active Site Concentration
To determine the
number of active molecules in the enzyme preparation 100 µM of the fluorogenic ester substrate
Ac-DED(Edans)EEAbu[COO]ASK(Dabcyl)-NH2 (34)
were added to 100 nM enzyme solution added from a 5 µM stock, previously quantitated by amino acid analysis.
50-µl samples were withdrawn at timed intervals and immediately
quenched by addition of 50 µl of 1% trifluoroacetic acid. A total of
48 data points were collected within 3 min of reaction. The pre-steady state burst expected to equal the concentration of active sites in the
enzyme preparation was determined by extrapolation of product formation
to zero time.
We first addressed the question of whether the activity of our protease preparation was attributable to a fully active enzyme or only to a fraction of enzymatically active protease molecules. To this purpose we used the fact that serine proteases follow a two-step mechanism: acylation of the active site serine residue with concomitant release of the C-terminal fragment of the substrate is followed by deacylation and the release of the N-terminal fragment. The former acylation reaction is usually the rate-limiting step for amide bond scission, whereas for ester substrates the acylation reaction is usually fast as compared with deacylation. Thus, for ester substrates the pre-steady-state release of the C-terminal leaving group creates an initial burst that equals the number of active sites encountered by the substrate (35).
To determine the number of active sites in our enzyme preparation 100 µM fluorogenic ester substrate was added to 100 nM enzyme, the reaction was stopped at timed intervals and
samples were analyzed by HPLC (Fig. 1). Extrapolation of
the linear phase of the reaction indicated a burst of 94 ± 10 nM demonstrating that 94% of the protease molecules in our
preparation were enzymatically active.
Substrate Specificity
We investigated the substrate specificity of the NS3 protease by introducing several modifications into a decamer peptide whose sequence was based on the NS4A-NS4B junction. The choice for this sequence was determined by the relative ease of synthesis (the other two NS3 trans-cleavage sites have two cysteine residues that are prone to oxidation), by the fact that the corresponding junction is cleaved with a relatively high efficiency in the context of the polyprotein, and taking into account that previous mutagenesis studies have shown that this junction is intermediate with respect to its sensitivity to mutations (22). All experiments were done in buffer containing 50% glycerol and 2% CHAPS. These conditions have been worked out to yield optimum activity (26). However, glycerol might exert differential effects on the binding of polar and non-polar substrates. Since we found that glycerol activation is a very complex phenomenon, affecting both Km and kcat values of substrates, and in addition having a dramatic effect on the dissociation constant of the NS3-Pep4A complex, no attempts were made to study the influence of this agent on the interaction of the enzyme with different substrates. Our data therfore have to be interpreted with the caveat that they have been obtained in the presence of this not entirely passive solvent.
P1 SubstitutionsThe consensus sequence of the
NS3-dependent junctions within the HCV polyprotein points
to conservation of negatively charged residues in the P6 position, a
cysteine or a threonine residue in the P1, and an alanine or a serine
residue in the P1 positions (Table I). In an attempt to better define
the P1 specificity of the enzyme a series of cysteine substitutes, with
a special focus on both natural as well as non-natural small
hydrophobic residues, were introduced in the P1 position using a
decamer NS4A-NS4B substrate spanning from P6 to P4
. This length has
been shown by earlier work to be the optimum compromise between number
of residues, cleavage efficiency, and ease of detectability by HPLC (26). Most P1 substitutions resulted in uncleaved substrates (Table
II). We have estimated that our cleavage detection limit was in the order of kcat/Km = 0.05 M
1 s
1. Among the natural
amino acids only threonine was accepted in P1 in the absence of Pep4A
while addition of the co-factor also allowed cleavage, albeit at barely
detectable levels, of a substrate having a valine residue in P1. The
best cysteine substitutes turned out to be homocysteine, having the
side chain length increased by 1 -CH2- unit with respect to
cysteine, and allylglycine, in which the sulfhydryl group of cysteine
is replaced by an ethenyl group. Still, these P1 residues decreased
cleavage efficiency by 5-10-fold, both in the absence and presence of
Pep4A. Replacement of the SH group by an amino or a hydroxyl group (in
diaminopropionic acid and serine, respectively) abolished cleavage, as
did incorporation of the SH group in a thiophene ring or its
carboxymethylation. Replacement of the cysteine sulfhydryl by a methyl
group in aminobutyric acid was compatible with cleavage of the
resulting decamer, although at the expense of a 15- (+Pep4A) to 50-fold
(
Pep4A) decrease in the respective
kcat/Km values. Increasing
the length of the side chain by incorporating norvaline in the P1
position resulted in a peptide that was cleaved only in the presence of Pep4A.
|
To determine whether inability of the protease to cleave substrates with certain P1 substituents was due to poor ground state binding or to their inability to proceed through the transition state we determined the Ki values of some selected peptides using the decamer peptide with cysteine in P1 as substrate. For amide substrates of serine proteases the relationship Ki ~ Km ~ Kd usually holds, indicating that Ki values are very good approximations of the true dissociation constants of the enzyme substrate complex (36, 37). We verified this experimentally by determining both for the substrate having Abu as P1 residue and obtained (in the presence of Pep4A): Km = 97 ± 28 µM and Ki = 133 ± 15 µM. Next, we determined the Ki values of decamer peptides having alanine, proline, phenylalanine, or serine in P1, and that were therefore not cleaved (Table III). Interestingly, the Ki values differ only by a factor of 2-8 from the Km value determined for the wild type substrate, indicating that the P1 residue makes relatively minor contributions to ground state binding. This is true even for a bulky substituent such as phenylalanine.
|
To investigate the role of the side chain length of the P6 residue we substituted the aspartic acid residue present in the NS4A-NS4B sequence by a glutamic acid. This substitution, although affecting slightly kcat and Km individually, had no significant effect on the overall cleavage efficiency (Table IV). Neutralization of the negative charge by introduction of an asparagine residue decreased the cleavage efficiency by a factor of 5. This effect was attributable mainly to an increase in Km. When the charge in the P6 position was inverted through introduction of a lysine residue, a pronounced decrease in kcat/Km was observed, which was again attributable to an impairment in ground state binding of the resulting substrate, as judged from the increase of the respective Km values. All these effects were less pronounced in the presence of the Pep4A cofactor.
Substitution of the P1 alanine residue by a serine residue, which is
found in this position in other substrates, moderately (2-5-fold)
decreased kcat/Km (Table IV).
Conversely, introduction of a phenylalanine residue in P1
decreased
kcat/Km by 2 orders of
magnitude in the absence of the co-factor and 25-fold in its
presence.
To investigate the relative
contribution to cleavage efficiency of positions other than P1, P6, and
P1 we performed an alanine scanning experiment. The experiment was
repeated also in the presence of saturating amounts of Pep4A. Table
V summarizes the results. Only the substitution of the
P1 cysteine resulted in complete abolishment of cleavage both in the
presence and absence of the co-factor. Introduction of alanine residues
in other positions had only slight effects on cleavage efficiency. The
largest effect (a 7-fold decrease in efficiency) was observed for the
P3 position in the absence of Pep4A.
|
The
picture that emerges from our data confirms the importance of the
consensus P1 and P1 residues, and to a lower extent also of the P6
residue, in determining cleavage efficiency of a given substrate.
Still, there are remarkable differences in the kinetic behavior of the
single cis cleavage sites which, nevertheless contain the same
consensus P6, P1, and P1
residues (22, 23). We have recently shown
that these differences can be reproduced using decamer peptide
substrates (26). Thus, there must exist additional determinants.
Failure to detect them in the above mentioned alanine scanning
experiment might indicate that it is the sum of several minor
contributions that modulates the recognition of substrates containing
the consensus residues in P6, P1, and P1
. To start to address this
issue we synthesized a polyalanine peptide containing the consensus P6,
P1, and P1
residues. Based on the results of the alanine scanning,
pointing to the P3 residue as important contribution to cleavage
efficiency we decided also to fix a glutamic acid in this postion. We
extended the peptide to the P6
residue, which being a tyrosine
facilitated HPLC detection of cleavage products via monitoring of
tyrosine fluorescence. We further introduced a lysine residue in
P7
.
The parent peptide Ac-DAAEACAAAAPYK was cleaved, but cleavage was slowed down 70-85-fold with respect to the wild type sequence. Inspection of the sequences of the natural cleavage sites reveals that a negatively charged residue is conserved in the P5 position of two out of four cleavage sites. Re-introduction of the wild type aspartate in our minimalist substrate indeed resulted in a modest (~2-fold) enhancement of cleavage efficiency (Table VI).
|
In the P4 position of natural substrates there appears to be a
preference for hydrophobic residues (Table I). As a matter of fact,
Leu, Tyr, or Trp residues are found in this position. Introduction of
Tyr into the P4
position of the minimalist substrate had no detectable
effect, whereas Leu modestly increased cleavage rates. Both P4
substituted peptides were very insoluble, thus not permitting
individual determinations of kcat and
Km. When both P3 Glu and P4
Leu were re-introduced
in the sequence a 20-23-fold enhancement of cleavage efficiency was
observed, yielding a substrate that was cleaved with 22-34%
efficiency with respect to the wild type sequence.
The homology model of the specificity pocket of the NS3 protease predicts that both its shape and its physico-chemical environment be primarily determined by the presence of phenylalanine 213 (according to the chymotrypsin numbering). Furthermore, the pocket was predicted to be very hydrophobic and closed by the aromatic ring of the phenylalanine. While this article was in preparation the crystal structure of NS3 protease has been solved independently by two groups (20, 21). In the published structures the side chain of Phe213 is indeed pointing inside the S1 pocket, thereby confirming the model. As the sulfhydryl group of cysteine has been shown to favorably interact with the aromatic ring of phenylalanine, cysteine has been suggested to be the most reasonable P1 residue. Our kinetic data are in line with these predictions. In fact, P1 substitutions have shown the following order of preferences: Cys > hCys ~ Alg > Abu > Thr > NVal > Val. Attempts at substituting the thiol group of cysteine with a hydroxyl group in serine or an amino group in diaminopropionic acid resulted in uncleaved peptides. Most likely these side chains are too hydrophilic to favorably interact with the hydrophobic milieu of the S1 pocket. Increasing the side chain length of the P1 residue in homocysteine and in allylglycine resulted in substrates that were still reasonably well cleaved. In contrast, incorporation of NVal, having the same side chain length resulted in a more pronounced impairment of cleavage efficiency. This could be related to a favorable interaction of the SH or the allyl group with the phenylalanine ring in the pocket which is expected not to occur in the case of the methyl group in NVal.
The kinetics of Thr and Val substituted peptides, the only two branched
residues for which we could detect cleavage, deserve some further
comment. The peptide substrate containing Val in P1 was detectably
cleaved only in the presence of Pep4A and with a relative efficiency
that was 15-fold lower than that observed for Thr in the P1 position.
As a matter of fact the isopropyl branch seems detrimental to
productive transition state binding as judged by the preference of NVal
over Val. Fig. 2 shows a schematic view of the S1 pocket
together with a cysteine docked into the pocket and a comparison of the
conformation of this cysteine with the conformers of Thr and Val most
commonly found in proteins. In this conformation the Val side chain is
more likely to encounter steric hindrances in contacting the pocket
than would be expected for Thr (Fig. 2). Alternatively, it could be
assumed that for both Thr and Val only a methyl group of the side chain
will point into the S1 pocket, whereas the other branch (a hydroxyl
group in Thr and a methyl group in Val) will point out of the pocket. In this view, the fact that Thr is preferred over Val could indicate that its hydroxyl group will make some contacts outside the pocket that
cannot be made by the methyl group of Val.
It is interesting to compare our data obtained with peptidic substrates
to previous reports in which point mutations were introduced into
polyprotein substrates. Kolykhalov and co-workers (22) have shown that
susceptibility to mutations depends on the sequence context, the
NS4A/NS4B cleavage site being intermediate between the least sensitive
NS3/NS4A junction and the most sensitive NS5A/NS5B cleavage site. Among
several substitutions, only Arg and Asp in the P1 position of the
NS4A/NS4B junction resulted in complete abolishment of cleavage while a
gradient following the order Asn < Gly < Ser < Thr < Cys < Leu was observed under conditions of short
metabolic labeling pulses. Remarkably, in this experiment Leu proved to
be even superior to Cys as P1 residue. In similar experiments
Bartenschlager et al. (23) found that in the context of a
NS4A/NS4B junction cleavage was reduced but still well detectable upon
substitution of the P1 Cys with Phe, Ser, Thr, and Ala. Clearly, these
findings are at variance with our data. However, both the experimental
context and the nature of the substrates that have been used might
explain these differences. In fact, in transient transfection
experiments, even using short labeling times, the accumulation of
considerable amounts of substrate and cleavage products is inevitable
leading to deviation from true initial rates. This fact compresses
differences in cleavage efficiencies. Furthermore, it is possible that
NS3 is more active on polyprotein substrates than it is using a peptide
substrate, thereby being less discriminative against suboptimal P1
residues. Differences in specific activity using either polyprotein or
peptidic substrates have been reported for other proteases such as CMV protease (38) or tPa (39). Using in vitro translated
substrates based on the NS5A/NS5B junction we have estimated the
specific activity of added purified protease to be in the order of
kcat/Km = 200,000 M1
s
1.2 Nevertheless, we wanted
to rule out that the relatively low activities we were observing using
peptidic substrates were due to an only partially active enzyme. We
have shown that we were indeed working with an enzyme population that
was composed of more than 90% of enzymatically active molecules,
indicating that the activities we measured under our experimental
conditions were intrinsic features of the enzyme.
We found that optimized S1 pocket occupancy was a major determinant of kcat, whereas the P1 residue exerted a less pronounced effect on ground state binding of substrate peptides. Incorporation in the P1 position of residues that reduced cleavage of the resulting peptide to undetectable levels (at least 100-fold) resulted in unaltered or only up to 8-fold decreased affinities, as judged from the respective Ki values. Interestingly, this turned out to be true even for residues that were expected to cause significant steric or conformational perturbations such as phenylalanine or proline. This behavior sheds some light on the mechanisms of substrate recognition by the NS3 protease: apparently ground state binding of the substrate is mediated by multiple interactions involving distal residues, whereas the efficiency with which the bound substrate will proceed through the transition state is strongly influenced by the nature of the residue in the P1 position. This dual requirement is probably needed to endow the enzyme with the high degree of specificity necessary to accomplish its physiological role of mediating the generation of the mature HCV replication machinery.
In the P6 position a conserved negative charge is present in all cleavage sites. From our data it appears that, at least for the NS4A/NS4B junction there is a preference but no stringent requirement for this negative charge. This finding is in agreement with what has been found by others through introduction of point mutations in polyprotein substrates (22, 23). In fact, it has been reported that, in the context of different cleavage sites, extensive mutagenesis of the P6 position has little if any effect on cleavage efficiency. The absolute conservation of this residue especially in the light of the pronounced variability of the HCV genome might therefore indicate that it serves some more subtle function.
We attempted to identify additional crucial determinants of recognition
by the protease by both "classical" alanine scanning and by
"inverse alanine scanning" using a polyalanine substrate with 4 non-alanine positions. Re-introduction of wild type sequence residues
in the minimalist polyalanine substrate causes a gradual but
apparently, synergistic increase in cleavage efficiencies. Thus,
re-introduction of the wild type P5 and P4 residues singularly into
the minimalist substrate results in 2-2.5-fold cleavage enhancement, whereas pairwise re-introduction of both residues results in a more
than 20-fold increase in
kcat/Km.
The picture emerging from this study points to a rather permissive substrate binding site, where the only absolute requirement for cleavage is a small, hydrophobic P1 residue. Several minor contributions arising from contacts with distal residues then cooperate in modulating substrate recognition and cleavage. Our findings can be rationalized in the context of the recently solved structure of the NS3 protease (20, 21): the substrate binding channel is relatively solvent exposed, with all the contributing loops being shorter or absent in comparison with the other serine proteases of similar fold, like chymotrypsin or elastase. Moreover, substrate modelling in the active site suggested that major binding contributions should come from the P6-P2 peptide backbone, with an apparent lack of P5-P2 side chain to enzyme interactions (21). Overall, it appears that a deeper understanding of how substrate recognition by NS3 is fine-tuned, possibly by the use of combinatorial peptide libraries spanning several simultaneous mutations, will be necessary to guide the design of substrate-based inhibitors of NS3.
We are grateful to R. Cortese for continuous support of this work, V. Matassa for valuable discussions and careful revision of the manuscript, S. Acali for peptide synthesis, G. Biasiol for protein purification, R. Petruzzelli for N-terminal sequence analysis, and P. Pucci and F. Naimo for mass spectrometry.