(Received for publication, October 27, 1995; and in revised form, March 13, 1996)
From the
Human cathepsin K is a recently identified protein with high
primary sequence homology to members of the papain cysteine protease
superfamily including cathepsins S, L, and B and is selectively
expressed in osteoclasts (Drake, F. H., Dodds, R., James, I., Connor,
J., Debouck, C., Richardson, S., Lee, E., Rieman, D., Barthlow, R.,
Hastings, G., and Gowen, M.(1996) J. Biol. Chem. 271,
12511-12516). To characterize its catalytic properties, cathepsin
K has been expressed in baculovirus-infected SF21 cells and the soluble
recombinant protein isolated from growth media was purified. Purified
protein includes an inhibitory pro-leader sequence common to this
family of protease. Conditions for enzyme activation upon removal of
the pro-sequence have been identified. Fluorogenic peptides have been
identified as substrates for mature cathepsin K. In addition, two
protein components of bone matrix, collagen and osteonectin, have been
shown to be substrates of the activated protease. Cathepsin K is
inhibited by E-64 and leupeptin, but not by pepstatin, EDTA,
phenylmethylsulfonyl fluoride, or phenanthroline, consistent with its
classification within the cysteine protease class. Leupeptin has been
characterized as a slow binding inhibitor of cathepsin K (k/[I] = 273,000 M
s
). Cathepsin K
may represent the elusive protease implicated in degradation of protein
matrix during bone resorption and represents a novel molecular target
in treatment of disease states associated with excessive bone loss such
as osteoporosis.
Remodeling of the human skeleton is an ongoing cyclical process that involves phases of bone resorption and replacement. Resorption of bone is carried out by multinuclear cells of hematopoietic lineage known as osteoclasts, while osteoblasts are responsible for deposition of new bone matrix. Osteoclasts resorb bone by creating an extracellular compartment, which is maintained at low pH, on the bone surface. The acidic environment removes the mineral phase of the underlying bone, exposing the organic, proteinaceous matrix to proteolytic degradation. Following this cycle, the recruitment of osteoblasts to the site would begin the process of laying down a new protein matrix that is subsequently mineralized.
Several studies
have suggested the involvement of cysteine class proteases in matrix
remodeling, including demonstrations that prototypic class inhibitors
such as leupeptin, E-64, and cystatin (1, 2, 3, 4) are effective in models
of osteoclast-mediated bone resorption (5) . These inhibition
results along with other circumstantial observations, such as the low
pH activity of the cysteine protease located within osteoclasts, have
most often been interpreted as evidence for the involvement of
cathepsins B, S, or L in degradation of protein components within the
bone matrix. Recently, a novel protein with high sequence homology to
cysteine proteases of the papain/cathepsin superfamily (6, 7, 8, 9, 10) has been
shown to be highly expressed within the osteoclast, but not in cells
from spleen, liver, kidney, muscle, or lung(11) . In contrast,
relatively low levels of cathepsins S, L, and B were found within the
osteoclast(11) . This unique and selective cellular
distribution has prompted reference to this protein as cathepsin
K(11) . ()
Its selective presence within the osteoclast suggests that cathepsin K is the previously elusive cysteine protease involved in bone resorption, and consequently may represent a potential target for therapeutic intervention of disease states involving excessive bone loss. Biochemical, functional, and structural studies of human cathepsin K have been initiated to further develop this concept. In this report are included our approaches to protein heterologous expression, purification, and processing leading to demonstrations of catalysis by mature cathepsin K with peptide and protein substrates as well as its interactions with prototypic protease class inhibitors.
Prestained molecular weight markers were purchased from
Amersham Corp. and Novex. Precast SDS-PAGE ()gels (12% and
15%) were obtained from Bio-Rad. Rapid Coomassie Blue protein stain was
acquired from Diversified Biotech. Fluorogenic peptides and prototypic
protease inhibitors were obtained from Bachem, Sigma, or Novo Biochem.
[
H]Propionylated rat type I collagen was
purchased from DuPont NEN. Protein concentrations were estimated by the
BCA dye reagent (Pierce) or the Bradford method using the commercially
available Bio-Rad protein assay. Two of the potential substrates
(Cbz-Leu-Leu-AMC and Cbz-Leu-Leu-Leu-AMC) were prepared by standard
methods of peptide synthesis; details of their syntheses will be
presented separately.
For construction of recombinant viruses, SF21 cells were cotransfected with purified AcNPV linear DNA purchased from Pharmingen and pBacCatK vector using the liposome-mediated transfection technique as described (12) and then incubated at room temperature for 4 days. The supernatants from the transfection were collected and used to select for the appropriate recombinant viruses as described(12) , amplified, and stored as virus stocks for subsequent experiments.
The time-dependent inhibition of cathepsin K
by leupeptin was evaluated using progress curve analysis. Product
progress curves were obtained in the absence and the presence of
inhibitor under conditions as described above. All reactions were
initiated by addition of enzyme to solutions of substrate and
inhibitor. Values for k at each concentration of
inhibitor were computed for individual curves by directly fitting of
the data to , where [AMC] is the concentration of
product formed over time t, v
is the
initial reaction velocity and v
is the final
steady state rate.
A complete discussion of this kinetic treatment has been presented elsewhere(15) .
The primary sequence alignment of the pre-proforms of cathepsin K and human cathepsins S and L is presented in Fig. 1. Of the members of this family, the greatest primary sequence homology for the full-length prepro-protein forms occurs between cathepsin K and human cathepsin S (56% identity and 71% similarity using BESTFIT). Progressively lower homology is found with human cathepsin L (51% identity, 68% similarity), cathepsin H (42% identity, 59% similarity), and cathepsin B (27% identity, 53% similarity). The sequence similarity (55%) and identity (41%) between papain and cathepsin K is comparable to that for cathepsin H. Even higher homology is observed between the mature, catalytically active forms of these proteins and that predicted for mature cathepsin K. As examples, the homologies between the mature cathepsin K resulting from removal of the signal and leader sequences and that of human cathepsins S and L are 59% and 60% identity with 73% and 76% similarity, respectively.
Figure 1:
Cathepsin sequence comparisons. ,
processing sites for maturation of cathepsins S and L (amino termini);
, active site cysteine, histidine, and asparagine (catalytic triad)
involved in catalysis. +, conserved cysteines involved in
intramolecular disulfide bonds.
As shown in Fig. 1, the full-length sequence of
cathepsin K would appear to include a 15-amino acid signal (pre-)
sequence followed by an additional (pro-) leader sequence of 99 amino
acids, which is analogous to that found for the other cathepsins. The
putative signal sequence contains a positively charged amino acid
(Lys) close to the initial methionine and a subsequent
stretch of hydrophobic amino acids terminated by a consensus alanine
(Ala
). The proposed mature, catalytically active form of
cathepsin K was predicted to result from cleavage between Arg
and Ala
according to the alignment of a P`
consensus sequence defined by P
(Pro),
P
(Ser), P
(Val), and
P
(Asp). The active site cysteine, histidine
and asparagine catalytic triad involved in proteolytic catalysis of the
papain protease family members can be similarly identified from the
sequence alignment to be Cys
, His
, and
Asn
of cathepsin K. All six of the Cys residues within
the mature forms of papain and the cathepsins that are thought to be
involved in three structural intramolecular disulfide bonds for
stabilization of mature enzyme are conserved within the protein
sequence of cathepsin K (see Fig. 1). The highest degree of
overall homology among the family members can be found in the vicinity
of the active site cysteines, which are located in the highly conserved
amino-terminal region of the mature enzyme forms.
Figure 2: Western blot analysis of cathepsin K expression in baculovirus-infected cells. Panel A: lane 1, uninfected SF21 cells; lane 2, SF21 cells infected with recombinant virus expressing PDE-IV; lanes 3 and 4, SF21 cells infected with viruses expressing cathepsin K. Panel B: lane 1, supernatant from SF21 cells infected with virus expressing PDE-IV; lane 2, pellet from SF21 cells infected with virus expressing PDE-IV; lane 3, supernatant from infected SF21 cells expressing cathepsin K; lane 4, pellet of infected SF21 cells expressing human cathepsin K.
Baculovirus genome encodes an endogenous cathepsin-like protein(17) ; however, lysates prepared from uninfected cells or cells infected with a nonrelated recombinant virus (18) produced no detectable protein band by immunoblotting (Fig. 2A, lane 2) using cathepsin K specific antisera. This observation eliminates the possibility that viral infection of the cells enhance levels of an endogenous cathepsin-like protein, discounting the possibility of a false positive signal with the cathepsin K antiserum. Thus, the immunoreactive protein bands detected either in media collected from infected cells or in the cell pellets were produced only upon infection with recombinant vBacCatK virus.
To determine the optimal time of expression and pattern of accumulation of recombinant human cathepsin K protein, SF21 cells were infected with the appropriate recombinant viruses and soluble protein samples from various times after infection were analyzed by Western blotting (data not shown). Very small levels of cathepsin K expression were detected 48 h after viral infection, while the recombinant 37-kDa protein was shown to accumulate through 96 h, consistent with the regulation of the strong late Polh promoter driving expression(19) .
Figure 5: SDS-PAGE analysis of procathepsin K activation at 4 °C, pH 4.0, initiated with addition of preactivated enzyme. Lanes 1 and 2, molecular size markers. Lane 3, empty. Lane 4, purified procathepsin K. Lane 5, procathepsin K pH 4.0 buffer at time = 0 h. Lanes 6-9, samples of procathepsin K afte r incubation at pH 4.0, 4 °C for 24, 48, 96, and 144 h, respectively. Samples from the time course correspond to the squares in Fig. 4.
Figure 4:
Alternative conditions of initiation and
activation of procathepsin K. Activation of procathepsin K was
initiated by incubation at 37 °C for 10 min, pH 4.5 (),
incubation at 50 °C for 10 min, pH 4.5 (
), addition of
preactivated cathepsin K, 4 °C, pH 4.5 (
), or addition of
preactivated cathepsin K, 4 °C, pH 4.0 (
). Following the
activation step, the samples were incubated at 4 °C to promote
formation and accumulation of the mature, catalytically active enzyme.
Proteolytic activity was evaluated at pH 5.5 using Cbz-Phe-Arg-AMC as
substrate.
One of the more common methods used in evaluation of catalytic properties of the cathepsin cysteine proteases employs the cleavage of a fluorogenic (such as AMC) molecule from the carboxyl terminus of a small peptide with the general form,
where activity is monitored upon release of the signal molecule (AMC) from a peptide of amino acid (AA) sequence of n residues. The primary uncertainty regarding the recognition of such a peptide as a substrate by cathepsin K would involve identity and number of the amino acids within the sequence.
In initial attempts
to demonstrate catalytic activity with recombinant cathepsin K obtained
by baculovirus expression, samples of media containing the soluble
37-kDa protein were analyzed with the fluorogenic peptides (see below)
and rat tail type 1 [H]propionylated collagen as
substrates. In none of these experiments, was the level of proteolytic
activity greater in media with cathepsin K than media expressing
phosphodiesterase IV or protein kinase C as comparator controls. At the
same time, the levels of proteolytic activity in media from
non-infected cells were significantly greater than from cells infected
with virus. From these experiments, it was concluded that purification
of the recombinant cathepsin K from the endogenous host proteases would
be required for success in demonstration of proteolytic catalysis.
Upon expression of the inactive prepro- or pro-forms of related
proteases, enzyme activation has been accomplished by treatment at low
pH under reducing conditions at elevated
temperatures(23, 24, 25, 26) . This
approach has been most successful when the recombinant expressed
protein is soluble, either within the cell or as a component of the
media(24, 25, 27, 28) . The initial
success at demonstrating cathepsin K proteolytic activity was achieved
with recombinant procathepsin K from baculovirus expression that had
been partially purified using two chromatographic steps to greater than
75% homogeneity. Using the fluorogenic peptide substrate
Cbz-Phe-Arg-AMC, significant proteolytic activity was detected at pH
5.5 in samples of the procathepsin K that had been preincubated at 60
°C in pH 4.0 buffer containing 20 mM cysteine (Fig. 3). ()No activity was detected without the
temperature treatment, suggesting that the conditions of preincubation
resulted in the processing and activation of inactive procathepsin K
precursor. A separate set of studies has shown that maximal enzyme
activity is achieved at assay conditions near pH 5.5 (data not shown).
Figure 3:
Activity
of cathepsin K following heat activation at 60 °C. Samples of
partially purified procathepsin K were incubated at 60 °C in the
presence of 20 mM cysteine at pH 4.0 (), pH 4.5 (
),
pH 5.0 (
), pH 5.5 (
), or pH 6.0 (
). Samples were
removed over a period of 30 min and assayed at pH 5.5 using
Cbz-Phe-Arg-AMC as substrate.
Follow-up studies have demonstrated that the time course of procathepsin K activation and processing is temperature-dependent with lower temperature both slowing the rate of processing and generating an activated enzyme preparation of significantly higher specific activity. To allow efficient formation of the mature catalytically active form of cathepsin K, several studies were conducted to identify preferred experimental conditions for enzyme activation. In one experiment, samples of procathepsin K were subjected to variable conditions for the initiation of activation, including short exposure to elevated temperatures or addition of a catalytic aliquot of preactivated cathepsin K, followed by incubation at pH 3.5 to 6.0 at 4 °C to encourage additional processing and accumulation of the mature enzyme. From these studies it was determined that maximal activity toward Cbz-Phe-Arg-AMC (assays conducted at pH 5.5) was obtained under incubation conditions of pH 4.0. As shown in Fig. 4, a maximum specific activity was achieved after 24 h of incubation at 4 °C with the sample that had be initiated by addition of heat preactivated cathepsin K. The resulting specific activity using these conditions was greater than 10-fold higher than that which had been achieved by the direct heat activation procedures, suggesting a significant improvement in the quality of the resulting activated mature enzyme population.
Analysis of samples from this study by SDS-PAGE with Coomassie staining for protein demonstrated that catalytic activity is associated with a significant accumulation of the 27-kDa protein. Fig. 5shows a representation of protein processing corresponding to the best activation conditions from the set of studies outlined above (pH 4.0; Fig. 4, solid squares). Integration of the bands from this gel suggests an overall yield of >60% in conversion of the 37-kDa procathepsin K to the 27-kDa mature enzyme.
Amino-terminal analysis of the 27-kDa
protein demonstrated the presence of two amino-terminal sequences,
RAPDSVDYRKKGY and GRAPDSVDYRKKGY, indicating that processing was offset
by one and two residues toward the amino terminus from that predicted
(APDSVDYRKKGY) by the primary sequence alignments with cathepsins S and
L (Fig. 1). Further studies to characterize the temporal
dependence of NH-terminal processing upon activation are in
progress.
Figure 6:
Inhibition of cathepsin K with by
prototypic protease-class inhibitors. Assays were conducted at pH 5.5
using Cbz-Phe-Arg-AMC as substrate in the presence of variable
concentrations of E-64 (), leupeptin (
), pepstatin
(
), and phenylmethylsulfonyl fluoride
(
).
Additional experiments have shown that leupeptin
is a time-dependent inhibitor of cathepsin K and is significantly more
potent than originally estimated in the profiling studies with the
prototypic class inhibitors. Representative progress curves showing the
release of AMC from peptide substrate are depicted in Fig. 7.
Product formation over the assay period in the absence of inhibitor was
shown to be linear with incubation time, while curvature of the
progress curves was observed in assays that included leupeptin. The
shape of these product progress curves is consistent with an increasing
loss of enzyme activity with time, which is characteristic of the slow
binding of inhibitor to enzyme as described by (15) . A plot of the observed rates of enzyme
inhibition (k) from a series of progress curves versus the concentration of leupeptin appears linear (Fig. 7, inset), yielding from the slope of this replot
an approximate value of 273,000 M
s
for k
/[inhibitor], the apparent second
order rate constant of inactivation.
Figure 7:
Leupeptin inhibition of cathepsin K by
progress curve analysis. Cathepsin K product formation (AMC) was
monitored over time in the absence () or presence of leupeptin.
Representative curves as shown include 2 nM (
), 4
nM (
), 7.5 nM (
), 10 nM (
), 15 nM (
), 20 nM (
), or
50 nM (
) leupeptin. Observed rates of inhibition (k
), obtained from fitting data sets to , were replotted versus inhibitor concentrations (inset) to generate an apparent second order r
ate constant for
inhibition of 273,000 M
s
.
This relatively poor ability of cathepsin K to degrade fibrinogen led to the search for alternative matrix-associated substrates. The high levels of cathepsin K detected within osteoclasts (11) concentrated our focus on constituents within bone. Type 1 collagen represents the major structural protein in bone, comprising approximately 90% of the protein matrix. The remaining 10% of matrix in bone consists of several other elements, including osteocalcin, osteopontin, osteonectin, thrombospondin, fibronectin, and bone sialoprotein(29) . While the exact roles for these non-collagenous proteins is not well understood, they appear to serve as cell adhesive proteins, and may play a role in matrix mineralization(30) .
Two of these proteins, collagen and
osteonectin, have been shown to be substrates for mature cathepsin K.
As in the case with fibrinogen, incubation of activated enzyme at
approximately equimolar concentrations to
[H]propionylated collagen resulted in partial
release of radiolabel from this modified matrix protein (data not
shown). In contrast, proteolytic processing of osteonectin was achieved
using much lower (catalytic) concentrations of activated cathepsin K.
As depicted in Fig. 8, limited proteolysis of parent osteonectin
(
42 kDa) resulted in the generation of several smaller protein
fragments in a time-dependent manner, with accumulation of three main
bands of 34, 14, and 10 kDa after 2 h of exposure. Different
amino-terminal sequences were obtained from the 34- and 14-kDa
fragments, with respective sequences of Gln-Glu-Ala-Leu, and
Val-Lys-Lys-Ile, representing amino acids that must bind within the P`
domain of the cathepsin K catalytic site. Localization of these
fragments within full-length human osteonectin have shown that the
sequence from the 34-kDa protein corresponds to the amino terminus of
mature osteonectin missing the first three amino acids
(Ala-Pro-Gln
Gln-Glu-Ala-Leu . . . ). The amino-terminal sequence
of the 14-kDa product (band B, Fig. 8) indicates that this
fragment is generated upon cleavage at a site internal to osteonectin (
. . . Gln-Lys-Leu-Arg
Val-Lys-Lys-Ile-His . . . ). Most
interesting is that the amino terminus from the 14-kDa fragment
predicts the presence of the dipeptide Leu-Arg at positions
P
-P
, which is consistent with the sequence from
one of the better fluorogenic peptide substrates for cathepsin K (Table 1). Initial attempts to obtain amino-terminal sequence
data of the 10-kDa fragment (band C, Fig. 8) were unsuccessful
due to insufficient sample. Efforts to identify the cleavage site
defined by this product are continuing. It is anticipated that these
observations will lead to the identification of improved peptide based
substrates and small molecule inhibitors for human osteoclast cathepsin
K.
Figure 8: Limited proteolysis of osteonectin by cathepsin K. Samples of osteonectin (lane 3) were incubated with activated cathepsin K at pH 5.5 for 30 (lane 5), 60 (lane 6), 90 (lane 7), or 120 (lane 8) min and analyzed by SDS-PAGE. A parallel gel was used for determination of amino-terminal sequences for the 34-, 14-, and 10-kDa band after 120 min of incubation. Lane 1 contains molecular size standards; lanes 2 and 4 were empty.
To gain a better understanding of its functional role within the osteoclast in support of this proposal, we have undertaken the biochemical and functional characterization of human cathepsin K. Expression of pre-procathepsin K with the baculovirus system has been successful at generating soluble recombinant protein. The 37-kDa protein isolated from growth media of infected SF21 cells, which was purified to greater than 95% homogeneity, has been shown to have been processed with removal of the pre-leader sequence, a common characteristic of papain and the cathepsins. Conditions of low pH in the presence of cysteine have been identified, which affect the conversion of procathepsin K to a mature catalytically active 27-kDa protein. Catalytic activity of the mature enzyme toward peptide and protein substrates occurs at low pH, consistent with the published observations that have led to speculation that the protease in osteoclasts involved in bone resorption may be cathepsin B, S, or L (1, 2, 3, 4, 5) . Taken with the observations of its selective expression at high concentrations within osteoclasts(10, 11) , these initial kinetic and biochemical characteristics support the concept that cathepsin K is intimately involved in the process of bone resorption and represents a novel molecular target toward treatment of disease states such as osteoporosis which are associated with excessive bone loss.