The chymotrypsin family of serine proteases is a paradigm for
enzymic substrate recognition. The family is subdivided on the basis of
four major classes of P1 (
)residue substrate specificity:
basic, aromatic, aliphatic, and acidic. These specificities are usually
mutually exclusive; substrate discrimination is on the order of
10
to 10
in k
/K
for trypsin
(Lys > Phe), chymotrypsin (Phe > Lys, Phe > Ala), elastase
(Ala > Tyr), and V8 protease (Glu > Ala)(2, 3) .
These distinct specificities arise from subtle modification of surface
loops surrounding a conserved double
barrel core
structure(3) . Sequence and structural similarity suggested a
classical model in which only a few critical residues determine
substrate specificity(4) . However, recent studies demonstrate
that the conversion of one protease into another is complex, requiring
the transplantation of several active site
loops(5, 6) . Thus, the evolutionary optimization of
this enzyme family may obscure important mechanistic and structural
commonalities regarding substrate specificity.
Serine collagenase 1
(EC 3.4.21.32) isolated from the hepatopancreas of the fiddler crab, Uca pugilator, is a serine protease capable of cleaving native
triple helical collagen(7) . The serine collagenases comprise a
large family of homologous, yet nonidentical enzymes of mostly
invertebrate origin(8) . These collagenases appear to serve
primarily a digestive function. Other serine collagenases have been
implicated in pulmonary, parasitic, and bacterial diseases (9, 10, 11) . The enzymology of crab
collagenase is unusual, as it possesses activities similar not only to
the matrix metallocollagenases, but also to the serine proteases
trypsin, chymotrypsin, and
elastase(12, 13, 14) . The collagen cleavage
sites of crab collagenase have recently been identified and are located
in the protease-sensitive region 3/4 of the length of the collagen
chain from the amino terminus(14) . Given the similar location
of the crab and metallocollagenases in their attack on collagen, crab
collagenase is an alternative model system for the elucidation of
protease-collagen interactions. Crab collagenase also presents the
opportunity to study, in a unified manner, the nature of hydrophobic
and basic substrate specificity in the chymotrypsin family of serine
proteases.
We present here the cloning, expression, and
characterization of crab serine collagenase 1. The collagenolytic
activity of the recombinant enzyme is identical to that isolated from
crab hepatopancreas. Quantitative structure activity relationships are
determined for collagenase and compared to the serine protease homologs
trypsin, chymotrypsin, and elastase. These criteria show serine
collagenase 1 to be a novel member of the chymotrypsin protease family.
EXPERIMENTAL PROCEDURES
RNA Isolation and cDNA Library Construction
Live
fiddler crabs (U. pugilator) were obtained from Gulf Specimen
Marine Laboratory (Panacea, FL). The hepatopancreas was dissected,
immediately frozen in liquid nitrogen, and stored at -80 °C.
Total RNA was extracted from the frozen hepatopancreas using guanidine
thiocyanate and partially purified by ultracentrifugation through a
cesium trifluoroacetate gradient(15) . Poly(A)
RNA was isolated from total RNA by hybridization to biotinylated
oligo(dT), which was recovered from solution using streptavidin-coated
paramagnetic beads (Poly(A)Tract, Promega). All RNA was stored under
ethanol at -80 °C.A Lambda Zap II crab hepatopancreas
cDNA library was constructed and amplified by Clontech Laboratories
(Palo Alto, CA). The library contains 1.8
10
independent clones, with a cDNA insert size range of 1.0-5
kilobase pairs.
Isolation of the Crab Collagenase cDNA
The
polymerase chain reaction (PCR) (
)was used to amplify a
fragment of the crab collagenase cDNA from the U. pugilator hepatopancreas library. Two degenerate PCR primers denoted FCN1
and FCC1 were synthesized based on the amino and carboxyl termini of
the mature protease amino acid sequence (16) (FCN1,
5`-TGCTCTAGA-GTI-GA(A/G)-GCI-GTI-CCI-AA(T/C)-TCI-TGG-3`; FCC1,
5`-GATAAGCTTGA-TTA-IGG-IGT-IAT-ICC-IGT-(T/C)TG-IGT-(T/C)TG-IAT-CCA-3`).
Inosine was used to reduce the degeneracy of the oligonucleoide pool by
broadening the base pairing potential at these positions. 5 µl of
library stock containing 3.5
10
phage were
subjected to PCR with the FCN1 and FCC1 oligonucleotides using standard
conditions (17) . The PCR reaction consisted of five cycles of
1 min of annealing at 44 °C, 2 min of polymerization at 72 °C,
and 1 min of denaturation at 95 °C; followed by 30 cycles with an
elevated annealing temperature of 50 °C. The single-band PCR
product was purified by agarose gel electrophoresis and Geneclean (Bio
101). The PCR product was sequenced by the dideoxy method, using
Sequenase T7 DNA polymerase (U. S. Biochemical Corp.) and the FCN1 and
FCC1 primers.The library was plated with Escherichia coli strain XL1-Blue, adsorbed in duplicate to nitrocellulose filters,
denatured, and fixed according to standard manufacturer's
instructions (Stratagene, Clontech). The probe
5`-CA-(G/A)AA-(G/A)TA-CAT-(G/A)TC-(G/A)TC-(G/A/T)AT-(G/A)AA-3` was a
degenerate oligodeoxynucleotide based on the FIDDMYFC (residues
34-42) motif of the crab collagenase protein
sequence(16) . The 5` end of the degenerate probe was
radiolabeled using T4 polynucleotide kinase and
[
-
P]ATP and hybridized to the plaque lifts
overnight at 42 °C as described(18) . The filters were
washed at 47 °C and autoradiographed(18) . Excision and
rescue of the Bluescript plasmid containing the cDNA insert was carried
out according to the manufacturer's instructions (Stratagene).
Both strands of the cDNA clones comprising the composite map were
sequenced by the dideoxy method using Sequenase.
Subsequent screens
of the library were carried out using homologous probes generated by
[
-
P]dCTP PCR from the collagenase clone
denoted FC1 (see below)(19) . Either an EcoRI fragment
containing the entire FC1 cDNA or a 200-bae pair EcoRI-NheI fragment of the 5` end of the cDNA were
used as templates. Under the conditions of limiting dCTP and high
template concentration, the reaction products resembled those of primer
extension rather than fragment amplification. These homologous probes
were hybridized overnight at 50 °C(18) . The filters were
then washed at 65 °C and autoradiographed as described (18) .
Amino Acid Alignment and Secondary Structure Modeling of
Crab Collagenase
The putative signal peptide of crab collagenase
was determined by the hydrophobic nature of the amino
acids(20) . The amino acid sequences of crab procollagenase and
shrimp chymotrypsinogen (EMBL accession no. X66415), rat anionic
trypsinogen 2 (Protein Identification Resource (PIR) code, TRRT2;
Protein Data Bank (PDB) code, 1BRA), bovine chymotrypsinogen A (PIR
code, KYBOA; PDB code, 7GCH), and porcine proelastase 1 (PIR code,
ELPG; PDB code, 3EST) were aligned using the PILEUP program of the GCG
software package (Genetics Computer Group, Madison, Wisconsin), and
consensus structural constraints, as derived from alignment of
proteases of known three-dimensional
structure(21, 22) .
Expression and Purification of the Recombinant Crab
Procollagenase in Yeast
The zymogen form of crab collagenase
(procollagenase) was cloned in frame with the
-factor leader of
the PsT vector(5) . PCR with Pfu DNA polymerase
(Stratagene) was used to generate the necessary HindIII and SalI restriction endonuclease cleavage sites. This construct
was named PsFC. The full expression vector was created by subcloning
the PsFC SstI/SalI fragment containing the alcohol
dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase promoter,
-factor leader and procollagenase into the PyT 1 µM circle yeast/E. coli shuttle vector(5) , yielding
PyFC.The PyFC construct was electroporated into the AB110 or
DM101
strain of Saccharomyces cerevisiae, and
transformants were selected by growth at 30 °C on SD (8% glucose)
plates lacking either uracil or leucine(23) . A small culture
was grown up in SD-Leu
(8% glucose) for 36 h at 30
°C with gentle shaking. This culture was diluted 1:20 into YPD (2%
glucose) and grown for 60-72 h at 30 °C with gentle shaking.
The yeast cells were removed by centrifugation and the supernatant was
adjusted to pH 7.4 by addition of Tris base to a final concentration of
10 mM. DEAE chromatography was performed as described for the
enzyme isolated from the crab hepatopancreas(14) . Fractions
were assayed for procollagenase either by Western blot analysis or by
activation with trypsin. The activation assay contained 20 µl of
sample, 5 µl of 1 µM TPCK-treated bovine trypsin
(Sigma), and 200 µl of 400 µM Suc-AAP-Leu-pNA in 50
mM Tris, 100 mM NaCl, 20 mM CaCl
, pH 8.0. The reaction course was monitored at 405
nM at room temperature using UV
microtiter plate
reader (Molecular Devices). The fractions containing procollagenase
were pooled and adjusted to 50 mM Tris, 100 mM NaCl,
20 mM CaCl
, pH 8.0. Addition of a 0.5% volume of
TPCK-treated, agarose-immobilized bovine trypsin (Sigma) resulted in
complete activation of the zymogen after 2 h of gentle shaking at room
temperature, as monitored by increase in activity toward
Suc-AAP-Leu-pNA. The activated collagenase was further purified by
bovine pancreatic trypsin inhibitor affinity
chromatography(14) . An overall yield of 1 mg of recombinant
collagenase/liter of yeast culture was achieved.
Kinetic Analysis of Recombinant Collagenase, Trypsin,
Chymotrypsin, and Elastase
Collagenase was prepared from crab
hepatopancreas as described(14) . Recombinant rat trypsin was
purified as described(24) . Other reagents were purchased from
the following sources: p-tosyl-L-lysine chloromethyl
ketone-treated bovine chymotrypsin (Sigma), porcine elastase
(Calbiochem), bovine calf skin collagen (U. S. Biochemical Corp.),
Suc-AAP-Abu-pNA (Bachem, Torrance, CA) and Z-GPR-Sbzl (Enzyme Systems
Products). All other substrates were from Bachem Bioscience. All enzyme
active site titrations, substrate calibrations, kinetic assays, and
collagen digestions were carried out as
described(14, 25) . Briefly, pNA kinetic assays were
monitored at 410 nm (E
= 8,480 M
cm
) in 50 mM Tris, 100 mM NaCl, 20 mM CaCl
, pH
8.0, at 25 °C. A total of 1-4% N,N-dimethylformamide
or 2% Me
SO was present in the final reaction buffer.
Benzylthioester kinetic assays were monitored at 324 nm (E
= 19,800 M
cm
) in the above buffer at 25 °C with the
inclusion of 250 µM dithiodipyridine (Chemical Dynamics)
and 2% N,N-dimethylformamide. 7-Amino-4-methylcoumarin
spectrofluorimetric assays were monitored at an excitation wavelength
of 380 nm and an emission wavelength of 460 nm, under conditions
identical to those for pNA. Assays were done in duplicate for 5
substrate concentrations, except for Suc-AAP-Asp-pNA, for which the k
/K
was determined using
three substrate concentrations in duplicate. The steady state kinetic
parameters were determined by non-linear regression fit to the
Michaelis-Menten equation. Standard deviation in k
/K
was generally less than
10%, though individual rate and binding constants varied to a greater
extent. In particular, error for elastase was 15% in k
versus Suc-AAP-Val-pNA and 25% in K
versus Suc-AAP-Ile-pNA. Kinetic parameters were plotted versus P1 residue volume (26) and the hydrophobicity
constant,
(27) .
RESULTS
Detection and Isolation of Crab Collagenase Clones from
the Hepatopancreas cDNA Library
Crab collagenase clones were
detected in the cDNA library by two methods utilizing degenerate
oligonucleotides based on the amino acid sequence of the
protease(16) . In the first method, a set of oligonucleotides,
FCN1 and FCC1, complementary to the amino and carboxyl termini of
mature collagenase were used in the polymerase chain reaction to
amplify a DNA fragment from the cDNA library. A single, intense band of
approximately the size of the mature protease (670 base pairs) was
produced. (
)Direct sequencing of the PCR DNA yielded
sequence around His
, Gly
, and Phe
(chymotrypsinogen numbering) of the collagenase. The
cDNA
library was also screened with a degenerate oligonucleotide
complementary to the FIDDMYFC sequence of the collagenase (residues
34-42). This sequence was chosen for three reasons: 1) minimal
sequence identity to other serine proteases, 2) proximity to the 5` end
of the gene permitting isolation of more full-length clones from the
oligo(dT)-primed cDNA library, and 3) low amino acid coding degeneracy
(96-fold degenerate). 40,000 plaques were screened, yielding 10
primary, 7 secondary, and 3 tertiary isolates. The most complete clone,
denoted FC1, contains a 15-amino acid signal sequence, a 29-amino acid
zymogen peptide, and the entire 226-amino acid mature form of the
collagenase, as well as 143 bases of 5`- and 153 bases of
3`-untranslated sequence (see Fig. 1and below). The likely
start codon of clone FC1 is a non-optimal AGG (Arg), rather than the
expected ATG (Met)(28) . Further screening of the library was
indicated, as no ATG start codon could be located in any reading frame
near the expected start site. Screening of an additional 30,000 plaques
with PCR fragments generated from the FC1 template yielded 15 primary,
9 secondary, and 6 tertiary isolates. Two clones, FC2 and FC3, yielded
necessary sequence data. Clone FC2 provided the requisite ATG start
codon, though uncharacterized recombination events rendered the
5`-untranslated region and the 3` third of the cDNA unusable. Clone FC3
encoded the complete collagenase zymogen minus the signal sequence and
5`-untranslated region, while the 3`-untranslated region extends into
the poly(A) tail. The cDNA presented in Fig. 1is a composite of
FC1, the ATG start of FC2, and the poly(A) tail of FC3. The coding
sequences of all clones were identical.
Figure 1:
Composite U.
pugilator serine collagenase 1 cDNA. Nucleotides 1-144 and
146-1042 are of clone FC1, 144-146 are of clone FC2, and
1043-1109 are of clone FC3. The 1.1-kilobase pair cDNA is
underwritten by the open reading frame corresponding to the putative
coding sequence. The predicted zymogen peptide begins at nucleotide 189
(Ser) and the mature collagenase begins at nucleotide 276 (Ile), as
indicated in bold.
Sequence Analysis of Recombinant Collagenase
The
published amino acid sequence (16) contained six changes
relative to the sequence predicted from the cDNA. These changes appear
to reflect errors in the original amino acid sequence determination,
rather than amino acid variation due to the cloning of an isozyme of
crab collagenase. (
)The discrepancies and the possible
causes are: I106V, carryover of Val
; S110V, weak
detection of Ser; S164N/N165S, acid-induced N
O acyl shift, weak
detection of Ser and Asn; N192D and N202D, acid-induced deamination
(chymotrypsinogen numbering, where the first letter denotes the amino
acid predicted from the cDNA sequence and the second letter denotes the
amino acid from the original sequence determination). One of the errors
in the protein sequence, N192D, maps to the rim of the S1 site, and
must be considered regarding the possible effect of the negative charge
on substrate recognition. The other errors appear to map to the surface
of the enzyme and are most likely functionally inconsequential.The
amino acid sequence of mature crab collagenase is homologous to the
mammalian serine proteases trypsin, chymotrypsin, and elastase (35%
identity) and to shrimp chymotrypsin (75% identity), another serine
collagenase (Fig. 2)(16, 29) . Virtually all
major structural features of a chymotrypsin-like serine protease are
found in crab collagenase. Three disulfide bonds (residues 42:58,
168:182, and 191:220) are conserved. Conservation of the double
barrel core is strict, and the surface loops are similar in size to
those of the vertebrate paradigms. Some are of unique sequence and may
play a role in determining the broad substrate specificity of crab
collagenase. An unusual crab collagenase active site geometry of
Gly
and Asp
, as compared to Asp
and Gly
in trypsin, is maintained in the
cDNA(16) .
Figure 2:
Amino acid sequence alignment of crab
collagenase (FC), shrimp chymotrypsinogen (SK), rat
anionic trypsinogen 2 (TN), bovine chymotrypsinogen A (CT), and porcine elastase 1 (EL). CT#,
chymotrypsinogen numbering; SS, secondary structure. b,
sheet; a,
helix; t, turn ( (21) and (22) ). Catalytic residues and cysteines are
highlighted in bold.
Comparison of the zymogen peptides of these
enzymes serves to further delineate the group, as they are of variable
length and share little identity (Fig. 2). Crab collagenase and
shrimp chymotrypsin possess zymogen peptides that are 2-3 times
longer than those of the vertebrate proteases. The purpose of these
large activation domains is unclear, as they are not required for
heterologous expression of vertebrate proteases such as
trypsin(30) . The activation site of procollagenase,
VKSSR-IVGG, is more similar to those of chymotrypsinogen, SGLSR-IVVG,
and proelastase, ETNAR-VVGG, which are activated by trypsin, than that
of trypsinogen, DDDDK-IVGG, which is activated by
enterokinase(31) . Crab collagenase may self-activate, or
another trypsin-like protease in the crab hepatopancreas may perform
this function(32) . The primary sequence alignment suggests
that crab collagenase and shrimp chymotrypsin are members of a novel
serine protease subfamily.
Expression and Purification of Crab Collagenase in S.
cerevisiae
Crab procollagenase was cloned into the PyT S.
cerevisiae expression vector (5) as a fusion with the
-factor signal sequence under the transcriptional control of the
alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase promoter
and alcohol dehydrogenase terminator, yielding the PyFC construct.
Yeast containing PyFC secrete a 30-kDa protein into the medium, which
cross-reacts with anti-crab collagenase antibodies on Western
blots.
The recombinant procollagenase is purified from
the yeast medium in much the same manner as the native collagenase from
crab hepatopancreas (14) . DEAE chromatography, trypsin
activation, and subsequent bovine pancreatic trypsin inhibitor affinity
chromatography are used to purify the recombinant enzyme to
homogeneity. The mature recombinant collagenase is identical in size to
that isolated from the hepatopancreas (Fig. 3a).
Figure 3:
Comparison of recombinant and
hepatopancreas crab collagenase. Panel a, Molecular weight
determination. Lane M, molecular weight markers; lane
H, 10 µg of hepatopancreas collagenase; lane R, 10
µg of recombinant collagenase. Panel b, collagen cleavage
assays. Reactions included bovine skin collagen in 50 mM Tris,
300 mM NaCl, 20 mM CaCl
, pH 8.0 at 25
°C, enzyme added in a 1:24 weight ratio as indicated. Lane
C, no enzyme; lanes 30`, 60`, and 120`,
hepatopancreas or recombinant collagenase incubated for the indicated
time; lane 120`+I, recombinant collagenase + 1
mM 4-(2-aminoethyl)benzenesulfonyl fluoride, incubated for
120`; lane E, 5 µg of recombinant collagenase alone; lane M, molecular weight markers.
Activity of Recombinant Collagenase Versus Type I
Collagen
The collagenolytic activity of the recombinant
collagenase was compared directly to that of the enzyme isolated from
the crab hepatopancreas (Fig. 3b). The specificity and
rate of collagen cleavage are similar. The signature 3/4- and
1/4-length fragments are identical in morphology, including the
1/4-length triplet. Furthermore, the collagenolytic activity of the
recombinant enzyme is completely inhibited by the serine protease
inhibitor 4-(2-aminoethyl)benzenesulfonyl fluoride, as previously
demonstrated for the hepatopancreas collagenase(14) .
Activity of Recombinant Collagenase Versus Peptidyl pNA
Substrates
The Michaelis constants of the recombinant
collagenase were determined for a matched set of 15 Suc-AAP-Xaa-pNA
substrates, varying only in the P1 residue (Table 1). The
relative balance of specificities (k
/K
) of the recombinant
enzyme is similar to that reported previously for the hepatopancreas
enzyme versus the Arg, Lys, Gln, Leu, and Phe substrates,
within an error of 15-30%(14) . The remaining 10
substrates Ala, Abu, Nva, Val, Nle, Ile, Met, Orn, Asp, and Glu were
selected to more fully map the specificity of crab collagenase for
hydrophobic, basic, and acidic residues. The substrate preference of
the collagenase is quite broad. The most striking aspect of the
specificity of the enzyme regards the amino acids residues it rejects (Fig. 4).
-Branched and acidic side chains are extremely
poor substrates. Although the apparent binding constants (K
) for Val and Ile are similar to those of the
other hydrophobic substrates, k
is as much as
10
-fold lower. Acidic residues are generally poor
substrates. There is no correlation in K
for the
various substrates (r = 0.57; see Equation 3 in Table 2), suggesting that there are several modes of ground state
binding. This implies the existence of several distinct S1 sites or a
single flexible site(14) . A correlation (r =
0.76; Equation 1 in Table 2) for log k
versus P1 residue volume (Å
) is
observed, irrespective of hydrophobicity(26) . The correlation
is improved and slope essentially unchanged (r = 0.95;
Equation 2 in Table 2) if only the hydrophobic residues Ala, Abu,
Nva, Nle, Leu, Met, and Phe are included. A weaker correlation of
log(k
/K
) versus residue volume (r = 0.89; Equation 4 in Table 2) for this hydrophobic subset is found (Fig. 5).
These results suggest that the transition state may be stabilized in
part by hydrophobic interactions. It is unclear how the enzyme binds
the neutral hydrophilic and basic residues so as to minimize the
effects of charge or polarity in the transition state. Bias or
insensitivity in the data set may also affect the interpretation of the
correlations.
Figure 4:
Substrate specificity of crab collagenase versus Suc-AAP-Xaa-pNA. Data are from Table 1. Black
bars, k
(/min); striped bars, K
(µM); gray bars, k
/K
(/min/µM).
Figure 5:
Quantitative structure-activity
relationships serine protease substrate specificity.
Log(k
/K
) of
collagenase (FC, square), chymotrypsin (CT, circle), and elastase (EL, diamond) for the
Suc-AAP-Xaa-pNA series, where Xaa = Ala (A), Abu (O), Val (V), Nva (U), Nle (J), Ile (I), Leu (L), Met (M), and Phe (F),
are plotted versus P1 residue volume. Data are from Table 1, omitting Ile and Val for collegenase and chymotrypsin,
and Nva and Leu for elastase. Correlations are from Table 2(collegenase: y =
0.016(Å
) - 2.2, r = 0.89;
chymotrypsin: y = 0.038(Å
) -
6.2, r = 0.99; elastase: y =
-0.039(Å
) + 5.1, r =
0.95).
Correlations of Serine Protease Specificity
The
steady state kinetic parameters of chymotrypsin and elastase versus the Suc-AAP-Xaa-pNA substrate set were determined under conditions
identical to those for crab collagenase (Table 1). This was
necessary in order to accurately compare the activities of these
different enzymes. Strong positive (chymotrypsin) and negative
(elastase) correlations were found for log k
or
log(k
/K
) versus P1
residue volume (r
0.95; Equations 6, 8, 10, and 12 in Table 2; Fig. 5). Val and Ile were omitted for
chymotrypsin, while Nva and Leu were deleted for elastase, as these
points deviated significantly from the rest of the data (see
``Discussion''). A tight negative correlation of K
versus volume was found for
chymotrypsin (r = 0.95; Equation 7 in Table 2),
while a much weaker positive correlation was seen for elastase (r = 0.68; Equation 11 in Table 2). The sensitivities of
chymotrypsin and elastase
log(k
/K
) to residue volume
are identical and twice that of collagenase (Fig. 5).
Chymotrypsin log(k
/K
) also
correlated with
, the log of the octanol:water partition
coefficient of the residue minus the log of the coefficient for Gly (27) (m = 2.0, r = 0.98;
Equation 9 in Table 2). This result with tetrapeptide amides is
consistent with the correlation of
log(k
/K
) for single-residue
esters with
, where a slope of 2.2 was found (33) .
Collagenase log(k
/K
) is
less sensitive to
(m = 0.80, r =
0.89; Equation 5 in Table 2), while elastase
log(k
/K
) correlated well,
with a slope equal and opposite that for chymotrypsin (m = -2.0, r = 0.94; Equation 13 in Table 2).
Contribution of the P1 Residue to Catalytic
Efficiency
The relative contribution of the P1 residue to the
cleavage of peptidyl substrates was estimated by comparing the
catalytic efficiencies of collagenase, trypsin, chymotrypsin, and
elastase versus single-residue and tetrapeptide P1-Arg, Phe,
or Ala substrates (Fig. 6). While k
/K
of all enzymes for the
peptidyl substrates are similar, within 2-20-fold, there is a 10-
to 10
-fold difference in k
/K
for the single-residue
substrates. Trypsin derives the highest k
/K
from its single-residue
Arg substrate, manifesting a 100-fold differential as compared to the
peptidyl Arg cognate. Chymotrypsin shows a 10,000-fold differential in
efficiency for single-residue Phe versus peptidyl Phe
substrates, while elastase k
/K
versus single-residue Ala is 100,000-fold less than that
for peptidyl Ala. Interestingly, collagenase demonstrates identical
100,000-fold differences in k
/K
for both single-residue Arg and Phe substrates,
10-1,000-fold greater than chymotrypsin or trypsin and similar to
elastase. Collagenase and elastase show the most dependence on the
P2-P4 residues for catalytic efficiency, with the low activity on
single-residue substrates being a consequence of small P1 residue size
or non-optimal P1 residue binding.
Figure 6:
k
/K
of single-residue and tetrapeptide substrates. Tetrapeptide data
(Suc-AAP-Xaa-pNA) are from Table 1, except for trypsin, which is
from (14) . Single-residue substrates are Ac-Arg-pNA,
Suc-Phe-pNA, and Ac-Ala-pNA. Enzymes are grouped according to P1
residue. Conditions were 50 mM Tris, 100 mM NaCl, 20
mM CaCl
, pH 8.0 at 25 °C, as described under
``Experimental Procedures.'' Gray bars, single
residue; striped bars,
tetrapeptide.
Structurally, the degree of
P2-P4 binding correlates with the length of the residue
215-220 domain (Fig. 2). This loop forms the lip of the
binding pocket and forms a
sheet with the P2-Pn substrate residues(3) . Elastase and collagenase have the
longest loops, while chymotrypsin and trypsin are 1 and 2 residues
shorter, respectively.
Acylation Is Rate-limiting for Crab Collagenase, Versus
Deacylation for Trypsin and Chymotrypsin
The relationship
between broad specificity and catalysis was further investigated by
determining the steady-state Michaelis constants for collagenase,
trypsin, and chymotrypsin versus two series (P1-Arg or Phe) of
peptidyl amides and esters, varying only in leaving group (Table 3). The highly specific enzymes trypsin and chymotrypsin
maintain high levels of k
independent of either
the activated amide 7-amino-4-methylcoumarin and pNA or the
benzylthioester leaving groups. Either deacylation (
)or
product dissociation is rate-limiting for these enzymes(34) .
In contrast, collagenase reacts with both sets of substrates and shows
an increase of up to 1,000-fold in k
as the
leaving group is changed from 7-amino-4-methylcoumarin to the more
labile pNA and Sbzl moieties. Acylation is therefore the likely
rate-limiting step for collagenase-catalyzed cleavage of both the
P1-Arg and P1-Phe peptidyl amide substrates(34) .
DISCUSSION
The cloning and expression of the crab serine collagenase 1
has resolved several issues regarding the molecular biology and
enzymology of this unusual enzyme. 1) The sequence was verified, and
minor errors were corrected. 2) Heterologous expression verified that
collagenolytic activity was intrinsic to this serine protease and
provided a source of reagent quantities of the enzyme. Serine
proteases, along with the matrix metalloproteases, can now be
considered true collagenases. The unique nature of the collagenase
active site justifies its classification as a major new branch of the
chymotrypsin family of serine proteases.
Crab Collagenase and Shrimp Chymotrypsin: Implications for
Collagen Recognition and Cleavage
High levels of identity
between the pre-pro forms of crab collagenase and shrimp chymotrypsin,
another serine collagenase(29) , suggest that a region
responsible for collagen recognition and cleavage may include the
S4-S`2 substrate binding sites of the enzyme. Most of these sites
are conserved between the crab collagenase and shrimp chymotrypsin,
including the acidic residues thought to be important in the
recognition of Arg in the P`1 position by the crab enzyme(14) .
This suggests that the two enzymes bind collagen by a similar
mechanism. A notable structural dissimilarity between the two enzymes
occurs in the primary substrate binding (S1) site. A major determinant
of the trypsin-like (Arg, Lys) P1 specificity of the crab collagenase
is likely to be
Asp
(13, 14, 16) . Shrimp
chymotrypsin lacks an Asp at this position, possessing an Ala instead.
Several other conservative substitutions at positions 189, 217a, and
218 may further perturb the P4-P1 specificity of the shrimp enzyme.
This suggests that shrimp chymotrypsin may cleave collagen at a subset
of the sites (Gln and Leu, but not Arg) recognized by crab
collagenase(14) .
The Active Site of Collagenase Is Less Hydrophobic than
That of Chymotrypsin and Larger than That of Elastase
Extensive
quantitative analysis of serine protease specificity has provided the
foundation for general theories concerning the interaction of enzymes
and substrates (see (27) and (35) for early reviews).
However, much of the groundbreaking work regarding the specificity of
the S1 site was carried out utilizing single-residue
esters(27) . As these compounds bear little structural or
chemical resemblance to the presumed physiological peptide substrates,
one might question their use in examining biological function. Partial
data sets for chymotrypsin and elastase versus the peptidyl
amides Suc-AAP-Xaa-pNA demonstrated the utility of this substrate
series in mapping specificity(36, 37, 38) .
Our results agreed well with that reported previously for
single-residue esters (33) and confirmed the assumption that,
at least for hydrophobic P1 substrates, S1 site specificity is largely
independent of the nature of the scissile bond, as well as
NH
-terminal groups(27) . This allowed the accurate
comparative analysis of the recombinant crab collagenase. Correlations of P1 residue volume and log k
or log(k
/K
) were
found for serine protease paradigms chymotrypsin and elastase. Although
these enzymes are commonly considered to be specific for aromatic or
small hydrophobic residues, respectively, these specificities represent
only the upper range of linear continuums that span more than 4 orders
of magnitude in k
/K
. The
sensitivities of chymotrypsin and elastase to P1 side chain volume, as
reflected in the slopes of the correlations, are equal and opposite.
This is also the case for the hydrophobicity constant
, a measure
of the free energy of transfer of an amino acid side chain from octanol
to water. (
)The slope of +2.0 found for chymotrypsin
log(k
/K
) versus
suggests that the free energy of transfer of a hydrophobic
amino acid side chain from the active site of chymotrypsin to water is
double the free energy of transfer from octanol to water
(-40-50 cal/Å
/mol versus -20-25 cal/Å
/mol, where
Å
refers to the solvent-accessible surface area of
the side chain)(39, 40, 41, 42) .
This behavior is attributed to the favorable desolvation of both free
enzyme and free substrate in forming the hydrophobic enzyme-substrate
complex, equivalent to two transfers from water to
octanol(42) . Full desolvation of the complex occurs when the
hydrophobic surfaces of enzyme and substrate are complementary. The
relative slopes of the
and P1 residue volume correlations are
identical, suggesting that the interactions observed are either purely
hydrophobic or that steric and hydrophobic effects contribute equally
in this system. The inverse correlation of elastase
log(k
/K
) with
may
represent increasing solvation of the complex as larger substrates are
bound to the enzyme, but is likely to also include unfavorable steric
effects.
Collagenase
log(k
/K
) is half as
sensitive to P1 residue volume and
than chymotrypsin and
elastase, which possess strongly hydrophobic S1 sites. According to the
desolvation model, the collagenase S1 site is less hydrophobic than
those of the other two enzymes. The positive slope of the correlation
also suggests an active site which is larger than that of elastase. The
collagenase S1 site increasingly, but never completely, desolvates
larger substrates. The S1 site may also be partially exposed to bulk
solvent. Hydrophilic residues, such as Asp
, involved in
binding Arg, Lys, Orn, and Gln substrates, likely compromise the
hydrophobicity of the region.
Several amino acid residues were
consistent outliers in the correlations. The
-branched amino acids
Val and Ile are unexpectedly poor substrates for chymotrypsin and
collagenase, indicating a constriction in the S1 sites of these enzymes
around the
carbon. In contrast, Nva and Leu (and, to some extent,
Abu) are exceptionally good substrates for elastase, suggesting that
they may bind productively in a hydrophobic region not accessible to
other residues. A detailed analysis must await three-dimensional
structural verification.
Ground-state Substrate Binding Does Not Correlate with
Transition State Catalysis
Although the serine protease kinetic
mechanism
(34, 43) describes the formation
of a ground-state Michaelis complex (K
) prior to
several steps of transition state catalysis (rate-determining step
k
), the tightness of the complex may not in
itself predict the rate of catalysis. Collagenase illustrates the
generality of this hypothesis, given its broad specificity for basic,
neutral hydrophilic, and hydrophobic residues. The value of k
correlates well with P1 residue volume,
irrespective of chemical nature, suggesting size is a component of
transition state stabilization. In contrast, there is no correlation of K
with residue volume or k
(assuming that acylation is rate-limiting for most substrates, K
K
). Similar k
values are achieved for Gln, Arg, and Phe with K
values ranging 100-fold. This indicates that
ground-state binding is independent of transition state catalysis.
Elastase and chymotrypsin also show better correlations in k
than K
with P1 residue
volume, again suggesting that these enzymes are designed for transition
state catalysis rather than ground-state binding. Site-directed
mutagenesis studies of trypsin further support the hypothesis that
ground-state binding does not correlate with transition state catalysis (44) .
The Coupling of Primary and Subsite Binding in Serine
Protease Catalysis
One striking observation of this study is the
similar rate of catalysis and level of catalytic efficiency for all
enzymes versus their preferred tetrapeptide substrates,
despite the large differences in enzyme and substrate structure.
Trypsin, chymotrypsin, elastase, and collagenase cleave their preferred
tetrapeptide substrates with k
values within
2-fold of one another. This suggests that all serine proteases of the
chymotrypsin family reach a common maximal level of transition state
stabilization in the limit of full subsite-induced activation, given
the shared chemical mechanism and the similar nature of their
physiological oligopeptide substrates. A key component of high level
catalysis is the coupling of the S1 and S2-S4 . . . Sn sites(5, 45) . The structural basis of this
productive substrate recognition is different for each enzyme, and is a
major contributor to substrate discrimination(5, 46) .
This is illustrated by the 35,000-fold variation in k
/K
for single-residue
substrates versus the 20-fold variation for the cognate
tetrapeptides. Clearly, there are several different compensatory
mechanisms of substrate binding for the chymotrypsin class of serine
proteases. The degree of productive P2-P4 binding correlates inversely
with the selectivity of the S1 site or the size of the preferred P1
residue. Collagenase, possessing the P1 specificities of both
chymotrypsin and trypsin, relies to a greater extent, up to 1,000-fold
in k
/K
, on the S2-S4
sites than the more specific enzymes. Collagenase P1-Phe and P1-Arg k
/K
are equally sensitive
to peptide binding, suggesting that nondiscriminant P2-P4 interactions
are a critical component of its broad specificity.
Mechanistic Consequences of Broad Specificity
The
optimization of enzyme specificity can also be assessed
mechanistically. The serine proteases hydrolyze substrates by two
chemical steps after the formation of the Michaelis
complex
(34, 43) . The carbonyl carbon of
the amide or ester substrate is attacked (k
) by
Ser
, forming the acyl enzyme and free amine or alcohol.
This covalent intermediate is deacylated (k
) by
water, generating the carboxylic acid product and free enzyme.
Acylation is generally rate-limiting for amides, and deacylation is
rate-limiting for esters, in part due to the higher pK
of the leaving group amine versus the
alcohol(43) . Although this is almost invariably true for
single-residue substrates, deacylation can be rate-limiting for longer
peptidyl amide substrates containing more potential binding
energy(5) . Trypsin and chymotrypsin are highly efficient,
specific proteases, and deacylation (or product dissociation preceding
or following deacylation) is likely rate-limiting for their preferred
peptide substrates(5, 6, 25, 47) .
In contrast, acylation remains the rate-limiting step for collagenase versus peptide substrates, apparently as a consequence of its
much broader activity. The fact that acylation is rate-limiting for
collagenase is advantageous for future work, especially in the area of
protein engineering. A key issue in mutagenesis studies is the shift in
rate-limiting step of variants relative to the wild-type enzyme. For
example, variant trypsins are often severely deficient in catalysis (48, 49, 50) . Acylation rather than
deacylation is then
rate-limiting(5, 24, 47) . This in turn
alters mechanistic definitions of k
and K
,
preventing accurate
structure/function correlations. Corrective measures include the use of
single-residue substrates, the estimation of mechanistic constants from
steady-state parameters, or ultimately, presteady-state
kinetics(5, 6, 24) . These results are
specific to the substrates examined here and should not be
extrapolated, as other mechanistic steps may be rate-limiting for
longer oligopeptide or natural substrates. In this regard, collagenase
may prove especially useful in exploring the interplay between
substrate binding and catalysis at a macromolecular level.