©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Rapid Identification of Highly Active and Selective Substrates for Stromelysin and Matrilysin Using Bacteriophage Peptide Display Libraries (*)

(Received for publication, December 1, 1994; and in revised form, January 9, 1995)

Matthew M. Smith Lihong Shi Marc Navre (§)

From the Department of Biochemistry, Affymax Research Institute, Santa Clara, California 95051

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

The discovery of useful peptide substrates for proteases that recognize many amino acids in their active sites is often a slow process due to the lack of initial substrate data and the expense of analyzing large numbers of peptide substrates. To overcome these obstacles, we have made use of bacteriophage peptide display libraries. We prepared a random hexamer library in the fd-derived vector fAFF-1 and included a ``tether'' sequence that could be recognized by monoclonal antibodies. We chose the matrix metalloproteinases stromelysin and matrilysin as the targets for our studies, as they are known to require at least 6 amino acids in a peptide substrate for cleavage. The phage library was treated in solution with protease and cleaved phage separated from uncleaved phage using a mixture of tether-binding monoclonal antibodies and Protein A-bearing cells followed by precipitation. Clones were screened by the use of a rapid screening assay that identified phage encoding peptide sequences susceptible to cleavage by the enzymes. The nucleotide sequence of the random hexamer region of 43 such clones was determined for stromelysin and 23 for matrilysin. Synthetic peptides were prepared whose sequences were based on some of the positive clones, as well as consensus sequences built from the positive clones. Many of the peptides have k/Kvalues as good or better than those of previously reported substrates, and in fact, we were able to produce stromelysin and matrilysin substrates that are both the most active and smallest reported to date. In addition, the phage data predicted selectivity in the P(2) and P`(1) positions of the two enzymes that were supported by the kinetic analysis of the peptides. This work demonstrates that the phage selection techniques enable the rapid identification of highly active and selective protease substrates without making any a priori assumptions about the specificity or the ``physiological substrate'' of the protease under study.


INTRODUCTION

The selectivity of a protease is dictated, in part, by the sequence of amino acids it recognizes in its active site before cleaving the substrate. From a kinetic point of view, the higher the value of the specificity constant k/K(1) , the better a peptide is as a substrate for that protease. Highly selective proteases will have high k/K values for only a small number of peptides, whereas a larger number of peptides will have similar k/K values for broad specificity proteases. The identification of the optimized peptide substrates is thus an important part of protease characterization. A time consuming step in protease characterization, however, is finding an optimal substrate. In the case of enzymes like the matrix metalloprotease (MMP), (^1)fibroblast collagenase, or the protease of the human immunodeficiency virus (HIV protease), a peptide substrate can be prepared based on the cleavage site of the physiological substrate collagen (2) or the HIV polyprotein (3, 4) , respectively. These peptides can be dramatically improved by substitution of individual residues with other amino acids(5, 6, 7) , so that the original substrate-derived peptide may not represent the optimal substrate. In some cases a true physiological substrate is unknown. This was true in the case of the MMP stromelysin. Investigators (8) found obtaining good peptide substrates so frustrating that they randomly screened commercially available peptides for active compounds. Even when substrate information is available, investigators (9, 10) have used proteins as substrates to further assess the sequence specificity of the protease under study.

To overcome these problems, we have made use of filamentous bacteriophage-based peptide display libraries to find optimal substrates. Phage display libraries have been used with great success to find epitopes for monoclonal antibodies and to improve the affinity of peptides for receptors(11, 12) . Recently Matthews and Wells (13) have presented a method for the use of monovalent ``substrate phage'' libraries for discovering peptide substrates for proteases. These investigators screened a random pentamer library and isolated clones carrying substrates for Factor X and a mutant form of subtilisin. However, these sequences were not tested as solution phase peptide substrates, and hence the predictive nature of the method was not evaluated. We have been developing an analogous method using polyvalent phage. The approaches presented here have enabled us to screen a greater number of phage (and thus larger libraries), characterize putative substrate clones more quickly, and to generate consensus sequences from these hits. The peptides prepared based on our screen are as good or better substrates than literature standards for the protease being examined.

To critically assess the method, we have chosen the MMPs stromelysin and matrilysin as the focus of our studies. The MMPs represent a family of enzymes that recognize at least 6 amino acids in their subsites, as shown by the sensitivity of k/K to the substitution of amino acids in positions P(3) to P`(3)(2, 5, 14, 15, 16, 17) . In addition, when this work was initiated, only a limited amount of information had been available for stromelysin (8, 18) or matrilysin(16) . Using recombinant forms of these enzymes, we have screened a random hexamer library and have used the sequence information from positive clones to prepare new, highly active peptide substrates. The best of these peptides are the most active stromelysin and matrilysin substrates reported to date. In addition, previous investigators have always used peptides no smaller than heptamers in the concern that shorter compounds would not be sufficiently active. With the availability of optimized amino acids at each position, we have demonstrated the opposite: that hexapeptides are superior MMP substrates, a finding not expected at the outset of this study. Finally, the phage data successfully predicted ways in which these substrates could be made selective toward matrilysin or stromelysin.

Most importantly, we have demonstrated that we can identify protease substrates without making any a priori assumptions about the specificity or the ``physiological substrate'' of the enzyme. This approach will be valuable for the study of proteases where peptide substrates are unavailable and sites of cleavage in vivo are unknown.


MATERIALS AND METHODS

Reagents

Library competent MC1061 (F) Escherichia coli and nitrocellulose were from Bio-Rad. Pansorbin (Protein A-bearing Staphylococcus aureus) cells were obtained from Calbiochem. Polyvinylidene fluoride membranes were from Millipore, Inc. K91 (F) and MC1061 (F) strains of E. coli(19) were obtained from Steve Cwirla of Affymax. mAb 179 recognizes an epitope (ACLEPYTACD) of the human placental alkaline phosphatase protein with subnanomolar affinity. (^2)mAb 3-E7 (20) was from Gramsch Laboratories (Schwabhausen, Germany). Stromelysin was expressed as a His(6)-NH(2) terminally tagged, COOH terminally truncated proenzyme (pro-sfSTR) in a soluble form in E. coli and purified with some minor modifications (^3)of the protocol of Marcy et al.(37) . A His(6)-NH(2) terminally tagged form of matrilysin was also expressed in E. coli, and after recovery from inclusion bodies was purified by nickel-chelation chromatography, refolded, and further purified by Blue-Sepharose chromatography.^3 Purified activated enzymes were prepared with trypsin, treated with an excess of soybean trypsin inhibitor and stored at -80 °C.

Construction of Vectors and Phage Libraries

fAFF1-tether C (fTC) was constructed by inserting oligonucleotides (ON) 1200/1201 (5`-CTCCCACTCCTACGGAGGATTCTTAGGTGCATGCCTGGAACCGTACACCGCTTGCGACGTAGGCCTGGTACCGGAATTCGCTTGT-3` and 5`-GCGAATTCCGGTACCAGGCCTACGTCGCAAGCGGTGTACGGTTCCAGG CATGCACCTAAGAATCCTCCGTAGGAGTGGGAGTAGA-3`; see Fig. 2A) into the BstXI sites of fAFF1(19) . Control substrate phage fTC-Good and fTC-Bad were constructed by inserting ON1280/1281 (5`-CGGTGGTGGTAGTCCGCTAGCCCTGTGGGCTGTAC-3` and complement) (Good substrate control; Fig. 2B) or ON1282/1283 (5`-CGGTGGTGGTAGTAACCCGGTTGAACCAGCTGTAC-3` and complement) (Bad substrate control; Fig. 2B) into StuI/KpnI-cut fAFF-tether C. fAFF-TC-Good-Kan^R (Kan^R = kanamycin resistance) was constructed by inserting the T4 DNA polymerase-treated 1.4-kilobase AvaI fragment containing the Kanamycin resistance gene from pCR1000c (Invitrogen) inserted in a reverse orientation between the T4 DNA polymerase-treated AvaI and NcoI sites of fTC-Good. fTC-LIB was constructed from fAFF-tether C by inserting oligonucleotides ON1526/1527 (5`-TCTGGAACCGTACACCGCATGCGACTCGAGCGAGACCGAAGACGTACTGGTAC-3` and complement) into the SphI/KpnI sites of fAFF-tether C. The fAFF-TC-LIB-N(6) library was constructed by cloning degenerate oligonucleotides (5`-CAGMNNMNNMNNMNNMNNMNNACCACTACCACCGC-3`, where N is A, C, G, T (equimolar) and M is C or A (equimolar)) annealed with an 18 inosine-containing complementary oligonucleotide (ON1439/1440; see Fig. 2C) into KpnI/XhoI-cut fAFF-TC-LIB at a 5:1 oligo/vector molar ratio and electroporating into E. coli MC1061 (F-).


Figure 2: Construction of fTC and its derivatives. A, fAFF-tether C: the oligo pair 1200/1201 (see ``Materials and Methods'') were hybridized and ligated into BstXI fAFF-1(19) . The lower line shows the translated peptide sequence of the pIII protein starting from the predicted site of signal peptide cleavage, leaving YGGFL at the NH(2) terminus of the phage(19) . The underlined peptide sequence (ACLEPYTACD) is the epitope for mAb 179.^2B, sequence of inserts of fTC-Good and Bad. These clones were derived from fTC as described under ``Materials and Methods.'' The sequences shown start at the end of the mAb 179 epitope (ACLEPYTACD). C, construction of the library. fTC-LIB (see ``Materials and Methods'') was cleaved and ligated with the degenerate oligo pair 1439/1440 (I indicates inosines, which are expected to hybridized to all bases (24) and M = A or C). The resultant library is shown, where X indicates any amino acid or stop codon. The sequences shown start at the end of the mAb 179 epitope (ACLEPYTACD) as in B.



Phage Selection

2 times 10 phage (20 µl of the fTC-LIB-N6 library) in TCB (20 mM Tris-HCl, pH 7.4, 5 mM CaCl(2), 0.05% Brij-35) in a 250-µl reaction were digested with an empirically determined amount of enzyme for 1 h at 37 °C. The reaction was stopped by adding EDTA to 5 mM and then followed by the addition of bovine serum albumin to 0.1%, 100 µg of mAb 179 and 10 µg of mAb3-E7. After 30 min on ice, 100 µl of Pansorbin cells were added, and the reaction was rotated at 4 °C for 1 h. The mixture was microfuged for 2 min and the supernatant recovered to repeat the Pansorbin adsorption. The final supernatant was amplified overnight in E. coli K91 cells. A small aliquot of the final supernatant solution was also used for titering on K91. Clones were selected from the titer plates and grown in 2-ml cultures for dot-blot analysis.

Phage Proteolysis Assay

To precipitate phage, 20 µl of 20% polyethylene glycol, 2.5 M NaCl was added to 100 µl of phage supernatant. After incubating on ice for 30 min, the precipitated phage were microfuged for 5 min. The supernatant was aspirated and the phage resuspended in 10 µl of TBS (50 mM Tris-HCl, pH 7.4, 150 mM NaCl). The phage were then distributed to wells of a flexible microtiter plate (Falcon). 90 µl of protease/buffer mix (90 µl 1.1 times TCB, 0.3 µl of enzyme) were added and the plate incubated for the appropriate time period at 37 °C. At various time points, 30-µl samples were removed and added to 70 µl of TBS + EDTA (to 5 mM final concentration) to stop the reaction. The samples were spotted onto a nitrocellulose filter with a dot-blotter (Bio-Rad) and the filter blocked with 5% non-fat milk in TBS-T (TBS + 0.05% Tween 20) for 30 min to 1 h. The filters were washed three times with TBS-T and then incubated for 1 h with a mAb 179 at 1.9 µg/ml. The washes are repeated and the filters probed with a 1:5000 dilution of goat anti-mouse IgG horseradish peroxidase-conjugate. After 1 h, the washes were repeated and the filter stained as directed using the Amersham Western Enhanced Chemiluminescence Kit.

Kinetic Analysis of Peptides

Synthetic substrates were prepared as peptides blocked at their amino termini by an acetyl group. Peptide 12 (2,4-dinitrophenyl-Pro-Leu-Gly-Leu-Trp-Ala-D-Arg-NH(2)) (14) was a gift of Dr. Robert Gray (University of Louisville) and 11 (7-methoxycoumarin-4-yl)acetyl-Pro-Leu-Gly-Leu-(3-(2,4-dinitrophenyl)-L-2,3-diaminopropionyl)-Ala-Arg-NH(2)) (21) was from Bachem. Peptides at various concentrations were treated with 20-50 nM protease at 37 °C for various times. The initial rates of hydrolysis were determined by measuring the rate of formation of free amino groups using fluorescamine(2) . Kinetic parameters were derived by fitting the data to the equation v = (V(max) bullet {S}/(K(M) + {S}) by non-linear regression analysis (Enzfitter), and k was determined by dividing V(max) by the concentration of enzyme used. The sites of hydrolysis of selected substrates were determined by fast atom bombardment mass spectrometry of the high performance liquid chromatography purified cleavage products.


RESULTS

Phage peptide display vectors have been used successfully to identify peptides that bind to proteins such as antibodies. In contrast, the goal of this work is to use this methodology to identify peptides that are efficiently cleaved by a specific protease. In the most commonly used phage display format, the random peptide sequence is placed at or near the NH(2) terminus of the display protein, generally pIII(19, 22, 23, 24, 25) . In the method described here, we have added an additional functional group to the NH(2) terminus of pIII, a peptide ``tether.'' The tether is a peptide sequence that enables attachment of the phage through binding of the tether sequence to an immobile phase: the simplest example being a peptide epitope tether bound to a monoclonal antibody on an agarose bead. The strategy of the selection is to subject a population of tether phage to the protease in solution, and then separate the cleaved (substrate) from the uncleaved (non-substrate) phage by capturing the undigested phage with a tether-binding resin. A schematic of a variation of this approach, used in this work, is shown in Fig. 1.


Figure 1: Outline of the tether phage selection. Shown are diagrams of the fTC phage (not drawn to scale for clarity). The gene III protein extends from phage body from the COOH to NH(2) terminus. At the NH(2) termini are the peptide tether and the protease target domain. The phage are treated in solution with the protease. Phage carrying substrate sequences (left side) are cleaved in the target domain, whereas others (right) are not. The entire digest is treated with antibodies to the tether(s) and captured using a resin that carries protein A. The protein A-antibody-phage complexes are precipitated by centrifugation, and the phage that are not bound by the antibodies remain in solution and can be recovered for amplification.



We have designed and prepared the vector fAFF-tether C (fTC; see ``Materials and Methods'') from the fd-derived plasmid fAFF-1(19) . The salient features of this phage, shown in Fig. 1and 2 include the (i) target region, which can consist of random amino acids, or predetermined sequences (i.e. GOOD and BAD, Fig. 2, B and C) for use as positive or negative controls. The design of the control sequences is described below; (ii) the tether region. We have employed a dual tether design, in which the tether consists of the epitopes for the anti-dynorphin mAb 3-E7 (YGGFL) (19, 20) and the mAb 179 epitope ACLEPYTACD (see Fig. 2, A and C).

A requirement of this approach is that the phage molecule be cleaved only in the ``target'' random peptide region. Although this is anticipated, as filamentous phage are generally viewed as being protease resistant(26) , we chose to test this proposal. We prepared two phage clones in fTC, one carrying a good substrate (fTC-Good) sequence for the MMP stromelysin, and one carrying a poor substrate (fTC-Bad) (see ``Materials and Methods'' and Fig. 1). We were faced with two choices of known substrate sequences for stromelysin: those based on substance P (8, 17) or a sequence based on collagenase cleavage sites(18) . We chose to build a generic MMP good substrate, Pro-Leu-Ala-Leu-Trp-Ala, based on the results of Netzel-Arnett et al.(16) that was suitable not only for stromelysin, but matrilysin as well. As shown later, this peptide has a k/K(M) value for sfSTR comparable with substance P (8) and a 2,4-dinitrophenyloctapeptide fluorogenic substrate(18) . The control phage were incubated with sfSTR at various times for up to 1 h. The digests were then analyzed by immunoblot analysis using the anti-tether mAb 179. As shown in Fig. 3A, after exposure to sfSTR, the pIII protein of fTC-Bad was essentially undisturbed, whereas the pIII of fTC-Good lost its tether. This indicates that only the target sequence of the pIII protein was specifically cleaved by sfSTR, and other regions of the protein were untouched. Similar results were observed for the matrilysin (Fig. 3B), tissue plasminogen activator, and HIV protease (using appropriate positive controls; data not shown). In addition, titering of phage (fTC-Good and fTC-Bad) before and after digest showed no effect of proteolysis (with any of the proteases tested) on infectivity (data not shown).


Figure 3: Immunoblot analysis of phage clones treated with protease. sfSTR (at 5 µg/ml, panel A) or matrilysin (at 5 ng/ml, panel B) were incubated with fTC-Good or fTC-Bad (Good or Bad) at 5 times 10^9 tu/ml (approximately 10 pM phage) at 37 °C, and aliquots removed at 0, 20, 40, and 60 min and added to 5 µl of SDS gel buffer and boiled to quench the reaction. The samples were subjected to electrophoresis through 12% polyacrylamide gels containing SDS and transferred to nitrocellulose, which was probed with mAb 179 and detected as described under ``Materials and Methods.'' The positions of the protein standards on the gels are indicated by dots and are M(r) 106, 80, 49.5, and 32.5 (panel A only) times 10^3.



Test with Mock Library

To allow us to easily distinguish fTC-Good and fTC-Bad, the tetracycline resistance gene in fTC-Good was replaced with a kanamycin resistance marker (Kan^R; see ``Materials and Methods''). A mock library for testing our selection procedure was prepared by spiking 10^9 fTC-Bad phage (in 100 µl) with 10^5 fTC-Good-Kan^R phage. The mock library was then treated with 1 µg/ml sfSTR for 1 h at 37 °C. The reaction was quenched with EDTA and subjected to one round of the selection procedure described under ``Materials and Methods.'' Various dilutions of the final supernatant solution were titered on both tetracycline (fTC-Bad) and kanamycin (fTC-Good-Kan^R). In many repetitions of this experiment, the recovery of fTC-Good was always nearly quantitative, whereas there was generally a 100-1000-fold loss of fTC-Bad, depending on experimental conditions. This result suggests that a single round of selection could enrich a rare good substrate over a large background of poor substrate. Of note is the fact that we use two antibodies simultaneously in our selections. Although the method does work if we use either antibody alone, we have found that the combination of mAbs yields consistently lower backgrounds and reduces the number of clones that are unreactive to Ab179 in our screening assays (see below).

An important concept in phage display is affinity selection. This involves using limiting levels of receptor or phage proteins in order to isolate phage carrying peptides of high affinity(27, 28) . The analogous approach for a protease selection would be the use of less protease in order to drive the selection toward those clones carrying better substrates (practically speaking, those cleaved at a lower or limiting protease concentrations). As a first test of this proposal, we examined recovery of good phage from the mock library at decreasing protease concentration. As shown in Fig. 4, when the mock library was treated with less sfSTR, the amount of good phage recovered decreased, and in an essentially linear fashion. This result indicates that the use of lower concentrations of proteases could be employed to gain more selectivity (that is, selection of clones that require lower protease concentrations to be cleaved). This is also a useful test for determining starting protease concentrations. By working with protease levels above that which gives 100% recovery, more protease is being used than is needed. In contrast, use of too low a protease concentration could result in the recovery of too few clones.


Figure 4: Recovery of good phage with decreasing concentrations of sfSTR. A preparation of 10^9 fTC-Bad phage (in 100 µl) was spiked with 10^5 fTC-Good-Kan^R phage and treated with varying concentrations of sfSTR for 1 h at 37 °C. The reaction was quenched with EDTA and subjected to one round of the selection procedure described under ``Materials and Methods.'' Various dilutions of the final supernatant solution were titered on both tetracycline and kanamycin. The percentage of recover of fTC-Bad (circle) and fTC-Good-Kan^R (box).



Phage Proteolysis Assay

One purpose of the peptide display approach is to obtain as much information as possible from the peptides while they are on phage, before turning to the use of synthetically prepared substrates. We thus developed a simple and rapid method for determining whether the peptide sequence carried by an fTC clone is a good or bad substrate. We had previously determined that while phage would bind to nitrocellulose, the cleaved tether peptide would not (data not shown). Thus, by simply spotting reaction time points onto a nitrocellulose filter and probing with the anti-tether antibody, we could observe the time-dependent loss of the tether from the phage. As shown for the controls fTC-Good and fTC-Bad in Fig. 5, similar signal intensities are observed for the two phage clones when they were not treated with sfSTR. After 10 or 60 min of protease treatment, however, the intensity of the fTC-Good spots decrease, while no such loss is seen for fTC-Bad. This dot-blot assay is thus useful as a way of monitoring the proportion of positive sequences generated during each round of screening.


Figure 5: Proteolytic analysis of phage clones. Individual clones from the initial library or round 3 of screening were prepared, digested with sfSTR for 0, 10, or 60 min, dotted onto nitrocellulose, and detected as described under ``Materials and Methods.'' Also shown are identical treatments with fTC-Good and fTC-Bad, showing digestion of the former, but not the latter. Six clones from round three are hits based on this assay.



Screening the Library

A library of random hexamers, generated using the DNA sequence (NNK)(6)(19) was prepared in fTC-LIB as described under ``Materials and Methods.'' The library contained 2 times 10^8 independent recombinants. In the first round, 2 times 10 tu (about 1000 equivalents of each clone) were treated with sfSTR, subjected to the selection protocol, and the recovered phage were amplified. After each round, 12 clones were analyzed using the phage proteolysis assay. The results of the screening are shown in Table 1. In the first series of screening (rounds A1-A7), the output was not titered before the next round of screening, thus variable amounts of phage were used as input for the succeeding round. In contrast, for series C, the output phage were titered before the next round, and a constant input was used (Table 1). Fig. 5shows such an analysis for 12 clones chosen randomly from the library and after three rounds of screening in series A. There are no positive phage among the randomly chosen clones (although one is missing or unreactive with the antibody). A number of the clones isolated in round 3, however, appear to be substrates for the enzyme. In later rounds of screening at reduced protease concentrations, a number of clones were found that appeared to be non-reactive with mAb179 prior to sfSTR treatment (see Table 1). Sequence analysis of four of these clones indicated that the epitope encoding sequence has not been altered, and blot analysis showed that the phage were still recognized by mAb 3-E7, indicating that the gene III protein had not been cleaved (data not shown). We believe that these non-reactive clones, which appear during more stringent selection, are due to alterations in the secondary or tertiary structure of the NH(2) terminus of the pIII protein in such a way as to mask the mAb 179 epitope.



The nucleotide sequence of a number of positive sequences from round 3 were determined, and the deduced amino acids corresponding to the random target sequences are shown in Table 2. Also shown for comparison are sequences from clones chosen randomly from the library. No pattern is apparent for the negative clones, but some patterns did appear for the positive sequences. To further characterize these sequence trends, the selection with sfSTR was pursued for three more rounds. In addition, since the majority of the clones screened were judged to be hits at this point, the concentration of sfSTR used in the selection was dropped 5-fold for round A4 and again at round A6. In addition, a new selection series (series C) was initiated, this time starting with only 2 times 10 tu from the library, and using an initial sfSTR at 1 µg/ml (see Table 1). As can be seen in Table 2A, in both the A and C series, the number of non-reactive clones increases as the selective pressure (decreasing protease activity) is increased. Positive sequences from the rounds of series A, as well as those from the most recent selection series (C), are also shown in Table 2A.



The trends from the 43 sequences in Table 2are summarized in Fig. 6. Most invariant was the B position, where the bulk of clones carried a Pro. In all MMP substrates examined thus far, Pro appears to be favored in the P(3) position(2, 5, 14, 15, 16, 17) , suggesting that we should ``lock in'' the B position as being equivalent to P(3) (this was confirmed by later analysis of the cleaved peptides; see Table 3). Looking at trends in other positions, we found stromelysin favored large hydrophobic groups (Leu, Met, Phe, and Tyr) in positions C (P(2)), Glu, or Ala in D (P(1)), and Leu or Met in E (P`(1)). Positions A and F were not as selective: Ala and Val predominated at A (P(4)), whereas hydroxy (Ser and Thr) and small aliphatic residues were favored at F (P`(2)). Since the protease seemed to recognize the N(6) target site of the library as P(4)-P`(2) sites, we have little information on trends in the P`(3) site.


Figure 6: Frequency analysis of substrate phage clones for stromelysin (left) and matrilysin (right) listed in Table 3. For each position in the N(6) library protease target sequence (A-F), the frequency of occurrence for each of the 20 natural amino acids is shown. The y axis indicates the number of times a particular residue occurred in that position. Note that the scale for each position ranges from 0 to 20 clones, except for position B (under stromelysin), which had been reduced for 0-40 clones, and positions A, C, D, and F (under matrilysin), which had been magnified for 0-10 clones.





Testing Positive Sequences as Peptides

To verify that the sequences generated were indeed substrates for stromelysin, peptides carrying sequences identical to those in the phage clones were prepared, and their kinetic parameters determined. Shown in Table 3are the k/K(M) values for the good and bad substrate controls (peptides 1 and 2). For reference, we also determined k/K(M) values for known stromelysin substrates. The substrates chosen were 12(2,4-dinitrophenyl-Pro-Leu-Gly-Leu-Trp-Ala-D-Arg-NH(2)) (14) , and 11(7-methoxycoumarin-4-yl-Pro-Leu-Gly-Leu-Dpa-Ala-Arg-NH(2)(21) ). The values obtained are comparable with those previously determined.

The first test peptides prepared were designed directly from phage clones. The k/K(M) values for these peptides (3-5) are comparable, and in some cases better, than the values for the good control, indicating that the selection scheme yields high quality substrates. In addition, peptide 6 was prepared based on a consensus sequence of the original 12 hits and was also a good substrate for stromelysin. When an additional 31 positive clones were sequenced, we noted the residues favored in each position (Fig. 6), and made appropriate substitutions in preparing synthetic peptides 7-10. Converting P(4) to Ala from Arg (peptide 7) gave a 5-fold increase in k/K(M), whereas replacing Leu with Met in P`(1) (peptide 10) gave only a modest (about 30%) increase in activity. In the P(1) position, Glu dominated over Ala. When we substitute the P(1) Ala with Glu (peptide 9), we saw a 3-fold increase in activity. Interestingly, Niedzwiecki et al.(17) found Ala favored of Gln at P(1); they did not test Glu, however. As noted earlier, the P(3) position was dominated by Pro. The only other residue found repeatedly in the B position was Ala, although at much lower frequency. When Ala was substituted for Pro at P(3) (peptide 8), cleavage rates dropped about 3-fold, similar to the result seen previously(17) . The results in Table 3show that not only can we obtain new peptide substrate sequences from the positive clones, but that the consensus data obtained (Fig. 6A) can also yield valuable information in the design of substrates.

Effect of Peptide Length

Although previous studies of MMP substrate optimization generally found that peptides need only extend from P(3) to P`(3) to maintain activity, the peptides designated as optimized substrates were always heptaptides or longer(16, 17) . Since our peptides were highly optimized already, we decided to test our consensus substrates as hexapeptides. Peptide 13 is a truncation of peptide 7 (except for the substitution of Leu with Ala, since the Leu was from flanking phage sequence). As can be seen in Table 3, this truncation has only a modest effect on activity. By again substituting Glu for Ala in P(1), we see a doubling of activity, yielding the most active stromelysin substrate described to date. (^4)This result clearly shows that MMP substrates do not have to be comprised of 7 or more amino acids to have good activity.

Screening the Library with Matrilysin

After the successful screening of the phage library with sfSTR, we wanted to further test the system with a second protease. We chose the MMP matrilysin in order to see if the selection could identify sites where selectivity could be engineered into peptide substrates. The library was subjected to three rounds of screening, and the nucleotide sequences of 23 clones designated as positive by the phage proteolysis assay were determined. The deduced protein sequences are shown in Table 2B, and the corresponding frequency analysis is shown in Fig. 6B. Although the patterns are similar to those seen for sfSTR, there are some interesting differences. In particular, at position E, the predominant residues are Leu and Met for sfSTR. In contrast, while Leu remained predominant in that position for matrilysin, only 1 out of 23 clones carried a Met. Likewise, in position E, Leu and Met were predominant for both enzymes, but Phe and Tyr, while frequent for sfSTR, were completely absent in matrilysin. These biases predict that matrilysin will show selectivity versus stromelysin at the P(2) and P`(1) positions. These predictions are borne out by the kinetic analysis of the peptides with matrilysin. Comparison of peptides 6 and 10, which differ only in the P`(1) position, shows that the substitution of Met for Leu results in a modest increase in activity for sfSTR. In contrast, this change causes an 8-fold decrease in activity for matrilysin. Analogous results are seen for peptides 14 and 15. The substitution here of Phe for Leu results in a 3-fold decrease in the k/K(M) value for sfSTR, but a nearly 3-fold increase for matrilysin. Again, peptides 15 and 16 are the smallest and most active peptide substrates of matrilysin described to date.

Quantitative Proteolysis Assay

The phage proteolysis assay described above is useful for assessing qualitatively whether the peptide sequence carried by a phage is cleaved by the protease. By collecting the dot-blot assay data in a quantitative manner, it should be possible to rank order the various phage substrates. To obtain quantitative data, phage proteolysis assays were stopped and dotted at different time points and the blot scanned by laser densitometry. The relative ``phage concentration'' (the level of phage retaining the reporter tag) at each time point, {P}, was determined by comparison of dot intensity to a serial dilution of untreated phage. (^5)By plotting log {P} versus time for each of the substrate phage clones, we can obtain a set of slopes corresponding to first-order decay rates. Since we are observing loss of substrate at a constant enzyme concentration, knowledge of the initial substrate concentrations is not required(29) . Thus, while we cannot determine absolute k/K(M) values, the comparison of the decay rates enables us to determine the relativek/K(M) values of the individual clones:

The data for three clones, as well as fTC-Good and Bad, is shown in Fig. 7. The decrease of log {P} with time is plotted. The slope for each curve was determined by linear regression and plotted versus the k/K(M) values (Fig. 7, inset) for the known synthetic peptides (from Table 3). The relationship is no t an ideal line, suggesting that the structure of the protein flanking the 6 amino acids in the target region influences relative to the k/K(M) values for the peptides on the phage. Thus, while the assay may not be able to discriminate between clones with similark/K(M)values(fTC-Good,clonesA3-3andA3-9), the data do indicate that the assay is fairly predictive in distinguishing poor (fTC-Bad) from highly active substrates (clone A5-2).


Figure 7: Quantitative analysis of the proteolytic analysis of phage clones. Preparations of phage clones A3-3, A3-9, A5-2, and fTC-Good and Bad were made and titered. Equivalent titers of each clone were treated with sfSTR at 5 µg/ml at 37 °C, and the reaction stopped at the time points indicated, dotted onto nitrocellulose, and detected as in Fig. 5. The blot was then scanned by laser densitometry, and the relative phage concentration at each time point was determined by comparison to a serial 2-fold dilution of untreated phage. The log of {P} for each clone is shown at the different time points. Inset, the slope of each decay curve is plotted against the known k/K values of the synthetic peptides (from Table 3).




DISCUSSION

The discovery of peptide substrates for proteases has traditionally been a slow and expensive exercise, requiring the synthesis and testing of large numbers of synthetic peptides. Recently, several groups have developed innovative methods for preparing and analyzing large numbers of peptides as protease substrates. A number of these methods utilize a pool of chemically synthesized peptides and are used to screen substitutions in one or two positions at a time (30, 31, 32, 33) . A few methods use recombinant techniques and offer the opportunity to screen large number of peptides(34, 35, 36) . None of these techniques offered a practical approach to screening very large numbers of substrates until the efforts of Matthews and Wells (13) made it possible to screen >10^7 pentameric substrates at once. We have here presented an analogous system that we have used to screen >10^8 hexameric sequences.

In addition, we have introduced a method for assaying putative substrate clones. This assay is simple, rapid, and requires only very small amounts (100 µl) of culture supernatant. With the phage proteolysis assay, the degree of enrichment achieved in each round of selection is easily monitored, allowing adjustment of the selection conditions (i.e. protease concentration) for each subsequent round without delaying the selection process. Moreover, the use of solution phase digests allows us to have precise control over enzyme and substrate concentrations, which is not possible in assay systems where one or more of the assay components is immobilized on beads or microtiter plates.

We have tested our system with recombinant forms of the MMPs stromelysin and matrilysin, as we know that substitutions in any of the six positions from P(3) to P`(3) will affect the specificity constant of the peptide substrates. This is the only information we used to start our screen. Our goal was to find substrates as potent as the current literature standards(8, 16, 17, 18) . We were able to find clones that were as good or better than these standards (clones A5-2 and A3-3, Table 2and Table 3). Due to the large number of sequences obtained, we could identify trends toward certain amino acids in some positions. Using the consensus sequence suggested by the positive phage clones available at the time, we also designed and tested five consensus peptides (6-10). We were pleased to find that these consensus peptides were better than the literature standards, and that after having prepared only four synthetic peptides based on consensus results, we had achieved a k/K(M) value nearly 20-fold better than our positive control. In fact, peptides 14 and 15 represent both the smallest and the most active peptide substrates of stromelysin and matrilysin, respectively, described to date.

The use of the phage system also gave us an unexpected insight into sites of selectivity between the two enzymes. While initial examination of the data reveals that these proteases have overlapping substrate specificity, closer inspection of the phage and peptide results indicates that certain subsites show distinct preferences: in particular, the opposing preferences for Phe and Leu at P(2) (peptides 14 and 15) and Leu and Met at P`(1) (peptides 6 and 10). Thus, while we learned of the similarity in substrate preferences for the two enzymes, there was sufficient data generated to allow the construction of substrates with differential sensitivities (see for example, peptide 3).

One important piece of information that cannot be derived from the peptides while they are on phage is the location of the site of peptide cleavage. We were only able to determine this once synthetic peptides were prepared and analyzed. It is thus interesting that the sequences obtained ``lined up'' so well in that the vast majority of the clones had Pro in their B position. Without this invariant residue, lining up the hits to build a consensus sequence might have been more difficult. It is unclear why the predicted P(3) position was nearly always located in B, and only rarely in A. Possibilities include biases due to the sequence flanking the random N(6) region, as well those due to steric hindrance from the tethers and pIII protein.

Another advantage of the phage systems over the screening of pools of synthetic peptides is that we acquire discreet rather than averaged data. The synthetic screening methods described above yield only averaged results for each position, as summarized for the phage data in Fig. 6. The discreet results, shown in Table 2, allow us to search for trends which might indicate interactions between one or more subsites. For example, the most abundant amino acids in the D position (P1) were Glu, followed by Ala. Examination of Table 2shows that when D = Glu, the residue in the C (P2) position is generally a bulky hydrophobic residue (Tyr, Leu, Met, and Phe). In contrast, when D = Ala, there is no marked preference. Another example can be shown for the eight cases where C = Met, where in six of these seven, E (P`2) is also Met. Yet when C instead is the closely related reside Leu (in nine of the hits), there is no such correlation: the most common residue in the E position in these cases is the hydrophilic Ser (three times) followed by Leu (twice) and then other residues. Although the significance of such correlations remains to be tested, they are intriguing and can lead to models that can be further tested using newly designed substrates. However, these correlations can be made with only with techniques that yield discreet results.

The differences in the approaches used for the use of monovalent and polyvalent phage as protease substrate discovery tools invites comparison of the two systems. While the monovalent system has been shown to be quite useful, the polyvalent system possesses certain advantages: 1) all phage can act as substrate phage in the polyvalent system. In monovalent systems, it is estimated that only 10% of the phage particles carry one copy of the recombinant pIII protein(25) . This increases the effective substrate concentration in the polyvalent system by at least 50-fold, thus increasing the sensitivity of the system (due to higher concentration of substrate at a given level of phage). Polyvalent phage preps give stronger signal on Western blots, making the phage proteolysis assay possible. 2) Since 90% of the monovalent phage do not carry pIII fusions, the non-recombinant phage lacking the tether must be removed prior to selection. This is accomplished by immobilizing the recombinant phage in microtiter plates coated with tether-binding protein and treating the phage with protease while immobilized. Polyvalent phage, being 100% recombinant, are digested in solution rather than immobilized on a solid surface. The advantages of this are: (i) there is little restriction on number of phage that can be screened in solution, but the surface system limits the number of phage that can be routinely immobilized on microtiter plates. Scale up of the solution phase system is thus very convenient when significantly larger libraries are prepared (i.e. in our first experiment, 10 phage were treated in a single reaction); (ii) protease resistance of tether binding protein (i.e. mAb) is not an issue; (iii) solution proteolysis offers more precise control of cleavage conditions. This has proven especially useful in the quantitative dot-blot assay.

The major disadvantage of the polyvalent system described here is the appearance of non-reactive phage clones, which does not occur in the monovalent system because of the pre-binding step which essentially eliminates clones with defective epitopes. Nothing about the polyvalent method, however, precludes our use of a binding step in later rounds to eliminate non-reactive clones.

In summary, we describe a system that can be used for the routine isolation of new substrates for poorly characterized endoproteases. The system is simple and rapid. Few assumptions about the nature of the selected protease need be made (what is the true physiological substrate; is it a serine or cysteine protease, etc). In fact, the protease need not be pure; it should only be free of other proteolytic (or protease inhibiting) activities. Filamentous phage are valuable tools for studying proteases as they are generally protease resistant. We have found no nonspecific degradation due to stromelysin, matrilysin, HIV protease, and tissue type plasminogen activator. (^6)If a protease is found that degrades the phage, one should be able to treat the phage vector with an excess of protease and select for mutant phage that have become resistant to proteolysis and use this modified vector to prepare a new substrate library.


FOOTNOTES

*
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed: 3410 Central Expressway, Santa Clara, CA 95051. Tel.: 408-522-5718; Fax: 408-481-0522; Marc_Navre{at}qmgates.affymax.com.

(^1)
The abbreviations used are: MMP, matrix metalloprotease; HIV, human immunodeficiency virus; mAb, monoclonal antibody; ON, oligonucleotides; Kan^R, kanamycin resistance; tu, transducing units; fTC, fAFF1-tetherC; fTC-Good, fTC carrying a good substrate for stromelysin; fTC-Bad, fTC carrying a bad substrate for stromelysin; fTC-LIB, fTC derivative used for preparing the library; DPA, 3-(2,4-dinitrophenyl)-L-2,3-diaminopropionyl; Ac, acetyl; sfSTR, COOH terminally truncated form of stromelysin; pro-sfSTR, the proenzyme form of sfSTR.

(^2)
R. Barrett, manuscript in preparation.

(^3)
R. J. Armstrong, L. Shi, and M. Navre, unpublished results.

(^4)
Niedzwiecki et al.(17) described the octapeptide Arg-Pro-Lys-Pro-Leu-Ala*Phe-TrpNH(2) (* = site of cleavage), with a k/K of 18,000 M s, which was the best substrate described using only natural amino acids (interestingly, that substrate is very similar in sequence to clone C4-33). The best stromelysin substrate currently reported is N-(2,4-dinitrophenyl)-Arg-Pro-Lys-Pro-Leu-Ala*Nva-Trp-NH(2) (Nva = norvaline), also from Niedzwiecki et al.(17) , which has a k/K of 45,000 M s.

(^5)
We have noted that the reactivity of the phage with the detecting antibody varies from clone to clone. Thus, no attempt was made to determine absolute phage concentrations.

(^6)
L. Ding, M. Smith, and M. Navre, unpublished results.


ACKNOWLEDGEMENTS

We would like to thank Lois Aldwin and Grace Hsu for providing us with synthetic peptides for these studies. mAb 179 was a gift of Ron Barrett. We would like to thank Bob Gray for peptide 11 as well as much helpful advice, Bill Fitch for the mass spectrometry analysis of the peptide products, Zhengyu Yuan for advice, and Bill Dower for valuable comments on the manuscript.

Note Added in Proof-A highly active peptide substrate for stromelysin with a k/K value of 218,000 was recently identified (Nagase, H., Fields, C. G., and Fields, G. B. (1994) J. Biol Chem.269, 20952-20957). Although this substrate is more active than our best substrate, it does contain 11 amino acids (some of which are unnatural) and is thus much larger than our substrates.


REFERENCES

  1. Fersht, A. R. (1985) Enzyme Structure and Mechanism, pp. 105-106, Freeman, New York
  2. Fields, G. B., Netzel-Arnett, S. J., Windsor, L. J., Engler, J. A., Birkedal-Hansen, H., and Van Wart, H. E. (1990) Biochemistry 29, 6670-6677 [Medline] [Order article via Infotrieve]
  3. Hellen, C. U., Krausslich, H. G., and Wimmer, E. (1989) Biochemistry 28, 9881-9890 [Medline] [Order article via Infotrieve]
  4. Krausslich, H. G., Ingraham, R. H., Skoog, M. T., Wimmer, E., Pallai, P. V., and Carter, C. A. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 807-811 [Abstract]
  5. Netzel-Arnett, S., Fields, G., Birkedal-Hansen, H., and Van Wart, H. E. (1991) J. Biol. Chem. 266, 6747-6755 [Abstract/Free Full Text]
  6. Griffiths, J. T., Phylip, L. H., Konvalinka, J., Strop, P., Gustchina, A., Wlodawer, A., Davenport, R. J., Briggs, R., Dunn, B. M., and Kay, J. (1992) Biochemistry 31, 5193-5200 [Medline] [Order article via Infotrieve]
  7. Tözsér, J., Weber, I. T., Gustchina, A., Bláha, I., Copeland, T. D., Louis, J. M., and Oroszlan, S. (1992) Biochemistry 31, 4793-4800 [Medline] [Order article via Infotrieve]
  8. Teahan, J., Harrison, R., Izquierdo, M., and Stein, R. L. (1989) Biochemistry 28, 8497-8501 [Medline] [Order article via Infotrieve]
  9. Poorman, R. A., Tomasselli, A. G., Heinrikson, R. L., and Kezdy, F. J. (1991) J. Biol. Chem. 266, 14554-14561 [Abstract/Free Full Text]
  10. Tomasselli, A. G., Hui, J. O., Adams, L., Chosay, J., Lowery, D., Greenberg, B., Yem, A., Deibel, M. R., Zurcher-Neely, H., and Heinrikson, R. L. (1991) J. Biol. Chem. 266, 14548-14553 [Abstract/Free Full Text]
  11. Hoess, R. H. (1993) Curr. Opin. Struct. Biol. 3, 572-579 [CrossRef]
  12. Scott, J. K. (1992) Trends Biochem. Sci. 17, 241-245 [Medline] [Order article via Infotrieve]
  13. Matthews, D. J., and Wells, J. A. (1993) Science 260, 1113-1117 [Medline] [Order article via Infotrieve]
  14. Stack, M. S., and Gray, R. D. (1989) J. Biol. Chem. 264, 4277-4281 [Abstract/Free Full Text]
  15. Seltzer, J. L., Akers, K. T., Weingarten, H., Grant, G. A., McCourt, D. W., and Eisen, A. Z. (1990) J. Biol. Chem. 265, 20409-20413 [Abstract/Free Full Text]
  16. Netzel-Arnett, S., Sang, Q. X., Moore, W. G. I., Navre, M., Birkedalhansen, H., and Van Wart, H. E. (1993) Biochemistry 32, 6427-6432 [Medline] [Order article via Infotrieve]
  17. Niedzwiecki, L., Teahan, J., Harrison, R. K., and Stein, R. L. (1992) Biochemistry 31, 12618-12623 [Medline] [Order article via Infotrieve]
  18. Netzel-Arnett, S., Mallya, S. K., Nagase, H., Birkedal-Hansen, H., and Van Wart, H. E. (1991) Anal. Biochem. 195, 86-92 [Medline] [Order article via Infotrieve]
  19. Cwirla, S. E., Peters, E. A., Barrett, R. W., and Dower, W. J. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 6378-6382 [Abstract]
  20. Gramsch, C., Meo, T., Riethmüller, G., and Herz, A. (1983) J. Neurochemistry 40, 1220-1226 [Medline] [Order article via Infotrieve]
  21. Knight, C. G., Willenbrock, F., and Murphy, G. (1992) FEBS Lett. 296, 263-266 [CrossRef][Medline] [Order article via Infotrieve]
  22. Parmley, S. F., and Smith, G. P. (1988) Gene (Amst.) 73, 305-318 [CrossRef][Medline] [Order article via Infotrieve]
  23. Scott, J. K., and Smith, G. P. (1990) Science 249, 386-390 [Medline] [Order article via Infotrieve]
  24. Devlin, J. J., Panganiban, L. C., and Devlin, P. E. (1990) Science 249, 404-406 [Medline] [Order article via Infotrieve]
  25. Bass, R., Greene, R., and Wells, J. A. (1990) Proteins 8, 309-314 [Medline] [Order article via Infotrieve]
  26. Model, P., and Russel, M. (1988) in The Bacteriophages (Calendar, R., ed) pp. 375-456, Plenum Press, New York
  27. Lowman, H. B., Bass, S. H., Simpson, N., and Wells, J. A. (1991) Biochemistry 30, 10832-10838 [Medline] [Order article via Infotrieve]
  28. Barrett, R. W., Cwirla, S. E., Ackerman, M. S., Olson, A. M., Peters, E. A., and Dower, W. J. (1992) Anal. Biochem. 204, 357-364 [Medline] [Order article via Infotrieve]
  29. Schellenberger, V., Siegel, R. A., and Rutter, W. J. (1993) Biochemistry 32, 4344-4348 [Medline] [Order article via Infotrieve]
  30. Petithory, J. R., Masiarz, F. R., Kirsch, J. F., Santi, D. V., and Malcolm, B. A. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 11510-11514 [Abstract]
  31. Birkett, A. J., Soler, D. F., Wolz, R. L., Bond, J. S., Wiseman, J., Berman, J., and Harris, R. B. (1991) Anal. Biochem. 196, 137-143 [Medline] [Order article via Infotrieve]
  32. Berman, J., Green, M., Sugg, E., Anderegg, R., Millington, D. S., Norwood, D. L., McGeehan, J., and Wiseman, J. (1992) J. Biol. Chem. 267, 1434-1437 [Abstract/Free Full Text]
  33. Schellenberger, V., Turck, C. W., Hedstrom, L., and Rutter, W. J. (1993) Biochemistry 32, 4349-4353 [Medline] [Order article via Infotrieve]
  34. Smith, T. A., and Kohorn, B. D. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 5159-5162 [Abstract]
  35. Baum, E. Z., Bebernitz, G. A., and Gluzman, Y. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 10023-10027 [Abstract]
  36. Dasmahapatra, B., Didomenico, B., Dwyer, S., Ma, J., Sadowski, I., and Schwartz, J. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 4159-4162 [Abstract]
  37. Marcy, A. I., Eiberger, L. L., Harrison, R., Chan, H. K., Hutchinson, N. I., Hagmann, W. K., Cameron, P. M., Boulton, D. A., and Hermes, J. D. (1991) Biochemistry 30, 6476-6483 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.