Knowledge-based modeling of the serine protease triad into non-proteases

Prathima Iengar1 and C. Ramakrishnan2

Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560012, India 1 Present address: Astra Biochemicals Pvt. Ltd, PB No. 359, 18th Cross, Malleswaram, Bangalore-560003, India


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The Asp–His–Ser triad of serine proteases has been regarded, in the present study, as an independent catalytic motif, because in nature it has been incorporated at the active sites of enzymes as diverse as the serine proteases and the lipases. Incorporating this motif into non-protease scaffolds, by rational design and mutagenesis, might lead to the generation of novel catalysts. As an aid to such experiments, a knowledge-based computer modeling procedure has been developed to model the protease Asp–His–Ser triad into non-proteases. Catalytic triads from a set of trypsin family proteases have been analyzed and criteria that characterize the geometry of the triads have been obtained. Using these criteria, the modeling procedure first identifies sites in non-proteases that are suitable for modeling the protease triad. H-bonded Asp–His–Ser triads, that mimic the protease catalytic triad in geometry, are then modeled in at these sites, provided it is stereochemically possible to do so. Thus non-protease sites at which H-bonded Asp–His–Ser triads are successfully modeled in may be considered for mutagenesis experiments that aim at introducing the protease triad into non-proteases. The triad modeling procedure has been used to identify sites for introducing the protease triad in three binding proteins and an immunoglobulin. A scoring function, depending on inter-residue distances, solvent accessibility and the substitution potential of amino acid residues at the modeling sites in the host proteins, has been used to assess the quality of the model triads.

Keywords: catalytic triad/knowledge-based modeling/serine protease/triad modeling procedure


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
An Asp–His–Ser catalytic triad is present at the active site of every serine protease (Blow et al., 1969Go; Wright et al., 1969Go; Blow, 1976Go). Site-specific mutagenesis experiments have shown that the triad is indispensable for catalysis and that replacing Ser, His or Asp results in large decreases in turnover number (Craik et al., 1987Go; Carter and Wells, 1988Go; Corey and Craik, 1992Go). There are two families of serine proteases: the trypsin and subtilisin families (Kraut et al., 1972Go). Enzymes belonging to the two families are unrelated in sequence and three-dimensional structure, but employ the same Asp–His–Ser triad for catalysis. An Asp–His–Ser triad is also observed at the active sites of lipases (Blow, 1990Go, 1991Go; Brady et al., 1990Go; Winkler et al., 1990Go; Brzozowski et al., 1991Go) and a closely related Glu–His–Ser triad is observed at the active site of acetyl cholinesterase (Sussman et al., 1991Go). Serine protease activity is also associated with Drosophila alcohol dehydrogenase (Fibla et al., 1993Go). Further, in patients with autoimmune disease, catalytic autoantibodies have been observed, which exhibit peptide bond-cleaving activity, by employing Ser and His residues in a serine protease-type mechanism (Paul et al., 1989Go; Gao et al., 1995Go). These examples suggest that the Asp–His–Ser triad is an independent catalytic motif that has, in nature, been coupled to different binding sites, to perform different functions. One can therefore try, by rational design and protein engineering, to introduce the protease triad into non-proteases and thus attempt to design novel catalysts.

The combining sites of antibodies can be used as hosts for the Asp–His–Ser triad. Antibodies have been raised to analogs of the transition state intermediate in amide hydrolysis and have been shown to enhance the rate of hydrolysis of amide substrates considerably, over the uncatalyzed reaction (Janda et al., 1988Go; Lerner et al., 1991Go). These catalytic antibodies, like serine proteases, are believed to process substrates via an acyl-antibody intermediate involving perhaps a Ser residue. Baldwin and Schulz (1989) have introduced a catalytic imidazole at the combining site of one such catalytic antibody for ester hydrolysis. Their Y34H mutant antibody hydrolyzed ester substrates 45 times faster than the wild-type antibody. Thus, novel catalysts may be designed by introducing the protease triad into the combining sites of antibodies.

This paper describes a computer modeling procedure for incorporating a trypsin-type Asp–His–Ser triad into non-proteases of known structure. The procedure selects sites in non-proteases that are geometrically suitable for introducing the protease triad and models the triad at these locations. This study was motivated by the possibility that protein engineering approaches, guided by computer-aided rational design, can be used to incorporate the protease triad into non-protease scaffolds, thus facilitating the design of novel catalysts. Previously, metal binding sites have been engineered into proteins, based on computer modeling methods (Higaki et al., 1990Go; Iverson et al., 1990Go). A computational procedure for modeling ligand binding sites into proteins has also been developed (Hellinga and Richards, 1991Go) and, guided by this procedure, a copper binding site has been successfully grafted into thioredoxin (Hellinga et al., 1991Go). More recently, a catalytic Asp, His, Ser triad has been successfully grafted into cyclophilin, a cis–trans prolyl isomerase without hydrolytic activity (Quéméneur et al., 1998Go). The mutant cyclophilin catalyzes the hydrolysis of X–Pro bonds. The study clearly demonstrates that (i) the protease Asp–His–Ser triad may be regarded as an independent catalytic motif and (ii) the triad can be successfully grafted into non-protease scaffolds to introduce hydrolytic activity.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
To develop a knowledge-based triad modeling procedure, the geometry of catalytic Asp–His–Ser triads taken from the crystal structures of serine proteases was first analyzed. A modeling procedure was then developed to introduce Asp–His–Ser triads, similar in geometry to the protease triads, into non-proteases.

Analysis of the geometry of protease Asp–His–Ser triads

The geometry of protease Asp–His–Ser triads was analyzed using catalytic triads taken from a set of nine crystal structures of trypsin family proteases. The structures were obtained from the Brookhaven Protein Data Bank (PDB) (Bernstein et al., 1977Go) and their PDB codes are: 2ALP, 1SGT, 1TGS, 5CHA, 2SGA, 1PPF, 3EST, 3RP2 and 1TPP. The catalytic triad in each structure is formed by the residues His57, Asp102 and Ser195.

Geometric features of the catalytic triads that could be used to develop a triad modeling procedure were analyzed. The (C{alpha}–Cß) triads, formed by the three C{alpha}- and three Cß-atoms of each catalytic triad, were analyzed by computing six virtual bond length parameters (Vl) and an interplanar angle parameter ({theta}{alpha}–ß). The six length parameters (Vl) define the sides of two triangles, one formed by the three C{alpha}-atoms and the other by the three Cß-atoms, and {theta}{alpha}–ß refers to the angle between the two triangles (Figure 1Go, Table IGo). The virtual bond lengths defined between pairs of Cß-atoms are smaller than those defined between the corresponding pairs of C{alpha}-atoms (Table IGo), an indication that the Asp, His and Ser side chains point towards each other in the catalytic triads. Further, the three Cß-atoms of each catalytic triad lie on the same side of the plane formed by the three C{alpha}-atoms and, as a result, the interplanar angle, {theta}{alpha}–ß, is small (Figure 2Go and Table IGo). The narrow ranges and small standard deviations for the Vl and {theta}{alpha}–ß parameters in Table IGo suggest that the geometry of the protease (C{alpha}–Cß) triads is faithfully preserved in catalytic triads of different trypsin family proteases.



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 1. The (C{alpha}–Cß) triad of serine proteases. The geometry of the triad was defined by six virtual bond lengths, Vl (three formed between the three C{alpha}-atoms and three between the three Cß-atoms) and the angle between the planes formed by the three C{alpha}- and three Cß-atoms ({theta}{alpha}–ß). The six Vl are indicated as Vl(1), Vl(2), Vl(3), etc.

 

View this table:
[in this window]
[in a new window]
 
Table I. The values for the six virtual bond length parameters (shown in Figure 1Go) and the interplaner angle parameter, in the protease (C{alpha}–Cß) triads
 


View larger version (8K):
[in this window]
[in a new window]
 
Fig. 2. Side view of the triangles formed by the three C{alpha}- and three Cß-atoms in a typical catalytic triad of the trypsin family. The three Cß-atoms all lie on the same side of the plane formed by the three C{alpha}-atoms.

 
(C{alpha}–Cß) triads in non-proteases (formed by any three C{alpha}- and corresponding three Cß-atoms), whose dimensions match those of the protease (C{alpha}–Cß) triad, may be expected to be of suitable geometry for positioning Asp, His and Ser side chains in the arrangement in which they appear in the catalytic triads. Thus the geometry of the protease (C{alpha}–Cß) triad was characterized by relaxing the extrema for each Vl parameter by 0.5 Å and by relaxing the extrema for the {theta}{alpha}–ß parameter moderately, to constitute the range 0–30°. This range, given in Table IGo, was used in the triad modeling procedure to identify sites in non-proteases that are suitable for introducing the protease triad.

The protease Asp–His–Ser triads are also characterized by side chain H-bonds: HisN{delta}1 donates a proton and forms a strong H-bond with one of the carboxylate oxygens of Asp and HisN{varepsilon}2 and Ser O{gamma} are within H-bonding distance of each other (Wright, 1972Go; Matthews et al., 1977Go; Cohen et al., 1981Go) (Figure 3Go). The occurrence or otherwise of the latter H-bond has been a matter of controversy,but, if present, it is expected to be of the –N···H–O type (Matthews et al., 1977Go; Tsukada and Blow, 1985Go). This H-bond has been shown to be long in the non-liganded enzyme structures, but short (indicating strong interaction) in the liganded complexes (Marquart et al., 1983Go). During catalysis, however, His is protonated at N{varepsilon}2 by the transfer of the Ser hydroxyl proton (Kossiakoff and Spencer, 1981Go). The triad modeling procedure described here ensures that the Asp–His and His–Ser side chain H-bonds are incorporated into the Asp–His–Ser triads modeled into non-proteases.



View larger version (9K):
[in this window]
[in a new window]
 
Fig. 3. Schematic diagram of the protease Asp–His–Ser triad showing the side chain torsion angles, {chi}1, {chi}2, of the three residues and the Asp···His and His···Ser side chain H-bonds.

 
The triad modeling procedure

A computer modeling procedure was developed to model the protease Asp–His–Ser triad into non-proteases. This procedure was developed on the same lines as the disulfide modeling procedure (MODIP), that introduces –S–S– bridges into proteins (Sowdhamini et al., 1989Go). The triad modeling procedure first selects sites in non-proteases that are suitable for introducing the protease triad. Asp, His and Ser residues are then modeled at these locations, to reproduce the geometry of the protease catalytic triads. The modeling procedure consists of four steps as shown in Figure 4Go and described below.



View larger version (30K):
[in this window]
[in a new window]
 
Fig. 4. Flow chart of the triad modeling procedure.

 
In the first step, the protease (C{alpha}–Cß) triad (Figure 1Go) is used as a search motif to identify sites in non-proteases that are suitable for modeling the protease Asp–His–Ser triad. Each non-protease is searched for sets of three residues that are so positioned that the (C{alpha}–Cß) triad formed by the three C{alpha}- and three Cß-atoms of each set of three residues matches the protease (C{alpha}–Cß) triad in dimensions. In other words, the six Vl and {theta}{alpha}–ß parameters of each non-protease (C{alpha}–Cß) triad should lie within the corresponding Vl and {theta}{alpha}–ß ranges of the protease (C{alpha}–Cß) triads (Table IGo). However, simultaneously requiring all the parameters of each non-protease (C{alpha}–Cß) triad to occur in the protease ranges is too restrictive a condition. Therefore, to permit some leniency, one of the three (C{alpha}···C{alpha}) and one of the three (Cß···Cß) virtual bond lengths were allowed to occur marginally outside their respective protease ranges. That is, (C{alpha}–Cß) triads in which any one of the three (C{alpha}···C{alpha}) and/or any one of the three (Cß···Cß) Vl parameters lie <=0.25 Å outside their protease limits are also accepted by the modeling procedure. Non-protease (C{alpha}–Cß) triads thus identified constitute sites that are of suitable geometry for modeling the protease triad. The procedure also imposes the restriction that the three non-protease residues that form each (C{alpha}–Cß) triad be separated in sequence by seven or more residues. This prevents neighboring residues from forming a triad modeling site. Only non-Gly residues form triad modeling sites, because Gly residues lack a Cß-atom and are ignored by the modeling procedure. Replacing Gly by the bulkier Asp, His and Ser residues may, in any case, be expected to be sterically unfavorable.

In the second step of the modeling procedure, Asp, His and Ser side chains are generated at each non-protease (C{alpha}–Cß) site identified above, in place of the side chains of the non-protease residues. The bond length and bond angle parameters used for generating the Asp, His and Ser side chains were calculated as mean values from the set of protease catalytic triads. The procedure then models each generated Asp–His–Ser triad by side chain rotations and examines if conformations are possible, in which the Asp–His and His–Ser side chains are H-bonded, as in the protease catalytic triads. Different conformations of the model triad are generated by side chain rotations about five torsion angles: two of Asp, two of His and one of Ser (Figure 3Go). Previous analyses of amino acid side chain conformations in proteins (Janin et al., 1978Go) have shown that {chi}1 of Asp, His and Ser generally take up g (~60°), t (~180°) or g+ (~–60°) conformations, whereas {chi}2 of His and Asp show wide distributions (Janin et al., 1978, and our unpublished data). Thus, while generating different conformations of the model triad, {chi}1 was varied in steps of 10° in three ranges: (i) 30 to 90° or g, (ii) 150 to –150° or t and (iii) –90 to –30° or g+. {chi}2 of His was varied in steps of 10° over the complete range –180 to 180°, while {chi}2 of Asp was varied over the range 0 to 180°, because the twofold symmetry of the Asp side chain makes positions 180° apart in {chi}2 equivalent.

The third step of the modeling procedure incorporates H-bonds in the model triad. To begin with, rotations are performed for His and Ser, in the above ranges. Each His–Ser pair generated is checked for the His(N{varepsilon}2)–Ser(O{gamma}) H-bond. If the N···O distance is in the range 2.5–3.2 Å and the H–N···O angle is <=40°, the His–Ser pair is regarded as being H-bonded. [Two hydrogens are fixed for the His residue, one on N{delta}1 and the other on N{varepsilon}2. This corresponds to the protonated form of His observed during catalysis (Kossiakoff and Spencer, 1981Go).] Hence all conformations of the triad in which His and Ser are H-bonded are identified. Taking up each conformation of His in which the His–Ser H-bond has been formed, the Asp side chain is rotated. Conformations, if any, in which His N{delta}1 is H-bonded to one of the Asp carboxylate oxygens are identified. Asp and His are regarded as being H-bonded if the N···O distance is in the range 2.5–3.2 Å and the H–N···O angle is <=40°. In other words, a His–Ser H-bond is first incorporated into the model triad and then the conformation of the Asp side chain is varied, so that an Asp–His H-bond can also be incorporated. Thus a set of conformations of the model triad is obtained in which Asp–His and His–Ser are H-bonded. The stereochemistry of each of these conformations is examined and conformations in which bumps or short contacts (Ramachandran et al., 1963Go; Ramachandran and Sasisekharan, 1968Go) occur between non-bonded and non-H-bonded atoms of the triad are rejected.

Each H-bonded, stereochemically allowed model triad conformation is superposed on an original protease catalytic triad, the triad of trypsin in the crystal structure, 1TPP and the root-mean-square (r.m.s.) deviation between the model and catalytic triads is calculated. Model triad conformations that superpose on the 1TPP triad with an r.m.s. deviation <2.0 Å may be regarded as well-modeled triads that reproduce satisfactorily the geometry of the protease catalytic triads.

Assessment of the quality of the model triads

A scoring function has been used to assess the quality of the model triads. The function is based on: (i) the six Vl parameters of the model triads, (ii) the side chain accessibility values of the host protein residues that are to be mutated to Asp, His and Ser and (iii) the substitution potential of the host protein residues, which would decide the ease with which they can be changed to Asp, His and Ser.

The contribution to the scoring function of each of the six Vl parameters is said to be Sl, where l = 1, 6. If a Vl for the model triad lies in its respective protease range (Table IGo), then Sl = 1. If a Vl lies outside its range, then the extent of deviation is incorporated as a penalty to its Sl value of 1. The contribution of the six Vl parameters of a model triad to the total score of the triad is {Sigma}Sl where l = 1, 6.

The side chain accessibilities for the host protein residues that are to be mutated to Asp, His and Ser were calculated using the program psa (A.Sali, unpublished results). The program reports the total side chain accessibility for each residue as a normalized percentage with respect to the extended conformation of the residue in a Gly–X–Gly peptide. These accessibility values (Ssc) were considered as contributions to the scoring function. The values for the wild-type residues were considered, rather than those for the mutated Asp, His and Ser residues, because the conformation of the latter in the mutant protein is difficult to predict. The contribution of the side chain accessibilities of the three host protein residues to the total score of the model triad is {Sigma}Ssc where sc = 1, 3.

The potential of the chosen host protein residues to be mutated to Asp, His and Ser was also used to score the model triads. The scoring was done using the substitution potential matrix generated by Johnson and Overington (1993) using a structure-based sequence alignment of homologous proteins in the PDB. The normalized matrix was used to assign penalties (Sm) for mutating each host protein residue, X, to Asp, His or Ser. The total penalty for introducing Asp, His and Ser in place of host protein residues is {Sigma}Sm where m = 1, 3.

The complete scoring function may be written as follows:

where l = 1, 6, sc = 1, 3 and m = 1, 3.

Values of the scoring function were computed for the protease and model triads. The highest values were obtained for the protease triads. Model triads with values approaching those of the protease triads may be regarded as favored candidates for an experiment.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Modeling the protease triad into non-proteases

The triad modeling procedure is a knowledge-based computational modeling procedure for introducing the protease Asp–His–Ser triad into non-proteases of known structure. The procedure was used to model the triad into four non-proteases: 1FGV (immunoglobulin), 1HSL (histidine binding protein), 2OLB (oligopeptide binding protein) and 2LTN (pea lectin). In these four proteins, 16, 35, 28 and 24 (C{alpha}–Cß) triads or sites suitable for modeling the protease triad were identified, respectively. However, H-bonded Asp–His–Ser triads could not be modeled in at all these sites. Triads could be modeled at only two out of 16 (C{alpha}–Cß) locations in 1FGV, at three out of 35 (C{alpha}–Cß) locations in 1HSL and at eight and nine locations out of 28 and 24 in 2OLB and 2LTN, respectively. The (C{alpha}–Cß) locations in the four non-proteases at which H-bonded Asp–His–Ser triads could be successfully modeled in are listed in Table IIGo. The best model triad (showing the lowest r.m.s. deviation with respect to the 1TPP triad) in each of the four non-proteases is shown superposed on the 1TPP triad in Figure 5Go. The model triads reproduce satisfactorily the geometry of the protease catalytic triad. The Asp–His–Ser triad modeled in at the 2LTN location, IleA52–ValB20–LeuB12 and the one modeled in at the 2OLB location, ValA164–TrpA97–TyrA112, best mimic the geometry of the trypsin triad (they show the lowest r.m.s. deviations with respect to the trypsin triad, 1.3 and 1.2 Å, respectively; Figure 5a and bGo). Table IIGo also gives the score for each model triad. The scores for the protease catalytic triads lie in the range 8.2–8.9. The scores for the model triads lie in the range 6.3–8.4. Triads with scores >=7.0 may be considered for mutagenesis experiments.


View this table:
[in this window]
[in a new window]
 
Table II. List of Asp–His–Ser triads modeled by the triad modeling procedure, in non-proteases
 


View larger version (11K):
[in this window]
[in a new window]
 
Fig. 5. Asp–His–Ser triads modeled into non-proteases. Model triads are shown superposed on the trypsin triad. (a) Triad modeled in 2LTN, at the location A52–B20–B12 (r.m.s. deviation = 1.2 Å). (b) Triad modeled in 2OLB at the location, A164–A97–A112 (r.m.s. deviation with respect to the trypsin triad = 1.3 Å). (c) Triad modeled in 1FGV at L102–L21–L111 (r.m.s. deviation = 1.8 Å). (d) Triad modeled in 1HSL at C239–A70–A52 (r.m.s. deviation = 1.6 Å).

 
The modeling procedure ensures that the model triads are of good stereochemical quality, in several ways. (i) The (C{alpha}–Cß) triads that are picked as triad modeling sites in a non-protease match the protease (C{alpha}–Cß) triad in dimensions. Thus, to begin with, the triad modeling sites are of suitable geometry for accommodating and positioning the Asp, His and Ser side chains in the form of a catalytic triad. (ii) The side chain torsion angles of each model triad are allowed to take up only stereochemically allowed conformations, i.e., {chi}1 of Asp, His and Ser are allowed to occur only in g, t or g+ conformations. (iii) The Asp–His and His–Ser H-bonds, characteristic of the protease catalytic triads, are incorporated into each model triad, thus ensuring that H-bonded Asp–His–Ser triads are obtained after modeling. (iv) Finally, the modeling procedure ensures that no short contacts occur between the three residues of each model triad. Further, the scoring function used to assess the quality of the model triads adds an appropriate penalty to account for possible unfavorable effects caused by mutation of protein residues to Asp, His and Ser. The function also weights the location in the protein at which the model triad is to be introduced, based on whether the side chain accessibility pattern of the host protein residues matches that of the protease catalytic triad residues.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The basis for using this modeling procedure as a guide to mutagenesis experiments lies in the assumption that the geometry of the model triad will be reproduced in the non-protease merely by making the required mutations, i.e. once introduced, the three residues will form the triad by making small side chain rotations. Side-chain rotations are recognized to be energetically feasible, as they have been observed in crystal structures of proteins in the form of large B-factors for side chain atoms and distinguishing sets of electron densities for different side chain orientations. Given that small side chain rearrangements occur in proteins, by selecting stereochemically favorable sites for introducing the protease triad, it would seem feasible to introduce the protease triad into non-proteases by site-specific mutagenesis.

Blow (1990) searched the protein data bank for H-bonded Asp–His–Ser triads. A triad each was identified in taka-amylase and immunoglobulin Kol, but the geometry of both triads poorly resembled that of the protease triad and neither showed catalytic function. In a similar search performed by us (unpublished data), we found that when H-bonded Asp–His–Ser triads occur in proteins, they usually perform a catalytic function, as in the serine proteases and in the triglyceride lipase from Rhizomucor miehei (Brady et al., 1990Go; Winkler et al., 1990Go; Brzozowski et al., 1991Go). The formation of an H-bonded, protease type Asp–His–Ser triad appears to result in an enzymological motif that performs a catalytic function. Introducing the triad motif into non-proteases, therefore, may be expected to introduce catalytic function in the latter. In the present study, rather than search for and model imperfectly formed H-bonded Asp–His–Ser motifs occurring in non-proteases, we use the protease (C{alpha}–Cß) triad as a search motif to identify existing locations in proteins that might be stereochemically optimal for introducing the protease triad. If, at such sites, it is possible to model an H-bonded, serine protease type Asp–His–Ser triad, then the non-protease site is regarded as being suitable for a mutagenesis experiment.

If Asp–His–Ser triads have been successfully modeled in at several locations in a non-protease, then for selecting one of these sites for a mutagenesis experiment, various considerations can be used. If Asp, His or Ser already occur in the non-protease at a model triad location, then such a location would be favored, because the number of mutations needed to complete the triad would be either one or two. Thus the triad location in 1HSL, formed by HisC39–SerA70–LeuA52 (Figure 5aGo, Table IIGo) would be preferred for a mutagenesis experiment because His from the protein would readily form a part of a catalytic triad. Triad locations that suggest conserved mutations (e.g. Glu->Asp; Lys->His; Thr->Ser) are favorable for an experiment, because they perturb the electrostatic microenvironment in the protein minimally. Finally, triad locations that suggest isosteric mutations (e.g. Met or Gln->His; Cys, Thr or Asn->Asp; Ala->Ser) may also be used for an experiment (Mian et al., 1991Go).

The non-proteases that have been used in the present study as hosts for the model triad include an antibody and three binding proteins. The combining sites of antibodies are attractive choices as hosts for the rational introduction of the serine protease triad, because the problem of engineering a substrate binding site is solved by the immune system and only an adjoining catalytic triad needs to be introduced. Catalytic antibodies raised to catalyze the hydrolysis of ester or amide substrates are especially attractive targets for such experiments, because by introducing the catalytic triad, the catalytic efficiency of the antibody can be enhanced. Indeed, in one experiment, the efficiency of a catalytic antibody for ester hydrolysis was improved by a factor of 45 by the introduction of a catalytic imidazole (His) at the combining site (Janda et al., 1988Go; Baldwin and Schulz, 1989Go). Conceptually, binding proteins (such as the histidine, oligopeptide and carbohydrate binding proteins included in the present study) are also attractive targets for introducing the protease triad. The binding site of each of these proteins may be regarded as a potential active site. By suitably introducing an Asp–His–Ser triad, perhaps at the center of the binding site, catalytic activity may be introduced into the binding protein. In nature, at least, experiments of this kind have been successfully performed. The structure of the human rhinovirus 3C-protease, for example, shows that its overall 3D-polypeptide fold is similar to that of the trypsin-like serine proteases. However, the catalytic triad of the enzyme is Glu–His–Cys, which makes the enzyme a cysteine protease (Matthews et al., 1994Go).

In more recent times, Quéméneur et al. (1998) have successfully engineered a catalytic Asp, His, Ser triad (N106D, F104H, A91S) close to the peptide binding cleft of cyclophilin. The mutant enzyme hydrolyzes X–Pro peptide bonds. Their study definitively shows that it is possible to engineer rationally the catalytic triad and proteolytic activity into the binding sites of non-proteases. Interestingly, the side chain accessibility values for the engineered cyclophilin triad suggest that the three residues, N101D, F99H and A86S, are buried (the residue numbers are as indicated in the PDB coordinate set, 1LOP; side chain accessibility values are 0, 0 and 0.031, respectively). Thus buried residues close to the cyclophilin binding cleft appear to have formed an active triad. Either a buried triad is functional in the mutant cyclophilin or structural changes caused by the mutations might have resulted in the formation of a suitably solvent exposed triad. This example suggests that triad modeling sites formed by apolar residues (cf., Table IIGo) may also be considered for experiments: (i) because not all of the apolar residues are buried and those that are buried may readjust due to structural changes resulting from mutation and (ii) because the resulting catalytic triad, although buried, may be active if the residues are suitably positioned within the protein binding site.

Using the protease (C{alpha}–Cß) triad as a search motif ensures that the non-protease (C{alpha}–Cß) sites selected are of suitable geometry for accommodating the Asp, His and Ser side chains in a triad-like conformation and that mutations at these non-protease sites perturb the local geometry of the protein minimally. However, if distortions do occur, it is expected that they will disappear as a result of overall relaxation of the protein. Finally, although the modeling procedure has been developed based on an analysis of the geometry of a set of trypsin family triads, it can be readily extended so as to model triads based on the geometry of subtilisin family triads. Catalytic triads of the two families are in general very similar in geometry.

Conclusions

The Asp–His–Ser triad may be regarded as an independent catalytic motif that has, in nature, been linked to different binding sites (as, for example, in the serine proteases and the lipases) to perform different hydrolytic functions. By using rational, computer-aided design and mutagenesis, an attempt can be made to introduce the protease triad into non-protease scaffolds. This would not only facilitate the design of novel catalysts, but would also provide insight into the structural determinants of enzyme function. A modeling procedure has been developed that permits the identification of locations in non-proteases that are stereochemically suitable for introducing the protease triad. The procedure has been used to identify such locations in three binding proteins and an immunoglobulin. The candidate locations have also been scored using a scoring function. Out of several sites provided by the modeling procedure, the site for mutagenesis can be chosen by considering (a) the score for the site and (b) the steric and chemical properties of the amino acids that have to be replaced. Immunoglobulins and binding proteins are suitable choices for an experiment because their binding sites can be exploited to design proteins with binding and catalytic functions.


    Acknowledgments
 
The authors thank Professor P.Balaram and Dr N.Srinivasan for helpful discussions during various stages of this work.


    Notes
 
2 To whom correspondence should be addressed. E-mail: ramki{at}crmbu2.mbu.iisc.ernet.in Back


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Baldwin,E. and Schultz,P.G. (1989) Science, 245, 1104–1107.[ISI][Medline]

Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F.,Jr, Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535–542.[ISI][Medline]

Blow, D.M. (1976) Acc. Chem. Res., 9, 145–152.[ISI]

Blow,D. (1990) Nature, 343, 694–695.[ISI][Medline]

Blow,D. (1991) Nature, 351, 444–445.[ISI][Medline]

Blow,D.M., Birktoft,J.J. and Hartley,B.S. (1969) Nature, 221, 337–340.[ISI][Medline]

Brady,L. et al. (1990) Nature, 343, 767–770.[ISI][Medline]

Brzozowski,A.M. et al. (1991) Nature, 351, 491–494.[ISI][Medline]

Carter,P. and Wells,J.A. (1988) Nature, 332, 564–568.[ISI][Medline]

Cohen,G.H., Silverton,E.W. and Davies,D.R. (1981). J. Mol. Biol., 148, 449–479.[ISI][Medline]

Corey,D.R. and Craik,C.S. (1992) J. Am. Chem. Soc., 114, 1784–1790.[ISI]

Craik,C.S., Roczniak,S., Largman,C. and Rutter,W.J. (1987) Science, 237, 909–913.[ISI][Medline]

Fibla,J., Atrian,S. and Gonzalez-Duarte,R. (1993) Eur. J. Biochem., 211, 357–365.[Abstract]

Gao,Q.-S., Sun,M., Rees,A.R. and Paul,S. (1995), J. Mol. Biol., 253, 658–664.[ISI][Medline]

Hellinga,H.W. and Richards,F.M. (1991) J. Mol. Biol., 222, 763–785.[ISI][Medline]

Hellinga,H.W., Caradonna,J.P. and Richards,F.M. (1991) J. Mol. Biol., 222, 787–803.[ISI][Medline]

Higaki,J.N., Haymore,B.L., Chen,S., Fletterick,R.J. and Craik,C.S. (1990) Biochemistry, 29, 8582–8586.[ISI][Medline]

Iverson,B.L., Iverson,S.A., Roberts,V.A., Getzoff,E.D., Tainer,J.A., Benkovic,S.J. and Lerner,R.A. (1990) Science, 249, 659–662.[ISI][Medline]

Janda,K.D., Schloeder,D., Benkovic,S.J. and Lerner,R.A. (1988) Science, 241, 1188–1191.[ISI][Medline]

Janin,J., Wodak,S., Levitt,M. and Maigret,B. (1978) J. Mol. Biol., 125, 357–386.[ISI][Medline]

Johnson,M.S. and Overington,J.P. (1993) J. Mol. Biol., 233, 716–738.[ISI][Medline]

Kossiakoff,A.A. and Spencer,S.A. (1981) Biochemistry, 20, 6462–6474.[ISI][Medline]

Kraut,J., Robertus,J.D., Birktoft,J.J., Alden,R.A., Wilcox,P.E. and Powers,J.C. (1972) Cold Spring Harbor Symp. Quant. Biol., 36, 117–123.[ISI][Medline]

Lerner,R.A., Benkovic,S.J. and Schultz,P.G. (1991) Science, 252, 659–667.[ISI][Medline]

Marquart,M., Walter,J., Deisenhofer,J., Bode,W. and Huber,R. (1983) Acta Crystallogr., Sect. B, 39, 480–490.[ISI]

Matthews,D.A., Alden,R.A., Birktoft,J.J., Freer,S.T. and Kraut,J. (1977) J. Biol. Chem., 252, 8875–8883.[ISI][Medline]

Matthews,D.A. et al. (1994) Cell, 77, 761–771.[ISI][Medline]

Mian,I.S., Bradwell,A.R. and Olson,A.J. (1991) J. Mol. Biol., 217, 133–151.[ISI][Medline]

Paul,S., Volle,D.J., Beach,C.M., Johnson,D.R., Powell,M.J. and Massey,R.J. (1989) Science, 244, 1158–1162.[ISI][Medline]

Quéméneur,E., Moutiez,M., Charbonnier,J.-B. and Ménez,A. (1998) Nature, 391, 301–304.[ISI][Medline]

Ramachandran,G.N. and Sasisekharan,V. (1968) Adv. Protein Chem., 23, 283–437.[Medline]

Ramachandran,G.N., Ramakrishnan,C. and Sasisekharan,V. (1963) J. Mol. Biol., 7, 95–99.[ISI][Medline]

Sowdhamini,R., Srinivasan,N., Schoichet,B., Santi,D.V., Ramakrishnan,C. and Balaram,P. (1989) Protein Engng, 3, 95–103.[Abstract]

Sussman,J.L., Harel,M., Frolow,F., Oefner,C., Goldman,A., Toker,L. and Silman,I. (1991) Science, 253, 872–879.[ISI][Medline]

Tsukada,H. and Blow,D.M. (1985) J. Mol. Biol., 184, 703–711.[ISI][Medline]

Winkler,F.K., D'Arcy,A. and Hunziker,W. (1990) Nature, 343, 771–774.[ISI][Medline]

Wright, C.S. (1972) J. Mol. Biol., 67, 151–163.[ISI][Medline]

Wright,C.S., Alden,R.A. and Kraut,J. (1969) Nature, 221, 235–242.[ISI][Medline]

Received September 15, 1997; revised March 9, 1999; accepted May 18, 1999.