Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560012, India 1 Present address: Astra Biochemicals Pvt. Ltd, PB No. 359, 18th Cross, Malleswaram, Bangalore-560003, India
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: catalytic triad/knowledge-based modeling/serine protease/triad modeling procedure
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The combining sites of antibodies can be used as hosts for the AspHisSer triad. Antibodies have been raised to analogs of the transition state intermediate in amide hydrolysis and have been shown to enhance the rate of hydrolysis of amide substrates considerably, over the uncatalyzed reaction (Janda et al., 1988; Lerner et al., 1991
). These catalytic antibodies, like serine proteases, are believed to process substrates via an acyl-antibody intermediate involving perhaps a Ser residue. Baldwin and Schulz (1989) have introduced a catalytic imidazole at the combining site of one such catalytic antibody for ester hydrolysis. Their Y34H mutant antibody hydrolyzed ester substrates 45 times faster than the wild-type antibody. Thus, novel catalysts may be designed by introducing the protease triad into the combining sites of antibodies.
This paper describes a computer modeling procedure for incorporating a trypsin-type AspHisSer triad into non-proteases of known structure. The procedure selects sites in non-proteases that are geometrically suitable for introducing the protease triad and models the triad at these locations. This study was motivated by the possibility that protein engineering approaches, guided by computer-aided rational design, can be used to incorporate the protease triad into non-protease scaffolds, thus facilitating the design of novel catalysts. Previously, metal binding sites have been engineered into proteins, based on computer modeling methods (Higaki et al., 1990; Iverson et al., 1990
). A computational procedure for modeling ligand binding sites into proteins has also been developed (Hellinga and Richards, 1991
) and, guided by this procedure, a copper binding site has been successfully grafted into thioredoxin (Hellinga et al., 1991
). More recently, a catalytic Asp, His, Ser triad has been successfully grafted into cyclophilin, a cistrans prolyl isomerase without hydrolytic activity (Quéméneur et al., 1998
). The mutant cyclophilin catalyzes the hydrolysis of XPro bonds. The study clearly demonstrates that (i) the protease AspHisSer triad may be regarded as an independent catalytic motif and (ii) the triad can be successfully grafted into non-protease scaffolds to introduce hydrolytic activity.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Analysis of the geometry of protease AspHisSer triads
The geometry of protease AspHisSer triads was analyzed using catalytic triads taken from a set of nine crystal structures of trypsin family proteases. The structures were obtained from the Brookhaven Protein Data Bank (PDB) (Bernstein et al., 1977) and their PDB codes are: 2ALP, 1SGT, 1TGS, 5CHA, 2SGA, 1PPF, 3EST, 3RP2 and 1TPP. The catalytic triad in each structure is formed by the residues His57, Asp102 and Ser195.
Geometric features of the catalytic triads that could be used to develop a triad modeling procedure were analyzed. The (CCß) triads, formed by the three C
- and three Cß-atoms of each catalytic triad, were analyzed by computing six virtual bond length parameters (Vl) and an interplanar angle parameter (
ß). The six length parameters (Vl) define the sides of two triangles, one formed by the three C
-atoms and the other by the three Cß-atoms, and
ß refers to the angle between the two triangles (Figure 1
, Table I
). The virtual bond lengths defined between pairs of Cß-atoms are smaller than those defined between the corresponding pairs of C
-atoms (Table I
), an indication that the Asp, His and Ser side chains point towards each other in the catalytic triads. Further, the three Cß-atoms of each catalytic triad lie on the same side of the plane formed by the three C
-atoms and, as a result, the interplanar angle,
ß, is small (Figure 2
and Table I
). The narrow ranges and small standard deviations for the Vl and
ß parameters in Table I
suggest that the geometry of the protease (C
Cß) triads is faithfully preserved in catalytic triads of different trypsin family proteases.
|
|
|
The protease AspHisSer triads are also characterized by side chain H-bonds: HisN1 donates a proton and forms a strong H-bond with one of the carboxylate oxygens of Asp and HisN
2 and Ser O
are within H-bonding distance of each other (Wright, 1972
; Matthews et al., 1977
; Cohen et al., 1981
) (Figure 3
). The occurrence or otherwise of the latter H-bond has been a matter of controversy,but, if present, it is expected to be of the N···HO type (Matthews et al., 1977
; Tsukada and Blow, 1985
). This H-bond has been shown to be long in the non-liganded enzyme structures, but short (indicating strong interaction) in the liganded complexes (Marquart et al., 1983
). During catalysis, however, His is protonated at N
2 by the transfer of the Ser hydroxyl proton (Kossiakoff and Spencer, 1981
). The triad modeling procedure described here ensures that the AspHis and HisSer side chain H-bonds are incorporated into the AspHisSer triads modeled into non-proteases.
|
A computer modeling procedure was developed to model the protease AspHisSer triad into non-proteases. This procedure was developed on the same lines as the disulfide modeling procedure (MODIP), that introduces SS bridges into proteins (Sowdhamini et al., 1989). The triad modeling procedure first selects sites in non-proteases that are suitable for introducing the protease triad. Asp, His and Ser residues are then modeled at these locations, to reproduce the geometry of the protease catalytic triads. The modeling procedure consists of four steps as shown in Figure 4
and described below.
|
In the second step of the modeling procedure, Asp, His and Ser side chains are generated at each non-protease (CCß) site identified above, in place of the side chains of the non-protease residues. The bond length and bond angle parameters used for generating the Asp, His and Ser side chains were calculated as mean values from the set of protease catalytic triads. The procedure then models each generated AspHisSer triad by side chain rotations and examines if conformations are possible, in which the AspHis and HisSer side chains are H-bonded, as in the protease catalytic triads. Different conformations of the model triad are generated by side chain rotations about five torsion angles: two of Asp, two of His and one of Ser (Figure 3
). Previous analyses of amino acid side chain conformations in proteins (Janin et al., 1978
) have shown that
1 of Asp, His and Ser generally take up g (~60°), t (~180°) or g+ (~60°) conformations, whereas
2 of His and Asp show wide distributions (Janin et al., 1978, and our unpublished data). Thus, while generating different conformations of the model triad,
1 was varied in steps of 10° in three ranges: (i) 30 to 90° or g, (ii) 150 to 150° or t and (iii) 90 to 30° or g+.
2 of His was varied in steps of 10° over the complete range 180 to 180°, while
2 of Asp was varied over the range 0 to 180°, because the twofold symmetry of the Asp side chain makes positions 180° apart in
2 equivalent.
The third step of the modeling procedure incorporates H-bonds in the model triad. To begin with, rotations are performed for His and Ser, in the above ranges. Each HisSer pair generated is checked for the His(N2)Ser(O
) H-bond. If the N···O distance is in the range 2.53.2 Å and the HN···O angle is
40°, the HisSer pair is regarded as being H-bonded. [Two hydrogens are fixed for the His residue, one on N
1 and the other on N
2. This corresponds to the protonated form of His observed during catalysis (Kossiakoff and Spencer, 1981
).] Hence all conformations of the triad in which His and Ser are H-bonded are identified. Taking up each conformation of His in which the HisSer H-bond has been formed, the Asp side chain is rotated. Conformations, if any, in which His N
1 is H-bonded to one of the Asp carboxylate oxygens are identified. Asp and His are regarded as being H-bonded if the N···O distance is in the range 2.53.2 Å and the HN···O angle is
40°. In other words, a HisSer H-bond is first incorporated into the model triad and then the conformation of the Asp side chain is varied, so that an AspHis H-bond can also be incorporated. Thus a set of conformations of the model triad is obtained in which AspHis and HisSer are H-bonded. The stereochemistry of each of these conformations is examined and conformations in which bumps or short contacts (Ramachandran et al., 1963
; Ramachandran and Sasisekharan, 1968
) occur between non-bonded and non-H-bonded atoms of the triad are rejected.
Each H-bonded, stereochemically allowed model triad conformation is superposed on an original protease catalytic triad, the triad of trypsin in the crystal structure, 1TPP and the root-mean-square (r.m.s.) deviation between the model and catalytic triads is calculated. Model triad conformations that superpose on the 1TPP triad with an r.m.s. deviation <2.0 Å may be regarded as well-modeled triads that reproduce satisfactorily the geometry of the protease catalytic triads.
Assessment of the quality of the model triads
A scoring function has been used to assess the quality of the model triads. The function is based on: (i) the six Vl parameters of the model triads, (ii) the side chain accessibility values of the host protein residues that are to be mutated to Asp, His and Ser and (iii) the substitution potential of the host protein residues, which would decide the ease with which they can be changed to Asp, His and Ser.
The contribution to the scoring function of each of the six Vl parameters is said to be Sl, where l = 1, 6. If a Vl for the model triad lies in its respective protease range (Table I), then Sl = 1. If a Vl lies outside its range, then the extent of deviation is incorporated as a penalty to its Sl value of 1. The contribution of the six Vl parameters of a model triad to the total score of the triad is
Sl where l = 1, 6.
The side chain accessibilities for the host protein residues that are to be mutated to Asp, His and Ser were calculated using the program psa (A.Sali, unpublished results). The program reports the total side chain accessibility for each residue as a normalized percentage with respect to the extended conformation of the residue in a GlyXGly peptide. These accessibility values (Ssc) were considered as contributions to the scoring function. The values for the wild-type residues were considered, rather than those for the mutated Asp, His and Ser residues, because the conformation of the latter in the mutant protein is difficult to predict. The contribution of the side chain accessibilities of the three host protein residues to the total score of the model triad is Ssc where sc = 1, 3.
The potential of the chosen host protein residues to be mutated to Asp, His and Ser was also used to score the model triads. The scoring was done using the substitution potential matrix generated by Johnson and Overington (1993) using a structure-based sequence alignment of homologous proteins in the PDB. The normalized matrix was used to assign penalties (Sm) for mutating each host protein residue, X, to Asp, His or Ser. The total penalty for introducing Asp, His and Ser in place of host protein residues is Sm where m = 1, 3.
The complete scoring function may be written as follows:
![]() |
Values of the scoring function were computed for the protease and model triads. The highest values were obtained for the protease triads. Model triads with values approaching those of the protease triads may be regarded as favored candidates for an experiment.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The triad modeling procedure is a knowledge-based computational modeling procedure for introducing the protease AspHisSer triad into non-proteases of known structure. The procedure was used to model the triad into four non-proteases: 1FGV (immunoglobulin), 1HSL (histidine binding protein), 2OLB (oligopeptide binding protein) and 2LTN (pea lectin). In these four proteins, 16, 35, 28 and 24 (CCß) triads or sites suitable for modeling the protease triad were identified, respectively. However, H-bonded AspHisSer triads could not be modeled in at all these sites. Triads could be modeled at only two out of 16 (C
Cß) locations in 1FGV, at three out of 35 (C
Cß) locations in 1HSL and at eight and nine locations out of 28 and 24 in 2OLB and 2LTN, respectively. The (C
Cß) locations in the four non-proteases at which H-bonded AspHisSer triads could be successfully modeled in are listed in Table II
. The best model triad (showing the lowest r.m.s. deviation with respect to the 1TPP triad) in each of the four non-proteases is shown superposed on the 1TPP triad in Figure 5
. The model triads reproduce satisfactorily the geometry of the protease catalytic triad. The AspHisSer triad modeled in at the 2LTN location, IleA52ValB20LeuB12 and the one modeled in at the 2OLB location, ValA164TrpA97TyrA112, best mimic the geometry of the trypsin triad (they show the lowest r.m.s. deviations with respect to the trypsin triad, 1.3 and 1.2 Å, respectively; Figure 5a and b
). Table II
also gives the score for each model triad. The scores for the protease catalytic triads lie in the range 8.28.9. The scores for the model triads lie in the range 6.38.4. Triads with scores
7.0 may be considered for mutagenesis experiments.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Blow (1990) searched the protein data bank for H-bonded AspHisSer triads. A triad each was identified in taka-amylase and immunoglobulin Kol, but the geometry of both triads poorly resembled that of the protease triad and neither showed catalytic function. In a similar search performed by us (unpublished data), we found that when H-bonded AspHisSer triads occur in proteins, they usually perform a catalytic function, as in the serine proteases and in the triglyceride lipase from Rhizomucor miehei (Brady et al., 1990; Winkler et al., 1990
; Brzozowski et al., 1991
). The formation of an H-bonded, protease type AspHisSer triad appears to result in an enzymological motif that performs a catalytic function. Introducing the triad motif into non-proteases, therefore, may be expected to introduce catalytic function in the latter. In the present study, rather than search for and model imperfectly formed H-bonded AspHisSer motifs occurring in non-proteases, we use the protease (C
Cß) triad as a search motif to identify existing locations in proteins that might be stereochemically optimal for introducing the protease triad. If, at such sites, it is possible to model an H-bonded, serine protease type AspHisSer triad, then the non-protease site is regarded as being suitable for a mutagenesis experiment.
If AspHisSer triads have been successfully modeled in at several locations in a non-protease, then for selecting one of these sites for a mutagenesis experiment, various considerations can be used. If Asp, His or Ser already occur in the non-protease at a model triad location, then such a location would be favored, because the number of mutations needed to complete the triad would be either one or two. Thus the triad location in 1HSL, formed by HisC39SerA70LeuA52 (Figure 5a, Table II
) would be preferred for a mutagenesis experiment because His from the protein would readily form a part of a catalytic triad. Triad locations that suggest conserved mutations (e.g. Glu
Asp; Lys
His; Thr
Ser) are favorable for an experiment, because they perturb the electrostatic microenvironment in the protein minimally. Finally, triad locations that suggest isosteric mutations (e.g. Met or Gln
His; Cys, Thr or Asn
Asp; Ala
Ser) may also be used for an experiment (Mian et al., 1991
).
The non-proteases that have been used in the present study as hosts for the model triad include an antibody and three binding proteins. The combining sites of antibodies are attractive choices as hosts for the rational introduction of the serine protease triad, because the problem of engineering a substrate binding site is solved by the immune system and only an adjoining catalytic triad needs to be introduced. Catalytic antibodies raised to catalyze the hydrolysis of ester or amide substrates are especially attractive targets for such experiments, because by introducing the catalytic triad, the catalytic efficiency of the antibody can be enhanced. Indeed, in one experiment, the efficiency of a catalytic antibody for ester hydrolysis was improved by a factor of 45 by the introduction of a catalytic imidazole (His) at the combining site (Janda et al., 1988; Baldwin and Schulz, 1989
). Conceptually, binding proteins (such as the histidine, oligopeptide and carbohydrate binding proteins included in the present study) are also attractive targets for introducing the protease triad. The binding site of each of these proteins may be regarded as a potential active site. By suitably introducing an AspHisSer triad, perhaps at the center of the binding site, catalytic activity may be introduced into the binding protein. In nature, at least, experiments of this kind have been successfully performed. The structure of the human rhinovirus 3C-protease, for example, shows that its overall 3D-polypeptide fold is similar to that of the trypsin-like serine proteases. However, the catalytic triad of the enzyme is GluHisCys, which makes the enzyme a cysteine protease (Matthews et al., 1994
).
In more recent times, Quéméneur et al. (1998) have successfully engineered a catalytic Asp, His, Ser triad (N106D, F104H, A91S) close to the peptide binding cleft of cyclophilin. The mutant enzyme hydrolyzes XPro peptide bonds. Their study definitively shows that it is possible to engineer rationally the catalytic triad and proteolytic activity into the binding sites of non-proteases. Interestingly, the side chain accessibility values for the engineered cyclophilin triad suggest that the three residues, N101D, F99H and A86S, are buried (the residue numbers are as indicated in the PDB coordinate set, 1LOP; side chain accessibility values are 0, 0 and 0.031, respectively). Thus buried residues close to the cyclophilin binding cleft appear to have formed an active triad. Either a buried triad is functional in the mutant cyclophilin or structural changes caused by the mutations might have resulted in the formation of a suitably solvent exposed triad. This example suggests that triad modeling sites formed by apolar residues (cf., Table II) may also be considered for experiments: (i) because not all of the apolar residues are buried and those that are buried may readjust due to structural changes resulting from mutation and (ii) because the resulting catalytic triad, although buried, may be active if the residues are suitably positioned within the protein binding site.
Using the protease (CCß) triad as a search motif ensures that the non-protease (C
Cß) sites selected are of suitable geometry for accommodating the Asp, His and Ser side chains in a triad-like conformation and that mutations at these non-protease sites perturb the local geometry of the protein minimally. However, if distortions do occur, it is expected that they will disappear as a result of overall relaxation of the protein. Finally, although the modeling procedure has been developed based on an analysis of the geometry of a set of trypsin family triads, it can be readily extended so as to model triads based on the geometry of subtilisin family triads. Catalytic triads of the two families are in general very similar in geometry.
Conclusions
The AspHisSer triad may be regarded as an independent catalytic motif that has, in nature, been linked to different binding sites (as, for example, in the serine proteases and the lipases) to perform different hydrolytic functions. By using rational, computer-aided design and mutagenesis, an attempt can be made to introduce the protease triad into non-protease scaffolds. This would not only facilitate the design of novel catalysts, but would also provide insight into the structural determinants of enzyme function. A modeling procedure has been developed that permits the identification of locations in non-proteases that are stereochemically suitable for introducing the protease triad. The procedure has been used to identify such locations in three binding proteins and an immunoglobulin. The candidate locations have also been scored using a scoring function. Out of several sites provided by the modeling procedure, the site for mutagenesis can be chosen by considering (a) the score for the site and (b) the steric and chemical properties of the amino acids that have to be replaced. Immunoglobulins and binding proteins are suitable choices for an experiment because their binding sites can be exploited to design proteins with binding and catalytic functions.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F.,Jr, Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535542.[ISI][Medline]
Blow, D.M. (1976) Acc. Chem. Res., 9, 145152.[ISI]
Blow,D. (1990) Nature, 343, 694695.[ISI][Medline]
Blow,D. (1991) Nature, 351, 444445.[ISI][Medline]
Blow,D.M., Birktoft,J.J. and Hartley,B.S. (1969) Nature, 221, 337340.[ISI][Medline]
Brady,L. et al. (1990) Nature, 343, 767770.[ISI][Medline]
Brzozowski,A.M. et al. (1991) Nature, 351, 491494.[ISI][Medline]
Carter,P. and Wells,J.A. (1988) Nature, 332, 564568.[ISI][Medline]
Cohen,G.H., Silverton,E.W. and Davies,D.R. (1981). J. Mol. Biol., 148, 449479.[ISI][Medline]
Corey,D.R. and Craik,C.S. (1992) J. Am. Chem. Soc., 114, 17841790.[ISI]
Craik,C.S., Roczniak,S., Largman,C. and Rutter,W.J. (1987) Science, 237, 909913.[ISI][Medline]
Fibla,J., Atrian,S. and Gonzalez-Duarte,R. (1993) Eur. J. Biochem., 211, 357365.[Abstract]
Gao,Q.-S., Sun,M., Rees,A.R. and Paul,S. (1995), J. Mol. Biol., 253, 658664.[ISI][Medline]
Hellinga,H.W. and Richards,F.M. (1991) J. Mol. Biol., 222, 763785.[ISI][Medline]
Hellinga,H.W., Caradonna,J.P. and Richards,F.M. (1991) J. Mol. Biol., 222, 787803.[ISI][Medline]
Higaki,J.N., Haymore,B.L., Chen,S., Fletterick,R.J. and Craik,C.S. (1990) Biochemistry, 29, 85828586.[ISI][Medline]
Iverson,B.L., Iverson,S.A., Roberts,V.A., Getzoff,E.D., Tainer,J.A., Benkovic,S.J. and Lerner,R.A. (1990) Science, 249, 659662.[ISI][Medline]
Janda,K.D., Schloeder,D., Benkovic,S.J. and Lerner,R.A. (1988) Science, 241, 11881191.[ISI][Medline]
Janin,J., Wodak,S., Levitt,M. and Maigret,B. (1978) J. Mol. Biol., 125, 357386.[ISI][Medline]
Johnson,M.S. and Overington,J.P. (1993) J. Mol. Biol., 233, 716738.[ISI][Medline]
Kossiakoff,A.A. and Spencer,S.A. (1981) Biochemistry, 20, 64626474.[ISI][Medline]
Kraut,J., Robertus,J.D., Birktoft,J.J., Alden,R.A., Wilcox,P.E. and Powers,J.C. (1972) Cold Spring Harbor Symp. Quant. Biol., 36, 117123.[ISI][Medline]
Lerner,R.A., Benkovic,S.J. and Schultz,P.G. (1991) Science, 252, 659667.[ISI][Medline]
Marquart,M., Walter,J., Deisenhofer,J., Bode,W. and Huber,R. (1983) Acta Crystallogr., Sect. B, 39, 480490.[ISI]
Matthews,D.A., Alden,R.A., Birktoft,J.J., Freer,S.T. and Kraut,J. (1977) J. Biol. Chem., 252, 88758883.[ISI][Medline]
Matthews,D.A. et al. (1994) Cell, 77, 761771.[ISI][Medline]
Mian,I.S., Bradwell,A.R. and Olson,A.J. (1991) J. Mol. Biol., 217, 133151.[ISI][Medline]
Paul,S., Volle,D.J., Beach,C.M., Johnson,D.R., Powell,M.J. and Massey,R.J. (1989) Science, 244, 11581162.[ISI][Medline]
Quéméneur,E., Moutiez,M., Charbonnier,J.-B. and Ménez,A. (1998) Nature, 391, 301304.[ISI][Medline]
Ramachandran,G.N. and Sasisekharan,V. (1968) Adv. Protein Chem., 23, 283437.[Medline]
Ramachandran,G.N., Ramakrishnan,C. and Sasisekharan,V. (1963) J. Mol. Biol., 7, 9599.[ISI][Medline]
Sowdhamini,R., Srinivasan,N., Schoichet,B., Santi,D.V., Ramakrishnan,C. and Balaram,P. (1989) Protein Engng, 3, 95103.[Abstract]
Sussman,J.L., Harel,M., Frolow,F., Oefner,C., Goldman,A., Toker,L. and Silman,I. (1991) Science, 253, 872879.[ISI][Medline]
Tsukada,H. and Blow,D.M. (1985) J. Mol. Biol., 184, 703711.[ISI][Medline]
Winkler,F.K., D'Arcy,A. and Hunziker,W. (1990) Nature, 343, 771774.[ISI][Medline]
Wright, C.S. (1972) J. Mol. Biol., 67, 151163.[ISI][Medline]
Wright,C.S., Alden,R.A. and Kraut,J. (1969) Nature, 221, 235242.[ISI][Medline]
Received September 15, 1997; revised March 9, 1999; accepted May 18, 1999.