Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, 44 Lincoln's Inn Fields, London, WC2A 3PX, UK
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: protein conformational change/proteinprotein docking
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
This paper considers only heteroprotein complexes formed from folded protein that are stable in isolation. A recent analysis of this system (Jones and Thornton, 1996) considered 10 enzymeinhibitor, six antibodyantigen and five other types of complexes. There tends to be a uniformity in the static features of the complex interface despite a variety of shapes. The interface, which rarely has cavities, buries between 980 ± 580 Å2 of accessible surface area with between 1.13 ± 0.47 hydrogen bonds per 100 Å2 buried accessible surface area. Janin and Chothia (1990) characterized the interface as formed from around 55% non-polar, 25% polar and 20% charged residues.
When the structures of a complex and of its components in isolation have been determined, the workers report the conformational change on association (e.g. Hecht et al., 1991, 1992
; Bhat et al., 1994
; Chantalat et al., 1995
). On the limited data sets available at the time, Huber (1979), Janin and Wodak (1983) and Bennett and Huber (1984) described general features of conformational changes in protein. More recently, Stanfield and Wilson (1994) have reviewed conformational changes in antibodyantigen association, and in a series of papers by Lesk and Chothia (1988), Gerstein and Chothia (1991) and Gerstein et al. (1994), the nature of domain movements in proteins has been analysed. However, these studies are dominated by the conformational change induced by small molecules binding to proteins. Our aim is to quantify the extent of conformational changes in a single type of recognitionthe formation of heteroprotein complexes.
The extent of conformational change on proteinprotein association has implications for the development of algorithms to dock proteins starting from the coordinates of the unbound components. In general, the docking algorithms (for reviews see Janin, 1995; Shoichet and Kuntz, 1996
; Sternberg et al., 1998
) employ the rigid-body approximation and initially search for favourable associations of the unbound components. The conformational change on association is treated as a subsequent refinement step. The results of our analysis will provide a framework to guide the application and the development of these algorithms.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Residues identified in the relevant paper or PDB file as having poor electron density were excluded from calculations of conformational change, as were those residues containing one or more atoms with B-factor greater than or equal to 50 Å2. The conformation of these residues is expected to differ more than that of others because of uncertainty in their position, or high mobility.
Residues were defined as exposed if their total relative side-chain surface area, or total relative main-chain surface area in the case of glycine, was greater than 15%. All others were defined as buried. Surface area was calculated by the algorithm of Lee and Richards (1971), implemented by Suhail Islam (personal communication), with a probe radius of 1.4 Å. `Relative areas' are relative to that of the particular residue in its extended conformation (Miller et al., 1987).
Independently solved structures of identical proteins
To obtain a value for the amount of structural change that can be expected from experimental differences in the determination of crystal structures, pairs of independently solved crystal structures of identical proteins were investigated. A similar analysis has been performed by another group (Flores et al., 1993).
From the April 1996 release of the Structural Classification of Proteins (SCOP) database (Murzin et al., 1995), we searched for sets of non-complexed structures with 100% identical sequence, no non-water heteroatoms and no insertions or deletions. When more than one set was available for the same SCOP classification, sets of native structures were chosen in preference to sets of mutants. If any of these sets contained more than two structures, then the two structures with the best resolution were used. If there were still more than two structures in any set, the two most recently solved structures were chosen.
Twelve pairs were found (Table I). Members of each pair were solved in the same space group, except turkey lysozyme (PDB codes 135l and 2lz2). Refinement procedures were not the same for members of each pair, meaning that any different systematic errors caused by the different procedures will show up in this analysis. These differences are justified in the context of the comparisons made with pairs of complexed and unbound structures, where the space groups and refinement methods often differ.
|
Complexed and unbound structures
In the April 1996 release of the PDB, 92 different proteinprotein complexes that satisfied the initial criteria were found. For each component of these complexes, classifications from the April 1996 release of SCOP were used to identify structures of unbound forms with identical classifications. This gave a set of 31 complexes with one or both of their components available in an unbound form (Table II).
|
Interface residues for each component of a complex were defined as those that have at least one atom 4 Å or nearer to the other component. The unbound forms of the proteins were then superposed on the bound forms by least squares fitting of C atoms of non-interface residues.
Identical proteins in different complexes
We wished to investigate whether different bound forms of the same protein are more similar to each other than to the unbound form. If so, then where available they could be used as the starting structure in a docking simulation.
The set of bound and unbound proteins (Table II) was searched for cases where the same protein was present in different complexes, as well as in an unbound form, using SCOP classifications to identify identical proteins. Five different proteins were found to have this data available (Table III
), not including lysozyme and neuraminidase. These were ignored because their partners in the complexes are antibodies. These do not necessarily bind in the same place, and consequently one would not expect changes in the interface to be common in all the complexes. Three of the five proteins are from the same family (eukaryotic proteases), and two of them are trypsins. This means that it is unreasonable to attempt to distinguish between movements of the five proteins, and also that any conclusions that are made from the five as a whole must be used cautiously, as they will be biased towards the eukaryotic protease family.
|
Calculations of conformational change
Pairs of proteins were superposed on atoms mentioned above by the least squares fitting algorithm of McLachlan (1979), implemented by Suhail Islam (personal communication) and by the `roughfit' option of the Structural Alignment of Multiple Proteins (STAMP) program of Russell and Barton (1992). Calculations of conformational change, based on the resulting superpositions, were calculated using programs written specifically for the work presented in this paper.
Root mean square deviations were calculated over all atoms concerned. For side chains this is not the same as the average r.m.s. over all residues concerned, because different types of residues have different numbers of atoms in their side chains.
Torsion angles that change minima are identified by looking for changes in their class. Torsion angles were considered to be of a particular class if they were 60° or less from the position of minimum energy of that class (Janin et al., 1978). In this way a change of 10, for example, that does not involve crossing an energy maximum (i.e. a conformation that involves steric clash) is not counted, whereas one that does is.
2 angles were only examined for change if the related
1 angle did not change minima.
Certain residue types (Arg, Asp, Glu, Phe and Tyr) have portions of their side chains that are symmetrical, and others (Asn, Gln and His) can be considered to have symmetry due to difficulties in distinguishing some atom types in the electron density. For example, a rotation of 180° of the benzene ring of phenylalanine (about 2) gives two identical conformations. Differences of this type between all pairs of structures were corrected for so that labelling differences in the PDB files do not show up as conformational changes in our calculations. A special case is leucine, which has no such symmetry but which has two different conformations, corresponding to a rotation of 180 about
2, that are difficult to distinguish in electron density maps (Janin et al., 1978
). We therefore do not calculate
2 torsion angles for leucines.
![]() |
Controls: differences between independently solved structures of identical proteins |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the rest of the paper, any conformational changes that have magnitudes that are equal to or smaller than the differences found here cannot be distinguished from differences in the experimental determination of structures. The word `control' is used to refer to the appropriate value.
Overall measures
Several measures were used to analyse the overall conformational differences between the members of each pair: C root mean square deviation (r.m.s.d.), side-chain r.m.s.d., and the percentage of
1 and
2 angles that occupy different minima. These were calculated separately for both exposed residues and all residues (Table IV
). Unfortunately the data have heavy-tailed non-normal distributions, which make means and standard deviations inappropriate measures for comparisons with the other data sets examined in this paper. Therefore a cut-off was chosen for each measure such that 95% of all the control pairs have values below it. The effect of this is to remove one outlier (the largest value), as there are twelve pairs in total. These cut-offs are given in the last row of Table IV
, and summarized below (see table legends for details of implementation).
|
The 95% cut-off for r.m.s. deviation of C atoms is 0.6 Å over exposed residues and 0.4 Å over all residues. The C
r.m.s. deviation over all residues from a similar analysis (Flores et al., 1993
) is higher at 1.0 Å. This reflects both the differences in the two data sets, and the fact that we ignore residues with poor electron density or B-factors greater than 50 Å2, whereas they do not. The conformation of these residues is expected to differ more than that of others because of uncertainty in their position, or high mobility. The 95% cut-off for r.m.s. deviation of side-chain atoms is 1.7 Å over exposed residues and 1.6 Å over all residues.
Changes in side-chain torsion angles were also calculated for exposed residues and for all residues. For structure comparison, a particularly useful measure of torsion angle change is the percentage of side-chain torsion angles that occupy different minima (see Materials and methods). 2 angles are only examined for change when their related
1 angle does not change. The 95% cut-offs are 31% of
1 angles and 23% of
2 angles for exposed residues, and 24 and 21% for all residues.
For all 1 angles, 87.1% occupy the same minima, and for all
2 angles (where
1 does not change) this value is 90.1%. These compare well with the equivalent values calculated by Flores et al. (1993) (81.7% for
1 angles and 86.7% for
2 angles), though our results suggest that torsion angles are more conserved. For exposed residues, 83.1% of
1 angles and 87.9% of
2 angles (where
1 does not change) occupy the same minima.
The two structures of transforming growth factor ß (TGF-ß) have already been compared in detail by Daopin and Davies (1994), and our results confirm theirs. They also present four different methods for estimating the coordinate errors. Two of these use resolution and R-factor to estimate overall coordinate error (Luzzati, 1952; Srinivasan and Ramachandran, 1965
), and the other two estimate local errors from temperature factors. One uses Cruickshank's equations (Cruickshank, 1949
, 1954
, 1967
) and the other is an empirical method based on the observed relationship between the temperature factors and positional differences of a pair of trypsin structures (Chambers and Stroud, 1979
). The values from these methods compare well with those from the direct comparisons of the two structures. However, as the authors point out, the estimations from the first three methods cannot give a value for systematic differences in the determination of structures; these can be found only by comparing independently solved structures. This caveat also applies to the work of Tickle et al. (1998), in which error estimates for two crystallin structures were calculated from full-matrix least-squares refinement. The estimation from the empirical method suffers by being based on only one pair of structures.
Movements of individual residues
For each of the 20 commonly occurring amino acids, we have calculated the C displacements and side-chain r.m.s.d.'s of every exposed residue of that type. The results are again given as a `95% cut-off' (Table V
), because the data have heavy-tailed non-normal distributions that make means and standard deviations inappropriate. These 95% cut-offs include most residues, but exclude large outliers caused by N- or C-terminal residues and those caused by residues adjacent to ones poorly defined in the electron density (the poorly defined ones themselves are excluded from the calculationssee Materials and methods).
|
![]() |
Differences between complexed and unbound structures |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The C and side-chain r.m.s.d.'s were analysed for all the pairs of complexed and unbound structures listed in Table II
. These calculations were performed separately for interface residues and for exposed non-interface residues. The results (Figure 1
) show that, in many complexes, conformational change is no higher than differences from experimental error: for each of the measures, more than half of the pairs have values that are equal to or below the relevant control limit. Also, nearly half of all the pairs (19 out of 39) do not move more than the controls by any of these measures. Values higher than the control limits are mostly caused by large movements of a few residues, which are discussed later.
|
|
|
The C displacements and side-chain r.m.s.d.'s of individual residues were compared against the control values for the relevant amino acid type (Table V
), and those that had values greater than the controls are described in the next two sections. Figure 1
shows counts of these residues for each pair of complexed and unbound structures, alongside the overall C
or side-chain r.m.s.d. These numbers vary widely for those pairs with an overall measure above the appropriate control limit. This reflects either substantial changes of several residues, or the fact that large changes of individual residues can dominate the overall measures of small regions.
All large C displacements (above 3 Å) and large side-chain r.m.s. deviations (above 5.6 Å) of exposed non-interface residues can be explained by one of the following causes (though note that these limits are greater than the control limits):
Hence all large residue movements of exposed residues that are not in the interface can be explained by either their close proximity to the interface (a), or by structural disorder (b, c and d), which is also the cause of movements greater than the controls in the systems used to define them. They are not due to hinge-bending or shear movements between domains as sometimes seen when small molecules bind (Gerstein et al., 1994). However, smaller movements than these that are nevertheless greater than the controls occur to a large extent. They may be due to crystal packing differences, which are less in the controls because all but one have identical space groups and most also have very similar unit cell dimensions. An exception to these generalities is human growth hormone complexed with its receptor. This is a four helix bundle with two long crossover connections and a short loop that move substantially (as already noted by Chantalat et al., 1995
).
Changes in the interfaces occur for a variety of reasons: to form specific interactions required for the action of the protein; to avoid steric clash; or to improve shape complementarity and allow hydrogen bonding (Janin and Chothia, 1990). The largest changes of interface residues are discussed in more detail later.
Do interface regions move more than exposed non-interface regions?
To answer this question, it is only meaningful to look at those systems where measurements of movements of the interface and/or the exposed non-interface regions are greater than movement of exposed residues in the controls.
The results suggest that interfaces typically have greater conformational change than other exposed parts of the structures. This is probably due to the fact that changes in the interface occur for specific reasons, rather than simply as a result of flexibility or disorder (see above). The effect is more noticeable in side-chain movement. Three of the measures, side-chain r.m.s.d. (Figure 4b) and percentages of
1 (Figure 4c
) and
2 (Figure 4d
) angles that change minima, all indicate more movement in interface regions than in exposed non-interface regions. This is shown most strongly by the percentages of
2's that change minimaall but one of the pairs have greater values for their interface regions than they do for their exposed non-interface regions. Surprisingly, Figure 4a
shows that more pairs have greater movement of the main chain (measured by C
r.m.s.d.) for exposed non-interface regions than they do for interface regions. However, the numbers are the same if two pairs are ignored: human growth hormone complexed with its receptor (discussed above), and amicyanin complexed with methylamine dehydrogenase. In this protein the first 15 N-terminal residues form an irregular outer ß-strand connected to a loop of six residues that are poorly defined in the electron density (Durley et al., 1993
). The loop itself is excluded from our calculations because of its poor definition, but it confers flexibility on the included N-terminal ß-strand.
|
Changes in interfaces occur for a variety of reasons: to form specific interactions required for the action of the protein, to avoid steric clash, or to improve shape complementarity and allow hydrogen bonding (Janin and Chothia, 1990). The changes of interface residues discussed below include all those that are equal to or larger than those of the exposed non-interface residues that could be explained by structural disorder or proximity to the interface (see above).
Changes that allow the formation of specifically required interactions are the largest and most extensive seen in the structures examined. When chymotrypsinogen binds to human pancreatic secretory trypsin inhibitor (PDB code 1cgi), the specificity pocket and oxyanion hole necessary for inhibitor binding are formed by large movements of loops Ser189Ser195 and Val213Cys220 towards the inhibitor (Figure 5a). This change is the same as occurs when the zymogen is activated by hydrolysis. Smaller C
shifts of inhibitor loop Tyr10Arg21, along with side-chain movements towards the enzyme of some of these residues, alter the pattern of hydrogen bonding and allow binding to chymotrypsinogen. The changes are largely the same as those noted by Hecht et al. (1991, 1992).
|
Interactions that appear to be less necessary for function, because they simply alleviate minor steric clash or improve hydrogen bonding and van der Waals contacts, are noticeably less extensive. However, they can still involve large changes of a few residues. Figure 5b shows changes of this nature that occur when the interface between hen egg white lysozyme and the variable domain of antibody D1.3 (PDB code 1vfb) is formed. Gly102 of lysozyme moves with a C
displacement of 7.5 Å, which brings it to within 2.1 Å of Arg99 on the heavy chain of the antibody. Movement of Arg99 was noted in a comparison of complexed and unbound antibody (Bhat et al., 1994
), along with a decrease in its mobility as shown by a decrease in temperature factor. The two residues either side of lysozyme Gly102 (Asp101 and Asn103) are not classified as interface but also move significantlythey are part of a loop movement. Another large but isolated discrete change occurs with Arg125 of lysozyme (side-chain r.m.s.d. = 6.3 Å), with the possible creation of a hydrogen bond to Ser93 on the light chain of the antibody. In other complexes, discrete changes not directly related to function occur to improve electrostatic complementarity; for example, the movement of Lys73 of amicyanin on binding to methylamine dehydrogenase (PDB code 1mda; Figure 5c
), or to positions that would be highly exposed to solvent if adopted in the unbound structure; for example, Phe39 of
-chymotrypsin (PDB code 1cho, Figure 5d
).
Differences between different types of component
In Table II there are eight complexes (six enzymeinhibitors and two antibodyantigens) which have both of their components solved in an unbound form. These data enable a comparison of the extent of conformational change in the different components (enzymes against inhibitors, and antibodies against antigens). The number of interface residues that have a side-chain r.m.s.d. larger than the relevant control is similar for the different components. The same is true for C
displacement (except for subtilisin complexed with chymotrypsin inhibitor, and Fab D44.1 bound to lysozyme). This suggests that in many cases the extent of conformational change is the same in the different components. However, side-chain r.m.s.d.'s calculated over all interface residues give a different but incorrect result, suggesting that the interfaces of inhibitors and antigens are more mobile than those of their enzyme and antibody partners (Figure 6
). This is incorrect because inhibitors and antigens have smaller interfaces than their partners in the complexes, with between 30 and 84% of the number of residues, and therefore a few large side-chain movements have more of an effect on the overall r.m.s.d.
|
A comparison of the amount of conformational change in equivalent components of different types of complexes would also be useful. Enzymes are comparable with antibodies and inhibitors are comparable with antigens, in terms of their relative sizes in the complexes, as mentioned above, and in terms of conformational change, because the two types of complexes behave like each other (Janin and Chothia, 1990). A comparison of the inhibitors and antigens in our data set (Table II
) is justified as there are six and seven of each, respectively, that have structures of both the complexed and unbound forms. The numbers of these that have values above the controls suggest that side-chain movement is more common in the interfaces of inhibitors than in that of antigens, as measured by both side-chain r.m.s.d. (Figure 1b
) and the percentage of
2's that change minima (Figure 3b
). Once again, the differences are caused by large changes of a few residues. However, this does not invalidate the results because of the similar number of residues in the interfaces. The numbers for interface C
r.m.s.d. (Figure 1a
) are equal or very similar for both types of component. This is also true for the percentages of
1's of interface residues that change minima (Figure 3a
). There are not enough antibodies with both components solved in an unbound form to justify a comparison of them with the enzymes. The other complexes, that are not enzymeinhibitor or antibodyantigen, show mixed results and should be considered individually. ß-Actin, in complex with profilin, has a similar number of residues in its interface when compared with inhibitors and antigens (though at the high end of the range), and a significant percentage of the
2's of these residues change minima (Figure 3b
). None of the other measures of interface movement are above the controls. Amicyanin complexed with methylamine dehydrogenase and human growth hormone complexed with its receptor both have large changes in their interfaces for all four measuresC
r.m.s.d. (Figure 1a
), side-chain r.m.s.d. (Figure 1b
), and the percentages of
1's and of
2's that change minima (Figure 3a and b
). Amicyanin has a small number of interface residues, so large changes of a few residues have a greater effect on these measures. Human growth hormone has double the number of interface residues that enzymes and inhibitors have (the receptor is a dimer, and the hormone effectively has two interfaces, one with each monomer). Therefore the large values seen for these measures are definitely significant, but there are also large changes of the whole molecule (discussed previously). The number of interface residues in the interface of subtilisin complexed with subtilisin prosegment is similar to the number in the growth hormone complex, but in this case only the percentage of
2's that change minima is above the control (Figure 3b
). The deoxyribonuclease Iactin and glycerol kinaseglucose specific factor III (GSF III) complexes have little significant movement of their interfaces, except for the percentage of
1's of the interface of GSF III that change minima (Figure 3a
).
Differences in the structures of identical proteins in different complexes
Table III gives information on five proteins that are present in more than one complex in the main data set (Table II
). Lysozyme and neuraminidase are not considered because their partners in the complexes are antibodies. These do not necessarily bind in the same place, and consequently one would not expect changes in the interface to be common in all the complexes. The only difference between comparing unbound structures with complexed and complexed with complexed is that the interface may be affected. Therefore it is appropriate to concentrate just on those residues that are common to the interface of all the complexes of a particular protein. We have examined the C
displacements and side-chain r.m.s.d.'s of these residues.
From work described in previous sections, only one of the proteins, bovine pancreatic trypsin inhibitor (PTI), has interface side-chain r.m.s.d.'s between all structures of that protein in a complex and the unbound form that are larger than the control. These structures have only one common interface residue that changes its conformation by more than the control limits. This residue, Arg17, has a much more similar conformation in the complexes than it does in the unbound structure (Figure 5f). The change avoids steric hindrance that would occur with the unbound conformation. It is only in this protein that the interfaces of the complexes appear more similar to each other than to the same region in the unbound structure. Arginine 17 in the unbound structure appears to have been built in the most common structure, perhaps suggesting that it is mobile and was poorly defined in the electron density map. However, it has a slightly lower temperature factor than in the complexed structures, suggesting that it was not more mobile than in those structures.
In the subtilisin complexes there are several residues common to the interface that have differences greater than the controls. His64 in the unbound structure and in the protein bound to subtilisin prosegment has a large side-chain r.m.s.d. when compared with the other situations. However, in the unbound structure this residue has two possible positions. The one used in this analysis has an occupancy of 0.8. However, this corresponds to a structure with phenylmethylsulfonate (PMS) bound with an occupancy of 0.7. The 0.2 occupancy structure of His64, with no bound PMS, is much closer to the structures of the complexes with inhibitors, but not to that with prosegment. His64 in the complex with prosegment differs from the others because the bulk of the prosegment binds away from the active site, with only eight residues of the C-terminus extending into the active site. In the other complexes, steric hindrance by the inhibitor, which is different to that caused by PMS, favours the 0.2 occupancy conformation of His64. There are also small differences in the conformations of Ser101 and Tyr104, but the conformations in the complexes are not significantly more similar to each other than they are to the unbound conformation. All the other common interface residues have conformations that are the same to the level of the controls.
In all comparisons between the three examples of bovine chymotrypsin (one unbound and two complexed), Phe39 differs by a large side-chain r.m.s.d. (around 5 Å). The difference between the two complexed structures is slightly smaller than in comparisons with the unbound, reflecting that the conformational change occurs only after Cß (i.e. involves a 1 rotation), rather than from C
onwards. Tyr146 differs slightly in all comparisons, but is at the end of a chain break. Ser218 differs most in comparisons with one of the complexes, and has a similar structure in the other complex and the unbound protein. All other common interface residues have conformations that are the same to the level of the controls.
In the bovine trypsin complexes, the conformations of only one of the common interface residues (Tyr39) differ by more than the controls, and in this case the conformations of the complexes are not more similar to each other than they are to that of the unbound. The same residue of rat trypsin differs between the unbound form and the two bound forms, but does not differ between the two bound forms. However, the differences are small.
The limitations of the data set make it difficult to draw firm conclusions. However, it appears that when the changes in the interface are small, the structures of the interfaces in the complexes are no more similar to each other than they are to the unbound structure. Larger changes are more likely to be common to all complexes, possibly indicating that they are more significant to binding.
![]() |
Implications for comparative modelling and predictive docking |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Martin et al. (1997) assessed the results of the comparative modelling section of the 1996 Critical Assessment of Structure Prediction (CASP2). Their control values were mainly C and all-atom r.m.s.d.'s, derived from a subset of three of the CASP2 targets which each had a pair of structures that were solved independently. This data set gave a value for C
r.m.s.d. that was similar to ours (0.6 Å compared with 0.4 Å). They found that targets with high sequence identity (~85%) to a protein of known structure were modelled to within these limits.
To analyse how the changes found affect the ability of docking algorithms to predict correctly the structure of a proteinprotein complex from the unbound structures of its components, we have looked at the results of the program FTDOCK (Gabb et al., 1997). This algorithm was developed and tested on a data set containing five of the complexes analysed by us (Table II
), using exactly the same structural data for the bound and unbound forms. The algorithm performs a global rigid-body search of rotational and translational space, and scores each potential structure on shape and electrostatic complementarity. The best 4000 from this search are filtered using distance constraints from biochemical data, and then undergo local refinement scored by shape complementarity.
The algorithm performed best on the -chymotrypsinogenPTI complex, with a correct structure (i.e. one with an interface C
r.m.s.d. of 2.5 Å or less when compared with the crystal structure of the complex) ranked first out of 133 predictions that remained after local refinement. This is somewhat surprising in the light of our analysis, as the interface regions of the two components show some of the largest C
and side-chain r.m.s.d.'s observed (Figure 1
), and percentages of side-chain angles that change minima that are mostly above the control levels (Figure 3
). These large values are caused by sizeable movements of several individual interface residues, as discussed previously. It is interesting to note, however, that none of these residues would have caused bad steric clash had they stayed in their unbound conformation.
Three of the other four complexes (kallikreinPTI, subtilisinchymotrypsin inhibitor and Fab D44.1lysozyme) were all predicted with varying degrees of success. All have some large movements of interface residues which avoid potential steric clash. The final complex, subtilisinsubtilisin inhibitor, had no correct solution in the top 4000 predictions. This is puzzling at first glance. Although both components have some interface residues that show movement above the control, and would cause steric clash if the movements did not occur, these movements are no more severe than those seen in the previous three complexes. However, the unbound structure of subtilisin inhibitor has a region (Ala62Met70) where only the approximate path of the main-chain could be traced, with associated uncertainties in the placement of the side-chains (see PDB file for code 2ssi). These residues were therefore excluded from our analysis, but unfortunately some of them are interface residues and would cause substantial steric clash if they remained in their unbound conformations.
We see here that conformational change which does not occur to avoid steric clash is coped with quite well, even when it is to the level seen in the -chymotrypsinogenPTI complex. There is sufficient shape complementarity to identify the correct complex, despite the large conformational change.
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
This analysis confirms the induced-fit model for proteinprotein recognition. Often the largest movements are not from the functionally important residues, such as those forming the active sites, but interface regions that are peripheral to these residues. The conformational change can alleviate steric clashes, improve van der Waals packing, or lead to the formation of hydrogen bonds or salt bridges. However, in several of the systems examined the extent of conformational change is not as substantial as those whose complexes were successfully predicted by FTDOCK (Gabb et al., 1997). For these systems, recognition in shape and charge can, as a first approximation, be treated as a lock and key.
There still is a limited number of systems for which there is information about conformational change. As more structures of complexes and their unbound components are solved, the conclusions from this analysis may need to be revised. In particular the extent of conformational change may vary between the different biological systems. The enzymeinhibitor complexes that dominate this study may generally exhibit less conformational changes than complex formation involved in other process, such as signalling. The high binding affinity seen in enzymeinhibitor and antibodyantigen association may rule out large conformational changes, whereas conformational changes of other proteins may be fundamental to their mechanisms. For those systems with limited conformational change, predictive docking should prove a valuable method to obtain structural models from unbound components and thereby provide insights into biological recognition. Applications such as computational ligand design (Caflisch and Karplus, 1995) are less able to tolerate conformational changes as large as those presented here. However, the lower accuracy of structures of complexes generated by predictive docking can still provide information about the functional region of a protein, and could therefore suggest the types of molecules which should be screened for inhibition.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bhat,T.N., Bentley,G.A., Boulot,G., Green,M.I., Tello,D., Dall'Acqua,W., Souchon,H., Schwarz,F.P., Mariuzza,R.A. and Poljak,R.J. (1994) Proc. Natl Acad. Sci. USA, 91, 10891093.[Abstract]
Caflisch,A. and Karplus,M. (1995) Perspectives Drug Discovery Des., 3, 5184
Chambers,J.L. and Stroud,R.M. (1979) Acta Crystallogr., B35, 18611874.[ISI]
Chantalat,L., Jones,N.D., Korber,F., Navaza,J. and Pavlovsky,A.G. (1995) Protein Peptide Lett., 2, 333340.[ISI]
Chen,L., Durley,R., Poliks,B.J., Hamada,K., Chen,Z., Mathews,F.S., Davidson,V.L., Satow,Y., Huizinga,E., Vellieux,F.M.D. and Hol,W.G.J. (1992) Biochemistry, 31, 49594964.[ISI][Medline]
Cruickshank,D.W.J. (1949) Acta Crystallogr., 2, 6582.[ISI]
Cruickshank,D.W.J. (1954) Acta Crystallogr., 7, 519.[ISI]
Cruickshank,D.W.J. (1967) In Kasper,J. and Lonsdale,K. (eds), International Tables for X-ray Crystallography, Vol. 2. Kynoch Press, Birmingham (present distributor: Kluwer Academic Publishers, Dordrecht), pp. 319340.
Daopin,S. and Davies,D.R. (1994) Acta Crystallogr., D50, 8592.
Durley,R., Chen,L., Lim,L.W., Mathews,F.S. and Davidson,V.L. (1993) Protein Sci., 2, 739752.
Flores,T.P., Orengo,C.A., Moss,D.S. and Thornton,J.M. (1993) Protein Sci., 2, 18111826.
Gabb,H.A., Jackson,R.M. and Sternberg,M.J.E. (1997) J. Mol. Biol., 272, 106120.[ISI][Medline]
Gerstein,M. and Chothia,C. (1991) J. Mol. Biol., 220, 133149.[ISI][Medline]
Gerstein,M., Lesk,A.M. and Chothia,C. (1994) Biochemistry, 33, 67396749.[ISI][Medline]
Hecht,H.J., Szardenings,M., Collins,J. and Schomburg,D. (1991) J. Mol. Biol., 220, 711722.[ISI][Medline]
Hecht,H.J., Szardenings,M., Collins,J. and Schomburg,D. (1992) J. Mol. Biol., 225, 10951103.[ISI][Medline]
Huber,R. (1979) Trends Biochem. Sci., 4, 271276.[ISI]
Jackson,R.M., Gabb,H.A. and Sternberg,M.J.E. (1998) J. Mol. Biol., 276, 265285.[ISI][Medline]
Janin,J. (1995) Prog. Biophys. Mol. Biol., 64, 145166.[ISI][Medline]
Janin,J. and Chothia,C. (1990) J. Biol. Chem., 265, 1602716030.
Janin,J., Wodak,S., Levitt,M. and Maigret,B. (1978) J. Mol. Biol., 125, 357386.[ISI][Medline]
Janin,J. and Wodak,S.J. (1983) Biophys. Mol. Biol., 42, 2178.
Jones,S. and Thornton,J.M. (1996) Proc. Natl Acad. Sci. USA, 93, 1320.
Lee,B. and Richards,F.M. (1971) J. Mol. Biol., 55, 379400.[ISI][Medline]
Lesk,A.M. and Chothia,C. (1988) Nature, 335, 188190.[ISI][Medline]
Luzzati,V. (1952) Acta Crystallogr., 5, 802810.[ISI]
Martin,A.C.R., MacArthur,M.W. and Thornton,J.M. (1997) Proteins, S1, 1428.
McLachlan,A.D. (1979) J. Mol. Biol., 128, 4979.[ISI][Medline]
Miller,S., Lesk,A.M., Janin,J. and Chothia,C. (1987) Nature, 328, 834836.[ISI][Medline]
Murzin,A., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) J. Mol. Biol., 247, 536540.[ISI][Medline]
Russell,R.B. and Barton,G.J. (1992) Proteins, 14, 309323.[ISI][Medline]
Shoichet,B.K. and Kuntz,I.D. (1996) Chem. Biol., 3, 151156.[ISI][Medline]
Srinivasan,R. and Ramachandran,G.N. (1965) Acta Crystallogr., 19, 10081014.[ISI]
Stanfield,R.L. and Wilson,I.A. (1994) Trends Biotech., 12, 275279.[ISI][Medline]
Sternberg,M.J.E., Gabb,H.A. and Jackson,R.M. (1998) Curr. Opin. Struct. Biol., 8, 250256.[ISI][Medline]
Tickle,I.J., Laskowski,R.A. and Moss,D.S. (1998) Acta Crystallogr., D54, 243252.[ISI]
Weng,Z., Vajda,S. and Delisi,C. (1996) Protein Sci., 5, 614626.
Received June 26, 1998; revised January 7, 1999; accepted January 21, 1999.