A novel method for scoring of docked protein complexes using predicted protein–protein binding sites

Kay-Eberhard Gottschalk, Hani Neuvirth and Gideon Schreiber1

Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel

1 To whom correspondence should be addressed. e-mail: gideon.schreiber{at}weizmann.ac.il


    Abstract
 Top
 Abstract
 Introduction
 Methods
 Results and discussion
 References
 
Docking algorithms produce many possible structures of a protein–protein complex. In most cases some of them resemble the correct structure within an r.m.s.d. of <3 Å. A major challenge in the field of docking is to extract the correct structure out of this pool, the so-called ‘scoring’. Here, we introduce a new scoring function, which discriminates between the many wrong and few true conformations. The scoring function is based on measuring the tightness of fit of the two docked proteins at a predicted binding interface. The location of the binding interface is identified using the recently developed computer algorithm ProMate. The new scoring function does not rely on energy considerations. It is therefore tolerant to low-resolution descriptions of the interface. A linear relation between the score and the r.m.s.d. relative to the ‘true structure’ is found in most of the cases evaluated. The function was tested on the docking results of 21 complexes in their unbound form. It was found to be successful in 77% of the examined cases, defining success as scoring a ‘true’ result with a p value of better than 0.1.

Keywords: docking/interface prediction/protein–protein interaction/scoring function


    Introduction
 Top
 Abstract
 Introduction
 Methods
 Results and discussion
 References
 
Life is dependent on the specific formation of protein–protein complexes. However, the number of experimentally determined structures of transient protein–protein complexes is small. Only a combination of homology modeling, docking calculations and experimental structure determination will be able to put the increasing knowledge about protein partners forming complex networks on a structural basis (Marti-Renom et al., 2000Go; Chance et al., 2002Go; Sali et al., 2003Go). To this end, docking programs have been developed in recent years, which can reliably dock two unbound proteins to obtain the structure of the complex, albeit the structural changes accompanying complexation have to be minor (Katchalski-Katzir et al., 1992Go; Walls and Sternberg, 1992Go; Norel et al., 1994Go; Rosenfeld et al., 1995Go; Vakser, 1995Go; Sandak et al., 1998Go; Halperin et al., 2002Go; Lorber et al., 2002Go; McConkey et al., 2002Go; Smith and Sternberg, 2002Go).

The docking process can be divided into two steps (Halperin et al., 2002Go). First, a large number of potential structures with reasonable surface complementarity are generated. In a second step, these structures are ranked according to a score, which extracts the near-native structures out of the pool of non-native structures. Common approaches for scoring the results are based either on surface complementarity (Katchalski-Katzir et al., 1992Go; Walls and Sternberg, 1992Go; Norel et al., 1994Go, 1995Go, 1999Go), sometimes together with an electrostatic filter (Gabb et al., 1997Go; Norel et al., 2001Go; Heifetz et al., 2002Go) or on energy-based methods such as residue potential scores or elaborate free energy evaluations (Jackson and Sternberg, 1995Go; King et al., 1996Go; Jackson et al., 1998Go; Moont et al., 1999Go; Camacho et al., 2000Go; Lorber et al., 2002Go). The known scoring functions, especially the more refined free energy estimates, are dependent on a high-resolution description of the protein surfaces. As the side-chain and main-chain conformations may change upon complexation and are difficult to predict, the broadness of the approaches is limited. Moreover, homology models, even when starting from highly similar structures, can provide only a rough estimate of the surface shape of the modeled protein. Therefore, so far only very low-resolution docking has been attempted for homology models (Vakser, 1996Go; Tovchigrechko et al., 2002Go).

Recently, we presented a new structure-based prediction program, ProMate (Neuvirth and Schreiber, 2004Go), which calculates the potential location of a protein–protein interface. The algorithm is based on the analysis of the unique structural and biochemical characteristics of transient protein–protein binding sites. Here, we present a new scoring function that is based on the prediction of putative binding sites using ProMate. Our scoring procedure is not reliant on energy considerations, but measures the tightness of the fit of the interacting proteins at the predicted binding site. The function is independent of side-chain conformations and tolerant towards inaccuracies in the backbone conformation.


    Methods
 Top
 Abstract
 Introduction
 Methods
 Results and discussion
 References
 
Predicting the location of a protein-binding site with ProMate

ProMate (http://bioportal.weizmann.ac.il/promate) is based on a statistical analysis of several properties that were found to distinguish binding regions from non-binding ones. The properties were modeled using a database of 57 transient hetero-interactions of proteins, the structure of which is known both in the unbound and complex forms (excluding antigens). Histograms of the distributions of each property in the interface and non-interface regions were constructed and served as the basic model for the prediction stage. Properties used for prediction include the frequency of atoms, their characteristics, chemical character, secondary structure, hydrophobic patches, distribution of water molecules and evolutionary conservation. A tested protein is initially processed as an independent set of circles. For every circle each of the properties is examined and the likelihood of this circle belonging to the interface is determined. The score is the observed frequency of the specific score in the interface of the training set divided by its sum of observed frequencies in the interface and non-interface. In other words, denoting interface by I and surface by S, where O refers to the observed frequency in the training set, for an input circle c:

Each circle’s probability is multiplied by the probability of being an interface for a protein of a specific size. The combined score is the product of all the scores resulting from the different properties corrected according to the actual frequencies as they appear in the training set. To smooth the score further the frequencies of the adjacent dots in a 7 Å circle are taken into account for the final score. This procedure is repeated for a number of iterations (H.Neuvirth and G.Schreiber, unpublished work).

Docking

An extensive set of 21 non-redundant enzyme–inhibitor complexes in their unbound state (Chen et al., 2003Go; Gray et al., 2003Go) was docked. The superimposition of the unbound structures on the bound complexes served as reference structures. All docking calculations were performed with a parallel version of the program package FT-Dock (Walls and Sternberg, 1992Go; Gabb et al., 1997Go; Moont et al., 1999Go) (http://www.bmm.icnet.uk/docking/) on a Mac G5 dual processor computer. The parallel version was generously provided by G.R.Smith and M.J.Sternberg. FT-Dock follows closely the shape complementarity algorithm introduced by Katchalski-Katzir et al. (1992Go). The docking was performed with the size of a single grid unit set to 2.0 Å, an angle step of 12°, a surface thickness of 4.0 Å and an internal deterrent value of –20. The molecular grid extended 1.8 Å outwards from the mobile molecule and 3.2 Å outwards from the static molecule. The 20 best surface complementarity translations were kept for each rotation. These settings have been found to be a good compromise between docking speed and accuracy (G.R.Smith, personal communication). The structures obtained were rescored using the residue potential scoring function implemented in FT-Dock (Moont et al., 1999Go); 10 000 structures were generated and further evaluated per docked complex. The docking calculations for all 21 complexes took 15 h on the dual G5 processor.

Scoring

The probabilities of residues being an interface were calculated as described elsewhere, using our software ProMate (http://bip.weizmann.ac.il/promate). Since a binding site covers ~10% of the total surface of the protein, just the top 10% scoring residues were taken as interface. These residues do not necessarily form a continuous patch. Two distances were calculated: first, the average minimum distance of the predicted interfacial C{alpha} atoms of protein 1 to any of the C{alpha} atoms of the binding partner, and second, the average minimum distance of all C{alpha} atoms of protein 1 to any C{alpha} atom of protein 2. From these two distances the ToF (tightness of fit) score was calculated according to:

where

Dinter;i is the minimum distance of the C{alpha} of residue i, predicted to be interface, of protein 1 to any C{alpha} of protein 2, and Dall;j is the minimum distance of the C{alpha} atom of surface residue j, which is either interface or not, of protein 1 to any C{alpha} atom of protein 2. There are n predicted interfacial residues and m surface residues altogether for the respective protein. dinter is therefore the average minimum distance of the predicted interfacial residues to the other protein, and dall is the average minimum distance of all surface residues to the other protein. Hence ToF measures the tightness of fit at the predicted binding site, normalized by the size of the protein.

Evaluation of scoring performance

The scoring performance was evaluated by calculating the chance of obtaining a result as good as or better than that obtained from the scoring function by randomly picking complexes out of the pool of generated complexes. This probability is described by the hypergeometric distribution. Hence, the probability was calculated according to:

where m is the total number of complexes (10 000), n the rank of the first near-native complex, r the total number of near-native complexes in the ensemble, a = 1 (at least one near-native structure is to be found), and, if n > r, b = r, otherwise b = n; b describes the upper limit of the possible number of near-native structures when picking n times. This calculates the probability of obtaining at least one near-native complex and at maximum all possible near-native complexes by chance when picking n times. Structures with an r.m.s.d. of <=3.0 Å were considered as near-native. If no near-native conformation was found, b was set to 1 and n was set as the rank of the best structure in the ensemble.


    Results and discussion
 Top
 Abstract
 Introduction
 Methods
 Results and discussion
 References
 
Since the interface predictor ProMate has been optimized for transient hetero complexes, the scoring function is not reliable for either antigen–antibody complexes or homo–oligomeric complexes. We therefore tested it on an extensive set of 21 non-redundant enzyme–inhibitor complexes. The same benchmark set was used recently by Baker and co-workers (Gray et al., 2003Go) and has been assembled by Weng and co-workers (Chen et al., 2003Go). It consists of the unbound structures of these proteins superimposed on the bound structures.

ToF is based on calculating the minimum distance between the binding site predicted by ProMate and the protein partner. Three distance scores can be calculated by this method. The binding site can be predicted for the enzyme (E), not taking the inhibitor (I) into account [ToF(E -> I)], it can be predicted for the inhibitor, not taking the enzyme into account [ToF(I -> E)] and it can be predicted for both [ToF(E {leftrightarrow} I)]. Preliminary tests on a reduced dataset clearly demonstrated that ToF(E -> I) performs best (data not shown). A possible reason for this is that ProMate was not designed to work for small proteins (<85 amino acids), which is the size of many of the inhibitors.

Analysis of the ToF performance

The performance of ToF(E -> I) for all 21 structures is summarized in Table I and Figure 1. The most intuitive criterion for success of the scoring function is to test whether near-native structures are found at low ranks. A near-native structure with an r.m.s.d. of <3.0 Å in the first 10 top-ranking structures is found for seven complexes and eight additional complexes have a r.m.s.d. of <5.1 Å in the 10 top-ranking structures. However, using this description of success does not discriminate between the performance of the docking function and the scoring function. If, for example, 9990 near-native structures are found by the docking algorithm and the scoring function gives the first 10 ranks to only the non-native structures, the scoring function performs badly even though the lowest rank of a near-native structure is only 11. If, on the other hand, the docking algorithm generates just one near-native structure and the scoring function gives this structure the rank 11, the scoring function performs well. A way to quantify the performance of the scoring function depending on the performance of the docking algorithm is to calculate the probability of performing as good as or better than the scoring function by random picking. This probability can be described with a hypergeometric distribution. The difference between the probability and the lowest rank as a criterion is clearly demonstrated for 1PPE: although the rank of the first near-native structure is 9, which appears to represent good success of the scoring function, the existence of 299 near-native conformations (which is the success of the docking program) renders it probable that the same result could be obtained by random picking (with a probability of p = 0.23). As demonstrated here, the probability is the one most unbiased single value that describes the performance of the scoring function. The average probability of being as good as or better than the scoring function by chance is p = 0.08. When excluding two outliers, 1SNI and 1BRS, the average probability drops to p = 0.05. Hence the scoring function performs much better than random picking. In four cases, the probability is <0.01 and in five cases it is >0.1. Out of these five cases, three have a probability of p > 0.2. These three or five cases can be considered as failures of the scoring function. This converts to a success rate of 77–85%.


View this table:
[in this window]
[in a new window]
 
Table I. Performance of scoring function
 


View larger version (61K):
[in this window]
[in a new window]
 
Fig. 1. Performance of scoring function for all 21 complexes. The relationship between r.m.s.d. and score is shown for all 10 000 docking results per complex. Only for 1BRS can no clear distinction between low- and high-r.m.s.d. structures be seen.

 
In 13 out of the 21 cases, a protein homologous to the enzyme used in the benchmark test was used during training of ProMate. Nevertheless, no correlation between success (measured by p) and sequence identity can be found (Table I). Furthermore, during the development of ProMate it was shown by cross-validation that the success rate of ProMate is of the order of 70%, using very stringent criteria of success (Neuvirth and Schreiber, 2004Go). This compares with our success rate reported here. This indicates that the benchmark set is a realistic test case for the performance of ToF and not biased owing to partial overlap between the benchmark set of ToF and the training set of ProMate.

Relationship between r.m.s.d. and ToF

A linear relationship between a score and the r.m.s.d. is desirable, since this would allow successful scoring if no near-native structure is found. Indeed, an R2 value of >0.5 is found in 16 cases (Table I and Figure 1). The worst correlation between ToF and r.m.s.d. is found for 1BRS (barnase–barstar) (Figure 1). This is slightly surprising, since the interface prediction works well for both proteins. Nevertheless, it has been observed earlier that the barnase–barstar complex features an unusually large number of interfacial water molecules. One can speculate that the interface between barnase and barstar is not as tight as it is for the other complexes. This would explain the failure of our scoring scheme for this complex and the existence of water in the interface.

The apparently worst case in Table I is the scoring for 2SNI, with the first near-native structure having rank 2281 (Table I). This is also the only protein for which no near-native structure has been found in the 1000 top-scoring complexes. The lowest r.m.s.d. of 3.1 Å in the ensemble is slightly higher than our cutoff for near-native structures, indicating that not only the scoring but also the docking failed on this protein. Still, even for this protein the scoring function discriminates fairly well between low- and high-r.m.s.d. structures (Figure 1). However, many false-positive structures in the r.m.s.d. range 5–10 Å are found. This stems from the interface predictor, which has a significant offset to the real interface. Therefore, the highest scoring complex positions the inhibitor close to, but significantly offset from the real structure. Overall, the results shown in Table I and Figure 1 suggest that for none of the 21 complexes analyzed did the scoring fail completely. A complete failure would be an anti-correlation between the score and the r.m.s.d. or a probability of p > 0.5. The ability of ToF(E -> I) to distinguish between near-native and non-native structures without taking into account a prediction of the binding site of the second protein apparently stems from a feature of protein–protein complexes. From all possible orientations at the binding site, the tightest fitting orientation appears to be the correct one. This leads to a lock-and-key mechanism of protein recognition, once the correct binding site is found in an early stage of complexation. Structural changes in the sense of an induced fit mechanism might have to occur to realize the tight fit at the binding site. Still, our scoring function might be improved by taking energy considerations into account.

Docking of homology models give valuable information, even if the r.m.s.d. to the true complex structure is rather high. Energy-based functions rely on a folding funnel, in which the true conformation is at a low energy and all non-native conformations are of similar, but higher, energy (Gray et al., 2003Go). Therefore, the applicability of high-resolution energy functions for docking of homology models is problematic, as homology models describe the surface at low resolution. The funnel-like behavior might not be observed in these cases. As our function is not reliant on a high-resolution description of the surface, the linear relationship between the ToF score and r.m.s.d. should be retained even for homology models (provided that the interface is predicted correctly at a narrow region).

Examining the performance of ToF on selected protein complexes

Here we will show three specific cases, 1AVW, 1ACB and 1CSE, and discuss the performance of our new score in more detail. In all three cases, the location of the interface is correctly predicted by ProMate, albeit with a small offset for 1 CSE (Figures 24). Still, the score of the first near-native structure for the three is 1, 11 and 500, respectively, with the best r.m.s.d. within the first 10 results being 2.1, 6.7 and 6.4 Å, respectively. Clearly, we obtain the best result for 1AVW, with both proteins being properly oriented relative to each other. For the other two, the angular orientation of the inhibitor is rotated relative to the real complex structure, while the translational orientation is correct. It is interesting to compare the spread of the r.m.s.d. of the docking results versus the score (Figure 1) with the spread in the center of mass (Figures 2–4). For 1AVW the predicted interface is located around the correct center, although with a larger spread in comparison with 1ACB. For 1AVW, structures with an r.m.s.d. of up to 15 Å display low scores (Figure 1). This is caused by two factors: on the one hand the predicted interface is rather large, allowing also slightly offset conformations to score well; on the other hand, the binding partner is large, so that an error in the angular orientation results in a large r.m.s.d. Despite these two factors, the top-scoring conformation has an r.m.s.d. of 2.1 Å to the real structure, underlining the power of our scoring function. For 1ACB, the interface prediction is excellent. Still, the first near-native structure is only at rank 11. This is caused by tight fitting, yet wrongly rotated conformations at the predicted interface (Figure 3). Here, other scoring functions or biological data might help to distinguish between these possibilities. This also demonstrates that the quality of the interface prediction can be judged from the R2 factor of the linear fit: a strictly linear relationship as, for example, for 1ACB or 1CHO indicates a good prediction of the interface, while deviations from linearity as, for example, for 1AVW and 1MAH indicate a rather broad predicted interface. Combination with measuring the tightness of fit at the predicted interface nevertheless enables good results to be obtained for these predictions. For 1CSE, the binding site prediction is slightly offset. This, together with the existence of tightly fitting, yet wrongly rotated, conformations leads to a poor performance, with the rank 500 of the first near-native structure. However, since the binding site is identified nearly correctly, even the non-native results can improve our understanding of protein–protein interactions and can guide experiments.



View larger version (99K):
[in this window]
[in a new window]
 
Fig. 2. Performance of scoring function for 1AVW. (A) The geometric centers of the inhibitor (dark balls) and the surface of the enzyme (gray) of all docking results (left), of the 100 highest scoring results (middle) and of the correct structure (right, light gray) are shown. The scoring function correctly extracts complexes that are centered around the protein–protein interface, yet with large fluctuations around the true geometric center. (B) The superposition between the real structure (light gray) and the top-scoring structure (dark gray) is shown in side view (left) and top view (right). The r.m.s.d. is 2.1 Å, underlining the power of the introduced approach.

 


View larger version (81K):
[in this window]
[in a new window]
 
Fig. 4. Performance of scoring function for 1CSE. (A) The geometric centers of the inhibitor (gray balls) and the surface of the enzyme (gray) of all docking results (left) and of the 100 highest scoring results (right) are shown. The experimentally determined geometric center is shown in light gray (right). The scoring function extracts complexes that are slightly offset from the real structure. (B) The superposition between the real structure (light gray) and the best structure in the top 10 ranks (dark gray) is shown in side view (left) and top view (right). As seen already for 1Acb, the rotational orientation is wrong. The total C{alpha} r.m.s.d. between the two structures is 6.3 Å. (C) The superposition between the real structure (light gray) and the first near-native structure (dark gray) is shown. Since the scoring function prefers structures which are slightly offset from the real structure, the first near-native structure appears only at rank 500. For this structure, both rotation and translation are nearly correct. The r.m.s.d. between the two structures is 2.4 Å. Nevertheless, most of the first 500 structures are located at the right interface, although with an offset and a wrong angular orientation.

 


View larger version (74K):
[in this window]
[in a new window]
 
Fig. 3. Performance of scoring function for 1ACB. (A) The geometric centers of the inhibitor (balls) and the surface of the enzyme of all docking results (left) and of the 100 highest scoring results (right) are shown. The experimentally determined geometric center is shown in light gray (right). The scoring function correctly extracts complexes that are centered around the protein–protein interface. (B) The superposition between the real structure (light gray) and the best structure in the top 10 ranks (dark gray) is shown in side view (left) and top view (right). While the structure is centered at the binding site, the rotational orientation is wrong. The total C{alpha} r.m.s.d. between the two structures is 6.7 Å. (C) The superposition between the real structure (light gray) and the first near-native structure (dark gray) is shown. Both rotation and translation are nearly correct for the near-native structure. Since the scoring function prefers complexes at the correct binding site, the first near-native structure has the low rank of 11. The r.m.s.d. between the two structures is 2.3 Å.

 
Using the ToF score without a predicted binding site

A valid question to be asked is whether a score, which is not dependent on any predicted binding site, but only on the tightness of fit of the two proteins, would perform well. If this were true, only the correct binding site would allow a tight fit between the protein and its partner. In order to answer this question, a modified score has been calculated for the successfully docked protein complexes of decoy set I. The modified score does not take into account the binding site prediction, but calculates the normalized average minimum distance of the lowest 10% of all distances between the two proteins. This score cannot discriminate between near-native and non-native structures (Figure 5). While ToF clearly distinguishes between near-native and non-native structures and has a linear relation with the r.m.s.d., the normalized averaged 10% smallest distances between the two proteins are not related to the r.m.s.d. Therefore, the correct orientation of the two proteins is not the tightest fit possible between the two proteins, but the tightest fit possible at the binding site of the larger protein. This conclusion can be justified from the two-step mechanism suggested for protein complexation (Schreiber, 2002Go). In the first step, the protein surface is roughly scanned for patches that are suitable for binding. This preliminary scan leads to the formation of an encounter complex, where the relative orientation of the two proteins is already near-native, but short-range interactions are not yet formed. The second step scans the binding site for the best possible fit, leading to the final complex. As shown earlier by both our group and others (Lo Conte et al., 1999Go; Ma et al., 2003Go), certain characteristics, such as hydrophobicity, atom density and the potential to form specific salt bridges, determine potential binding sites. These are exactly the attributes that allow us to predict binding sites from unbound structures. Hence our procedure, which first scans the protein surface for potential binding sites and thereafter scans the binding site for the tightest fit, emulates the formation of protein–protein complexes.



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 5. Comparison of ToF with the normalized averaged 10% closest distances. Whereas ToF (squares) scales with the r.m.s.d. of the docked structure to the complex and thus distinguishes between low- and high-r.m.s.d. structures, no such relation can be found when considering just the 10% tightest fitting residues without regard to the predicted binding site (crosses).

 

    Acknowledgements
 
We thank Miri Eisenstein for helpful discussions, Graham Smith and Michael Sternberg for the parallel version of FT-Dock and Tal Peleg-Shulman for critically reading the manuscript. K.-E.G. is supported by a Minerva Fellowship.


    References
 Top
 Abstract
 Introduction
 Methods
 Results and discussion
 References
 
Camacho,C.J., Gatchell,D.W., Kimura,S.R. and Vajda,S. (2000) Proteins, 40, 525–537.[CrossRef][ISI][Medline]

Chance,M.R. et al. (2002) Protein Sci., 11, 723–738.[Free Full Text]

Chen,R., Mintseris,J., Janin,J. and Weng,Z. (2003) Proteins, 52, 88–91.[CrossRef][ISI][Medline]

Gabb,H.A., Jackson,R.M. and Sternberg,M.J. (1997) J. Mol. Biol., 272, 106–120.[CrossRef][ISI][Medline]

Gray,J.J., Moughon,S., Wang,C., Schueler-Furman,O., Kuhlman,B., Rohl,C.A. and Baker,D. (2003) J. Mol. Biol., 331, 281–299.[CrossRef][ISI][Medline]

Halperin,I., Ma,B., Wolfson,H. and Nussinov,R. (2002) Proteins, 47, 409–443.[CrossRef][ISI][Medline]

Heifetz,A., Katchalski-Katzir,E. and Eisenstein,M. (2002) Protein Sci., 11, 571–587.[Abstract/Free Full Text]

Jackson,R.M. and Sternberg,M.J. (1995) J. Mol. Biol., 250, 258–275.[CrossRef][ISI][Medline]

Jackson,R.M., Gabb,H.A. and Sternberg,M.J. (1998) J. Mol. Biol., 276, 265–285.[CrossRef][ISI][Medline]

Katchalski-Katzir,E., Shariv,I., Eisenstein,M., Friesem,A.A., Aflalo,C. and Vakser,I.A. (1992) Proc. Natl Acad. Sci. USA, 89, 2195–2199.[Abstract]

King,B.L., Vajda,S. and DeLisi,C. (1996) FEBS Lett., 384, 87–91.[CrossRef][ISI][Medline]

Lo Conte,L., Chothia,C. and Janin,J. (1999) J. Mol. Biol., 285, 2177–2198.[CrossRef][ISI][Medline]

Lorber,D.M., Udo,M.K. and Shoichet,B.K. (2002) Protein Sci., 11, 1393–1408.[Abstract/Free Full Text]

Ma,B., Elkayam,T., Wolfson,H. and Nussinov,R. (2003) Proc. Natl Acad. Sci. USA, 100, 5772–5777.[Abstract/Free Full Text]

Marti-Renom,M.A., Stuart,A.C., Fiser,A., Sanchez,R., Melo F. and Sali,A. (2000) Annu. Rev. Biophys. Biomol. Struct., 29, 291–325.[CrossRef][ISI][Medline]

McConkey,B.J., Sobolev,V. and Edelman,M. (2002) Curr. Sci., 83, 845–856.[ISI]

Moont,G., Gabb,H.A. and Sternberg,M.J. (1999) Proteins, 35, 364–373.[CrossRef][ISI][Medline]

Neuvirth,H. and Schreiber,G. (2004) J. Mol. Biol., in press.

Norel,R., Lin,S.L., Wolfson,H.J.and Nussinov,R. (1994) Biopolymers, 34, 933–940.[ISI][Medline]

Norel,R., Lin,S.L., Wolfson,H.J.and Nussinov,R. (1995) J. Mol. Biol., 252, 263–273.[CrossRef][ISI][Medline]

Norel,R., Petrey,D., Wolfson,H.J. and Nussinov,R. (1999) Proteins, 36, 307–317.[CrossRef][ISI][Medline]

Norel,R., Sheinerman,F., Petrey,D. and Honig,B. (2001) Protein Sci., 10, 2147–2161.[Abstract/Free Full Text]

Rosenfeld,R., Vajda,S. and DeLisi,C. (1995) Annu. Rev. Biophys. Biomol. Struct., 24, 677–700.[CrossRef][ISI][Medline]

Sali,A., Glaeser,R., Earnest,T. and Baumeister,W. (2003) Nature, 422, 216–225.[CrossRef][ISI][Medline]

Sandak,B., Nussinov,R. and Wolfson,H.J. (1998) J. Comput. Biol., 5, 631–654.[ISI][Medline]

Schreiber,G. (2002) Curr. Opin. Struct. Biol., 12, 41–47.[CrossRef][ISI][Medline]

Smith,G.R. and Sternberg,M.J. (2002) Curr. Opin. Struct. Biol., 12, 28–35.[CrossRef][ISI][Medline]

Tovchigrechko,A., Wells,C.A. and Vakser,I.A. (2002) Protein Sci., 11, 1888–1896.[Abstract/Free Full Text]

Vakser,I.A. (1995) Protein Eng., 8, 371–377.[ISI][Medline]

Vakser,I.A. (1996) Biopolymers, 39, 455–464.[CrossRef][ISI][Medline]

Walls,P.H. and Sternberg,M.J. (1992) J. Mol. Biol., 228, 277–297.[ISI][Medline]

Received November 26, 2003; revised January 22, 2004; accepted January 22, 2004 Edited by Alan Fersht