Center for Biomedical Engineering, Beijing Polytechnic University, Beijing 100022, China
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: docking/entropy/intermolecular interactions/protein association
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() | (1) |
Subsequently, Zhang et al. put forward a binding free energy function based on the atomic contact energy (Zhang et al., 1997). The binding free energy is estimated by
![]() | (2) |
In addition, Xu et al.(1997) devised a function relative to the hydrophilic number and the molecular surface:
![]() | (3) |
In general, entropy loss is indispensable to the binding free energy. As is well known, the entropy calculation, however, is difficult since it depends on the complete phase space of a molecular system and is sensitive to the inclusion of correlations between motions along the many degrees of freedom (Karplus and Kushick, 1981; Di Nola et al., 1984
). Pickett and Sternberg developed an empirical scale to estimate the calculation of the side-chain conformational entropy loss (Pickett and Sternberg, 1993
). In the entropy scale the maximum conformational entropy, Sc, of each side chain was calculated by the classical expression
![]() | (4) |
In order to avoid the complicated calculation for conformational entropy and to consider the effect of entropy on the binding free energy, we obtained a simple and effective empirical scale for the conformational entropy and the binding free energy through the analysis of protein interfaces. In this study, we analyzed the binding interfaces of 20 protein complexes and extracted the three variables concerned with the interface information, i.e. the side-chain accessible number (Nb), the number of hydrophilic pairs (Npair) and buried apolar solvent-accessible surface areas of complexes interface (ASAapol). Then, the empirical scale in terms of the three variables was established by linear fitting with experimental data for the free energy. In addition, the scale was applied as a score function to the docking processes for 10 protein complexes. Finally, the feasibility and shortcomings of our empirical method are discussed.
![]() |
Systems and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The side-chain accessible number, Nb, was taken from the number of contacted residues in the interface and the contacted residue was defined by the effective accessibility (RA) of its side chain, calculated by
![]() | (5) |
The number of hydrophilic pairs, Npair, was defined by the distance between the critical points of hydrophilic atoms, which was basically around their centers of contact surfaces (Lin et al., 1994). If the distance between two hydrophilic atoms was <2.8 Å (the diameter of the solvent probe), the two atoms were treated as a hydrophilic pair.
To examine our model mentioned above, the 10 complexes with experimentally determined structures were selected as a test set to do molecular docking. The soft proteinprotein docking algorithm (C.H.Li et al., in preparation) developed in our group was used for the test and was based on the simplified protein models of Janins rigid-body proteinprotein docking algorithm (Cherfils et al., 1991, 1994
; Cherfils and Janin, 1993
). The partial binding space including the partial surface of the receptor and complete surface of the ligand was searched, in which 3x104 different modes of contact between two proteins for each case were obtained. After filtering and clustering analysis, about 300 binding modes were retained. The binding free energy was then used to score those retained binding modes.
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The conformational entropy is able to affect the binding free energy of protein and its ligand as well as to drive protein folding. A major unfavorable entropy effect arises from the reduction in the number of accessible conformation, which is available to the protein backbone and side chains. As an approximation, we assume that the backbone in all folded conformations has the same conformational entropy. Therefore, only the entropy loss from the side chain is taken into account when the accessibility of the side chain is more than 60% of the standard side-chain surface area. When the values of the side-chain accessible number, Nb, are used to fit the side-chain conformational entropy loss according to Pickett and Sternbergs empirical scale, the linear fitting function is given by
![]() | (6) |
Figure 1 shows a linear fitting of side-chain conformation entropy (T
S) versus Nb. It is found that Nb correlates very well with T
S values. Therefore, Nb can be used to represent the side-chain conformational entropy loss for the proteinprotein binding process.
|
|
|
|
As mentioned above, Nb, Npair and ASAapol are related to the interface of protein complexes and correlate well with the conformational entropy change, the electrostatic interaction and the hydrophobic interaction, respectively. When the proteinprotein binding free energy,
Gcal, is written as a linear function of three variables Nb, Npair and
ASAapol,
Gcal can be expressed as
![]() | (7) |
|
|
Currently, the approach of rescoring docked conformations has made progress to some extent and has been used to rescore the lower root mean square deviation (r.m.s.d.) conformations (Norel et al., 2001; Smith and Sternberg, 2002
). The main terms used in the rescoring are the statistics of residueresidue contacts across the interfaces of complexes and electrostatics. As discussed above, we presented an empirical method, which was based on the three variables extracted from the binding interface information. The calculation of the free energy of proteinprotein association with the method was quick and accurate. Especially the conformational entropy has been taken into account and this term is also accurate, which is supported from analysis. Therefore, we tried to apply this approach as a scoring function to rank the putative docked structures in the proteinprotein docking problem.
Table IV summarizes the docking results for the 10 proteinprotein complexes including the name of the complexes, the ranking position of the first near-native structure using our scoring function and the corresponding r.m.s.d. from the X-ray crystallographic complex. For the first six cases, the complexes were reconstructed from the structures of the co-crystallized proteins. In these cases, the conformations of the two molecules are already adapted to each other. For this set of docking simulations, XX was added after the PDB code in the protein column. For the following two cases, the complexes were reconstructed from the structures in which one is from the protein of the complex and the other is from the free form. For this set of docking simulations, FX or XF was added after the PDB code, where F and X designate the free form and co-crystallized form, respectively. If the complexes were reconstructed from the structures of both proteins from the free form, FF was added to the PDB code. The docked geometry is taken into account only if the r.m.s.d. of the backbone atoms from the X-ray structure is <4.0 Å. For the 10 tested complexes, all the native-like docked geometries are found, of which six are found within the 10 top ranking solutions. This indicates that our scoring function is able to distinguish the true binding mode from the remaining false ones.
|
|
Conclusions
The interface information for proteinprotein complexes is important for understanding proteinprotein interactions and recognition. In this work, we investigated the useful variables from the interfaces and developed a simple scale to calculate the binding free energy of proteinprotein association. The variables are used as a scoring function in the protein protein docking calculation. As discussed above, the side-chain accessible number, Nb, can be reasonable for depicting the loss of side-chain conformational entropy in the binding process. The interface information for complexes has great potential for describing proteinprotein association and the corresponding three variables can be used to calculate the binding free energy. The model is advantageous in terms of saving calculation time and ease of use. However, the binding free energy function presented here is based on an approximate treatment in which the molecule is treated as a rigid body. Today it is necessary to develop both new docking methods for elucidating the details of specific interactions at the atomic level and computational tools for providing information on proteinprotein association in various environments (Camacho and Vadja, 2002). The interface information for complexes may give us some helpful hints on the subject and help us to get some ideas about specific associations. Work on improving the accuracy of binding free energy and molecular flexibility is currently under way.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bernstein,F.C., Koetzle,T.F., Williams,G.J.B, Meyer,E.F., Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535542.[ISI][Medline]
Camacho,C.J. and Vajda,S. (2002) Curr. Opin. Struct. Biol., 12, 3640.[CrossRef][ISI][Medline]
Camacho,C.J., Weng,Z., Vajda,S. and DeLisi,C. (1999) Biophys J., 76, 11661178.
Cherfils,J. and Janin,J. (1993) Curr. Opin. Struct. Biol., 3, 265269.[ISI]
Cherfils,J., Duquerroy,S. and Janin,J. (1991) Proteins: Struct. Funct. Genet., 11, 271280.[ISI][Medline]
Cherfils,J., Bizebard,T., Knossow,M. and Janin,J. (1994) Proteins: Struct. Funct. Genet., 18, 818.[ISI][Medline]
Di Nola,A., Berendsen,H.J.C. and Edholm,O. (1984) Macromolecules, 17, 20442050.[ISI]
Goodsell,D.S. and Olson,A.J. (1990) Proteins: Struct. Funct. Genet., 8, 195202.[ISI][Medline]
Jackson,R.M. and Sternberg,M.J. (1995) J. Mol. Biol., 250, 258275.[CrossRef][ISI][Medline]
Karplus,M. and Kushick J.N. (1981) Macromolecules, 14, 325332.[ISI]
Karplus,M. and Petsko,G.A. (1990) Nature, 347, 631639.[CrossRef][ISI][Medline]
King,B.L., Vajda,S. and DeLisi,C. (1996) FEBS Lett., 384, 8791.[CrossRef][ISI][Medline]
Lee,B. and Richards F.M. (1971) J. Mol. Biol., 55, 379400.[ISI][Medline]
Lin,S.L., Nussinov,R., Fischer,D. and Wolfson,H.J. (1994) Proteins: Struct. Funct. Genet., 18, 94101.[ISI][Medline]
Mezei,M. and Beveridge,D.L. (1986) Ann. N. Y. Acad. Sci., 482, 123.
Miyamoto,S. and Kollman,P.A. (1993) Proteins: Struct. Funct. Genet., 16, 226245.[ISI][Medline]
Nauchitel,V., Villaverde,M.C. and Sussman,F. (1995) Protein Sci., 4, 13561364.
Norel,R., Sheinerman,F., Petrey,D. and Honig,B. (2001) Protein Sci., 10, 21472161.
Novotny,J., Bruccoleri,R.E. and Saul,F.A. (1989) Biochemistry, 28, 47354749.[ISI][Medline]
Pickett,S.D. and Sternberg,M.J.E. (1993) J. Mol. Biol., 231, 825839.[CrossRef][ISI][Medline]
Reynolds,C.A., King,P.M. and Richards,W.G. (1992) Mol. Phys., 76, 251275.[ISI]
Sezerman,U., Vajda,S., Cornette,J., DeLisi,C. (1993) Protein Sci., 2, 18271843.
Smith,G.R. and Sternberg,J.E. (2002) Curr. Opin. Struct. Biol., 12, 2835.[CrossRef][ISI][Medline]
Smith,K.C. and Honig,B. (1994) Proteins: Struct. Funct. Genet., 18, 119132.[ISI][Medline]
Stoddard,B.L. and Koshland,D.E.,Jr. (1993) Proc. Natl Acad. Sci. USA, 90, 11461153.[Abstract]
Takamatsu,Y. and Itai,A. (1998) Proteins: Struct. Funct. Genet., 33, 6273.[CrossRef][ISI][Medline]
Vajda,S., Weng,Z.P., Rosenfld,R. and DeLisi,C. (1994) Biochemistry, 33, 1397713988.[ISI][Medline]
Vajda,S., Weng,Z.P. and DeLisi,C. (1995) Protein Sci., 8, 10811092.
Vajda,S., Sippl,M., Novotny,J. (1997) Curr. Opin. Struct. Biol., 2, 222228.[CrossRef]
Weng,Z.P., DeLisi,C. and Vajda,S. (1997) Protein Sci., 6, 19761984.
Xu,D., Lin,S.L. and Nussinov,R. (1997) J. Mol. Biol., 265, 6884.[CrossRef][ISI][Medline]
Zhang,C., Vasmatzis,G., Cornette,J.L. and DeLisi,C. (1997) J. Mol. Biol., 267, 707726.[CrossRef][ISI][Medline]
Received January 30, 2002; revised April 26, 2002; accepted May 21, 2002.