SOFTDOCK: understanding of molecular recognition through a systematic docking study

Fan Jiang1,2,3, Wei Lin1 and Zihe Rao1

1 Department of Biological Sciences and Biotechnology, Laboratory of Protein Sciences MOE, Tsinghua University, Beijing 100084 and 2 Institute of Physics, Chinese Academy of Sciences, Beijing 100080, China


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Molecular recognition and docking are essential to the biological functions of proteins. SOFTDOCK was one of the first molecular docking methods developed for protein–protein docking. Its ability to represent the molecular surface with different shapes and properties and to dock a variety of molecular complexes with certain conformational changes was demonstrated in a previous study. In the present work, we studied the effects of the docking parameters through statistical analysis. Seventy one typical binary complexes of different categories in PDB were also systematically docked for a test; 57 of them produced correct solutions with one set of docking parameters whereas the other 14 complexes required adjustment of the docking parameters, by decreasing the softness of the recognition and hence the background noise. We found that these 14 complexes had special structural features. Our results suggest that a variety of mechanisms may be involved in molecular recognition rather than the shape complementarity only, which is very helpful in developing more powerful methods for predicting molecular recognition.

Keywords: interface packing/molecular complex/molecular recognition/SOFTDOCK/systematic docking


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Biological functions of macromolecules depend on accurate molecular recognition. Molecular recognition processes are often precisely regulated and controlled, which in turn regulates biological processes and pathways in the cell. Broadly, the interaction between two molecules that recognize each other can form a molecular complex, provide the binding energies for enzymatic catalysis, inhibit or activate an enzyme, cause a conformational change in the effector molecule, transmit a signal across the membrane and so on. The molecular mechanism of molecular recognition has been extensively studied both experimentally and theoretically. Recent progress in theoretical studies has been very encouraging, especially in the area of molecular recognition between macromolecules. With the advent of more crystal structures of molecular complexes, detailed characterizations of protein–protein recognition have been conducted (Chothia and Janin, 1975Go; Argos, 1988Go; Janin and Chothia, 1990Go; Janin, 1995Go, 1996Go; Jones and Thornton, 1995Go, 1996Go, 1997Go; Laskowski, et al., 1996Go; Tsai et al., 1996Go; Chothia, 1997Go; Lo Conte et al., 1999Go). It has been found that the stabilization of protein association is related to several factors: the size and chemical character of the protein surface that is buried at interfaces, the complementarity and the shape (or roughness) of the contacting surfaces and polar interactions through hydrogen bonds, ionic bonds and/or water molecules.

The principles that we derived from these studies have been applied to molecular docking of proteins. Methods of molecular docking have also been developed rapidly in the last decade. It is generally true that using the atomic coordinates to represent a molecule and docking it to another molecule using atomic force fields are computationally not feasible, especially when dealing with proteins. Therefore, various representations of molecular surface and volume have to be designed.

The method of representation is often closely related to the search algorithm of the solution space. The search algorithm is a combination of the generation of solutions and the evaluation of the generated solutions. The solution space is defined by the relative rotation and translation between the docked molecules and also the other degrees of freedom included to describe the conformational flexibility of the ligand and receptor. Kuntz and co-workers (Kuntz et al., 1982Go) first generated a molecular surface (Connolly, 1981Go, 1983Go) of a molecule and then represented the surface with a set of spheres as positive or negative images of the surface. The positive image of the ligand was then docked to the negative image of the receptor. The negative image of the receptor can also be used to match with the atomic coordinates of small ligands from a known database. Efficient searching and matching algorithms have been developed for macromolecular docking with the incorporation of volume overlap checking (Shoichet and Kuntz, 1991Go). Shape and chemical descriptors of the surface have also been developed with this basic representation (Meng et al., 1992Go; Shoichet et al., 1992Go; Shoichet and Kuntz, 1993Go).

Jiang and Kim used the molecular surface of a molecule to represent the surface with the surface normals attached to the surface dots (Jiang and Kim, 1991Go). The surface and volume of a molecule are then digitized into grids called surface cubes and volume cubes. Searching and matching between two sets of molecular surface cubes and volume cubes are then achieved through exhaustive sampling of the rotation and translation space with a fast translation algorithm. In order to accommodate certain conformational flexibility, a soft dock method is used to evaluate the generated solutions. The softness is implemented through (1) varying the cube size and (2) the cone angle cutoff in calculating the local surface complementarity. The advantage of this cube representation is that it contains the most important characteristics of the surface (using surface dots with areas and normals) and the details of the representation are variable according to the cube size. Moreover, this representation does not contain higher level abstraction and hence is not limited by the set of shape descriptors used in a particular representation in describing a complicated surface. Its disadvantage, however, is that the searching requires more computation.

One type of abstraction is the sparse critical point representation (Lin et al., 1994Go; Norel et al., 1994aGo,bGo, 1995Go). The high knobs and deep holes of a surface are selected by the extrema of a shape function and the normal vector for each critical point is also calculated, which was found to be crucial in the efficient matching and correct scoring of the generated solutions. Another elegant representation of the molecular surface is to use quadratic shape descriptors (Goldman and Wipke, 2000Go). The authors suggested that a shape-explicit docking algorithm, that is, when the shape information is used explicitly in generating the solutions, could be more efficient than a shape-implicit algorithm. Their docking results with small ligand and enzyme receptor systems were compared with other methods and found to be better. There are many other examples of molecular surface representations and search algorithms which are combinations of shape-explicit and shape-implicit representations (Lee and Rose, 1985Go; Katchalski-Katzir et al., 1992Go; Lawrence and Davis, 1992Go; Masek et al., 1993Go; Bohm, 1994Go; Helmer-Citterich and Tramontano, 1994Go; Jones et al., 1995Go; Perkins et al., 1995Go; Rarey et al, 1996Go; Sobolev et al., 1996Go; Given and Gilson, 1998Go; Stahl and Bohm, 1998Go; Hou et al., 1999Go; Palma et al., 2000Go).

As is well known, most docking algorithms have their shortcomings and cannot work universally on every complexed molecule. Therefore, it is necessary to study when the algorithms are effective and when they are not. So far, only one study has been performed with such a variety and large number of known crystal complexes (Vakser et al., 1999Go) that are now available in the Protein Data Bank (PDB). In this work, we used the SOFTDOCK program and tested it using different sets of parameters. Through statistical analysis, we selected a set of values for the cube size, the cutoff cone angle and the volume overlap cutoff for docking protein complexes in general. We also suggested the best function for calculating the interface complementarity out of four choices. We then docked a series of 71 known protein–protein complexes found in the PDB (Bernstein et al., 1977Go). We found that most complexes could be found by the SOFTDOCK program with a broad range of parameters. However, some complexes with small interface areas (relative to the complex size) and discontinuous interface surfaces required more stringent parameters for calculating the complementarity. We introduced the signal-to-noise ratio as a measure of tightness of the binding of an interface, which seems to be correlated with the shape of the interface and uncorrelated with the size of the interface. Systematic docking of a large set of known protein–protein complexes offered an opportunity to evaluate the SOFTDOCK program and to define an optimum set of parameters in docking unknown complexes and studying molecular recognition.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The surface and volume representation and the searching algorithms have been described previously (Jiang and Kim, 1991Go). Briefly, a molecular surface is calculated for a molecule according to Connolly (Connolly, 1981Go, 1983Go) with a probe radius of 1.4 Å and a dot density of 4.0/Å2, which represents the surface by surface dots with a surface area and surface normal attached to each surface dot. For a given cube size, the surface dots within each surface cube are summed, where the summed surface normal is area-weighted and normalized again. Hence after the summation within each surface cube, the surface dots are represented with the surface cubes. The cubes occupied by the atoms of the molecule but not by surface dots are called volume cubes.

The algorithm for conducting a complete, exhaustive, six-dimensional search to match two sets of points with properties attached to each point was developed previously and described by Jiang and Kim (Jiang and Kim, 1991Go). Briefly, the rotational space is sampled uniformly with a given angle distance. The sampled rotations represented in the present study by polar angles are searched exhaustively. For each rotation applied, a fast translation search is performed, which is also exhaustive. This fast translation algorithm first calculates the difference vector between a pair of points, each from one of the two sets, in Cartesian coordinates and the matching score according to the properties is assigned to this pair of points. Then, the matching score is stored in a three-dimensional matrix, called the difference vector space or the translation vector space, according to the difference vector. Hence all pairs of points from the two sets with the same difference vector are accumulated in the same position in the difference vector space. A local maximum of the accumulated scores in the difference vector space corresponds to a possible good match and its position in the difference vector space gives the corresponding translation vector that should be applied for this good match.

Four function forms for calculating the local complementarity, i.e. the matching score, are shown in Figure 1Go. The effect of these functions should be similar, as can be seen from their forms. The difference is that the step function used in the previous study cannot be parallelized, whereas the Gaussian function used in the present study can, when such optimization is available. The other functions are also amenable to parallelization. The score function is the sum of the local complementarity scores minus the weight for volume overlap times the volume overlap.



View larger version (29K):
[in this window]
[in a new window]
 
Fig. 1. The functions for calculating interface complementarity.

 
For each orientation of the probe, 20 docking solutions are output. Then, all the docking solutions for all the orientations sampled are pooled together. Using the probe geometry in the crystal complex as the reference geometry, all the docking solutions that are within 5 Å and 20° of the reference geometry are considered the correct solutions for this docking, which are extracted and saved to a file. The average and standard deviation of the docking scores for the saved solutions are calculated. The solution with the highest score is used as the peak score to calculate the signal-to-noise ratio for this docking, which is the peak score minus the average and then divided by the standard deviation.

We used the signal-to-noise ratio to evaluate the effect of the selected parameters. The signal-to-noise ratio of a crystal complex was calculated in the following way. All possible rotations within 30° were searched, that is, the polar angle phi ranged from 0 to 360°, psi from 0 to 180° and chi from 0 to 30° with 10° steps. In the translation search, the whole surfaces of the probe and the target molecules were included. After the search, the solutions close to the crystal complex were extracted and the maximum score was found and its signal-to-noise ratio was calculated, as described above. For reference, the number of raw solutions generated for each complex tested is about 20 000. The number of correct solutions found may vary depending on the docking parameters and the individual complexes.

We used the list of crystal complexes as given by Lo Conte et al. (Lo Conte et al., 1999Go) to retrieve the corresponding atomic coordinates when they were available in the PDB (www.rcsb.org) (Bernstein et al., 1977Go). Each binary complex was separated into a probe molecule and a target molecule for docking according to the biological interacting pairs.

The SOFTDOCK program consists of many programs. These programs can be run in batch mode with c shell and awk scripts in a Unix or Linux environment. For further understanding of docking rules, the atomic details of the complex interfaces were examined by jiffy C and Perl programs. The numbers of buried main chain atoms and total buried atoms of the interface were summed. %M (percentage of main-chain atoms among total atoms of interfaces) was calculated in the following way. The atoms of both component molecules were extracted if the distance between the probe atom and the target atom was <4 Å. Among these atoms, %M was defined as the percentage of main-chain atoms among total extracted atoms.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Effects and summary of the docking parameters

We first used 10 crystal complexes to test the parameters systematically, namely 1AVW, 1BRS, 1IGC, 1NCA, 1TX4, 1VFB, 1YCS, 1YDR, 2TRC and 4THC. They represent a variety of complexes with different biological functions. The results for testing the effect of the cube size, cone angle and scoring function type are shown in Table IGo. It can be seen that the signal-to-noise ratio (SNR), which is a measure of the interface complementarity, increases when the cube size and the cone angle decrease. When the cube size is large, the surface areas and normals are averaged within each cube and therefore the molecular surface is smoothed and the detailed features are removed. Hence it is not surprising that the calculated interface complementarity decreases.


View this table:
[in this window]
[in a new window]
 
Table I. Effect of different cube sizes and cone angles on the calculated interface complementarity
 
The small cube sizes and cone angles correspond to the more stringent criteria for calculating the complementarity and hence give a more sensitive measure of the complementarity. In contrast, the large cube sizes and cone angles correspond to more relaxed criteria for calculating the complementarity. As a result, the surfaces with approximate complementarity will be included in the calculation and they contribute to the background noise, which can sometimes compete with the true signal. The balance between the background noise and the true signal is related to the function for calculating the interface complementarity and the cone angle. Since we must also consider the fact that the complementarity between the two interacting surfaces cannot be perfect, we should not extrapolate that the optimum cone angle be zero. In subsequent tests, we chose the cone angle to be 40°.

Next, we tested the effect of the different functions for calculating the interface complementarity. Four representative functions are shown in Figure 1Go. Function No. 1, the step function, was used in the previous study (Jiang and Kim, 1991Go). One of the motivations for changing the function form was to be able to parallelize the algorithm so that the SOFTDOCK program could run faster when the vectorization optimization is available for multi-processor machines. For a cube size of 2.4 Å and a volume cutoff of 1500, on a Pentium III 450 MHz 128 MB Linux system, a typical run for one of the 71 complexes takes about 5 min. The corresponding results are shown in Table IIGo. The SNRs obtained for the four functions are similar without significant variation. This is understandable because these four functions are similar to each other in their forms and the Gaussian function (No. 4) is most similar to the original step function (No. 1). The Gaussian function was used in almost all of our tests.


View this table:
[in this window]
[in a new window]
 
Table II. Effect of different functions on the calculated interface complementarity
 
Systematic docking of known protein–protein complexes

Seventy one protein–protein complexes was selected, taken from Lo Conte et al. (Lo Conte et al., 1999Go), namely 1A2K, 1ACB, 1AGR, 1AK4, 1AIP, 1AO7, 1ATN, 1AVW, 1BRS, 1BTH, 1CBW, 1CHO, 1CSE, 1DAN, 1DHK, 1FJ, 1DVF, 1DKG, 1EBP, 1EFN, 1EFU, 1FBI, 1FC2, 1FIN, 1FLE, 1FSS, 1GG2, 1GLA, 1GOT, 1GUA, 1HIA, 1HWG,1IAI, 1IGC 1JHL, 1KB5, 1MCT, 1MEL, 1MKW, 1MLC, 1NCA, 1NFD, 1NMB, 1NSN, 1OSP, 1PPF, 1SEB, 1STF, 1TBQ, 1TCO, 1TGS, 1TOC, 1TX4, 1UDI, 1VFB, 1YCS, 1YDR, 2BTF, 2JEL, 2KAI, 2PCC, 2PTC, 2SIC, 2SNI, 2TRC, 3HFL, 3HFM, 3SGB, 3TPI, 4CPA and 4HTC. Among them, the probe and target molecules for 1AO7, 1TX4 and 2PTC are ABC:DE (molecule IDs), B:A and I:E, respectively. Docking was performed for each complex using a cube size of 2.4 Å, a cone angle of 40°, the Gaussian function for calculating the interface complementarity and a volume overlap cutoff of 1500. As summarized above, this parameter set tends to implement enough `softness' and works on most complexes tested. The volume overlap cutoff of 1500 was chosen by trial, and is usually about the minimum size of the interface area of a generic complex. A lower value of volume overlap cutoff may lead to the exclusion of correct solutions.

Among the 71 complexes, 57 had correct solutions and their corresponding SNRs were calculated. The correlation coefficient between SNR and the ratio of the interface area to the total number of atoms of the complex is 0.36, which means that there is no significant correlation. The same is true for SNR and the interface area, the correlation coefficient for which is 0.34. The interface area represents the absolute size of the interface. The ratio of the interface area to the total number of atoms of the complex represents the relative size of the interface to the size of the complex. Here we used the total number of atoms to approximate the total surface area of a complex. The SNR can be thought of as a measure of the degree of interface complementarity. Hence the degree of the interface complementarity can vary from one complex to another independent of the size of the interface. In other words, the SNR reflects more the shape characteristics of the interface. It could be thought as a measure of the tightness of the binding of a complex.

Of the 71 complexes, 14 did not give correct solutions with the chosen parameter set (2.4, 40, 1500, Gaussian function). These 14 complexes were 1MKW, 1OSP, 1NCA, 1NMB, 1IAI, 1NFD, 1DFJ, 1GLA, 2PCC, 1AIP, 1AK4, 2BTF, 1SEB and 1YCS. We found that these 14 complexes had lower ratios of the interface area to the total number of atoms than the 57 complexes whose correct solutions could be found. A histogram of these two sets of complexes as a function of the ratio mentioned in the above is shown in Figure 2Go. The histogram clearly shows two profiles with well-separated peaks for the two sets. Since we used the whole molecular surfaces of the probe and the target in our docking, the background noise will increase as the relative size of the interface to the whole complex decreases. To verify this explanation, we chose another set of parameters (1.6, 20, 1500, Gaussian function) with more stringent criteria for calculating the complementarity to reduce the background noise and enhance the true signal. We found the correct solutions for 10 of the 14 complexes, i.e. missing only four complexes, namely 1MKW, 1GLA, 1DKG and 2PCC (data not shown).



View larger version (53K):
[in this window]
[in a new window]
 
Fig. 2. Histogram of the ratio of the interface area to the total number of atoms in the complex versus the number of complexes tested.

 
To analyze the docking results further, we examined the complexes more carefully using computer graphics, especially the 14 complexes which did not give the correct solutions using the first set of parameters. First, we noticed that the 14 complexes were determined mostly at low to medium resolutions around 2.5 Å. For example, 2BTF was determined at 2.5, 1SEB at 2.7, 1IAI at 2.9, 1AIP at 3.0, 1DFJ at 2.5, 1AK4 at 2.4, 1NMB at 2.5, 1DKG at 2.8 and 1MKW at 2.3 Å. However, since 2.5 Å is close to the average resolution of the 71 complexes, the resolution of the structures may have only a minor effect on the docking results. Second, special structural features were noticed for these 14 complexes. For example, complexes 1OSP, 1NFD and 1AK4 use a single loop in the binding site whereas common molecular complexes use more segments with secondary structures (Jones and Thornton, 1996Go). Third, the packing in the interface for these 14 complexes is relatively loose and is often made of more than two patches of surfaces so that channels or cavities could be seen from side views of the interface on computer graphics. This characteristic was observed for 2BTF, 1DFJ, 1YCS, 1AK4, 1DKG, 1GLA and 1MKW. Complex 1MKW is one of the most significant examples, in which four short loops from one molecule bind to the surface of the other molecule as if fingers `grab' on to several patches of a surface leaving many holes or cavities between the fingers. To measure the packing properties quantitatively, we employ the percentage of buried atoms in the interface [%B, as defined and calculated in Lo Conte et al. (Lo Conte et al., 1999Go)] and the percentage of buried main-chain atoms (%M) as two measures of the tightness of interface packing. The results are listed in Table IIIGo. The average %B is 18.8 for the 14 complexes, which is significantly less (by 1{sigma}, which is 9) than the average of 26 for all the complexes (including the 14 complexes). The average %M is 13.5% for the 14 complexes, which is also significantly less than the average of 17.3% for all.


View this table:
[in this window]
[in a new window]
 
Table III. Percentage of buried atoms and percentage of main-chain atoms in the interface
 
Further analyses of the chemical characteristics of the interface based on data from Lo Conte et al. (Lo Conte et al., 1999Go) were performed and the results are shown in Table IIIGo. For these 14 complexes, there are in general fewer interface areas than the average (over 71 complexes) fewer non-polar atoms and thus fewer hydrophobic interactions, more charged atoms and thus more electrostatic interactions, fewer water molecules and fewer hydrogen bonds. Therefore, it is certain that for these 14 complexes, electrostatic interaction is more important than hydrophobic interaction and hydrogen bonding.

It is worth noting that complex 2PCC (cytochrome peroxidase–cytochrome) is an electron transport system and the molecular recognition between the two component molecules is dominated by electrostatic interactions. Other examples of molecular recogniton by electrostatic interactions have been found by Botti et al. (Botti et al., 1998Go). For this type of molecular recognition, the electrostatic complementarity will play the major role and the shape complementarity an auxiliary role. This is consistent with the current implementation of SOFTDOCK. In the future, the calculated electrostatic field or potential of a macromolecule could be represented in grid space and used in SOFTDOCK.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
From our systematic docking tests, the effects of the docking parameters in SOFTDOCK have been studied. First, the selection of parameters (cube size and cone angle) will impact the quality of the solutions generated. From our testing on known crystal complexes, we find that the cube size and cone angle dictate the so-called `softness' of the molecular recognition. The smaller cube size and smaller cone angle require `harder' molecular recognition, that is, the shape complementarity must be perfected at the finer atomic detail level. In the cases when we do not know the softness of molecular recognition for a complex of interest, we find that a balance could be reached by selecting `optimum' parameters for docking. This optimum set of parameters has been validated with the successful docking of 57 of the 71 crystal complexes. These 57 complexes include a variety of biological complexes, which suggests that our method has general applicability for studying molecular recognition.

Second, the large cube size smears the molecular surface and removes the detailed features. However, the docking results showed that many complexes could be docked and hence recognized at a cube size of 1.4–6 Å. Similar observations have been suggested previously and applied to low-resolution docking (Vakser, 1996Go).

Third, in our original design of the soft docking algorithm, the softness is implemented through varying the cube size, the cone angle and the volume overlap cutoff. Increasing these three parameters will increase the softness of docking. However, the increased softness will lead to increased background noise. For some complexes, the recognition can occur at the `soft' level whereas for other complexes the recognition occurs at the `hard' level. In the latter case, the complementarity at the detailed level is essential for recognition, so conformational rearrangement will be necessary to determine whether a molecular complex can be formed. The results presented here show that the majority of the molecular complexes known to date could be found with the strategy of soft docking. It will be interesting to compare our results with those of other docking algorithms now that more protein–protein complexes are available.

The results also delineate what types of molecular complexes will be hard to dock with the current implementation of SOFTDOCK. Electrostatics are important for molecular recognition. In some cases they dominates the complementarity in molecular recognition whereas the shape complementarity takes a lesser role. On the other hand, in many cases, the shape complementarity is sufficient to achieve molecular recognition; here 57 of the 71 crystal complexes tested were docked successfully by considering the shape complementarity alone.

It is true that one of the main conclusions of our present work, i.e. that electrostatics are important for molecular recognition, has already been accepted by the scientific community. However, it worth noting that most supporting evidence comes from the positive control, that is, the successful docking of a known complex by considering the distribution of the electrostatic potential only (we quoted only a couple of examples). In our case, we support this conclusion by a negative control, that is, a known complex for which docking failed using only the shape complementarity was shown to have been a complex in which electrostatics play the dominant role. Therefore, our results support the same conclusion from a different point of view.

From our current understanding of molecular recognition, our results suggest that there should be a diversity of mechanisms involved in molecular recognition. For example, a diversity of molecular interfaces and conformational changes during complex formation have been observed (Lo Conte et al., 1999Go). Molecular recognition depends on the principle of complementarity, be it shape or electrostatics. However, using different parameters, we find that molecular recognition of different complexes requires different degrees of complementarity. We call this degree of complementarity the `softness' to reflect the special properties of our docking algorithm. It is biologically relevant to understand this type of variation in molecular recognition. For example, in enzyme–inhibitor, antibody–antigen and signaling complexes, the `softness' is different for different biological functions. Admittedly, the `softness' is somewhat related to our docking algorithm and may not have a generalized meaning in molecular recognition. For example, for a peptide fragment in an MHC binding site, the docking could be very `soft' as it involves large conformational changes but the binding is very tight. The peptide–MHC complex may be a special case in molecular recognition. We think that it is still too early to say that the effect of the docking parameters is irrelevant to biological understanding, but instead it is helpful to know the `softness' of molecular recognition for the biological complex that one is studying, at least for selecting the docking parameters.

The accuracy of the docking of two proteins is usually estimated by the root mean square deviation (r.m.s.d.) between the docked complex and the known crystal complex. The r.m.s.d. of the C{alpha} atoms for the contact residues was not calculated for the best correct solution for each complex tested in the present work. However, it could be calculated as was done earlier (Jiang and Kim, 1991Go) and the resulting values should be comparable to those from other docking methods. We have not provided these values because we consider that we established the validity of our method in the previous study. Another reason for not calculating the r.m.s.d. in our present docking of 71 complexes is that we have not optimized the final docking complexes using energy minimization, so it is not a fair comparison with other studies such as the ab initio docking of a full-atom model of lysozyme to an antibody with 1.6 Å accuracy (Totrov and Abagyan, 1994Go).

Owing to the simplicity and versatility of our representation, several new features could be implemented in SOFTDOCK to encompass a variety of molecular recognition mechanisms and conformational changes. For example, we could easily include electrostatic potential (Knegtel et al., 1997Go; Lorber and Schoichet, 1998) and contact pair-wise potential (Miyazawa and Jernigan, 1985Go) in the representation and apply the ensemble docking method, which should bring about a significant improvement in the applicability of SOFTDOCK. Furthermore, we should extend the concept of the complementarity in the light of the current data on molecular recognition and complexes and explore other methods of calculating the complementarity and predicting the molecular recognition.

Finally, we have shown that for docking known crystal complexes, the correct solution has the highest score. When the binding sites on the probe and target (ligand and receptor) molecules are known, the correct solution is also the top solution. When docking structures of unbound molecules, our docking method also works but requires a filtering and clustering procedure. A paper describing the augmented procedure and the related results has been submitted elsewhere. Futhermore, it will be very interesting to test SOFTDOCK on structures from homology modeling. We think that it might work because, first, the binding sites are known, second, the correct solutions could be refined with some restraints on key interactions and third, many candidate solutions could be evaluated at the atomic level using molecular dynamics simulation. Our general goal is to try to develop our method into an automated procedure where a scientist only has to evaluate a few top solutions for their biological relevance. For very difficult cases, we try to limit the number of solutions for evaluation to around 50. For SOFTDOCK, we really think that we are close to this goal.


    Notes
 
3 To whom correspondence should be addressed, at Tsinghua University. E-mail: jiangf{at}tsinghua.edu.cn Back


    Acknowledgments
 
We acknowledge the support of Tsinghua University Research Grant 985, National Basic Research Fund 973 (Grants G1999075602, G1999011902, G1998051105) and the National Science Foundation of China (Grants 39870174, 39970155, 30170198).


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Argos,P. (1988) Protein Eng., 2, 101–113.[Abstract]

Bernstein,F.C., Koetzle,T.F., Williams,J.B., Meyer,E.F.,Jr, Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535–542.[ISI][Medline]

Bohm, H-J. (1994) J. Comput.-Aided Mol. Des., 8, 623–632.

Botti,S.A., Felder,C.E., Sussman,J.L. and Silman,I. (1998) Protein Eng., 11, 415–420.[Abstract]

Chothia,C. (1997) In McCrae,M.A., Saunders,J.R., Smyth,C.J. and Stow,N.D. (eds), Molecular Aspects of Host–Pathogen Interaction. Cambridge University Press, Cambridge.

Chothia,C. and Janin,J. (1975) Nature, 256, 705–708.[ISI][Medline]

Connolly,M.L. (1981) QCPE Bull., 1, 18 (MS, QCPE 429).

Connolly,M.L. (1983) Science, 221, 709–713.[ISI][Medline]

Given,J.A. and Gilson,M.K. (1998) Proteins, 33, 475–495.[CrossRef][ISI][Medline]

Goldman,B.B. and Wipke,W.T. (2000) Proteins, 38, 79–94.[CrossRef][ISI][Medline]

Helmer-Citterich,M. and Tramontano,A. (1994) J. Mol. Biol., 235, 1021–1031.[CrossRef][ISI][Medline]

Hou,T., Wang,J., Chen,L. and Xu,X. (1999) Protein Eng., 8, 639–647.

Janin,J. (1995) Biochimie, 77, 497–505.[CrossRef][ISI][Medline]

Janin,J. (1996) Prog. Biophys. Mol. Biol., 64, 145–165.[ISI]

Janin,J. and Chothia,C. (1990) J. Biol. Chem., 265, 16027–16030.[Free Full Text]

Jiang,F. and Kim,S.-H. (1991) J. Mol. Biol., 219, 79–102.[ISI][Medline]

Jones,S. and Thornton,J.M. (1995) Prog. Biophys. Mol. Biol., 63, 131–165.[CrossRef][ISI][Medline]

Jones,S. and Thornton,J.M. (1996) Proc. Natl Acad. Sci. USA, 93, 13–20.[Abstract/Free Full Text]

Jones,S. and Thornton,J.M. (1997) J. Mol. Biol., 272, 121–132.[CrossRef][ISI][Medline]

Jones,G., Willett,P. and Glen,R.C. (1995) J. Comput.-Aided Mol. Des., 9, 532–549.

Katchalski-Katzir,E., Shariv,I., Eisenstein,M., Friesem,A.A., Aflalo,C. and Vakser,I.A. (1992) Proc. Natl Acad. Sci. USA, 89, 2195–2199.[Abstract]

Knegtel,R.M.A., Kuntz,I.D. and Oshiro,C.M. (1997) J. Mol. Biol., 266, 424–440.[CrossRef][ISI][Medline]

Kuntz,I.D., Blaney,J.M., Oatley,S.J., Langridge,,R., Ferrin,T.E. (1982) J. Mol. Biol., 161, 269–288.[ISI][Medline]

Laskowski,R.A., Luscombe,N.M., Swindells,M.B. and Thornton,J.M. (1996) Protein Sci., 5, 2438–2452.[Abstract/Free Full Text]

Lawrence,M.C. and Davis,P.C. (1992) Proteins, 12, 3–41.

Lee,R.H. and Rose,G.D. (1985) Biopolymers, 24, 1613–1627.[ISI][Medline]

Lin,S.L., Nussinov,R., Fischer,D. and Wolfson,H.J. (1994) Proteins, 18, 94–101.[ISI][Medline]

Lo Conte,L., Chothia,C. and Janin,J. (1999) J. Mol. Biol., 285, 2177–2198.[CrossRef][ISI][Medline]

Lorber,D.M. and Shoichet,B.K. (1998) Protein Sci., 7, 938–950.[Abstract/Free Full Text]

Masek,B.B., Merchant,A. and Matthews,J.B. (1993) Proteins, 17, 193–202.[ISI][Medline]

Meng,E.C., Shoichet,B. and Kuntz,I.D. (1992) J. Comput. Chem., 13, 505–524.[ISI]

Miyazawa S and Jernigan R.L. (1985) Macromolecules, 18, 534–552.[ISI]

Norel,R., Lin,S.L., Wolfson,H.J. and Nussinov,R. (1994a) Biopolymers, 34, 933–940.[ISI][Medline]

Norel,R., Lin,S.L., Wolfson,H.J. and Nussinov,R. (1994b) Protein Eng., 7, 39–46.[Abstract]

Norel,R., Lin,S.L., Wolfson,H.J. and Nussinov,R. (1995) J. Mol. Biol., 252, 263–273.[CrossRef][ISI][Medline]

Palma,P.N., Krippahl,L., Wampler,J.E. and Moura,J.J.G. (2000) Proteins, 39, 372–384.[CrossRef][ISI][Medline]

Perkins,T.D.H., Mills,J.E.J. and Dean,P.M. (1995) J. Comput.-Aided Mol. Des., 9, 479–490.

Rarey,M., Wefing,S. and Lengauer,T. (1996) J. Comput.-Aided Mol. Des., 10, 41–54.

Shoichet,B. and Kuntz,I.D. (1991) J. Mol. Biol., 221, 327–346.[CrossRef][ISI][Medline]

Shoichet,B. and Kuntz,I.D. (1993) Protein Eng., 6, 723–732.[Abstract]

Shoichet,B., Bodian,D.L. and Kuntz,I.D. (1992) J. Comput. Chem., 13, 380–397.[ISI]

Sobolev,V., Wade,R.C., Vriend,G. and Edelman,M. (1996) Proteins, 25, 120–129.[CrossRef][ISI][Medline]

Stahl,M. and Bohm, H.-J. (1998) J. Mol. Graphics Modelling, 16, 121–132.[CrossRef][ISI][Medline]

Totrov,M. and Abagyan,R. (1994) Nature Struct Biol., 1, 259–263.[ISI][Medline]

Tsai,J., Lin,S.L., Wolfson,H. and Nussinov,R. (1996) J. Mol. Biol., 260, 604–620.[CrossRef][ISI][Medline]

Vakser,I.A. (1996) Protein Eng., 9, 37–41.[Abstract]

Vakser,I.A., Matar,O.G. and Lam,C.F. (1999) Proc. Natl Acad. Sci. USA, 96, 8477–8482.[Abstract/Free Full Text]

Received March 21, 2001; revised December 14, 2001; accepted January 4, 2002.





This Article
Abstract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (1)
Request Permissions
Google Scholar
Articles by Jiang, F.
Articles by Rao, Z.
PubMed
PubMed Citation
Articles by Jiang, F.
Articles by Rao, Z.