1 Department of Biological Chemistry and 2 Department of Chemical Services, The Weizmann Institute of Science, Rehovot 76100, Israel
3 To whom correspondence should be addressed. E-mail: miriam.eisenstein{at}weizmann.ac.il
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: docking/false-positive solutions/flexibility/surface recognition/trimming
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
One intuitively expects that in docking of unbound structures, which differ from the corresponding bound structures, clashes occur at the interface, thereby reducing the ability of rigid body docking procedures to identify the correct solution. However, our previous study indicates that the success or failure of geometric docking does not correlate well with the degree of conformation change (Heifetz et al., 2002). It is therefore of interest to study in detail the effect of local shape modifications on rigid-body docking. It appears that the flexibility of exposed side chains is not uniform. The large polar or charged residues, arginine, lysine, glutamate and glutamine are very flexible. In contrast, many of the smaller polar or charged residues, such as aspargine, aspartate and histidine, and the large aromatics, phenylalanine, tyrosine and tryptophan, are markedly inflexible (Zhao et al., 2001
). At the binding interface, the degree of flexibility follows the order lysine > arginine > methionine, glutamine > glutamate, isoleucine, leucine > aspargine, threonine, tyrosine, serine, histidine, aspartate > cysteine, tryptophane, phenylalanine. Thus, the lysine side chains flex 25 times more often than do phenylalanine side chains (Najmanovich et al., 2000
).
Based on the aforementioned studies, we chose to trim lysine, arginine and glutamine side chains by lowering their geometric weight, without additional softening of the molecular surface. Lysine, the most flexible residue at the binding interface, is also abundant on molecular surfaces (Zhao et al., 2001). The effect of trimming lysine side chains is compared with the effect of trimming the less flexible glutamine side chains. Next, it is compared with the effect of trimming arginines, which are less flexible and less abundant on the surface. Notably, serine is the most flexible surface residue and it is abundant on the molecular surface (Zhao et al., 2001
). We do not trim serines because their side chains are short and conformation changes barely affect the shape of the molecule.
Our procedure does not assume prior knowledge of the binding site. Therefore, we trim all the surface lysine, glutamine or arginine side chains. We expect that this shape modification will reduce the effect of clashes formed by interface residues whose conformations are not adequate for binding. We find an additional prominent effect: an overall lowering of the complementarity scores of false-positive solutions and a large reduction in their number, which leads to a significant improvement in the rank of the nearly correct solution.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The geometric representation in the docking algorithm, which was previously presented by our group and named MolFit (Katchalski-Katzir et al., 1992; Eisenstein et al., 1997
), was modified in the current study. Thus, the geometric weight of grid points derived from the most mobile atoms of the exposed side chains is lowered to reflect their mobility. As before, the atomic representation of the molecules to be docked is replaced by a three-dimensional (3D) grid representation. Grid points outside the molecule are given the value 0, points on the surface of the molecule are given the value g and those in the interior of the molecule are given either a negative value (15) or a positive value (+1), for molecules a and b, respectively. The value of g in the unmodified representation is 1. In the current study g remains 1 for most of the surface grid points; it is <1 for grid points situated within the volume of the trimmed atoms, as specified below. The interior of molecule a remains unchanged by this modification.
As in our previous studies (Eisenstein and Katchalski-Katzir, 1998; Heifetz et al., 2002
), the surface of each molecule, which corresponds to a solvent-accessible surface, is thickened to allow some interpenetration. The modification in this study is designed so that there is only small additional softening of the surface. Thus, the lower g allows penetration of the most mobile ends of the flexible side chains into the interior of the other molecule. At the same time we keep the shape of the interior of molecule a unchanged, strongly restricting this penetration.
The modified grid representations of the two molecules are correlated using discrete Fourier transformations as described before (Katchalski-Katzir et al., 1992) and the procedure is repeated for many relative orientations of the two molecules. The modified geometric representation is combined with electrostatics as described previously (Heifetz et al., 2002
). However, the electrostatic representation of the molecules, which is less sensitive to small conformation changes, is the same as for the untrimmed molecules.
In order to determine the value of g, we compared the results of geometricelectrostatic rotation/translation scans with three values of g: 1 (unmodified geometric representation), 0.5 and 0. In all but one case (2ptn/4pti), the gradual change in g produces a gradual change in the complementarity score of the nearly correct solution (see Table I). Thus, if g = 0.5 improves the unmodified geometricelectrostatic results, g = 0 further improves them. Therefore, in all the subsequent computations and analyses g = 0 for the mobile ends of the trimmed residues.
|
Only exposed side chains are trimmed. The solvent accessibility of each residue is calculated using the algorithm of Lee and Richards (Lee and Richards, 1971) as implemented in the MSI package (Accelrys, San Diego, CA). Residues with
30% exposed surface area are classified as buried and are not trimmed. All the exposed lysine or arginine side chains are trimmed, including interface and non-interacting surface residues.
As the study progressed, it became of interest to check if trimming of lysine side chains that extend away from the surface like comb teeth (protruding side chains) affects the docking results differently to trimming of side chains that lie on the surface (sticky side chains). We define as sticky the lysine side chains whose C atoms make at least one contact of <4 Å with other atoms in the molecule. All the other exposed lysine side chains are protruding.
Rotation/translation scans and determination of the ranks of the nearly correct solutions
Throughout this study, we use the same set of disassembled and unbound structures as in a previous study (Heifetz et al., 2002), allowing a detailed comparison of the results. The coordinates were taken from the Protein Data Bank (PDB) (Berman et al., 2000
) and treated as before. All the water molecules in the experimental structures were omitted; missing side chains were modeled and for antibody molecules the grid points on the surface of the Fc domain were given a value of -15, preventing docking of the ligand to this domain.
Geometric and geometricelectrostatic rotation/translation scans were performed using a grid interval of 1.01.2 Å and a rotation interval of 12°. These parameters have been found adequate for unbound docking before (Katchalski-Katzir et al., 1992; Heifetz et al. 2002
). They were not re-evaluated in this study, because the shape modification suggested here is subtle and not likely to affect them. Only one solution was saved for each orientation, resulting in 8760 putative binary complexes sorted by their complementarity scores. All the solutions were compared with the experimental structure of the complex by calculating the root mean square differences (r.m.s.d.s) between the positions of the common C
atoms. The rank of the nearly correct solution was its position in the sorted list of solutions determined according to the same limits as in our previous study (Heifetz et al., 2002
). Hence, we searched for the highest scoring solution with an r.m.s.d. smaller than 3 Å. As before, when the score of the nearly correct solution is identical with that of other solutions, its rank is given as a range of numbers representing the ranks of all these solutions.
The 8760 solutions from each scan were statistically analyzed. An extreme-value distribution function (Levitt and Gerstein, 1998) was fitted to the observed distribution of scores, providing estimates for the mean score, µ, and the standard deviation,
. These values were used to calculate a uniqueness value Zi = (Si - µ)/
for each docking solution, i, whose score was Si.
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Trimming lysine side chains
Trimming of the exposed lysine side chains significantly improves the rank of the nearly correct solution for 10 of the 16 systems in Tables I and II
and in some cases the improvement is dramatic. For example: the rank of the nearly correct solution for the thrombin/BPTI (4htc/4pti) system is elevated from 508531 to 5; for the acetylcholinesterase/fasciculin-II (2ace/1fcs) system it is elevated from 689705 to 156163 and for the Jel42/HPR (2jel/1poh) system the rank is improved from 286292 to 4042. No change or a minor deterioration of the rank of the nearly correct solution is observed for four systems but for two other systems (1ept/1ldt, 1ept/1avu) a severe negative effect due to lysine side chain trimming is observed.
|
Surprisingly, 4htc/4pti is the only system in which the score of the nearly correct solution increases when lysine side chains are trimmed. In the other 15 systems the scores are either unchanged (in five cases in which there are no lysine residues at the interface) or lower. Comparison of the bound and unbound conformations of all the interface lysines, reveals large differences in the 1 and
2 torsion angles in six cases: 1ept/1avu, 4htc/4pti, 5cha/1ovo, 3hfm/1hel, Tem1/BLIP and 2ptn/4pti. However, in only three of these cases (4htc/4pti, 3hfm/1hel and 2ptn/4pti) are there clashes involving the interface lysine side chains. One system, 4htc/4pti, is discussed above. In 3hfm/1hel, lysine 97 of unbound lysozyme clashes with tryptophan H98 of the antibody and in 2ptn/4pti lysine 15 of the inhibitor clashes with cysteine 220 of the enzyme. These clashes are less severe than those observed for 4htc/4pti. Thus, the shortest distances are 1.5 and 2.2 Å in 3hfm/1hel and 2ptn/4pti, respectively, compared with 0.6 Å in 4htc/4pti. The effect of lysine side chain trimming for these three systems is inconsistent. In one case, 4htc/4pti, the score and rank of the nearly correct solution improve; in a second case, 2ptn/4pti, the score is lower but the rank improves, and in the third case, 3hfm/1hel, both deteriorate. It appears that although clashes occur, resulting from the different conformations of the lysine side chains, their effect on the docking results is dominant in only one case.
In three additional systems, in which the 1 and
2 torsion angles of interface lysines change on complex formation (1ept/1avu, 5cha/1ovo and Tem1/BLIP), the changes are well accommodated. Hence the lysine residues form good contacts in the nearly correct solution and are not involved in clashes. For example, inspection of the structure of 1cho reveals that lysine 55 of the ligand, whose
1 angle changes from -76° in the bound structure to 63° in the unbound structure (1ovo), forms good contacts with the enzyme (Figure 1A
). The situation is similar in the systems 1ept/1avu and Tem1/BLIP, where the interacting lysines are deeply buried at the interface, losing more than 50% of the accessible surface area upon complex formation. Trimming, in these three cases, causes loss of contacts and a reduction in the geometric complementarity score of the nearly correct solution. However, in 5cha/1ovo the rank of the nearly correct solution improves, in Tem1/BLIP it is almost unaffected, and in 1ept/1avu the rank strongly deteriorates. In the last system the complementarity at the interface in unbound docking is scarce and the few contacts formed by lysine 60 of the enzyme are important.
|
The analyses above indicate that the effect of lysine side chain trimming on the score of the nearly correct solution is inconsistent. In many cases, the unbound conformation is easily accommodated and some contacts are formed. This is more pronounced when the conformation of the interacting lysine residue does not change much, but is not confined to such cases. It appears that the notion of tight geometric complementarity between the interacting molecules needs revision. The occurrence of water molecules at the interface in almost every system suggests that the geometric fit is only approximate (unless water is considered a part of the system) and often conformation differences do not lead to clashes.
Effect of lysine side chain trimming on the false-positive solutions
Despite the inconsistent effect of conformation changes described above, we observe a consistent general trend in the results in Tables I and II
. Thus, the complementarity scores of the nearly correct solutions are unchanged or lower when lysine trimming is in effect (except in 1bth), but their ranks improve in most cases. The improved ranking must originate from the elimination of false-positive solutions. This is most evident for the five systems in which there are no lysine residues at the interface and the complementarity scores of the nearly correct solutions are not influenced by lysine side chain trimming. However, the ranks of these solutions improve significantly, indicating that the number of the false-positive solutions is lowered. In addition, our statistical analyses show a systematic decrease of the mean scores in rotation/translation scans with trimmed lysine side chains for all 16 systems, by 511% (1637 score units; see Table II
).
To clarify further the effect described above, we calculated for each system the average difference between scores obtained with and without lysine trimming, for the set of false-positive solutions in the unmodified geometric scan. These averages are negative for all 16 systems tested and their absolute values are larger than the corresponding difference for the nearly correct solution in systems for which the rank of the nearly correct solution elevates when lysines are trimmed (with the exception of 4htc/4pti). Interestingly, the average reduction in the scores of the false-positive solutions is weakly correlated to the number of trimmed lysine side chains (Figure 2).
|
|
We further pursued the effect of lysine side chains trimming by trimming either protruding or sticky side chains, as defined in the Methods section. The test was performed for three systems, 2ptn/4pti, 2ptn/1pi2 and 2jel/1poh, for which trimming all the surface lysines had a different effect on the rank of the nearly correct solution (it improved in two cases and deteriorates for 2ptn/1pi2). According to Table IV, the changes in the ranks are approximately proportional to the number of trimmed lysine side chains, in all three systems, and they do not depend on the conformations.
|
Trimming glutamine or arginine side chains
The effect of lysine trimming on the false-positive solutions can be attributed to the length and mobility of their side chains or to their abundance on the non-interacting surface or to both. Therefore, we also trimmed glutamine side chains, which are often as abundant as lysines or arginine side chains, which are long but less flexible and less abundant than lysines.
Glutamine side chains were trimmed in two systems (1thm/2sec and 1scd/1tec), in which there are no lysine neither glutamine residues at the interface and the number of such residues on the non-interacting surface is very close (ten lysines and nine glutamines in 1thm/2sec; six lysines and seven glutamines in 1scd/1tec). Trimming of glutamines improved the ranks of the nearly correct solutions for both systems, from 180186 to 8085 for 1thm/2sec and from 918949 to 707730 for 1scd/1tec. This result stems from a systematic reduction in the scores of the false-positive solutions, because the trimming of glutamine side chains on the non-interacting surface can only affect false solutions. Indeed, the mean scores are lower when glutamine trimming is in effect than in unmodified geometric scans by 21 and 10 score units, for 1thm/2sec and 1scd/1tec, respectively.
Trimming glutamine side chains has a smaller effect on the ranks of the nearly correct solutions than the trimming of lysines, although their numbers are very similar. Thus, the longer side chains of lysines are more frequently involved in formation of false-positive solutions than the side chains of glutamines. Trimming of lysine and glutamine side chains has a cumulative effect on the ranks of the nearly correct solutions, which change from 180186 to 3640 for 1thm/2sec and from 918949 to 358376 for 1scd/1tec and on the mean scores, which are lower than in unmodified geometric docking by 38 and 27 score units for 1thm/2sec and 1scd/1tec, respectively.
Unlike for lysines and glutamines, the trimming of arginine side chains spoils the rank of the nearly correct solution for seven of the nine systems tested (see Table V). To understand this difference we compare the number of interface and non-interacting surface lysine and arginine residues (see Tables II
and V
). Arginines are more abundant than lysines at the interface, in the set of systems considered in this study. Moreover, in most cases the arginine residues are buried at the interface, losing more than 50% of their exposed surface area upon complex formation. In contrast, the number of arginine residues on the non-interacting surface is often smaller than the number of lysine or glutamine residues. These different distributions correspond to the different effects of lysine or arginine side chain trimming in docking. The larger number of arginines at the interface leads to a considerable loss of contacts when they are trimmed and to a decrease in the scores of the nearly correct solutions. At the same time, the smaller number of arginine residues on the non-interacting surface affects less the scores of the false solutions.
|
The trimming of mobile side chains often improves the ranking of the nearly correct solution in rigid-body docking. However, the mechanism of this improvement is rather unexpected. It appears that even when large differences in side chain conformations between bound and unbound molecules occur, good contacts are formed in the nearly correct solution. These contacts are often only partially correct; nevertheless, they add to the complementarity score and, therefore, trimming generally lowers the scores of the nearly correct solutions. However, trimming also lowers the scores of the false-positive solutions and this effect turns out to be more consistent and more dominant than the effect on the nearly correct solutions.
It is interesting to compare our modification of the surface with the modification presented by Palma et al. (Palma et al., 2000), where the core of the flexible side chains is modified, allowing more interpenetration and producing inferior docking results. Our modification of the surface restricts the interpenetration but at the same time does not reward contacts formed by the flexible residues. Therefore, it leads to a significant reduction in the scores of the false-positive solutions and improves the ranks for the nearly correct ones. Interestingly, the effectiveness of our side chains trimming does not depend on the conformation of the trimmed side chains, as observed for protruding and sticky lysines. It does, however, depend on the length of the side chains (lysines versus glutamines), on their abundance on the surface and on the relative number of these residues at the interface and on the non-interacting surface. Trimming is therefore likely to be beneficial when there are many mobile side chains on the surface and it is most advantageous when there are no mobile side chains at the interface. Hence, when the approximate position of the interaction site is known, only side chains on the non-interacting surface should be trimmed. Trimming is less recommended when there are only a few mobile side chains on the surface, in particular when one of the molecules is very small, because in such a case the interacting surface is small and it may strongly depend on the interactions of one lysine. Geometricelectrostatic docking with trimmed side chains is particularly useful because the electrostatic term moderates the negative effects of trimming.
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Dixon,J.S. (1997) Proteins, Suppl., 198204.
Eisenstein,M. and Katchalski-Katzir,E. (1998) Lett. Pept. Sci., 5, 365369.[CrossRef][ISI]
Eisenstein,M., Shariv,I., Koren,G., Friesem,A.A. and Katchalski-Katzir,E. (1997) J. Mol. Biol., 266, 135143.[CrossRef][ISI][Medline]
Gabb,H.A., Jackson,R.M. and Sternberg,M.J. (1997) J. Mol. Biol., 272, 106120.[CrossRef][ISI][Medline]
Heifetz,A., Katchalski-Katzir,E. and Eisenstein,M. (2002) Protein Sci., 11, 571587.
Katchalski-Katzir,E., Shariv,I., Eisenstein,M., Friesem,A.A., Aflalo,C. and Vakser,I.A. (1992) Proc. Natl Acad. Sci. USA, 89, 21952199.[Abstract]
Lee,B. and Richards,F.M. (1971) J. Mol. Biol., 55, 379400.[ISI][Medline]
Levitt,M. and Gerstein,M. (1998) Proc. Natl Acad. Sci. USA, 95, 59135920.
Lorber,D.M., Udo,M.K. and Shoichet,B.K. (2002) Protein Sci., 11, 13931408.
Mandell,J.G., Roberts,V.A., Pique,M.E., Kotlovyi,V., Mitchell,J.C., Nelson,E., Tsigelny,I. and Ten Eyck,L.F. (2001) Protein Eng., 14, 105113.
Najmanovich,R., Kuttner,J., Sobolev,V. and Edelman,M. (2000) Proteins, 39, 261268.[CrossRef][ISI][Medline]
Palma,P.N., Krippahl,L., Wampler,J.E. and Moura,J.J. (2000) Proteins, 39, 372384.[CrossRef][ISI][Medline]
Sandak,B., Wolfson,H.J. and Nussinov,R. (1998) Proteins, 32, 159174.[CrossRef][ISI][Medline]
Strynadka,N.C., Eisenstein,M., Katchalski-Katzir,E., Shoichet,B.K., Kuntz,I.D., Abagyan,R., Totrov,M., Janin,J., Cherfils,J. et al. (1996a) Nature Struct. Biol., 3, 233239.[ISI][Medline]
Strynadka,N.C., Jensen,S.E., Alzari,P.M. and James,M.N. (1996b) Nature Struct. Biol., 3, 290297.[ISI][Medline]
Zhao,S., Goodsell,D.S. and Olson,A.J. (2001) Proteins 43, 271279.[CrossRef][ISI][Medline]
Received September 6, 2002; revised January 2, 2003; accepted January 7, 2003.