Department of Chemistry, University of Tromsø, N-9037 Tromsø, Norway. E-mail: arne.smalas{at}chem.uit.no
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: free energy perturbation/molecular association/protein engineering/trypsin
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Knowledge of structural and functional properties of enzymatic systems is of great importance if we are to design enzymes with specific properties as one may desire through protein engineering. Interplay between experimental and theoretical research has a central role in this context. Combining computer simulations based on empirical energy functions with site-directed mutagenesis and X-ray crystallography offers a unique opportunity to investigate the level of accuracy of theoretical modeling of highly complex processes, such as proteinprotein recognition. Molecular recognition in general and proteinprotein recognition in particular represent one of the most complex issues currently addressed by theoretical modeling techniques. Significant progress has been made in studies of proteinligand interactions (Kollman, 1993; Åqvist et al., 1994
; Hansson and Åqvist, 1995
), whereas much less has been achieved in studies of proteinprotein recognition. Formation of proteinprotein complexes is expected to obey the same physical principles as the proteinligand interactions, but in the former case a more delicate balance between entropic and enthalpic contributions is anticipated, which may not be easily reproduced by computer simulation approaches (Muegge et al., 1998
).
Despite the success in providing rational explanations and predictions of free energies, most current work based on free energy perturbation (FEP) simulations suffers from the same sort of problems. First, FEP calculations reside on statistical mechanics and, in theory, as the number configurations sampled increases, the free energy should converge towards the experimental. Second, truncation of non-bonded potentials, particularly long-range electrostatics, tends to introduce instabilities in the simulated trajectories. Correct treatment of long-range potentials has received considerable attention during recent years. Hansson and Åqvist (1995) showed that using a too small cut-off suppressed some of the solvent's resistance to polarization by charged groups and as a result the simulations did not converge. Recently, McCarrick and Kollman (1999) used a dual cut-off (8 Å/12 Å) which seemed to reduce the overall drift in the average coordinates. Problems related to truncation of the non-bonded potentials can be resolved by using a local reaction field approach (LRA) (Lee and Warshel, 1992), which uses a Taylor expansion that yields approximately the same results as infinte cut-offs at a modest cost. Still, the problem of sampling enough configurations to obtain a reliable, converged free energy is far more difficult. Therefore, in order to improve our calculated binding free energies, we use an average of several simulations of different length.
The present study was initiated in order to establish a protocol for free energy perturbation calculations that enable us to predict the energies and mechanisms of proteinprotein recognition. The model system is well suited since high-resolution structures of the proteinprotein complexes are available, along with experimental data for the association energies. The necessity of atomic resolved experimental structures of both the initial and final states was in particular addressed in the study. A series of perturbations, different protocols and simulation times were carried out for both a small (P1-Ala to Gly) and a large (P1-Met to Gly) perturbation of the complex. We find that the experimental association energies can be estimated in a quantitative manner only if the correct numbers and positions of water molecules at the proteinprotein interface are included. The quantitative estimates of binding free energy differences are dramatically improved when additional water molecules that are present in the crystal structure of the final state are taken into consideration. The protocol established here will therefore form the basis for our future calculations of binding free energies in a predictive manner on the model system.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Generation of initial structures was done by solvation of the recently reported X-ray structures (Helland et al., 1999) of bovine trypsin (BT) in complex with bovine pancreatic trypsin inhibitor (BPTI) P1 variants. X-ray structures of complexes between BT and BPTI P1-Met and P1-Gly were available, the latter complex solved at both room temperature (295 K) and cryogenic temperature (120 K). The BTBPTI P1-Ala complex was modeled from the coordinates of BTBPTI P1-Gly. In order to put all the free energy calculations on a common basis, the initial starting structures were solvated using an identical protocol. This was done by placing a sphere of equilibrated TIP3P (Jorgensen et al., 1983
) water molecules with a radius of 18 Å centered at the P1 C
atom and water molecules closer than 2.8 Å to protein atoms were deleted. Water molecules with a distance <2.3 Å to other water molecules, as judged by the oxygenoxygen distance, were also removed. These structures were then subjected to energy minimization using 500 steps of steepest decent and 2000 steps of conjugate gradient.The coordinates for BPTI free in solution were generated by removal of BT from the respective complex. The BPTI structures were then subjected to the same solvation and minimization procedures as used for the BTBPTI complexes. After minimization the structures were equilibrated by performing a short simulation (15 ps) with a weak harmonic positional restraint of 0.5 kcal/mol on all heavy atoms. Structures after equilibration are now referred to as initial structures.
Computational procedure
All the calculations were carried out using the AMBER 5.0 software package (Pearlman et al., 1995) and the Cornell et al. (1995) force field. Calculations of binding free energies for molecular systems were performed on the basis of statistical mechanics implemented into regular molecular mechanics force fields. A number of statistical mechanical procedures exist to determine free energy differences between two closely related states of a system based on molecular dynamics (MD) simulation techniques (Kollman, 1993
). The thermodynamic integration approach is now reasonably well established and the details of the methodology can be found in a number of excellent reviews (e.g. Kollman, 1993; Gilson et al., 1997); here we only briefly outline the method used in this work.
Free energy calculations applied to molecular association in solution resides on the equation depicted in Figure 1, where E, I and I' denote enzyme, inhibitor and modified inhibitor, respectively. The free energy of association of I and I' to E can be measured experimentally, while theoretical methods can `mutate' I into I' free in solution and when bound to E. Considering that the free energy is a state function, the difference between the experimentally measured and calculated free energies should be equal:
![]() |
![]() |
|
Average binding free energies were calculated from four free energy calculations of different lengths and to obtain an upper limit for the hysteresis, standard deviations for these four simulations were used. The hysteresis in the calculated binding free energies is normally obtained by taking the difference between forward and reverse runs (Miyamoto and Kollman, 1993; Wang et al., 1993
), but several procedures for estimation of an upper limit for the hysteresis have been suggested. Using several simulations also allows the convergence of the simulated free energies to be monitored. Assuming an average error of ±40% in the individual
Gs (Krowarsch et al., 1999
), which corresponds to about 0.2 kcal/mol, the hysteresis in the experimental
Gs becomes about 0.3 kcal/mol.
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Alanine to glycine
The simplest substitution, from P1-Ala to Gly, was used to test the system and for deriving a general protocol for free energy calculations on the BTBPTI system, using P1-Gly as reference state. The BTBPTI P1-Ala complex was first modeled by using the room-temperature structure of BT P1-Gly. Coordinates for the sixth water molecule (Sol 6) were generated from superimposition of the cryogenic and the room-temperature structure. Both the initial (alanine) and final (glycine) states therefore had a water-mediated hydrogen bonding network formed by six water molecules in the S1 pocket. Individual Gs and also total
Gs are presented in Table Ia
. The total
Gs range from 0.28 to 1.83 kcal/mol, with an average of 0.79 kcal/mol. Only the simulation using 1.0 ps for equilibration and data collection time is in quantitative agreement with the experimental value of 1.71 kcal/mol. It was therefore of interest to perform four new calculations based solely on the room-temperature structure and these calculations reside on both an initial and a final structure that has an S1 pocket filled with five water molecules. Table Ib
shows that three of the total
Gs are in reasonable agreement with the experimental value of 1.71 kcal/mol, whereas the use of a 1.0 ps equilibration and data collection time estimates a free energy difference of 3.17 kcal/mol. The average binding free energy difference for mutation P1-Ala to Gly is 2.28 kcal/mol, with a standard deviation of 0.70 kcal/mol. From the individual
Gs it appears as the source of the variance in the total
Gs arise from the vdW term from the simulations of free BPTI in solution. Furthermore, the total
Gs is dominated by the vdW interaction term, which favours alanine by 2.57 kcal/mol (average value). Compared with the first four calculations (Table Ia
), the change in the calculated binding free energy arises from the vdW term for the simulations of the complex, as this term has on average increased by 1.55 kcal/mol. This indicates that the sixth water molecule creates unfavorable interactions within the BTBPTI P1-Ala binding pocket. In contrast, the electrostatic term is more or less unchanged from the calculations with six water molecules (Table Ia and b
).
|
Methionine to glycine
The structure of BT P1-Met showed that the P1 residue is bent, with the C atom rotated slightly out of the specificity pocket (Figure 2
). When methionine is entering the pocket, two solvent molecules (Sol 5 and Sol 6) are expelled. Correct modeling of water molecules can be of significant importance, as seemed to be case for mutation of P1-Ala to Gly. In order to address this issue further, two estimates for the binding free energy difference between P1-Met and P1-Gly were calculated. First, the presence of both Sol 5 and Sol 6 was discarded (Table IIa
), which means that only four water molecules were present in the S1 pocket in both the initial and final states. The total binding free energy ranges from 4.25 to 11.52 kcal/mol and is dominated by the vdW interaction term favouring binding of BPTI P1-Met. The average binding free energy becomes 8.78 kcal/mol, with a standard deviation of 3.16 kcal/mol. This correctly predicts a far stronger binding for the P1-Met variant of BPTI compared with P1-Gly, but is only in qualitative agreement with the experimental value of 4.62 kcal/mol. The discrepancy can be explained from the fact that when performing a mutation as large as methionine to glycine, a cavity of unoccupied space will be generated and the resulting free energy will overestimate the binding of methionine. The calculations were therefore repeated, where the P1-Met side-chain was mutated to glycine plus a water molecule in accordance with the five water molecule arrangement in the active site of the P1-Gly complex.
|
Methionine to alanine
The free energy calculations for mutation of P1-Ala to Gly indicated that the S1 pocket of BT in the complex with BPTI P1-Ala should contain five water molecules. To address the importance of an additional water molecule at the predicted position, we first carried out four calculations with a direct perturbation of P1-Met to Ala (Table IIIa). Again, the binding free energy difference is far from the experimental value, but has the correct sign. In order to model the presence of Sol 5, the procedure described previously for the mutation of P1-Met to Gly was used. The binding free energy difference is 9.79 kcal/mol when the side-chain of P1-Met is mutated directly to Ala without considering the additional Sol 5 (Table IIIa
). In contrast,
Gtotal is 2.93 kcal/mol when the effect of Sol 5 is considered (Table IIIb
). The standard deviation is 0.92 kcal/mol for mutation of P1-Met to Ala when Sol 5 is present in the final (Ala) state. Again, the total binding free energy is dominated by the vdW term which favors binding of BPTI-P1-Met by 4.33 kcal/mol (average). In terms of electrostatics, binding of the P1-Ala variant is favored by 1.40 kcal/mol (average).
|
Conclusion
The present study demonstrates that the water molecules at the proteinprotein interface are of crucial importance. In response to change in size of the P1 residue entering the specificity pocket, water molecules tend to redistribute within the S1 pocket in order to optimize the enzymeinhibitor interactions. A clear trend was observed for mutations of P1-Met to Gly/Ala, as the calculations performed by directly changing the P1 residue predicted the correct relative binding order, but are not in quantitative agreement with experiments. Only when the presence of the additional water molecules in the S1 pocket was modeled explicitly was the binding free energy difference in quantitative agreement with the experimental values. Inclusion of a fifth water molecule in the S1 pocket of the BTBPTI complex of the P1-Gly and Ala variants when mutated from P1-Met reduced the total G in the range 47 kcal/mol and become 5.08 and 2.93 kcal/mol for the two perturbations, respectively (Tables II and III
). The corresponding experimental values are 4.62 and 2.91 kcal/mol. This can be explained by the observation that a near removal of a large side-chain such as methionine creates a large cavity. Empty spaces or cavities in the protein interior are known to be destabilizing and consequently the calculated binding free energy difference favors the methionine state too strongly. P450cam in complex with 2-phenylimidazole contains a water molecule at the proteinligand interface that stabilizes the complex by 2.8 kcal/mol (Helms and Wade, 1995
), which is comparable to the results found in this work. The present study therefore shows that relatively precise values for binding free energy differences involving two proteins can be obtained by the free energy pertubation method, but detailed knowledge about the structure of both the initial and final states is required. It is possible, however, that much longer simulations would have allowed bulk water molecules to fill the cavity formed by removal of methionine. This is not desirable, however, as one wants to obtain an estimate of the binding free energy within a reasonable amount of time.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Åqvist,J., Medina,C. and Samuelsson,J.E. (1994) Protein Engng, 7, 385391.[Abstract]
Bode,W. and Huber,R. (1992) Eur. J. Biochem., 204, 433451.[Abstract]
Cornell,W.D. et al. (1995) J. Am. Chem. Soc., 117, 51795197.[ISI]
Gilson,M.K., Given,J.A., Bush,B.L. and McCammon,J.A. (1997) Biophys. J., 72, 10471069.[Abstract]
Hansson,T. and Åqvist,J. (1995) Protein Engng, 8, 11371144.[Abstract]
Helland,R., Otlewski,J., Sundheim,O., Dadlez,M. and Smalås,A.O. (1999) J. Mol. Biol., 287, 923942.[ISI][Medline]
Helms,V. and Wade,R.C. (1995) Biophys. J., 69, 810824.[Abstract]
Huang,K., Lu,W., Anderson,S., Laskowski,M.,Jr and James,M.N. (1995) Protein Sci., 4, 19851997.
Jorgensen,W.L., Chandrasekhar,J., Madura,J.D., Impey,R.W. and Klein,M.L. (1983) J. Chem. Phys., 79, 926935.[ISI]
Kollman,P.A. (1993) Chem. Rev., 93, 23952417.[ISI]
Krowarsch,D., Dadlez,M., Buczek,O., Krokoszynska,I., Smalås,A.O. and Otlewski,J. (1999) J. Mol. Biol., 289, 175186.[ISI][Medline]
Lee,F.S. and Warshel,A. (1992) J. Chem. Phys., 97, 31003107.[ISI]
Lu,W. et al. (1997) J. Mol. Biol., 266, 441461.[ISI][Medline]
McCarrick,M.A. and Kollman,P.A. (1999) J. Comput.-Aided Mol. Des., 13, 109121.
Miyamoto,S. and Kollman,P.A. (1993) Proteins, 16, 226245.[ISI][Medline]
Muegge,I., Schweins,T. and Warshel,A. (1998) Proteins, 30, 407423.[Medline]
Pearlman,D.A., Case,D.A., Caldwell,J.W., Ross,W.S., Cheatham,T.E., Debolt,S., Ferguson,D., Seibel,G. and Kollman,P.A. (1995) Comp. Phys. Commun., 91, 141.[ISI]
Pearlman,D.A. and Kollman,P.A. (1989) J. Chem. Phys., 90, 24602470.[ISI]
Qasim,M.A., Ganz,P.J., Saunders,C.W., Bateman,K.S., James,M.N. and Laskowski,M.,Jr (1997) Biochemistry, 36, 15981607.[ISI][Medline]
Schechter,I. and Berger,A. (1967) Biochem. Biophys. Res. Commun., 27, 157162.[ISI][Medline]
Vlassi,M., Cesareni,G. and Kokkinidis,M. (1999) J. Mol. Biol., 285, 817827.[ISI][Medline]
Wang,C.X., Shi,Y.Y., Zhou,F. and Wang,L. (1993) Proteins, 15, 59.[ISI][Medline]
Xu,J., Baase,W.A., Baldwin,E. and Matthews,B.W. (1998) Protein Sci., 7, 158177.
Received November 8, 1999; revised January 24, 2000; accepted February 8, 2000.