1 European Molecular Biology Laboratory, 69117 Heidelberg, Germany, 2 Novo Nordisk A/S, 2880 Bagsværd, Denmark, 3 Columbia University,New York, NY 10032, USA, 4 Nonius BV, Delft, The Netherlands and 5 University of Marburg, 35032 Marburg, Germany
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: electrostatics/hydrogen bond network/pKa calculations/structure correction
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The programs most widely used [DelPhi (Nicholls and Honig, 1991), GRASP (Nicholls et al., 1991
) and UHBD (Madura et al., 1995
)] all solve the PoissonBoltzmann equation (PBE) for a proteinsolvent system. The PBE relates the charges present on the atoms in the system to the electrostatic potential at all points in space, taking into account the effects of the solvent. Numerical PBE solvers were developed as early as 1982 (Warwicker and Watson, 1982
) but until recently they were used in only a handful of laboratories. Currently, PBE solvers are available to most protein researchers and are widely used to analyse macromolecular interactions and interactions with ligands.
Electrostatic potentials computed by solution of the PBE are dependent on parameters assigned to describe the proteinsolvent system. In particular, partial atomic charges and van der Waals radii, the location of the proteinsolvent interface and the value of the protein dielectric constant must be assigned in a consistent manner in order to compute accurate values of quantities such as pKas. Considerable effort has been devoted to deriving appropriate parameters. The PARSE set of partial atomic charges and radii has been derived specifically for continuum electrostatic calculations and good agreement between calculated and experimental solvation free energies has been obtained for model compounds (Sitkoff et al., 1994). The protein dielectric constant is essentially an adjustable parameter.
Many studies have addressed the issue of the assignment of the protein dielectric constant for pKa calculations and values ranging from 2 to 80 have been used (Yang et al., 1993; Antosiewicz et al., 1994
; Demchuk and Wade, 1996
). The assignment of the protein dielectric constant is dependent on the model of the proteinwater system used, for example, whether protein flexibility and protein relaxation upon charging are modelled explicitly or not (Sham et al., 1998
) Indeed, the models for protein pKa calculations are still evolving, e.g. a modification to account for the entropy contribution of the first hydration shell was developed recently (Warwicker, 1997
).
Despite much work on parameterization, less effort has been devoted to the optimization of the coordinates of the macromolecules for electrostatic calculations. Sensitivity of computed pKas to the positioning of protons has been noted (Antosiewicz et al., 1996) and parameterization has been developed to minimize sensitivity to misplacement of protons on titratable groups (Demchuk and Wade, 1996
). Alexov and Gunner (1997) developed a Monte Carlo sampling procedure to optimize simultaneously proton positions and assign protonation states to compute pKas.
Here we studied the influence of optimization of the hydrogen bond network in the proteinsolvent system on the results of electrostatic calculations. We show that significant improvements in the calculation of electrostatic potentials and pKa values can be obtained by using the protocol described in this paper. For our calculations, we use a standard published set of protocols and parameter assignments, although these may not yield the most accurate values currently attainable with continuum solvation models. The improvements due to coordinate optimization will, however, be obtained regardless of the details of which atomic parameters, protein dielectric and pKa computation protocol are used.
Hydrogen atoms in macromolecules
Information on hydrogen atom positions is missing for the majority of X-ray structures of proteins. Several methods exist for constructing these hydrogen atom positions. The earliest methods place hydrogen atoms according to standard geometries for every amino acid type. More sophisticated methods take the local environment into account and optionally perform an energy minimization. These methods find a local energy minimum but cannot find the global energy minimum for the hydrogen bonding network. Identifying the global energy minimum of a hydrogen bonding network in a protein is complicated because the number of potential donors and acceptors is so large that the number of choices is astronomical even for small proteins (Hooft et al., 1996). An additional complication is that the electron density often does not allow crystallographers to distinguish between carbon, nitrogen and oxygen atoms, which makes it difficult to determine the
2,
2 and
3 of His, Asn and Gln, respectively. We call a His, Asn or Gln `flipped' if it has this dihedral angle off by 180°; 82% of all X-ray structures in the PDB (Bernstein et al., 1977
) require at least one flip of a His, Asn or Gln to achieve a consistent H-bonding scheme (Hooft et al., 1996
), thus illustrating that flipping side chains should be an essential aspect of the hydrogen bond optimization procedure (e.g. Bass et al., 1992
; McDonald and Thornton, 1995
; Hooft et al., 1996
).
Hooft et al. (1996) used a threshold-accepting algorithm (Dueck and Scheuer, 1990) combined with tree searching methods to optimize the global hydrogen bond network. The algorithm allows for His, Asn and Gln side chain flips and for changes in the protonation state of His, Asp and Glu. Here we show that building hydrogen atoms using this method significantly affects calculated electrostatic potentials. This has implications not only for electrostatic potential calculations themselves but also for calculations that are based on these, such as pKa calculations, Brownian dynamics (Madura et al., 1998
) or binding free energies.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Five different approaches to constructing hydrogen atom positions on proteins are analysed, as follows.
Standard coordinate method in WHAT IF (std.coord.WI):
Hydrogen atoms were added using the WHAT IF (Vriend, 1990) standard method. This method adds hydrogen atoms according to standard coordinates for each of the 20 amino acids.
Standard coordinate method in Insight97 (std.coord.In): Hydrogen atoms were placed using InsightII version 97 (MSI, San Diego, CA) using a pH value of 7. This algorithm places hydrogen atoms according to the unfilled valences of each atom. Further information can be obtained from http://www.msi.com.
Standard coordinate method followed by energy minimization of hydrogen atom positions (std.coord. + min.). Hydrogen atoms were placed with InsightII as described above. Hydrogen atoms energies were subsequently minimized for 500 steps with Discover 3 using a conjugate gradients (PolakRibiere) algorithm. All heavy atoms were kept fixed during the minimization.
Standard coordinate method followed by molecular dynamics of all non-backbone atoms (std.coord. + dyn.). Hydrogen atoms were placed with InsightII as described above. Molecular dynamics at 298 K over 5 ps were carried out using Discover 3 (version 97, MSI). A 500-step conjugate gradient (PolakRibiere) energy minimization was performed before and after the molecular dynamics run. All backbone atoms were kept at fixed positions in both energy minimizations and in the molecular dynamics run. This protocol was only applied to crambin leading to a final structure with an r.m.s.d. for the side chain atoms of 0.8 Å relative to the starting coordinates.
The global energy minimum of the hydrogen bonding network (global network).
Construction of hydrogen atom positions according to the global energy minimum of the hydrogen bonding network was performed using the algorithm of Hooft et al. (1996), as implemented in WHAT IF (Vriend, 1990).
Notes on the preparation of each individual structure before electrostatic analysis
Crambin (1crn; resolution 1.5 Å) (Hendrickson and Teeter, 1981).
No residues in 1crn occupy multiple conformations. Electrostatic potential calculations for 1crn were carried out in a single run on a 101 cubed grid with a resolution of 2 grid points/Å.
BPTI mutant (1aal; resolution 1.6 Å) (Eigenbrot et al., 1992).
The structure of this BPTI variant indicates the existence of two conformations for each of nine residues. Only the conformations denoted A in the PDB file were used in this study. Using the conformations denoted B did not significantly influence the results of this experiment. The inorganic phosphate ion was given a formal charge of 3e. The electrostatic potential calculations were carried out on a 65 cubed grid with a box fill percentage of 90, resulting in a resolution of 1.03 grid points/Å. No focusing runs (Gilson et al., 1987
) were done.
Superoxide dismutase (1cob; resolution 2.0 Å) (Djinovic et al., 1992).
No residues in 1cob occupy multiple conformations. The electrostatic potential calculations were carried out on a 65 cubed grid using one focusing run. The box fill percentage was 30 and 90 in these two runs, respectively. The resolution was 0.84 grid points/Å in the final run.
Hen egg white lysozyme (1lse; resolution 1.7 Å) (Kurinov,I.V. and Harrison,R.W., to be published). No residues in 1lse occupy multiple conformations. Electrostatic potential calculations for 1lse were carried out as described for 1cob. The resolution in the final run was 1.34 grid points/Å.
Calculating electrostatic potentials
The DelPhi program (Nicholls and Honig, 1991), interfaced to WHAT IF, was used to calculate electrostatic potentials by solving the linear form of the PBE. Manipulation of potential maps was done using the electrostatics module of WHAT IF. Van der Waals radii and atomic charges were taken from the PARSE parameter set (Sitkoff et al., 1994
). Program parameters were as follows: interior dielectric 2, exterior dielectric 80 and solvent probe radius 1.4 Å (Gilson and Honig, 1988
). Protonation states were in all cases determined by the program that constructed the hydrogen atom positions. All but the global network method use predefined protonation states for all residues. Crystallographic water molecules were in all cases included during the construction of the hydrogen atoms, but were excluded from the electrostatic potential calculations and the subsequent analyses. This was done to observe the electrostatic potential resulting from only protein atoms. Potentials were displayed on the `accessible surface' as calculated by GRASP (Nicholls and Honig, 1991
). The numerical analyses of the accessible surface potentials were done with WHAT IF using the map interpolation method as described by Rogers and Sternberg (1984).
Scanning the PDB for flipped His, Asn and Gln residues
PDB files were scanned for His, Asn and Gln side chain flips by interfacing the algorithm of Hooft et al. (as implemented in WHAT IF) to Python (http://www.python.org) scripts. These scripts are available at ftp://swift.embl-heidelberg.de/pub/flip/
pKa calculations
pKa calculations were carried out with the Yang et al. package (Yang et al., 1993) The program parameters were set according to the recommendations by the authors. The following parameters were used: number of PBE runs for focusing, 4; interior dielectric constant, 4; exterior dielectric constant, 80; ionic strength, 150 mM; ion radius, 2.0 Å; and solvent probe radius, 1.4 Å (Yang et al., 1993
). Titratable groups included in the pKa calculations were N-terminus, Asp, Glu, His, Lys, Arg and C-terminus. Only the transition from neutral to positively charged was included for histidines. The Yang et al. package uses two structures as input to the pKa calculations. One structure has all titratable groups in their neutral state and the other has all titratable groups in their charged state. This makes it possible to include structural changes upon titration. It was decided to use the same heavy atom positions for the charged and uncharged structures with the Yang et al. package. Construction of protons on the charged and neutral structures was performed using InsightII.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Crambin (1crn)
Crambin is a small hydrophobic protein that contains no histidines. According to the global network method, all Asn and Gln side chains are correctly positioned and no titratable group is predicted to have a different protonation state from what is normally observed at pH 7. This makes crambin an ideal test case for observing the results of differences in the positioning of hydrogen atoms.
Hydrogen atoms were constructed for crambin using the five methods described in materials and methods. Each resulting structure was submitted to DelPhi (Nicholls and Honig, 1991) and the electrostatic potential maps were compared to reveal differences. Differences in electrostatic potentials on the molecular surface were calculated in each case relative to the potential map obtained for the structure prepared by the global network method. The results are shown in Table I
. The absolute mean difference at the accessible surface is approximately 0.2 kT/e but is slightly larger in the case where molecular dynamics were involved in the optimization procedure. The std.coor. + dyn. method map displays an absolute mean difference of 0.27 kT/e. This larger difference is mainly caused by the fact that the r.m.s.d. between the initial and final steps in the molecular dynamics run is 0.80 Å for the side chain atoms. To check whether the hydrogen bonding was optimal in the std.coor. + dyn. structure we reconstructed the hydrogen atoms on the final MD structure using the global network method (see Table I
, column 5). The average difference in the electrostatic potentials between these two methods is 0.21 kT/e, indicating that the molecular dynamics run had not fully optimized the hydrogen bonding network.
|
The effect of incorrectly positioned Gln or Asn side chains was demonstrated using the PDB file 1aal (BPTI with mutations C30V and C51A). This X-ray structure contains two molecules in the asymmetric unit and both are supposed to have Gln 31 flipped. The position of Gln 31 is unambiguously defined by its interaction with Asn 24, which in turn is unambiguously defined by its interactions with the NH amide groups of Lys 26 and Ala 27 (see Figure 1). Two variants were produced of this structure: a global network structure and a global network structure with Gln 31 flipped back to the conformation found in the PDB file. Figure 2
shows the difference in surface potential displayed using GRASP. Although the global features of the electrostatic potential seem similar, the details around Gln31 are markedly different. Table II
shows differences between the two potential maps. The average change in the potential at the surface is 0.1 kT/e, but around Gln31 the differences are up to 2 kT/e.
|
|
|
The bovine superoxide dismutase crystal structure consists of 151 amino acids complexed with one divalent cobalt ion and one divalent copper ion. Differences in protonation state between the global network structure and the std.coord. + min. structure were observed for His A41, His A61, His B41, His B46, His B61 and for the N-terminus of chain B. The global network method assigned a formal charge of 1 to His A61 and His B61. Each of these two histidines sits between a Cu2+ and a Co2+ ion. Residues His19, Asn214 and Asn235 are flipped. It can be seen in Figure 3a-c that these changes result in an increased area of positive potential around the active site. Statistics of differences in potential values at the accessible surface of the protein are shown in Table III
.
|
|
Hen egg white lysozyme consists of 129 amino acids in a single peptide chain. A difference in protonation state between the global network structure and the std.coord. + min. structure was observed for His15. Eight side chains were flipped differently in the two structures: Asn19, Asn37, Gln41, Asn46, Gln57, Asn106, Asn113 and Gln121. The negative area around the active site is considerably larger in the global network structure (see Figure 3df). This is mainly due to the flips of Asn46 and Gln57 that both orient the side chain oxygen closer to the surface as a result of the flip. The protonation of His15 does not significantly influence the potential in the active site as it is located at the other side of the molecule.
pKa calculations
The PDBREPORT database (http://swift.embl-heidelberg.de/pdbreport/; version February 16, 1998) holds structure analysis reports for more than 7000 PDB files. We scanned all reports and counted 49 018 flipped residues. Using an active site-finding algorithm (J.E.Nielsen and G.Vriend, in preparation), we found flipped residues in 655 of 2619 PDB files containing an EC record. As an example we considered in detail the 48 hen egg white lysozyme structures (resolution ranging from 1.5 to 2.5 Å). For each structure, the flipped His, Asn and Gln side chains were identified in the presence and in the absence of crystal symmetry. The results of this analysis are summarized in Table IV. When symmetry contacts are taken into account, the total number of side chain flips for these 48 structures was 178, which is 3.7 per structure. Out of these 178 side chain flips, 19 were Asn46 that can form a hydrogen bond with the putative nucleophile Asp52. When symmetry contacts were ignored, the total number of flips was 161, which is an average of 3.3 per structure. The highest number of flips (24) was seen for Asn46.
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
More aspects can be considered to improve the accuracy of the calculations. For example, properly placing the tightly bound waters or the incorporation of protein dynamics will undoubtedly have a large effect.
Every scientist will, of course, perform electrostatic calculations on superoxide dismutase with His61 negatively charged, but the fact that only the global network method automatically assigns this residue a negative charge indicates that blind faith in automated electrostatic calculations can be dangerous.
In this study we started by assuming that the method of Hooft et al. always produces the best solution to the hydrogen bond optimization problem. This is corroborated by the improved pKa calculations for hen egg white lysozyme. Additionally, although the effects of hydrogen bonding optimization are sometimes insignificantly small, visual inspection of a large number of cases indicated that the globally optimized hydrogen bonding network is always at least as `good' as any other solution to the proton positioning problem, but often considerably better.
A global hydrogen bonding network-optimizing WWW-based server is available at http://swift.embl-heidelberg.de/servers2/.
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Antosiewicz,J., McCammon,J.A. and Gilson,M.K. (1994) J.Mol.Biol., 238, 415436.[ISI][Medline]
Antosiewicz,J., Briggs,J.M., Elcock,A.H., Gilson,M.K. and McCammon,J.A. (1996) J.Comput.Chem., 17, 16331644.
Bass,MB, Hopkins,D.F., Jaquysh,W.A. and Ornstein,R.L. (1992) Proteins, 12, 266277.[ISI][Medline]
Bernstein,F.C., Koetzle,T.F., Williams,G.J.B, Meyer,E.F.,Jr, Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J.Mol.Biol., 112, 535542.[ISI][Medline]
Demchuk,E. and Wade,R.C. (1996) J. Phys. Chem., 100, 1737317387.[ISI]
Djinovic,K., Coda,A., Antolini,L., Pelosi,G. Desideri,A., Falconi,M., Rotilio,G. and Bolognesi,M. (1992) J. Mol. Biol., 226, 227238.[ISI][Medline]
Dueck,G. and Scheuer,T. (1990) J.Comput.Phys., 90, 161175.
Eigenbrot,C., Randal,M. and Kossiakoff,A. (1992) Proteins, 14, 7587.[ISI][Medline]
Gilson,M.K. and Honig,B. (1988) Proteins, 4, 718.[ISI][Medline]
Gilson,M.K., Sharp,K.A. and Honig,B. (1987) J.Comput.Chem., 9, 327335.
Hendrickson,W.A and Teeter,M.M. (1981) Nature, 290, 107113.[ISI]
Honig,B. and Nicholls,A. (1995) Science, 268, 11441149.[ISI][Medline]
Hooft,R.W.W., Sander,C. and Vriend,G. (1996) Proteins, 26, 363376.[ISI][Medline]
Klapper,I., Hagstrom,R., Fine,R., Sharp,K. and Honig,B. (1986) Proteins, 1, 4759.[Medline]
Madura,J.D. et al. (1995) Comput. Phys. Commun., 91, 5795.[ISI]
Madura,J.D., Briggs,J.M., Wade,R.C. and Gabdoulline,R.R. (1998) Encyclopedia Comput. Chem., 1, 141154.
McDonald,I.K. and Thornton,J.M. (1995) Protein Engng, 8, 217224.[Abstract]
Nicholls,A. and Honig,B. (1991) J.Comput.Chem., 12, 435445.
Nicholls,A., Sharp,K. and Honig,B. (1991) Proteins, 11, 281296.[ISI][Medline]
Rogers,N.K. and Sternberg,M.J. (1984) J. Mol. Biol., 174, 527542.[ISI][Medline]
Schreiber,G. and Fersht,A.R. (1996) Nature Struct. Biol., 3, 427431.[ISI][Medline]
Sham,Y.Y., Muegge,I. and Warshel,A. (1998) Biophys. J., 74, 174453.
Sharp,K., Fine,R. and Honig,B. (1987) Science, 236, 14601463.[ISI][Medline]
Sitkoff,D., Sharp,K.A. and Honig,B. (1994) J. Phys. Chem., 98, 19781988.[ISI]
Vriend,G. (1990) J. Mol. Graphics, 8, 5256.[ISI][Medline]
Warwicker,J. (1997) Protein Engng, 10, 809814.[Abstract]
Warwicker,J. and Watson,H.C. (1982) J. Mol. Biol., 157, 671679.[ISI][Medline]
Yang,A.-S., Gunner,M.R., Sampogna,R., Sharp,K. and Honig,B. (1993) Proteins, 15, 252265.[ISI][Medline]
Received September 9, 1998; revised April 23, 1999; accepted May 18, 1999.