1 Biology Department, 402 Kell Hall, GSU, 24 Peachtree Center Avenue, and 4 Computer Science Department, Georgia State University, Atlanta, GA 30303, USA and 2 Department of Chemistry, Moscow State University, 119899 Moscow, Russia
3 To whom correspondence should be addressed, at the first address. E-mail: biotiy{at}suez.cs.gsu.edu
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: bioinformatics/carboxylates/hydrogen bonding/protein tertiary structure/side chainside chain packing
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
However, permanent covalent attachment of the hydrogen to the donor may not be an essential requirement for the formation of stable hydrogen bonds. One well-known example of a close pair of carboxylates is the catalytic center of aspartic proteases (Kashparov et al., 1998). Although it is generally agreed that only one of the catalytic aspartates is protonated most of the time, the location of the proton is less clear and different positions have been postulated. In fact, the location of the proton is likely to change during the enzyme reaction. The simplest model of the transition state of HIV-1 protease consists of a free proton located between the four oxygen atoms (Harrison and Weber, 1994
). During molecular dynamics simulation the proton was kinetically stable and also had a stable energy minimum, despite the absence of a covalent bond. The average distance of the proton from the delta oxygen atoms of the aspartates was 2.2 ± 0.1 Å. The small variance of this distance suggested that the proton could form multicenter hydrogen bonds of the kind known in inorganic chemistry (Jeffrey and Sanger, 1991
). Taking into account this shared proton as described in our simplest model also significantly improves the results of the molecular mechanics calculations on HIV-1 protease (Harrison and Weber, 1994
). The close interaction between Asp30 and carboxyl of the glutamate at P2' of a substrate analog of HIV-1 protease (Weber et al., 1997
) is also likely to be mediated by a shared proton and this interaction may be important for viral replication (Pettit et al., 1994
; Weber et al., 1997
). Another example is Glu664 in erbB-2 receptor. NMR and FTIR spectroscopy provide direct evidence that the side chain carboxyl of Glu644 is protonated and forms strong hydrogen bonds, influencing activation of the receptor (Smith et al., 1996
).
Thus, a stable interaction can be formed between a pair of AspGlu residues. At first glance, it seems unlikely that two amino acid residues of the same charge (such as aspartates or glutamates) could ever stably interact. However, a positively charged particle (such as a metal ion or a proton), placed between the two residues, is likely to stabilize the pair of the negatively charged residues. Alternatively, the two negative residues may be forced into contact by the constraints of the structure. This paper analyzes unusually close AspGlu pairs of residues in a large set of protein chains from PDB (Berman et al., 2000) and seeks evidence that a stabilizing interaction between two negative residues is possible through a shared proton.
![]() |
Data and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
We selected 3150 chains from crystal structures with resolution better than 2.5 Å and R-factor <0.2 using a non-redundant set of protein chains (Madej et al., 1995). This non-redundant dataset is updated on monthly basis and is available online (http://www.ncbi.nlm.nih.gov/Structure/VAST/nrpdb.html); the most current version (as of March 2002) of the PDB and the non-redundant set was used for the present study. NMR structures were not analyzed. The chains in which the close carboxylate pairs were identified are presented in Table I
.
|
The close pairs of carboxylates have been analyzed in relation to the simplest model with a free protein interacting with the four oxygen atoms of the two carboxylates. Modeling of the pair of carboxylates from the active site of the HIV-1 aspartic protease (Harrison and Weber, 1994) has shown that the maximal OH distance is 2.3 Å. This distance cutoff was used as a criterion for identification of the residue pairs in the set of proteins analyzed here. For each two negative residues with the carboxyl oxygens closer than 5 Å, an approximate location of the proton was calculated using the geometry as described by Harrison and Weber (Harrison and Weber, 1994
) and as shown in Figure 1a
, then the OH distances were measured. If at least three of the four OH distances were <2.3 Å, then the residue pair was identified as being close. It should be noted that this criterion, although based on a model for an aspartic protease, also identifies residue pairs with geometry very different from that in the aspartic protease (Harrison and Weber, 1994
), so this model is valuable for analysis of large datasets of protein structures.
|
Geometrical distinctions between the pairs of carboxylates were investigated in terms of multiplicity of the potential hydrogen bonds (three- or four-center interaction) and the minimal OO distance. Application of the criteria can be used to identify both three- and four-center interactions. The geometrical configuration of each pair was described by four distances: d(1OX1, 2OX2), d(1OX2, 2OX1), d(1OX1, 2OX1) and d(1OX2, 2OX2) (Figure 1a). The maximal OO distance in a pair, at which hydrogen bonding is still possible [max d(OO) in Figure 1b
] was double the acceptorproton maximal distance (4.6 Å, as shown in Figure 1b
, the case when atoms 1OX1, H and 2OX2 are collinear). If all four OO distances were less than the maximal distance, the interaction was labeled as four-center, and three-center otherwise. The minimal OO distance was selected as the least of the four OO distances [in Figure 1a
the minimal distance is d(1OX2, 2OX2)]. The minimal distance may describe the degree of constraint of the residue pair, as OO distances <2.8 Å also correspond to van der Waals repulsion in addition to electrostatic repulsion.
Identification of the networks of charged residues and interactions with metal ions
A close pair of negatively charged residues may be stabilized not only by a proton, but also by a positive residue nearby and/or a metal ion. Charged residues may participate in more than one ion pair thus forming charge networks (Barlow and Thornton, 1983). A pair of carboxylates may also form a metal binding site. Therefore, it was necessary to exclude these two cases to identify the pairs that can be stabilized by a proton. The interactions with the nearby charged residues and the metal ions were identified using the program INTG as described by Torshin (Torshin, 1999a
). This program was also used to analyze the interactions of the residues with crystallographic water molecules. As no more than 21% of all the carboxylates in the 840 pairs had a water molecule at van der Waals distances and potential hydrogen bonds of a water molecule with both of the residues were even rarer, these interactions were not analyzed in detail.
Analysis of surface accessibilities
An AspGlu residue was considered to be surface accessible if it had relative accessibility of at least ~0.04 (which corresponds to ~15% of the residue surface being accessible to solvent). In the calculations with SURFC (Torshin, 1999b) using a probe size of 1.4 Å, this 0.04 value corresponded to exposure of not only the two oxygens of the carboxylate group, but also almost the whole side chain. Secondary structure was identified with STRIDE (Frishman and Argos, 1995
) and figures of protein fragments were prepared using Rasmol (Sayle and Milner-White, 1995
).
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Minimal OO distance and the quality of the X-ray experiments
It is possible that abnormally close carboxylates may appear owing to the poor quality of the experimental data or crystallographic refinement. We analyzed the correlation between the minimal OO distance in the pairs and quality of the X-ray experiment for the structures in Table I. If the abnormally small OO distances between the carboxyls are related to quality of structure determination, the resolution, R-factor and average (for chain) B-factor will systematically increase whereas the minimal OO distance will decrease. However, the minimal OO distance and the R- and B-factors were not correlated. Thus, the average quality of the structures does not seem to correlate with the minimal OO distance.
Possible constraint by SS bonds and crystal packing
Another possible source of abnormally close pairs of carboxylates may be a constraint put on the pair of the negative residues by the rest of the molecule or by crystal packing. One of the most obvious constraints is the presence of the SS bonds in the vicinity of the pair of carboxylates. Although SS bonds occur in about ~14% of all the chains mentioned in Table I, the carboxylate pairs could be constrained by nearby SS bonds in <3% of all the chains. Analysis of crystal packing (Torshin, 1999a
) shows that the carboxylate pairs are usually far from the proteinprotein contacts in the crystal (>5 residues along the chain, >10 Å in space) and the possible constraint by the crystal packing may occur in no more than 15% of all the chains in Table I
. Moreover, close carboxylate pairs have also been found in many NMR experiments, although the NMR data were not analyzed in detail. Therefore, the close carboxylate pairs do not seem to occur as a result of crystal packing.
Geometrical classification of the residue pairs, three-center and four-center hydrogen bonds
Analysis of the pairwise packing of side-chains (Singh and Thornton, 1992) has shown an almost continuous spectrum of orientations with no clear preferences in orientation of the AspGlu pairs. Multicenter hydrogen bonds may be three-center (Jeffrey and Sanger, 1991
) or four-center (Harrison and Weber, 1994
). Therefore, we analyzed the geometrical distinctions in terms of multiplicity of the hydrogen bonds (three- or four-center). Application of the criteria (Figure 1a
) identifies both three- and four-center interactions. To distinguish between the two, one may use the calculated coordinates of the proton and describe configuration by the four OH distances. The geometrical configuration of each pair can also be described by the four OO distances (Figure 1a
) and in this case classification includes only the experimental coordinates. Comparison with the analysis based on the calculated coordinates of the proton shows that both methods produce similar results in >90% of cases.
When one of the four OO distances was less than the maximal OO distance (Figure 1b), the oxygen atom was marked as involved in multicenter hydrogen bonding. The configuration of each pair of carboxylates (potential H-bonding subset, Table I
) was encoded as a binary string (1 stands for an atom involved in the interaction, 0 otherwise). For example, in chain 1how_a the distances for residues Glu217 and Asp719 were 4.484, 3.910, 3.305 and 3.914. All of them are less than 4.6 and the binary string for this residue pair is 1111. Thus, 1111 stands for a four-center hydrogen bond, 0011 and 1100 stand for a distinctly three-center interaction and the other strings correspond to intermediate cases. The results of the analysis are summarized in Table II
.
|
|
The separation of the carboxylate pairs along the polypeptide chain was analyzed in order to determine whether their interaction involves mostly local or tertiary structure; 51% of the residue pairs are separated by less than 20 residues [size of a ß-hairpin (Espinosa et al., 2001)] and hence can stabilize mostly the local structure. In the remaining 49% the residues are separated by 20700 residues, which may be important for the tertiary structure stabilization. This is distinct from the ion pairs, which were suggested to stabilize mostly tertiary rather than secondary (local) structure (Barlow and Thornton, 1983
).
Residues of the carboxylate pairs are preferably placed on loops. Over the set, 67% of the residues belong to loops, 24% to helices and 9% to ß-strands. Loops are the most preferred elements both in the local and the tertiary residue pairs. This differs from the three-center hydrogen bonds formed by the main-chain atoms (Preissner et al., 1991), which systematically occur in
-helices and in the ß-strands. Loops are fairly flexible elements of secondary structure and thus the close packing of the two negative residues is not likely to be due to an induced constraint. In other words, the more flexible loops are less likely to put significant constraints to sustain abnormally close distances between two negatively charged residues than are the more rigid helices and/or ß-strands.
Temperature factors of the surface-accessible close pairs of carboxylates
Aspartate and glutamate side chains of proteins are often found on the surface and hence are likely to be disordered in the electron density maps. In order to check whether the carboxylates are disordered in the structures used for the analysis, we compared the B-factors of the carboxylate pairs comprised of the surface-accessible Asp and Glu residues and all of the Asp and Glu residues over the set. The results are presented in Figure 3. Two observations are important. First, the maximum of the distribution of the surface-accessible residues of carboxylate pairs is at 15 Å2, whereas the distribution of all of the surface-accessible Asp and Glu residues peaks at 25 Å2. The difference in the means between the two distributions in Figure 3
was estimated (Efron and Tibshirani, 1993
) as being over 10 times higher than estimated standard error for a random sample of the size of the smaller set. Second, the majority (93%) of the residues of the carboxylate pairs have B-factors <40 Å2, while only 76% of all the surface accessible Asp and Glu residues have B-factors lower than 40 Å2.
|
Provided that no structural constraints are placed on a pair of carboxylates, the close configuration may be due to (1) binding of a metal ion, (2) the presence of a charged residue in the vicinity of the [DE][DE] pair and (3) a proton, invisible in the electron density map. For some experiments (over 300 entries marked in Table I), data on pH were explicitly stated in the PDB files. Although these values were the pH of the crystallization solution, they may be used for a comparative analysis, as the crystal is in equilibrium with the solution.
The correlation between the pH and percentages of the [DE][DE] pairs stabilized by metal ion, charged residue and, potentially, by a proton was analyzed (Figure 4). The pH values were between 3 and 9; in the bulk of the experiments pH was >4. As a pair of negative residues can be stabilized only by an intermediate positive particle, physico-chemical sense suggests that at higher pH (>7), when protons are less likely to occur, the particle is more likely to be a metal ion or an atom of a positive residue, whereas at lower pH (<7) the stabilizing particle is more likely to be a proton. The results indeed support the idea that more pairs are stabilized by a proton at lower pH (3.16). At the same time, more pairs are stabilized with a metal ion or a positive residue at higher pH values (Figure 4
).
|
If the close carboxylate pairs were indeed stabilized by a proton and not by constraints of the rest of the protein, then the minimal OO distance would correlate with the pH. Although the shared proton in the active site of the HIV-protease (Harrison and Weber, 1994) was not likely to be rapidly exchanged with solvent, the rate of such exchange would certainly depend on the surface accessibility of the two carboxylates. Therefore, the correlation between minimal OO distance and pH should depend on the surface accessibility of the residues of the pair.
It was found that 71% of the residues in the pairs of the H-bonding subset (Table I) were accessible with surface accessibilities of the separate residues between 0.04 and 0.25 (0.070.43 for residue pairs). However, a pHmin d(OO) correlation is likely to be less statistically significant if such a wide range of surface accessibilities is taken. Therefore, it is necessary to take a narrow range of surface accessibilities as similar surface accessibilities are likely to imply a comparable spatial environment of the close pairs of carboxylates. The narrowest accessibility ranges in the set that had at least 30 data points were 0.70.9, 0.911, 0.110.13 and 0.140.16. The rest of the relatively narrow ranges (e.g. 0.170.19, 0.200.22, etc.) contained too few data points. As there were no more than 1520 points at a fixed value of accessibility (say, 0.9), these overlapping ranges were taken to ensure a significant number of data points.
The statistical significance of the correlation coefficients in these subsets was assessed using Students t-test to assess the significance of correlation based on transformation of the coefficient into a Gaussian random variable (Press et al., 1992; Efron and Tibshirani, 1993
). The t-test gives the probability that a given correlation coefficient can be the result of choosing pairs of random numbers. The t-value also explicitly includes the dependence of the correlation coefficient on the number of data points. The results are summarized in Table III
and an example of correlation between pH and the minimal OO distance is given in Figure 5
. In the tested ranges of relative surface accessibility the correlations are significant to at least the 0.95 confidence level.
|
|
The graphics of the pH versus min d(OO) (such as in Figure 5) were examined. The slopes of the regression lines were positive and were in a relatively specific range of 0.750.96 (Table III
). The positive signs of the slopes (Figure 5
) suggests plausible physical chemistry: in general, the lower the pH, the smaller the OO distance will be (as at lower pH more protons are available to stabilize the carboxylate pairs).
Conclusion
The packing of any pair of side chains in a protein is determined by a constraint provided by the rest of the molecule as well as by electrostatic attraction/repulsion plus van der Waals repulsion between the two side chains. For two negatively charged side chains, both electrostatic and van der Waals repulsion would prevent close packing. However, close pairs of negatively charged residues are often placed on loops (which are not likely to provide significant constraint) and these residue pairs are without other possible types of constraints (such as disulfide bonding or crystal packing). Moreover, such closely packed pairs are often surface accessible. These three observations suggest an environment in which side chains can move freely to avoid both electrostatic and van der Waals repulsion. In contrast, we observe that the oxygen atoms of the pairs are fairly tightly packed and the temperature factors of the surface-accessible carboxylate pairs were typically lower than those of the other surface-accessible carboxylates. These two observations are more consistent with an attractive rather than a repulsive interaction. In the absence of water molecules, metal ions or positive residues a proton is the only intermediary that can provide stabilization of a pair. The correlation between pH and the minimal OO distance is an additional argument that supports the presence of multicenter hydrogen bonds, which are formed by a loose proton shared by side chain carboxyls of the two negatively charged residues. The carboxylate pairs are likely to stabilize both the local and tertiary structure, particularly at low pH. The results suggest that the close pairs of carboxylates should be explicitly treated during molecular simulations of proteins by placing a proton between them.
![]() |
Acknowledgments |
---|
The work was supported in part by the NCI grants CA 76259 and GM 62920.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Barlow,D.J. and Thornton,J.M. (1983) J. Mol. Biol., 168, 867885.[ISI][Medline]
Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) Nucleic Acids Res., 28, 235242.
Efron,B. and Tibshirani,R.J. (1993) An Introduction to Bootstrap. Monographs on Statistics and Applied Probability. Chapman and Hall, New York, NY, pp. 28, 5356, 156159.
Espinosa,J.F., Munoz,V. and Gellman,S.H. (2001) J. Mol. Biol., 306, 397402.[CrossRef][ISI][Medline]
Frishman,D. and Argos,P. (1995) Proteins, 23, 566579.[ISI][Medline]
Harrison,R.W. and Weber,I.T. (1994) Protein Eng., 7, 13531363.[Abstract]
Huggins,M.L. (1971) Angew. Chem., Int. Ed. Engl., 10, 147151.[ISI]
Ippolito,J.A., Alexander,R.S. and Christianson,D.W. (1990) J. Mol. Biol., 215, 457471.[ISI][Medline]
Jeffrey,G.A. and Sanger,W. (1991) Hydrogen Bonding in Biological Structures. Springer, Berlin, Germany.
Kashparov,I.V., Popov,M.E. and Popov,E.M. (1998) Adv. Exp. Med. Biol., 436, 115121.[ISI][Medline]
Madej,T., Gibrat,J.F. and Bryant,S.H. (1995) Proteins, 23, 356369.[ISI][Medline]
McDonald,I.K. and Thornton,J.M. (1994) J. Mol. Biol., 238, 777793.[CrossRef][ISI][Medline]
Pettit,S.C., Moody,M.D., Wehbie,R.S., Kaplan,A.H., Nantermet,P.V., Klein,C.A. and Swanstrom,R. (1994) J. Virol., 68, 80178027.[Abstract]
Preissner,R., Egner,U. and Sanger,W. (1991) FEBS Lett., 288, 192196.[CrossRef][ISI][Medline]
Press,W.H., Teukolsky,S.A., Vettering,W.T. and Flannery,B.R (1992). Numerical Recipes in C. 2nd edn. Cambridge University Press, Cambridge, UK, pp. 636639.
Sayle,R.A. and Milner-White,E.J. (1995) Trends Biochem. Sci., 20, 374375.[CrossRef][ISI][Medline]
Singh,J. and Thornton,J.M. (1992) Atlas of Protein Side-Chain Interactions, Vols I and II. IRL Press, Oxford, UK.
Smith,S.O., Smith,C.S. and Bormann,B.J. (1996) Nature Struct. Biol., 3, 252258.[ISI][Medline]
Stickle,D.F., Presta,L.G., Dill,K.A. and Rose,G.D. (1992) J. Mol. Biol., 226, 11431159.[ISI][Medline]
Torshin,I. (1999a) Front. Biosci., 4, 557570.
Torshin,I. (1999b) Front. Biosci., 4, 394407.
Weber,I.T., Wu,J., Adomat,J., Harrison,R.W., Kimmel,A.R., Wondrak,E.M. and Louis,J.M. (1997) Eur. J. Biochem., 249, 523530.[Abstract]
Received May 15, 2002; revised December 15, 2002; accepted January 17, 2003.