Carbonyl–carbonyl interactions stabilize the partially allowed Ramachandran conformations of asparagine and aspartic acid

Charlotte M. Deane1, Frank H. Allen2, Robin Taylor2 and Tom L. Blundell1,3

1 Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, 2 Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, UK


    Abstract
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
Asparagine and aspartate are known to adopt conformations in the left-handed {alpha}-helical region and other partially allowed regions of the Ramachandran plot more readily than any other non-glycyl amino acids. The reason for this preference has not been established. An examination of the local environments of asparagine and aspartic acid in protein structures with a resolution better than 1.5 Å revealed that their side-chain carbonyls are frequently within 4 Å of their own backbone carbonyl or the backbone carbonyl of the previous residue. Calculations using protein structures with a resolution better than 1.8 Å reveal that this close contact occurs in more than 80% of cases. This carbonyl–carbonyl interaction offers an energetic sabilization for the partially allowed conformations of asparagine and aspartic acid with respect to all other non-glycyl amino acids. The non-covalent attractive interactions between the dipoles of two carbonyls has recently been calculated to have an energy comparable to that of a hydrogen bond. The preponderance of asparagine in the left-handed {alpha}-helical region, and in general of aspartic acid and asparagine in the partially allowed regions of the Ramachandran plot, may be a consequence of this carbonyl–carbonyl stacking interaction.

Keywords: asparagine/aspartic acid/carbonyl stacking/propensity/Ramachandran plot


    Introduction
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
Conformational preferences of amino acid residues in proteins are important in relating sequence to structure. Ramachandran and Sasisekharan (1968) identified three types of region in conformational space, defined by the main chain {phi}/{Psi} angles: (a) fully allowed regions, such as the right-handed {alpha}-helical region, where all amino acids can occur whilst preserving ideal peptide geometry; (b) partially allowed regions, often bridges between allowed regions, where residues can occur but only at the cost of distorting from ideal peptide geometry to avoid clashes between atoms; and (c) disallowed regions in which amino acids can only occur with severe geometric distortions.


    Amino acid Ramachandran preferences
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
Although non-glycyl residues can, in principle, adopt conformations anywhere within the allowed regions, preferences within this space depend upon the amino acid residue type. Specific residue-dependent backbone conformational patterns are well known even in structurally variable regions (Sibanda and Thornton, 1985Go; Donate et al., 1996Go).

The empirical rules of amino acid conformational propensities calculated from known structures have been examined many times and with a large number of different structural data sets (Chou and Fasman, 1978Go; McGregor et al., 1987Go; Swindells et al., 1995Go). However, while many features are well documented, most have not been explained in terms of the interactions that stabilize them. In this context asparagine is interesting (Richardson, 1981Go) because it has a high preference for the partially allowed regions of the Ramachandran plot (Chou and Fasman, 1974Go) and in particular for the left-handed {alpha}-helical region ({alpha}-L) (Srinivasan et al. 1994Go). Aspartic acid also shows a high preference for the partially allowed regions of the Ramachandran plot. Its propensity for the {alpha}-L region is lower than that of asparagine but is still well above average for the amino acids (Chou and Fasman, 1978Go).

Why any amino acid residue is found in the partially allowed Ramachandran regions is a question of interest. One possibility is that an amino acid may tolerate small deviations from its ideal conformation in order to optimize stabilizing tertiary interactions in the protein, such as hydrogen-bonding patterns or keeping hydrophobic residues buried or interactions with the substrate or ligand at the active site (Moult and Herzberg, 1991Go). Thus, the unfavourable energy of these small local distortions can be compensated by the energetic favourability of the overall structure. Amino acids with a high propensity for partially allowed Ramachandran regions should therefore display some form of stabilization with respect to all other amino acids in that conformation to explain this relative preponderance.

The {alpha}-L is a small region with positive {phi} in the Ramachandran plot labelled area E in Figure 2Go. For non-glycyl residues it has a higher energy than the larger right-handed {alpha} helical region with negative {phi}. L-Amino acids in the {alpha}-L conformation exhibit a close approach of the backbone carbonyl oxygen to the ß-carbon of the side chain, a situation which does not occur in the right-handed {alpha}-helical region. {alpha}-L has particularly unfavourable steric clashes for ß-branched amino acids such as Thr, Val and Ile. Asparagine adopts conformations in {alpha}-L and other partially allowed regions more readily than any other non-glycyl residue. It has often been assumed that this preference is due to hydrogen bonding by the asparagine side chain back to a preceding main-chain carbonyl, but this hydrogen-bonding pattern does not occur very frequently (C.M.Deane, unpublished results).



View larger version (57K):
[in this window]
[in a new window]
 
Fig. 2. The five areas of the Ramachandran plot considered in this analysis.

 

    Carbonyl–carbonyl interactions
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
An alternative source of stabilization energy can arise through non-covalent attractive interactions between pairs of >C({delta}+)=O({delta}–) (carbonyl) dipoles. Taylor et al., (1990) and Gavezzotti (1990) have previously examined small-molecule crystallographic data and commented on the close approach of carbonyl groups and on the likely importance of these dipolar interactions. Recently, Allen et al., (1998) carried out a systematic study of such interactions between ketonic groups in the Cambridge Structural Database (CSD) (Allen et al., (1991) and characterized three main types of interaction motif: (a) a sheared anti-parallel motif with two short carbon–oxygen interactions; (b) a perpendicular motif with only one short carbon–oxygen interaction; and (c) a highly sheared parallel motif with only one short carbon–oxygen interaction. Their calculations, using the intermolecular perturbation theory of Hayes and Stone (1983) and the acetone dimer as a model system, gave an attractive interaction energy for the anti-parallel dimer, with carbon–oxygen separations of 3.02 Å, as –22.3 kJ/mol. For a single carbonyl–carbonyl interaction in the perpendicular motif the energy was –7.6 kJ/mol at this separation. Hence some carbonyl–carbonyl interactions are clearly competitive with hydrogen bonds in terms of their stabilization energy.

MacCallum et al. (1995a,b) examined the importance of Coulombic interactions between backbone carbonyls in proteins as a stabilizing factor in {alpha}-helices, ß-sheets and the right-handed twist often observed in ß-strands. Their calculations indicate an attractive interaction energy of ~-8 kJ/mol in specific cases and they remark that, within their computational model, these interactions are ~80% as strong as the backbone hydrogen bonds in proteins.

In this paper, we examine the possibility of asparagine and aspartic acid side-chain carbonyls interacting with neighbouring backbone carbonyls in this fashion. We show that such interactions can stabilize these two amino acid types in partially allowed conformations in a way that is not available to other amino acids.


    Results and discussion
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
Asparagine and aspartic acid show a preference to be in regions outside regular secondary structure as shown in Figure 1AGo. As expected, Pro and Gly also both show a preference for regions outside regular secondary structure. Pro disrupts secondary structure and is often found as a capping residue. Gly is often found in loop regions, probably because a conformation with positive {phi} is often required to complete a turn.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1. (A) The propensities of the 20 naturally occurring amino acids for regions outside regular secondary structure. The data set used is PDB chains with <35% identity and resolution better than 1.8 Å. (B) The five areas of the Ramachandran plot used are shown in Figure 2Go; A and B correspond to fully allowed regions and C, D and E to partially allowed regions. The data set used is PDB chains with <35% identity and resolution better than 1.8 Å. The average is calculated across all the 18 amino acids under consideration. The percentage of Asn/Asp/Glu/Gln involved in carbonyl stacking over their own backbone carbonyl or that of the preceding residue. `Overall' is not an average of the areas, as some residues do not have {phi}/{Psi} within any of the five defined regions. The carbon–oxygen distance cut-off used to identify the interaction was 4 Å.

 
The five areas used in this analysis are shown in Figure 2Go. Areas A, B and E are generally similar to the original Efimov (1980) regions or the extension of these defined by Wilmot and Thornton (1990). Areas A and B correspond to the {alpha} and ß regions, respectively, and E corresponds to an amalgamation of {alpha}-L and {omega}-L. Regions C and D are partially allowed regions which in the Wilmot and Thornton (1990) definition are incorporated into the large ß and {alpha} areas; here they are used as individual regions in their own right. Asparagine and aspartic acid show dominance for the partially allowed regions of the Ramachandran map (confirmed to be statistically significant by the {chi}2-test).

Stabilizing interactions

An examination of the 45 examples of asparagine and the 14 examples of aspartic acid in area E, from test set structures having resolutions better than 1.5 Å, was carried out to search for possible stabilizing interactions that might account for the high propensity of asparagine in this region. This revealed that the asparagine/aspartic acid side chains are frequently solvent exposed or hydrogen bonded to distant parts of the protein chain (not involved in hydrogen bonding back to the neighbouring backbone atoms). If this distant hydrogen bonding were the only stabilizing interaction, glutamic acid and glutamine should have similarly high propensities since they have the same hydrogen bonding possibilities. However, an interaction was noted that might add extra stability to those asparagine and aspartic acid in area E; this is a carbonyl–carbonyl interaction. This stabilization is local and theoretically available to four of the amino acids, asparagine, aspartic acid, glutamine and glutamic acid, as they all possess carbonyls in their side chains. The percentage of residues involved in an interaction between the side-chain carbonyl of X to the backbone carbonyl of X or X – 1 was calculated for each of the areas and overall for the residue types asparagine, aspartic acid, glutamic acid and glutamine, as shown in Figure 1BGo. Glutamine and glutamic acid show very little carbonyl stacking locally, presumably owing to steric constraints. Of the asparagine side-chain residues in area E, 79% are involved in carbonyl–carbonyl interactions at a carbon–oxygen separation of <=4 Å. The 4 Å cut-off was selected as at this range there is still an appreciable attractive interaction energy out to van der Waals + 0.5 Å (Allen et al., 1998Go). The same ratio between the areas and the different amino acids is observed with variation in the distance cut-off between 3.5 and 4.2 Å.

Carbonyl–carbonyl stacking of the side chain over main chain

These data show that asparagine and aspartic acid have far greater percentages of carbonyl–carbonyl interactions per residue than glutamine and glutamic acid and that asparagine shows the greatest percentage of all. Carbonyl–carbonyl interactions can be seen in all the regions for asparagine and aspartic acid but are least prevalent in area A (Figure 1BGo).

Two types of carbonyl–carbonyl interaction were observed: type 0, stacking of the asparagine/aspartic acid side chain over its own backbone carbonyl; and type 1, stacking of the asparagine/aspartic acid side-chain carbonyl over the backbone carbonyl of the previous residue. An example of each is shown in Figure 3Go. These can generally be correlated with the two most common {chi}1 of asparagine/aspartic acid. Type 0 is usually associated with a {chi}1 in the trans (–180°) conformation and type 1 is associated with {chi}1 in the g+ (–60°) conformation. The two types have slightly different percentages of occurrence in the five different areas. They also have very different average distances of interaction: using the shortest carbon–oxygen separation, type 0 is about 0.4 Å shorter than type 1 on average.




View larger version (72K):
[in this window]
[in a new window]
 
Fig. 3. Two Asn side chains from area E involved in carbonyl stacking: carbons grey; oxygens red; nitrogens blue. The Asn side chain (with its C{alpha} indicated by the black dot) and the backbone atoms of the neighbouring residues are displayed. (A) Asn 172 of 2eng, {chi}1 is trans and the carbonyl on the side chain of Asn is over its own backbone carbonyl with a separation of 2.9 Å. (B) Asn 74 of 1arb, {chi}1 is g+ and the carbonyl on the side chain of Asn is over the previous amino acids backbone carbonyl with a separation of 3.9 Å.

 
An analysis of all pairs with a separation less than the 4 Å cut-off showed that 70% could be considered to be in the sheared parallel motif as described by Allen et al. (1998) and only 15% could not be classified into any of their three interaction motifs. The preference for the sheared parallel motif may be that both of the O atoms in this motif are exposed, so that both can take part in additional hydrogen bonding or other interactions, which is not true for the perpendicular or sheared anti-parallel motif. The sheared parallel motif has an attractive energy of about –7.6 kJ/mol at a separation of 3.07 Å (Allen et al. 1998Go). This energy is comparable to that obtained by MacCallum et al. (1995a,b), on which they based their arguments for secondary structure stabilization by carbonyl–carbonyl interactions. Further, it is comparable to the energies of the stronger carbon–hydrogen to oxygen interactions which are known to play a significant role in the stabilization of small molecule structures (see, e.g., Desiraju, 1989).

The remaining question in this analysis is why asparagine has a much greater propensity than aspartic acid for area E and for the partially allowed regions in general, when both have the possibility of forming carbonyl–carbonyl interactions. The carbonyl–carbonyl interaction is seen less for aspartic acid and particularly in area E. It would seem that the interaction is weaker for aspartic acid, probably because of greater repulsion between the charged oxygen atoms of the side chain when they are close to the backbone oxygen atoms.


    Conclusion
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
This analysis provides an explanation for the predominance of asparagine in the {alpha}-L conformation and also a reason for the high preference of asparagine and aspartic acid for the partially allowed regions of the Ramachandran map and their preference for regions outside regular secondary structure in general.

This predominance is explained by a stabilization that can only be achieved by asparagine and aspartic acid. This is a carbonyl–carbonyl interaction between the side-chain carbonyl and local backbone carbonyls, specifically the backbone carbonyl of the asparagine/aspartic acid residue in question or the one previously. The interaction energy of this dipole–dipole interaction has previously been calculated and shown to be of comparable strength to that of a hydrogen bond. This means that there is stabilization energy available to asparagine/aspartic acid for the partially allowed regions of the Ramachandran plot in all structural contexts that is not available to any other amino acid. Hence it is energetically more favourable for an aspartic acid or asparagine to be found in these partially allowed regions than any other non-glycyl amino acid.

The theme of this analysis emphasizes the fact that close examination of unusual propensities in the Ramachandran plot can reveal possible stabilization for these features which can lead to further understanding more generally of the dependence of protein tertiary structure on primary structure.


    Materials and methods
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
Amino acid propensities were calculated from known protein structures to describe the preferences of amino acids for regions in the Ramachandran plot using the equation

where P = propensity, n(X)a = number of amino acid type X in area a, n(X) = total number of amino acid type X, Na = number of all amino acids in region a and N = total number of all amino acids.

A value P > 1 indicates a preference and a value P < 1 indicates disfavour for a region. The calculations were done with and without secondary structure as defined by Kabsch and Sander (1983), and also with and without Pro and Gly as they have unique conformational features (Kabsch and Sander, 1983Go). Propensities were also calculated for regions outside regular secondary structure replacing the number in an area of the plot by the number not involved in secondary structure.

The Protein Data Bank (PDB) (Abola et al., 1987Go, 1997Go) files to be used for calculation of these propensities were selected using PDB_SELECT (Hobohm et al., 1992Go; Hobohm and Sander, 1994Go) to collect 253 PDB chains with an R factor below 30%, a resolution better than 1.8 Å and sequence identity less than 35%. The PDB chains were then `cleaned' (Morris et al., 1992Go). These chains contain a total of 51151 residues with 3164 aspartic acid and 2525 asparagine residues; 43% of all residues are involved in secondary structure as defined by Kabsch and Sander (1983).

The overall environments of the individual asparagine and aspartic acid residues in these regions were then inspected for interesting features. Carbonyl–carbonyl interactions between the asparagine/aspartic acid/glutamine/glutamic acid side chain and the backbone carbonyl of the asparagine/aspartic acid/glutamine/glutamic acid residue or the preceding one were then calculated. A carbonyl–carbonyl interaction was presumed to occur if the shortest distance between the carbon/oxygen of the side chain to an oxygen/carbon of a backbone carbonyl was below a threshold. The distance cut-off for the analysis of these interactions was varied between 3.5 and 4.2 Å.

In the X-ray crystal structures of proteins defined at resolutions >1.5 Å, the electron density map cannot distinguish between carbon, nitrogen and oxygen atoms and the hydrogen atoms can rarely be seen. For the majority of side chains the atoms can be uniquely identified from the shape of the electron density, but asparagine, glutamine and histidine side chains appear symmetrical in the electron density and some specific atoms can only be identified on the basis of their interactions, principally hydrogen bonds. In the cases of asparagine and glutamine, the problem is to distinguish between the side-chain nitrogen and oxygen atoms. This relates directly to the analysis of the carbonyl–carbonyl interaction described here. In this study the asparagine and glutamine sidechains were allowed to occur in either conformation and that which confirmed a carbonyl–carbonyl interaction was used in the calculation.

The type of interaction motif was defined by calculation of the four angles A (C1–O1–C2), B (O1–C2–O2), C (C2–O2–C1) and D (O2–C1–O1) (subscript 1 indicating the side-chain carbonyl and 2 the backbone carbonyl).

The three ideal motifs as described by Allen et al. (1998) are as follows: (a) an anti-parallel motif with two short carbon–oxygen interactions and ideal angle pattern 90, 90, 90, 90°; (b) a perpendicular motif with only one short carbon–oxygen interaction and angle pattern 75, 15, 180, 90°; and (c) a parallel motif with only one short carbon–oxygen interaction and angle pattern 90, 90, 55, 55°.

If all angles were within 15° of these ideal patterns (carbonyl bond length 1.2 Å, carbon–oxygen contact 3.5 Å and no shearing) and at least the expected number of short carbon–oxygen interactions were present in any interacting pair, they were considered to form that motif. The angle pairs A,C and B,D are interchangeable for calculation of similarity to the angle pattern.


    Notes
 
3 To whom correspondence should be addressed. E-mail: tom{at}cryst.bioc.cam.ac.uk Back


    References
 Top
 Abstract
 Introduction
 Amino acid Ramachandran...
 Carbonyl-carbonyl interactions
 Results and discussion
 Conclusion
 Materials and methods
 References
 
Abola,E.E., Bernstein,F.C., Bryant,S.H., Koetzle,T.F. and Weng,J. (1987) Data Commission of the International Union of Crystallography, 107–132.

Abola,E.E., Sussman,J.L., Prilusky,J. and Manning,N.O. (1997) Methods Enzymol., 277, 556–571.[ISI][Medline]

Allen,F.H. et al. (1991) J. Chem. Inf. Comput. Sci., 31, 187–204.[ISI]

Allen,F.H., Baalham,C.A., Lommerse,J.P.M. and Raithby,P.R (1998) Acta Crystallogr., B54, 320–329.

Chou,P.Y. and Fasman,G.D. (1974) Biochemistry, 13, 211–222.[ISI][Medline]

Chou,P.Y. and Fasman,G.D. (1978) Adv. Enzymol., 47, 45–148.[Medline]

Desiraju,G.R. (1989) Crystal Engineering and Design of Organic Solids. Academic Press, New York.

Donate,L.E., Rufino,S.D., Canard,L.H.J. and Blundell,T.L. (1996) Protein Sci., 5, 2600–2616.[Abstract/Free Full Text]

Efimov,A.V. (1980) Mol. Biol. (Moscow), 20, 208–216.

Gavezzotti,A. (1990) Phys. Chem., 94, 4319–4325.

Hayes,I.C. and Stone,A.J. (1986) J. Mol. Phys., 53, 83–105.

Hobohm,U. and Sander,C. (1994) Protein Sci., 3, 552.

Hobohm,U., Scharf,M., Schneider,R. and Sander,C. (1992) Protein Sci., 1, 409–417.[Abstract/Free Full Text]

Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.[ISI][Medline]

MacCallum,P.H., Poet,R. and Milner-White,E.J. (1995a) J. Mol. Biol., 248, 361–373.[ISI][Medline]

MacCallum,P.H., Poet,R. and Milner-White,E.J. (1995b) J. Mol. Biol., 248, 374–384.[ISI][Medline]

McGregor,M.J., Islam,S.A. and Sternberg,M.J.E. (1987) J. Mol. Biol., 198, 295–310.[ISI][Medline]

Morris,A.L., Macarthur,M.W., Hutchinson E.G. and Thornton,J.M. (1992) Proteins: Struct. Funct. Genet., 12, 345–364.[ISI][Medline]

Moult,J. and Herzberg,O. (1991) Proteins: Struct. Funct. Genet., 11, 223–229.[ISI][Medline]

Ramachandran,G.N. and Sasisekharan,V. (1968) Adv. Protein Chem., 28, 283–437.

Richardson,J.S. (1981) Adv. Protein Chem., 34, 167–339.[Medline]

Sibanda,B.L. and Thornton,J.M. (1985) Nature, 316, 170–174.[ISI][Medline]

Srinivasan,N., Anuradha,V.S., Ramakrishnan,C., Sowdhamini,R. and Balaram,P. (1994) Int. J. Pept. Protein Res., 44, 112–122.[ISI][Medline]

Swindells,M.B., MacArthur,M.W. and Thornton,J.M. (1995) Nature Struct. Biol., 2, 596–603.[ISI][Medline]

Taylor,R., Mullaley,A. and Mullier,G.W. (1990) Pestic. Sci., 29, 197–213.[ISI]

Wilmot,C.M. and Thornton,J.M. (1990) Protein Engng, 3, 479–493.[Abstract]

Received February 3, 1999; revised March 19, 1999; accepted March 24, 1999.