Institute of Biochemistry, Charité, Medical Faculty of the Humboldt University, Monbijoustr 2, 10117 Berlin, Germany
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: classification/helix packing/mimicry/similarity
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Because of their aesthetics, graphs of -helices in biochemical textbooks outlined an idealized picture of this structurally diverse secondary structural element and of its packing. Considerable work has been done in the last two decades to draw a more realistic portrait of the complex helical arrangements. Approaches exploring the propensities of charged amino acids at helix termini (Chou and Fassman, 1978) focused on the helical dipole (Hol et al., 1978
), providing explanations for enzymatic functions (Hol, 1985
). Analysis of typical helical sequence patterns exhibiting clear (heptad) periodicity of hydrophobicity (Eisenberg et al., 1982
) not only revealed the helical wheel model but also allowed the prediction of helices from their sequence including the relative arrangement (Cohen and Kuntz, 1987
) or special types of termini (Bork and Preissner, 1990
). A review of the work concerning
-helix-forming amino acid propensities was given by Creamer and Rose (1994).
Along with the growing number of highly resolved protein structures, it became obvious that -helices are not just ideal cylinders fixed by linear hydrogen bonds but they are actually often curved (60%) or even kinked (Barlow and Thornton, 1988
) and 90% of the hydrogen bonds are bifurcated (Preissner et al., 1990
).
The packing of helices has been subjected to a number of detailed studies for particular proteins. Richmond and Richards (1978) examined helix packing in myoglobin in terms of interface size and Voronoi volumes. They pointed out the meaning of the size of the contact surface area which was emphasized in correlation with packing angle preferences (Bowie, 1997).
Efimov (1979) analyzed five proteins and found correlations between side chain rotamers of hydrophobic residues and helix packing (polar/apolar packing model). From a set of 10 proteins the `ridges into grooves' model was developed by Chothia et al. (1981). About 700 pairs of helices from well resolved proteins allowed a statistical analysis of interhelical angles and distances (Reddy and Blundell, 1993). Walther et al. (1996) continued the studies of helices from 220 proteins as lattices in a rigorous mathematical manner. Bowie (1997) found that different packing models do not adequately explain the distribution of the interhelical torsion angle, although Chothia et al. (1981) stated that a simple model would predict parallel helical axes as the most probable.
Experimental approaches to helix transplants between different proteins were made a decade ago (Du Bose and Hartl, 1989), but methods permitting the rational selection of candidates are missing.
The analysis given here concentrates on the molecular surface patches between -helices as defined in Preissner et al. (1998). The idea is the use of such interfaces within proteins as a learning set for intermolecular recognition processes and for the prediction of contacts occurring during protein folding and association. Their extent is directly governed by the atomic contacts between the secondary structural elements.
Our approach was guided by a number of questions, as follows:
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
This work is based on the Databank of Interfaces in Proteins (DIP); for details, see Preissner et al. (1998). A subset of 300 well-resolved protein structures from the Brookhaven Protein Data Bank (Bernstein et al., 1977) was selected to avoid redundancy. A database of about 150 000 surfaces was created, analyzing the inter-secondary contacts within proteins and their contact with solvent. A retrieval system allows fast scans for different criteria such as origin (helix, coil, sheet), extent (length, width, depth, number of atoms), shape or density.
Definition of secondary structures
The protein structure was dissected into structural elements on the basis of secondary structure. There is a variety of methods to assign the location of structural elements of proteins on the basis of hydrogen bond patterns, -,
-angles or backbone curvature (Colloc'h et al., 1993
). In principle, the assessment of secondary structure is an unambiguous procedure, but known algorithms are not free of artifacts (Colloc'h et al., 1993
). To be comparable to other studies we chose the common method of the DSSP program written by Kabsch and Sander (1983) to assign helical segments `H' (4-helix =
-helix) and extended ß-strands participating in ß-ladder `E'. All other segments are summarized under coil `C'. The most significant artifact of the KabschSander algorithm is the high frequency of short helices assigned in other methods as coil (Colloc'h et al., 1993
). Therefore, only helices with more than four residues were taken into account in this analysis.
Calculation of atomic neighbors
To avoid artifacts in the interface definition resulting e.g. from different refinement methods during structure determination, not only a singular distance criterion was used but for any calculated atomic contact in a distance range from 0.5 to 2.5 Å the corresponding distance between the van der Waals surfaces was stored.
We used the atomic radii of Stouten et al. (1983). Contacts at negative distances (meaning overlapping van der Waals spheres) are predominantly found for hydrogen-bonded atoms. The compositional data presented here (Tables I and II) refer to the strict criterion of cut-off = 0.0 Å. For larger cut-off values the atomic composition becomes more similar (for details, see Preissner et al., 1998).
|
|
Interior interfaces. An interface is built by two patches from neighboring secondary structural elements. The patches are constituted by those atoms fulfilling the above-mentioned contact criterion. This analysis is focused on patches of the following type: helixhelix (HH), helix-extended (HE), helixcoil (HC), helixsolvent (Hout).
Exterior patches. For the external surface and the surface of larger holes in proteins the positions of neighboring atoms, e.g. the water molecules, are mostly unknown. In our approach the distance of protein atoms to a continuous solvent is defined using the molecular surface according to Connolly (1983a). This exterior surface is generated by the closest possible van der Waals surfaces of virtual solvent atoms and can be used as a neighbor for superficial structural elements. The distance of an atom to the solvent is defined as the minimal distance between the van der Waals and Connolly surfaces.
To verify that the different approaches for interior and exterior patches give similar results, largely inaccessible elements were stripped from their protein environment (e.g. the helix 161177 from aldolase; PDB code 1ALD) and the cut-off dependence of atomic contacts was compared between the `original protein surrounded' and `virtual solvated' case. Three-quarters of the atoms are found to come into contact at equal cut-off values, some at 0.5 Å lower in the case of virtual solvent because of the assumed perfect atomic packing of the solvent around the protein (mean difference of cut-off values, 0.11 Å; standard deviation, 0.33 Å).
Computation of the Connolly surface
The molecular surface, called here the Connolly surface, serves as a basis for the reasonable handling of solvent patches compared with inter-secondary patches. The first part of the Connolly surface is made up of those parts of the van der Waals surface that are accessible to a solvent probe sphere (radius 1.4 Å) which is rolled over the molecule. Second, to generate a smooth and analytical describable outer-surface contour, parts of the (rolling) probe and tori are joined at circular arcs (Connolly, 1983b). For patches exposed to solvent the Connolly surface is used as a close virtual neighbor. The membership of the particular atoms to the solvent patch is estimated by their smallest distance to the Connolly surface (for details, see Preissner et al., 1998).
Amino acid propensities for molecular surface patches
The generally used (helix) propensity e.g. for Ala (<P>HA = 1.52) is calculated according to
|
|
Variable Explanation of the variable Total number
HA number of Ala residues in helices 3197
Htotal number of residues in helices 24 694
AAtotal number of amino acids in the entire database 87 707
Atotal number of Ala residues in the entire database 7425
HHA number of Ala residues in helixhelix patches 2099
HHtotal number of residues in helixhelix patches in the entire database 17 050
Residues in patches are counted if at least one of their atoms is in contact with the neighboring secondary structure or solvent. In this respect one residue may occur in several patches. Therefore, given values are not propensities in a strict sense, but deviating effects for particular interfaces will be indicated.
Atomic packing density
Volume and density calculations were carried out following the algorithm of Goede et al. (1997). Computing the volume occupied by the atoms and the local packing density in proteins, one is faced with the problem of intersecting spheres. To estimate both, the space between the atoms has to be divided according to the location of the atoms relative to each other. Various methods have been proposed for this purpose which are based on Voronoi's idea of approximating the atomic space by polyhedra. Comparing the known procedures concerned with the allocation of all space amongst distinct atoms, we observed different partitioning of space with deviations up to 60% for particular atoms. Instead of dividing planes between the atoms, we use curved surfaces defined as set of those geometrical loci with equal orthogonal distance to the surfaces of the considered van der Waals spheres. The proposed dividing surface meets not only the intersection circle of the two van der Waals spheres but also the intersection circle of the two spheres enlarged by an arbitrary value (e.g. radius of water). This hyperbolic surface, enveloping the Voronoi cell. can be easily constructed and offers a number of advantages (Goede et al., 1997). The local packing density is estimated as the ratio of van der Waals volume and those of the corresponding Voronoi cell.
Structural alignments
An automatic procedure was created to search in a given database (DIP) of interfaces for similar regions. The search can be restricted to patches of comparable size. This superposition approach is based on a normalization of the atomic sets according to the directions of least and largest dimension. These directions are independent of transformations of the coordinate system and stable for small alterations of the atomic positions. The normalization of the atomic sets is unique except for four possible rotations (original arrangement and rotations of 180° around the x-, y- or z-axis). Therefore, the degrees of freedom are drastically reduced and the assignment of pairs of atoms is straightforward for identical and slightly modified atomic sets.
In a first step the centers of mass of the two atomic sets are determined and superimposed followed by a rotation of one of them, such that the major directions (least and largest expansions) coincide. All four normalizations are used in a further step to determine the pairs of atoms between the two patches. Two atoms only form a pair if they are mutually the nearest atoms and their distance is smaller than a given cut-off value. For further calculations the normalization with the largest number of atomic pairs is chosen. For these pairs the root mean square deviation was calculated. This normalization is used in a further step for improvements of the alignment.
In its recent implementation, the alignment procedure gives reliable results for patches of similar size. Thus, for similarity screening of the 24 HH patches constituted by 25 atoms the size was restricted to a range from 20 to 30 atoms (128 patches). In this manner the number of structural alignments required can be reduced from 3.6x107 (6000x6000) to 0.72x107 (6000x1200) for an all-against-all analysis.
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the database, 2176 helices, 3822 extended strands and 6477 coils were included. Because of the different mean element lengths (LH = 11.4, LE = 6.5, LC = 5.3), this results in the relationship H:E:C = 1:1:2 for the participating amino acid residues. This relationship corresponds to the number of particular helix patches: between helices (HH) 5551, between helix and ß-sheet (HE) 5548, helix and coil (HC) 11245. The number of neighboring secondary structural elements (about 10 for a helix of 11 residues length) increases with increase in the size of the helix. The size distribution of the interfaces depends on the type of interacting patches (see Figure 1). Small solvent-directed patches (less than 20 constituting atoms for Hout) rarely occur. While the number of HE and HC patches clearly falls off for larger sizes, HH patches are broadly distributed up to a size of more than 60 atoms.
|
According to the definition in the Materials and methods section, the propensities of amino acid residues for the helix <P>H should not be directly compared with the contact preferences for particular helical patches like <P>HH as given in Table I. The contact preferences express predilections additional to those of the particular secondary structural element, which means that the value 1.00 is scaled to the helix propensity of a given residue type. Thus for overall predictions the product of the two values is relevant: the contact preference for prolyl residues to occur in helix-extended patches would be below 0.3 (see Table I
: <P>Hx<P>HE = 0.43x0.66). The helix-propensities <P>H given in the first row are very similar (correlation coefficient 0.975) to those from redundancy-excluding studies (Swindells et al., 1995
). The standard deviation of the helixhelix contact preferences <P>HH from the mean (1.00) is clearly smaller (0.14) than that of the helix propensities itself (0.31), but the following trends can be recognized. Charged residues occur less frequently (D, E, K, R) (mean 0.875), while aromatic residues are preferred (F, H, W, Y) (mean 1.14). Generally, the predilection of hydrophobic residues for this type of interface can be confirmed and is found to be even more pronounced for helix-extended interfaces.
A rough comparison of the atomic composition as given in Table II would group each two types of patches together concerning the content of main chain atoms as well as the number of polar atoms: HH and HE against HC and Hout. The equally high content of main chain atoms in the latter results from a somewhat different main chain orientation, which is expressed in an appreciable accessibility of the C
-atom for the helixsolvent patch and on the other hand by a pronounced dominance of the hydrogen bonding atoms in the helixcoil patch. In Tables I and II
clear deviations between different parts of the helical surfaces are noticeable in terms of atomic and amino acid preferences. The broadening of the range of amino acid preferences (0.281.92 instead of 0.431.52) indicates their value for structure predictions.
Similarity screening
The results of the similarity screening are outlined exemplarily for helixhelix patches consisting of 45 atoms (see Figure 2). Plotting the number of aligned atoms against the r.m.s. values gives clearly bimodal distributions, which can be fitted by two Gaussians (Figure 2
). The distribution with its mean value above 1.0 Å can be interpreted as noise, but the other distribution mainly contains reasonable superpositions.
|
|
Exchangeability of patches
The question remained as to whether detected similar patches could be exchanged between different proteins, which would be of interest for the construction of proteins with maintained function but different antigenic properties. This problem was considered using the similar patches from different proteins given in Figure 3b. First the helix 252260 was removed from the structure of mandelate racemase (see Figure 3c
, left). Then the corresponding residues (8997) from citrate synthase were inserted according to the superposition of the two patches without consideration of the environment (see Figure 3c
, right). This patch fits nearly perfectly into this artificial pocket which is expressed by a high local packing density. It is shifted only slightly from 0.62 to 0.60, which are typical values for helixhelix packing (see Figure 4a
). Visual inspection showed that minor deviations in chi-angles could compensate for small deviations from ideal non-bonded distances. Summing up a rational selection of candidates for replacement becomes possible.
|
We found that the mean packing density is absolutely independent of the size (585 atoms) of the helical interface and might serve as a quality criterion for an atomic fit.
Even the distribution of the local packing density between helices peaks sharply around the mean (mean 0.67; standard deviation 0.07; see Figure 4a). This distribution resembles those between helices and extended (mean 0.72; standard deviation 0.07; see Figure 4b
), while the packing density between helices and coils is somewhat lower (mean 0.64; standard deviation 0.08; see Figure 4c
). An explanation for this finding can be given by the separation of helixcoil interfaces that are adjacent in sequence, so-called helix caps. For these 4300 patches a pronounced lower packing density was observed (mean 0.62).
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The interface files for the proteins considered and a viewer including the superposition procedure are deposited at http://www.charite.de/ch/biochem/dip.
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F.,Jr, Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535542.[ISI][Medline]
Bork,P. and Preissner,R. (1990) Biochem. Biophys. Res. Commun., 180, 666672.[ISI]
Bowie,J.U. (1997) Nature Struct.Biol., 4, 915917.[ISI][Medline]
Chothia,C., Levitt,M. and Richardson,D. (1981) J. Mol. Biol., 145, 215250.[ISI][Medline]
Chou,P.Y. and Fasman,G.D. (1978) Adv. Enzymol. Relat. Areas Mol. Biol., 47, 45148.[Medline]
Cohen,F.E. and Kuntz,I.D. (1987) Proteins, 2, 162166.[ISI][Medline]
Colloc'h,N., Etchebest,C., Thoreau,E., Henrissat,B. and Mornon,J.-P. (1993) Protein Engng, 6, 377382.[Abstract]
Connolly,M.L. (1983a) Science, 221, 709713.[ISI][Medline]
Connolly,M.L. (1983b) J.Appl.Crystallogr., 16, 548558.
Creamer,T.P. and Rose,G.D. (1994) Proteins, 19, 8597.[ISI][Medline]
Crick,F.H.C. (1953) Acta Crystallogr., 6, 689697.[ISI]
DeLano,W.L. and Brünger,A.T. (1994) Proteins, 20, 105123.[ISI][Medline]
Du Bose,R.F. and Hartl,D.L. (1989) Proc. Natl Acad. Sci. USA, 86, 99669970.[Abstract]
Efimov,A.V. (1979) J. Mol. Biol., 134, 2340.[ISI][Medline]
Eisenberg,D., Weiss,R.M. and Terwillinger,T.C. (1982) Nature, 299, 371374.[ISI][Medline]
Goede,A., Preissner,R. and Frömmel,C. (1997) J. Comput. Chem., 18, 11131123.[ISI]
Gogonea,V. and Osawa,E. (1994) Supramol. Chem., 3, 303315.
Hol,W.G.J. (1985) Adv. Biophys., 19, 133165.[Medline]
Hol,W.G.J., van Duijnen,P.T. and Berendsen,H.J. (1978) Nature, 273, 443446[ISI][Medline]
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[ISI][Medline]
Kendrew,J.C., Dickerson,R.E., Strandberg,B.E., Hart,R.G., Davies,D.R. and Shore,V.C. (1960) Nature, 185, 422427.[ISI]
Mumenthaler,Ch. and Braun,W. (1995) Protein Sci., 4, 863871.
Offer,G. and Sessions,R. (1995) J. Mol. Biol., 249, 967987.[ISI][Medline]
Peng,Z.Y., Wu,L.C., Schulman,B.A. and Kim,P.S. (1995) Philos. Trans. R. Soc. London, Ser. B: Biol. Sci., 348, 4347.
Preissner,R., Egner,U. and Saenger,W. (1990) FEBS Lett., 288, 192196.[ISI]
Preissner,R., Goede,A. and Frömmel,C. (1998) J. Mol. Biol., 280, 535550.[ISI][Medline]
Preissner,R., Goede,A. and Frömmel,C. (1999) Bioinformatics, in press.
Reddy,B.V.B. and Blundell,T.L. (1993) J. Mol. Biol., 233, 464479.[ISI][Medline]
Richmond,T.J. and Richards,F.M. (1978) J. Mol. Biol., 119, 537555.[ISI][Medline]
Stouten,P.F.W., Frömmel,C., Nakamura,H. and Sander,C. (1983) Mol. Simul., 10, 97120.
Swindells,M.B., MacArthur,M.W. and Thornton,J.M. (1995) Nature Struct. Biol., 2, 596603.[ISI][Medline]
Walther,D., Eisenhaber,F. and Argos,P. (1996) J. Mol. Biol., 255, 536553.[ISI][Medline]
Received February 10, 1999; accepted July 6, 1999.