Analysis and prediction of inter-strand packing distances between ß-sheets of globular proteins

Hampapathalu A. Nagarajaram1, Boojala V.B. Reddy2 and Tom L. Blundell1,3

1 Department of Biochemistry, 80, Tennis Court Road, Old Addenbrooks Site, Cambridge CB2 1GA, UK and 2 Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad 500 007, India


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Any two ß-strands belonging to two different ß-sheets in a protein structure are considered to pack interactively if each ß-strand has at least one residue that undergoes a loss of one tenth or more of its solvent contact surface area upon packing. A data set of protein 3-D structures (determined at 2.5 Å resolution or better), corresponding to 428 protein chains, contains 1986 non-identical pairs of ß-strands involved in interactive packing. The inter-axial distance between these is significantly correlated to the weighted sum of the volumes of the interacting residues at the packing interface. This correlation can be used to predict the changes in the inter-sheet distances in equivalent ß-sheets in homologous proteins and, therefore, is of value in comparative modelling of proteins.

Keywords: ß-strand packing/comparative modelling/helix packing/protein data analysis/structure prediction


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Most proteins are composed predominantly of {alpha}-helices and ß-sheets (Levitt and Chothia, 1976Go; Richardson, 1981Go), with a core region formed by close packing of these secondary structural elements (SSEs). In a family of homologous proteins these SSEs are arranged in a similar way. Although the amino acid replacements, insertions and deletions in homologous families of proteins occur most frequently at the surface, small changes occurring in the densely packed core are also accommodated (Lesk and Chothia, 1980Go, 1986Go; Bajaj and Blundell, 1984Go; Chothia and Lesk, 1987Go). These observations are central to comparative modelling techniques (Browne et al., 1969Go; Greer, 1981Go; Chothia et al., 1986Go; Blundell et al., 1987Go, 1988Go; Sutcliffe et al., 1987aGo,bGo; Havel and Snow, 1991Go; Sali and Blundell, 1993Go; Srinivasan et al., 1993Go; Johnson et al., 1994Go; Sali, 1995Go; Sanchez and Sali, 1997Go) in which a three-dimensional (3-D) model structure for a protein of known sequence is extrapolated from one or more experimentally determined structures of homologous proteins. Only if the target sequence is close to the basis structures used and there are few amino acid substitutions in the core can the model obtained be comparable to a medium-resolution X-ray structure (Srinivasan and Blundell, 1993Go; Sali et al., 1995Go).

In the case of homologous proteins of <40% sequence similarity, amino acid differences lead to changes in the residue volumes in the core. In order to accommodate these changes, the equivalent SSEs in the common core undergo significant relative shifts and rotations (Lesk and Chothia, 1980Go, 1982Go, 1986Go; Chothia and Lesk, 1982Go, 1986Go). As a consequence the root mean square (r.m.s.) differences of the structures increase, leading to a decrease in the number of topologically equivalent residues. This limits the predictive capability of comparative modelling procedures. In order to develop a method to predict relative shifts in the secondary structural elements, it is essential to understand the precise quantitative relationships of the distances between SSEs and the properties of the residues involved in their packing.

Many studies covering various aspects relating to conformation, geometry and packing of ß-sheets have been reported. For example, the conformation of ß-sheets with respect to ß-strand connectivity has been analysed by Sternberg and Thornton (1977a,b). The ß-sheet geometry in proteins, emphasizing the allowed conformational flexibility among the parallel, anti-parallel and mixed ß-sheets, has been discussed by Salemme (1983). The relative rigid body shifts and rotations in equivalent ß-sheets in the immunoglobulin and the plastocyanin–azurin families have been reported (Chothia and Lesk, 1982Go; Lesk and Chothia, 1982Go). Chothia and Janin (1981, 1982) have reported the details of orthogonal and aligned packing of ß-sheets. Principles governing ß-sandwich structures have been analysed (Cohen et al., 1981Go). Efimov (1997a,b) has described construction of structural trees for protein superfamilies of ß-proteins having root structures characterized by aligned packing and orthogonal packing of ß-sheets. There are reports on coiling (Chothia, 1983Go), energetics (Chou et al., 1986Go) and folding pattern (Chothia and Finkelstein, 1990Go) in ß-sheet proteins. The propeller assembly of ß-sheets, their preferred assembly with sevenfold symmetry and principles determining ß-sheet barrels have been analysed (Murzin, 1992Go; Murzin et al., 1994aGo,bGo). Recently, Chothia et al. (1997) described all known folds of ß-proteins. However, little attention has been paid to quantitative relationships that might allow the prediction of inter-sheet distances.

Blundell and co-workers (Reddy and Blundell, 1993Go; Reddy et al., 1999Go) have reported quantitative packing relationships in three cases of SSE packing, viz. helix to helix, helix to ß-strand and helix to ß-sheet, and showed their potential use in comparative modelling of {alpha} and {alpha}–ß classes of proteins. In this paper, we analyse packing between two ß-strands belonging to two different ß-sheets in a large number of protein structures and show that the inter-axial distance between the two interactively packed ß-strands is significantly correlated with the weighted sum of the volumes of the interacting residues at their packing interface. We investigate the most common factors that influence packing. We also further demonstrate the usefulness of the distance–volume relationship in the prediction of inter-axial distances in homologous proteins.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Data set used

The three-dimensional coordinates of protein structures deposited in the Protein Data Bank (PDB) (Bernstein et al., 1977Go) were used to identify the ß-strands with a length of at least five residues using the definition of Kabsch and Sander (1983) as implemented by Smith (1989) in his SSTRUC program. Two non-redundant sets of ß-strand pairs were considered. The first set (A) included pairs from proteins with sequence identities <25% and defined at 2 Å (Hobohm et al., 1992Go). However, this set omits pairs from homologous proteins where differences in packing are significant and can include those that are identical. The second set (B) was derived from a set of protein structures solved at 2.5 Å or better resolution, corresponding to 6531 protein chains. Where more than one identical pair was identified from homologous proteins, only the pair from the structure defined at the best resolution was considered for analysis. The results presented in the following sections have been obtained using set B unless stated otherwise.

Calculation of amino acid residue-dependent parameters of the packing interface

To identify the interactively packed pairs in a protein structure, the solvent contact surface area (SCSA) of every SSE was calculated in the presence and absence of every other SSE. The fractional loss in SCSA of a residue j (denoted by ndaj) due to packing was calculated as ndaj = (aijapj)/astdj, where aij and apj are the SCSA values of the residue j of a SSE in the absence and presence of the other SSE, respectively, and astdj is the standard state value of SCSA for the residue type j (see Table IGo). The residues with ndaj values >=0.1 are considered as the interacting residues and the packing interface between the two SSEs constitutes the side chains of all such interacting residues. The total fractional loss of solvent contact areas at the packing interface is calculated as nda = {Sigma}ndaj ; j = 1, nint where nint is the total number of interacting residues. The SCSAs of the SSEs were determined with a spherical probe of radius of 1.4 Å using an algorithm of Richmond and Richards (1978) as implemented by Sali (1991) in his program PSA.


View this table:
[in this window]
[in a new window]
 
Table I. Properties of 20 amino acid residues
 
The total residue volume, V, at the packing interface is calculated as the sum of volumes of every interacting residue (Table IGo). Several functions using V, nda and nint have been calculated (see Table IVGo) to test their correlations with inter-SSE distances.


View this table:
[in this window]
[in a new window]
 
Table IV. Correlation coefficients (r) for various volume-dependent functions (VDFs) versus the inter-axial distances dip (first row) and dcl (second row)
 
Packing geometry

The packing geometry is characterized by the distance between the interactively packed SSEs and their mutual angle of orientation, each defined with respect to the linear axes of the SSEs. In the case of ß-strands we quantify the distance in two ways. The first, referred to as dip, is the distance between the midpoints on the projected interaction regions on the axes of the secondary structural elements (distance between the midpoints of b11–b12 and b21–b22 in Figure 1Go) and the second, referred to as dcl, is the shortest of the distances between the projected C{alpha} positions on the two axes of the secondary structural elements (distance between the projections of the two residues A63 and V7 in Figure 1Go). Both dip and dcl have been used to explore the packing relationship. The angle of mutual orientation is calculated as the angle between the N- to C-terminal directional axis vectors (the vectors AB and A'B' in Figure 1Go) projected on to the plane perpendicular to the line joining their midpoints. The axis of a ß-strand is calculated using a method suggested by Blundell et al. (1983) for helices. A probe ß-strand of length of four residues is superposed on to the real ß-strand, the superposed C{alpha} atoms of the real ß-strand are projected on to the probe axis (the projected C{alpha} positions are referred to as `real-axis points') and a straight line is fitted using the projected C{alpha} positions.



View larger version (47K):
[in this window]
[in a new window]
 
Fig. 1. Interactive packing between two ß-strands (occurring in 1apn) The solvent contact surfaces of the interacting residues are shown by dotted spheres. The axis of each ß-strand is shown by an arrow which points in the N- to C-terminal direction. The interaction region on the first ß-strand (strand 1; left-hand side; 60–67) lies between the points b11 and b12 (marked on its axis), whereas for the second ß-strand (strand 2; right-hand side; 4–10) it lies between b21 and b22 (marked on its axis). The inter-axial distances dcl and dip are shown. This figure and Figures 2, 5 and 6GoGoGo were prepared using the molecular graphics display package SETOR (Evans, 1993).

 

    Results and discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Interactive packing between two SSEs can be defined in two ways. Lesk and Chothia (1980) considered SSEs to be packed if the distance between any two atoms of the residues in each of the SSEs is <=0.6 Å between their van der Waals surfaces. An alternative definition of packing has been given by Reddy and Blundell (1993), where two SSEs are involved in the interactive packing if each SSE has at least one residue that loses one tenth or more of its SCSA in the presence of the other SSE. We adopt this definition as this involves a surface region of the SSEs, which is probably more appropriate for the packing analysis and probably accounts for many aspects of interactions (hydrophilic and hydrophobic) in the protein cores. It is also supported by free energy considerations, which show that the loss of every Å2 of SCSA contributes about 80 cal to the free energy of hydrophobic association of SSEs (Chothia, 1974Go; Richmond and Richards, 1978Go). An example of interactive packing between two ß-strands is shown in Figure 1Go, where the packing interface comprises six interacting residues, three from each ß-strand (L61, A63 and V65 from strand 1 and V5, V7 and L9 from strand 2).

ß-Strand–ß-strand packing

In the present analysis we focus only on the interactive packing of ß-strands that occurs due to the aligned packing of two ß-sheets (Chothia and Janin, 1981Go). However, interactive packing between two ß-strands can also occur due to the folding of a ß-sheet on to itself, giving rise to orthogonal packing of the ß-sheet (Chothia and Janin, 1982Go). This is not discussed in this paper.

Of the 3500 interactively packed ß-strand pairs, about 40% were interacting only through the residues at one of the termini. Figure 2Go illustrates different modes of interactions in such pairs. As such interactions are very unlikely to determine the packing geometry between the ß-strands, such pairs have been removed from the data set, leaving 1986 pairs (occurring in 428 protein chains) which were used for further analysis.





View larger version (122K):
[in this window]
[in a new window]
 
Fig. 2. Representative examples of interactively packed ß-strands that interact through the residues at one of the two termini. (a) 2aai; chain B; 158–162; 202–206; (b) 1g of; 84–90; 214–219; (c) 1hsg; chain A; 18–24; 52–66.

 
Of 1986 interacting pairs, about 50% were found in the immunoglobulin domains but significant numbers of pairs were also found in sialidases, con-A-like lectins/glucanases and cupredoxins (Table IIGo).


View this table:
[in this window]
[in a new window]
 
Table II. Composition of the final set of 1986 interactively packedß-strands
 
Packing geometry

The minimum, maximum and average values of the two inter-axial distances dip and dcl are given in Table IIIGo. Although the average values are about 10 Å, the distances vary from 4 to 15 Å. Distances in most of the pairs, derived from ß-sandwich structures, are distributed in the range 9–11 Å, which is similar to that reported earlier (Cohen et al., 1981Go).


View this table:
[in this window]
[in a new window]
 
Table III. Residue properties V, nda and nint and inter-axial distances at the packing interface of the interactively packed pairs of ß-strands
 
The distributions of angle of orientation shown in Figure 3Go agree well with the earlier observations (Chothia and Janin, 1981Go). The distributions show two distinct, well populated regions, one centred around –30° (near-parallel packing) and the other centred around +150° (near-anti-parallel packing).



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 3. Distribution of interactively packed ß-strand pairs with respect to their angle of orientation (°) in the five major families. The profile was obtained by computing the number of pairs occurring within a moving window of the size of 30°. The two peaks are (i) –30 ± 30° (near-parallel orientation) and (ii) +150 ± 30° (near anti-parallel orientation) pairs (a = immunoglobulin domains; b = sialidases; c = ConA-like lectins/glucanases; d = cupredoxins; e = others).

 
Packing interfaces

In a majority (80%) of pairs, the interface is occupied by three to six interacting residues, although some pairs have as few as two residues and some others have as many as 10 residues (see Table IIIGo). The number of interacting residues is, roughly, inversely proportional to the inter-axial distance, i.e. the interfaces with shorter inter-axial distances contain a greater number of residues than those with longer inter-axial distances. The total volume V varies from 227 to 1612 Å3 (Table IIIGo).

Packing relationship: inter-axial distances and interacting residue volumes

An inspection of the plot of V versus the distances dcl and dip for interacting pairs of ß-strands did not reveal a clear correlation (Table IVGo). We further examined the correlation between distances and the functions of the form (V/nda) discussed in Reddy and Blundell (1993) which gave a very good correlation for helix–helix packing. The corresponding values of the correlation coefficient (r) are given in Table IVGo and these are better than those obtained for V versus inter-axial distances, implying that the actual contribution of the total residue volume to the packing depends on the extent of SCSA of the interacting residues buried. The best correlation is obtained for F6 versus dcl in most of the families (see Table IVGo). In further discussion here we consider only the relationship F6 versus dcl and refer to this as the packing relationship. We also tested the relationship with set A and obtained a slightly improved correlation (Table VGo).


View this table:
[in this window]
[in a new window]
 
Table V. Correlation coefficients (r) for the packing relationship F6 versus dcl in different sets of interacting pairs
 
We next investigated the interactive packing by considering only the side chain (C{alpha} atom included) contributions for volumes and SCSAs, i.e. if the side chains of the component residues undergo a loss of one tenth of their SCSA upon packing. The inter-axial distance dcl was computed with respect to the interacting side chains and the quantities V and nda were taken as equal to the sum of the interacting side chain volumes and sum of the fractional losses in SCSAs of the interacting side chains.

The correlation coefficients were found to be slightly less when the full residue contributions were not considered (Table VGo). This means that inclusion of main-chain atomic contributions to the packing improves the overall packing relationship.

The interacting pairs were also examined for side chain–side chain contacts between them. The side chain–side chain contacts were defined as those with inter-atomic distances less than or equal to the sum of the van der Waals radii of the corresponding atoms plus 0.6 Å (Lesk and Chothia, 1980Go). The van der Waals radii for the atoms (C = 1.7, O = 1.5, N = 1.6 and S = 1.8 Å) were taken from Bondi (1964).

Of 1986 pairs, only 60% had one or more contacts between them. The correlation for F6 versus dcl is relatively poor (see Table VGo) in the majority of the families, suggesting residues not in contact also contribute to the packing in those families. In the following discussion we consider only the packing relationship obtained from full-residue contributions.

The slopes and intercepts of the regression lines best fitting the F6–dcl relationship in various families of proteins are given in Table VIGo. In order to obtain an unbiased, general regression line that best represents the packing relationship for a pair of ß-strands, the average values of slopes and intercepts were calculated (see Table VIGo). Figure 4Go shows the scatter plot for the packing relationship superposed with the general regression line.


View this table:
[in this window]
[in a new window]
 
Table VI. Values of slope (m) and intercept (c) of regression line and standard deviation ({sigma}) for dispersion of data about that line in various families of proteins (the last row gives their average values, which represent general regression line)
 


View larger version (25K):
[in this window]
[in a new window]
 
Fig. 4. Scatter plot of F6 versus dcl in interactively packed strand pairs along with the general regression line.

 
Angular dependence in packing relationship

In general, r values for parallel and anti-parallel pairs are ~0.8 and ~0.7, respectively, indicating a better correlation for parallel pairs than anti-parallel pairs (Table VIIGo). Among the anti-parallel pairs, those comprising ß-strands contiguous in the sequence and connected by linkers of three to nine residues [ß-arcs (Efimov, 1987Go)] showed a poor correlation, indicating that such ß-strand pairs are probably constrained by the short linkers.


View this table:
[in this window]
[in a new window]
 
Table VII. Values of correlation coefficient (r), slope (m) and intercept (c) of the regression line for the packing relationship F6 versus dcl in parallel (pl) and anti-parallel (apl) pairs of interactively packed ß-strands
 
Deviations from packing relationship

Although most points are evenly spread about the regression line ({sigma} {approx} 1.1 Å ) (Figure 4Go), about 40 points lie well below having dcl < 6 Å. In these pairs one or both ß-strands are severely distorted (bent, coiled or twisted) (see Figure 5aGo for an example) and therefore the inter-axial distances are uncorrelated with the volume-dependent function.





View larger version (178K):
[in this window]
[in a new window]
 
Fig. 5. Illustrative examples of interactively packed pairs that lie well outside the regression line, where (a) the ß-strand is severely distorted (7nn9; 115–125; 439–449); (b) a ß-strand pair whose packing interface contains an Ala and a Gly which generally give rise to an outlier below the regression line (BRL) (1loe; chain A; 30–35; 41–46); (c) the packing interface is constituted by large hydrophobics and the residues are perfectly aligned. Such pairs generally give rise to outliers above the regression line (ARL) (1lcl; 35–41; 89–95).

 
The difference between the real distance and expected values from the regression line [i.e. {Delta}dcl = dcl(observed) –dcl(expected)] gives the deviation of a pair from the packing relationship. ß-Pairs which show a deviation of |{Delta}dcl| > 0.5 were considered as outliers. One group of 553 lies below the regression line (BRL outliers) and the other group of 625 lies above the regression line (ARL outliers).

Residues at the packing interface of outliers

Visual inspection indicated that many pairs in the BRL group (Figure 5bGo) involve small amino acids such as Ala and Gly, while those in the ARL (Figure 5cGo) involve large hydrophobics, such as Phe and Leu. The propensity of each amino acid residue i to occur in each group was calculated as

where fi = % occurrence of a residue i in that outlier group and Fi is the % occurrence of a residue i in all pairs, regardless of the group to which it belongs; pi values were calculated using set A. The following residues appeared as preferred and non-preferred residues for the two groups of outliers:

Preferred                   Non-preferred

BRL outliers A, G, M, N, S, T      F, I, L, V, W, Y

ARL outliers F, I, L, M, V, W, Y    A

Side chains of the polar residues in the BRL group point away from the packing interface, resulting in a lesser residue volume contribution than expected. For ARL pairs the side chains of apolar residues align at the packing interface. A similar residue-based effect on the packing of helices in the globin family has been reported (Efimov, 1979Go). In this family the presence of small and polar residues at the interfaces gives rise to small inter-helix distances, whereas the presence of apolar residues gives rise to large inter-helix distances.

Distortion of ß-strands

Ideally a regular ß-strand is an extended structure with a small right-handed twist about its axis. However, in proteins ß-strands are often bent or coiled (Chothia, 1973Go, 1984Go). Such a structural distortion can affect the packing relationship in an interacting pair. The structural distortion was quantified by the r.m.s. deviation (denoted by {Delta}ax) of the real-axis points from the linear axis. {Delta}ax varies from 0.3 to 4.0 Å. and the number of pairs showing an absolute deviation |{Delta}dcl| > 0.5 Å from the regression line for increasing values of {Delta}ax is given in Table VIIIGo. More than half of them are outliers and their percentage increases with the value of {Delta}ax. Thus, a key factor responsible for the deviation from the packing relationship is the structural distortion associated with one of or both the ß-strands in the pairs.


View this table:
[in this window]
[in a new window]
 
Table VIII. The number of ß-strand pairs with different extents of structural distortion (quantified by {Delta}ax) and the number of outliers in them
 
Packing between two ß-sheet units

We also investigated interactive units consisting of two hydrogen-bonded (parallel or anti-parallel) ß-strands (ß-sheet units) and belonging to two different ß-sheets (see Figure 6Go). The inter-unit distances in the pairs where each ß-strand has at least one residue that undergoes a loss of one tenth of its SCSA upon packing were calculated as the distance between the ortho-centres of each of the units (In Figure 6Go. O and O' are the ortho-centres for the ß-sheet units 1 and 2, respectively). The positions of ortho-centres were computed using the four C{alpha} projections of the first and the last interacting residues from each ß-strand on to their respective axes (A, B, C, D, A', B', C' and D' in Figure 6Go). In 457 pairs of unique sequence there were significant correlations (r values 0.69 and 0.61) between the inter-sheet distance (dOO') and the volume-related functions F5 and F6. However, the correlations were not as good as those for pairs of ß-strands. Of 427 pairs 93% belong to anti-parallel–anti-parallel class, 5.3% belong to anti-parallel–parallel class and only 1.3% belong to parallel–parallel class. The value of the correlation coefficient in the anti-parallel–anti-parallel class, 0.70, is better than those for all the pairs of ß-sheet units.



View larger version (60K):
[in this window]
[in a new window]
 
Fig. 6. Illustration of packing between two ß-sheet units [taken from 1dvf; chain D; sheet unit 1 (ABCD) comprises strands 68–72 and 77–83; sheet unit 2 (A'B'C'D') comprises strands 33–39 and 45–52]. The interacting residues are encapsulated by dotted spheres. The distance between the two orthocentres O and O' has been used as a measure of the packing distance (dOO').

 
Helix–helix packing: a reinvestigation

Reddy and Blundell (1993) investigated helix–helix packing in X-ray structures solved at 3.0 Å or better resolution and investigated about 1000 helix pairs. The present data set of 6531 protein chains contains 6081 non-identical interacting pairs [excluding those interacting only through the terminal residues (three residues) at either N or C-termini]. The corresponding r values are given in Table IXGo. Functions F3 and F5 give rise to a very good correlation with dip (r = 0.79).


View this table:
[in this window]
[in a new window]
 
Table IX. Values of correlation coefficient (r) for volume-dependent functions (VDFs) versus inter-axial distances in 6081 interactively packed pairs of helices
 
Prediction of inter-axial distances

Prediction of the inter-axial distance between two ß-strands belonging to two ß-sheets in a protein (`target') using the packing relationship requires information about the possible interacting residues in those ß-strands. This can be obtained by sequence alignment of target with its homologue of known 3-D structure (`template'). Using the residues of the target equivalent to the interacting residues in the template, the value of the function F6 can be calculated and the inter-axial distance can be predicted using an appropriate regression line depending on the angle of packing (parallel or anti-parallel) in the template pair (see Table VIIGo). The total residue volume V can be calculated using the values given by Chothia (1975) (Table IGo) for 20 amino acid residues. The nda value can either be derived from the representative values of residues nda_avj (Table IGo) calculated as the average values from the interacting residues in all the interacting pairs analysed in this study or taken from the equivalent interacting residues in the template.

The usefulness of the packing relationship to predict the inter-axial distance between two ß-strands was tested using seven families of homologous proteins with average sequence identities varying from 15 to 51% (Table XGo). The information regarding possible interacting residues was obtained using the structure-based sequence alignments, obtained using COMPARER (Sali and Blundell, 1990Go) and deposited in the in-house database HOMSTRAD (http://cryst-bioc.cam.ac. uk/~homstrad) (Mizuguchi et al., 1998Go). In every family each protein was considered as a target and a distance prediction was made using every other protein as template in that family (938 target–template ß-strand pairs). For each of the target ß-strand pairs two values of F6 were computed using the two values of nda and the inter-axial distances dp1 and dp2 were predicted.


View this table:
[in this window]
[in a new window]
 
Table X. Comparison of errors associated with the predicted distances (Edp2) and the template distances (Edt)
 
The scatter plots of dp1 versus the observed distances (dobs) (Figure 7aGo) and dp2 versus dobs (Figure 7bGo) show good agreement between the observed and the predicted distances. A large majority of points in both the plots lie close to the diagonal. The values of {sigma} for scatter of points about the diagonal are 0.68 (dp1 dobs ) and 0.53 (dp2dobs ). The distributions of errors associated with the predicted distances dp1 (Edp1 = dp1 dobs) and dp2 (Edp2 = dp2dobs) are shown in Figure 7cGo. About 85% of the Edp1 and 94% of Edp2 occur within the range –1 to 1.0 Å, implying that dp2 gives a better prediction of inter-axial distance than dp1 in the families considered here.




View larger version (33K):
[in this window]
[in a new window]
 
Fig. 7. Scatter plots demonstrating close agreement between the predicted distances and the observed distances: (a) dp1 versus dobs and (b) dp2 versus dobs. The distributions of error associated with the two distances are shown in (c): broken line (dp1) and solid line (dp2).

 
In order that the packing relationship is useful in comparative modelling of proteins, predicted distances should be closer to the observed distances than the template distances. To invesitage this we compared the errors associated with the predicted distances (Edp2) and the template distances (Edt = dtdobs; dt = template distance) (see Table XGo). For 43% of the 938 predictions made, Edp2 was found smaller than Edt (dp2 closer to observed distance than dt). In the remaining cases Edp2 was either greater than (17%) or equal to (40%) Edt. We also compared the predicted distances with the distances from the closest homolgoues. The percentages of predictions in the three situations Edp2 < Edt, Edp2 = Edt and Edp2 > Edt are 31, 49 and 20%, respectively. This means that the predicted distance is often found closer to the observed distance than the template distance, indicating an advantage in using predicted distances over the template distances in comparative modelling.

Hence this investigation shows that the predicted inter-axial distances are more useful than those taken from templates. For comparative modelling one can obtain as many predicted distances as the number of homologues used as templates. Therefore, a weighted average of the predicted distances is used in modelling of the target. The weights are taken as equal to the inverse of square of sequence identities between the template and the target (Srinivasan and Blundell, 1993Go).

Conclusions

We have investigated the interactive packing in ß-strands and have shown that the distance between two ß-strands is significantly correlated with the weighted sum of the volumes of all the interacting residues at the packing interface. The same is also shown in the packing of ß-sheet units and helices. Two factors seem to influence packing of ß-strands: structural distortions in the interacting pairs and the presence of certain types of amino acid residues at the packing interface. We have also shown the usefulness of the packing relationship in the prediction of inter-axial distances between two equivalent ß-strands in homologous proteins. The predicted distances are often found closer to the observed distances than the template distances, indicating an advantage in using predicted distances over the template distances in comparative modelling of proteins.


    Acknowledgments
 
H.A.N. is the recipient of a fellowship funded by a grant from Oxford Molecular Ltd.


    Notes
 
3 To whom correspondence should be addressed.E-mail: tom{at}cryst.bioc.cam.ac.uk Back


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Bajaj,M. and Blundell,T.L. (1984) Annu. Rev. Biochem., 13, 453–492.

Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F.,Jr, Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535–542.[ISI][Medline]

Blundell,T.L., Barlow,D., Borkakoti,N. and Thornton,J. (1983) Nature, 306, 281–283.[ISI][Medline]

Blundell,T.L., Sibanda,B.L., Sternberg,M.J.E. and Thorton,J.M. (1987) Nature, 323, 347–352.

Blundell,T.L. et al. (1988) Eur. J. Biochem., 172, 513–520.[Abstract]

Bondi, A, (1964) J. Phys. Chem., 68, 441–451.[ISI]

Browne,W.J., North,A.C.T., Philips,D.C., Brew,K., Vanaman,T.C. and Hill,R.L. (1969) J. Mol. Biol., 42, 65–86.[ISI][Medline]

Chothia,C. (1973) J. Mol. Biol., 75, 295–302.[ISI][Medline]

Chothia,C. (1974) Nature, 248, 338–339.[ISI][Medline]

Chothia,C. (1975) Nature, 254, 705–708.[ISI][Medline]

Chothia,C. (1983) J. Mol. Biol., 163, 107–117.[ISI][Medline]

Chothia, C. (1984) Annu. Rev. Biochem., 53, 537–572.[ISI][Medline]

Chothia, C and Finkelstein,A.V. (1990) Annu. Rev. Biochem., 59, 1007–1039.[ISI][Medline]

Chothia,C. and Janin,J. (1981) Proc. Natl Acad. Sci. USA, 78, 4146–4150.[Abstract]

Chothia,C. and Janin,J. (1982) Biochemistry, 21, 3955–3965.[ISI][Medline]

Chothia,C. and Lesk,A.M. (1982) J. Mol. Biol., 160, 325–342.[ISI][Medline]

Chothia,C. and Lesk,A.M. (1986) EMBO J., 5, 823–826.[Abstract]

Chothia,C. and Lesk,A.M. (1987) Cold Spring Harbour Symp. Quant. Biol., 52, 399–405.[ISI][Medline]

Chothia,C., Lesk,A.M., Levitt,M., Amitt.,A.G., Mariuzza,R.A., Phillips,S.E.V. and Poljak,R.J. (1986) Science, 233, 755–758.[ISI][Medline]

Chothia,C., Hubbard,T., Brenner,S., Barns,H. and Murzin,A. (1997) Annu. Rev. Biophys. Biomol. Struct., 26, 597–627.[ISI][Medline]

Chou,K.C., Nemethy,G., Rumsey,S., Tuttle,R.W. and Scheraga,H.A. (1986). J. Mol. Biol., 188, 641–649.[ISI][Medline]

Cohen,F.E., Sternberg,M.J.E. and Taylor,W.R. (1981) J. Mol. Biol., 148, 253–272.[ISI][Medline]

Efimov,A.V. (1979) J. Mol. Biol., 134, 23–40.[ISI][Medline]

Efimov,A.V. (1987) FEBS Lett., 224, 372–376.[ISI]

Efimov,A.V. (1997a) Proteins, 28, 241–260.[ISI][Medline]

Efimov,A.V. (1997b) FEBS Lett., 407 37–41.[ISI][Medline]

Evans,S.V. (1993) J. Mol. Graphics, 11, 134–138.[ISI][Medline]

Greer,J. (1981) J. Mol. Biol., 153, 1027–1042.[ISI][Medline]

Havel,T.F. and Snow,M.E. (1991) J. Mol. Biol., 217, 1–7.[ISI][Medline]

Hobohm,U., Scharf,M., Schneider,R. and Sander,C. (1992) Protein Sci., 1, 409–417.[Abstract/Free Full Text]

Johnson,M.S., Srinivasan,N., Sowdhamini,R. and Blundell,T.L. (1994) CRC Crit. Rev. Biochem. Mol. Biol., 29, 1–68.

Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.[ISI][Medline]

Lesk,A.M. and Chothia,C. (1980) J. Mol. Biol. 136, 225–270.[ISI][Medline]

Lesk,A.M. and Chothia,C. (1982) J. Mol. Biol., 160, 325– 342.[ISI][Medline]

Lesk,A.M. and Chothia,C. (1986) Phil. Trans. R. Soc. Lond., 317, 345–356.

Levitt,M. and Chothia,C. (1976) Nature, 261, 552–558.[ISI][Medline]

Mizuguchi,K., Deane,C.M., Blundell,T.L. and Overington,J.P. (1998) Protein Sci., 7, 2469–2471.[Abstract/Free Full Text]

Murzin,A.G. (1992) Proteins, 14, 191–201.[ISI][Medline]

Murzin,A.G., Lesk,A.M. and Chothia,C. (1994a) J. Mol. Biol., 236, 1369–1381.[ISI][Medline]

Murzin,A.G., Lesk,A.M. and Chothia,C. (1994b) J. Mol. Biol., 236, 1382–1400.[ISI][Medline]

Reddy,B.V.B. and Blundell,T.L. (1993) J. Mol. Biol., 233, 464–479.[ISI][Medline]

Reddy,B.V.B, Nagarajaram,H.A. and Blundell,T.L. (1999) Protein Sci., 8, 573–586.[Abstract]

Richardson,J.S. (1981) Adv. Protein Chem., 34, 167–339.[Medline]

Richmond,T.J. and Richards,F.M. (1978) J. Mol. Biol., 119, 537–555.[ISI][Medline]

Salemme,F.R. (1983) Prog. Biophys. Mol. Biol., 42, 95–133.[ISI][Medline]

Sali,A. (1991) PhD Thesis, University of London.

Sali,A. (1995) Curr. Opin. Biotechnol., 6, 437–451.[ISI][Medline]

Sali,A. and Blundell,T.L. (1990) J. Mol. Biol., 212, 403–428.[ISI][Medline]

Sali,A. and Blundell,T.L. (1993) J. Mol. Biol., 234, 779–815.[ISI][Medline]

Sali,A., Potterton,L., Yuan,F., Van Vlijmen,H. and Karplus,M. (1995) Proteins, 23, 318–326.[ISI][Medline]

Sanchez,R. and Sali,A. (1997) Curr. Opin. Struct. Biol., 7, 206–214.[ISI][Medline]

Smith,D. (1989) SSTRUC: A Program to Calculate Secondary Structural Summary. Department of Crystallography, Birkbeck College, University of London.

Srinivasan,N. and Blundell,T.L. (1993) Protein Engng, 6, 501–512.[Abstract]

Srinivasan,S., March,C.J. and Sudarshanam,S. (1993) Protein Sci., 2, 277–289.[Abstract/Free Full Text]

Sternberg,M.J.E. and Thornton,J.M. (1977a) J. Mol. Biol., 110, 269–283.[ISI][Medline]

Sternberg,M.J.E. and Thornton,J.M. (1977b) J. Mol. Biol., 110, 285–296.[ISI][Medline]

Sutcliffe,M.J., Haneef,.I., Carney,D. and Blundell,T.L. (1987a) Protein Engng, 1, 377–384.[Abstract]

Sutcliffe,M.J., Hayes,F.R.F. and Blundell,T.L. (1987b) Protein Engng, 1, 385–392.[Abstract]

Received April 9, 1999; revised August 13, 1999; accepted August 21, 1999.