Equipe Systèmes Moléculaires et Biologie Structurale, LMCP, CNRS UMR7590, Universités Paris 6 et Paris 7, Case 115, 75252 Paris cedex 05, France
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: helical surface/modeling/protein sheet/twist
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Studies of ß-sheets and in particular of ß-barrels have already been carried out by many workers (Cohen et al., 1980, 1982
; Lasters et al., 1988
; Lasters, 1990
; Flower, 1994
; King et al., 1994
; Murzin et al., 1994
; Wang et al., 1996
). It was shown that ß-barrels could be fitted by hyperboloids (Lasters et al., 1988
; Lasters, 1990
; Flower, 1994
; King et al., 1994
; Murzin et al., 1994
). The result is that barrels can be classified by means of two parameters: the number of strands in the ß-sheet, n, and the `shear number', S, which is a measure of the stagger of the strands in the ß-barrel (Murzin et al., 1994
). Briefly, S is a measure of the inclination of the strands relative to the axis of the barrel. Only a small number of combinations of n and S parameters are observed in nature (Murzin et al., 1994
).
In this work, we thoroughly investigated the topology of small ß-sheets, typically three or four strands with five to nine residues each. The influence of the number, the length and the direction of strands on the twist was analyzed. Strands were chosen as independent units, basically as rigid bodies and loops connecting them have been discarded in this approach (Efimov, 1993; Gellman, 1998
).
The main chains of ß-sheets form a two-dimensional curved surface whose shape depends on interactions between strands. It should be noted that we were not actually able to find a plane sheet in our analysis, as curvature is very common. The geometry of a ß-sheet has been fully defined by two parameters determining the twist and the coiling (Salemme, 1983; Daffner, 1994
). Different definitions of twist are given in the literature. It has been defined as the average twist of the strands, which is derived from the backbone dihedral angles
and
(Salemme, 1981
; Chou et al., 1982
; Yang and Honig, 1995
). Another definition uses the four C
atoms of the two terminal residues from each of the two most external strands of a sheet (Wang et al., 1996
). Here the twist depends on particular positions of the sheet's four `corner' C
atoms. A right-handed twist is indeed an intrinsic feature of a ß-sheet (Wang et al., 1996
) and coiling is defined as a curvature along the strands or in the direction perpendicular to the strands of a ß-sheet. The ß-sheet coiling depends on the nature of residues in the protein and in particular on their relationships with each other. Since the small sheets usually have weak constraints on the border strand's positions, compared with a barrel, there is no strong bending factor contributing to sheet coiling. Therefore, in the present study, coiling was neglected. Hereafter we shall consider twist as the main parameter to describe small sheets.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
![]() |
![]() |
|
Once this three-dimensional model, based on the helical surface, has been determined, it is then possible to fit C atom positions of an experimental sheet with this ideal sheet by varying only H in order to adjust it to the actual conformation, as r is only a function of the strand number. These comparisons were performed by the FIT program (Lu Guogang, personal communication) and the H value giving the least r.m.s. was chosen. We shall call T the `twist' of a sheet as follows:
![]() |
In order to provide a value of the twist for every possible combinatory positiondirection of strands within this `ideal sheet' method, all sheets of three and four strands from the PDB were exhaustively analyzed. Two databases were derived from the PDB, with a maximum of 50 and 95% sequence identity among any pair of members, respectively, called 50% PDB and 95% PDB (Hobohm et al., 1992; Hobohm and Sander, 1994
). The strands were first distinguished and attributed to sheets according to PDB header description. This was checked by DSSP (Kabsch and Sander, 1983
) and in case of disagreement, DSSP results were preferred. Once retrieved from the databank, they were sorted according to their structural configurations defined by the position and direction of the strands within a sheet. We wished to determine the configurations of strands yielding the same value for the twist, thus establishing a taxonomy of strands based upon easy geometrical considerations. Originally, it was believed that the twist of a sheet should also depend on its size, and therefore on the number of residues it contains. This is the reason why several classes have been constituted according to the number of residues in the sheet. This number was further divided by the number of strands involved in this sheet, yielding an average number of residues per strand, therefore allowing comparison of sheets with different numbers of strands. The first class contains sheets having on average less than five residues per strand, the second class contains sheets from five to six residues and so on. The number of classes was determined in order to keep a sufficient population of sheets in each class, for reasons of statistical consistency.
For every experimental sheet, an ` ideal sheet' was derived. The twist of this model was adjusted to yield the smallest r.m.s. between the modeled and the real sheets. For each group, both mean value and the associated confidence interval were calculated for twist. Experimental sheets often suffer multiple distortions, owing to their environment and their functional constraints. For instance, sheets which form a barrel are coiled and they can be described simply by the hyperboloid model (Lasters et al., 1988; Lasters, 1990
; Flower, 1994
; King et al., 1994
; Murzin et al., 1994
). Therefore, we defined additional conditions to select sheets that can be described by the helical model. The first condition (i) relies on the pseudo-dihedral angles between consecutive C
atoms, limited to the range 180 ± 45° for ß assignment, eliminating overly coiled strands. In the program PSEA (Labesse et al., 1997
), the range for ß assignment, derived from a statistical analysis, was 170 ± 45°, taking into account the fact that all experimental sheets are slightly coiled. In order to keep plane conformations to ß-strands, we chose 180° to avoid a bias in angles, but kept the same angular spread as that established by Labesse et al. Another condition (ii) on connectivity between strands of a same sheet helped in choosing the less bent sheets along the z-axis. We considered hydrogen bonds between strands of a sheet in the `neighborhood' of the central C
of the longest strand of a sheet (Figure 2
). If there was a path with a hydrogen bond between adjacent residues, connecting all strands near the central residue, then the sheet was considered as connected. Otherwise, the sheets were defined as unconnected and were omitted from the analysis. However, no restriction on the length of strands in a sheet was imposed.
|
The hypothesis of a normal distribution of twist was verified with the 2 test for each class containing >30 elements. To perform this test we defined four subclasses and compared the observed values with expected values on the assumption of normality. If the normal distribution had unknown values of mean µ and standard deviation
, we calculated the mean m and standard deviation s of the sample. We defined our subclasses for the x variable as:
![]() |
![]() |
![]() |
![]() |
The hypothesis of equivalence of two populations was verified by the Wilcoxon test on the ranks (Wonnacott and Wonnacott, 1990). We will use this test in order to determine whether the twist depends on the size of the ß-sheet and precisely if there is a range of ß-sheet sizes where the twist is conserved. For this we compare the mean values of twist in several classes of strands. The Wilcoxon test provides us with a reliable method to answer these questions.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Four-strand sheets
While exploring the 50% PDB database, among the 1058 sheets containing four strands, we found 780 sheets satisfying conditions (i), (ii) and (iii) (dihedral conformation, connectivity, limited r.m.s.). Thus 26% of the four-strand sheets were not treated in our study. All possible combinatorial positions of strands were found among the selected sheets but, as shown in Figure 3, their occurrences are far from being equal. For the 50% PDB, 73.5% of the selected sheets are antiparallel (
), 12.8% are parallel (
), 3% are sheets with one inverted strand in the center (
), 2.3% are sheets with two inverted strands in the center (
) and 0.3% have two pairs of parallel (
) strands (see Figure 3
). Statistical studies were performed only on antiparallel (
), parallel (
) and mixed (
) sheets, since the other groups had fewer than 20 representatives. The redundancy of the 95% PDB increases the amount of the most populated group, the antiparallel one.
|
|
The Wilcoxon test (Wonnacott and Wonnacott, 1990) was performed to verify the hypothesis of equivalence of average twists for parallel sheet groups (
) of different sizes. This hypothesis of equivalence could be accepted at a level
= 0.05. Therefore, an average value of twist for all parallel sheets (
) was calculated, yielding 4.27 ± 0.21°/Å, for the 95% PDB database and 4.20 ± 0.26°/Å for the 50% PDB database.
A study of the more redundant 95% PDB database, in which 1739 sheets satisfying conditions (i), (ii) and (iii) were selected from 2332 sheets, yielded results close to those found for the 50% PDB database. The same dependence of average and mean twist values on the number of residues for different sheet configurations is found in the more redundant database.
Three-strand sheets
The same procedure was applied to explore the properties of sheets containing three strands. For the 50% PDB database, 766 sheets containing three strands and satisfying conditions (i), (ii) and (iii) (dihedral conformation, connectivity and small r.m.s.) were selected among 940 unconditioned ones. Thus 19% of the three-strand sheets were out of consideration in this study. All possible sheet topologies were also found, as shown in Figure 3. As in the case of four-strand sheets, the antiparallel (
) sheets represent the majority of the sample (78.1% in the 50% PDB). The parallel (
) and mixed (
) groups are distributed almost equally and represent 10.3 and 11.6% of the 50% PDB, respectively. Owing to only three possible different topologies for three-strand sheets (antiparallel, parallel and mixed), all groups contain enough members for confident statistics. The group of antiparallel (
) sheets was divided into seven classes, the mixed (
) group into three classes and the group of parallel (
) sheets into two classes.
As for the four-strand sheets, it was also found that the twist of a sheet depends on its size in terms of mean number of residues and on its topology, as reported in Figure 5. This dependence is similar to that of the four-strand sheets, i.e. the twist decreases as the size of the sheet increases, except for the parallel sheet, but it has a higher variance.
|
The 95% PDB database, reduced to 1581 sheets out of 1846 after applying the conditions of selection, gave similar results for antiparallel and mixed groups. For the parallel sheet group, the hypothesis of similarity of the distribution of twist upon size was rejected at a level of 5%.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
By simple geometric considerations, we proposed to model one sheet as a piece of helical surface, the C atoms of the backbone lying on this surface. This has the main advantage of being very rapid to compute and easy to handle. Although it was not created in order to describe all available sheets, this helical model has proven to be potentially applicable to the majority of three- and four-strand sheets. It was found that among all considered sheets, antiparallel (
) sheet topologies represent more than 70% of the set, while the number of sheets satisfying the three conditions necessary to be able to apply this method represents more than 80% of all three- and four-strand sheets found in the PDB databases. Therefore, it is important to emphasize that the results presented in this study concern the `small' and `regular' sheets which do not belong to a barrel.
This observation also illustrates the fact that the antiparallel () topologies are more energetically favorable, which is also true for the three-strand sheets. The short antiparallel (
) sheets of four strands, fewer than six residues per strand on average, present the highest twist. This may also be a consequence of a lower energy state compared with the parallel (
) sheets. The low-energy state of the antiparallel (
) sheets may favor greater flexibility.
The Wilcoxon test, which is used to decide if the mean twist of several classes can be considered as equal, was applied to the parallel sheets. For the four-strand sheets () it was found that the twist values for parallel sheet groups (
) of different sizes have the same distribution, at a level
= 0.05. This result is true for both the 95% and 50% PDB databases. For the three-strand sheets (
) the same result was obtained with the 50% PDB database. However, it was not true for the 95% PDB database, where the average twists for the sheets with <24 and those with >24 residues per sheet could not be accepted as equivalent. It can also be a consequence of a high degree of redundancy of the 95% database. These results demonstrate that the value of the parallel sheet twist can be considered as independent of the size of the stands that constitute the sheet.
The results of the study performed on the 50% PDB database are consistent with those of the 95% PDB database. The study of the two databases was necessary for a comprehensive analysis of the variation of twist of ß-sheets. The results obtained from the 50% PDB database, being less influenced by the redundancy of the database, were favored over those of the of 95% PDB database. Hence we obtained a set of values of twist for every structural configuration of three- and four-strand ß-sheets.
One targeted output of this study was to obtain a simple but accurate model of small ß-sheets in order to perform ab initio predictions of protein 3D folds. Indeed, together with helices modeled by rigid cylinders, the ß-sheet helical surfaces proposed here will be used to generate a tight pack of regular secondary structures through an appropriate procedure that will be presented elsewhere.
Conclusion
The aim of this work was to estimate the dependence of the twist of a ß-sheet on the topology of strands and on their mean length. It was found that the twist of a ß-sheet is strongly connected to its strand architecture and to the size of the ß-sheet. Parallel sheets, of both four and three stands, present a merely constant twist, while all other kinds of ß-sheets behave in another way: the smaller the sheet, the more it is twisted. Therefore, parallel ß-sheet twists are less dependent on sheet size than the ß-sheets with other strand topologies. The most comprehensive study was performed on the antiparallel sheets because they represent the majority (>70%) of ß-sheets found in the PDB and, therefore, offer the largest sample size. The results of this study will be used in the algorithm RUSSIA (Rigid Secondary Structure Iterative Assembling), to be presented elsewhere.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Chou,K.C., Pottle,M., Nemethy,G., Ueda,Y. and Sheraga,H.A. (1982) J. Mol. Biol., 162, 89112.[ISI][Medline]
Cohen,F.E., Sternberg,M.J.E. and Taylor,W.R. (1980) Nature, 285, 378382.[ISI][Medline]
Cohen,F.E., Sternberg,M.J.E. and Taylor,W.R. (1982) J. Mol. Biol., 156, 821860.[ISI][Medline]
Daffner,C. (1994) Protein Sci., 3, 876882.
Efimov,A.V. (1993) FEBS Lett., 334, 253256.[ISI][Medline]
Flower,D.R. (1994) Protein Engng, 7, 13051310.[Abstract]
Gellman,S.H. (1998) Curr. Opin. Chem. Biol., 2, 717725.[ISI][Medline]
Hobohm,U., Sander,C. (1994) Protein Sci., 3, 522.
Hobohm,U., Scharf,M., Schneider,R. and Sander,C. (1992) Protein Sci., 1, 409417.
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[ISI][Medline]
King,R.D., Clark,D.A., Shirazi,J. and Sternberg,J.E. (1994) Protein Engng, 7, 12951303.[Abstract]
Labesse,G., Colloc'h,N., Pothier,J. and Mornon,J.-P. (1997) CABIOS, 13, 291295.[Abstract]
Lasters,I. (1990) Protein Engng, 4, 133135.[Abstract]
Lasters,I., Wodak,S.J., Philippe,A. and van Cutsem,E. (1988) Proc. Natl Acad. Sci. USA, 85, 33383342.[Abstract]
Murzin,A.G., Lesk,A.M. and Chothia,C. (1994) J. Mol. Biol., 236, 13691381.[ISI][Medline]
Salemme,F.R. (1981) J. Mol. Biol., 146, 143156.[ISI][Medline]
Salemme,F.R. (1983) Prog. Biophys. Mol. Biol., 42, 95133.[ISI][Medline]
Wang,L., O'Connell,T., Tropsha,A. and Hermans,J. (1996) J. Mol. Biol., 262, 283293.[ISI][Medline]
Wonnacott,T.H. and Wonnacott,R.J. (1990) Introductory Statistics for Business and Economics. 4th edn. Wiley, New York.
Yang,A.S. and Honig,B. (1995) J. Mol. Biol., 252, 366376.[ISI][Medline]
Received December 17, 1999; revised March 9, 2000; accepted March 14, 2000.