Length preferences and periodicity in ß-strands. Antiparallel edge ß-sheets are more likely to finish in non-hydrogen bonded rings

Simon Penel1,2, R.Gwilym Morrison1, Paul D. Dobson1, Russell J. Mortishire-Smith3 and Andrew J. Doig1,4

1Department of Biomolecular Sciences, UMIST, PO Box 88,Manchester M60 1QD and 3Merck, Sharp & Dohme Research Laboratories, Neuroscience Research Centre, Terlings Park, Eastwick Road, Harlow, Essex CM20 2QR, UK 2Present address: Laboratoire de Biometrie et Biologie Evolutive,Bât 711—CNRS UMR 5558—Université Lyon 1, 43 bd du 11 novembre 1918, 69622 Villeurbanne Cedex, France

4 To whom correspondence should be addressed. e-mail: andrew.doig{at}umist.ac.uk


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
We analysed the length distributions of different types of ß-strand in a high resolution, non-homologous set of 500 protein structures, finding differences in their mean lengths. Antiparallel edge strands in strand–turn–strand motifs show a preference for an even number of residues. This propensity is enhanced if the length is corrected for ß-bulges, which insert an extra residue into the strand. Residues in antiparallel edge ß-strands alternate between being in hydrogen bonded and non-hydrogen bonded rings. Antiparallel edges with an even number of residues are more likely to have their final ß residue in a non-hydrogen bonded ring. This suggests that non-hydrogen bonded rings are intrinsically more stable than hydrogen bonded rings, perhaps because its side chain packing is closer. Therefore, we suggest that a simple way to increase ß-hairpin stability, or the stability of an antiparallel edge strand, is to have a non-hydrogen bonded ring at the end of the strand.

Keywords: antiparallel/ß-bulge/ß-sheet/ß-turn/secondary structure


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
{alpha}-Helices are typically 4–12 residues in length, with few longer than 25 residues (Srinavasan, 1976Go; Barlow and Thornton, 1988Go; Kumar and Bansal, 1998Go; Penel et al., 1999Go). The upper limit may arise from the maximum diameter of a protein domain. In contrast to this, ß-strands tend to have fewer residues, with lengths of two to six being most common and few having more than 12 (Kabsch and Sander, 1983Go; Penel et al., 1999Go). While this may be a result of the ß-sheet being more extended than the helix, stability measurements on ß-hairpin peptides of varying lengths suggest that unlike isolated helices, ß-sheet stability may decrease beyond seven to nine residues (Stanger et al., 2001Go).

ß-Sheets are complex structures, in which hydrogen bonding between adjacent strands can be parallel or antiparallel, and residues in edge strands can have their backbone NH and CO groups either hydrogen bonding within the sheet or oriented toward solvent. Figure 1 shows the following distinct structural locations for residues in ß-sheets, arranged in a hierarchy: mixed, antiparallel centre, parallel centre, hydrogen bonded antiparallel edge, non-hydrogen bonded antiparallel edge, hydrogen bonded parallel edge and non-hydrogen bonded parallel edge. We further distinguish between coherent and non-coherent ß-strands. Coherent strands are bonded to the same ß-strand throughout the strand edge. Non-coherent strands are bonded to more than one different strand. Non-coherent strands are often not possible to classify simply, as they may, for example, be edge on one end and centre at the other.



View larger version (10K):
[in this window]
[in a new window]
 
Fig. 1. ß-Sheet substructure hierarchy showing the seven distinct structures found in ß-sheets and the parent structures associated with them.

 
Kabsch and Sander (1983Go) have previously highlighted small differences in the length distribution of parallel and antiparallel ß-strands. Here we examine the length preferences of the different strand types in more detail. Most interestingly, we find that particular types of strand are more likely to have either an odd or an even number of residues.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
We surveyed the September 2001 pdb_select (Hobohm and Sander, 1994Go) data set to select in the Brookhaven Data Bank (PDB) (Bernstein et al., 1977Go) protein structures solved at a resolution higher than 2.5 Å, with an R-factor lower than 25%, and having less than 25% identity. We select only one member from each SCOP superfamily (Murzin et al., 1995Go) to remove structural homology. We obtain a data set of 500 protein structures. We used the program SSTRUC (Smith and Thornton, 1989Go) to find the secondary structure of each protein and the program HBPLUS (McDonald and Thornton, 1994Go) to generate all the hydrogen bonds in the structures.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Table I gives the length distributions for each different strand type. Coherent strands can be subdivided into parallel, antiparallel and mixed or into edge and centre (Figure 1). Coherent edge strands are subdivided into parallel and antiparallel. Centre strands are subdivided into parallel, antiparallel or mixed.


View this table:
[in this window]
[in a new window]
 
Table I. Length distributions of different types of ß-strand
 
Coherent strands are more abundant than non-coherent strands, though their mean length is lower. It is more likely that a non-coherent strand is long as it interacts with more than one other strand, by definition, and this becomes more probable as a strand length increases. Within the coherent set, antiparallel strands are more abundant than parallel strands (76% versus 22%) and mixed strands are rare (1%). The majority of coherent strands are edge (93%) rather than centre (7%). Most of the central ß-strands are non-coherent (80%). Edge antiparallel strands (78%) are more abundant than parallel edges (22%). Within the coherent centre subset, antiparallel strands (49%) are slightly favoured over parallel strands (34%), with mixed being the least frequent (16%). There is a general trend for antiparallel strands to have longer mean lengths than do parallel strands, with the rare mixed strands having a short mean length. Edge strands have slightly longer mean lengths than centre strands. Within the coherent set, more abundant strand types tend to have longer mean lengths, as high abundance and length are both signs of intrinsic stability. (Similarly, 310-helices have both a lower abundance and length than {alpha}-helices.)

We previously found that there is an approximately 3.6 residue periodicity in frequencies of {alpha}-helix lengths, corresponding to 3.6 residues per turn found in the helix (Penel et al., 1999Go). Favoured helix lengths have their N and C ends on the same face of the helix. In view of this, we investigated whether a similar periodicity was present in ß-strand lengths. As side chains in strands alternate between the two sides of the sheet, we investigated whether there is a 2-fold periodicity in strand length so that either an odd or an even number of residues per strand is favoured. Table I gives the fractions of strands with odd or even lengths for each strand type. A small variation is seen, though none vary by more than 8% from 50%.

More interesting results were found when examining a ß-hairpin subset of the antiparallel edge group, specifically antiparallel edges immediately following a reverse turn joining the edge strand to another strand (a strand–turn–strand ß-hairpin motif, Figure 2a). These were found by searching for the secondary structure motif ETTEnX, where n is the length of the antiparallel edge, T is a turn residue, E is a ß-sheet residue and X is a non-sheet residue. Table II gives the results for this strand. A preference for an even number of residues in the strand is now apparent, with 215/342 = 63% of strands having an even length. This preference is only present for short lengths (less than six), however.




View larger version (37K):
[in this window]
[in a new window]
 
Fig. 2. (a) Example of an ETTE3X motif, analysed in Table II, with an antiparallel edge strand length of four. (b) Example of an XE3TTE motif, analysed in Table III, with an antiparallel edge strand length of four.

 

View this table:
[in this window]
[in a new window]
 
Table II. Length distributions for antiparallel edge ß-strands following a reverse turn
 
There are two types of ring within a pair of adjacent antiparallel ß-strands—hydrogen bonded and non-hydrogen bonded (Salemme, 1983Go) (Figure 3). Table II shows that 76% of strands that have a hydrogen bonded ring at their C-terminal positions (i.e. the last residue in a ß conformation) have an odd number of residues, while 80% of strands that have a non-hydrogen bonded ring at their C-terminal positions have an even number of residues. Figure 2 shows that this is to be expected since a hydrogen bonded ring is adjacent to the turn in a ß-hairpin. Similar trends are found in antiparallel edges immediately preceding a reverse turn joining the edge strand to another strand in an XEnTTE motif (Table III). Here, 57% of the strands have an even number of residues.



View larger version (13K):
[in this window]
[in a new window]
 
Fig. 3. Hydrogen bonded and non-hydrogen bonded rings in antiparallel ß-sheets.

 

View this table:
[in this window]
[in a new window]
 
Table III. Length distributions for antiparallel edge ß-strands preceding a reverse turn
 
The odd/even preference is switched if a ß-bulge is present. A ß-bulge occurs when two residues in one ß-strand are found opposite one residue in the other strand (Richardson et al., 1978Go; Richardson, 1981Go; Milner-White, 1987Go; Chan et al., 1993Go). This insertion of an extra residue into the strand will switch the preference from even to odd, since the Kabsch and Sander algorithm, used to define secondary structure, does not distinguish between a ß-strand residue and a ß-bulge residue. As the length of the strand increases, the probability of a ß-bulge within the strand increases, so the 2-fold periodicity that is clear for short strands becomes lost. Therefore, we refined the results by subtracting the number of ß-bulge residues from the length of each antiparallel edge strand (Tables II and III). The preferences for odd or even lengths are now enhanced. In ETTEnX motifs, 67% of strands have an even number of residues (increased from 63%); in XEnTTE motifs 63% of strands have an even number of residues (increased from 57%). A preference for antiparallel edge ß-strands to have a non-hydrogen bonded ring at their ends is therefore clear.

Distortions from the regular patterns of hydrogen bonding in antiparallel edge strands, shown in Figures 2 and 3, are occasionally present. It is expected that strands that finish with a hydrogen bonded ring have an odd number of residues, while strands that finish with a non-hydrogen bonded ring should have an even number of residues. This is the case 79 and 84% of the time, respectively. When corrected for the number of ß-bulges, in ETTEnX motifs 93% of strands that finish with a hydrogen bonded ring have an odd number of residues (increased from 76%) and 94% of strands that finish with a non-hydrogen bonded ring have an even number of residues (increased from 80%, Table II). In XEnTTE motifs, 95% of strands that finish with a hydrogen bonded ring have an odd number of residues (increased from 79%) and 98% of strands that finish with a non-hydrogen bonded ring have an even number of residues (increased from 84%), after the ß-bulge correction (Table III). These data show that deviations from the expected antiparallel edge strand structure are found in ~5% of cases. These may arise from distortions so that a hydrogen bond is missing, for example.

We analysed the complete set of antiparallel edge strands, not just those that are preceded or followed by a ß-turn. We found that 2045/4507 = 45% of antiparallel edges have hydrogen bonded rings at their C-terminal ends and 2169/4507 = 48% of antiparallel edges have hydrogen bonded rings at their N-terminal ends. Non-hydrogen bonded rings are therefore more abundant at the N and C ends in all edge ß-strands, further supporting our conclusion.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The preference for antiparallel edge strands adjacent to a reverse turn to have an even number of residues can be rationalized by considering the structure of the end ß residue in the strand. Antiparallel edge strands adjacent to a reverse turn with an even number of residues and no ß-bulges or those with an odd number of residues and one ß-bulge in the strand, finish with a non-hydrogen bonded ring. Conversely, antiparallel edge strands adjacent to a reverse turn with an odd number of residues and no ß-bulges or those with an even number of residues and one ß-bulge in the strand, finish with a hydrogen bonded ring. This suggests that an explanation for our length preferences is that non-hydrogen bonded rings are more stable than hydrogen bonded rings. This is plausible because the Cß atoms in a non-hydrogen bonded ring are closer than in a hydrogen bonded ring, by ~1 Å, giving a van der Waals interaction between these CH2 or CH groups. This may make side chain interactions across non-hydrogen bonded rings more favourable, in most cases. Hydrogen bonded and non-hydrogen bonded rings may also differ in flexibility, leading to a difference in stability, though it is not clear which would be more flexible. While non-hydrogen bonded rings do not have bonded central CO and NH groups, a potential gain in flexibility from this could be offset by rigidity imposed by shorter side chain contacts.

Wouters and Curmi (Wouters and Curmi, 1995Go) found differences in the amino acid preference for hydrogen bonded and non-hydrogen bonded sites in antiparallel ß-sheets. Hutchinson et al. (1998Go) compared side chain pair frequencies and structures across the two types of ring and found some clear differences in preferred interactions. Smith and Regan measured some side chain interactions within a hydrogen bonded ring in protein G (Smith and Regan, 1995Go). While these studies show that there are differences in preference between the two sites, they do not show whether one site is intrinsically more stable than another. This could be tested in principle in a ß-hairpin peptide. ß-Hairpin structures will finish in either hydrogen bonded or non-hydrogen bonded rings. We suggest that ß-hairpins with a non-hydrogen bonded ring at their ends are more likely to be stable. Hence, a systematic investigation of hairpin stability as a function of length would show a periodicity in stability. Such a study has not been performed to our knowledge; all the peptides made by Stanger et al. (2001Go) in their work on the length dependence of ß-hairpin stability finished with a hydrogen bonded ring. A range of ß-sheet peptides have been shown to successfully fold (for recent reviews see Serrano, 2000Go; Cheng et al., 2001Go; Searle, 2001Go; Venkatraman et al., 2001Go). Our work suggests that a simple way to increase hairpin stability, or the stability of an antiparallel edge strand, is to have a non-hydrogen bonded ring at the end of the ß-strand. This can be achieved if the number of residues in the ß-hairpin with the two strands linked by a ß-turn and with no ß-bulges is 4N + 2, where N is an integer. An alternative explanation for the periodicity we observe is that the end residues in antiparallel edge strands are more stable with a peptide NH group facing solvent than with a peptide CO group facing solvent, though we know of no reason why this might be the case.


    Acknowledgements
 
We thank Sam Gellman and Dek Woolfson for helpful discussions. This work was funded by the Wellcome Trust (grant number 051374). R.G.M. was supported by a BBSRC CASE Award with Merck, Sharp & Dohme. P.D.D. thanks the BBSRC Engineering and Biological Systems committee for a studentship.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Barlow,D.J. and Thornton,J.M. (1988) J. Mol. Biol., 201, 601–619.[ISI][Medline]

Bernstein,F.C., Koetzle,T.F., Williams,G.J., Meyer EE,J., Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535–542.[ISI][Medline]

Chan,A.W.E., Hutchinson,E.G., Harris,D. and Thornton,J.M. (1993) Protein Sci., 2, 1574–1590.[Abstract/Free Full Text]

Cheng,R.P., Gellman,S.H. and DeGrado,W.F. (2001) Chem. Rev., 101, 3219–3232.[CrossRef][ISI][Medline]

Hobohm,U. and Sander,C. (1994) Protein Sci., 3, 522–524.[Abstract/Free Full Text]

Hutchinson,E.G., Sessions,R.B., Thornton,J.M. and Woolfson,D.N. (1998) Protein Sci., 7, 2287–2300.[Abstract/Free Full Text]

Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.[ISI][Medline]

Kumar,S. and Bansal,M. (1998) Biophys. J., 75, 1935–1944.[Abstract/Free Full Text]

McDonald,I.K. and Thornton,J.M. (1994) J. Mol. Biol., 238, 777–793.[CrossRef][ISI][Medline]

Milner-White,E.J. (1987) Biochim. Biophys. Acta, 911, 261–265.[ISI][Medline]

Murzin,A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) J. Mol. Biol., 247, 536–540.[CrossRef][ISI][Medline]

Penel,S., Morrison,R.G., Mortishire-Smith,R.J. and Doig,A.J. (1999) J. Mol. Biol., 293, 1211–1219.[CrossRef][ISI][Medline]

Richardson,J.S. (1981) Adv. Protein Chem., 34, 167–339.[Medline]

Richardson,J.S., Getzoff,E.D. and Richardson,D.C. (1978) Proc. Natl Acad. Sci. USA, 75, 2574–2578.[Abstract]

Salemme,F.R. (1983) Prog. Biophys. Mol. Biol., 42, 95–113.[CrossRef][ISI][Medline]

Searle,M.S. (2001) J. Chem. Soc. Perkin Trans. 2, 7, 1011–1020.

Serrano,L. (2000) Adv. Protein Chem., 53, 49–85.[ISI][Medline]

Smith,C.K. and Regan,L. (1995) Science, 270, 980–982.[Abstract]

Smith,D.K. and Thornton,J.M. (1989) SSTRUC Computer Program. Department of Biochemistry and Molecular Biology, University College, London.

Srinavasan,R. (1976) Indian J. Biochem. Biophys., 13, 192–193.[ISI][Medline]

Stanger,H.E., Syud,F.A., Espinosa,J.F., Giriat,I., Muir,T. and Gellman,S.H. (2001) Proc. Natl Acad. Sci. USA, 98, 12015–12020.[Abstract/Free Full Text]

Venkatraman,J., Shankaramma,S.C. and Balaram,P. (2001) Chem. Rev., 101, 3131–3152.[CrossRef][ISI][Medline]

Wouters,M.A. and Curmi,P.M. (1995) Proteins, 22, 119–131.[ISI][Medline]

Received June 19, 2003; revised October 25, 2003; accepted November 7, 2003





This Article
Abstract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (2)
Request Permissions
Google Scholar
Articles by Penel, S.
Articles by Doig, A. J.
PubMed
PubMed Citation
Articles by Penel, S.
Articles by Doig, A. J.