Novel protein structural motifs containing two-turn and longer 310-helices

Lipika Pal and Gautam Basu1

Department of Biophysics, Bose Institute, P-1/12 CIT Scheme VIIM, Calcutta 700 054, India


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
The 310-helix constitutes a small but significant fraction of secondary structural elements in proteins. Protein data base surveys have shown these helices to be present as {alpha}-helical extensions, in loops and as connectors between ß-strands. The present work focuses on two-turn and longer 310-helices where we establish that two-turn and longer 310 helices, unlike the more abundant single-turn 310-helices, frequently occur independent of any other contiguous secondary structural elements. More importantly, a large fraction of these independent two-turn and longer 310-helices, along with {alpha}-helices and ß-strands, are found to form novel super-secondary structural motifs in several proteins with possible implications for protein folding, local conformational relaxation and biological functions.

Keywords: 310-helix/{alpha}-helix/secondary structure/structural motif


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Although the {alpha}-helix is the most ubiquitous helical secondary structure in proteins, several recent studies have demonstrated that amino acid residues in proteins can also significantly populate the non-canonical 310-helical conformation (Baker and Hubbard, 1984Go; Barlow and Thornton, 1988Go; Karpen et al., 1992Go; Blundell and Zhu, 1995Go; Doig et al., 1997Go). The 310-helix, suggested to be the intermediate between a nascent helix and an {alpha}-helix (Millhauser, 1995Go) and observed in simulation studies of {alpha}-helix melting (Fan et al., 1991Go; Soman et al., 1991Go), has also been demonstrated to have functional roles in several proteins (McPhalen et al., 1992Go; Kavavaugh et al., 1993Go, Kostrikis et al., 1994Go; De Guzman et al., 1998Go; Hashimoto et al., 1998Go). The general consensus from analysis of protein helices present in the protein data bank (PDB; Bernstein et al., 1977) is that 310-helices comprise of only 3–4% of all residues and are short. They have also been reported to occur frequently at the termini of {alpha}-helices (Baker and Hubbard, 1984Go; Barlow and Thornton, 1988Go) and as connectors between two ß-strands (Barlow and Thornton, 1988Go).

Since the majority of 310-helices in proteins are very short, comprising of three residues (or one-turn) only, the results of PDB analyses of 310-helices have almost always been determined by these single-turn helices. However, with increasing number of structures in PDB, the number of longer (two-turn or more) 310-helices have grown to a small but a finite size. Therefore, as the number of known protein structures grow, there is a need to re-examine the role of 310-helices in proteins, especially the longer ones, in the context of their immediate structural environment. The present analysis, where we focus exclusively on two-turn and longer 310-helices in proteins, establishes that unlike the single-turn 310-helices, two-turn and longer 310-helices in proteins mostly occur independent of any contiguous {alpha}-helix or other secondary structural element (SSE), and often as part of a super-secondary structural motif (SSSM), defined as a set of sequence-contiguous SSEs that pack into well-defined three-dimensional super-secondary structural or folding units (Efimov, 1984Go).


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
The December 1997 pdb_select data set (Hobohm and Sander, 1994Go), with less than 25% sequence identity, was used in this analysis; this set was further reduced to the final set of 267 protein chains from the PDB by the following screening criteria: (a) resolution <= 2.0 Å and (b) R-factor <= 20%. Although these stringent criteria result in a small data set, it improves the quality of the helices studied, as required in this study. Secondary structure assignments were made using the DSSPprogram (Kabsch and Sander, 1983Go). This procedure yielded a total of 267 low sequence-identity and high resolution protein (chain) structures which contained 40 two-turn or longer 310-helices. The organization of structural motifs containing the 310-helices were identified by visual inspection along with the DSSP classification of residues. To confirm the results on the small data set and to calculate amino acid composition of 310-helices, a larger data set, the March 1999 culledpdb data set (Hobohm et al., 1993Go), with less stringent screening criteria (<30% sequence identity and <=2.5 Å resolution) was also used. Over-representation of amino acids in the helices were estimated (5% significance level) as described elsewhere (Karpen et al., 1992Go).


    Results and discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Quality of the helices

The average backbone dihedral angles of the 40 two-turn and longer 310-helices were found to be {phi} = –69.3 (±20.9) and {Psi} = –18.1 (±19.7). This compares well with a recent analysis with a larger dataset (Smith et al., 1996Go) where the corresponding values were {phi} = –62.8 (±38.0) and {Psi} = –16.5 (±34.7). The average backbone dihedral angles for the N-cap and the C-cap residues were [{phi} = –90.1 (±60.3), {Psi} = 82.3 (±73.5)] and [{phi} = –68.5 (±63.6), {Psi} = 67.5 (±87.2)] respectively. The average pitch of the helices were found to be 5.88(±0.29) Å. These average values as well as an inspection of individual backbone dihedral angles for the helices show that these are indeed well formed 310-helices and not just a distorted {alpha}-helix.

Amino acid composition of 310-helices

The larger culledpdb data set, consisting of 1085 protein chains, was used to determine the amino acid composition of 310-helices. The number of three, four, five, six, seven, eight, nine and 10 residue 310-helices present in this set are 2365, 396, 108, 94, 21, 8, 4 and 5 respectively. Here we summarize the results for amino acid over-representation where the helices are classified as short (3–5 residues) and long (6 and more residues). N-cap (N0): Asp, Asn, His, Pro (short) and Asp, Ser (long). N1: Pro, Ala, Gly, Trp (short), Pro, Met (long). N2: Glu, Asp, Ser, Asn, His, Trp (short), Asp, Glu, Gly, Pro (long). N3: Asp, Glu, Phe, His, Leu, Lys, Asn, Gln, Tyr, Ser, Trp, Ala (short), Trp, Tyr, Asp, Glu, Met (long). C-cap (C0): Leu, Ile, Phe, Val, Cys, Gly (short), Phe, Leu, Ile (long). The results on the short 310-helices are almost in accordance with a previous study (Karpen et al., 1992Go) performed exclusively on 77 three-residue 310-helices. While an amino acid composition study is important, because of its statistical nature, its application to a limited number of 310-helices is prone to errors. However, our results confirm the reported amino acid composition trend of the more abundant short 310-helices (Karpen et al., 1992Go) and points to the interesting variation of amino acid composition as a function of chain length, which should be looked at more carefully only when a larger dataset becomes available before any definite conclusion can be made.

Two-turn and longer 310-helices present in SSSM

When the 310-helices from the small dataset were examined carefully, in more than 50% of cases (25 structures) they showed an unprecedented tendency to occur as part of a SSSM. A summary of the observed SSSMs are shown schematically in Figure 1Go and a few representative 310-helices are shown in Figure 2Go in the context of their immediate structural environments. The remaining 15 helices were found to occur as termini of {alpha}-helices (6), termini of the protein chain (1) or part of a long loop (8). Table IGo summarizes all the 310-helices and proteins studied in this work.



View larger version (0K):
[in this window]
[in a new window]
 
Fig. 1. A schematic representation of all the SSSMs containing a 310-helix found in this work. {alpha}- and 310-helices are depicted by dark and light barrels respectively.

 


View larger version (0K):
[in this window]
[in a new window]
 
Fig. 2. Cartoon representation of a select set of two turn 310-helices in the context of their structural environment generated by MOLSCRIPT (Kraulis, 1994Go). 310-helices are darker than the {alpha}-helices and all hetero atoms are shown by cpk model: (a) unusual 310-helical termini of an {alpha}-helix in peroxidase coordinating the heme Fe atom; the lone hetero atom is calcium and the side chain of the distal His coordinated to heme is shown in wireframe model; (b) co-planar heme–310-helix (present as motif II.1) interaction in myoglobin (ferric); (c) the interaction of motif IV.1 in aconitase with the Fe4-S4 cluster and nitroisocitrate; and (d) motif VIII.1 in xanthine-guanine phosphoribosyltransferase with a sulphate and a magnesium ion (with five co-ordinated water molecules).

 

View this table:
[in this window]
[in a new window]
 
Table I. Structural environment around two-turn and longer 310-helices
 
310-Helices contiguous with a SSE and present in long loops

The 310-helix was found contiguous with an {alpha}-helix in five proteins. In peroxidase, the contiguous {alpha}-helical and 310-helical stretches in the entire helix are of equal length. The 310-helix contributes the distal histidine coordinating the heme, suggesting an important indirect structural role related to protein function as shown in Figure 2aGo. In acetohydroxy acid isomeroreductase only one turn of the {alpha}-helix was contiguous to the two turn 310-helix. This is unusual as more often the {alpha}-helix is longer than the 310-helix. Three other cases of {alpha}-helical termini were more typical of a long {alpha}-helix with a short 310-helical termini. Unlike the above case, in phosphoribosyl anthranilate isomerase, the two turn 310-helix is contiguous with one more turn of 310-helix. In only one protein was a C-terminal 310-helix found, that in 2Fe-2S ferredoxin.

As an independent SSE, the occurrence of 310-helices in long loops have been mentioned earlier (Martin et al., 1995Go). We found eight cases where 310-helices were present in long loops. In half of these, the 310-helix acts as a connector between two SSSMs. For example, in glutathione synthetase, a 310-helix connects two SSSMs. When not a connector, the loops are often part of a SSSM and exhibit a functional role, like the metal binding site in leucine aminopeptidase.

SSSMs containing a 310-helix

The simplest SSSM containing a 310-helix is the helix–helix motif comprising of one {alpha}- and one 310-helix. These occur either as a 310/{alpha} corner (motif II.1) or a 310/{alpha} hairpin (motif II.2). The II.1 motif is mostly present in heme proteins with a globin-like fold exhibiting strong co-planar interaction with the heme. A typical case, that of myoglobin, is shown in Figure 2bGo. The only non-heme protein containing the II.1 motif is glycogen phosphorylase where, unlike in the globin counterparts, the 310-helix is attached to the N-terminal end of the {alpha}-helix. Of the four cases where the motif II.2 was found, no SSEs were found close to the motif (sequence or spatial) suggesting inherent stability of the motif, as an independent folding unit. Only in one case, that of endoglucanase V, did a disulphide bond covalently connected the two helices.

A single 310-helix also forms a SSSM with two other SSEs. The 310-helix occurs as an anti-parallel ß-strand connector (motif III.1) in three proteins. In streptavidin, the anti-parallel ß-strands are part of a ß-barrel with Trp120 from the 310-helix being part of the binding site. In T-cell surface glycoprotein, the anti-parallel ß-strands are part of a larger ß-sandwich super structure, and in pseudoazurin, His81, part of the 310-helix, and Cys78 and Met86, flanking the helix, bind the copper ion. Two proteins contain the 310-helix as the `turn' in an {alpha}-ß-hairpin motif (motif III.2)–rubisco and triosephosphate isomerase, both sharing the ß/{alpha} (TIM) barrel fold. The other motif, 310-helix as a parallel ß-strand connector (motif III.3), occurs in endo-ß-N-acetylglucosaminidase where, strictly speaking, the 310-helix is part of the long loop described earlier.

A 310-helix was also found to occur in SSSMs containing more than three SSEs as described here. In old yellow enzyme and aconitase, a pair of 310- and {alpha}-helices pack against a pair of parallel ß-strands (motif IV.1). In both cases, a loop connecting the SSEs in the motif exhibits strong binding activity or is part of the active site. Figure 2cGo shows the interaction of motif IV.1 of aconitase with the Fe4-S4 cluster. In cellulase CelC, an {alpha}-helix–310-helix hairpin motif forms a single layer SSSM with a pair of short antiparallel ß-strands (motif IV.2). Cellulase CelC exhibits yet another two layer motif (motif V.1) containing three parallel ß-strands and a pair of 310- and {alpha}-helices which recurs in Klebsiella aerogenes urease as well. In both cases several residues that are part of the motif's loops and turns participate in the active site.

Deoxyribonuclease I exhibits a two layer motif (motif VI.1) containing two pairs of anti-parallel ß-strands and a pair of 310- and {alpha}-helices where several loop residues and a ß-strand residue are involved in oligopeptide binding. Four parallel ß-strands along with three helices, two {alpha}-helices and one 310-helix, form a two layer motif (motif VII.1) in dienelactone hydrolase, where, like the previous motifs, two residues in the loop of the motif form part of the active site. The same set of SSEs, with different connectivity, form a three layer motif (motif VII.2) in flavodoxin where a part of the motif, away from the 310-helix, is involved in FMN binding. Three pairs of anti-parallel ß-strands form a pseudo barrel SSSM (motif VII.3) along with a 310-helix in ascorbate oxidase, where three His residues, contributed by a ß-strand and a turn, are part of the active trinuclear copper site. In xanthine-guanine phosphoribosyltransferase, three parallel ß-strands, each 90° bent via a turn, form a motif (motif VIII.1) along with a pair of 310- and {alpha}-helices, where Asp89 in the turn is the magnesium ion interaction site, as shown in Figure 2dGo.

Implications for SSSMs containing a 310-helix

Although we report only a small number of SSSMs containing two-turn and longer 310-helices in this work, upon analyzing the larger culledpdb dataset with less stringent structural quality, the number of SSSMs containing 310-helices grew. The larger dataset contained 132 two-turn and longer 310-helices where 63% 310-helices occur as part of a SSSM. The remaining 310-helices were found to occur as termini of {alpha}-helices (18%), termini of the protein chain (4%) or part of a long loop (15%). This clearly establishes the high propensity of two-turn and longer 310-helices to be present in SSSMs. It should be noted that these SSSMs, especially III–VIII, are not meant to represent typical and frequently observed motifs containing a 310-helix. Rather, they represent structural motifs within which the independent 310-helices occur, and therefore, even for single occurrences, they demonstrate the diversity of structural environment in which two-turn and longer 310-helices may occur. On the other hand, SSSM II can be considered as a `conserved' SSSM that occurs again and again in unrelated proteins. A simple sequence and structural analysis of these helices did not yield any clear statistically significant trend dictating their occurrences or stabilities, probably due to the small dataset used. Specific interactions, from within the protein and from solvent and ligand molecules, possibly account for the stability of these 310-helices. Only a protein-specific and case-by-case detailed atomic level examination would reveal if the presence of the 310-helices imposes any crucial structural constraints that ultimately translates into a functional role.

These motifs are novel in that they contain a 310-helix in place of the more expected {alpha}-helix. It is remarkable that in the majority of the representative SSSMs illustrated here, either the 310-helix itself (direct) or some part of the motif (indirect) was found to have some functional role. A natural question that arises at this point is: `Why did a particular motif substitute a 310-helix for an {alpha}-helix?' As demonstrated from computer simulations (Tirado-Rives et al., 1993Go; Basu et al., 1994Go; Huston and Marshall, 1994Go), energetically the two helical forms are very close, both in terms of inter-helical free energy difference {Delta}G° and free energy of activation {Delta}G{ddagger}. Experimental data also suggest that they are inter-convertible under suitable conditions (Li et al., 1997Go; Rigby et al., 1997Go) or that the 310-helix may be an intermediate in {alpha}-helix formation (Sundaralingam and Sekharudu, 1989Go). At the same time, structural motifs are thought to be independent folding units (Efimov, 1984Go, 1997Go). This is because identical structural motifs are often found to occur in unrelated proteins. Recent NMR studies of protein folding intermediates also lend support to such hypothesis (Barber et al., 1996Go). In the context of folding and stability, these motifs seem to have stabilized some conformational intermediate containing a 310-helix during their folding pathway. Further, since the motifs often exhibit direct or indirect functional roles, they could also be special in that they can undergo local conformational relaxation (310-helix {leftrightharpoons} {alpha}-helix) under structural constraints arising from their environment like binding a ligand. It will be interesting to see, perhaps from computer simulations, the conformational pathways of folding of these motifs and if these motifs indeed show a 310-helix {leftrightharpoons} {alpha}-helix transition under suitable conditions.


    Acknowledgments
 
We would like to thank D.Pal, P.Chakrabarti, K.Kinoshita and K.Mizuguchi for their critical comments. K.Kinoshita kindly prepared Figure 2Go. Part of this work was done at the Bioinformatics Centre, Bose Institute, with financial support from CSIR, India.


    Notes
 
1 To whom correspondence should be addressed; email: gautam{at}boseinst.ernet.in Back


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Baker,E.N. and Hubbard,R.E. (1984) Prog. Biophys. Mol. Biol., 44, 97–179.[ISI][Medline]

Basu,G., Kitao,A., Hirata,F. and Go,N. (1994) J. Am. Chem. Soc., 116, 6307–6315.[ISI]

Barber,E., Barany,G. and Woodward,C. (1996) Folding Des., 1, 65–76.[ISI][Medline]

Barlow,D.J. and Thornton,J.M. (1988) J. Mol. Biol., 201, 601–619.[ISI][Medline]

Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F., Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,N. (1977) J. Mol. Biol., 112, 535–542.[ISI][Medline]

Blundell,T.L. and Zhu,Z.-Y. (1995) Biophysical Chem., 55, 167–184.

De Guzman,R.N., Wu,Z.R., Stalling,C.C., Pappalardo,L., Borer,P.N. and Summers,M.F. (1998) Science, 279, 384–388.[Abstract/Free Full Text]

Doig,A.J., Macarthur,M.W., Stapley,B.J. and Thornton,J.M. (1997) Protein Sci., 6, 147–155.[Abstract/Free Full Text]

Efimov,A.V. (1984) FEBS Lett., 166, 33–38.[ISI][Medline]

Efimov,A.V. (1997) Proteins Struct. Funct. Genet., 28, 241–260.[ISI][Medline]

Fan,P., Kominos,D., Kitchen,D.B. and Levy,R.M. (1991) Chem. Phys., 158, 295–301.[ISI]

Hashimoto,Y., Kohri,K., Kaneko,Y., Morisaki,H., Kato,T., Ikeda,K. and Nakanishi,M. (1998) J. Biol. Chem., 26, 16544–16550.

Hobohm,U., Scharf,M. and Schneider,R. (1993) Protein Sci., 1, 409–417.[Abstract/Free Full Text]

Hobohm,U. and Sander,C. (1994) Protein Sci., 3, 522–524.[Abstract/Free Full Text]

Huston,S.E. and Marshall,G.R. (1994) Biopolymers, 34, 75–90.[ISI][Medline]

Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.[ISI][Medline]

Kavavaugh,J.S., Moo-Penn,W.F. and Arnone,A. (1993) Biochemistry, 32, 2509–2513.[ISI][Medline]

Karpen,M.E., DeHaseth,P.L. and Neet,K.E. (1992) Protein Sci., 1, 1333–1342.[Abstract/Free Full Text]

Kostrikis,L.G., Liu,D.J. and Day,L.A. (1994) Biochemistry, 33, 1694–1703.[ISI][Medline]

Kraulis,P.J.J. (1994) Appl. Crystallogr., 24, 946–950.

Li,T., Horan,T., Osslund,T., Stearns,G. and Arakawa,T. (1997) Biochemistry, 36, 8849–8857.[ISI][Medline]

Martin,A.C.R., Toda,K., Stirk,H.J. and Thornton,J.M. (1995) Protein Engng, 8, 1093–1101.[Abstract]

McPhalen,C.A., Vincent,M.G., Picot,D., Jansonius,J.N., Lesk,A.M. and Chothia,C. (1992) J. Mol. Biol., 227, 197–213.[ISI][Medline]

Millhauser,G.L. (1995) Biochemistry, 34, 3873–3877.[ISI][Medline]

Rigby,A.C., Baleja,J.D., Li,L., Pedersen,L.G., Furie,B.C. and Furie,B. (1997) Biochemistry, 16, 15677–15684.

Smith,L.J., Bolin,K.A., Schwalbe,H., MacArthur,M.W., Thornton,J.A. and Dobson,C.M. (1996) J. Mol. Biol., 255, 494–506.[ISI][Medline]

Soman,K.V., Karimi,A. and Case,D.A. (1991) Biopolymers, 31, 1351–1361.[ISI][Medline]

Sundaralingam,M. and Sekharudu,Y.C. (1989) Science, 244, 1333–1337.[ISI][Medline]

Tirado-Rives,J., Maxwell,D.S. and Jorgensen,W.J. (1993) J. Am. Chem. Soc., 115, 11590–11593.[ISI]

Received April 23, 1999; revised July 2, 1999; accepted July 2, 1999.