Department of Surgery, Duke University Medical Center, Durham, NC 27710, USA
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: -helix/intermediate/protein folding/protein structure
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The potential to form amphiphilic -helix is a property of proteins that one might suspect would be found exclusively in proteins that contain amphiphilic
-helix. However, all predominantly ß-sheet proteins contain several regions that (a) have a level of amphiphilic
-helical potential that does not occur in randomly generated sequences and that is an indicator of native
-helix in helical proteins (Parker and Song, 1990
; Parker and Stezowski, 1996
), (b) are not apparently associated with any particular type of native secondary structure (Parker and Stezowski, 1996
) and (c) might form
-helix under certain conditions (Jirgensons, 1982
; Parker and Song, 1992
). The presence of this strong amphiphilic
-helical potential which is not involved in the native fold suggests that
-helix may have been present at some point in time before the native structure was reached. These observations, along with the idea that amphiphilic
-helical segments may be ideally suited for rapid hydrophobic collapse of a protein during folding (Lim, 1978
), suggest strongly the possibility that non-native amphiphilic
-helices may be involved as intermediates in protein folding.
In this work, we found that the location of regions with amphiphilic -helical potential in ß-sheet proteins does not show a particular relation to native structure, even within non-homologous representatives of a given fold. In addition, there was no apparent association of amphiphilic
-helical potential with any particular type of secondary structure. This confirms the idea that these regions are not critical to the maintenance of native structure and suggests that if these regions are critical for any particular process or processes, that process is highly adaptable and accommodating in terms of the structure required to accomplish it. Further, we evaluated the three-dimensional arrangement of regions with amphiphilic
-helical potential in ß-sheet proteins. Surprisingly, the average region-to-region distance along the shortest path connecting all regions in the native structure of the proteins studied was relatively consistent at about 13 Å, ranging from 11 to 15 Å. No such consistently spaced intervals were observed when the location of regions was randomly assigned, indicating that the consistent spacing did not occur by chance (p = 0.0056), but rather that the intervals were specific in their organization. It is proposed that the arrays of regions with amphiphilic
-helical potential form `dormant domains' that have previously been described as elliptical in shape (Parker and Stezowski, 1996
) and that these dormant domains may be involved in (a) folding intermediates containing amphiphilic
-helical bundles and/or (b) a ubiquitous but specific template for docking with chaperones.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Regions with amphiphilic -helical potential were identified as described previously (Parker and Song, 1990
). The helical hydrophobic moment (<µH>), a quantitative measure of helical amhiphilicity, was calculated according to Eisenberg et al. (Eisenberg et al., 1982
) using the following equation:
![]() |
where n is a specific amino acid in a segment of N residues, is the angle between residues as viewed down the helical axis (100° for an
-helix) and Hn is the hydrophobic value assigned to residue n. Hydrophobic values were assigned to each amino acid using the method of Eisenberg (Eisenberg et al., 1984
). A segment length (N) of 11 residues was selected for all calculations and a moving window algorithm was used. Regions with high amphiphilic
-helical potential were considered to be regions with at least three consecutive <µH> values > 0.35. This cutoff value was used since it was previously found to be useful in predicting the occurrence of native
-helix in
-helical proteins (Parker and Song, 1990
).
Selection of protein structures for study
A basis set of protein structures was selected to evaluate the spatial distribution of regions with amphiphilic -helical potential. Structures comprising a complete fold as defined by the SCOP database (Murzin et al., 1995
) were chosen such that each type of ß-sheet fold was represented at least once, with the following exceptions: membrane-associated folds (membrane penetration domain of minor coat protein), structures having an extensive region(s) outside the central globular portion of the protein (polio virus coat proteins; outer surface protein A, chain o) and folds not forming an intact hydrophobic core (bacteriochlorophyll A; near-membrane domain of outer membrane proteins).
Ninety-one non-homologous structures representing 56 different types of ß-sheet folds were thus chosen and the amphiphilic -helical potential of each was evaluated as described above. Structures were selected for study if they contained
6 regions with amphiphilic
-helical potential and if at least 60% of those regions were found to adopt a predominantly non-
-helical conformation in the native fold. It should be noted that although a minimum of six regions was chosen as a reasonable number for studying the geometry and shape of the structures, there is no reason why six is a minimum for any biological role and indeed most proteins evaluated had < 6 regions with amphiphilic
-helical potential. Secondary structure as defined by DSSP (Kabsch and Sander, 1983
) was used for the evaluation. The 23 structures which met the selection criteria (Table I
) were used as a `basis set' for further evaluation.
|
A measure of the average distance between regions with amphiphilic -helical potential was used to probe the spatial arrangement of those regions. Specifically, the average region-to-region distance along the shortest path connecting all regions was used. To calculate this distance, the position of each region with amphiphilic
-helical potential was approximated as the mean x, mean y and mean z values of all of the
-carbons of that region. Calculation of positions of regions and generation of a matrix including all region-to-region distances was carried out using the program Matlab. The shortest path connecting all regions was found using a program written in QuickBasic that tabulated the length of all possible paths connecting all regions and selected the shortest one. Once the shortest path connecting all regions was found, the average region-to-region distance along that path was calculated by dividing the length of the path by n 1, where n is the number of regions with amphiphilic
-helical potential in that protein.
Structures containing randomly assigned distributions of regions with amphiphilic -helical potential
To determine whether results obtained using the basis set of 23 structures could occur by chance, the results were compared with those obtained using basis sets in which the location of regions with amphiphilic -helical potential was randomly assigned. Non-overlapping control regions were randomly assigned to a given structure such that the number and length of regions was the same as that found in the native structure. One hundred randomly assigned structures were constructed for each of the 23 structures in the basis set. Sets of 23 such control structures were then generated by randomly selecting one of the randomly generated structures (out of 100) for each structure in the basis set. In this manner, 10 00050 000 sets of randomly generated structures were generated and used for comparison with the basis set containing the 23 naturally occurring structures. When evaluated, differences between results obtained using 10 000 sets of randomly generated structures and 50 000 sets of such structures were negligible, indicating that 10 000 sets of structures were sufficient.
Superposition of protein structures
Proteins were superimposed by first selecting topologically equivalent atoms (Holm and Sander, 1994) within the core of the fold and generating translation and rotation matrices to move one set of coordinates on to another. Coordinates for the entire protein were then aligned using the appropriate translation and rotation operations. The program Supr was used for this process, which uses Diamond's method as translated from his superpos.f subroutine by Richardson. Structures were visualized using the MAGE display program (Richardson and Richardson, 1992
).
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
It was proposed by Lim (Lim, 1978) that amphiphilic
-helices participate in the rapid condensation of all proteins during the initial stages of folding. This idea has received direct experimental confirmation for the case of predominantly
-helical proteins, but there is only sparse evidence (Hamada and Goto, 1997
; Arai et al., 1998
) that such a mechanism might be generally applicable to ß-sheet proteins. If indeed all proteins fold by way of collapse of amphiphilic
-helical bundles, then some `residual' from this process may still remain in the native structure of ß-sheet proteins in the form of distinct patterns of amphiphilic
-helical potential. Although such a pattern, if it did exist, might be distorted by folding processes that occur subsequent to the initial condensation, the pattern might emerge if a sufficient number of structures were analyzed. The documented existence of amphiphilic
-helical potential in non-helical regions of ß-sheet proteins (Parker and Song, 1990
) substantiates such a possibility.
To test this idea further, the shortest distance connecting regions of amphiphilic -helical potential was determined for a basis set of 23 all-ß-sheet structures, as described in Methods. The shortest connecting distance was strongly correlated with the number of regions with amphiphilic
-helical potential (r = 0.953) (Figure 1
). To determine whether this degree of correlation is indeed greater than that expected by chance, it was compared with the correlation found using structures with randomly assigned location of regions with amphiphilic
-helical potential. In 10 000 trials, linear correlations between shortest connecting distance and number of randomly assigned regions had an average r value of 0.895 (range 0.7460.974) and was less than that observed for the naturally occurring regions in 99.44% of the 10 000 trials. This suggests strongly (p = 0.0056) that the correlation between the shortest connecting distance and number of regions is stronger than might be expected by random chance.
|
Although differences in the organization of naturally occurring and randomly assigned regions of amphiphilic -helical potential might be associated with some process such as protein folding, these differences might also reflect some trivial property of naturally occurring regions of amphiphilic potential. Specifically, the relative consistent average interval between regions (13.03 ± 1.26 Å) might reflect the tendency of these regions to occur within certain types of native secondary structure that are regularly spaced within a protein because of constraints imposed by the native fold. However, as addressed below, no such constraints with respect to the native fold are observed and thus no trivial explanation is evident at present.
Relationship between native structure and regions with amphiphilic -helical potential
One possible explanation for the presence of amphiphilic -helical potential in ß-sheet proteins is that, rather than playing a role in the folding of the protein, the potential plays an as yet undetermined role in maintaining the native structure of ß-sheet proteins or is associated in some other way with particular native structure. To provide some insight into this possibility, the conservation of regions with amphiphilic
-helical potential in proteins that have the same fold but which are in different superfamilies as defined by SCOP (Murzin et al., 1995
) was evaluated. As shown in Figure 2
, the location of regions with amphiphilic
-helical potential was not conserved among various superfamilies of the seven-strand Greek key (immunoglobulin-like) fold. Further, regions with amphiphilic
-helical potential were not particularly associated with strands, turns or aperiodic structure, but were distributed among all three types of secondary structure. Similar results were obtained for all other folds evaluated (OB-fold, nine-strand Greek key and ß-trefoil). Thus, the existence of multiple regions with amphiphilic
-helical potential is conserved, but their locations are not conserved in the native structure, even within non-homologous representatives of a given fold. This suggests strongly that these regions are not involved in the maintenance of native structure, but rather are involved in another process. If indeed these non-conserved regions are involved in folding, these findings might explain the experimental finding that non-homologous proteins with the same fold can exhibit dramatically different folding behavior (Burns et al., 1998
).
|
Intervals separating amphiphilic -helices in
-helical bundles
Given that regions with amphiphilic -helical potential in ß-sheet proteins are not associated with any particular secondary or tertiary structure, the average separation of about 13 Å between these regions (along the shortest line connecting all regions) is of interest. It might be asked whether this distance is appropriate for the formation of helical bundles. To address this question, the average distance between amphiphilic
-helices (along the shortest line connecting all amphiphilic
-helices) in five structurally distinct proteins that contain native
-helical bundles was evaluated. The methods used were the same as those used for calculating the average distance separating regions with amphiphilic
-helical potential along the shortest line connecting all regions in ß-sheet proteins (methods described above). In the five structures analyzed (gln1, agr1, abk2, rgp1 and bvp1), the average distance separating amphiphilic
-helices along the shortest line connecting all of the helices was found to be 13.3 ± 1.5 Å (mean ± S.D.) and ranged from 11.5 Å (gln1, five amphiphilic
-helices) to 15.5 Å (bvp1, eight amphiphilic
-helices). Although only a limited number of
-helical proteins were evaluated, these finding do show that the average distance separating regions with amphiphilic
-helical potential in ß-sheet proteins is similar to the distances associated with formation of native
-helical bundles.
Conclusions
Lim originally postulated that there were at least four reasons why ß-sheet proteins might use amphiphilic -helices to initiate folding. First, in the denatured state, the
-helix is easier to form than ß-sheets because it relies on shorter range (i, i + 3, i + 4) interactions than do ß-sheets. Second, the amphiphilic
-helix has both hydrophobic and hydrophilic domains and is therefore ideal for selective interhelical interactions. Third, an amphiphilic helix of average length (1015 residues) approximates a sphere in solution, making it ideal for rapid diffusion. Fourth, the time required for formation of
-helix is short compared with that of ß-sheets. Not only might amphiphilic
-helices be ideal as initiators of protein folding, but also amphiphilic ß-sheets may be unacceptable as an alternative. Evidence suggests that amphiphilic sequences with periodicity consistent with ß-strands (periodicity of 2.0) tend to form amyloid-like structures (West et al., 1999
) and are avoided in nature (Broome and Hecht, 2000
).
The results presented here indicate that ß-sheet proteins contain specifically ordered arrangements of regions with amphiphilic -helical potential. The overall organization of these regions has been described previously as elliptical in shape (Parker and Stezowski, 1996
). It is hypothesized that these arrays of regions with amphiphilic
-helical potential represent `dormant'
-helical domains within ß-sheet proteins. Consistent with this idea, regions with amphiphilic
-helical potential in ß-sheet proteins can apparently assume an
-helical structure, at least in some amphiphilic environments (Jirgensons, 1982
; Parker and Song, 1992
).
Whether dormant -helical domains are important in the folding process remains unproven. Current experimental evidence regarding the folding of ß-sheet proteins tends to point toward
-helical intermediates only in some cases. However, if intermediates such as helical bundles are less stable than subsequent intermediates, their breakdown could be very fast, leaving us blind to their existence since current techniques for finding rapid intermediates (principally NMR) are limited to detecting structures also present in the native structure. Consistent with this idea, it is anticipated that these dormant domains are unable to form stable helical structures, since the formation of stable helical structures would result in `traps', hindering the folding process.
The findings presented here support the hypothesis that dormant domains within ß-sheet proteins exist and that they participate in processes other than maintenance of native structure. Further, these findings suggest that ß-sheet proteins may exist as `-helical like' proteins before taking on the native character. If this is correct, the avenues leading to native structure may be similar for all proteins and idea that may aid greatly in our future study of protein folding.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Broome,B.M. and Hecht,M.H. (2000) J. Mol. Biol., 296, 961968.[ISI][Medline]
Burns,L.L., Dalessio,P.M. and Ropson,I.J. (1998) Proteins, 33, 107118.[ISI][Medline]
Eisenberg,D., Weiss,R.M. and Terwilliger,T.C. (1982) Nature, 299, 371374.[ISI][Medline]
Eisenberg,D., Schwarz,E., Komaromy,M. and Wall,R. (1984) J. Mol. Biol., 179, 125142.[ISI][Medline]
Hamada,D. and Goto,Y. (1997) J. Mol. Biol., 269, 479487.[ISI][Medline]
Holm,L. and Sander,C. (1994) Nucleic Acids Res., 22, 36003609.[Abstract]
Jirgensons,B. (1982) J. Protein Chem., 1, 7184.
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[ISI][Medline]
Lim,V.I. (1978) FEBS Lett., 89, 1014.[ISI][Medline]
Murzin,A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) J. Mol. Biol., 247, 536540.[ISI][Medline]
Parker,W. and Song,P.S. (1990) J. Biol. Chem., 265, 1756817575.
Parker,W. and Song,P.S. (1992) Biophys. J., 61, 14351439.[Abstract]
Parker,W. and Stezowski,J.J. (1996) Proteins, 25, 253260.[ISI][Medline]
Richardson,D.C. and Richardson,J.S. (1992) Protein Sci., 1, 39.
West,M.W., Wang,W., Patterson,J., Mancias,J.D., Beasley,J.R. and Hecht,M.H. (1999) Proc. Natl Acad. Sci. USA, 96, 1121111216.
Received October 24, 2000; revised February 27, 2001; accepted March 5, 2001.