Identification of amino acids involved in protein structural uniqueness: implication for de novo protein design

Yasuhiro Isogai1,2, Motonori Ota3, Anna Ishii4,5, Manabu Ishida1,6 and Ken Nishikawa3

1 The Institute of Physical and Chemical Research (RIKEN), 2-1 Hirosawa, Wako, Saitama 351-0198, 3 National Institute of Genetics, Mishima, Shizuoka 411-8540, 4 Department of Physics, Faculty of Science, Gakushuin University, Toshima-ku, Tokyo 170-0031 and 6 Department of Life Sciences, Faculty of Science, Himeji Institute of Technology, Ako-gun,Hyogo 678-1297, Japan


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Structural uniqueness is characteristic of native proteins and is essential to express their biological functions. The major factors that bring about the uniqueness are specific interactions between hydrophobic residues and their unique packing in the protein core. To find the origin of the uniqueness in their amino acid sequences, we analyzed the distribution of the side chain rotational isomers (rotamers) of hydrophobic amino acids in protein tertiary structures and derived {Delta}Scontact, the conformational-entropy changes of side chains by residue–residue contacts in each secondary structure. The {Delta}Scontact values indicate distinct tendencies of the residue pairs to restrict side chain conformation by inter-residue contacts. Of the hydrophobic residues in {alpha}-helices, aliphatic residues (Leu, Val, Ile) strongly restrict the side chain conformations of each other. In ß-sheets, Met is most strongly restricted by contact with Ile, whereas Leu, Val and Ile are less affected by other residues in contact than those in {alpha}-helices. In designed and native protein variants, {Delta}Scontact was found to correlate with the folding–unfolding cooperativity. Thus, it can be used as a specificity parameter for designing artificial proteins with a unique structure.

Keywords: conformational entropy/folding cooperativity/hydrophobic core/rotamer/side chain packing


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Two aspects in the thermodynamics of protein folding are essential to elucidate the principles of protein architecture: stability and structural specificity. The latter has been highlighted by recent progress in de novo protein design, which attempts to create artificial amino acid sequences to fold into a given three-dimensional (3D) structure independently of the native protein sequences (Bryson et al., 1995Go; Desjarlais and Handel, 1995Go; Cordes et al., 1996Go). Many designed proteins retain the targeted overall topology with significant secondary structure and high stability, but they have low structural specificity, i.e. less unique structure, without the fixed conformations of side chains (Hecht et al., 1990Go; Handel et al., 1993Go; Choma et al., 1994Go; Tanaka et al., 1994Go; Gibney et al., 1997Go; Isogai et al., 1999Go). Specificity and stability are clearly distinguishable by comparing the conformation dependence of free energy between native and designed proteins, as shown in Figure 1Go. The depth in the energy well is indicative of stability whereas the narrowness is indicative of specificity. Native proteins typically have a sharp single valley with a relatively small depth (Figure 1AGo), i.e. unique and less stable structures. In the energy profile characteristic of designed proteins (Figure 1BGo), however, the deep multiple valleys and the large width indicate low structural specificity with high stability. It has become a common understanding that a unique structure is more challenging to design than a stable structure.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 1. Schematic energy profiles characteristic of native (A) and artificial proteins (B). The upper waving plateaus indicate the energy levels of the unfolded states. The sharp and single valley in (A) gives the unique structure of native protein, and the broad and multiple valleys in (B) give the multi-conformational folded states of artificial protein.

 
A large number of studies have focused on protein stability and its relationship with amino acid sequences. However, the relationships of structural specificity with stability and with the sequences remain almost unexplored. An unfolded protein has a vast number of accessible conformations, particularly in its side chains of residues. When a native protein folds, the side chains in the hydrophobic core are generally restricted to a single conformation. Entropy is related to the number of accessible conformations, and {Delta}Sfolding, conformational entropy change upon folding, measures the contribution of the restriction of side chain conformations to the thermodynamics of protein folding (Doig and Sternberg, 1995Go). Thus, {Delta}Sfolding was used for assessing the structural uniqueness of natural and artificial proteins (Furukawa et al., 1996Go; Jiang et al., 2000Go).

The unique packing of side chains in the protein core is essential to attain the overall structural uniqueness of proteins and requires the restriction of side chain conformations by interactions among core amino acid residues. The conformations of side chains relative to the main chains are expressed by a pair of side chain dihedral angles, {chi}1 and {chi}2, for the C{alpha}–Cß and Cß–C{gamma} bonds, respectively (Dunbrack and Karplus, 1993Go). There are three typical values (60°, 180°, 300°) for {chi}1 and {chi}2, as expected from the hybridized sp3 orbitals between carbon atoms (Ponder and Richards, 1987Go). Thus, the number of rotamers, i.e. typical rotational isomers of each amino acid residue, is nine at maximum and the rotamer library consists of a total of 112 templates for the 20 amino acids (Dunbrack and Karplus, 1993Go). The conformers of most amino acid residues in folded proteins, however, are limited to a smaller number that depends on the secondary structure. These backbone-dependent rotamer preferences have been understood by interactions between side chains and main chains, and have been used for the prediction of the side chain conformations from the main chain structures (Dunbrack and Karplus, 1993Go; Tanimura et al., 1994Go; Lasters et al., 1995Go). On the other hand, side chain–side chain interactions should also contribute to the conformational restriction of residues and depend on the secondary structures. In the present study, we investigated the rotamer distribution of seven hydrophobic amino acids (Leu, Ile, Val, Met, Phe, Tyr, Trp) in protein 3D structures and analyzed the effects of inter-residue contacts on the rotamer distribution in each secondary structure. The results reveal the amino acid residues involved in protein structural uniqueness and give a specificity parameter for designing artificial proteins with a unique structure.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Rotamer library

Six hundred and eighty-three protein structures determined with <2.5 Å resolution, whose sequences were mutually dissimilar with <30% identity, were chosen from the Protein Data Bank (PDB) (Berman et al., 2000Go; Ota et al., 2001Go). Side chain conformations of Leu, Ile, Val, Met, Phe, Tyr and Trp in the interior sites buried at hydration class 6 or above (Ota and Nishikawa, 1997Go) in these structures and their residue–residue contacts were analyzed as follows. Each of the seven hydrophobic amino acids was classified into the nine rotamers at maximum based on a pair of dihedral angles, {chi}1 and {chi}2, according to Dunbrack and Karplus (Dunbrack and Karplus, 1993Go). For example, leucine and valine were classified into rotamers L1–L9 and V1–V3, respectively. Secondary structures were classified into {alpha}-helix, ß-sheet and others (coil), according to Kabsch and Sander (Kabsch and Sander, 1983Go). Two amino acid residues in a protein tertiary structure were defined to contact, or interact with, each other when the minimum distance between side chain atoms containing no hydrogen was <5 Å. Two residues adjacent to each other on the amino acid sequence (i ± 1) were ruled out from the contacting residue pairs. The numbers of rotamers having contact or no contact with a certain amino acid residue were counted separately in the secondary structures of the 683 proteins and were used for further analyses.

Conformational entropy of side chains

Side chain conformational entropy (Sconf) is expressed as follows:


(1)

where R is the gas constant and W is the number of accessible rotamers in the structural state. Thus, the change in Sconf upon folding is expressed as {Delta}Sfolding = R ln Wf / Wu, where Wf and Wu are the numbers of rotamers in folded and unfolded states, respectively. As for residues of the native protein interior, Wf is often assumed to be one, i.e. {Delta}Sfolding = –R ln Wu, because these side chains are restricted to adopt almost a single rotamer (Doig and Sternberg, 1995Go). In the present study, however, it is not reasonable to assume Wf to be one, because the calculation aims to estimate the potential contribution of each amino acid to the structural uniqueness. Wf is smaller than Wu but more than one in the mixed conformational ‘gemisch’ states of artificial proteins with no structural uniqueness (Dill et al., 1995Go). Here, we estimated Wf to be the number of allowable rotamers weighted by the probability of each rotamer populated in each secondary structure of a folded state as follows:


(2)

where pi is the fractional population of each rotamer state i in the folded state. Wf is calculated to be one when the side chain conformations are strongly restricted to a single rotamer in the structure, whereas Wf is the same as the total rotamer number when each rotamer is present at an equal probability. In general, Wf ranges from one to the total rotamer number according to the values of pi and is thought to be the effective number of rotamers in the folded state. On the other hand, Wu is difficult to measure directly. Then, {Delta}Sfolding can be calculated by assuming that each rotamer populates at an equal probability in the unfolded state and that Wu is the total number of possible rotamers as defined (Doig and Sternberg, 1995Go). However, {Delta}Scontact defined here can be obtained without using this uncertain assumption (see below).


    Results and discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Effects of inter-residue contacts on conformational entropy

The effects of side chain–side chain contacts on rotamer distribution were estimated with {Delta}Sqcontact(a, b), the entropy change of residue a in a secondary structure q (q is ‘{alpha}-helix’, ‘ß-sheet’ or ‘coil’) by contact with residue b in the folded state. For a given residue a in a given structural site environment, it may contact with some residues, e.g. b, c, d. If a contacts with b at least once, a is defined to be in contact with b. Otherwise, a is defined to be in no contact with b. Thus, the number of a rotamer ai (rotamer state i of residue a) in no contact with b equals the total number of ai minus the number of ai in contact with b. For every contact counterpart b, all rotamers were categorized into either in contact or in no contact. {Delta}Sqcontact(a, b) is then defined by


(3)

where Sqc(a, b) and Sqnc(a, b) are side chain conformation entropies (Sconf) of residue a in a secondary structure q in contact and no contact with residue b, respectively. Sqc(a, b) does not mean the entropy of residue a in contact with only residue b but in contact with other residues containing residue b, and Sqnc(a, b) is the entropy of residue a in contact with surrounding residues other than b. They were estimated from the rotamer distribution of residue a in contact or in no contact with residue b [see Equations (1) and (2)GoGo in Materials and methods].

For correction of statistical errors due to small data sets of rotamer numbers, pic and pinc, the fractional population of each rotamer state i in contact and no contact with a certain residue, respectively, in the folded state are transformed to pic' and pinc' as follows: pic' = pia/(1+Nc{sigma}i) + pic·Nc{sigma}/(1+Nc{sigma}); pinc' = pia/(1+Nnc{sigma}) + pinc·Nnc{sigma}/(1 + Nnc{sigma}), where pia is the fractional population of a rotamer state i regardless of the contact; Nc and Nnc are the observed numbers of rotamers in contact and no contact, respectively; the correction factor, {sigma} = 0.01 (Sippl, 1990Go). Here, we calculated Wfc and Wfnc, the effective number of rotamers in contact and no contact, by Equation (2)Go with pic' and pinc', respectively. Sqc and Sqnc in Equation (3)Go were calculated with these Wfc and Wfnc, respectively, to obtain {Delta}Sqcontact.

The {Delta}Sqcontact(a, b) values are plotted against the hydrophobic residues, Ala (A), Leu (L), Ile (I), Val (V), Met (M), Phe (F), Tyr (Y) and Trp (W), in Figure 2Go. This indicates the degree of conformational restriction of residue a by contact with residue b or how much residue b affects the side chain conformation of another residue a in contact. Also, the 10 residue pairs with the best (most negative) {Delta}Scontact values are listed in Table IGo. The data sets of {Delta}Scontact values for the residue pairs are apparently different among {alpha}-helix, ß-sheet and coil, and are characteristic of each secondary structure (Figure 2Go and Table IGo). Of the hydrophobic residues in {alpha}-helices, aliphatic residues (Leu, Val, Ile, Met) strongly restrict the side chain conformations of each other. In particular, Ile is most restricted by contact with other Ile ({Delta}S{alpha}contact = -1.33 J mol-1 K-1), Met (–1.30 J mol-1 K-1) and Val (–0.83 J mol-1 K-1). The effects of the rotamer-distribution changes on {Delta}Scontact for the Ile–Ile pair in {alpha}-helices are shown in Figure 3AGo. Ile rotamers, I8 and I9, which have the side chain conformations of 240° < {chi}1< 360°, 120° < {chi}2 < 240° and 240° <{chi}1 < 360°, 240° < {chi}2 < 360°, respectively, are dominant in {alpha}-helices and increase from 83.8 to 89.2% by contact with Ile. This can be partly explained by the fact that only certain conformations of the ß-branched side chains are compatible with the main chain {alpha}-helical conformation. In ß-sheets, Met is most strongly restricted by contact with Ile ({Delta}Sßcontact = –1.75 J mol-1 K-1) (Figure 3BGo) and Leu (–0.70 J mol-1 K-1), whereas Leu, Val and Ile are less affected by other residues in contact than those in {alpha}-helices. Met rotamers, M5 and M8, which have the side chain conformations of 120° < {chi}1 < 240°, 120° < {chi}2 <240° and 240° < {chi}1 <360°, 120° < {chi}2 < 240°, respectively, are dominant in ß-sheets and increase from 58.0 to 71.5% by contact with Ile. In coils, conformational restrictions by inter-residue contacts are smaller than those in the other secondary structures (Figure 2CGo and Table IGo), possibly due to the various conformations of the main chain.



View larger version (93K):
[in this window]
[in a new window]
 
Fig. 2. Effects of inter-residue contacts on restriction of side chain conformation at the interior sites in {alpha}-helix (A), ß-sheet (B) and coil (C). {Delta}Scontact(a, b) of amino acid is plotted against residue a (vertical ‘Amino acid L, V, I, ...’) contact with residue b (horizontal ‘Contact counterpart A, L, V, ...’) in density maps.

 

View this table:
[in this window]
[in a new window]
 
Table I. Hydrophobic residue pairs involved in structural uniqueness
 


View larger version (42K):
[in this window]
[in a new window]
 
Fig. 3. Conformational entropy Sconf of Ile in {alpha}-helix in contact and no contact with Ile (A) and of Met in ß-sheet in contact and no contact with Ile (B). The stacked bars indicate each rotamer contribution to the Sconf values. The contribution was calculated with the rotamer distribution using Equations (1) and (2)GoGo in Materials and methods, where the effective rotamer number Wf was obtained for each single rotamer. The Sconf values were corrected for errors due to small data sets of observed rotamer numbers (see text).

 
To examine the reliability of the {Delta}Sqcontact values, we analyzed the statistical dependence of the rotamer distributions on the residue–residue contact. Assuming that rotamer distributions are independent of the contact, the true probability of this assumption for each residue pair was estimated based on the {chi}2-test. The probability, P{chi}2, ranges between 0 and 1, and the residue pairs with smaller P{chi}2 values are more strongly affected by the contact. The P{chi}2 values for II, IV, VL and LV of the 10 residue pairs with the lowest {Delta}Sqcontact values in {alpha}-helices are <0.05 (Table IGo), indicating that the rotamer distributions of Ile, Val and Leu in {alpha}-helices certainly depend on the contact with the counterpart residues. On the other hand, the P{chi}2 values for MI and FA of the 10 pairs with the lowest {Delta}Sqcontact values in ß-sheets are <0.01 and indicate that Met and Phe in ß-sheets highly depend on contact with Ile and Ala, respectively. There are no residue pairs with P{chi}2 values <0.1 in the 10 residue pairs in coils. Thus, the data of {Delta}Sqcontact for the residue pairs of II, IV, VL and LV in {alpha}-helices and of MI and FA in ß-sheets are reliable and the residue pairs are truly involved in structural uniqueness. The {Delta}Sqcontact values with P{chi}2 <0.05 are listed in Table IIGo, which shows residue pairs whose rotamer distribution is significantly affected by inter-residue contact. The positive values of {Delta}Sqcontact indicate that the rotamer distribution is uniformalized by the inter-residue contact, and the amino acid sequences to form the residue pairs with positive {Delta}Sqcontact should be avoided in protein design.


View this table:
[in this window]
[in a new window]
 
Table II. {Delta}Sqcontact of hydrophobic residue pairs with P{chi}2 <0.05
 
We assume that steric interactions between residues are mainly responsible for the entropy decreases. The ß-branched groups of Ile or Val at a position i in an {alpha}-helix easily bump neighboring residues at positions i ± 3 or 4 on the helix (Nakamura et al., 1996Go). On the other hand, the linear long side chain of Met in a ß-strand strongly interacts with the neighboring residues at the next strand, which point the C{alpha}–Cß vectors to the directions parallel to each other and can contact at several positions. This is in contrast to the residues contacting each other in an {alpha}-helix, which project to the different directions by typically 40° or 60° between the C{alpha}–Cß vectors. Thus, the contact in an {alpha}-helix is at the point near the Cß atom whereas the contact in a ß-sheet is on the line along the side chain. As in these cases, the differences in the entropy changes between the secondary structures are due to those in the spatial relationship of the side chains in contact.

Assessment of {Delta}Scontact as a specificity parameter

To examine the validity of {Delta}Scontact as a parameter for structural specificity, {Delta}{Delta}SXY, the total change in {Delta}Scontact by substitution of residue X with residue Y, was estimated for a series of artificial four-helix bundles (Gibney et al., 1999Go; Skalicky et al., 1999Go) and also for a set of single alanine substitution mutants of the natural Arc repressor (Milla et al., 1994Go). {Delta}{Delta}SXY is expressed as:


(4)

where {a}Y and {a}X are all the hydrophobic residues in contact with residues Y and X, respectively. The calculation was performed based on the 3D coordinates of the native and designed proteins. These designed and natural protein variants showed a variety of structural specificities as well as stabilities. The {Delta}{Delta}SXY values between the prototype (wild-type) and the variants are plotted against the differences in m values obtained from denaturation experiments with guadinine hydrochloride in Figure 4Go. The parameter m measures the cooperativity of the folding–unfolding transitions or structural specificity of proteins, and correlates with the signal quality (dispersion and resolution) of NMR spectra (Murphy et al., 1992Go; Gibney et al., 1999Go). The data show correlation of {Delta}{Delta}SXY values with {Delta}m values and scatter around the expected line, which decreases along the x-axis through the coordinate origin (0, 0) corresponding to the prototype four-helix bundle (LLL) or wild-type Arc. The correlation coefficients for the four-helix bundle variants were –0.63 (single mutations), –0.65 (single and double mutations) and –0.53 (all data of single, double and triple mutations). For the single mutants of the Arc repressor, the coefficient was lower than those for the four-helix bundles and was –0.11. For the combined data of the four-helix bundles and Arc mutants, the correlation is clearer with a total correlation coefficient of –0.63. Although the linear inverse correlation between {Delta}m and {Delta}{Delta}SXY is not theoretically proved, the correlation observed here is evident and strongly suggests that {Delta}Scontact can be a measure of the folding cooperativity as well as m.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 4. Plot of total {Delta}Sqcontact calculated from the values in Figure 2Go against experimental folding cooperativity m for designed four-helix bundles (Gibney et al., 1999Go) and mutants of the Arc repressor (Milla et al., 1994Go). The data are expressed as {Delta}{Delta}SXY and {Delta}m, the differences of {Delta}Sqcontact and m, respectively, from those of the prototype four-helix bundle LLL or wild-type Arc according to Equation (4)Go (see text). The four-helix bundle variants and Arc repressor mutants are indicated by their codes such as LIL and V25A, respectively, near the data points. The four-helix bundles are [(Ac-CGGGEXWKL·HEEXLKK·FEEXLKL·HEERLKK·L-CONH2)2]2, where X is either L, I, V or F and coded with the three amino acids at positions 6, 13 and 20. For the four-helix bundles with double and triple substitutions, {Delta}{Delta}SXY for each substitution is simply added up. The {Delta}{Delta}SXY values of the four-helix bundles and of the Arc mutants were calculated based on a solution structure of IFL, one of the native-like variants (Skalicky et al., 1999Go), and on the crystal structure of an Arc repressor mutant (1MYK) (Schildbach et al., 1995Go), respectively. ({alpha}), (ß) and (c) in the notations of the Arc mutants indicate {alpha}-helix, ß-sheet and coil, respectively, in which the substituted amino acids are positioned. The data of four-helix bundle variants with single ({circ}), double (•) and triple substitutions ({triangleup}), respectively, on the prototype LLL, are shown. {blacktriangleup}, Single mutants of the Arc repressor. One of the data of the helix bundles, FFL, is excluded because it is the only variant that unfolds in a more cooperative manner than the prototype LLL but exhibits broader and less resolved NMR signals (Gibney et al., 1999Go). The 3D models of proteins were constructed and their inter-residue contacts were measured using InsightII98 (Accelrys) on an SGI Octane workstation.

 
The {Delta}{Delta}SXY values in Figure 4Go were calculated based on the structural coordinates and the definition of residue–residue contacts, as mentioned, and thus obtained by counting the interactions within a secondary structure and also those between secondary structures (helix–helix, helix–sheet or sheet–sheet). The plot of {Delta}{Delta}SXY against {Delta}m for the four-helix bundle variants by considering only the interactions within each helix, i.e. those between residues at positions i and i ± 3 or 4, showed a weaker correlation (correlation coefficient, –0.11) than that of Figure 4Go. This indicates that the interactions across the interface between secondary structures, as well as those within local structures, are important to realize structural uniqueness.

Assuming that all the possible rotamers of each residue populate at an equal probability in an unfolded state, the total conformational-entropy change upon folding ({Delta}Sfolding) was calculated for the four-helix bundle and Arc variants, in which inter-residue contacts were not considered (data not shown). The expected correlation of {Delta}Sfolding with m was not observed; the correlation efficient was positive (0.13) for the combined data of the four-helix bundles and Arc mutants. This may be partly due to the fact that {Delta}Sfolding strongly depends on the total rotamer number of each residue, i.e. residues with longer side chains tend to have more negative {Delta}Sfolding, which is not reasonable to estimate the structural specificity of proteins in the folded state. It may also be due to uncertain Wu necessary to calculate {Delta}Sfolding (see Materials and methods).

Application to designed globins

We have proposed a new computational method for designing an entire amino acid sequence that can adopt a given tertiary structure with sizeable molecular mass by using a knowledge-based 3D–1D compatibility function (Ota et al., 1997; Isogai et al., 1999Go). Based on this method, an artificial sequence of 153 amino acids was designed to fit the main chain framework of sperm whale myoglobin (Mb). The synthesized artificial globin (DG1) was well folded and bound one heme per protein molecule as designed. DG1 exhibited much higher thermodynamic stability than natural apoMb but lacked structural uniqueness at the side chain level. Then, several Leu and Met residues in DG1 were replaced with ß-branched amino acids, Ile and Val, to generate DG2–4 (Isogai et al., 2000Go). These residue replacements significantly affected both stability and structural specificity, although they resulted in no significant changes of their compactness and {alpha}-helical contents in the absence of denaturant. Among DG1–4, DG3 in which 11 Leu residues of DG1 were replaced with seven Ile and four Val residues, and one Met residue is replaced with Val, displayed the lowest stability but the most cooperative folding–unfolding transition and the best NMR spectrum with the smallest linewidth. These results indicate that the replacements of Leu (Met) with Ile and Val at appropriate sites reduce the freedom of side chain conformation and are consistent with the present analyses.

By using the computational models of the designed globins (Isogai et al., 2000Go), the total {Delta}Scontact values for the 17 replaced sites of DG1, DG2, DG3 and DG4 were calculated to be –4.2, –8.8, –12.8. and –13.7 J mol-1 K-1, respectively (Figure 5Go). In the denaturation experiments of DGs, they unfolded with two distinct transitions, i.e. F{leftrightarrow}I{leftrightarrow}U, and thus gave two m values, m1 and m2, for F{leftrightarrow}I and I{leftrightarrow}U, respectively. The parameter m1 was indicative of the overall structural specificity detected by NMR spectroscopy, rather than m2 or m1 + m2, (Isogai et al., 2000Go) and the m1 values were compared with the total {Delta}Scontact values in Figure 5Go. The m1 value or the quality of NMR signals increases with the –{Delta}Scontact value for DG1–3, whereas they are inconsistent for DG4. This may reflect that too many residue replacements change the main chain structure and/or give unexpected effects on structural properties, such as induction of non-specific interactions between protein molecules, which could not be detected by NMR measurement.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 5. Comparison of total {Delta}Scontact values of the substituted sites in designed globins with their measured m1 values. The total {Delta}Scontact was calculated as {Sigma}X {Sigma}{a}X {{Delta}Sqcontact(a, X) + {Delta}Sqcontact(X, a)}, where X is a residue in a substituted site and {a}X is all the hydrophobic residues in contact with the residue X. The substituted sites were positions 28, 29, 39, 46, 68, 69, 76, 86, 108, 110, 112, 114, 115, 131, 137, 138 and 142, which were all in {alpha}-helices, of the 153 sites in each designed globin (Isogai et al., 2000Go). The calculation was performed based on the computational models of DG1–4, which were constructed by mounting the sequences on the backbone of the crystal structure of sperm whale Mb (1MBD) (Phillips, 1980Go) in computer graphics followed by the molecular-mechanics calculation of side chain conformations (Isogai et al., 2000Go).

 
In this study, we extracted the average effects of inter-residue contacts on the side chain conformation in each secondary structure from the analyses on many proteins. The conformational entropy changes obtained here cannot be applied for the folding reaction of a single molecule directly, because a side chain at each site is not restricted to the averaged conformation but to each unique conformation. However, as shown in Figure 4Go, the parameter correlates well with the folding cooperativity of native and designed proteins as a whole, presumably because the averaging effects of each parameter emerge through Equation (4)Go. For designing a protein, it is necessary to select amino acids at many sites and several candidate proteins can be synthesized on trial. Thus, the parameter is useful for designing proteins by combination of other parameters such as a stability-evaluating function.


    Notes
 
5 Present address: Mitsubishi Research Institute, Inc., Chiyoda-ku, Tokyo 100-8141, Japan Back

2 To whom correspondence should be addressed. E-mail: yisogai{at}postman.riken.go.jp Back


    Acknowledgments
 
We are grateful to Professor Brian R.Gibney, Columbia University, for sending us structural coordinates of a designed four-helix bundle. We also thank Hiroyoki Izuno for his help in the preliminary analyses of the rotamer distribution. This work was supported in part by the Bioarchitect Research Program of RIKEN to Y.I., and by Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan to Y.I., M.O. and K.N.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results and discussion
 References
 
Berman,H., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T., Weissig,H., Shindyalov,I. and Bourne,P. (2000) Nucleic Acids Res., 28, 235–242.[Abstract/Free Full Text]

Bryson,J.W., Betz,S.F., Lu,H.S., Suich,D.J., Zhou,H.X., O’Neil,K.T. and DeGrado,W.F. (1995) Science, 270, 935–941.[Abstract]

Choma,C.T., Lear,J.D., Nelson,M.J., Dutton,P.L., Robertson,D.E. and DeGrado,W.F. (1994) J. Am. Chem. Soc., 116, 856–865.[ISI]

Cordes,M.H., Davidson,A.R. and Sauer,R.T. (1996) Curr. Opin. Struct. Biol., 6, 3–10.[CrossRef][ISI][Medline]

Desjarlais,J.R. and Handel,T.M. (1995) Curr. Opin. Biotechnol., 6, 460–466.[CrossRef][ISI][Medline]

Dill,K.A., Bromberg,S., Yue,K., Fiebig,K.M., Yee,D.P., Thomas,P.D. and Chan,H.S. (1995) Protein Sci., 4, 561–602.[Abstract/Free Full Text]

Doig,A.J. and Sternberg,M.J.E. (1995) Protein Sci., 4, 2247–2251.[Abstract/Free Full Text]

Dunbrack,R.L. and Karplus,M. (1993) J. Mol. Biol., 230, 543–574.[CrossRef][ISI][Medline]

Furukawa,K., Oda,M. and Nakamura,H. (1996) Proc. Natl Acad. Sci., USA. 93, 13583–13588.

Gibney,B.R., Johansson,J.S., Rabanal,F., Skalicky,J.J., Wand,A.J. and Dutton,P.L. (1997) Biochemistry, 36, 2798–2806.[CrossRef][ISI][Medline]

Gibney,B.R., Rabanal,F., Skalicky,J.J., Wand,A.J. and Dutton,P.L. (1999) J. Am. Chem. Soc., 121, 4952–4960.[CrossRef][ISI]

Handel,T.M., Williams,S.A and DeGrado,W.F. (1993) Science, 261, 879–885.[ISI][Medline]

Hecht,M.H., Richardson,J.S., Richardson,D.C. and Ogden,R.C. (1990) Science, 249, 884–891.[ISI][Medline]

Isogai,Y., Ota,M., Fujisawa,T., Izuno,H., Mukai,M., Nakamura,H., Iizuka,T. and Nishikawa,K. (1999) Biochemistry, 38, 7431–7443.[CrossRef][ISI][Medline]

Isogai,Y., Ishii,A., Fujisawa, Ota,M. and Nishikawa,K. (2000) Biochemistry, 39, 5683–5690.[CrossRef][ISI][Medline]

Jiang,X., Farid,H., Pistor,E. and Farid,R.S. (2000) Protein Sci., 9, 403–416.[Abstract]

Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.[ISI][Medline]

Lasters,I., De Maeyer,M. and Desmet,J. (1995) Protein Eng., 8, 815–822.[Abstract]

Milla,M.E., Brown,B.M. and Sauer,R.T. (1994) Nature Struct. Biol., 1, 518–523.[ISI][Medline]

Murphy,K.P., Bhakuni,V., Xie,D. and Freire,E. (1992) J. Mol. Biol., 227, 293–306.[ISI][Medline]

Nakamura,H., Tanimura,R. and Kidera,A. (1996) Proc. Jpn Acad. 72B, 149–152.

Ota,M. and Nishikawa,K. (1997) Protein Eng., 10, 339–351.[Abstract]

Ota,M., Isogai,Y. and Nishikawa,K. (2001) Protein Eng., 14, 557–564.[Abstract/Free Full Text]

Phillips,S.E. (1980) J. Mol. Biol., 142, 531–554

Ponder,J.W. and Richards,F.M. (1987) J. Mol. Biol., 193, 775–791.[ISI][Medline]

Schildbach,J.F., Milla,M.E., Jeffrey,P.D., Raumann,B.E. and Sauer R.T. (1995) Biochemistry, 34, 1405–1412.[ISI][Medline]

Sippl,M.J. (1990) J. Mol. Biol., 213, 859–883.[ISI][Medline]

Skalicky,J.J., Gibney,B.R., Rabanal,F., Urbauer,R.J.B., Dutton,P.L. and Wand,A.J. (1999) J. Am. Chem. Soc., 121, 4941–4951.[CrossRef][ISI]

Tanaka,T., Kimura,H., Hayashi,M., Fujiyoshi,Y., Fukuhara,K. and Nakamura,H. (1994) Protein Sci., 3, 419–427.[Abstract/Free Full Text]

Tanimura,R., Kidera,A. and Nakamura,H. (1994) Protein Sci., 3, 2358–2365.[Abstract/Free Full Text]

Received November 16, 2001; revised March 8, 2002; accepted March 31, 2002.





This Article
Abstract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (3)
Request Permissions
Google Scholar
Articles by Isogai, Y.
Articles by Nishikawa, K.
PubMed
PubMed Citation
Articles by Isogai, Y.
Articles by Nishikawa, K.