1 The Institute of Physical and Chemical Research (RIKEN), 2-1 Hirosawa, Wako, Saitama 351-0198, 3 National Institute of Genetics, Mishima, Shizuoka 411-8540, 4 Department of Physics, Faculty of Science, Gakushuin University, Toshima-ku, Tokyo 170-0031 and 6 Department of Life Sciences, Faculty of Science, Himeji Institute of Technology, Ako-gun,Hyogo 678-1297, Japan
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: conformational entropy/folding cooperativity/hydrophobic core/rotamer/side chain packing
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
The unique packing of side chains in the protein core is essential to attain the overall structural uniqueness of proteins and requires the restriction of side chain conformations by interactions among core amino acid residues. The conformations of side chains relative to the main chains are expressed by a pair of side chain dihedral angles, 1 and
2, for the C
Cß and CßC
bonds, respectively (Dunbrack and Karplus, 1993
). There are three typical values (60°, 180°, 300°) for
1 and
2, as expected from the hybridized sp3 orbitals between carbon atoms (Ponder and Richards, 1987
). Thus, the number of rotamers, i.e. typical rotational isomers of each amino acid residue, is nine at maximum and the rotamer library consists of a total of 112 templates for the 20 amino acids (Dunbrack and Karplus, 1993
). The conformers of most amino acid residues in folded proteins, however, are limited to a smaller number that depends on the secondary structure. These backbone-dependent rotamer preferences have been understood by interactions between side chains and main chains, and have been used for the prediction of the side chain conformations from the main chain structures (Dunbrack and Karplus, 1993
; Tanimura et al., 1994
; Lasters et al., 1995
). On the other hand, side chainside chain interactions should also contribute to the conformational restriction of residues and depend on the secondary structures. In the present study, we investigated the rotamer distribution of seven hydrophobic amino acids (Leu, Ile, Val, Met, Phe, Tyr, Trp) in protein 3D structures and analyzed the effects of inter-residue contacts on the rotamer distribution in each secondary structure. The results reveal the amino acid residues involved in protein structural uniqueness and give a specificity parameter for designing artificial proteins with a unique structure.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Six hundred and eighty-three protein structures determined with <2.5 Å resolution, whose sequences were mutually dissimilar with <30% identity, were chosen from the Protein Data Bank (PDB) (Berman et al., 2000; Ota et al., 2001
). Side chain conformations of Leu, Ile, Val, Met, Phe, Tyr and Trp in the interior sites buried at hydration class 6 or above (Ota and Nishikawa, 1997
) in these structures and their residueresidue contacts were analyzed as follows. Each of the seven hydrophobic amino acids was classified into the nine rotamers at maximum based on a pair of dihedral angles,
1 and
2, according to Dunbrack and Karplus (Dunbrack and Karplus, 1993
). For example, leucine and valine were classified into rotamers L1L9 and V1V3, respectively. Secondary structures were classified into
-helix, ß-sheet and others (coil), according to Kabsch and Sander (Kabsch and Sander, 1983
). Two amino acid residues in a protein tertiary structure were defined to contact, or interact with, each other when the minimum distance between side chain atoms containing no hydrogen was <5 Å. Two residues adjacent to each other on the amino acid sequence (i ± 1) were ruled out from the contacting residue pairs. The numbers of rotamers having contact or no contact with a certain amino acid residue were counted separately in the secondary structures of the 683 proteins and were used for further analyses.
Conformational entropy of side chains
Side chain conformational entropy (Sconf) is expressed as follows:
![]() | (1) |
where R is the gas constant and W is the number of accessible rotamers in the structural state. Thus, the change in Sconf upon folding is expressed as Sfolding = R ln Wf / Wu, where Wf and Wu are the numbers of rotamers in folded and unfolded states, respectively. As for residues of the native protein interior, Wf is often assumed to be one, i.e.
Sfolding = R ln Wu, because these side chains are restricted to adopt almost a single rotamer (Doig and Sternberg, 1995
). In the present study, however, it is not reasonable to assume Wf to be one, because the calculation aims to estimate the potential contribution of each amino acid to the structural uniqueness. Wf is smaller than Wu but more than one in the mixed conformational gemisch states of artificial proteins with no structural uniqueness (Dill et al., 1995
). Here, we estimated Wf to be the number of allowable rotamers weighted by the probability of each rotamer populated in each secondary structure of a folded state as follows:
![]() | (2) |
where pi is the fractional population of each rotamer state i in the folded state. Wf is calculated to be one when the side chain conformations are strongly restricted to a single rotamer in the structure, whereas Wf is the same as the total rotamer number when each rotamer is present at an equal probability. In general, Wf ranges from one to the total rotamer number according to the values of pi and is thought to be the effective number of rotamers in the folded state. On the other hand, Wu is difficult to measure directly. Then, Sfolding can be calculated by assuming that each rotamer populates at an equal probability in the unfolded state and that Wu is the total number of possible rotamers as defined (Doig and Sternberg, 1995
). However,
Scontact defined here can be obtained without using this uncertain assumption (see below).
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
The effects of side chainside chain contacts on rotamer distribution were estimated with Sqcontact(a, b), the entropy change of residue a in a secondary structure q (q is
-helix, ß-sheet or coil) by contact with residue b in the folded state. For a given residue a in a given structural site environment, it may contact with some residues, e.g. b, c, d. If a contacts with b at least once, a is defined to be in contact with b. Otherwise, a is defined to be in no contact with b. Thus, the number of a rotamer ai (rotamer state i of residue a) in no contact with b equals the total number of ai minus the number of ai in contact with b. For every contact counterpart b, all rotamers were categorized into either in contact or in no contact.
Sqcontact(a, b) is then defined by
![]() | (3) |
where Sqc(a, b) and Sqnc(a, b) are side chain conformation entropies (Sconf) of residue a in a secondary structure q in contact and no contact with residue b, respectively. Sqc(a, b) does not mean the entropy of residue a in contact with only residue b but in contact with other residues containing residue b, and Sqnc(a, b) is the entropy of residue a in contact with surrounding residues other than b. They were estimated from the rotamer distribution of residue a in contact or in no contact with residue b [see Equations (1) and (2) in Materials and methods].
For correction of statistical errors due to small data sets of rotamer numbers, pic and pinc, the fractional population of each rotamer state i in contact and no contact with a certain residue, respectively, in the folded state are transformed to pic' and pinc' as follows: pic' = pia/(1+Nci) + pic·Nc
/(1+Nc
); pinc' = pia/(1+Nnc
) + pinc·Nnc
/(1 + Nnc
), where pia is the fractional population of a rotamer state i regardless of the contact; Nc and Nnc are the observed numbers of rotamers in contact and no contact, respectively; the correction factor,
= 0.01 (Sippl, 1990
). Here, we calculated Wfc and Wfnc, the effective number of rotamers in contact and no contact, by Equation (2)
with pic' and pinc', respectively. Sqc and Sqnc in Equation (3)
were calculated with these Wfc and Wfnc, respectively, to obtain
Sqcontact.
The Sqcontact(a, b) values are plotted against the hydrophobic residues, Ala (A), Leu (L), Ile (I), Val (V), Met (M), Phe (F), Tyr (Y) and Trp (W), in Figure 2
. This indicates the degree of conformational restriction of residue a by contact with residue b or how much residue b affects the side chain conformation of another residue a in contact. Also, the 10 residue pairs with the best (most negative)
Scontact values are listed in Table I
. The data sets of
Scontact values for the residue pairs are apparently different among
-helix, ß-sheet and coil, and are characteristic of each secondary structure (Figure 2
and Table I
). Of the hydrophobic residues in
-helices, aliphatic residues (Leu, Val, Ile, Met) strongly restrict the side chain conformations of each other. In particular, Ile is most restricted by contact with other Ile (
S
contact = -1.33 J mol-1 K-1), Met (1.30 J mol-1 K-1) and Val (0.83 J mol-1 K-1). The effects of the rotamer-distribution changes on
Scontact for the IleIle pair in
-helices are shown in Figure 3A
. Ile rotamers, I8 and I9, which have the side chain conformations of 240° <
1< 360°, 120° <
2 < 240° and 240° <
1 < 360°, 240° <
2 < 360°, respectively, are dominant in
-helices and increase from 83.8 to 89.2% by contact with Ile. This can be partly explained by the fact that only certain conformations of the ß-branched side chains are compatible with the main chain
-helical conformation. In ß-sheets, Met is most strongly restricted by contact with Ile (
Sßcontact = 1.75 J mol-1 K-1) (Figure 3B
) and Leu (0.70 J mol-1 K-1), whereas Leu, Val and Ile are less affected by other residues in contact than those in
-helices. Met rotamers, M5 and M8, which have the side chain conformations of 120° <
1 < 240°, 120° <
2 <240° and 240° <
1 <360°, 120° <
2 < 240°, respectively, are dominant in ß-sheets and increase from 58.0 to 71.5% by contact with Ile. In coils, conformational restrictions by inter-residue contacts are smaller than those in the other secondary structures (Figure 2C
and Table I
), possibly due to the various conformations of the main chain.
|
|
|
|
Assessment of Scontact as a specificity parameter
To examine the validity of Scontact as a parameter for structural specificity,
SXY, the total change in
Scontact by substitution of residue X with residue Y, was estimated for a series of artificial four-helix bundles (Gibney et al., 1999
; Skalicky et al., 1999
) and also for a set of single alanine substitution mutants of the natural Arc repressor (Milla et al., 1994
).
SXY is expressed as:
![]() | (4) |
where {a}Y and {a}X are all the hydrophobic residues in contact with residues Y and X, respectively. The calculation was performed based on the 3D coordinates of the native and designed proteins. These designed and natural protein variants showed a variety of structural specificities as well as stabilities. The SXY values between the prototype (wild-type) and the variants are plotted against the differences in m values obtained from denaturation experiments with guadinine hydrochloride in Figure 4
. The parameter m measures the cooperativity of the foldingunfolding transitions or structural specificity of proteins, and correlates with the signal quality (dispersion and resolution) of NMR spectra (Murphy et al., 1992
; Gibney et al., 1999
). The data show correlation of
SXY values with
m values and scatter around the expected line, which decreases along the x-axis through the coordinate origin (0, 0) corresponding to the prototype four-helix bundle (LLL) or wild-type Arc. The correlation coefficients for the four-helix bundle variants were 0.63 (single mutations), 0.65 (single and double mutations) and 0.53 (all data of single, double and triple mutations). For the single mutants of the Arc repressor, the coefficient was lower than those for the four-helix bundles and was 0.11. For the combined data of the four-helix bundles and Arc mutants, the correlation is clearer with a total correlation coefficient of 0.63. Although the linear inverse correlation between
m and
SXY is not theoretically proved, the correlation observed here is evident and strongly suggests that
Scontact can be a measure of the folding cooperativity as well as m.
|
Assuming that all the possible rotamers of each residue populate at an equal probability in an unfolded state, the total conformational-entropy change upon folding (Sfolding) was calculated for the four-helix bundle and Arc variants, in which inter-residue contacts were not considered (data not shown). The expected correlation of
Sfolding with m was not observed; the correlation efficient was positive (0.13) for the combined data of the four-helix bundles and Arc mutants. This may be partly due to the fact that
Sfolding strongly depends on the total rotamer number of each residue, i.e. residues with longer side chains tend to have more negative
Sfolding, which is not reasonable to estimate the structural specificity of proteins in the folded state. It may also be due to uncertain Wu necessary to calculate
Sfolding (see Materials and methods).
Application to designed globins
We have proposed a new computational method for designing an entire amino acid sequence that can adopt a given tertiary structure with sizeable molecular mass by using a knowledge-based 3D1D compatibility function (Ota et al., 1997; Isogai et al., 1999). Based on this method, an artificial sequence of 153 amino acids was designed to fit the main chain framework of sperm whale myoglobin (Mb). The synthesized artificial globin (DG1) was well folded and bound one heme per protein molecule as designed. DG1 exhibited much higher thermodynamic stability than natural apoMb but lacked structural uniqueness at the side chain level. Then, several Leu and Met residues in DG1 were replaced with ß-branched amino acids, Ile and Val, to generate DG24 (Isogai et al., 2000
). These residue replacements significantly affected both stability and structural specificity, although they resulted in no significant changes of their compactness and
-helical contents in the absence of denaturant. Among DG14, DG3 in which 11 Leu residues of DG1 were replaced with seven Ile and four Val residues, and one Met residue is replaced with Val, displayed the lowest stability but the most cooperative foldingunfolding transition and the best NMR spectrum with the smallest linewidth. These results indicate that the replacements of Leu (Met) with Ile and Val at appropriate sites reduce the freedom of side chain conformation and are consistent with the present analyses.
By using the computational models of the designed globins (Isogai et al., 2000), the total
Scontact values for the 17 replaced sites of DG1, DG2, DG3 and DG4 were calculated to be 4.2, 8.8, 12.8. and 13.7 J mol-1 K-1, respectively (Figure 5
). In the denaturation experiments of DGs, they unfolded with two distinct transitions, i.e. F
I
U, and thus gave two m values, m1 and m2, for F
I and I
U, respectively. The parameter m1 was indicative of the overall structural specificity detected by NMR spectroscopy, rather than m2 or m1 + m2, (Isogai et al., 2000
) and the m1 values were compared with the total
Scontact values in Figure 5
. The m1 value or the quality of NMR signals increases with the
Scontact value for DG13, whereas they are inconsistent for DG4. This may reflect that too many residue replacements change the main chain structure and/or give unexpected effects on structural properties, such as induction of non-specific interactions between protein molecules, which could not be detected by NMR measurement.
|
![]() |
Notes |
---|
2 To whom correspondence should be addressed. E-mail: yisogai{at}postman.riken.go.jp
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bryson,J.W., Betz,S.F., Lu,H.S., Suich,D.J., Zhou,H.X., ONeil,K.T. and DeGrado,W.F. (1995) Science, 270, 935941.[Abstract]
Choma,C.T., Lear,J.D., Nelson,M.J., Dutton,P.L., Robertson,D.E. and DeGrado,W.F. (1994) J. Am. Chem. Soc., 116, 856865.[ISI]
Cordes,M.H., Davidson,A.R. and Sauer,R.T. (1996) Curr. Opin. Struct. Biol., 6, 310.[CrossRef][ISI][Medline]
Desjarlais,J.R. and Handel,T.M. (1995) Curr. Opin. Biotechnol., 6, 460466.[CrossRef][ISI][Medline]
Dill,K.A., Bromberg,S., Yue,K., Fiebig,K.M., Yee,D.P., Thomas,P.D. and Chan,H.S. (1995) Protein Sci., 4, 561602.
Doig,A.J. and Sternberg,M.J.E. (1995) Protein Sci., 4, 22472251.
Dunbrack,R.L. and Karplus,M. (1993) J. Mol. Biol., 230, 543574.[CrossRef][ISI][Medline]
Furukawa,K., Oda,M. and Nakamura,H. (1996) Proc. Natl Acad. Sci., USA. 93, 1358313588.
Gibney,B.R., Johansson,J.S., Rabanal,F., Skalicky,J.J., Wand,A.J. and Dutton,P.L. (1997) Biochemistry, 36, 27982806.[CrossRef][ISI][Medline]
Gibney,B.R., Rabanal,F., Skalicky,J.J., Wand,A.J. and Dutton,P.L. (1999) J. Am. Chem. Soc., 121, 49524960.[CrossRef][ISI]
Handel,T.M., Williams,S.A and DeGrado,W.F. (1993) Science, 261, 879885.[ISI][Medline]
Hecht,M.H., Richardson,J.S., Richardson,D.C. and Ogden,R.C. (1990) Science, 249, 884891.[ISI][Medline]
Isogai,Y., Ota,M., Fujisawa,T., Izuno,H., Mukai,M., Nakamura,H., Iizuka,T. and Nishikawa,K. (1999) Biochemistry, 38, 74317443.[CrossRef][ISI][Medline]
Isogai,Y., Ishii,A., Fujisawa, Ota,M. and Nishikawa,K. (2000) Biochemistry, 39, 56835690.[CrossRef][ISI][Medline]
Jiang,X., Farid,H., Pistor,E. and Farid,R.S. (2000) Protein Sci., 9, 403416.[Abstract]
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[ISI][Medline]
Lasters,I., De Maeyer,M. and Desmet,J. (1995) Protein Eng., 8, 815822.[Abstract]
Milla,M.E., Brown,B.M. and Sauer,R.T. (1994) Nature Struct. Biol., 1, 518523.[ISI][Medline]
Murphy,K.P., Bhakuni,V., Xie,D. and Freire,E. (1992) J. Mol. Biol., 227, 293306.[ISI][Medline]
Nakamura,H., Tanimura,R. and Kidera,A. (1996) Proc. Jpn Acad. 72B, 149152.
Ota,M. and Nishikawa,K. (1997) Protein Eng., 10, 339351.[Abstract]
Ota,M., Isogai,Y. and Nishikawa,K. (2001) Protein Eng., 14, 557564.
Phillips,S.E. (1980) J. Mol. Biol., 142, 531554
Ponder,J.W. and Richards,F.M. (1987) J. Mol. Biol., 193, 775791.[ISI][Medline]
Schildbach,J.F., Milla,M.E., Jeffrey,P.D., Raumann,B.E. and Sauer R.T. (1995) Biochemistry, 34, 14051412.[ISI][Medline]
Sippl,M.J. (1990) J. Mol. Biol., 213, 859883.[ISI][Medline]
Skalicky,J.J., Gibney,B.R., Rabanal,F., Urbauer,R.J.B., Dutton,P.L. and Wand,A.J. (1999) J. Am. Chem. Soc., 121, 49414951.[CrossRef][ISI]
Tanaka,T., Kimura,H., Hayashi,M., Fujiyoshi,Y., Fukuhara,K. and Nakamura,H. (1994) Protein Sci., 3, 419427.
Tanimura,R., Kidera,A. and Nakamura,H. (1994) Protein Sci., 3, 23582365.
Received November 16, 2001; revised March 8, 2002; accepted March 31, 2002.