Laboratoire de Physique Quantique, UMR 5626 of CNRS, IRSAMC, Université Paul Sabatier, 118 route de Narbonne, 31062 Toulouse Cedex, France
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: collective motion/degree of collectivity/hinge bending/protein model/shear motion
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
One of the best suited theoretical methods for studying collective motions in proteins is normal mode analysis (NMA), which leads to the expression of the dynamics in terms of a superposition of collective variables, namely the normal mode coordinates. These coordinates have been used to analyze molecular dynamics trajectories through the quasi-harmonic approximation (Karplus and Kuschick, 1981; Levy et al., 1984; Teeter and Case, 1990
; Hayward and Go, 1995
), to integrate the equations of atomic motion with large time steps (Elezgaray and Sanejouand, 1998
, 2000
) and to sample larger portions of the configurational space (Amadei et al., 1993
). However, the idea that normal mode theory may be an accurate tool for studying protein conformational changes comes from the fact that in several cases, e.g. hexokinase (Harrison, 1984
), lysozyme (Brooks and Karplus, 1985
; Gibrat and Go, 1990
) and citrate synthase (Marques and Sanejouand, 1995
), the largest amplitude motion obtained with this theory, that is, that with the lowest frequency, was found to compare well with the conformational change observed by crystallographers in these proteins upon ligand binding. In the case of hemoglobin, it is the second lowest frequency mode of the T form which was found to be very similar to the observed difference between the T and R forms (Perahia and Mouawad, 1995
).
The reasons underlying these successes of NMA are not fully understood. First, proteins are known to fold and function in a water environment, within a narrow range of pH, temperature, ionic strength, etc., while standard NMA is performed in vacuo. Second, standard NMA requires a preliminary energy minimization which drifts the atoms of the protein up to a few ångströms away from their position in the crystallographic structure. As a consequence, the structure studied with standard NMA is always a distorted one. Finally, NMA is based on a severe small displacements approximation, which amounts to supposing that a protein behaves like a solid does at low temperature, whereas it is well known that a protein is a somewhat flexible polymer, undergoing many local conformational transitions at room temperature.
Recent results have shed some light on this paradox. It was shown that using a single parameter hookean potential for taking into account pairwise interactions between neighboring atoms yields results in good agreement with those obtained when NMA is performed with standard semi-empirical potentials, as far as low-frequency normal modes are concerned (Tirion, 1996; Hinsen, 1998
). The use of the same kind of highly simplified potential, but including only one point mass per residue in the model, yields low-frequency normal modes also in good agreement with those obtained using standard NMA (Hinsen, 1998
). Moreover, when the interactions between closely located
-carbon pairs are described by an elastic network model, protein crystallographic temperature B factors are found to be accurately predicted (Bahar et al., 1997
). Again, this means that low-frequency normal modes are well described with such a model, since it is well known that modes with frequencies under 30 cm1 are responsible for most of the amplitude of atomic displacements, as they can be estimated from the knowledge of B factors (Levy et al., 1982
; Swaminathan et al., 1982
).
Hence, results obtained with NMA in the field of low-frequency protein dynamics seem to be of very good quality even when most atomic details are simply ignored. However, up to now, low-frequency normal modes obtained with simple models have mainly been compared with those obtained with standard NMA as a whole, and not on a one-to-one basis. Detailed comparisons of such modes with large amplitude conformational changes of proteins remain to be performed. To do so, a study of proteins of various sizes undergoing conformational changes of various types and amplitudes was undertaken (see Table I). The results obtained confirm that a lot of information on the nature of the conformational change can be found in a single low-frequency normal mode of the open form of a protein. Moreover, it is shown that simple protein models are adequate for performing this kind of analysis.
|
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
For this analysis, 20 proteins were chosen for which at least two significantly different X-ray conformations are known. They were picked within the `Database of Macromolecular Movements' where conformational changes are classified into five main types, referenced as predominantly shear, predominantly hinge, not hinge or shear, involving partial refolding of the structure, or unclassified (Gerstein and Krebs, 1998). Except for the Che Y case, all proteins considered in the present study undergo conformational changes of one of the two first kinds (see Table I
for the pdb codes of the corresponding pairs of conformers), mainly because these are more frequent and/or better characterized.
Normal mode calculations
Normal mode theory. The small displacements of atomic coordinate i, ri(t) in the vicinity of a stationary point of the potential energy surface are given by Goldstein (1950):
![]() |
where mi is the mass of the corresponding atom, aij the ith coordinate of the normal mode j and j =
j/2
the corresponding frequency. Cj, aij,
j2 and
j are obtained as follows:
j2 is the jth eigenvalue of the 3Nx3N mass-weighted second derivatives of the potential energy matrix, and aij is the ith coordinate of eigenvector j. Cj and
j, the amplitude and phase of mode j, respectively, are determined once the coordinates and the velocities of the system at t = 0 are known.
Simplified potential. Within the frame of the approach proposed by Tirion (1996), the standard detailed potential energy function is replaced by
![]() | (1) |
where dij is the distance between atoms i and j, dij0 being the distance between these two atoms in the given studied crystallographic structure. C, the strength of the potential, is a phenomenological constant assumed to be the same for all interacting pairs. Note that this energy function was designed in such a way that for any chosen configuration of any system the total potential energy, Ep, is a minimum of the function. Thus, with such an approach, by definition, NMA does not require any prior energy minimization.
Note that in Equation 1, the sum is restricted to atom pairs separated by less than Rc, which is an arbitrary cut-off parameter. In all normal mode calculations hereafter, a cutoff of 8 Å has been used and, as proposed by Bahar et al. (1997), only the C
atoms have been taken into account. Such a model is adequate for studying backbone motions, which in turn is sufficient for characterizing low-frequency normal modes of large proteins. Moreover, it allows the study of proteins of large size on common workstations, using small amounts of CPU time, since with this simple model the matrix to be diagonalized is a 3Nx3N matrix, where N is the number of residues of the protein.
Comparison with experiment
Overlap.
Ij, the overlap between = {
r1, ···,
ri, ···,
r3N}, the conformational change observed by crystallographers, and the jth normal mode of the protein is a measure of the similarity between the direction of the conformational change and the one given by mode j. It is obtained as follows (Marques and Sanejouand, 1995
):
![]() | (2) |
where ri = roi rci, roi and rci being the ith atomic coordinate of the protein in the `open' and `closed' crystallographic structure, respectively. A value of one for the overlap means that the direction given by normal mode j is identical with
. From a practical point of view,
is calculated after both crystallographic conformations of the protein were superimposed, using standard fitting procedures. These pairs of conformations are referred to as `open' or `closed' because many conformational changes considered in the present study occur in enzymes, in which an active pocket site is being closed as a consequence of substrate binding.
Correlation. The correlation coefficient cj measures the similarity of the patterns of the atomic displacements in the conformational change and in mode j. It is obtained as follows:
![]() | (3) |
where Aij and Ri are, respectively, the amplitudes of the displacement of atom i in the mode j and in the conformational change, Aj and
being the corresponding average displacements, while
(Aj) and
(
R) are the corresponding root mean square values. A value of one for cj means that both patterns of atomic displacements are identical.
The degree of collectivity of a protein motion.
A measure of how collective a protein motion is was proposed by Bruschweiler (1995). In the present study, it was used in order to estimate the degree of collectivity of each conformational change considered, reflecting the number of atoms which are significantly affected during the conformational change. This degree of collectivity, , is defined as being proportional to the exponential of the `information entropy' embedded in vector
= {
R1, ···,
Ri, ···,
RN}:
![]() | (4) |
where the sum is over the C atoms of the protein and where
is a normalization factor chosen so that
is analogous to Boltzmann's W in S = klogW and gives the effective number of non-zero
Ri2. It is confined to the interval between N1 and 1. If
= 1, the conformational change is maximally collective and all
Ri2 are identical. In the limit of extreme local motion, where the conformational change involves only a few atoms,
is minimal (
= 1/N when only one atom is involved in the conformational change).
values have been calculated for the conformational changes of all proteins considered in the present study (see Tables IV and V
).
|
|
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the seminal work of Tirion in which simplified potentials such as the one that we use were introduced, it was mentioned that, using an all-atom model, tests performed on a periplasmic maltodextrin binding protein indicate that the slowest modes do indeed closely map the open form into the closed form (Tirion, 1996), but the corresponding details were not given. In further work by Hinsen (1998), a simplified potential was used together with a simplified protein model, with one point mass per residue as in ours, and, in the case of crambin, lysozyme and aspartate transcarbamylase, normal modes thus obtained were found to compare well with those obtained using standard detailed potentials and models (Hinsen, 1998
). Next, using the same simple potential and model, it was shown that in the case of citrate synthase and aspartate transcarbamylase, low-frequency normal modes lead to essentially the same domain identification as when the corresponding pair of crystallographic conformations are compared (Hinsen et al., 1999
).
Although all these results strongly suggest that simple potentials and models yield low-frequency normal modes as accurate as those obtained using standard detailed potentials and models, no direct comparison between such approximate low-frequency normal modes and an experimentally known conformational change has yet been published. In Table II, the overlap (see Equation 2
) of the mode found to be the most involved in the conformational change, that is, the one with the largest overlap value (Ma and Karplus, 1997
), is given in the case of five proteins, when the modes are calculated with the model described in the Methods section (see Equation 1
and Figure 1
), or when they are calculated using standard detailed potentials and models, as done within the frame of a previous methodological study (Tama et al., 2000
).
|
|
The above results are in line with the hypothesis that, as far as the calculation of dynamic properties of proteins with NMA is concerned, simple potentials and protein models perform as well as detailed semi-empirical models (Tirion, 1996; Bahar et al., 1997
; Hinsen, 1998
). Hereafter, advantage is taken of the ease of use of simple models in order to address other issues in a quantitative manner, namely through the study of a significant number of proteins.
NMA performs better with open forms
First, a potential such as the one we use (see Equation 1) is a description of a protein as a set of harmonic springs linked together, as illustrated in Figure 1
in the case of the lysinearginineornithine (LAO) binding protein. The fact that NMA performs well using such a description (see Figure 2
and Tables II and III
) suggests that the property captured by NMA may for the most part be a property of the shape of the protein itself. If this point is correct, NMA should perform better with `open' than with `closed' forms, since in the former the domains of the protein are, by definition, more separated, that is, better defined as far as their shape is concerned (see Figure 1
). Indeed, in Table III
it is shown that the normal mode most involved in the conformational change, as obtained when studying an open conformation of a protein, almost always yields a better description of the direction of the observed conformational change than that obtained when studying a closed form, the corresponding overlap being significantly larger in eight cases out of 10. Using standard detailed potentials and models, such a result had already been noted in the case of citrate synthase, when the normal modes of this protein are calculated for the open (Tama et al., 2000
) and for the closed (Marques and Sanejouand, 1995
) forms, and also in the case of the open (see Table II
) and closed forms of adenylate kinase. In the latter case, the overlap of the mode most involved in the conformational change is 0.37. Here again, this value is found to be close to that obtained with the simplified model used in the present study, that is, 0.38 (F.Tama and Y.-H.Sanejouand, unpublished results).
|
|
NMA performs better with highly collective motions
In Table IV, values of
, the degree of collectivity of atomic motions (see Equation 4
) are given for each conformational change studied, as well as the overlap of the mode the most involved found in the case of the open forms of the studied proteins. It appears that for a degree of collectivity larger than 0.18, there is always one normal mode with a large overlap value with the conformational change (larger than 0.5). This result makes sense since normal modes of low frequency are known to be highly collective motions. In the case of rather localized motions, that is, when
is <0.18, the direction of the conformational change is rarely well described with a single normal mode. Indeed, in the four cases with
<0.14, no mode is found with an overlap with the conformational change >0.33. Such results suggest that only highly collective conformational changes may occur along a direction well described by a single normal mode.
NMA also performs well with more localized motions
However, it appears that important correlation values can be obtained (see Equation 3), even for motions with a low degree of collectivity, as in the case of triglyceride lipase (see Table IV
). In Figure 3
, the experimental conformational change of triglyceride lipase is shown. As expected from the small
value of 0.07, this motion is localized in a small stretch of residues (~10). However, while no mode with a large overlap value was found either in our study (see Table IV
) or in a study performed with standard NMA (Jaaskelainen et al., 1998
), the motion corresponding to the normal mode with the best correlation does indeed describe correctly the conformational change of triglyceride lipase, importance displacements being observed in the same stretch of residues. Similar results are obtained in the case of triose phosphate isomerase, tyrosine phosphatase and seryl-tRNA synthetase. Thus, in these cases also, there is some information on the conformational change of the protein lying in the low-frequency normal modes of the open form, but this information is about the amplitudes of the displacements of the atoms and not about the direction of their motion. This result can be understood since when atomic displacements in a small stretch of residues are as large as ~10 Å, as in the case of triglyceride lipase, the corresponding motion is not expected to be as linear as it is in a domain motion. So, in such cases, normal modes describing correctly where the atoms will be at the end of the conformational change should be rare, for the simple reason that this direction may vary all along the conformational change. Such an explanation may also help in understanding why in the case of calmodulin, whose conformational change has a very large degree of collectivity, namely, 0.68, the mode the most involved has a rather poor 0.5 overlap with the conformational change. Indeed, calmodulin is a special case, since it has a dumbbell-like shape in the open form, and a rather globular one in the closed form. Thus, atomic motions during this conformational change are likely to follow a curvilinear path, the direction of the motion varying as the conformational change proceeds.
|
For 10 out of the 20 proteins studied, a lot of information on the nature of the conformational change is found in a single normal mode, namely its direction and the pattern of the atomic displacements in the protein, since the same normal mode presents both the best overlap and the best correlation with the conformational change. However, most often, this single normal mode is not the lowest frequency mode, as found, for instance, when the closed form of citrate synthase was studied using standard semi-empirical potentials (Marques and Sanejouand, 1995), but is often one of the three lowest frequency ones (see Tables IV and V
).
Conclusion
Large conformational transitions are important for a variety of protein functions, including catalysis and regulation of activity. Most of these motions have been probed by X-ray crystallography. However, it is often difficult to obtain the crystallographic structures of the two forms of a protein, that is, both the free form and the form of the proteinligand (or substrate) complex. As a consequence, conformational changes are more often suspected than they are described at the residue level. Thus, theoretical tools able to give information on the kind of conformational change a protein can undergo would be welcome. What we have shown in the present study is that a lot of information on the nature of the conformational change is often carried by a single low-frequency normal mode of the open form of the protein, as is obtained when normal mode analysis is performed using very simple protein models. Thus, seeking such a normal mode could help in checking a hypothesis about the kind of conformational change a given protein undergoes in order to perform its function. At a more fundamental level, our results raise the possibility that protein sequences may have been designed, through evolutionary processes, so as to allow the protein to follow the direction of a single normal mode, at least at the very beginning of its conformational change. In other words, proteins may take advantage of one of the collective motions they are able to perform, because of the shape they happen to have, as a starting point for their large amplitude functional motions.
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bahar,I., Atilgan,A. and Erman,B. (1997) Fold. Des., 2, 173181.[ISI][Medline]
Bennet,W. and Steitz,T. (1980) J. Mol. Biol., 140, 210230.
Brooks,B.R. and Karplus,M. (1985) Proc. Natl Acad. Sci. USA, 82, 49954999.[Abstract]
Bruschweiler,R. (1995) J. Chem. Phys., 102, 33963403.[ISI]
Derewenda,U., Brzozowski,A., Lawson,D. and Derewenda,Z. (1992a) Biochemistry, 31, 15321541.[ISI][Medline]
Derewenda,Z., Derewenda,U. and Dodson,G. (1992b) J. Mol. Biol., 227, 818839.[ISI][Medline]
Elezgaray,J. and Sanejouand,Y.-H. (1998) Biopolymers, 46, 493501.[ISI][Medline]
Elezgaray,J. and Sanejouand,Y.-H. (2000) J. Comput. Chem., 21, 12741282.[ISI]
Faber,H. and Matthews,B. (1990) Nature, 348, 263266.[ISI][Medline]
Gerstein,M. and Krebs,W. (1998) Nucleic Acids Res., 26, 42804290.
Gibrat,J. and Go,N. (1990) Proteins, 8, 258279.[ISI][Medline]
Goldstein,H. (1950) Classical Mechanics. Addison-Wesley, Reading, MA.
Harrison,W. (1984) Biopolymers, 23, 29432949.[ISI][Medline]
Hayward,S. and Go,N. (1995) Annu. Rev. Phys. Chem., 46, 223250.[ISI]
Hinsen,K. (1998) Proteins, 33, 417429.[ISI][Medline]
Hinsen,K., Thomas,A. and Field,M.J. (1999) Proteins, 34, 369382.[ISI][Medline]
Hubert,R. and Bennett,W. (1983) Biopolymers, 22, 261279.[ISI][Medline]
Jaaskelainen,S., Verma,C.S., Hubbard,R.E., Linko,P. and Caves,L.S. (1998) Protein Sci., 7, 13591367.
Karplus,M. and Kushick,J. (1981) Macromolecules, 14, 325.[ISI]
Kraulis,P. (1991) J. Appl. Crystallogr., 24, 946950.[ISI]
Levy,R., Perahia,D. and Karplus,M. (1982) Proc. Natl Acad. Sci. USA, 79, 13461350.[Abstract]
Levy,R., Karplus,M., Kushick,J. and Perahia,D. (1984) Macromolecules, 17, 1370.[ISI]
Ma,J. and Karplus,M. (1997) J. Mol. Biol., 274, 114131.[ISI][Medline]
Marques,O. and Sanejouand,Y.-H. (1995) Proteins, 23, 557560.[ISI][Medline]
Oh,B., Pandit,J., Kang,C., Nikaido,K., Gokcen,S., Ames,G. and Kim,S. (1993) J. Biol. Chem., 268, 1134811355.
Perahia,D. and Mouawad,L. (1995) Comput. Chem., 19, 241246.[ISI][Medline]
Remington,S., Weigand,G. and Huber,R. (1982) J. Mol. Biol., 158, 111152.[ISI][Medline]
Swaminathan,S., Ichiye,T., Van Gunsteren,W. and Karplus,M. (1982) Biochemistry, 21, 52305241.[ISI][Medline]
Tama,F., Gadea,F., Marques,O. and Sanejouand,Y.-H. (2000) Proteins, 41, 17.[ISI][Medline]
Teeter,M. and Case,D. (1990) J. Phys. Chem., 94, 80918097.[ISI]
Tirion,M. (1996) Phys. Rev. Lett., 77, 19051908.[ISI][Medline]
Wiegand,G. and Remington,S. (1986) Annu. Rev. Biophys. Biophys. Chem., 15, 97117.[ISI][Medline]
Received April 10, 2000; revised October 24, 2000; accepted November 9, 2000.