Discrete structure of van der Waals domains in globular proteins

Igor N. Berezovsky

Department of Structural Biology, The Weizmann Institute of Science, P.O.B. 26, Rehovot 76100, IsraelPresent address: Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street M-105, Cambridge, MA 02138, USA, E-mail: inberez{at}fas.harvard.edu


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Most globular proteins are divisible by domains, distinct substructures of the globule. The notion of hierarchy of the domains was introduced earlier via van der Waals energy profiles that allow one to subdivide the proteins into domains (subdomains). The question remains open as to what is the possible structural connection of the energy profiles. The recent discovery of the loop-n-lock elements in the globular proteins suggests such a structural connection. A direct comparison of the segmentation by van der Waals energy criteria with the maps of the locked loops of nearly standard size reveals a striking correlation: domains in general appear to consist of one to several such loops. In addition, it was demonstrated that a variety of subdivisions of the same protein into domains is just a regrouping of the loop-n-lock elements.

Keywords: closed loops/formation and alteration of domains/hierarchy of domain structure/loop-n-lock structure/protein folding/protein structure


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Proteins consist of distinct, semi-independent, stable structural fragments (domains) that were elucidated from the results of limited proteolysis (Porter, 1959Go) even before the first protein X-ray structure was determined (Kendrew et al., 1958Go). A variety of computational methods for identifying domains was developed (Berezovskii and Tumanyan, 1995Go; Islam et al., 1995Go; Siddiqui and Barton, 1995Go; Berezovskii et al., 1997Go; Berezovsky et al., 1997Go, 1999Go; Jones et al., 1998Go; Wernisch et al., 1999Go) and various factors in the formation of the domains have been discussed (Doolittle, 1995Go). Hierarchic organization of the protein globule was first described by Crippen (Crippen, 1978Go) and Rose (Rose, 1979Go) and recently discussed in the model of hierarchic protein folding (Baldwin and Rose, 1999aGo,bGo). Van der Waals energy calculations (Berezovskii and Tumanyan, 1995Go; Berezovskii et al., 1997Go; Berezovsky et al., 1997Go, 1999Go, 2000aGo) allow the detection of the domains and changes in the domain structure at different energy levels. Domains can be further subdivided into energetically distinct segments and combinations thereof. The energy-based dissection of globular proteins into domains (Berezovskii and Tumanyan, 1995Go; Berezovskii et al., 1997Go; Berezovsky et al., 1997Go, 1999Go, 2000bGo) appears to be the most natural way to separate independent elements and to evaluate interactions between them.

It was recently discovered that the globular proteins are universally built of nearly standard size closed loops (loop-n-lock elements), in other words, returns of the chain trajectory with tight end-to-end contacts (Berezovsky et al., 2000cGo; Berezovsky and Trifonov, 2001aGo,bGo). In order to establish the possible connection between energy-derived domains and subdomains on the one hand and the structurally defined loop-n-lock elements on the other, we compared the results obtained by these two independent approaches. The detailed comparisons of the energy maps with positions of the primary closed loops, as described below, show that the domains are essentially made of the loops and the hierarchy of domain structure is defined by interactions between the loops and by the loop regrouping.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Structural data

Major representatives (Orengo et al., 1994Go) of the protein superfolds [(Globin (1thb), Trefoil (1i1b), Up-Down (256b), Immunoglobulin (2rhe), {alpha}ß Sandwich (1aps), Jelly Roll (2stv), Doubly Wound (4fxn), UB {alpha}ß roll (1ubq), TIM-Barrel (7tim)] were analyzed. X-ray data from the Protein Data Bank were supplemented with coordinates of H-atoms (Berezovsky et al., 1999Go).

Hierarchy of protein domain structure by the van der Waals energy approach

The algorithm is based on the segmentation of the globule into parts with high concentration of van der Waals energy and further detailed analysis of interactions between these segments. Van der Waals energies were calculated for all pairs of contacting atoms. Only the contact distances between 2.5 and 5.0 Å were considered (Berezovsky et al., 1999Go). The Lennard–Jones 6–12 potential and the standard Scheraga parameters for different types of atoms were used (Dunfield et al., 1978Go; Nemethy et al., 1983Go). The van der Waals energies were calculated for atoms belonging to residues separated by at least two amino acids along the polypeptide chain (Berezovskii and Tumanyan, 1995Go). Figure 1AGo demonstrates an example of van der Waals ‘energy walks’ for the TIM-Barrel fold (7tim). Every point of the curve of the plot is an energy interaction between the parts of the globule separated by a given amino acid residue.




View larger version (59K):
[in this window]
[in a new window]
 
Figure 1. Van der Waals interaction energy curves: ‘energy walks’ for triosephosphate isomerase (7tim). (A) Initial energy curve. Position 190 corresponds to the minimum of interaction energy (E0). Positions 64 and 124 are maxima of the interaction energy that correspond to segmentation with potential barrier 0.25E0 (see graph B). (B) Energy curves for segments 1–63, 65–123 and 125–248 (value of ‘potential barrier’ is 0.25E0–0.2E0). (C) The same curve for the value of barrier 0.15E0: segments 1–63, 65–123, 125–160 and 162–248. (D) Barrier 0.1E0: segments 1–63, 65–91, 93–123, 125–160, 162–210 and 212–248. (E) Barrier 0.05E0: segments 1–63, 65–91, 93–103, 105–123, 125–160, 162–173, 175–210 and 212–248.

 
The procedure of domain detection and setting of the levels of energy to establish the hierarchy of the domains consists of the following steps:

  1. Calculation of the interaction energy between parts of the native globule separated by each given residue. The minimal energy value (E0) is found on the curve of interaction energy between parts of the globule. Every local maximum on the curve of interaction energy between parts of the globule corresponds to a point of separation. The null value of the interaction energy means complete energy independence of the adjacent regions from each other.
  2. Setting of ‘potential barriers’, e.g. 0.3 E0, 0.25 E0, 0.2 E0, 0.15 E0, 0.1 E0, 0.05 E0, and analysis of the initial curve of interactions between parts of the globule at different levels of the ‘potential barrier’. Any maximum on the initial curve is considered to be a point of structural separation if the differences between this maximum and neighboring deep minima exceed the value of a chosen potential barrier. This generates sets of structural segments corresponding to the values of the barriers. Thus, alternative domains can be defined at different levels of the barrier. These segments are characterized as follows: internal energy of isolated segments eii, integral energy of external interaction of each segment with others and interaction energies for each pair of segments eij (i,j = 1, ..., K, where K is the number of segments).
  3. Analysis of the interaction energy within the structural segments separated at the previous step, and between the segments. Any isolated segment is considered as a candidate domain if (here and below, we compare absolute values of energy). Any candidate domain ill be classified as domain if . Any two potential domains i and k will be combined into one independent domain if and simultaneously. Two potential domains will also be combined if more than 70% of the external energy of one domain pertains to the interaction with the other domain. Any isolated segment with will be joined with the segment or potential domain for which the first segment has maximal external interaction energy.

The procedure is explained in further detail by the illustration (presented in Figure 1Go) that contains a full set of van der Waals curves for triosephosphate isomerase (TIM-Barrel fold, 7tim). Graph A represents the initial curves of a van der Waals energy walk and graphs B–E energy curves at different levels of hierarchy. In Figure 1Go, accordingly:

  1. Position 190 of the initial curve (see graph A) gives a minimal value E0 of interactions.
  2. Different types of structure splitting are observed for the following levels of a potential barrier: 0.25E0 (Figure 1BGo), 0.15E0 (Figure 1CGo), 0.1E0 (Figure 1DGo) and 0.05E0 (Figure 1EGo). Figure 1AGo demonstrates maxima at positions 64 and 124. Each of these maxima is accompanied by two minima with respective energy differences larger than 0.25E0. This suggests segmentation 1–63, 65–123 and 125–248 at the potential barrier 0.25E0 (see also Figure 1BGo). Figure 1C–EGo demonstrate interaction energies within the segments (see legend to Figure 1Go) corresponding to the barrier values 0.15E0, 0.1E0 and 0.05E0, respectively.
  3. This analysis therefore suggests four levels of hierarchy in triosephosphate isomerase: level 1 (0.3E0), single-domain structure; level 2 (0.25E0–0.15E0), domain 1 residues 1–63 and 125–248, domain 2 residues 65–123; level 3 (0.1E0), domain 1 residues 1–63 and 212–248, domain 2 residues 65–123, domain 3 residues 125–210; and level 4 (0.05E0), domain 1 residues 1–63 and 125–248, domain 2 residues 65–123.

Comparative analysis of the domains detected by different computational approaches

We compared the domain assignments by our method with other methods and authors’ definitions. If the assigned domain boundary is located in the interval l ± 2 (residue l is the domain boundary assigned by the other method), we shall consider the domain boundary assignments identical. This is related to the accuracy of the method, since we take into account the interaction between atoms belonging to the residues separated by at least two residues (see above). The accuracy score is calculated as follows:


where Nicor is the number of residues assigned to the same domain both by our program and another method or author definition, Ntot is the total number of residues in the protein chain, m is the number of domain boundaries that were similarly assigned both by the program and by the other methods and M is the number of domains under comparison. If the number of domains assigned by our method is not equal to the number of domains assigned by others, then M is the maximal number of domains in the compared assignments.

Detection of the closed loops and their characterization

Closed loops are defined as continuous sub-trajectories of the folded chains with small C{alpha}–C{alpha} distance between their ends (up to 10 Å). These are not loops in the traditional definition as connectors between elements of secondary structure (Leszczynski and Rose, 1986Go; Martin et al., 1995Go; Kwasigroch et al., 1996Go; Oliva et al., 1997Go) or so-called U-turns (Kolinski et al., 1997Go), which do not include loop closure points. The closed loops (Berezovsky et al., 2000cGo; Berezovsky and Trifonov, 2001aGo) connect points distantly positioned along the polypeptide chain, providing the formation of locally compact structural subunits. The C{alpha}–C{alpha} contacts with immediate neighbors along the sequence are not considered. Five residues are taken as the cut-off value. For the (anti)parallel {alpha}- and ß-structures forming several short C{alpha}–C{alpha} contacts the shortest one is taken. According to the loop size distribution (Berezovsky et al., 2000cGo; Berezovsky and Trifonov, 2001aGo), the loops accepted into the mapping procedure have sizes from 15 to 50 amino acid residues.

The mapping procedure sequentially selects the tightest loops and at each step the sequence region corresponding to the mapped loop is excluded from further calculations. In case of partial overlapping the tighter of the two loops is accepted. With overlapping less than five common amino acid residues both loops were accepted.

‘Van der Waals walks’, i.e. interaction energy between the parts of the native globule, are plotted in Figure 2Go as described previously (Berezovskii et al., 1997Go; Berezovsky et al., 1999Go). The top curves on the plots in Figure 2Go correspond to the smallest value of the ‘potential barrier’ [for Jelly Roll fold (2stv) only this curve is presented]. Values of the barrier are as follows: 0.01E0 for {alpha}/ß Sandwich (1aps) and 0.05E0 for other superfolds. A total of 54 sections of the van der Waals plots around the loop ends (left and right) of the nine superfolds were aligned and summed together in Figure 3Go.



View larger version (35K):
[in this window]
[in a new window]
 
Figure 2. Superposition of nearly standard size loops (1) and the ‘van der Waals walks’. (A) UB {alpha}/ß Roll (1ubq). (B) {alpha}/ß Sandwich (1aps). (C) Up-Down (256b). (D) Immunoglobulin fold (2rhe). (E) Doubly Wound (4fxn). (F) Globin (1thb). (G) Trefoil (1i1b). (H) Jelly Roll (2stv). (I) TIM-Barrel (7tim). Maxima on the plots correspond to the points that separate energy-independent parts of the molecule and, thus, show boundaries between domains (subdomains), segments, etc. A domain is defined as a part of the molecule with strong internal van der Waals interactions between corresponding parts and weak external ones. They may consist of one continuous part of the globule or can be an association of several discontinuous elements suggested by the plot. According to the procedure of domain detection [see Materials and methods for explanation, and also earlier work (Berezovskii et al., 1997Go; Berezovsky et al., 1999Go)], five proteins [{alpha}/ß Sandwich (1aps), Globin fold (1thb), UB {alpha}/ß Roll (1ubq), Jelly Roll (2stv) and Immunoglobulin fold (2rhe)] yield a unique single-domain structure, whereas other proteins [Trefoil fold (1i1b), TIM-Barrel(7tim), Up-Down (256b) and Doubly Wound (4fxn)] reveal a hierarchy of domain structure with distinct domains in the same protein at different levels of hierarchy. Trefoil fold (1i1b) and Doubly Wound (4fxn) can have a single-domain structure as well as two-domain structure with the formation of one complex domain [in Trefoil fold, residues 3–41 and 104–153 (domain 1); residues 43–102 (domain 2); in Doubly Wound, residues 1–33 and 122–138 (domain 1); residues 35–120 (domain 2)]. Up-Down fold (256b) yields three levels of hierarchy with single-domain structure and distinct two-domain organizations [residues 1–75 (domain 1), residues 77–106 (domain 2) or residues 1–42 (domain 1), residues 44–106 (domain 2)]. TIM-Barrel fold reveals four level of hierarchy with single-, two- and three-domain organizations (see legend to Figure 4Go and Materials and methods).

 


View larger version (14K):
[in this window]
[in a new window]
 
Figure 3. Sum of synchronized van der Waals plots around the ends of the mapped loops. To eliminate the sequence end effects the loop ends closer than 30 amino acid residues to the ends of the sequences were excluded from the calculation. A total of 54 sections of energy curves are summed together. The background of -10 kcal/mol that corresponds to the interaction energy at the loop ends is subtracted.

 

    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Van der Waals segmentation and loop structure of the major superfolds

As demonstrated earlier (Berezovskii et al., 1997Go; Berezovsky et al., 1999Go), calculation of the van der Waals segments of the protein globule allows one to define boundaries and hierarchy of the domains at different energy levels. This calculation generates curves with zero levels at the start and end points. The energy profiles (see Figures 1Go and 2Go) are rather ragged, showing numerous maxima and minima. The maxima correspond to the borders between the energy-defined independent segments. For example, in Figure 1AGo the profile for the whole molecule shows several maxima that split the molecule into several segments. The energy-justified splitting can be as detailed, as many maxima are considered to be borders. The energy profiles are then calculated separately for each segment so that other parts of the molecule do not contribute to the neighboring segments (e.g. plots in Figure 1B–EGo). The subdivision starts with the highest maxima observed and the procedure allows one to reveal additional maxima, which appear in the original plot as changes in the slope (shoulders) rather than maxima. Selected structural units are characterized by substantially higher internal versus external interactions.

The top curves in Figure 2A–IGo demonstrate more detailed segmentation of the globules for the smallest values of the ‘potential barriers’ (the notion of the ‘potential barrier’ is explained in detail and exemplified in Materials and methods). Inspection of Figure 2A–IGo shows the typical size of these segments: 10–50 amino acid residues. Similar sizes are characteristic of closed loops (Berezovsky et al., 2000cGo; Berezovsky and Trifonov, 2001aGo); that is returning pieces of trajectory with tight C{alpha}–C{alpha} contacts.

A comparison of maps of domain boundaries with borders between primary closed loops (indicated by bars above the energy curves in the Figure 2Go) shows that these two maps are rather similar. That is, the majority of the loop ends correspond to the peaks on the energy curves (loop mapping error bars: ±3 amino acid residues). Table IGo contains closed loops mapped in nine major superfolds and van der Waals segments of respective structures selected at the lowest level of the ‘potential barrier’ (see Materials and methods). Sets of the loops and the segments contain 38 and 48 entities, respectively. Among 58 internal (not at the ends of the protein) loop ends there are 36 located at the respective borders of van der Waals segments (bold in Table IGo).


View this table:
[in this window]
[in a new window]
 
Table I. Location of the closed loops and van der Waals segmentation in nine major superfolds
 
Quantitative agreement of the loop borders with energy plots is further demonstrated by Figure 3Go, where the sections of energy curves around the loop ends of the nine superfolds are aligned and summed together. Thus, Figure 3Go represents an averaged interaction energy in the vicinity of the loop end (position zero, see Materials and methods). The average interaction energy shows a clear maximum at the aligned loop ends and the match between the loop ends and the borders of van der Waals segments. Together with the similarity of the sizes, this give a basis to consider closed loops as common elementary units of protein structure and stability. In some cases (for the smallest values of the potential barrier), there are several small segments inside the closed loop. These correspond to local rather compact pieces with high energy content.

Hierarchy of domain structure

There are many cases where a fold can be dissected in alternative ways (different techniques and/or authors). A principle of the systematic comparison of the domain assignments by different techniques (see Materials and methods) has been developed earlier (Berezovsky et al., 1999Go). In this work, the procedure was applied to the domain assignments of the major superfolds. Single-domain assignments made by the van der Waals techniques are in full accordance with the same conclusions by different techniques for the following superfolds: single-domain structure in Trefoil fold (1i1b) coincides with the result of the DOMAK program (Siddiqui and Barton, 1995Go); single-domain structures in {alpha} Sandwich (1aps), Doubly Wound (4fxn) and TIM-Barrel (7tim) have also been detected (Islam et al., 1995Go); both the DOMAK program and the algorithm developed by Islam et al. (Islam et al., 1995Go) show a single domain for Up-Down (256b) and Jelly Roll (2stv) folds. Three-domain assignment for TIM-Barrel fold coincides with that made by the DOMAK program (accuracy score 90%). In addition, the van der Waals approach demonstrates other variants of domains in these structures.

Inspection of known alternative van der Waals domain structures for the major superfolds reveals that their domains and subdomains also consist of closed loops. For example, formation of a two-domain structure in the Trefoil fold (1i1b) is achieved by the contribution of loops 43–63 and 70–99 to the two-loop domain 43–102 while the rest of the structure is a complex domain of two parts: the upstream part (residues 3–41) contains loop 19–40 and region 104–153 contains loops 101–122 and 122–144. In the Doubly Wound fold (4fxn), segment 1–33 (loop 1–30), being in strong interaction with the last loop 112–134, forms a complex domain of residues 1–30 and 122–138. At the same time, the second domain (residues 32–120) is made of two other loops (35–66 and 78–107). An Up-Down fold (256b) yields two variants of a two-domain structures (in addition to a single-domain description): domain 1 (residues 1–75; loops 10–33 and 41–62) and domain 2 (residues 77–106; loop 68–94) or, alternatively, domain 1 (residues 1–42; loop 10–33) and domain 2 (residues 44–106; loops 41–62 and 68–94). Finally, the domain structure of the TIM-Barrel fold can vary from a single-domain to two- or three-domain organization. Loops 9–40 (black, Figure 4BGo) and 41–62 (light grey, Figure 4BGo) (segment 1–63) and loops 128–166 (grey, Figure 4DGo), 177–210 (light grey, Figure 4DGo), 207–228 (grey, Figure 4BGo) and 229–243 (light grey, Figure 4BGo) (segment 125–248) form the first domain of the two-domain structure, and loops 62–90 (black, Figure 4CGo) and 95–126 (light grey, Figure 4CGo) segment (65–123) the second domain. A three-domain structure is generated by strong interaction between loops in the following order: loops 9–40 (black), 41–62 (light grey), 207–228 (grey) and 229–243 (light grey) are in the first domain (segments 1–63 and 212–248; Figure 4BGo); loops 62–90 (black) and 95–126 (light grey) form a second domain (residues 65–123; Figure 4CGo), while the third domain (residues 125–210; Figure 4DGo) consists of the loops 128–166 (grey) and 177–210 (light grey). Hence there is an obvious correlation of maxima of van der Waals plots with positions of the loop ends. The interaction of the loops and their closure results in the formation/alteration of domains at different levels of the energy hierarchy. Whichever domain is considered it consists of one or two or any number of nearly standard size loop-n-lock elements.



View larger version (36K):
[in this window]
[in a new window]
 
Figure 4. Domain structure of triosephosphate isomerase (7tim). (A) Three domains of the TIM-Barrel fold: domain 1 (residues 1–63, 212–248, dark grey), domain 2 (residues 65–123, grey), domain 3 (residues 125–210, light grey). (B–D) Loop representation of three domain of TIM Barrel fold. (B) Domain 1 (residues 1–63 and 212–248): loops 9–40 (black), 41–62 (light grey), 207–228 (grey) and 229–243 (light grey). (C) Domain 2 (residues 65–123): loops 62–90 (black) and 95–126 (light grey). (D) Domain 3 (residues 125–210): loops 126–166 (grey) and 177–210 (light grey).

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Hierarchical subdivisions of the van der Waals domains provide common ground for the reconciliation of traditional definitions of domains. The very important advantage of this approach is the possibility of detecting structural domains that involve any number of continuous or discontinuous segments of the polypeptide chain. Moreover, van der Waals segmentation eventually leads to the elucidation of the levels of energy hierarchy, which correspond to distinct sets of structural domains (Berezovsky et al., 1999Go). We demonstrated here the existence of several levels of energy hierarchy in ß-lactamase (4blm, Figure 5Go) and triosephosphate isomerase (7tim, Figure 4Go) with distinct sets of domains. In both cases clear levels of hierarchy revealed by the van der Waals approach show single-, two- and three-domain structures. Where analytical approaches were used, alternative conclusions on domain structure in ß-lactamase 4blm were made (Islam et al., 1995Go; Siddiqui and Barton, 1995Go). Islam et al. suggested single-domain assignment, whereas the technique of Siddiqui and Barton showed a two-domain structure. The van der Waals approach detects both variants of domain structure as the first and third levels of energy hierarchy (with accuracy scores equal to 100 and 67%, respectively). The latter case is an example of a multidomain protein with a discontinuous domain structure: discontinuous sections 31–60 and 216–291 make a separate domain (Siddiqui and Barton, 1995Go), confirmed also by our energy calculations (Berezovsky et al., 1999Go). In addition, another variant of two-domain structure is suggested. In the case of triosephosphate isomerase (7tim, TIM-Barrel fold), the van der Waals approach also selects single-, two- and three-domain structures at different levels of hierarchy (Figure 4Go). Here, a single-domain structure defined in the X-ray experiment and a three-domain structure revealed by an analytical approach (Siddiqui and Barton, 1995Go) belong to different levels of energy hierarchy detected by the van der Waals approach (accuracy scores are 100 and 90%, respectively). Thus, the hierarchical energy calculations allow one to single out domains of interest with regard to function, structure, energy or other properties by essentially regrouping the energy-defined segments.



View larger version (42K):
[in this window]
[in a new window]
 
Figure 5. Domain structure of ß-lactamase (4blm). (A) Residues 31–60 (light grey), 62–214 (grey) and 216–291 (black). (B) Two-domain structure (variant 1): domain 1 (residues 31–214, light grey), domain 2 (residues 216–291, black). (C) Two-domain structure (variant 2): domain 1 (residues 62–214, light grey), domain 2 (residues 31–60, 216–291, black).

 
Domain boundaries defined by the van der Waals energy approach match well the closed loop boundaries. Considering the inaccuracies in loop mapping and in energy calculations, this match is rather surprising. The correlation is best seen, for example, in the case of a Trefoil fold (1i1b) or a TIM-Barrel fold (7tim) in Figure 2G and IGo, respectively. Detailed inspection of domains and subdomains of different levels shows that it is not merely a correlation, but in fact the domains are actually made of the closed loops. Figures 4Go and 5Go illustrate all the variants of domain structures in ß-lactamase (4blm) and triosephosphate isomerase (7tim), with the loops regrouped accordingly. In ß-lactamase (4blm) for example, the alteration of domain structure depends on the status of the loop 31–60 (light grey, Figure 5AGo). A reassignment of this loop results in alternative two-domain structures: domains 31–214 (light grey, Figure 5BGo) and 216–291 (black, Figure 5BGo) or domains 31–60, 216–291 (black, Figure 5CGo) and 62–214 (light grey, Figure 5CGo). A division of the triosephosphate isomerase (7tim) into eight loops provides variants from one to three domains (Figure 4Go). Alteration of domains at different levels of hierarchy is also defined in this case by the regrouping of the loops and interactions between them. Van der Waals locking of closed loops and additional secondary locks with the formation of domains (subdomains) prove that van der Waals interactions play a major role in the formation of mature globular structures, loops and domains equally. Directed interactions (hydrogen bonds or ion pairs) play an additional role in the local stabilization of compact structural elements (e.g. {alpha}-helices) or modulate interactions between enthalpy driven stable structures (loops, subdomains, domains). The exceptional role traditionally ascribed to directed interactions is an overestimation of their marginal role in the overall globule stability. Directed interactions are always saturated either by interaction between respective groups inside the globular structure or by the contacts with water and counter-ions. Therefore, they could only provide a small advantage for a folded versus an unfolded structure. Van der Waals closure of the loop ends and additional (secondary) distant van der Waals contacts serve as major folding enthalpy contributors. A closed loop can therefore be considered as an elementary unit of domain structure and interactions between them provide diversity of the domain structures in globular proteins.


    Acknowledgments
 
Professor E.N.Trifonov’s stimulating discussions and thoughtful comments and Professor M.D.Frank-Kamenetskii’s critical reading of the manuscript and fruitful discussions are greatly appreciated. I am grateful to Mrs. A.Weinberg for editing of the text. I.N.B. is a Post-Doctoral Fellow of the Feinberg Graduate School at the Weizmann Institute of Science.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Baldwin,R.L. and Rose,G.D (1999a) Trends Biochem. Sci., 24, 26–33.[CrossRef][ISI][Medline]

Baldwin,R.L. and Rose,G.D. (1999b) Trends Biochem. Sci., 24, 77–83.[CrossRef][ISI][Medline]

Berezovskii,I.N. and Tumanyan,V.G. (1995) Biophysics, 40, 1181–1187.

Berezovskii,I.N., Esipova,N.G. and Tumanyan,V.G. (1997) Biophysics, 42, 557–565.

Berezovsky,I.N. and Trifonov,E.N. (2001a) Protein Eng., 14, 403–407.[Abstract/Free Full Text]

Berezovsky,I.N. and Trifonov,E.N. (2001b) J. Mol. Biol., 307, 1419–1426.[CrossRef][ISI][Medline]

Berezovsky,I.N., Tumanyan,V.G. and Esipova,N.G. (1997) FEBS Lett., 418, 43–46.[CrossRef][ISI][Medline]

Berezovsky,I.N., Namiot V,A., Tumanyan,V.G. and Esipova,N.G. (1999) J. Biomol. Struct. Dyn., 17, 133–155.[ISI][Medline]

Berezovsky,I.N., Esipova,N.G., Tumanyan,V.G. and Namiot V,A. (2000a) J. Biomol. Struct. Dyn., 17, 799–809.[ISI][Medline]

Berezovsky,I.N., Esipova,N.G. and Tumanyan,V.G. (2000b) J. Comput. Biol., 7, 183–192.[CrossRef][ISI][Medline]

Berezovsky,I.N., Grosberg,A.Y. and Trifonov,E.N. (2000c) FEBS Lett., 466, 283–286.[CrossRef][ISI][Medline]

Crippen,G.M. (1978) J. Mol. Biol., 126, 315–332.[ISI][Medline]

Doolittle,R.F. (1995) Annu. Rev. Biochem., 64, 287–314.[CrossRef][ISI][Medline]

Dunfield,L.G., Burgess,A.W. and Sheraga,H.A. (1978) J. Phys. Chem., 24, 2609–2616.

Islam,S.A., Luo,J. and Sternberg,M.J.E. (1995) Protein Eng., 8, 513–525.[Abstract]

Jones,S., Stewart,M., Michie,A., Swindells,M.B., Orengo,C. and Thornton J.M. (1998) Protein Sci., 7, 233–242.[Abstract/Free Full Text]

Kendrew,J.C., Bodo,G., Dintzis,H.M., Parrish,R.G., Wyckoff,H., Phillips,D.C. (1958) Nature, 181, 662–666.[ISI][Medline]

Kolinski,A., Skolnick,J., Godzik,A. and Hu,W.-P. (1997) Proteins, 27, 290–308.[CrossRef][ISI][Medline]

Kwasigroch,J.M., Chomilier,J. and Mornon,J.P. (1996) J. Mol. Biol., 259, 855–872.[CrossRef][ISI][Medline]

Leszczynski,J.F. and Rose,G.D. (1986) Science, 234, 849–855.[ISI][Medline]

Martin,A.C.R., Toda,K., Stirk,H.J. and Thornton,J.M. (1995) Protein Eng., 8, 1093–1101.[Abstract]

Nemethy,G., Pottle,M.S. and Scheraga,H.A. (1983) J. Phys. Chem., 87, 1883–1887.[ISI]

Oliva,B., Bates,P.A., Querol E., Aviles,F.X. and Sternberg,M.J.E. (1997) J. Mol. Biol., 259, 814–830.

Orengo,C.A., Jones,D.T. and Thornton,J.M. (1994) Nature, 372, 631–634.[CrossRef][ISI][Medline]

Porter,R.R. (1959) Biochem. J., 73, 119–126.[ISI]

Rose,G.D. (1979) J. Mol. Biol., 134, 447–470.[ISI][Medline]

Siddiqui,A.S. and Barton,G.J. (1995) Protein Sci., 4, 872–884.[Abstract/Free Full Text]

Wernisch,L., Hunting,M. and Wodak,S. (1999) Proteins, 35, 338–352.[CrossRef][ISI][Medline]

Received March 18, 2002; revised December 13, 2002; accepted January 24, 2003.