Systèmes Moléculaires et Biologie Structurale, LMCP, Universités Paris 6 et Paris 7, CNRS UMR 7590, Case 115, 75252 Paris cedex 05, France
1 To whom correspondence should be addressed. e-mail: jacques.chomilier{at}lmcp.jussieu.fr or jean-paul.mornon{at}lmcp.jussieu.fr
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() |
---|
Keywords: homology modeling/hydrophobic core/minimal surface/protein folding/secondary structure
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() |
---|
RUSSIA uses the secondary structure prediction from existing programs, such as PHD (Rost et al., 1994) or HMM (Eddy, 1998
). In the present model, ß-sheets (instead of ß-strands) and
-helices are considered as SSE, representing the basic building blocks of a structure. The
-helix geometry is invariant, in the sense that it has no inner degree of liberty, and the architecture of ß-sheets can be altered by the relative displacements (shifts) of the strands involved in the sheet, giving rise to different initial conformations for one sheet of known composition and topology. All SSE are reduced to a single point, the geometric center of their hydrophobic residues. The global 3D structure is produced by the successive displacements of these points towards their geometric center. To avoid steric hindrance during these displacements, a repulsive potential, based on a mean inter-residue distance analysis, is used in conjunction with the fact that each loop must accommodate a given number of residues. From this simple force field, several structures are proposed and in the final step, a choice is performed among them.
![]() |
The RUSSIA methodology |
---|
![]() ![]() ![]() ![]() ![]() |
---|
To represent the residues ri of a sequence, we defined the vector ri = ri(a,x,y,z), where a is the nature of amino acid i and (x, y, z) are the coordinates of its -carbon. We chose to model the residues by material points, placed at their C
atom positions, as these positions are well determined in the SSE and also because they do not depend on rotamers. Within a given SSE, the spatial positions of residues are fixed during one simulation.
The topology of SSE
The topology of ß-sheet surfaces emerges from the study of the geometry of two media borders (Osserman, 1986; Nitsche, 1989
; Safran, 1994
) and, in particular, from the study of soap films (Fomenko, 1986
). Let us consider two liquids in a container, not mixing with each other. When the system is in equilibrium, there is a border between the two media, which is a two-dimensional (2D) smooth surface. The geometry of the inter-media border is under the influence of the equilibrium. Here, we introduce, from differential geometry, the notion of the average curvature of a surface (Frankel, 1997
). To calculate at a given point the average curvature of a smooth 2D surface embedded in a 3D Euclidean space, one needs to calculate the values of curvatures (second-order derivatives) of all curves produced by intersections of the surface with a plane, containing the vector normal to this surface. The sum of the largest and the smallest values of curvatures is called the average curvature of the surface at this point. From the Poisson theorem we know that the average curvature of an inter-media border in equilibrium is constant at all regular points.
If we consider a soap film introduced between two media with pressures p1 and p2, its average curvature will be (p1 p2), where 1/
is the surface tension (a constant, depending on the nature of the film). If p1 = p2, the average curvature of the surface is equal to zero. Such surfaces are called minimal surfaces. The term minimal means that any small distortion of the surface in the neighborhood of a given point does not diminish the surface.
The ß-sheets are approximated by a minimal surface, a helicoid one (Znamenskiy et al., 2000). Another minimal surface is the catenoid surface, produced by rotation around the y-axis of the curve y = ach(x/a) where a is a constant (this curve also describes the shape of a chain hanging between two fixed points). This surface can be used to model ß-sheet barrels (Lasters et al., 1988
; Lasters, 1990
; Flower, 1994
; King et al., 1994
; Murzin et al., 1994
).
From the C coordinates of an SSE, a vector field v can be constructed. The vectors at the C
positions can be defined as the sum of two vectors that bind C
i with i 1 and i + 1 (Figure 1a). Between the C
the vector field can be smoothly interpolated. Then the integral curve C of the field v can be traced to model the backbone chain for ß-strands (Figure 1b) as well as for
-helices (Figure 1c).
|
|
|
|
ß-Sheet modeling.
To decrease the number of degrees of freedom of the model, we have previously shown that gathering the ß-strands in a sheet described by a single helicoid surface (Figure 4) is a pertinent description of this regular super secondary structure (Znamenskiy et al., 2000). The algorithm assumes the twisted architecture of the ß-sheets (Ho and Curmi, 2002
), often observed in the PDB database (Berman et al., 2000
), and due to the lone pair of the backbone nitrogens that tend to eclipse the C
C bond which results in a decrease in the
value (Shamovsky et al., 2000
). Nevertheless, owing to good convergence of the algorithm, slightly bent ß-sheets can also be considered, yielding a lower precision for the resulting structure. For the three- and four-stranded ß-sheets, the average root mean square (r.m.s.) deviation between the twisted models of ß-sheets and all the ß-sheets from the PDB database, including the bent ß-sheets, is 1.5 Å, making the twisted model representative for these classes of ß-sheets.
In the present version of RUSSIA, we focused on proteins whose folding relies on the compactness of SSE. Hence it is not designed to propose models of folds made of single sheets, as in the case of some toxins (see, for instance, bucadin, PDB code 1f94), where folding holds because of the presence of disulfide bridges, instead of SSE interactions. Our method starts with sheets of three strands, hence smaller sheets are not considered. Even if more strands were involved, bulged or bent sheets are not in the reach of this method, so far.
In practice, the cylinder radii correspond to the distances between one strand and the z-axis. The step of the helicoid surface is given by 2H and the angle of rotation of a strand around the z-axis is
. The conformation of a particular sheet is given by the twist T = 360°/2
H, related to a fixed value of H (Znamenskiy et al., 2000
). The strands are obtained as the intersections of cylinders of radius r and the helicoids, whose equations are given in Figure 4. The C
atoms of a strand are positioned along these intersecting curves, alternately on either side of the helical surface at a distance of 0.9 Å distance from it, on the cylinder surface at 3.8 Å from each other, as shown in the inset of Figure 4. The radius is invariant for a given strand and the distance between two adjacent strands is 4.7 Å. The ß-sheet architecture is defined by the number of residues in a sheet with fixed relative topology, i.e. strands cannot move from one end to the other in the sheet. To overcome this limitation, one can enumerate all possible topologies as starting conformations and run the program as many times as necessary, taking advantage of a few privileged topologies (Znamenskiy et al., 2000
). Furthermore, it was demonstrated that the number of allowed topologies both for ß-sandwich and
-helix packing can be decreased because of the restrictions due to steric and connectivity constraints (Cohen et al., 1980
).
This algorithm is limited to a small number of strands (45), hence full enumeration of the topologies is feasible. For a four-stranded sheet the total number of topologies is only 16, as long as one ignores the sequential order, and 96 when loops are considered, i.e. when connectivity is included. Furthermore, it is known that all topologies are not equivalently probable and when the numbers of observed topologies are compared with the numbers of theoretical topologies for three- to eight-stranded ß-sheets, it is clearly seen that the current structural database represents only a small fraction of what topologies are possible (Zhang and Kim, 2000; Znamenskiy et al., 2000
). Hence one can drastically reduce the number of investigations and limit it to the known folds. As the program is fairly rapid, one can hope in the near future to generate from scratch all already indexed topologies and then select the best ones among the computed models.
Building of the SSE as rigid blocks and choice of their initial positions
In globular proteins, most SSE have one side exposed to the solvent and the other facing the inner part of the protein. Statistically, hydrophobic residues are less present on the solvent-exposed side of SSE. Therefore, we can distinguish a region with a high density of hydrophobic residues on the SSE surface, which we call the hydrophobic face (HF), characterized by its hydrophobic face center (HFC). Residues V, I, M, W, Y, L, F and C were considered as hydrophobic following the results of several compilations of experimental data (Callebaut et al., 1997; Soyer et al., 2000
; Hennetin et al., 2003
). Hence the RUSSIA algorithm considers each HFC as a point of attraction, guiding its SSE during the folding process. Once the initial relative positioning of these HFCs has been performed (the choice of initial positions is described later), the various HFC drive the SSE towards the geometric center of the protein, until one equilibrium position is reached. Hence, the HFC of a given SSE represents the attraction point for other SSE in the dynamic part of the algorithm and the motor of the algorithm is the collapse of these hydrophobic HFC.
-Helix hydrophobic face definition.
To determine the HFC of the
-helices, hydrophobic residue angular positions only were considered. In a canonical
-helix with 3.6 residues per turn, a C
atom can occupy one of 18 discrete angular positions in a plane perpendicular to the axis of the local cylindrical coordinate system (see Figure 3).
A connected region was defined as successive angular positions occupied by hydrophobic residues on the base circle, allowing at most insertion of one non-hydrophobic residue among a pair of hydrophobic ones (see Figure 5). The hydrophobic face of the -helix was defined by the set of hydrophobic residues belonging to the previously defined longest connected region. The angular coordinate of the HFC of an
-helix was defined as the weighted sum of C
angular coordinates of hydrophobic residues belonging to the hydrophobic face. For helices longer than 18 amino acids, one angular position can be occupied by more than one residue. To choose the weights for each hydrophobic residue in the sum, we considered a training set of five four-helix bundle proteins, ferritin (PDB code 1fha), granulocyte-macrophage colony-stimulating factor (1gmf), hemerythrin (1hmd), apolipoprotein E3 (1lpe) and myohemerythrin (2mhr). Using the simulated annealing algorithm (Press et al., 1991
), we derived the weights for hydrophobic residues that determined the HFC producing the best orientation of an
-helix towards the center of a protein. Four hydrophobic amino acid species were considered to derive the HFC of the helix: Cys (with a weight of 4.4), Leu (2.5), Met (1.3) and Phe (1.4). This choice relies on the fact that, when one of these four amino acids is present in the HF of a helix, its position is very close to the HFC, in our test set. This was done on a trial and error basis and as it happened to give a reasonably successful packing of the structure, this rule of thumb has not been further refined in the present state of the algorithm. If a helix had none of these four residues, then Ile, Val, Trp and Tyr are considered with equivalent weights. Once the angular coordinate has been set, the HFC is then placed in the middle point along the
-helix axis. This choice of the HFC determination produces the most compact structures.
|
The HFCs are calculated as the unweighted geometrical centers of the C of hydrophobic residues on each side. As the HFC of a sheet contains more hydrophobic residues than in a helix, we did not evidence any statistical correspondence between the HFC and the nature of the amino acids. Two independent HFCs are obtained, one for each side of a ß-sheet (Figure 6).
|
For the HELIX model with three helices, the HFCs are placed at the vertices of an equilateral triangle and in the case of four helices they were placed at the vertices of a square. The initial position of helix i is determined by the angle of rotation i around the axis passing through the HFC of this helix i and C, the geometric center of all HFCs (Figure 7). We found that the optimal number of angular positions
covering the whole conformational space of an
-helix is 36, yielding a 10° grid. In other words, the initial conformations correspond to 36 possible positions for each helix, obtained by rotating each helix around this axis. Hence, for a three-helix bundle we obtained 363 initial positions and 364 for a four-helix bundle. The actual number of initial positions, once the loop connectivity conditions are applied, is much smaller.
|
|
For the MIXED model, an -helix was placed in front of a ß-sheet. Its HFC is placed along the normal to the ß-sheet plane passing by the HFC of the sheet side facing the helix. A second
-helix can be set on the other side of the ß-sheet. They are both at an initial distance h = 8 Å from the sheet, along the normal to the sheet passing at the HFC facing the considered helix (Figure 9). Among the MIXED models, the simplest case is a structure containing one
-helix facing one ß-sheet, and at the other extreme, there can be several helices on each side of the ß-sheet. The maximum number of
-helices that one can position on one side of a sheet depends on the numbers of residues involved in these SSE.
|
|
As the bulk of the RUSSIA algorithm relies on geometry, the first step was to produce a force field able to attract the SSE, reduced to a single point, their HFC. In addition to this attraction, one repulsion must be introduced in order to prevent the rigid blocks from inter-penetrating during the simulation. It was derived from the real mean equilibrium distances between SSE in a set of structures. This training set consisted of 32 895 residue pairs, extracted from the June 2000 release of the PDB (Berman et al., 2000), limited to a maximum of 25% sequence identity between any pair. SSE assignment was provided by DSSP (Kabsch and Sander, 1983
). For each residue ri of a given protein, we measured the distances dij between the C
of this residue and any C
j in the sequence, provided that residues ri and rj belong to different SSE (Vendruscolo et al., 2000
). Remember that RUSSIA assembles SSE as rigid blocks and loops are discarded from this study in terms of explicit treatment of their residues. Therefore, we did not consider a distance dij if any residue in the pair (ri, rj) belonged to a loop. To be coherent with the fact that all strands of a sheet constitute one rigid block and for ease of computing, we did not consider distances between residues if both belonged to ß-strands.
From the set of distances dij, the closest distance di for each residue ri is calculated between all residues and we obtained a set D of 32 895 closest distances. These distances were then grouped according to the chemical nature of amino acid pairs ai and aj, thus giving a matrix of 400 data, where each cell in this matrix is a list of closest distances:
D(ai,aj) = (dij1, ..., dijnij)(1)
where nij are the numbers of minimal distances found in the whole data set for the amino acid types ai and aj. We analyzed the distribution of distances for each of the 400 D(ai,aj). The number of distance classes lij was chosen from the sample size nij according to the Sturgess equation (see, for instance Wonnacott and Wonnacott, 1995):
lij 1 + 1.44lnnij(2)
The representative distance dij* = d*(ai,aj) for each pair of amino acids was assumed to be the mode of the distribution D(ai,aj), i.e. the value for which the density of the distribution is maximal. In the case of multimodal distributions, the mode corresponding to the smallest value of dij* was selected. This is consistent with the fact that the bulk of the algorithm is the maximal compactness of the hydrophobic core of proteins. Hence we obtained a 20x20 symmetric matrix giving a value of the representative distance between any pair of amino acids (Bryant and Lawrence, 1993), presented in Table I.
|
g(dij) = C f(dij)(3)
where C is a constant. The function g(dij) represents a potential for each pair of interacting residues and is similar to that of van der Waals (Figure 10b).
To decrease the number of calculations, we used a simplified version of the constraint function g* for a pair of residues (Sippl, 1990):
The function g* so constructed reaches its minimum at the representative distance dij*. A shorter distance between residues is not prohibitive (as in a real protein) but less favorable. This constraint g is used to avoid the overlapping of the SSE when they are brought closer in order to produce a compact structure. It also accounts for the steric hindrances of the residues, reduced to a single point, the C, in these calculations. Thus for n residues, one gets a set of n(n 1) inter-residue conditions with g*(dij) = 0. This set of conditions is denoted B1.
The goal function
In order to produce a compact structure from a set of independent and unconnected SSE, the distances between their HFCs were used to define a goal function F. Its minimum must be reached if one wishes to fulfil the conditions of compactness of a protein. Minimization is accomplished by regular displacements of the SSE along the axis passing through the HFC of the particular SSE and the geometric center of HFCs of all the SSEs. When SSE come into contact, they are rotated in a plane perpendicular to their previous displacement.
For the HELIX model, the goal function was the greatest distance between all HFCs and C, the geometric center of all the HFCs (Figure 7), such that, for a protein with k helices:
where hk is the distance between the HFC of helix number k and C. By choosing this type of very simple goal function, containing as few parameters as possible, we assumed that minimizing the distance hk for every SSE would lead to a compact 3D structure. Furthermore, we noticed that minimizing the average h for all SSE could produce locally compact structures which, however, were not globally optimal.
For the MIXED model, the goal function F has the same formalism as for the HELIX model. In this case, the hk are the distances from the helix HFCs to the HFCs of each side of the facing ß-sheet (Figure 9). Minimization of F is performed independently for each side of the sheet. Flexibility has been introduced in the sheets by means of shifts of strands inside one sheet. To ensure that the native structure is present among all generated models, the helix is moved from one side of the sheet to the other under the initial conditions of the simulation.
For the BETA model, the distance h in the goal function F was defined as the distance between the HFCs of the two sheets (Figure 11). One of the two sheets is permitted to rotate around the axis passing through the HFCs, the second one remaining fixed. The parameters allowed to vary are the shifts for each ß-strand in each ß-sheet and the relative rotation of ß-sheets around the axis passing through their HFCs.
|
The RUSSIA algorithm does not consider loops as a rigid SSE, because they present a wide range of conformations. The size of the loops is taken as the greatest distance between the first and last C atoms of the loop and it is used in the algorithm as a constraint applied to the SSE. This greatest distance lm is derived from the study of loop conformation by Wojcik et al. (Wojcik et al., 1999
), where all loops of a given length m are classified with respect to the distance separating the first and last C
of each loop.
Then, a constraint function skm(l) (Figure 12) is defined by
|
The value of = 0.5 Å was fixed by trial and error tests, for all values of m. The parameter l is the distance between the two SSE connected by the loop. These constraints allow the elimination of a large number of prohibited initial positions of the SSE at the very beginning of the algorithm, thus saving computing time. This set of conditions is denoted B2.
Minima search technics
Having modeled the regular SSE, we wished to bring them closer to form compact structures. Therefore, we had to minimize the goal function F that has been previously defined:
F(x1, ..., xk, y1, ..., yk, z1, ..., zk, 1, ...,
k,
1, ...,
k ,
1, ...,
k)(7)
The parameter p = (x1, ..., xk, y1, ..., yk, z1, ..., zk) determines the Cartesian coordinates of the k SSE, considered as rigid blocks and thus reduced to a single point. The parameter q = (1, ...,
k,
1, ...,
k,
1, ...,
k) determines the angular positions of all the SSE, also by reference to their HFCs. When the distance between the SSE decreases in order to minimize F, the SSE can come into contact. To avoid the steric hindrance, a set of inter-residue contact conditions, called B1, has been defined, based upon inter-residue distance analysis. To account for the presence of the loops linking the k SSE, another set of k 1 boundary conditions, called B2, has been introduced. Hence the problem of producing a model is given by the following set of equations:
where F is the goal function and B1, B2 are the set of boundary conditions for k SSE.
To reach the minimum of F, the SSE are moved by steps towards the center of the protein (Figure 13). Let us remember that F itself is defined as the largest hk distance between helices and the center C. One can understand the minimization process as the following: one searches for the largest distance hk, then this distance is reduced, which in turn produces a new position for the center C, thus giving rise to a new largest hk distance (it may be with another helix) and so on. Once the function g (residue contact constraint) becomes greater than zero, the SSE are slightly moved (of the order of 5% of the largest F) and rotated (typically 1°) around their HFC until g = 0. The operation is repeated as long as values of F decrease and s = 0.
|
The transfer of an -helix from one side of the ß-sheet to another in the MIXED model and the swap of
-helices (change of relative position of one helix in the bundle) in the HELIX model yield a fairly complete set of the initial positions for the
-helices, although proper comparison of very different structures is difficult. Nevertheless, wrong structures will be eliminated at the end of the process. In a sheet every strand adopts six possible positions, because of the shifts (Figure 10). These positions cover all possible ß-sheet conformations for the considered class of ß-sheets (Znamenskiy et al., 2000)
. Swapping the ß-strands within a ß-sheet, with respect to the connectivity condition, allows a complete description of ß-sheet conformations, at the cost of several runs of the program. Hence the algorithm explores all possible SSE conformations and a simulated annealing algorithm (Press et al., 1991
) is used to perform the search of the minima.
Sorting out solutions
For each protein model, the conformational space was explored and the minimization algorithm was applied to different initial positions of SSE corresponding to the various angles at rotations of the SSE around the direction of their displacements (see earlier). Each one leads to a tertiary structure. As the SSE also rotate when one of the amino acids comes into contact with another SSE, very similar structures can be produced from different initial conditions. In the particular case of sheets, initial conditions correspond to one given value of the shift b for each strand of the sheet, besides the global angular orientation of the sheet relative to its direction of displacement. After some structures have been eliminated by the loop length condition, several fulfil the conditions of the goal function and remain possible candidates for the final structure. To sort out these structures and to determine the best ones, we introduced the potential function :
where represents the average distance between the C
of the hydrophobic residues belonging to different SSE. The initial angles for the h helices: are
= (
1, ...,
h) and the initial shifts for the s strands are b = (b1, ..., bs) for a total number of SSE given by k = h + s. The total number of hydrophobic residues is nphob. The function
, although fairly simple, is smoother than the function F and more suitable for comparing structures. Conceptually, it corresponds to the fact that the folding is accomplished by allowing the local clusters of hydrophobic residues attached to each SSE to come into contact. It is a consequence of the empirical observation that hydrophobic clusters correspond to regular secondary structures (Woodcock et al., 1992
; Callebaut et al., 1997
; Hennetin et al., 2003
). Of course, if any additional information about the 3D structure of a protein is available, a more precise and specific function could be used. From the analysis performed on a test set of 12 sample structures (described in the following Part II), we noticed that the shape of
is similar to that of the r.m.s. deviation between the RUSSIA generated and the native structures. This latter r.m.s. is only calculated over the
-carbons of the residues involved in helices and strands. As an example, we considered a three-helix bundle from engrailed homeodomain, PDB code 1enh. Owing to the small number of SSE, a simple graphical representation can easily be made (e.g. Figure 14). Initial angles
1 for the first helix are on the first line, in degrees between 0 and 180°, and for the second helix below,
2 is limited to the range 140200°. The large regions corresponding to the prohibited initial angles were discarded in order to focus more clearly on the important areas. At every 10° step of
2, the
1 distribution is reported over 180°. The ordinate is the initial
3 angle of the third helix of 1enh, from 60 to 160°. The light-shaded areas represent the smoothed values of
(top, Figure 14) and r.m.s. (bottom, Figure 14) and the dark areas correspond to these lowest values. Regions where the combination of these initial angles is impossible are left white. As this was also the case for the whole test set, we introduced the smoothed
function and r.m.s. over 20°:
|
= [
(
i) +
(
i + 10°) +
(
i 10°)]/3(10)
r.m.s. = [r.m.s.(i) + r.m.s.(
i + 10°)+ r.m.s.(
i 10°)]/3(11)
Indeed, the correlation coefficient between the and r.m.s. is clearly higher when these functions are smoothed (Figure 14). The fact that smoothing the
and r.m.s. over a range of ±20° is an a posteriori proof that the step of 10° used to generate the structures does not need to be smaller. A correlation, however, does not always provide a coincidence of minima. The fact that smoothed parameters
and r.m.s. correlate means that one should not restrict to the smallest value of
in order to select the best model, but scan the various models over a range of 20° around this minimal solution. From here on and throughout this paper and Part II, we will be concerned by those smoothed values of
and r.m.s. To select a structure of low r.m.s. with the native one, we used two search techniques. The first approach is to sort out all the generated structures by their
values and select a small number of structures of low
values The second approach is to choose the structure with the lowest
value and to explore the structures generated for close initial positions; the convergence of the algorithm leads to structures of small r.m.s. when starting from neighboring initial structures. For the HELIX model the second approach seemed to be more appropriate, whereas for the BETA and MIXED the first approach was preferred.
In the example in Figure 14, a correlation coefficient of 0.95 was obtained between smoothed and r.m.s., revealing a quasi-linear relation between these two functions. However, the generated structure with minimal value of
does not necessarily provide the least r.m.s. compared with the native structure. Nevertheless one can find a reasonably close structure in the neighborhood of ±20° around the corresponding initial angles. Our tests showed that this search range was sufficient.
We also studied, as an example of the MIXED model, a pollen allergen 5 of PDB code 1bbg, containing three ß-strands and one -helix. The conformational space has four dimensions, the initial
angle of the helix and the three shifts b1, b2, b3 for the corresponding strands in the sheet (Figure 15). We also found a phase shift of
relative to the r.m.s. between the generated and the native structures. The correlation coefficient between the averaged function
and the r.m.s. is 0.95. The 10% of the lowest values of
and r.m.s. is colored in black and 90% of the greatest values of the functions is colored in gray. White areas represent the initial angles and shifts of the SSE for which the algorithm could not generate a tertiary structure. The minimal r.m.s. deviation is 1.73 Å.
|
In a subsequent study (see the following Part II), we applied the RUSSIA procedure to a representative set of small- to medium-sized globular domains (up to 166 amino acids) of the all-, all-ß and
/ß protein classes of CATH (Orengo et al., 1998
). We show that, with limited computing power (PC Pentium 233), it routinely allows one to obtain good folds (C
r.m.s. in the range 1.43.7 Å) for the core or the protein, limited to the SSE considered as rigid blocks.
![]() |
Acknowledgements |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() |
---|
Bryant,S.H. and Lawrence,C.E. (1993) Proteins, 16, 92112.[ISI][Medline]
Callebaut,I., Labesse,G., Durand,P., Poupon,A., Canard,L., Chomilier,J., Henrissat,B. and Mornon,J.P. (1997) Cell. Mol. Life Sci., 53, 621645.[CrossRef][ISI][Medline]
Cohen,F.E., Richmond,T.J. and Richards,F.M. (1979) J. Mol. Biol., 132, 275288.[ISI][Medline]
Cohen,F.E., Sternberg,M.J.E. and Taylor,W.R. (1980) Nature, 285, 378382.[ISI][Medline]
Cohen,F.E., Sternberg,M.J.E. and Taylor,W.R. (1981) J. Mol. Biol., 148, 253272.[ISI][Medline]
Eddy,S.R. (1998) Bioinformatics, 14, 755763.[Abstract]
Flower,D.R. (1994) Protein Eng., 7, 13051310.[ISI][Medline]
Fomenko,A.T. (1986) Comput. Math. Appl. B, 12, 825834.[ISI]
Frankel,T. (1997) The Geometry of Physics. Cambridge University Press, Cambridge.
Hennetin,J., Le Tuan,K., Canard,L., Colloch,N., Mornon,J.-P. and Callebaut,I. (2003) Proteins, 51, 236244.[CrossRef][ISI][Medline]
Ho,B.K. and Curmi,P.M.G., (2002) J. Mol. Biol., 317, 291308.[CrossRef][ISI][Medline]
Kabsch,W. and Sander,C. (1983) FEBS Lett., 155, 179182.[CrossRef][ISI][Medline]
King,R.D., Clark,D.A., Shirazi,J. and Sternberg,M.J. (1994) Protein Eng., 7, 12951303.[ISI][Medline]
Lasters,I. (1990) Protein Eng., 4, 133135.[ISI][Medline]
Lasters,I., Wodak,S., Philippe,A. and van Gunsteren,E. (1988) Proc. Natl Acad. Sci. USA, 85, 33383342.[Abstract]
Murzin,A.G., Lesk,A.M. and Chothia,C. (1994) J. Mol. Biol., 236, 13691381.[CrossRef][ISI][Medline]
Nitsche,J.C.C. (1989) Lectures on Minimal Surfaces. Cambridge University Press, Cambridge.
Orengo,C.A., Martin,A.M., Hutchinson,G., Jones,S., Jones,D.T., Michie,A.D., Swindells,M.B. and Thornton,J.M. (1998) Acta Crystallogr. D, 54, 11551167.[CrossRef][ISI][Medline]
Osserman,R.A. (1986) A Survey of Minimal Surfaces. Dover, New York.
Press,W.H., Flannery,B.P., Teukolsky,S.A. and Vetterling,W.T. (1991) Numerical Recipes in C. Cambridge, Cambridge University Press.
Rost,B., Schneider,R. and Sander,C. (1994) Comput. Applic. Biosci., 10, 5360.[Abstract]
Russell,R.B. and Barton,G.J. (1994) J. Mol. Biol., 244, 332350.[CrossRef][ISI][Medline]
Safran,S.A. (1994) Statistical Thermodynamics of Surfaces, Interfaces and Membranes. Addison-Wesley, Boston.
Shamovsky,I.L., Ross,G.M. and Riopelle,R.J. (2000) J. Phys. Chem. B, 104, 1129611307.[CrossRef][ISI]
Simons,K.T., Ruczinski,I., Kooperberg,C., Fox,B.A., Bystroff,C. and Baker,D. (1999) Proteins, 34, 8295.[CrossRef][ISI][Medline]
Sippl,M. (1990) J. Mol. Biol., 213, 859883.[ISI][Medline]
Soyer,A., Chomilier,J., Mornon,J.-P., Jullien,R. and Sadoc,J.-F. (2000) Phys. Rev. Lett., 85, 35323535.[CrossRef][ISI][Medline]
Vendruscolo,M., Najmanovich,R. and Domany,E. (2000) Proteins, 38, 134148.[CrossRef][ISI][Medline]
Wojcik,J., Mornon,J.-P. and Chomilier,J. (1999) J. Mol. Biol., 289, 14691490.[CrossRef][ISI][Medline]
Wonnacott,T.H. and Wonnacott,R.J. (1995) Introductory Statistics for Business and Economics. Wiley, New York.
Woodcock,S., Mornon,J.P. and Henrissat,B. (1992) Protein Eng., 5, 629635.[ISI][Medline]
Zhang,C. and Kim S.H., (2000) J. Mol. Biol., 299, 10751089.[CrossRef][ISI][Medline]
Znamenskiy,D., Le Tuan,K., Poupon,A., Chomilier,J. and Mornon,J.-P. (2000) Protein Eng., 13, 407412.[CrossRef][ISI][Medline]
Received July 25, 2003; revised October 25, 2003; accepted October 30, 2003