Factors enhancing protein thermostability

Sandeep Kumar1, Chung-Jung Tsai2 and Ruth Nussinov1,3,4

1 Intramural Research Support Program, SAIC Frederick, 2 Laboratory of Experimental and Computational Biology, National Cancer Institute, Frederick Cancer Research and Development Center, Bldg 469, Rm 151, Frederick, MD 21702, USA and 3 Sackler Institute of Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion and conclusions
 References
 
Several sequence and structural factors have been proposed to contribute toward greater stability of thermophilic proteins. Here we present a statistical examination of structural and sequence parameters in representatives of 18 non-redundant families of thermophilic and mesophilic proteins. Our aim was to look for systematic differences among thermophilic and mesophilic proteins across the families. We observe that both thermophilic and mesophilic proteins have similar hydrophobicities, compactness, oligomeric states, polar and non-polar contribution to surface areas, main-chain and side-chain hydrogen bonds. Insertions/deletions and proline substitutions do not show consistent trends between the thermophilic and mesophilic members of the families. On the other hand, salt bridges and side chain–side chain hydrogen bonds increase in the majority of the thermophilic proteins. Additionally, comparisons of the sequences of the thermophile–mesophile homologous protein pairs indicate that Arg and Tyr are significantly more frequent, while Cys and Ser are less frequent in thermophilic proteins. Thermophiles both have a larger fraction of their residues in the {alpha}-helical conformation, and they avoid Pro in their {alpha}-helices to a greater extent than the mesophiles. These results indicate that thermostable proteins adapt dual strategies to withstand high temperatures. Our intention has been to explore factors contributing to the stability of proteins from thermophiles with respect to the melting temperatures (Tm), the best descriptor of thermal stability. Unfortunately, Tm values are available only for a few proteins in our high resolution dataset. Currently, this limits our ability to examine correlations in a meaningful way.

Keywords: melting temperature/sequence/structure/thermophiles/thermostability


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion and conclusions
 References
 
Several organisms, mainly archaea, thrive under extreme environmental conditions, e.g. high pressure in deep sea vents, high temperature and non-physiological pH found in submarine hydrothermal areas, continental sulfataras, low temperatures in Antarctica and high salt concentration in the Dead Sea and in the Great Salt Lake, and in man made geothermal power plants. There has been a growing interest in understanding the stabilization of proteins from these organisms. Such an understanding, especially of the thermophilic proteins, is not only essential for a theoretical description of the physico-chemical principles behind protein folding and stability, but is also critical for designing efficient enzymes that can work at high temperatures. Such enzymes may be useful for several industrial applications, such as detergent manufacturing, food and starch processing, production of high fructose corn syrup and PCR (Adams and Kelly, 1995Go). It has also been noticed that thermophilic enzymes are more resistant to proteolysis than their mesophilic homologues (Daniel et al., 1982Go), probably owing to their greater rigidity.

Thermostable proteins maintain their activities and are stable at high temperatures. Identifying and understanding the factors contributing to the stability of proteins from organisms living under extreme conditions has been a long standing problem. The first high resolution crystal structure of thermolysin was reported in 1974 (Matthews et al., 1974Go). Perutz and Raidt (1975) commented on the stereochemical basis of thermostability of ferredoxins and hemoglobin A2. Since these pioneering efforts, several investigators have focused on the problem of the molecular basis of protein thermostability. Several reasons have been attributed to the greater stability of the thermophilic proteins (Querol et al., 1996Go; Jaenicke and Bohm, 1998Go; Ladenstein and Antranikian, 1998Go). Among the most prominent ones are greater hydrophobicity (Haney et al., 1997Go), better packing, deletion or shortening of loops (Russell et al., 1997Go), smaller and less numerous cavities, increased surface area buried upon oligomerization (Salminen et al., 1996Go), amino acid substitutions within and outside the secondary structures (Zuber, 1988Go; Haney et al., 1997Go; Russell et al., 1998Go), increased occurrence of proline residues (Haney et al., 1997Go; Watanabe et al., 1997Go; Bogin et al., 1998Go), decreased occurrence of thermolabile residues (Russell et al., 1997Go), increased helical content, increased polar surface area (Haney et al., 1997Go; Vogt and Argos, 1997Go; Vogt et al., 1997Go), increased hydrogen bonding (Vogt and Argos, 1997Go; Vogt et al., 1997Go) and salt bridges (Yip et al., 1995Go, 1998Go; Haney et al., 1997Go; Russell et al., 1997Go, 1998Go; Elcock, 1998Go; Xiao and Honig, 1999Go; Kumar et al., 2000Go).

Here we present a statistical analysis of parameters thought to contribute toward protein thermostability. We have carried out structural comparisons to cluster the thermophile– mesophile protein families, creating a non-redundant dataset of 18 families from the Protein Data Bank (PDB) (Bernstein et al., 1977Go). These families span an entire spectrum, containing proteins from moderately thermophilic to hyperthermophilic organisms and their mesophilic homologs. Not all the differences observed between the thermophilic and mesophilic proteins are due to thermostability. Here we select one pair from each family. We choose the structurally most similar thermophile–mesophile pair having the best resolution, so that the observed differences can be expected to be mostly due to thermostability. In our dataset, no two thermophilic proteins from different families have similar three-dimensional structures, ensuring a bias free sample. Between each thermophile–mesophile pair, we have compared several structural properties such as oligomeric state, insertion/deletion of residues, compactness, hydrophobicity, helical content, hydrogen bonds and salt bridges. We find that most of these do not show consistent trends across the families, indicating versatile protein stabilization strategies adopted by the individual families. However, there are a few global trends across a large number of families. Salt bridges and side-chain hydrogen bonds increase in most of the thermophilic proteins. Interestingly, the overall amino acid distributions in the thermophilic and the mesophilic proteins are significantly different, in spite of the high sequence homologies between the protein structural pairs. The proportions of the thermolabile residue Cys and of Ser decrease significantly, while those of Arg and Tyr increase significantly in the thermophilic proteins as compared with their mesophilic homologs. Pro is observed to occur less frequently in {alpha}-helices of the thermophilic proteins. On the whole, a higher proportion of amino acids in the thermophilic proteins adopt {alpha}-helical conformation. Our results indicate a two pronged strategy adopted by the thermophiles. Thermophilic proteins appear to disfavor potentially destabilizing factors along with favoring the potentially stabilizing ones. Furthermore, here we compare our results with those obtained from an analysis of a database of 165 non-homologous proteins.

Our intention was to carry out the analysis with respect to the melting temperatures of the corresponding proteins, from both the thermophiles and the mesophiles. Melting temperatures (Tm's), are the best descriptor of thermal stability. To be able to draw reliable conclusions, we wished to focus on cases where (i) high resolution crystal structures are available for both the thermophilic protein and its mesophilic homolog; and (ii) melting temperatures for the thermophilic and mesophilic proteins have been measured and reported. Cases where the difference between the melting temperatures of the thermophilic–mesophilic protein pair is not too small, and that the size of the protein is large enough, are the more meaningful ones. Too small a difference in the melting temperatures corresponds to a small difference in energy between the pair of proteins; whereas if the protein is small, the differences in structural parameters might be difficult to gauge accurately. Unfortunately, only a few cases are currently available in the literature. In these cases, the difference in the number of salt bridges between the thermophile and its mesophile homologue appears to correlate with the Tm of the thermophilic protein. While other structural factors, such as compactness and hydrophobicity, contribute to thermostability, no consistent correlation with the Tm is observed. However, we are unable to obtain statistically reliable results due to the sparse data. On the other hand, we point out that none of the structural factors correlates with the living temperatures of the thermophilic organisms.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion and conclusions
 References
 
Construction of the families of thermophilic and mesophilic proteins

An index file, called source.idx, in the Protein Data Bank (PDB) (Bernstein et al., 1977Go) contains the names of the organisms for all protein crystal structures available in the PDB. The January 7, 1998 update of this file was searched for the keywords THERM and PYRO. This search yielded 167 (out of 6751) PDB entries containing different proteins from thermophilic organisms. The entries in which protein structures had been determined by using nuclear magnetic resistance (NMR) and/or theoretical modeling, R = –1.0 Å in cmpd_res file, were discarded, leaving us with 145 PDB entries. From this set of entries containing proteins whose structures were determined by X-ray crystallography, 113 entries containing high resolution (R <= 2.5 Å) structures for 55 different thermophilic proteins were selected for further study. For each of the thermophilic proteins in the list, the PDB entry with the best resolution was picked. Three-dimensional structures of the thermophilic proteins were compared all against all using a sequence order independent structural comparison technique (Tsai et al., 1996Go). This computer vision-based technique superimposes spatially equivalent regions in two proteins without regard to their sequential connectivity, or to the number of residues in the protein. Since the mesophilic and thermophilic proteins have different sizes and may have different oligomeric states, this technique allows us to superimpose the conserved regions of the proteins independently of these factors. Two proteins are considered to be dissimilar if (i) the backbone C{alpha} atom superposition for the two structures yields an r.m.s.d. >= 2.00 Å; and (ii) the sequence identity (ID) for the two proteins is <= 20%. Finally, thermophilic proteins were retained in the database if they have dissimilar structures and if there is at least one high resolution crystal structure for their corresponding mesophilic homologs. This step ensures non-redundancy in the database. Eighteen different thermophilic proteins were obtained. The structure of each of the 18 proteins was compared with their corresponding homologous PDB entries. Two structures were considered to be similar if they did not satisfy both of the above conditions. At this stage, many families contain several mesophilic proteins. Application of a 2.5 Å resolution cut-off substantially decrease their number. Finally, the PDB entry which has the best resolution and contains the structure that is most similar to the thermophilic protein is selected. As far as possible, we have tried to select wild-type thermophile–mesophile pairs. Attention was also paid to the presence (absence) of substrates in the thermophilic and mesophilic proteins. Choosing one thermophile–mesophile pair per family, in a way such that the pair contains the best resolved structures along with the largest sequence and structure homology among the various available alternates, has several advantages. First, since the two proteins are most similar, the observed differences can be correlated with thermostability with a greater degree of confidence. Second, the variability, or the consistency of the results, can be judged from the behavior of all 18 families; and third, in particular, the behavior of the parameters is a function of two factors: the extent of structural similarity between the two molecules and the sequence similarity. The non-polar buried surface area, compactness, etc. obtained in comparisons of members of the same family would need to be calibrated against the sequence differences, and it is unclear how best to do this in practice. In an extensive recent analysis, Vogt et al. (1997) have used multiple mesophilic homologs for comparison with the thermophilic proteins. They have calibrated specific protein structural properties per 10°C rise in living temperature of the organisms in a given family. The statistical trends obtained by Vogt et al. (1997) and by us are similar, indicating the equivalence of the two approaches.

The properties of these 18 pairs of thermophilic and mesophilic proteins are summarized in Table IGo. The best matching protein chains in each family are indicated in the footnotes of Table IGo. One PDB entry for the mesophilic protein elongation factor EF-TU-EF-TS complex (PDB entry 1EFU) from Escherichia coli is an A2B2 type tetramer with chains of type A and B being highly dissimilar. This particular protein complex has two different homologs in the thermophilic proteins, namely, EF-TU (PDB entry 1EFT) and EF-TS (PDB entry 1TFE). Furthermore, 1TFE, a dimer, matches with a single chain, 1EFU-B. The asymmetric unit of lactate dehydrogenase crystals from Bacillus stearothermophilus (PDB entry 1LDN) contains two copies of the molecule. The first copy has been used in this analysis. In all the families, the spatially overlapping regions in the superposition of the thermophilic and mesophilic proteins are very extensive. For example, in the citrate synthase family, where the similarity between the thermophilic and mesophilic proteins is relatively poor as compared with most other families, 332 residues in each chain overlap spatially. A chain of thermophilic citrate synthase (1AJ8-B) has 370 residues while a chain of mesophilic citrate synthase (1CSH) contains 435 residues. A few of the PDB entries used in this analysis have missing atoms, residues or small fragments due to poor diffraction data. Additionally, the crystal structures in several cases may be determined at low temperatures to obtain better diffraction data. However, these factors do not substantially affect the overall three-dimensional structures of the proteins. No systematic errors are expected on this count.


View this table:
[in this window]
[in a new window]
 
Table I. Families of thermophilic and mesophilic proteins
 
Sequence composition analysis

Distributions (numbers, N) and frequencies (percent, %) of all 20 amino acids were computed for the thermophilic and mesophilic proteins. In addition, we have computed their distributions in the {alpha}-helices. The amino acid distributions were compared using the {chi}2-test. Hamming distance was computed between percent (%) amino acid compositions. The change in proportion test was used to identify the amino acids whose proportions change significantly. These calculations follow Kumar and Bansal (1998a).

Structural properties

Oligomeric state For a given protein, the PDB files contain coordinates for the structure observed in a crystallographic asymmetric unit. This may not reflect the true biochemically relevant oligomeric state for the protein. In our data set these oligomeric states of the thermophilic and mesophilic proteins are tabulated by studying the biochemical data contained in the relevant literature on these proteins, indicators within the PDB files and the pointers in the PDB3DB browser.

Hydrophobicity The hydrophobicity of a protein was calculated as the fraction of the buried non-polar area out of the total non-polar area, computed by using the methods described earlier (Tsai and Nussinov, 1997aGo,bGo; Tsai et al., 1997Go).

Compactness The compactness (Zehfus and Rose, 1986Go) of a protein was defined as the ratio of solvent accessible area (Lee and Richards, 1971Go; Tsai et al., 1997Go) of the protein and the surface area of a sphere with equal volume to the protein (Tsai and Nussinov, 1997aGo,bGo).

Hydrogen bonds and salt bridges Whenever two heavy (non-hydrogen) atoms with opposite partial charges [donor (D)–accepter (A) pairs] were found to be within a distance of 3.5 Å, a hydrogen bond has been inferred. The geometrical goodness of the hydrogen bond was assessed by computing the values of the following angles.

A hydrogen bond was taken to have good geometry if both these angles lie in the range 90–150°. Only those hydrogen bonds which have a good geometry were included in our studies.

The presence of salt bridges was inferred when Asp or Glu side-chain carbonyl oxygen atoms were found to be within 4.0 Å distance from the nitrogen atoms in Arg, Lys and His side chains.

Helical content The helical content of a protein refers to the percentage (%) of residues that have {alpha}-helical conformation in the protein. The corresponding Dictionary of Protein Secondary Structure (DSSP) (Kabsch and Sander, 1983Go) file was used to identify the residues in {alpha}-helical conformation in each protein. Overall geometries of {alpha}-helices in the thermophilic and mesophilic protein chains were characterized using HELANAL (Kumar and Bansal, 1996Go; Kumar and Bansal, 1998bGo). This program is available at http://www-lecb.ncifcrf.gov/~kumarsan/

Buried and exposed surface areas Buried and accessible surface areas (Lee and Richards, 1971Go; Tsai and Nussinov, 1997aGo,bGo) have been computed for thermophilic and mesophilic protein chains as well as for 165 dissimilar monomers. Four different fractions have been computed from these areas, in each case:

Measurement of percent change in various properties

For the purpose of a comparison between a thermophilic–mesophilic pair, the numbers of hydrogen bonds and salt bridges in the two proteins were normalized by their respective number of residues. Percent changes were computed as the difference between the normalized values of hydrogen bonds and salt bridges in the two proteins in each family, divided by the corresponding normalized values for the mesophilic proteins.

Changes in protein size can occur due to insertion/deletion and/or oligomerization. Percent change in protein size in each family was computed by dividing the difference in the number of residues between the thermophilic and mesophilic proteins by the number of residues in the mesophilic protein.

Percent change in hydrophobicity in each family was computed by dividing the difference in hydrophobicity for the thermophilic and mesophilic proteins by the hydrophobicity for the mesophilic protein. Percent change in compactness was also computed in the same way.

Database of 165 dissimilar monomers

A database of 165 proteins, which (i) have been solved to high resolution R <= 2.5 Å by X-ray crystallography and contain at least 50 amino acids, (ii) have dissimilar 3D structures, as determined by the sequence order independent structure comparison technique (Tsai et al., 1996Go), and (iii) exist as monomers in solution as indicated in their PDB files, relevant biochemical literature and pointers in PDB3DB browser to other databases such as SWISS-PROT, was generated from the PDB. This database was used as a control for studying structural features, such as compactness, hydrophobicity, polar and non-polar contribution to buried and exposed surfaces in thermophilic and mesophilic protein chains.

Cases of high resolution structural pairs where the melting temperatures are currently available

(i) 3-Phosphoglycerate kinase (PGK) (Davies et al., 1993Go): Tm = 67°C for the thermophilic enzyme from Bacillus stearothermophilus and 53°C for its mesophilic enzyme counterpart, from Saccharomyces cerevisiae. The thermophilic PGK is a monomer while the mesophilic PGK is a dimer. The energy difference between the two enzymes, {Delta}{Delta}G = ~5 kcal/mol.
(ii) Adenylate kinase (Glaser et al., 1992Go): Tm = 74.5°C for the thermophilic enzyme from Bacillus stearothermophilus and 48°C for the mesophilic enzyme from Saccharomyces cerevisiae. Both the thermophilic and the mesophilic enzymes are monomers.
(iii) CheY, the bacterial chemotaxis protein (Usher et al., 1998Go): Tm for the thermophilic protein is 95°C from Thermotoga maritima. Both the thermophilic and the mesophilic proteins are monomers.
(iv) Glutamate dehydrogenase (Yip et al., 1995Go): Tm = 113°C for the thermophilic protein from Pyrococcus furiosus. Both the thermophilic and the mesophilic enzymes are hexamers. Tm = 55°C for Clostridium symbiosum glutamate dehydrogenase (Yip et al., 1995Go).
(v) Rubredoxin, a small redox protein (Day et al., 1992Go): there are several estimates of Tm for rubredoxin from Pyrococcus furiosus. The one used here is from Hiller et al. (1997), determined by the Hydrogen exchange technique. Tm for thermophilic rubredoxin = 176 – 195°C. Both the thermophilic and the mesophilic rubredoxins are monomers.

For PGK the melting temperatures of the thermophilic and mesophilic proteins are close ({Delta}Tm = 67 – 53 = 14°C). The energy difference between thermophilic and mesophilic enzymes is only 5 kcal/mol ({Delta}{Delta}G = ~5 kcal/mol). Moreover, the oligomeric states of the two PGKs are also different. The thermophilic rubredoxin has a very high Tm. However, it is a very small protein, consisting of only about 50 amino acids. More than one estimate of Tm for rubredoxin further complicates the matter.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion and conclusions
 References
 
We have selected a non-redundant dataset of 18 families consisting of thermophilic and mesophilic proteins whose high resolution (R <= 2.5 Å) structures are available in the PDB (Table IGo). The corresponding thermophilic and mesophilic proteins within these families are highly similar, with sequence identities varying in a range of 24–73% and backbone r.m.s.d. values between 0.69 and 1.68 Å. At the same time, the thermophilic proteins across the 18 families are highly dissimilar among themselves (sequence identities being <10% and backbone r.m.s.d. > 2 Å). The mesophilic proteins are also highly dissimilar among themselves.

Packing

Reasons for higher stability of thermophilic proteins include better packing (Russell et al., 1997Go, 1998Go) and hence, smaller and less numerous cavities. To study packing in a protein one can compute its compactness (Zehfus and Rose, 1986Go). Compactness has been defined to be the ratio of accessible surface area (ASA) (Lee and Richards, 1971Go) of a given protein to the surface area of a sphere with the same volume as the protein. Assuming that most proteins are more or less globular in shape, a better packed protein will have a smaller ratio value. We have already used this formulation to study hydrophobic folding units (Tsai and Nussinov, 1997aGo,bGo). Figure 1Go plots the compactness versus the number of residues in thermophilic and mesophilic protein chains (one chain per protein), along with the values calculated for the 165 structurally dissimilar monomeric protein chains selected from the PDB. The compactness values for the thermophilic protein chains are very similar to those calculated for the mesophilic protein chains. They are also within the range of the compactness values obtained for the 165 dissimilar monomers. However, the overall packing of an oligomeric protein may involve two components: (i) packing of atoms within individual subunits, and (ii) the association, or packing, of the subunits with respect to each other. Consequently, we have computed the compactness for the thermophilic and mesophilic proteins in their biochemically relevant oligomeric states. The results are presented in Table IIGo. Again, the compactness values for thermophilic and mesophilic proteins are highly similar. Hence, there is no consistent pattern in the contribution of packing to the differences in stabilities between thermophilic and mesophilic protein pairs. Recently, Karshikoff and Ladenstein (1998) have also reached similar conclusions upon computing cavity volumes for a large number of thermophilic and mesophilic proteins.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 1. Distribution of compactness as a function of chain size (number of residues), for thermophilic (*) and mesophilic ({circ}) protein chains. x-axis denotes the number of residues (N) in the protein chains and y-axis denotes compactness (Z). For comparison, 165 monomers with dissimilar structures (•) obtained from the PDB are also depicted.

 

View this table:
[in this window]
[in a new window]
 
Table II. Values of hydrophobicity and compactness in thermophilic and mesophilic proteins
 
Hydrophobicity

With the rapid increase in the structural information available for proteins, it is becoming increasingly clear that the hydrophobic effect is the dominant driving force in protein folding (Dill, 1990Go). Hence, it has been suggested that thermophilic proteins are substantially more hydrophobic (Haney et al., 1997Go) and have more surface area buried upon oligomerization (Salminen et al., 1996Go) as compared with their mesophilic counterparts. As with packing, the hydrophobic effect can manifest itself at two levels: (i) hydrophobicities of the individual protein chains, and (ii) hydrophobicity due to the association of the chains. We have computed the hydrophobicity as the fraction of buried non-polar surface area out of the total non-polar surface area (Tsai and Nussinov, 1997aGo,bGo), for the thermophilic and mesophilic protein chains as well as their biochemically relevant oligomeric forms. Figure 2Go presents a plot of the hydrophobicity versus the number of residues in thermophilic and mesophilic protein chains, along with those for the 165 dissimilar monomeric chains. The figure illustrates that thermophilic and mesophilic protein chains have very similar hydrophobicities. The values lie within the same range as those for the hydrophobicities of 165 dissimilar monomers. The hydrophobicities computed for the thermophilic and mesophilic proteins in their biochemically relevant oligomeric states are presented in Table IIGo. Again, the hydrophobicities of the thermophilic and mesophilic protein oligomers are very similar.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 2. Distribution of hydrophobicity as a function of chain size (number of residues), for thermophilic (*) and mesophilic ({circ}) protein chains. x-axis denotes the number of residues (N) in the protein chains and y-axis denotes percent hydrophobicity. For comparison, 165 monomers with dissimilar structures (•) obtained from the PDB are also depicted.

 
Polar and non-polar surface areas

It has been suggested that increased polar surface area contributes to the greater stability of the thermophilic proteins (Haney et al., 1997Go; Vogt and Argos, 1997Go; Vogt et al., 1997Go). Here, we have divided protein surfaces into buried and exposed parts and evaluated the contribution of polar and non-polar atoms. These calculations have been performed for all thermophilic and mesophilic protein chains (one polypeptide chain per protein) and compared with those for 165 dissimilar monomers. The calculations have been done in two different ways. In the first set all atoms including the backbone were considered. In the second set, the backbone atoms were excluded. Table IIIGo presents the results. The distributions of buried and exposed, polar and non-polar surface areas are quite uniform for the 165 dissimilar monomers as well as for the thermophilic and mesophilic protein chains.


View this table:
[in this window]
[in a new window]
 
Table III. Polar and non-polar contributions to buried and accessible surface areas
 
The above observations on packing, hydrophobicity and surface areas indicate that basic protein core is similar between thermophiles and mesophiles.

Salt bridges and hydrogen bonds

Along with oligomerization, chain length, hydrophobicity and compactness, hydrogen bonds and salt bridges have also been compared between the thermophilic and the mesophilic proteins. The hydrogen bonds were divided into three classes: main chain–main chain (MM H-bonds), main chain–side chain (MS H-bonds) and side chain–side chain hydrogen bonds (SS H-bonds). Figure 3Go shows plots of SS H-bonds and salt bridge content changes in the families of thermophilic and mesophilic proteins in their biochemically relevant oligomeric states, and at their interfaces. As the figure shows, side chain–side chain H-bonds and salt bridge content increase in the monomers of most thermophilic proteins and at their interfaces.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 3. Plots depicting changes in side chain–side chain hydrogen bonds (SS H-bonds) and salt bridges in biochemically relevant forms of proteins and at interfaces in various families of thermophilic and mesophilic proteins. A positive change indicates that the thermophilic protein has a higher content as compared with its mesophilic homolog, while a negative change indicates that the thermophilic protein has a lower content than its mesophilic homolog. For the majority of the families, SS H-bond and salt bridge content increases for thermophilic proteins. For each subplot, the x-axis denotes the family number while the y-axis represents the percent change in the property indicated at top of the subplot. The data on interfaces is available only in the case of eight families.

 
The most significant change in the number of salt bridges was observed in the glutamate dehydrogenase family. This family contains glutamate dehydrogenase enzymes from hyperthermophile Pyrococcus furiosus and the mesophile Clostridium symbiosum. Both thermophilic and mesophilic glutamate dehydrogenases are homohexamers and share good sequence and structural similarities (Table IGo). The difference between their melting temperatures is approximately 60° (see Materials and methods). Pyrococcus furiosus glutamate dehydrogenase contains 168 salt bridges while Clostridium symbiosum glutamate dehydrogenase contains 107 salt bridges. Thus, the salt bridge frequency increases by ~70% for the thermophilic protein. The changes in other structural parameters between this thermophile–mesophile pair are insignificant (Table IIGo; Yip et al., 1995). Thus salt bridges and their networks have been implicated in thermostability of this protein (Yip et al., 1995Go). Recently, we have computed the electrostatic strengths of salt bridges in monomers of this pair (Kumar et al., 2000Go). We have observed that salt bridges in Pyrococcus furiosus glutamate dehydrogenase, which form extensive networks, are highly stabilizing. On the other hand, salt bridges in Clostridium symbiosum glutamate dehydrogenase, which form considerably less networks, are only marginally stabilizing (Kumar et al., 2000Go).

Insertions, deletions and oligomerization

It has been suggested that deletion or shortening of loops may increase protein thermal stability (Russell et al., 1997Go, 1998Go). Oligomerization can be another contributing factor. These factors reflect a change in protein size, and its effect on thermal stability. Figure 4Go shows changes in hydrogen bonds, salt bridges, compactness and hydrophobicity plotted against the change in the number of residues between thermophilic and mesophilic proteins in each family. Mostly there is no correlation with a change in protein size, either due to insertions/deletions or due to oligomerization. This is further corroborated by the observation that in 14 out of 18 families in our database, thermophilic and mesophilic proteins have the same oligomeric states. In two families the oligomeric states of thermophilic proteins are found to be higher than those of their mesophilic homologs. However, the oligomeric states of mesophilic proteins are higher than their thermophilic homologs in the other two families.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 4. Change in hydrogen bonds, salt bridges, compactness and hydrophobicity plotted with respect to change in protein size (number of residues in the protein). For each subplot, the x-axis denotes the percent change in the protein size and the y-axis represents the percent change in the property indicated at top of the subplot. Most structural properties of proteins are not correlated with insertion/deletion or oligomerization.

 
Living temperatures of the thermophilic organisms and structural factors involved in protein thermostability

In the literature, the stability of thermophilic proteins has been described in a number of ways, such as in terms of the temperature at which a protein is active (activity temperature), stable (stability temperature) or by half life for a certain duration of time. Much less frequently a protein is described in terms of melting, or mid-point transition temperature (Tm). Perhaps due to this heterogeneity in the available data, a recent database analysis study (Vogt and Argos, 1997Go; Vogt et al., 1997Go) used the living temperatures of the organisms from which the proteins were isolated as a parameter for studying thermostability. Figure 5Go plots changes in the oligomeric state, chain length, hydrophobicity, compactness, main chain–main chain, main chain–side chain and side chain–side chain hydrogen bonds and salt bridges as a function of living and of melting temperatures. Figure 5aGo shows that structural factors involved in protein thermostability do not correlate with living temperatures of the thermophilic organisms. The trends observed in Figure 5bGo are clearer. However, there are only five data points, two out of these (first and last) are unreliable due to reasons summarized in the Materials and methods section. If we ignore these points, we observe that among the various factors, only the salt bridges tend to correlate with the melting temperature. Unfortunately, this observation is unreliable, as it is based only on three proteins. However, it is consistent with studies by Yip et al. (1998), who have observed a correlation between ion pairs and thermostability for glutamate dehydrogenases from different organisms. Clearly, this phenomenon needs to be investigated further before any conclusions are drawn.




View larger version (45K):
[in this window]
[in a new window]
 
Fig. 5. Change in various structural properties—oligomerization, chain length, hydrophobicity, compactness, hydrogen bonds—involving main chain–main chain atoms, main chain–side chain atoms, side chain–side chain atoms and salt bridges plotted against (a) living temperature (TL) of thermophilic organisms and (b) melting temperature (Tm) of the thermophilic proteins. Trends for various properties are clearer in the plots with Tm. Salt bridges show a correlation with melting temperature. However, the correlation is not statistically reliable. For each subplot, the x-axis represents the temperature, while the y-axis represents the percent change in property indicated at top of the subplot. In each panel of (b), the first point (smallest Tm) corresponds with phosphoglycerate kinase. The second point corresponds with adenylate kinase. The third point represents CheY. The fourth point corresponds with glutamate dehydrogenase and the fifth point (greatest Tm) represents rubredoxin.

 
Distribution of amino acids

The overall distributions of amino acids in the 18 non-redundant families of thermophilic and mesophilic protein chains are presented in Table IVGo. Figure 6Go presents a comparison between the residue composition of the thermophilic and mesophilic proteins. Despite the high sequence homology, a {chi}2 test (Kumar and Bansal, 1998aGo) indicates that the differences between the two distributions are highly significant ({chi}2 = 86.2). For a 19 parameter system such as amino acid distribution, a {chi}2 value at 95% level of confidence (probability of accepting the null hypothesis that two distributions are similar, P <= 0.05) should be greater than 30.14 to reject the null hypothesis. This evidence is further corroborated by the observation that the value of Hamming distance in 20 dimensional amino acid composition (%) space (Kumar and Bansal, 1998aGo) between thermophilic and mesophilic chains is large (8.1 distance units).


View this table:
[in this window]
[in a new window]
 
Table IV. Distribution of amino acid residues in the 18 non-redundant families of thermophilic and mesophilic proteins
 


View larger version (32K):
[in this window]
[in a new window]
 
Fig. 6. Bar diagram showing a comparison between amino acid compositions of thermophilic and mesophilic protein chains. For each residue indicated by single letter code on the x-axis, the white bar represents frequency of occurrence (y-axis) of the residue in mesophilic protein chains and the black bar represents the same in thermophilic protein chains. Change in proportion tests show that differences in frequencies of Cys, Ser, Arg and Tyr are significant at a 95% level of confidence.

 
Proline substitutions

It has been suggested that Pro has an increased occurrence in thermophilic proteins, especially in loops (Haney et al., 1997Go; Watanabe et al., 1997Go; Bogin et al., 1998Go). A total of 75 Pro substitutions are observed in loop regions of thermophilic and mesophilic chains. In 39 cases, the thermophilic chains contain a Pro residue instead of other residues found in their mesophilic homologs at equivalent loop positions. However, in 36 cases, another residue is present in the thermophilic chains instead of Pro in the mesophilic homologs. Thus, there is no consistent pattern for Pro substitutions in loops. In our database, the frequency of occurrence of Pro is unchanged (4.2%) (Figure 6Go) in thermophilic and mesophilic proteins.

Preferred and avoided residues in thermophilic proteins

A change in proportion test (Kumar and Bansal, 1998aGo) is used to identify amino acids whose proportions change significantly, that is, by >2 standard deviations, between thermophilic and mesophilic chains. Changes in the proportions of Cys (0.6% in thermophilic and 1.0% in mesophilic chains), Arg (4.6% in thermophilic and 3.6% in mesophilic chains), Ser (4.0% in thermophilic and 5.5% in mesophilic chains) and Tyr (4.5% in thermophilic and 3.7% in mesophilic chains) are found to be significant (Figure 6Go).

Of the 20 amino acids, Asn, Gln, Met and Cys can be classified as thermolabile due to their tendency to undergo deamidation or oxidation at high temperatures (Russell et al., 1997Go). Table IVGo and Figure 6Go indicate that the frequencies of occurrence for Gln (2.8% in thermophiles and 2.9% in mesophiles) and Met (2.3% in thermophiles and 2.4% in mesophiles) are similar. Cys (0.6% in thermophilic chains and 1.0% in mesophilic) and Asn (4.4% in thermophilic and 5.1% in mesophilic) change by appreciable amounts. However, only the change in the frequency of Cys is significant.

The above observations raise questions about the possible roles of Arg, Tyr and Ser whose proportions change significantly. It has been suggested that thermophilic proteins have increased hydrogen bonding and salt bridge formation (Yip et al., 1995Go; Querol et al., 1996Go; Vogt and Argos, 1997Go; Vogt et al., 1997Go; Russell et al., 1997Go, 1998Go). Due to their large side chains, Arg and Tyr may be useful both in short range local interactions and in long range interactions. The guanidium group in Arg can form salt bridges. On the other hand, due to its short side chain Ser forms mostly local interactions (Jeffrey and Saenger, 1991Go). Interestingly, it has recently been observed that hot spots for binding in protein interfaces are also rich in Arg, Tyr and Trp (Clackson and Wells, 1995Go; Bogan and Thorn, 1998Go). Hence, it appears that in both binding and folding at high temperatures, Arg and Tyr play a similar role, contributing toward protein stability. On the other hand, Trp occurs with a similar proportion in both thermophilic and mesophilic chains (Table IVGo and Figure 6Go). In contrast to Arg and Tyr, Trp is a hydrophobic residue with a bulky double ring side chain, usually occurring with low frequencies in proteins. Alternatively, it is possible that the absence of a noticeable trend for Trp, a rare residue, is due to its low counts in our sample.

Thermophilic and mesophilic {alpha}-helices

It has been suggested that thermophilic proteins have a higher helical content (Querol et al., 1996Go). In our database, we find that in nine out of the 18 families, thermophilic and mesophilic chains have similar values for the fraction of residues in helical conformation (fH), as identified using DSSP (Kabsch and Sander, 1983Go). However, on the whole, thermophilic proteins have a higher occurrence of residues in helical conformation. fH for thermophilic chains is 32.0% as compared with 25.4% in the mesophilic chains. {alpha}-Helices in the thermophilic and mesophilic proteins adopt similar overall geometries as characterized using HELANAL (Kumar and Bansal, 1996Go; Kumar and Bansal, 1998bGo).

Tables VGo presents the amino acid distributions in {alpha}-helices of thermophilic and mesophilic chains. {chi}2-test shows that amino acid distribution in {alpha}-helices of thermophilic proteins is significantly different from that of {alpha}-helices in mesophilic proteins. Hamming distance (Kumar and Bansal, 1998aGo) between the two distributions is 15.1 distance units in the 20 dimensional amino acid composition space. The proportions of Cys (0.1% in thermophilic and 0.8% in mesophilic helices), His (2.0% in thermophilic and 3.3% in mesophilic helices) and Arg (5.5% in thermophilic and 3.9% in mesophilic helices) change significantly. Thermophilic helices favor Arg and avoid His and Cys as compared with mesophilic helices. A recent database analysis study on {alpha}-helices shows Arg to be a helix-favoring residue with its propensity to occur in the middle region of {alpha}-helices being 1.33, while Cys (propensity = 0.87 in the middle of {alpha}-helices) and His (propensity = 0.76 in the middle of {alpha}-helices) are helix disfavoring residues (Kumar and Bansal, 1998aGo). Thermostability has also been attributed to enhanced secondary structure propensity (Querol et al., 1996Go). This might rationalize the increase in the proportion of Arg, a helix favoring residue in thermophilic protein helices, while helix disfavoring residues Cys and His decrease. A previous analysis of the composition of {alpha}-helices in the thermophilic proteins (Warren and Petsko, 1995Go) has also noted a significant decrease in Cys and His. The proportion of Arg increases and that of Cys decreases significantly in the entire thermophilic proteins as well. Furthermore, Proline occurs with a frequency of 0.7% in {alpha}-helices of thermophilic as compared to 1.3% in {alpha}-helices of mesophilic proteins. Proline is the most avoided residue in the middle of {alpha}-helices (Kumar and Bansal, 1998aGo), since it may cause kinks (Woolfson and Williams, 1990Go; Kumar and Bansal, 1996Go, 1998aGo, Kumar and Bansal, bGo).


View this table:
[in this window]
[in a new window]
 
Table V. Distribution of amino acid residues in the {alpha}-helices in thermophilic and mesophilic proteins
 
From the sequence composition comparison between thermophiles and mesophiles, thermophiles favor those factors that can enhance their stability, and avoid those factors which can destabilize them. Lower occurrence of thermolabile residues in the thermophilic chains along with lower occurrence of Cys, His and Pro in thermophilic helices illustrate a clear trend in this direction.


    Discussion and conclusions
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion and conclusions
 References
 
In this extensive study we have examined structural and sequence factors involved in protein thermostability. Thermophilic proteins optimize their stabilities via different mechanisms. Sequence and structural factors, such as packing, oligomerization, insertions and deletions, proline substitutions, helical content, helical propensities, polar surface area, hydrogen bonds and salt bridges, have been proposed to contribute to greater stability of thermophilic proteins. We have analyzed all these factors in a database of 18 thermophile–mesophile families. There are two major concerns in the analyses such as the ones presented here. First, protein stabilization strategies that may be observed in the individual families may not show consistent trends across several families. Second, not all differences among the thermophiles and mesophiles may be attributable to protein thermostability. Some may be due to phylogenetic differences between the thermophiles and mesophiles. In the available data, we observe that no single factor proposed to contribute toward protein thermostability is 100% consistent in our set of proteins. It is particularly interesting to note that hydrophobicity, packing and fractional polar and non-polar surface areas show little quantitative differences between thermophiles and mesophiles. While insertions/deletions, oligomerization and proline substitutions can stabilize individual thermophilic proteins, they do not show consistent trends across the families. It is also possible that the observed differences are due to phylogenetic differences between thermophiles and mesophiles. It should also be mentioned that more than one factor may be responsible for greater stability of the thermophilic protein in a given family.

The most consistent trend is shown by salt bridges and side chain–side chain hydrogen bonds. These increase in the majority of the thermophilic proteins. In recent years, the role of salt bridges toward protein stability has been controversial (Hendsch and Tidor, 1994Go; Kumar and Nussinov, 1999Go). However, in the case of the thermophilic proteins, salt bridges have been shown to be stabilizing (Elcock, 1998Go; Xiao and Honig, 1999Go; Kumar et al., 2000Go). Recently, we have calculated the electrostatic strengths of salt bridges in the glutamate dehydrogenase family (Kumar et al., 2000Go). Network formation stabilizes individual salt bridges in Pyrococcus furiosus glutamate dehydrogenase (Kumar et al., 2000Go). Salt bridges are major contributors toward thermostability of Pyrococcus furiosus glutamate dehydrogenase as compared with the mesophilic Clostridium symbiosum glutamate dehydrogenase (Yip et al., 1995Go). In a large database analysis study, we have observed that salt bridges with `good geometries', such as those in the present study, have mostly, but not always, contributed stabilizing electrostatic contributions toward protein stability (Kumar and Nussinov, 1999Go). Thermophilic proteins are not only stable, but are also optimally active at high temperatures. An increase in the number of salt bridges and hydrogen bonds may rigidify a thermophilic protein and expose it to the danger of becoming inactive. Still, while a thermophilic protein may be rigid at room temperature, it is likely to be flexible at high temperatures (Jaenicke and Bohm, 1998Go). Recently, we have also observed that Pyrococcus furiosus glutamate dehydrogenase contains a greater number of salt bridges and their networks around the active site as compared with the mesophilic Clostridium symbiosum glutamate dehydrogenase. The salt bridges around the active site may help to keep the active site region together by opposing disorder due to greater atomic mobility at high temperatures (Kumar et al., 2000Go).

Examination of the sequences shows that despite high sequence homology, the differences in amino acid distributions in the thermophilic and mesophilic proteins are highly significant. While some of the differences in the amino acid distributions are likely to be the outcome of phylogenetic differences between thermophiles and mesophiles, others correlate with protein thermostability. For example, the proportions of the thermolabile amino acid Cys, and of Ser which usually forms local interactions, decrease significantly, while those of Arg and Tyr which are capable of both short range and long range interactions increase significantly in the thermophilic proteins. The stability of the constituent {alpha}-helices also appears to contribute to protein thermal stability. Thermophilic proteins have a higher proportion of residues in helical conformation. Helix-favoring residue Arg occurs more frequently in {alpha}-helices of thermophilic proteins, whereas helix-disfavoring residues Cys, His and Pro have lower frequencies of occurrence in thermophilic helices. Refraining from using some residues, and opting for others in sequences of thermophilic proteins suggests a dual strategy employed by these proteins to enhance their stability. On the one hand, thermophilic proteins prefer residues with larger side chains that can form salt bridges, long range or local electrostatic and hydrophobic interactions, and which stabilize secondary structure elements. However, concomitantly, thermophilic proteins avoid thermolabile residues and residues that can destabilize secondary structure elements.

Our analysis shows that the organisms' living temperatures are not good descriptors of protein thermostability. Melting temperatures may be more appropriate to measure protein thermostability. When explored with respect to the melting temperatures, salt bridges appear to show a correlation with the Tm's. We note, however, that while high quality crystal structures are available, unfortunately, the Tm's have been determined only for a few of these proteins. Hence, currently we are unable to examine a correlation of salt bridges and the respective melting temperatures of the thermophiles in a statistically meaningful way. However, we observe that structural factors involved in the stability of the thermophilic proteins do not correlate with the living temperatures of their source organisms.

From the point of view of designing a thermophilic protein, this study suggests inclusion of a larger proportion of salt bridges. Additionally, it indicates including residues in {alpha}-helical conformation, and a higher frequency of Arg both to form salt bridges and additionally to stabilize {alpha}-helices. It would be preferable to avoid Pro, Cys and His in {alpha}-helices, and avoid thermolabile residues, particularly Cys.


    Acknowledgments
 
We thank Drs Buyong Ma and Neeti Sinha and, in particular, Dr Jacob V.Maizel for helpful discussions. The personnel at FCRDC are thanked for their assistance. The research of R.Nussinov in Israel has been supported in part by grant no. 95-00208 from BSF, Israel, by a grant from the Ministry of Science, by the Center of Excellence, administered by the Israel Academy of Sciences, by the Magnet grant, and by the Tel Aviv University Basic Research and Adams Brain Center grants. This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under contract No. NO1-CO-56000. The content of this publication does not necessarily reflect the view or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organization imply endorsement by the U.S. Government.


    Notes
 
4 To whom correspondence should be addressed Email: ruthn{at}ncifcrf.gov Back


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion and conclusions
 References
 
Adams,M.W.W. and Kelly,R.M. (1995) Chem. Engng News, 73, 32–42.[ISI]

Auerbach,G., Jacob,U., Grottinger,M., Schurig,M. and Jaenicke,R. (1997) Biol. Chem., 378, 327–329.

Bernstein,F., Koetzle,T., Williams,G., Meyer,E.J., Brice,M., Rodgers,J., Kennard,O. Shimanuchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535–542.[ISI][Medline]

Bogan,A.A. and Thorn,K.S. (1998) J. Mol. Biol., 280, 1–9.[ISI][Medline]

Bogin,O., Peretz,M., Hacham,Y., Korkhin,Y., Frolow,F., Kalb(Gilboa),A.J. and Burstein,Y. (1998) Protein Sci., 7, 1156–1163.[Abstract/Free Full Text]

Clackson,T. and Wells,J.A. (1995) Science, 267, 383–386.[ISI][Medline]

Daniel,R.M., Cowan,D.A., Morgan,H.W. and Curran,M.P. (1982) Biochem. J., 207, 641–644.[ISI][Medline]

Davies,G.J., Gamblin,S.J., Littlechild,J.A. and Watson,H.C. (1993) Proteins, 15, 283–289.[ISI][Medline]

Day,M.W., Hsu,B.T., Joshua-Tor,L., Park,J.B., Zhou,Z.H., Adams,M.W.W. and Rees,D.C. (1992) Protein Sci., 1, 1494–1507.[Abstract/Free Full Text]

Dill,K.A. (1990) Biochemistry, 31, 7134–7155.

Elcock,A.H. (1998) J. Mol. Biol., 284, 489–502.[ISI][Medline]

Fukuyama,K., Nagahara,Y., Tsukihara,T., Katsube,Y., Hase,T. and Matsubara,H. (1988) J. Mol. Biol., 199, 183–193.[ISI][Medline]

Glaser,P., Presecan,E., Delepierre,M., Surewicz,W.K., Mantsch,H.H., Barzu,O. and Giles,A.M. (1992) Biochemistry, 31, 3038–3043.[ISI][Medline]

Gomes,J., Gomes,I., Kreiner,W., Esterbauer,H., Sinner,M. and Steiner,W. (1993) J. Biotech., 30, 283–297.[ISI]

Haney,P., Konisky,J., Koretke,K.K., Luthey-Schulten,Z. and Wolynes,P.G. (1997) Proteins, 28, 117–130.[ISI][Medline]

Hendsch,Z.S. and Tidor,B. (1994) Protein Sci., 3, 211–226.[Abstract/Free Full Text]

Hiller,R., Zhou,Z.H., Adams,M.W.W. and Englander,S.W. (1997) Proc. Natl Acad. Sci. USA, 94, 11329–11332.[Abstract/Free Full Text]

Holland,D.R., Hausrath,A.C., Juers,D. and Matthews,B.W. (1995) Protein Sci., 4, 1955–1965.[Abstract/Free Full Text]

Jaenicke,R. and Bohm,G. (1998) Curr. Opin. Struct. Biol., 8, 738–748.[ISI][Medline]

Jeffrey,G.A. and Saenger,W. (1991) Hydrogen Bonding in Biological Structures. Springer-Verlag, Berlin

Jiang,Y., Nock,S., Nesper,M., Sprinzl,M. and Sigler,P.B. (1996) Biochemistry, 35, 10269–10278.[ISI][Medline]

Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637.[ISI][Medline]

Karshikoff,A. and Ladenstein,R. (1998) Protein Engng, 1, 867–872.

Kelly,C.A., Nishiyama,M., Ohnishi,Y., Beppu,T. and Birktoft,J.J. (1993) Biochemistry, 32, 3913–3922.[ISI][Medline]

Kjeldgaard,M., Nissen,P., Thirup,S. and Nyborg,J. (1993) Structure, 1, 35–50.[ISI][Medline]

Klump,H.H., Dikuggiero,J., Kessel,M., Park,J.B., Adams,M.W.W. and Robb,F.T. (1992) J. Biol. Chem., 267, 22681–22685.[Abstract/Free Full Text]

Knegtel,R.M.A., Wind,R.D., Rozeboom,H.J., Kalk,K.H., Buitelaar,R.M., Dijkhuizen,L. and Dijkstra,B.W. (1996) J. Mol. Biol., 256, 611–622.[ISI][Medline]

Kumar,S. and Bansal,M. (1996) Biophys. J., 71, 1574–1586.[Abstract]

Kumar,S. and Bansal,M. (1998a) Proteins, 31, 460–476.[ISI][Medline]

Kumar,S. and Bansal,M. (1998b) Biophys. J., 75, 1935–1944.[Abstract/Free Full Text]

Kumar,S. and Nussinov,R. (1999) J. Mol. Biol., 293, 1241–1255.[ISI][Medline]

Kumar,S., Ma,B., Tsai,C.J. and Nussinov,R. (2000) Proteins, 38, 368–383.[ISI][Medline]

Ladenstein,R. and Antranikian,G. (1998) Adv. Biochem. Engng Biotechnol., 61, 37–85.

Lee,B.K. and Richards,F.M. (1971) J. Mol. Biol., 55, 379–400.[ISI][Medline]

Matthews,B.W., Weaver,L.H. and Kester,W.H. (1974) J. Biol. Chem., 249, 8030–8044.[Abstract/Free Full Text]

Obmolova,G., Kuranova,I. and Teplyakov,A. (1993) J. Mol. Biol., 232, 312–313.[ISI][Medline]

Perutz,M. and Raidt,H. (1975) Nature, 255, 256–259.[ISI][Medline]

Querol,E., Perez-Pons,J.A. and Mozo-Villarias,A. (1996) Protein Engng, 9, 256–271.

Russell,R.J.M., Ferguson,J.M.C., Haugh,D.W., Danson,M.J. and Taylor,G.L. (1997) Biochemistry, 36, 9983–9994.[ISI][Medline]

Russell,R.J.M., Gerike,U., Danson,M.J., Hough,D.W. and Taylor,G.L. (1998) Structure, 6, 351–361.[ISI][Medline]

Rypniewski,W.R. and Evans,P.R. (1989) J. Mol. Biol., 207, 805–821.[ISI][Medline]

Salminen,T., Teplyakov,A., Kankare,J., Cooperman,B.S., Lahti,R. and Goldman,A. (1996) Protein Sci., 5, 1014–1025.[Abstract/Free Full Text]

Singleton,P. and Sainsbury,D. (1978) Dictionary of Microbiology and Molecular Biology, 2nd Edn. John Wiley, New York.

Tsai,C.J., Lin,S.L., Wolfson,H. and Nussinov,R. (1996) J. Mol. Biol., 260, 604–620.[ISI][Medline]

Tsai,C.J. and Nussinov,R. (1997a) Protein Sci., 6, 24–42.[Abstract/Free Full Text]

Tsai,C.J. and Nussinov,R. (1997b) Protein Sci., 6, 1426–1437.[Abstract/Free Full Text]

Tsai,C.J., Xu,D. and Nussinov,R. (1997) Protein Sci., 6, 1–13.[Free Full Text]

Tsunasawa,S., Izu,Y., Miyagi,M. and Kato,I. (1997) J. Biochem., 122, 843–850.[Abstract]

Usher,K.C., De la Cruz,A.F.A., Dahlquist,F.A., Swanson,R.V., Simon,M.I. and Remington,S.J. (1998) Protein Sci., 7, 403–412.[Abstract/Free Full Text]

Vogt,G. and Argos,P. (1997) Fold. Des., 2, S40–S46.[ISI][Medline]

Vogt,G., Woell,S. and Argos,P. (1997) J. Mol. Biol., 269, 631–643.[ISI][Medline]

Warren,G.L. and Petsko,G.A. (1995) Protein Engng, 8, 905–913.[Abstract]

Watanabe,K., Hata,Y., Kizaki,H., Katsube,Y. and Suzuki,Y. (1997) J. Mol. Biol., 269, 142–153.[ISI][Medline]

Wigley,D.B., Gamblin,S.J., Turkenburg,J.P., Dodson,E.J., Piontek,K., Muirhead,H. and Holbrook,J.J. (1992) J. Mol. Biol., 223, 317–335.[ISI][Medline]

Woolfson,D.N. and Williams,D.H. (1990) FEBS Lett., 277, 185–188.[ISI][Medline]

Xiao,L. and HonigB. (1999) J. Mol. Biol., 289, 1435–1444.[ISI][Medline]

Yip,K.S.P. et al. (1995) Structure, 3, 1147–1158.[ISI][Medline]

Yip,K.S.P., Britton,K.L., Stillman,T.J., Lebbink,J., De Vos,W.M., Robb,F.T., Vetriani,C., Maeder,D. and Rice,D.W. (1998) Eur. J. Biochem., 255, 336–346.[Abstract]

Zehfus,M.H. and Rose,G.D. (1986) Biochemistry, 25, 5759–5765.[ISI][Medline]

Zuber,H. (1988) Biophys. Chem., 29, 171–179.[ISI][Medline]

Received June 30, 1999; revised October 26, 1999; accepted November 29, 1999.