Structure-based substitutions for increased solubility of a designed protein

Leila K. Mosavi and Zheng-yu Peng1

Department of Biochemistry, University of Connecticut Health Center, Farmington, CT 06032, USA

1 To whom correspondence should be addressed. e-mail: peng{at}sun.uchc.edu


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Manipulation of protein solubility is important for many aspects of protein design and engineering. Previously, we designed a series of consensus ankyrin repeat proteins containing one, two, three and four identical repeats (1ANK, 2ANK, 3ANK and 4ANK). These proteins, particularly 4ANK, are intended for use as a universal scaffold on which specific binding sites can be constructed. Despite being well folded and extremely stable, 4ANK is soluble only under acidic conditions. Designing interactions with naturally occurring proteins requires the designed protein to be soluble at physiological pH. Substitution of six leucines with arginine on exposed hydrophobic patches on the surface of 4ANK resulted in increased solubility over a large pH range. Study of the pH dependence of stability demonstrated that 4ANK is one of the most stable ankyrin repeat proteins known. In addition, analogous leucine to arginine substitutions on the surface of 2ANK allowed the partially folded protein to assume a fully folded conformation. Our studies indicate that replacement of surface-exposed hydrophobic residues with positively charged residues can significantly improve protein solubility at physiological pH.

Keywords: ankyrin repeat/arginine substitution/pH dependence of stability/protein design/protein solubility


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
The recent completion of whole genome sequences has resulted in a flood of new proteins and biochemical systems to be characterized. The concentration of a single protein inside the cell is usually fairly low. The transition from relatively dilute in vivo conditions to more concentrated conditions used for in vitro structure–function studies often results in aggregation, poor solubility and misfolding of proteins. Most biochemical studies, particularly quantitative binding assays, and also high-resolution structural studies, demand the target protein to be highly soluble. Often, achieving sufficient protein solubility requires detailed and time-consuming screening of solution conditions such as buffer, pH and salt. To date, many of the biophysically well characterized proteins in the literature have been extensively studied, in part, due to their high solubility and ease of purification. There are many proteins, however, that are not amenable to biophysical studies because of low solubility and a tendency to misfold or aggregate. In addition, many industrial applications of enzymes, such as detergents, animal feed and textile products, can benefit from improved protein solubility [for a review, see Kirk et al. (Kirk et al., 2002Go)]. In the field of protein design, a detailed understanding of the generalized solution properties of proteins will help to produce well folded stable molecules.

Various strategies exist to alter the solubility of a protein. The most straightforward is to screen for conditions such as different buffers, salts or additives to increase solubility. This strategy has the advantage of using the native protein sequence; however, it is time consuming and the proper conditions for high solubility may never be found. Minor modifications to the protein include the addition of a fusion protein such as maltose-binding protein, protein G (B1 domain) or thioredoxin upstream or downstream of the target protein in order to improve solubility and expression (LaVallie and McCoy, 1995Go; Kapust and Waugh, 1999Go; Zhou et al., 2001Go). In addition, recent advances in peptide chemistry have allowed the synthesis of proteins glycoslyated at specific amino acids, which results in altered solution behavior (Kochendoerfer et al., 2003Go). Alternatively, one can modify the amino acid sequence of the protein in order to achieve the desired properties. Directed molecular evolution combined with selection techniques using fusion proteins such as green fluorescent protein or chloramphenicol acetyltransferase as reporters for proper folding and solubility of the target protein have met with some success (Maxwell et al., 1999Go; Waldo et al., 1999Go). The drawback of such experiments is that there is no way to control the function of the target protein. Many groups have relied on rational design of substitutions to engineer increased protein solubility. Knowledge of the high-resolution structure and the functionally important residues is used to guide the design process. This requires an understanding of the role of various amino acids in protein folding, stability and solubility. Despite efforts aimed at correlating amino acid sequences with solubility (Wilkinson and Harrison, 1991Go), a comprehensive survey relating these properties has not been established.

Previously, we have designed a series of consensus ankyrin repeat proteins containing one, two, three and four identical repeats (Mosavi et al., 2002aGo). The ankyrin repeat is a 33-residue sequence motif that forms a helix–loop–helix followed by a ß-hairpin positioned at a 90° angle (Sedgwick and Smerdon, 1999Go). The ankyrin repeat, like the tetratricopeptide repeat (TPR), ARM/HEAT repeat and leucine-rich repeat (LRR), is typically tandemly repeated in a protein and forms an elongated surface capable of binding other proteins. Having no specific binding partner or enzymatic function, these repeats are found in a variety of proteins including transcription factors, cell cycle regulators, cytoskeletal proteins and signal transduction molecules. Typically, a protein contains four to six ankyrin repeats, although as many as 29 repeats have been observed in a single protein (Walker et al., 2000Go).

We designed our consensus ankyrin repeat with the ultimate goal of using it as a generalized scaffold to engineer protein–protein interactions. Our four-repeat construct (4ANK) is folded, stable, soluble and monomeric only under acidic conditions. In order to use this consensus ankyrin repeat sequence as a building block to create specific binding surfaces, it is important that the protein be soluble and monomeric at physiological pH. Inspection of the surface of 4ANK revealed two large hydrophobic patches. In order to increase the number of charged residues on the protein surface, we replaced six exposed leucines to arginines on the N- and C-terminal repeats of 4ANK. The resulting protein, 4ANK TALR, is soluble over a large pH range. The arginine-substituted proteins are designated as ‘TALR’, which stands for ‘terminal all L to R’. Furthermore, the leucine to arginine substitutions in the same relative positions on our consensus repeat protein containing two repeats (2ANK) caused the protein to adopt a monomeric and fully folded conformation. This work demonstrates that substitution of hydrophobic residues with arginine on the protein surface is an effective way to increase protein solubility under various conditions.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Gene construction

The genes for 4ANK and 2ANK were constructed as described previously (Mosavi et al., 2002aGo). Genes for 2ANK TALR and 4ANK TALR were constructed by ligating together separate DNA cassettes, each containing a 99-bp single ankyrin repeat gene with the desired sequence/substitutions. These cassettes were created using a recursive polymerase chain reaction (PCR) (Dillon and Rosen, 1990Go; Prodromou and Pearl, 1992Go; Chen et al., 1994Go) with four complementary oligonucleotides. Each cassette contains an MluI restriction site at the 5'-end and a BssHII restriction site at the 3'-end. When cleaved, these two restriction sites have complementary overhang sequences and can be ligated to one another eliminating both cutting sites at the junction. This results in a gene containing two ankyrin repeats with a 5'-MluI site and a 3'-BssHII site but no restriction site in between the two cassettes. This construct can then be ligated to another cassette if desired, using the restriction site at either end. The 4ANK TALR gene consists of amino acid positions 1–33 in the first three repeats and positions 1–26 in the fourth repeat, and the 2ANK TALR gene consists of positions 1–33 of the first repeat followed by positions 1–26 of the second repeat. Both genes contain an additional C-terminal tyrosine for concentration determination. The ligation product corresponding to the expected size for the desired number of repeats was cloned into the pAED4 expression vector (Studier et al., 1990Go; Doering, 1992Go) downstream from a Trp-LE leader sequence, which forces the protein into inclusion bodies (Staley and Kim, 1994Go; Mosavi et al., 2002aGo). The genes encoding 2ANK TALR and 4ANK TALR were identified by restriction digestion and confirmed by automated DNA sequencing analysis.

Protein expression and purification

The proteins were expressed in Escherichia coli BL21(DE3) pLysS and purified as described previously (Mosavi et al., 2002aGo). The predicted molecular masses for 4ANK TALR and 2ANK TALR were 13 681 and 6674 Da and the measured molecular weights, as determined by electrospray mass spectrometry, were 13 683 and 6674 Da, respectively. Protein concentration was determined by measuring the absorbance at 280 nm in 6 M guanidine hydrochloride, assuming a molar extinction coefficient of 1280 for all constructs (Edelhoch, 1967Go).

Circular dichroism (CD)

All CD experiments were carried out using a JASCO J-715 spectropolarimeter with a thermoelectric temperature controller. The CD spectra were recorded using 10 µM protein in a 0.1 cm pathlength cuvette with scan rate 10 nm/min, bandwidth 2 nm and response time 8 s at 20°C. Samples of 4ANK TALR contained 10 mM succinate for pH 4 and 5 or 10 mM phosphate for pH 6 and 7, 1 mM EDTA and 500 mM NaCl. Samples of 4ANK contained 10 mM succinate, pH 4.25, or phosphate, pH 7.0, 50 mM NaCl and 1 mM EDTA. Samples of 2ANK TALR contained 10 mM phosphate, pH 7, 1 mM EDTA and 500 mM Na2SO4. The results shown are the averages of three scans.

Urea and guanidine denaturation studies were performed in the buffers described above with various concentrations of urea or guanidine hydrochloride. The concentrations of the urea and guanidine hydrochloride stock solution were determined by using a refractometer (Shirley, 1995Go). Each data point was measured in a 0.1 cm pathlength cuvette, with resolution 2 s, response time 1 s and averaged for 2 min. The chemical denaturation data were fitted with a six-parameter equation assuming a two-state folding transition (Pace, 1986Go):

where {theta} represents the CD signal, x is the urea or guanidine concentration, {Delta}G is the free energy of unfolding and m is the cooperativity factor. {theta}f(x) = {theta}0f + {theta}1fx and {theta}u(x) = {theta}0u + {theta}1ux are the folded and unfolded baselines. For 4ANK TALR at pH 7.0, it was impossible to obtain an unfolded baseline. Therefore, an unfolded baseline was estimated using the pH 6.0 data in order to reduce the error of fitting.

Thermal denaturation studies on 2ANK TALR were performed in the same buffer except that 1 M NaCl was used instead of 500 mM Na2SO4 in order to obtain a fully reversible melt. The protein concentration was 2 µM in a 1 cm pathlength cuvette. The CD signal was monitored at 222 nm, with bandwidth 2 nm, response time 8 s and a temperature increase of 25°C/h. The forward and reverse curves were superimposable within 1°C when the sample was heated to 56°C and then allowed to cool to 4°C at the same rate (25°C/h). The thermal denaturation data were fitted with the following equation (Becktel and Schellman, 1987Go; Privalov, 1989Go):

where {theta} represents the CD signal, Tm is the melting temperature, {Delta}H is the change in enthalpy and {Delta}Cp is the change in specific heat upon unfolding. {theta}f(T) = {theta}0f + {theta}1fT and {theta}u(T) = {theta}0u + {theta}1uT are the folded and unfolded baselines, which were assumed to be linearly dependent on temperature.

Size-exclusion chromatography–multiple angle laser light scattering (SEC–MALLS)

SEC–MALLS studies were performed using a Varian HPLC system connected to a miniDAWN light scattering detector and an Optilab refractive index detector (both from Wyatt Technology). Buffers, described in the CD experiments, were filtered and degassed prior to use. The system was equilibrated overnight. Samples (200 µl) were injected at 100 µM concentration and eluted at a flow rate of 0.5 ml/min for 4ANK TALR on a Pharmacia Superdex 75 size-exclusion column and 0.3 ml/min for 2ANK TALR on a Pharmacia Superdex peptide column. Scattering data were collected and analyzed using ASTRA software (Wyatt Technology). The apparent molecular weights and polydispersity factors were calculated by using the Zimm method (Folta-Stogniew and Williams, 1999Go) at the peak width corresponding to the peak half-height.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Design of arginine substitutions

In order to choose the surface positions on 4ANK that were good candidates to improve solubility, we calculated the solvent accessibility of each residue in the 4ANK structure (PDB ID: 1N0R) using the program MOLMOL (Koradi et al., 1996Go). Residues L8, A9, V17, L20, L21 and A24 on the N-terminal repeat and L105, A109, L114, V117 and L121 on the C-terminal repeat are hydrophobic amino acids with solvent accessibility ranging from 43 to 10%. Mapping these residues on to the X-ray crystal structure of 4ANK showed that they formed two mostly contiguous patches on the surface of the protein (Figure 1). We chose to substitute the three leucine residues in the N-terminal repeat and the three leucine residues in the C-terminal repeat with arginine, reasoning that the arginine side-chain methylene groups could still form hydrophobic contacts with existing valine and alanine, while the positively charged guanido group would increase the solubility and pI of the protein. 4ANK TALR has substitutions L8R, L20R and L21R in the N-terminal repeat and substitutions L105R, L114R and L121R in the C-terminal repeat; 2ANK TALR has L8R, L20R and L21R in the N-terminal repeat and L39R, L48R and L55R in the C-terminal repeat. These substitutions correspond to positions L8, L20, L21 and L6, L15, L22, respectively, in our designed consensus ankyrin repeat sequence, which follows the canonical numbering scheme (Mosavi et al., 2002aGo).



View larger version (73K):
[in this window]
[in a new window]
 
Fig. 1. Hydrophobic patches on the surface of 4ANK. The hydrophobic patch is colored blue with the amino acids identified on the N-terminal (a) and C-terminal (b) repeat. The orientation of the molecule is with the ß-hairpins facing to the right in (a) and to the left in (b). The figure was made using the program MOLMOL (Koradi et al., 1996Go).

 
Secondary structure and oligomerization state of 4ANK TALR

Far-UV CD scans of 4ANK TALR at pH 4–7 were compared with CD scans of 4ANK at pH 4.25 and pH 7.0 (Figure 2a). With the exception of 4ANK at pH 7.0, the spectra show well defined minima at 222 and 208 nm, characteristic of a mostly {alpha}-helical, well folded protein. Both the shape and the amplitude of the spectra of 4ANK TALR are essentially invariant at different pH values and agree well with 4ANK at pH 4.25. In contrast, at pH 5 or higher, the CD spectra of 4ANK are distorted (Figure 2a), which may indicate the presence of soluble aggregates. The {alpha}-helical content of 4ANK TALR agrees well with that of 4ANK at pH 4.25, indicating that the arginine substitutions have not disrupted secondary structure elements in the protein.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 2. Solution characterization of 4ANK TALR at various pH values. (a) Far-UV CD spectra of 4ANK TALR at pH 4 (closed circles), 5 (closed squares), 6 (open circles) and 7 (open squares) and 4ANK at pH 4.25 (open diamonds) and 7.0 (closed diamonds) at 20°C. (b) Guanidine hydrochloride denaturation curves of 4ANK TALR at pH 4 (closed circles), 5 (closed squares), 6 (open circles) and 7 (open squares) monitored by CD at 222 nm at 20°C. The data were fitted with a two-state unfolding model (gray line).

 
The oligomerization state of 4ANK TALR was analyzed using SEC–MALLS studies (Table I; Figure 3). At 100 µM protein concentration and salt concentrations >500 mM, 4ANK TALR elutes as a single peak at pH 4–7 with an observed molecular weight within 5% of that expected for a monomer (13 681 Da). At low salt concentrations, 4ANK TALR tends to self-associate. This is different from the solution conditions necessary for 4ANK, which is folded and monomeric only from pH 3 to 4.25. At pH 5 and 6, 4ANK has very low solubility, and at pH 7, it forms soluble aggregates that bind to the filter used to remove particulate matter from the HPLC sample. Therefore, HPLC spectra of 4ANK at pH 7 showed little or no signal corresponding to the injected protein. Salt was only tolerated by 4ANK at more acidic pH. For example, 4ANK (50 µM) is monomeric at pH 4.0 in 500 mM NaCl but aggregates at salt concentrations >50 mM NaCl at pH 4.25 at 100 µM.


View this table:
[in this window]
[in a new window]
 
Table I. SEC–MALLS data for TALR mutants under various conditions
 


View larger version (28K):
[in this window]
[in a new window]
 
Fig. 3. Oligomerization state of 4ANK TALR shown by SEC–MALLS. Protein was injected at 100 µM concentration in 500 mM NaCl at pH 4 (a), pH 5 (b), pH 6 (c) and pH 7(d). The light scattering signal (solid circles) shown is from the 138° angle detector and molar mass (open squares) was calculated using ASTRA software.

 
Stability of 4ANK TALR

Previously, 4ANK displayed an increase in the free energy of unfolding at higher pH, but this was difficult to analyze given the small pH range at which the protein was folded and soluble. Since 4ANK TALR is soluble at pH 4–7, we were able to characterize the pH dependence of the stability of this protein. Figure 2b shows the guanidine hydrochloride denaturation curves for 4ANK TALR at pH 4, 5, 6 and 7. At pH 7, even at the highest guanidine hydrochloride concentration, 4ANK TALR is not completely unfolded. The {Delta}G of unfolding at pH 4, 5, 6 and 7 are 4.5, 7.6, 8.6 and 12.3 kcal/mol, respectively (Table II). At neutral pH, 4ANK TALR is one of the most stable ankyrin repeat proteins characterized to date. Ankyrin repeat proteins usually show an increase in {Delta}G with growing number of repeats. Naturally occurring proteins such as Drosophila Notch with seven repeats has a {Delta}G of 8.0 kcal/mol (Zweifel and Barrick, 2001Go), whereas rat myotrophin, a 3.5-repeat cardiomyogenic hormone, has a {Delta}G of 5.1 kcal/mol (Mosavi et al., 2002bGo). The designed proteins created by Kohl et al., with non-identical consensus ankyrin repeats, have {Delta}G values ranging from 9.5 to 11.4 kcal/mol for four repeats, 9.6 to 14.8 kcal/mol for five repeats and 21.1 kcal/mol for six repeats (Kohl et al., 2003Go).


View this table:
[in this window]
[in a new window]
 
Table II. Calculated {Delta}G(H2O) and m values for 4ANK TALR guanidine hydrochloride denaturation data at various pHs and 2ANK TALR urea denaturationa
 
2ANK TALR

Previously, 2ANK appeared to be partially folded under acidic conditions (pH 3–5) and form soluble aggregates at higher pH (data not shown). With the TALR substitution, 2ANK TALR forms a fully folded monomeric protein. The far-UV CD spectra of 2ANK TALR show a folded {alpha}-helical protein with the percentage of {alpha}-helical structure in agreement with 4ANK TALR (Figure 4a). In addition, we used SEC–MALLS to verify that 2ANK TALR is a monomer in solution at 100 µM concentration (Table I). In the absence of salt, 2ANK TALR is partially structured (data not shown) and increasing concentrations of salt greatly enhance 2ANK TALR’s stability.



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 4. Solution characterization of 2ANK TALR. (a) Far-UV CD spectrum at pH 7.0, 20°C. (b) Thermal denaturation at pH 7.0 monitored by far-UV CD at 222 nm.

 
Thermal and chemical denaturation studies suggest that 2ANK TALR has a unique tertiary structure. Thermal denaturation of 2ANK TALR monitored by far-UV CD at 222 nm in 1 M NaCl shows a reversible, cooperative transition with a calculated midpoint of transition (Tm) of 34.7°C (Figure 4b). We monitored the unfolding of 2ANK TALR by urea denaturation in 0.5 M Na2SO4 and obtained a {Delta}G value of 0.8 kcal/mol (Table II). This value probably represents the lower limit of unfolding free energy because the lack of folded baseline does not allow us to fit the baseline independently. Although the Tm and {Delta}G values for 2ANK TALR are relatively small, this protein is folded with a packed hydrophobic core as indicated by the cooperative thermal and chemical denaturation curves. Previously, a truncated version of the tumor suppressor p16INK4A, named p16C, was the only ankyrin repeat protein containing two repeats to be characterized biophysically with a {Delta}G value of 1.7 kcal/mol (Zhang and Peng, 2000Go).


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Engineering soluble monomeric proteins is important for the field of protein biochemistry and also for many industrial applications of proteins. The use of our consensus ankyrin repeat protein, 4ANK, as a general scaffold to design protein–protein interactions, requires that it be soluble and monomeric under physiological conditions. The motivation behind our TALR substitutions was to improve the solubility of 4ANK from acidic pH to more neutral conditions for such studies. We chose to replace six surface leucine residues with arginine to increase the net positive charge of the protein. Previously, several studies have demonstrated the role of arginine in protein solubility and stability at high pH. Comparisons of subtilisin-family proteases from alkaliphilic bacteria with their less alkaliphilic or mesophilic homologs show an increase in the number of arginine residues (Masui et al., 1994Go; Shirai et al., 1997Go). In addition, random mutagenesis and rational design efforts incorporating arginine residues to engineer greater alkaline stability and enzymatic activity have been successful (Cunningham and Wells, 1987Go; Turunen et al., 2002Go). In these studies, alkaline adaptation has been attributed to an increase in the isoelectric point of the proteins, either through a decrease in acidic residues such as glutamate or aspartate, replacement of lysines with arginines or an increase in the number of arginines. Arginine is also known to be effective in preventing dilution-induced and thermal unfolding-induced aggregation for a number of proteins (Shiraki et al., 2002Go). The results reported here agree with past experimental data. The TALR substitutions increase the theoretical pI of 4ANK from 9.3 to 10.5. 4ANK is soluble, folded and monomeric from pH 3 to 4.25 at 100 µM. However, 4ANK TALR is soluble, folded and monomeric from pH 4 to 7 at concentrations >100 µM. The leucine to arginine substitutions replace the hydrophobic patches on the N- and C-termini of 4ANK with charged residues, resulting in greater interaction with the solvent and reduction in self-association.

pH dependence of stability

4ANK TALR, with its increased solubility, allowed us to examine the pH dependence of stability. Previously, 4ANK showed an increase in stability with increase in pH, although this was difficult to characterize since the protein was soluble and monomeric over only a small pH range. 4ANK TALR shows an approximately linear pH dependence of stability ranging from 4.5 kcal/mol at pH 4 to 12.3 kcal/mol at pH 7 (Table II). The classical Linderstrom–Lang model postulates that a protein surface is an average of the total side-chain charges, which are all repulsive (Linderstrom-Lang, 1924Go). Therefore, the protein is most stable at its isoelectric point and the stability will decrease as the pH increases or decreases from the pI. Although it appears that the stability of 4ANK TALR increases as the theoretical pI is approached, we were unable to determine the pH value at which the free energy of unfolding is maximized. It has been shown that this classical model does not apply universally to all proteins. Several studies have shown the maximum stability to be significantly different from the protein’s isoelectric point. For example, analysis of variants of RNase SA in which the net charge of the protein was altered using lysine mutations shows that, while the minimum of protein solubility coincides with the pI, the pH of maximum stability does not (Shaw et al., 2001Go). Similar results were found in studies of T4 lysozyme and barnase (Becktel and Baase, 1987Go; Pace et al., 1992Go). For both proteins, the deviation in the pH of maximum stability was attributed to a single His residue in each protein with unusually high pKa values (Anderson et al., 1990Go; Pace et al., 1992Go). Indeed, the pH dependence of stability can be solely attributed to the presence of ionizable groups in the protein that have different pKa values in the folded state and unfolded state (Pace et al., 1990Go, 1992; Yang and Honig, 1993Go). For example, titratable groups that have a lower pKa in the native state than in the unfolded state will result in an increase in stability as pH increases.

Interestingly, 4ANK TALR stability data displays a pH-dependent decrease in the m values from pH 4 to 6 (the m values at pH 6 and 7 are identical within experimental error) (Table II). This relationship between pH and m values has been observed previously in other proteins such as RNase A and barnase (Pace et al., 1990Go, 1992). The m value represents the dependence of {Delta}G on denaturant concentration and a larger m value signifies a larger difference in solvent accessibility between the native and unfolded states of the protein. It is generally accepted that higher m values at low pH result from repulsion of positive charges in the unfolded state. Under acidic conditions, the unfolded state is more expanded and therefore more accessible to denaturant binding.

Salt dependence

The ionic strength of the solution affects 4ANK and 4ANK TALR differently. 4ANK, at pH 4.25, forms aggregates at NaCl concentrations >50 mM. In fact, the tolerance to salt in solution decreases as the pH of the solution increases. In the presence of >500 mM NaCl, 4ANK TALR, is a monomer at pH 4–7. However, at 50 mM NaCl, 4ANK TALR forms higher order soluble aggregates. This difference in the solution salt requirement may correspond to a shift between salting-in and salting-out effects. The aggregation of 4ANK resulting from increasing salt concentration can be attributed to the Hofmeister effect [for reviews, see Baldwin (Baldwin, 1996Go), Cacace et al. (Cacace et al., 1997Go) and Record et al. (Record et al., 1998Go)]. Preferential hydration of the protein, in response to ions in the solute, results in aggregation or salting-out by increasing the hydrophobic effect. If one considers the exposed hydrophobic patches on 4ANK, this explanation seems reasonable for the observed aggregation. 4ANK TALR, having six more positive charges, is a protein that may have preferential interaction with the ions in solution rather than preferential hydration (Wright et al., 2002Go). This can be considered as a ‘salting-in’ effect in which the solubility of the protein increases as the ionic strength of the solution increases. Although the Hofmeister effect has been discussed in detail, little is known about the balance between salting-in and salting-out effects (Arakawa and Timasheff, 1985Go). The mechanism of the observed self-association of 4ANK TALR at low ionic strength is not completely clear. A previous survey of protein–protein interface ‘hot spots’ found that the interfaces are enriched in arginine (Bogan and Thorn, 1998Go), suggesting that their presence on the surface of a protein may provoke intermolecular interactions.

Salt may also help to stabilize the overall structure of 4ANK TALR because of the high density of positive charges on the surface. It is likely that the proximity of the three arginines to one another on each terminus of the protein results in destabilizing repulsive interactions. The presence of ions in solution may partially screen out the unfavorable electrostatic effects, thereby stabilizing the protein. At pH 4.0, 4ANK TALR is marginally stable and therefore the destabilizing effects of the charge repulsion may be particularly prominent. At higher pH values, however, it is likely these repulsive effects contribute less to overall stability, since the {Delta}G of unfolding is much higher.

2ANK TALR

Substitution of the six leucine residues to arginine on the surface of 2ANK TALR results in a well folded, monomeric protein displaying cooperative thermal denaturation. Previously, 2ANK was partially folded at pH 5.0 and formed soluble aggregates with folded CD spectra at higher pH. 2ANK TALR, under low ionic strength conditions, was partially folded at all pH values tested. However, with the addition of 500 mM NaCl, 2ANK TALR was fully folded and monomeric at pH 7. A possible explanation for this change in the solution properties lies in the surface to volume ratio of hydrophobic residues. Owing to the identical nature of the repeats, 2ANK has approximately the same number of exposed and buried hydrophobic residues. This is in contrast to 4ANK, which has the same exposed hydrophobic patches on the surface of the protein, but contains three times more buried hydrophobic residues at the inter-repeat interfaces compared with 2ANK. Our previous studies involving the truncation of p16INK4A suggest that two repeats are the minimum folding unit of the ankyrin repeat motif (Zhang and Peng, 2000Go). Taken together, these results suggest that two repeats can fold only if a sufficient number of buried hydrophobic residues exist to form an energetically favorable core.

Effect of TALR mutations on stability

The TALR mutations, although located on the surface of the protein, have a profound effect on global stability. 4ANK is soluble and monomeric at pH 4 with 500 mM NaCl, allowing us to compare the guanidine hydrochloride denaturation curves of 4ANK and 4ANK TALR under these conditions. 4ANK has a {Delta}G of 10.1 ± 0.4 kcal/mol and m value of 2.7 ± 0.1 kcal/mol·M (data not shown). The stability of 4ANK is more than double that of 4ANK TALR (4.5 kcal/mol). This difference may be attributed to the destabilizing effect of unpaired ions on the surface of the protein that can result in desolvation (Yang and Honig, 1993Go). It is generally accepted that protein stability increases as the solution pH approaches the pI. Additional positively charged residues on the protein surface will increase the theoretical pI. Therefore, at the same pH, the difference between pH and pI for 4ANK is less than that of 4ANK TALR and consequently 4ANK should be a more stable molecule. Given these results, it is likely that if 4ANK were soluble and monomeric at pH 7, the stability would be higher than that of 4ANK TALR.

In summary, we have designed substitutions on the surface of 4ANK in order to increase the solubility at physiological pH. Substitutions of arginine for leucine on the exposed hydrophobic surfaces resulted in 4ANK TALR, which is soluble, folded and monomeric from pH 4 to 7. These substitutions altered the salt requirements of the protein in solution. We have characterized the pH dependence of stability for 4ANK TALR, which, although less stable than 4ANK under the same conditions, is one of the most stable ankyrin repeat proteins at pH 7. In addition, 2ANK, a partially folded molecule with a tendency to aggregate, formed a folded monomeric protein after introduction of the TALR substitutions. Taken together, our results indicate that substitution of exposed hydrophobic residues to arginine can be an effective strategy to increase protein solubility. This information is particularly important for protein design studies, in addition to contributing to our general understanding of the effect of mutations and charges on protein solubility and salt dependence.


    Acknowledgements
 
We thank Ping Bai for help with cloning, Jeff Hoch for helpful discussions and Tobin Cammett, Daniel Desrosiers and Zandra Sutter for critical reading of the manuscript. This work was supported by grant GMC-103045 from the American Cancer Society.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Anderson,D.E., Becktel,W.J. and Dahlquist,F.W. (1990) Biochemistry, 29, 2403–2408.[ISI][Medline]

Arakawa,T. and Timasheff,S.N. (1985) Methods Enzymol., 114, 49–77.[ISI][Medline]

Baldwin,R.L. (1996) Biophys. J., 71, 2056–2063.[Abstract]

Becktel,W.J. and Baase,W.A. (1987) Biopolymers, 26, 619–623.[ISI][Medline]

Becktel,W.J. and Schellman,J.A. (1987) Biopolymers, 26, 1859–1877.[ISI][Medline]

Bogan,A.A. and Thorn,K.S. (1998) J. Mol. Biol., 280, 1–9.[CrossRef][ISI][Medline]

Cacace,M.G., Landau,E.M. and Ramsden,J.J. (1997) Q. Rev. Biophys., 30, 241–277.[CrossRef][ISI][Medline]

Chen,G.-q., Choi,I., Ramachandran,B. and Gouaux,J.E. (1994) J. Am. Chem. Soc., 116, 8799–8800.[ISI]

Cunningham,B.C. and Wells,J.A. (1987) Protein Eng., 1, 319–325.[Abstract]

Dillon,P.J. and Rosen,C.A. (1990) Biotechniques, 9, 298–300.[ISI][Medline]

Doering,D.S. (1992) PhD Thesis, Massachusetts Institute of Technology, Cambridge, MA.

Edelhoch,H. (1967) Biochemistry, 6, 1948–1954.[ISI][Medline]

Folta-Stogniew,E. and Williams,K.R. (1999) J. Biomol. Technol., 10, 51–63.[Abstract/Free Full Text]

Kapust,R.B. andWaugh, D.S. (1999) Protein Sci., 8, 1668–1674.[Abstract]

Kirk,O., Borchert,T.V. and Fuglsang,C.C. (2002) Curr. Opin. Biotechnol., 13, 345–351.[CrossRef][ISI][Medline]

Kochendoerfer,G.G. et al. (2003) Science, 299, 884–887.[Abstract/Free Full Text]

Kohl,A., Binz,H.K., Forrer,P., Stumpp,M.T., Pluckthun,A. and Grutter,M.G. (2003) Proc. Natl Acad. Sci. USA, 100, 1700–1705.[Abstract/Free Full Text]

Koradi,R., Billeter,M. and Wuthrich,K. (1996) J. Mol. Graph., 14, 51–55.[CrossRef][ISI][Medline]

LaVallie,E.R. and McCoy,J.M. (1995) Curr. Opin. Biotechnol., 6, 501–506.[CrossRef][ISI][Medline]

Linderstrom-Lang,K. (1924) C. R. Trav. Lab. Carlsberg, Ser. Chim., 15, 29.

Masui,A., Fujiwara,N. and Imanaka,T. (1994) Appl. Environ. Microbiol., 60, 3579–3584.[Abstract]

Maxwell,K.L., Mittermaier,A.K., Forman-Kay,J.D. and Davidson,A.R. (1999) Protein Sci., 8, 1908–1911.[Abstract]

Mosavi,L.K., Minor,D.L.,Jr and Peng,Z.Y. (2002a) Proc. Natl Acad. Sci. USA, 99, 16029–16034.[Abstract/Free Full Text]

Mosavi,L.K., Williams,S. and Peng,Z.Y. (2002b) J. Mol. Biol., 320, 165–170.[CrossRef][ISI][Medline]

Pace,C.N. (1986) Methods Enzymol., 131, 266–280.[Medline]

Pace,C.N., Laurents,D.V. and Thomson,J.A. (1990) Biochemistry, 29, 2564–2572.[ISI][Medline]

Pace,C.N., Laurents,D.V. and Erickson,R.E. (1992) Biochemistry, 31, 2728–2734.[ISI][Medline]

Privalov,P.L. (1989) Annu. Rev. Biophys. Biophys. Chem., 18, 47–69.[CrossRef][ISI][Medline]

Prodromou,C. and Pearl,L.H. (1992) Protein Eng., 5, 827–829.[ISI][Medline]

Record,M.T.,Jr, Zhang,W. and Anderson,C.F. (1998) Adv. Protein Chem., 51, 281–353.[ISI][Medline]

Sedgwick,S.G. and Smerdon,S.J. (1999) Trends Biochem. Sci., 24, 311–316.[CrossRef][ISI][Medline]

Shaw,K.L., Grimsley,G.R., Yakovlev,G.I., Makarov,A.A. and Pace,C.N. (2001) Protein Sci., 10, 1206–1215.[Abstract/Free Full Text]

Shirai,T., Suzuki,A., Yamane,T., Ashida,T., Kobayashi,T., Hitomi,J. and Ito,S. (1997) Protein Eng., 10, 627–634.[Abstract]

Shiraki,K., Kudou,M., Fujiwara,S., Imanaka,T. and Takagi,M. (2002) J. Biochem. (Tokyo), 132, 591–595.

Shirley,B.A. (1995) Methods Mol. Biol., 40, 177–190.[Medline]

Staley,J.P. and Kim,P.S. (1994) Protein Sci., 3, 1822–1832.[Abstract/Free Full Text]

Studier,F.W., Rosenberg,A.H., Dunn,J.J. and Dubendorff,J.W. (1990) Methods Enzymol., 185, 60–89.[Medline]

Turunen,O., Vuorio,M., Fenel,F. and Leisola,M. (2002) Protein Eng., 15, 141–145.[Abstract/Free Full Text]

Waldo,G.S., Standish,B.M., Berendzen,J. and Terwilliger,T.C. (1999) Nat. Biotechnol., 17, 691–695.[CrossRef][ISI][Medline]

Walker,R.G., Willingham,A.T. and Zuker,C.S. (2000) Science, 287, 2229–2234.[Abstract/Free Full Text]

Wilkinson,D.L. and Harrison,R.G. (1991) Biotechnology (NY), 9, 443–448.[ISI][Medline]

Wright,D.B., Banks,D.D., Lohman,J.R., Hilsenbeck,J.L. and Gloss, L.M. (2002) J. Mol. Biol., 323, 327–344.[CrossRef][ISI][Medline]

Yang,A.S. and Honig,B. (1993) J. Mol. Biol., 231, 459–474.[CrossRef][ISI][Medline]

Zhang,B. and Peng,Z. (2000) J. Mol. Biol., 299, 1121–1132.[CrossRef][ISI][Medline]

Zhou,P., Lugovskoy,A.A. and Wagner,G. (2001) J. Biomol. NMR, 20, 11–14.[CrossRef][ISI][Medline]

Zweifel,M.E. and Barrick,D. (2001) Biochemistry, 40, 14357–14367.[CrossRef][ISI][Medline]

Received June 23, 2003; revised August 22, 2003; accepted August 28, 2003.





This Article
Abstract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (8)
Request Permissions
Google Scholar
Articles by Mosavi, L. K.
Articles by Peng, Z.-y.
PubMed
PubMed Citation
Articles by Mosavi, L. K.
Articles by Peng, Z.-y.