A simple dual selection for functionally active mutants of Plasmodium falciparum dihydrofolate reductase with improved solubility

D. Japrung1,3, S. Chusacultanachai1, J. Yuvaniyama2,3, P. Wilairat3 and Y. Yuthavong1,4

1National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Science Park, Pathumthani 12120, 2Center for Excellence in Protein Structure and Function and 3Department of Biochemistry, Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand

4 To whom correspondence should be addressed. E-mail: yongyuth{at}nstda.or.th


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Sufficient solubility of the active protein in aqueous solution is a prerequisite for crystallization and other structural studies of proteins. In this study, we have developed a simple and effective in vivo screening system to select for functionally active proteins with increased solubility by using Plasmodium falciparum dihydrofolate reductase (pfDHFR), a well-known malarial drug target, as a model. Prior to the dual selection process, pfDHFR was fused to green fluorescent protein (GFP), which served as a reporter for solubility. The fusion gene was used as a template for construction of mutated DNA libraries of pfDHFR. Two amino acids with large hydrophobic side chains (Y35 and F37) located on the surface of pfDHFR were selected for site-specific mutagenesis. Additionally, the entire pfDHFR gene was randomly mutated using error-prone PCR. During the first step of the dual selection, mutants with functionally active pfDHFR were selected from two libraries by using bacterial complementation assay. Fluorescence signals of active mutants were subsequently measured and five mutants with increased GFP signal, namely Y35Q + F37R, Y35L + F37T, Y35G + F37L and Y35L + F37R from the site-specific mutant library and K27E from the random mutant library, were recovered. The mutants were expressed, purified and characterized as monofunctional pfDHFR following excision of GFP. Our studies indicated that all mutant pfDHFRs exhibited kinetic properties similar to that of the wild-type protein. For comparison of protein solubility, the maximum concentrations of mutant enzymes prior to aggregation were determined. All mutants selected in this study exhibited 3- to 6-fold increases in protein solubility compared with the wild-type protein, which readily aggregated at 2 mg/ml. The dual selection system we have developed should be useful for engineering functionally active protein mutants with sufficient solubility for functional/structural studies and other applications.

Keywords: dihydrofolate reductase/error-prone PCR/green fluorescent protein/Plasmodium falciparum/protein solubility


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Solubility of a protein is one of the key properties determining its choice in functional and structural studies, as well as in medical and industrial uses. These applications often require a much higher concentration of a protein than its natural concentration in the cell; as a result, many proteins, when over-expressed, fail to form native soluble structures, but rather aggregate into inclusion bodies. Although optimizing culture conditions occasionally reduces inclusion body formation and sometimes the target protein can be expressed and purified in a large amount, a purified protein with low intrinsic solubility still tends to form aggregates at the working concentration required for various biological studies.

Intrinsic solubility of a protein is governed by its amino acid sequence and folding pattern. Sufficient interactions of hydrophilic residues on the surface of protein with water provide adequate solvation, which is a key element of high protein solubility in aqueous solution. In contrast, exposure of hydrophobic residues on the protein surface disrupts hydrogen-bonding networks with water molecules (D'Arcy et al., 1999Go), and consequently, random intermolecular interactions between surface hydrophobic patches can occur, resulting in inclusion body formation in cells or aggregation at low concentration of the purified protein.

One of the most common strategies to overcome this problem is to disrupt surface hydrophobic patches by replacing exposed hydrophobic amino acids on the protein surface with those with less hydrophobicity. After the mutations are generated, each mutant is expressed and its function is assayed in order to identify those proteins with improved solubility (Dale et al., 1994Go; Nieba et al., 1997Go; Daujotyte et al., 2003Go; Mosavi and Peng, 2003Go). Alternatively, directed evolution of proteins with improved expression/folding can be accelerated by random mutagenesis, followed either by a genetic selection or by selection using a reporter protein with a simple assay. Examples of such reporters include chloramphenicol acetyltransferase (Maxwell et al., 1999Go) and green fluorescent protein (GFP) (Waldo et al., 1999Go and reviewed in Waldo, 2003aGo,bGo). Generally, those selection systems allow the production of a large number of protein variants and the selection of soluble proteins in a high throughput manner; however, these systems are designed to monitor cellular expression and folding of proteins, but not the function of target proteins of interest, which can be disturbed during mutagenesis.

To address this issue, we have developed a simple and effective dual selection system that allows simultaneous selection of functionally active mutants with improved solubility from large libraries. The feasibility of our dual selection system was demonstrated by using Plasmodium falciparum dihydrofolate reductase (pfDHFR) as a model protein.

pfDHFR is a well-characterized target for antimalarial antifolate drugs, such as pyrimethamine (Pyr) and cycloguanil (Cyc). DHFR of Plasmodium and other protozoa are expressed on the same polypeptide chain with its accompanying enzyme, thymidylate synthase (TS) (Sirawaraporn et al., 1993Go). TS and DHFR sequentially catalyze consecutive reactions in the biosynthesis of thymidylate, which is required for DNA synthesis. In the past few years, considerable efforts have been made towards the understanding of drug-resistant mutations in the pfDHFR gene through structural studies of the target protein in order to gain insights into the design of more effective antimalarial compounds (Lemcke et al., 1999Go; Yuthavong et al., 2000Go; Delfino et al., 2002Go; Kamchonwongpaisan et al., 2004Go).

The monofunctional DHFR domain can fold and function independently without the TS domain. In previous studies, heterologous expression of pfDHFR either in Escherichia coli (Sirawaraporn et al., 1993Go; Sano et al., 1994Go) or yeast (Wooden et al., 1997Go; Ferlan et al., 2001Go) yielded a sufficient amount of protein for functional studies (Cowman et al., 1988Go; Peterson et al., 1988Go; Zolg et al., 1989Go; Hyde, 1990Go; Foote et al., 1990Go; Peterson et al., 1990Go; Basco et al., 1995Go) and for drug testing (Hall et al., 1991Go; Sirawaraporn et al., 1993Go; Sano et al., 1994Go). Unfortunately, the purified pfDHFR domain exhibits low solubility with a tendency to aggregate at a low concentration. This makes it difficult to use the purified pfDHFR in structural and other studies requiring high protein concentration. In sharp contrast to the monofunctional enzyme, the full-length pfDHFR-TS protein suffers from a markedly low level of expression. However, it exhibits sufficient solubility, allowing successful purification and crystallization using a large culture volume to compensate for the very low protein yield (Yuvaniyama et al., 2003Go; Chitnumsub et al., 2004Go).

Based on amino acid sequence alignment and crystal structures of pfDHFR-TS (Yuvaniyama et al., 2003Go), the pfDHFR domain is shown to contain a unique N-terminal protrusion of six amino acids and two additional inserts with respect to DHFR of human and other organisms, designated Insert-I (residues 20–36) and Insert-II (residues 64–99). In the crystal structures of pfDHFR-TS, the N-terminal region forms a short {alpha}-helix that interacts with the surface of the DHFR domain and anchors Insert-II helix to the protein core structure, while Insert-I is involved in inter-domain interaction with the TS unit. Upon deletion of the TS domain, those inter-domain interactions are inevitably abolished, exposing the remaining interaction partners. When two or more molecules of monofunctional pfDHFR are brought into close proximity, such as in a concentrated protein solution, it is possible that these remaining interaction surfaces with exposed hydrophobic amino acids make random contacts, leading to aggregation of pfDHFR mono-domains. In our previous attempts to engineer monofunctional pfDHFR with improved solubility, N-terminal truncation and Insert-I deletion mutants were constructed. Unfortunately, DHFR activity of these mutants is completely abolished (Wattanarangsan et al., 2003Go).

In this study, we have built surface electrostatic, hydrophobicity and accessibility maps of the pfDHFR monodomain based on the structure of wild-type pfDHFR-TS, and located two amino acids with large hydrophobic side chains, tyrosine (Y) 35 and phenylalanine (F) 37, in the hydrophobic patches at the stem of Insert-I (Figure 1). Using degenerate oligonucleotides Pfu (DOP) mutagenesis (Chusacultanachai et al., 1999Go), site-specific saturated mutagenesis of these two amino acids was performed simultaneously and a library carrying different mutations at these two positions was generated. In addition to the site-specific mutagenesis, a random library of the pfDHFR gene was also constructed using error-prone PCR. In both libraries, the pfDHFR gene was fused with the gene encoding GFP, which was used as a reporter in the subsequent selection step.



View larger version (64K):
[in this window]
[in a new window]
 
Fig. 1. Structure and surface properties of the pfDHFR domain. (A) Overall structure of pfDHFR-TS showing that dimerization of the bifunctional protein is formed solely by the pfTS domains (blue and light blue). Attachment of the pfDHFR domains (green and light green) to the pfTS domains is facilitated by interactions from Insert-I (orange). (B) Drawing of the pfDHFR domain (residues 1–231) showing positions of the three mutated residues in Insert-I (orange): K27 (magenta), Y35 and F37 (red). (C) The pfDHFR domain with coloring scheme from hydrophobic (yellow) to hydrophilic (green) amino acids. Surface with brighter colors represents the interacting area with the junction region and the pfTS domain. (D) The pfDHFR domain drawn with surface electrostatic potentials calculated at 300 K, coloring from negative (≤–0.5 V, red) to neutral (white) to positive (≥0.5 V, blue) values. All figures were generated with the program MolMol (Koradi et al., 1996Go).

 
To enable selection of functional pfDHFR mutants with improved solubility from these libraries, a simple two-step selection was employed. First, the functional mutants were selected from the libraries using a bacterial complementation assay, which is based on the ability of the pfDHFR mutants to rescue growth and survival of DHFR-deficient E.coli cells. Under the condition in which the bacterial DHFR was selectively inhibited, only cells expressing functional pfDHFR could survive on the selective media plates. The surviving cells were harvested and fluorescent signals of the mutants were monitored using either a fluorescence-activated cell sorter (FACS) or a fluorometer, in order to select those mutants with improved solubility.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Molecular cloning of pfDHFR–GFP

Cloning of pfDHFR–GFP, using pET17b-pfDHFR carrying a synthetic pfDHFR gene (Sirawaraporn et al., 1993Go), was carried out in three steps. Firstly, the stop codon of the pfDHFR gene was changed to the NheI site by Quik Change site directed mutagenesis (Stratagene), employing two complementary primers (5'-AAGAAGACCAACAACGCTACAAGCCGGCCGGCTT-3' and 5'-AAGCTTGCGGCCGGCTAGCGTTGTTGGTGTTCTT-3') resulting in pET17b-pfDHFR-Nhe plasmid. Secondly, a short linker encoding 12 amino acids, AlaSerAlaGlySerAlaAlaGlySerGlyGlySer (Waldo et al., 1999Go) was inserted at the 3' end of the pfDHFR gene. Two complementary oligonucleotides (Linker 5', 5'-CTAGCGCTGGCTCCGCTGCTGGTTCTGGCG-3' and Linker 3', 5'-GATCCGCCAGAACCAGCAGCGGAGCCAGCG-3') encoding the designated amino acid sequence were used. They could be annealed together to form double-stranded oligonucleotides with NheI and BamHI sticky overhangs at 5' and 3' ends, respectively. The annealed oligonucleotides were ligated to the 3' end of the pfDHFR gene in pET17b-pfDHFR-Nhe at NheI and BamHI sites. The ligation products were then transfected into E.coli strain DH5{alpha} and plated on LB agar plates containing 100 µg/ml of ampicillin. After overnight incubation at 37°C, colonies were picked and plasmid DNA was extracted. Finally, the GFP gene (720 bp), amplified from the p159ngfp plasmid (Matthysse et al., 1996Go), was cloned at the 3' end of the linker. The amplified product was digested with restriction endonucleases to generate a BamHI site at the 5' end and an EcoRI site at the 3'end, and subsequently ligated into the plasmid encoding the pfDHFR-Linker using similar restriction enzymes. The positive clone carrying the pfDHFR–GFP gene was selected and verified by DNA sequencing.

Site-specific saturated mutagenesis of residues 35 and 37 of pfDHFR

Saturated mutagenesis of residues 35 and 37 of pfDHFR was simultaneously performed by DOP mutagenesis (Chusacultanachai et al., 1999Go). In brief, the reaction exploited two oligonucleotide primers, 5'-AAACGAAGTCTTCAACAAC(NNN)ACC(NNN)CGCGGTTCTGGGCAAC-3' and 5'-GTTGTTCAGACCGCG(NNN)GGT(NNN)GTTGTTGAAGACTTCCGTTT-3', of which codons 35 and 37 were doped with an equal ratio of four different nucleotides, A, G, C and T (designated as N), during the oligonucleotide synthesis. In the presence of pET17b-pfDHFR–GFP template, the primers were integrated into newly synthesized plasmid DNA by primer extension reaction using Pfu DNA polymerase. After template DNA has been eliminated by DpnI digestion, the newly synthesized plasmid DNA library was transfected into the E.coli strain BL-21(DE3) pLysS by electroporation and the cells were plated on LB agar plates containing 100 µg/ml of ampicillin.

Random mutagenesis of pfDHFR using error-prone PCR

The pfDHFR–GFP gene was amplified in error-prone PCR (Chusacultanachai et al., 2002Go) using two oligonucleotide primers, 5'-GAAGGAGATATACATATGATGCAACAG-3' and 5'-AGCGGAGCCAGCGCTAGC-3'. The amplified PCR product (~700 bp) was purified through a filter membrane (0.025 µm; Millipore), digested with NdeI and NheI at 37°C for 16 h and subsequently ligated into the corresponding sites of pET17b pfDHFR–GFP. Ligation product was transfected into E.coli DH5{alpha} and cells were plated on Luria–Bertani (LB) agar plates containing 100 µg/ml of ampicillin. All colonies on the plate were pooled. Plasmid DNA was extracted and used as a DNA library to transform E.coli strain BL-21(DE3) pLysS.

Screening of active pfDHFR by bacterial complementation

Colonies of E.coli BL-21(DE3) pLysS carrying either site-specific or random libraries of pfDHFR were harvested and plated onto minimal media (MM) plates containing 100 µg/ml of ampicillin, 35 µg/ml of chloramphenicol and 4 µM trimethoprim (TMP), which selectively inhibits endogenous DHFR of bacterial host cells (Chusacultanachai et al., 2002Go). Surviving colonies expressing active mutant pfDHFRs were harvested and subjected to further selection.

Selection of pfDHFR–GFP with high GFP signal

Approximately 400 colonies of site-specific library were randomly selected from an MM plate containing TMP and cultured in 96-well plates. Each well contained 200 µl of LB broth supplemented with 100 µg/ml of ampicillin, 34 µg/ml of chloramphenicol and 0.4 mM isopropyl ß-D-thiogalactopyranoside (IPTG). After production of the protein was induced for 4–6 h by IPTG, the fluorescence signal of each clone was measured using a CytoFluor Multi-well series 4000 plate reader (PerSeptive Biosystem) (excitation at 488 nm, emission at 530 nm). The fluorescence signal was normalized by dividing it by the cell density measured at OD595. Clones with a GFP signal higher than that of the wild-type pfDHFR–GFP were selected. Plasmid DNA from individual clones was isolated and sequenced.

All colonies from random mutagenesis that grew on MM plates were pooled and grown in LB media containing 100 µg/ml of ampicillin and 34 µg/ml of chloramphenicol. The bacterial culture was allowed to grow until the OD595 reached 0.6–0.8, and then the expression of pfDHFR was induced by adding 0.4 mM IPTG. After 4–6 h of incubation, the cells (105) were subjected to FACS analysis (Vantate model, Becton Dickenson). GFP signal of individual cells was detected at 488 nm excitation and 530 nm emission. Cells expressing 10% highest GFP signal were sorted and collected. The selected cells were plated on LB agar plates containing 100 µg/ml of ampicillin and 34 µg/ml of chloramphenicol, and the mutations were verified by sequencing of plasmid DNA.

Construction of pfDHFR monofunctional domain using pfDHFR–GFP as a template

To assess the solubility and kinetic properties of the selected mutants, the mutants were cloned and expressed as monofunctional proteins without the GFP partner. Construction of the monofunctional pfDHFR mutants was accomplished by replacing the NheI site at the 3' end of pfDHFR of the fusion gene with two stop codons. Two oligonucleotide primers (5'-ATCTACAAGAAGACCAACAACTAGTAGGCTGGCTCCGCTGCTGGT-3' and 5'-ACCAGCAGCGGAGCCAGC\CTACTAGTTGTTGGTCTTCTTGTAAT-3') were used in the Quik Change reaction, with plasmids carrying the selected pfDHFR–GFP mutant genes as templates. After template DNA was eliminated by DpnI digestion, the Quik Change product was transfected into E.coli DH5{alpha} and plated on LB agar plates containing 100 µg/ml of ampicillin. Plasmid DNA was isolated and the existence of the stop codon in a positive clone was verified by DNA sequencing.

pfDHFR expression and purification

Single colonies of E.coli BL-21(DE3) pLysS harboring the recombinant pET-pfDHFR plasmid or pET-pfDHFR–GFP plasmid were cultured in 120 ml of LB broth containing 100 µg/ml of ampicillin and 35 µg/ml of chloramphenicol for 3 h at 37°C with shaking at 250 r.p.m. The cells were then added to 12 l of LB broth containing 100 µg/ml of ampicillin and 34 µg/ml of chloramphenicol and the cultures were incubated at 37°C with shaking until the OD595 reached 0.6. IPTG was added to induce DHFR or DHFR–GFP production and the cells were incubated at 18°C for 24 h with shaking before being harvested by sedimentation at ~7000 g.

The cell suspensions were subjected to disruption by a French press (Vibra cellTM, Sonics & Material Inc.) at 1500 p.s.i. or sonication for two cycles with a pulse cycle of 5 s on, 9 s off, amplitude of 35% and interval time of 30–60 s between cycles. The cell lysates were centrifuged at ~26 000 g for 1 h. Proteins were purified by affinity chromatography using a methotrexate–Sepharose CL-6B column (Sirawaraporn et al., 1993Go). Protein concentrations were measured by the Bradford method (Lowry et al., 1951Go).

Kinetic studies

Activity of DHFR was determined spectrophotometrically by measuring the rate of reduction of absorbance at 340 nm as reported previously (Sirawaraporn et al., 1993Go). Enzyme activity was calculated using {varepsilon}340 of 12 300 M–1cm–1. One unit of activity is defined as the amount of enzyme that catalyzes the reduction of 1 µmol of the substrate per minute at 25°C. The standard assay reaction (1 ml) was performed in a 1 cm path length cuvette using 50 mM TES, pH 5.0, 75 mM 2-mercaptoethanol, 100 µM dihydrofolate (DHF), 100 µM reduced NADPH, 1 mg/ml of BSA and ~0.01 U of the enzyme. Reaction was initiated with DHFR. Blank reaction contained all components except the enzyme. To determine the Km value for DHF, the enzyme activity was assayed with DHF in the concentration range of 0.8–100 µM, while keeping the concentration of NADPH constant at 100 µM. Likewise, the Km value of NADPH was determined by varying NADPH concentrations from 0.8 to 100 µM while keeping the concentration of DHF constant at 100 µM. In each assay ~0.001–0.005 U of the enzyme was used.

Measurement of enzyme solubility

In this study, protein solubility is defined as the maximum concentration of the protein prior to aggregation. To measure the solubility of the wild-type and mutant pfDHFR, 100 µl aliquots of the purified enzyme of increasing concentrations were prepared and transferred to 96-well plates. To detect protein aggregation, the optical density of the solution was measured at 595 nm using a microplate spectrophotometer (Labsystems iEMS Reader MS). Additionally, aggregation formation in each solution was observed under a stereomicroscope (Olympus model SZ4045 TR).


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Identification of hydrophobic amino acids on the surface of the pfDHFR molecule

Crystal structure of wild-type pfDHFR-TS (PDB ID 1J3I, Yuvaniyama et al., 2003Go) reveals a dimeric assembly of the functional enzymes, in which the TS domains form the dimer interface with the N-terminal pfDHFR domains positioned further apart (Figure 1). This full-length pfDHFR-TS construct possesses a junction region of 89 amino acids linking the N-terminal pfDHFR and the C-terminal pfTS domains. The junction region contains a flexible part of unknown function that is not completely visible in the structure, leading to ambiguity in the determination of monomeric domain connectivity. Nevertheless, the interacting surface of a pfDHFR domain can be identified (Figure 1C).

It has been deduced that the aggregation of monomeric pfDHFR is mainly due to hydrophobic interaction as aggregation is induced by increasing ionic strength but is stabilized by addition of glycerol (data not shown). We hypothesized that amino acid residues involved in such aggregation would be located on or near the surface of interaction uncovered by truncation of the junction region and pfTS, assuming that the native pfDHFR folding was preserved. Based on the published hydropathy index (Engelman et al., 1986Go), hydrophobic amino acids on the surface of the pfDHFR domain, especially on the interacting surface, were distinguished from polar amino acids using different colors (Figure 1C). In addition, surface electrostatic potential (Figure 1D) was taken into account in order to identify hydrophobic residues with neutral electrostatic potential, which could be responsible for the aggregation. At least five surface patches of such candidates could be located. However, in order to narrow the scope of our study, only two amino acids, Y35 and F37, positioned at the stem of Insert-I were chosen for site-directed mutagenesis (Figure 1) because this feature is absent in all known monofunctional, soluble DHFR from other organisms. Nevertheless, a random mutation was also employed in a parallel study, so that the protein engineering approach was not limited by this site-directed mutation strategy.

pfDHFR–GFP fusion protein

Owing to its intrinsic fluorescence, robust exogenic expression, high solubility and stability, GFP is a widely used reporter for protein expression (Chalfie et al., 1994Go) and localization (Jun et al., 2004Go), and is recently being used for folding and solubility studies (Waldo et al., 1999Go). It has been shown that, when it was fused to proteins of interest and over-expressed in E.coli, the absence of GFP signal is a good indication of aggregation of the fusion partner (Waldo et al., 1999Go). In this study, we fused GFP to the C-terminus of the pfDHFR, replacing the junction region and pfTS counterpart. The pfDHFR-TS structures show that the junction region linking the N-terminal pfDHFR to the C-terminal pfTS contains a rigid helical part besides the flexible region. This helix is attached to the pfDHFR domain on the opposite side of the functional dimer of pfDHFR-TS, indicating that the junction region is important for domain positioning in protein dimerization. The two domains of the engineered pfDHFR–GFP were then linked by a short linker of 12 amino acids, in order to mimic the junction region linking the pfDHFR and its accompanying partner pfTS in the natural full-length form. This would allow some space between the two domains to keep them from making unfavorable inter-domain interactions or from unintentional steric clash.

To test whether the fusion protein exhibited comparable activity to that of DHFR, pfDHFR–GFP fusion protein was expressed in BL-21(DE3) pLyS, purified by affinity chromatography and its kinetic parameters were determined. As shown in Table I, the fusion protein exhibited similar Km values for substrate DHF and cofactor NADPH to those of the wild-type pfDHFR monofunctional protein (4.6 and 6.3 versus 4.2 and 5.8, respectively). Furthermore, the inhibition constant (Ki) of the fusion protein for standard anti-malarial antifolates, Pyr and Cyc, was also similar to that of the wild-type pfDHFR.


View this table:
[in this window]
[in a new window]
 
Table I. Kinetic properties and inhibition by cycloguanil (Cyc) and pyrimethamine (Pyr) of purified wild-type pfDHFR and fusion pfDHFR–GFP

 
The DHFR–GFP was then used as a template in subsequent experiments to construct a site-specific and a random mutant library of pfDHFR.

Construction of pfDHFR mutant libraries

Two surface residues with large hydrophobic side chains, Y35 and F37 (Figure 1), were selected for replacement studies using DOP mutagenesis (Chusacultanachai et al., 1999Go, 2002Go). In this method, a pair of degenerate oligonucleotides encoding all 20 amino acids at those positions was synthesized and used in the primer extension by Pfu DNA polymerase. During the primer extension cycles, different codons are randomly integrated into newly synthesized DNA products. After elimination of the wild-type template, the DNA library was transfected into BL-21(DE3) pLysS. Theoretically, the pool size of this library is 400 clones. After the transformation 2000 colonies were obtained, indicating that the library was saturated at 5x coverage.

Aside from site-specific mutations of the surface residues, a random DNA library of the entire pfDHFR gene was also constructed by using error-prone PCR (Chusacultanachai et al., 2002Go). We elected to perform a single round of error-prone PCR in a condition generating low mutation frequency to avoid the possible accumulation of multiple mutations in pfDHFR. After the error-prone PCR products were inserted into the plasmid backbone, the mutation rate was assessed by sequencing 10 randomly selected colonies. The DNA sequence analysis of ~5000 bp indicated a mutation rate of 1.2%, which is equivalent to seven base changes per 700 bp of pfDHFR gene.

Dual selection of active pfDHFR mutants exhibiting increased green fluorescence signal

The first step of the dual selection was to select for functionally active pfDHFR mutants by using a bacterial complementation system. In the selection, BL-21(DE3) pLysS cells containing site-specific mutant DNA library or random mutant DNA library were plated on the MM plate containing 100 µg/ml of ampicillin and 34 µg/ml of chloramphenicol, supplemented with 4 µM TMP. Endogenous DHFR was selectively inhibited by TMP, and hence, only cells expressing functionally active pfDHFR mutants could survive and form colonies on the TMP plates. After overnight incubation at 37°C, these colonies were harvested for the second selection step.

To select mutants with improved solubility, fluorescence signals a of the mutants were measured and compared with that of the wild-type pfDHFR–GFP fusion protein. For the site-specific library, each surviving colony on TMP plates was transferred to a well of a 96-well microtiter plate and cultured separately in 200 µl of MM containing 100 µg/ml of ampicillin and 34 µg/ml of chloramphenicol. After 6 h of IPTG induction at 20°C, the fluorescence signal of each well was measured using a fluorometer. Fluorescence intensity of each clone was normalized with cell density measured at OD595 and was reported as fluorescence intensity per OD595 of cells. Four clones exhibited 1- to 3-fold higher fluorescence intensity than that of the wild-type pfDHFR–GFP. DNA sequencing of inserts in plasmids obtained from those clones revealed that the selected clones contained different mutations, namely Y35Q + F37R, Y35L + F37R, Y35G + F37L and Y35L + F37T (Figure 2).



View larger version (32K):
[in this window]
[in a new window]
 
Fig. 2. Relative fluorescence of selected mutant pfDHFR–GFP fusion proteins. Excitation wavelength was 488 nm and emission wavelength 530 nm.

 
For the random library, the surviving colonies were harvested from TMP plates and subjected to FACS analysis. After the fluorescence intensity of each cell was measured, mutants that exhibited the highest 10% fluorescence intensity were sorted and recovered by plating onto LB medium containing 100 µg/ml of ampicillin and 34 µg/ml of chloramphenicol. Eight clones were selected randomly and their plasmid DNA inserts were subjected to DNA sequencing. Sequencing data from six out of the eight clones were obtained. Four of the six clones carried K27E mutation and the other two clones had the wild-type pfDHFR–GFP sequence. To verify the enhanced fluorescence of K27E pfDHFR–GFP, the mutant was grown separately and its fluorescence signal was measured in a fluorometer. The K27E pfDHFR–GFP mutant exhibited 1.8-fold higher fluorescence intensity than that of the wild-type pfDHFR–GFP (Figure 2).

Kinetic property and solubility of the five selected mutants

As the ultimate goal of this study was to obtain active pfDHFR mutants with increased solubility, the mutants were expressed and characterized as monofunctional pfDHFR, without GFP. Mutant pfDHFR enzymes were purified to homogeneity using affinity and gel filtration chromatography (Sirawaraporn et al., 1993Go). Kinetic parameters of the purified proteins including Km for DHF and NADPH, and Ki for standard inhibitors were determined. All mutants exhibited the same range of kinetic parameters as those of the wild-type protein (Table II).


View this table:
[in this window]
[in a new window]
 
Table II. Kinetic properties and inhibition by cycloguanil (Cyc) and pyrimethamine (Pyr) of purified mutant pfDHFR without GFP fusion

 
Our previous studies showed that the monofunctional form of wild-type pfDHFR exhibited low solubility in aqueous solution and aggregated at ~2 mg/ml (data not shown). To determine whether these selected mutants exhibited sufficient solubility for crystallization and other functional studies, solutions of purified proteins of increasing concentrations were prepared and examined for aggregation. Solubility of the wild type was measured using solutions of the purified protein at concentrations of 0.76, 1.12, 1.82 and 2.31 mg/ml. At concentrations ≤1.82 mg/ml, the solution of wild-type pfDHFR was clear when visualized under a stereomicroscope, and the OD595 was consistently <0.1. However, at a protein concentration of 2.31 mg/ml, aggregation was clearly observed under the stereomicroscope. Moreover, the OD595 increased to 0.14 (Table III). Thus the solubility of the wild-type protein in this study was defined as 1.82 mg/ml. Using the same approach, the solubility of Y35L + F37R, Y35G + F37L, Y35Q + F37R, Y35L + F37T and K27E pfDHFR mutants was determined to be 6.83, 11.86, >6.09, 4.94 and >13.6 mg/ml, respectively (Table IV). From these results it was shown that all selected mutants exhibited higher solubility than the wild-type pfDHFR, in particular mutants K27E and Y35Q + F37R, which showed 3- to 6-fold increases in solubility.


View this table:
[in this window]
[in a new window]
 
Table III. Observed aggregation and OD at 595 nm of purified wild-type pfDHFR

 

View this table:
[in this window]
[in a new window]
 
Table IV. Solubility of the wild-type and pfDHFR mutants without GFP fusion

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Current strategies to engineer functionally active proteins with sufficient solubility fall into two broad categories, rational design and random selection. Rational design or structural based substitution often involves the identification of hydrophobic patches on the protein surface and replacing amino acids with large hydrophobic side chains with more hydrophilic ones. Because such site-specific mutagenesis approaches generate only a few mutant proteins, these mutants are often characterized individually by low throughput functional studies or by direct detection of the expressed proteins (D'Arcy, 1994Go; Malissard and Berger, 2001Go). This strategy requires structural information on the target proteins and its success depends upon the accuracy of the prediction or identification of the culprit amino acids.

On the other hand, random mutations with specific selections or directed evolution of proteins do not require prior structural knowledge of the protein targets. Generally, random libraries of the gene encoding a protein of interest are generated and the mutant proteins with increased solubility are selected by monitoring the signal of a reporter protein. Since the reporter protein is fused to one end of the target protein, the signal of the reporter reflects expression level and folding efficiency of the fusion protein. In directed evolution, mutants selected from the first library are subjected to iterations of mutagenesis and selection until a satisfactory signal is obtained (Waldo et al., 1999Go). Using a reporter combined with an easy assay, a library of millions of mutants can be screened at a single time. Nonetheless, signals of those reporters may not reflect intrinsic solubility of the mutant proteins nor represent functional target proteins, which possibly can be altered by the mutagenesis process, especially when many rounds of mutagenesis are performed (Waldo et al., 2003aGo,bGo).

We therefore incorporated a simple functional selection with the reporter assay in order to select functionally active mutants with improved expression and solubility, using pfDHFR as a target protein. GFP was fused to the C-terminal end of pfDHFR and the fusion protein was purified and characterized. Kinetic parameters of the pfDHFR–GFP fusion protein, including Km for substrate and co-factor, and Ki for inhibitors, were similar to those of the pfDHFR monofunctional protein (Table I). It is noteworthy that the maximum excitation and emission wavelengths and fluorescence spectrum of the pfDHFR–GFP fusion protein were similar to those of the wild-type GFP (data not shown), indicating that the two proteins could fold and function independently and did not interfere with each other's activity. Proper choice of the peptide linker sequence that allows the two proteins to fold and function independently, avoiding steric clash between the two fusion partners, will be critical in determining the success of the selection process.

It has been previously reported that the N-terminal sequence of a protein contains a signature sequence guiding it to a proper folding pathway (Korepanova et al., 2001). Therefore, fusion of a protein with robust folding, such as maltose binding protein, glutathione S-transferase, thioredoxin and GFP, to the N-terminus of the target protein often leads to a more efficient folding and a higher proportion of soluble fraction of the fusion protein (di Guan et al., 1988Go). Nevertheless, we found that fusion of GFP to the C-terminal end of pfDHFR also improved solubility of the fusion protein. The monodomain of the wild-type pfDHFR starts to form aggregations at ~2 mg/ml, whereas the purified pfDHFR–GFP fusion protein could be concentrated at >30 mg/ml (data not shown). It is possible that the presence of GFP (although possibly not directly interacting with pfDHFR, and covering the exposed hydrophobic patches of pfDHFR, which become exposed when its natural partner, TS, is removed) provides steric hindrance that prevents premature aggregation of the fusion protein.

DHFR from TMP-resistant type S1 Staphylococcus aureus forms inclusion bodies when expressed in E.coli (Dale et al., 1994Go). Solubility of the S.aureas DHFR can be improved over 264-fold by replacing two amino acids in hydrophobic patches on the protein surface with hydrophilic amino acids (Dale et al., 1994Go). Using a similar strategy, two amino acids at positions 35 (F) and 37 (Y), found in hydrophobic patches on the surface of the pfDHFR molecule, were selected for site-specific mutagenesis. Additionally, random mutations of the entire pfDHFR gene were generated using error-prone PCR. Using these two distinct mutagenesis approaches, large and diverse pools of pfDHFR mutants were created.

The libraries were first subjected to bacterial complementation assay and only active mutants with sufficient DHFR activity to rescue DHFR-deficient E.coli cells were harvested. Mutants with increased GFP signal can be effectively selected either by the moderate throughput selection using a fluorometer or by the high throughput screening using FACS, which allows selection of pools of million cells at a time. Five active pfDHFR mutants exhibiting up to 3-fold higher fluorescence signal, compared with wild-type pfDHFR–GFP (Figure 2), were identified.

Although GFP is a convenient marker of the solubility of the fused protein, it can perturb the effect of mutations on the solubility. To assess intrinsic solubility and other properties, the selected mutants were therefore expressed and purified without the fused GFP. pfDHFR activities of these mutants were detectable using the bacterial complementation system, which is sensitive to as low as 0.1% of wild-type pfDHFR activity (Chusacultanachai et al., 2002Go). However, kinetic parameters, including Km and Ki, indicated that all of the selected mutants exhibited comparable activity to that of the wild-type pfDHFR. This is probably due to the fact that our mutagenesis strategy was aimed at a very low mutation rate, in order to retain the wild-type DHFR activity.

In other studies, solubility of a protein of interest was indirectly analyzed by comparison of soluble and insoluble fractions of the protein in SDS–PAGE (Fox et al., 2001Go). However, the proportion of soluble and insoluble fractions of a protein is governed not only by solubility of the protein, but also by the expression ability and folding capability of the target protein in host cells, which depend on various cellular and environmental factors. To directly determine the intrinsic solubility of the mutants, which is more relevant to biochemical and structural studies, and other applications, we elected to measure and compare the maximum concentration of each purified protein in aqueous solution before aggregation occurred. Aggregation of a protein can be simply detected either by measurement of OD595 or by direct observation under a microscope (Table III). We found that the purified wild-type pfDHFR monofunctional domain was still soluble at 1.8 mg/ml, but aggregation was observed at a higher concentration (Table III), consistent with our previous report.

Using those two methods, all five mutants exhibited higher solubility than the wild-type pfDHFR (Table IV). Among the five mutants, four (Y35Q + F37R, Y35L + F37T, Y35G + F37L and Y35L + F37R) were from site-specific mutagenesis of two amino acids in Insert-I. It is worth noting that the fifth mutation (K27E) obtained through random mutagenesis is also located in Insert-I, indicating that Insert-I of pfDHFR plays an important role in determining the solubility of the monofunctional pfDHFR, as we had predicted. As Insert-I forms a loop structure that anchors the pfDHFR to the pfTS domain (Figure 1), in the absence of the pfTS it becomes flexible and can freely adopt various conformations, some of which may be able to complement molecular surfaces of other molecules. With its exposed hydrophobic amino acids, such intermolecular interaction could result in hydrophobic aggregation. Y35Q, F37R and F37T mutations would change the large hydrophobic side chain of tyrosine or phenylalanine to a positively charged group of arginine or a polar side chain of glutamine or threonine, providing better interactions with water molecules and thus increasing the protein solubility. A leucine side chain, despite its hydrophobicity, is shorter than both tyrosine and phenylalanine; therefore, Y35L and F35L mutations would contribute to the increase in protein solubility through reduction of unfavorable interactions with water. Y35G mutation, on the other hand, might provide greater flexibility to the Insert-I region as this residue is located at the stem of the Insert-I loop (Figure 1). The more flexible Insert-I loop might sterically increase difficulty in hydrophobic aggregation of the mutant enzyme. In the K27E mutant, lysine (a basic amino acid) is changed to glutamate (an acidic amino acid). Although both residues are hydrophilic, the side chain of glutamate exhibits lower hydrophobicity than that of lysine (Black and Mould, 1991Go). In addition, at physiological pH, the change from positive to negative charge at this position can affect ionic interaction involving this amino acid residue. Such a change in the charge at residue 27 may repel the mutant pfDHFR molecules away from one another, preventing aggregation and thus increasing its solubility.

In this study, we report a simple and effective dual screening system to select for active protein mutants with increased solubility, by combining the bacterial complementation technique with the already widely used GFP reporter. This is the most convenient strategy when an in vivo activity screening is available. The bacterial complementation selection is a simple and efficient functional assay for essential genes and can be utilized by using chemical knockout or readily available knockout E.coli strains, yeast or other organisms. However, other genetic selections can be performed according to the function(s) of target proteins, including protein-fragment complementation assays (Remy and Michnick, 1999Go), ubiquitin-based assay (Lyapina et al., 1998Go) and two-hybrid system (Fields et al., 1989) for protein–protein interaction; challenge phage assay (Pini and Bracci, 2000Go) and three-hybrid system (Licitra and Liu, 1996Go) for DNA- and RNA-binding proteins; and G protein-coupled receptor (GPCR) (Pao and Benovic, 2005Go) for kinases and proteins in the signal transduction pathway. When functional screens are available, the combination of a simple genetic selection and GFP reporter should find wide applications in obtaining biologically active proteins with sufficient solubility for structural and functional studies, as well as for industrial and medical applications.


    Acknowledgements
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
This work was supported by grants (to Y.Y.) from Medicines for Malaria Ventures, Wellcome Trust and European Union (INCO-Dev). D.J. was supported by Thailand Graduate Institute of Science and Technology, and P.W. is a Thailand Research Fund Senior Fellow.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Basco,L.K., Eldin de Pecoulas,P., Wilson,C.M., Le Bras,J. and Mazabraud,A. (1995) Mol. Biochem. Parasitol., 69, 135–138.[CrossRef][ISI][Medline]

Black,S.D. and Mould,D.R. (1991) Anal. Biochem., 193, 72–82.[CrossRef][ISI][Medline]

Chalfie,M., Tu,Y., Euskirchen,G., Ward,W.W. and Prasher,D.C. (1994) Science, 263, 802–805.[ISI][Medline]

Chitnumsub,P., Yuvaniyama,J., Vanichtanankul,J., Kamchonwongpaisan,S., Walkinshaw,M.D. and Yuthavong,Y. (2004) Acta Crystallogr. D, 60, 780–783.[CrossRef][ISI][Medline]

Chusacultanachai,S., Glenn,K.A., Rodriguez,A.O., Read,E.K., Gardner,J.F., Katzenellenbogen,B.S. and Shapiro,D.J. (1999) J. Biol. Chem., 274, 23591–23598.[Abstract/Free Full Text]

Chusacultanachai,S., Thiensathit,P., Tarnchompoo,B., Sirawaraporn,W. and Yuthavong,Y. (2002) Mol. Biochem. Parasitol., 120, 61–72.[CrossRef][ISI][Medline]

Cowman,A.F., Morry,M.J., Biggs,B.A., Cross,G.A. and Foote,S.J. (1988) Proc. Natl Acad. Sci. USA, 85, 9109–9113.[Abstract/Free Full Text]

Dale,G.E., Broger,C., Langen,H., D'Arcy,A. and Stuber,D. (1994) Protein Eng., 7, 933–939.[ISI][Medline]

D'Arcy,A. (1994) Acta Crystallogr. D, 50, 469–471.[CrossRef][ISI][Medline]

D'Arcy,A., Stihle,M., Kostrewa,D. and Dale,G. (1999) Acta Crystallogr. D, 55(Pt 9), 1623–1625.[CrossRef][ISI][Medline]

Daujotyte,D., Vilkaitis,G., Manelyte,L., Skalicky,J., Szyperski,T. and Klimasauskas,S. (2003) Protein Eng., 16, 295–301.[CrossRef][ISI][Medline]

Delfino,R.T., Santos-Filho,O.A. and Figueroa-Villar,J.D. (2002) Biophys. Chem., 98, 287–300.[CrossRef][ISI][Medline]

di Guan,C., Li,P., Riggs,P.D. and Inouye,H. (1988) Gene, 67, 21–30.[CrossRef][ISI][Medline]

Engelman,D.M., Steitz,T.A. and Goldman,A. (1986) Annu. Rev. Biophys. Biophys. Chem., 15, 321–353.[CrossRef][ISI][Medline]

Ferlan,J.T., Mookherjee,S., Okezie,I.N., Fulgence,L. and Sibley,C.H. (2001) Mol. Biochem. Parasitol., 113, 139–150.[CrossRef][ISI][Medline]

Fields,S. and Song,O. (1989) Nature, 340, 242–246.[CrossRef][ISI][Medline]

Foote,S.J., Galatis,D. and Cowman,A.F. (1990) Proc. Natl Acad. Sci. USA, 87, 3014–3017.[Abstract/Free Full Text]

Fox,J.D., Kapust,R.B. and Waugh,D.S. (2001) Protein Sci., 10, 622–630.[Abstract/Free Full Text]

Hall,S.J., Sims,P.F. and Hyde,J.E. (1991) Mol. Biochem. Parasitol., 45, 317–330.[CrossRef][ISI][Medline]

Hyde,J.E. (1990) Pharmacol. Ther., 48, 45–59.[CrossRef][ISI][Medline]

Jun,L., Saiki,R., Tatsumi,K., Nakagawa,T. and Kawamukai,M. (2004) Plant Cell Physiol., 45, 1882–1888.[Abstract/Free Full Text]

Kamchonwongpaisan,S., Quarrell,R., Charoensetakul,N., Ponsinet,R., Vilaivan,T., Vanichtanankul,J., Tarnchompoo,B., Sirawaraporn,W., Lowe,G. and Yuthavong,Y. (2004) J. Med. Chem., 47, 673–680.[CrossRef][ISI][Medline]

Koradi,R., Billeter,M. and Wuthrich,K. (1996) J. Mol. Graph., 14, 51–55, 29–32.[CrossRef][ISI][Medline]

Korepanova,A., Douglas,C., Leyngold,I. and Logan,T.M. (2001) Protein Sci., 10, 1905–1910.[Abstract/Free Full Text]

Lemcke,T., Christensen,I.T. and Jorgensen,F.S. (1999) Bioorg. Med. Chem., 7, 1003–1011.[CrossRef][ISI][Medline]

Licitra,E.J. and Liu,J.O. (1996) Proc. Natl Acad. Sci. USA, 93, 12817–12821.[Abstract/Free Full Text]

Lowry,O.H., Rosebrough,N.J., Farr,A.L. and Randall,R.J. (1951) J. Biol. Chem., 193, 265–275.[Free Full Text]

Lyapina,S.A., Correll,C.C., Kipreos,E.T. and Deshaies,R.J. (1998) Proc. Natl Acad. Sci. USA, 95, 7451–7456.[Abstract/Free Full Text]

Malissard,M. and Berger,E.G. (2001) Eur. J. Biochem., 268, 4352–4358.[Abstract/Free Full Text]

Matthysse,A.G., Stretton,S., Dandie,C., McClure,N.C. and Goodman,A.E. (1996) FEMS Microbiol. Lett., 145, 87–94.[CrossRef][ISI][Medline]

Maxwell,K.L., Mittermaier,A.K., Forman-Kay,J.D. and Davidson,A.R. (1999) Protein Sci., 8, 1908–1911.[Abstract]

Mosavi,L.K. and Peng,Z.Y. (2003) Protein Eng., 16, 739–745.[CrossRef][ISI][Medline]

Nieba,L., Honegger,A., Krebber,C. and Pluckthun,A. (1997) Protein Eng., 10, 435–444.[CrossRef][ISI][Medline]

Pao,C.S. and Benovic,J.L. (2005) J. Biol. Chem., 280, 11052–11058.[Abstract/Free Full Text]

Peterson,D.S., Walliker,D. and Wellems,T.E. (1988) Proc. Natl Acad. Sci. USA, 85, 9114–9118.[Abstract/Free Full Text]

Peterson,D.S., Milhous,W.K. and Wellems,T.E. (1990) Proc. Natl Acad. Sci. USA, 87, 3018–3022.[Abstract/Free Full Text]

Pini,A. and Bracci,L. (2000) Curr. Protein Pept. Sci., 1, 155–169.[CrossRef][Medline]

Remy,I. and Michnick,S.W. (1999) Proc. Natl Acad. Sci. USA, 96, 5394–5399.[Abstract/Free Full Text]

Sano,G., Morimatsu,K. and Horii,T. (1994) Mol. Biochem. Parasitol., 63, 265–273.[CrossRef][ISI][Medline]

Sirawaraporn,W., Prapunwattana,P., Sirawaraporn,R., Yuthavong,Y. and Santi,D.V. (1993) J. Biol. Chem., 168, 21637–21644.

Waldo,G.S. (2003a) Curr. Opin. Chem. Biol., 7, 33–38.[CrossRef][ISI][Medline]

Waldo,G.S. (2003b) Methods Mol. Biol., 230, 343–359.[Medline]

Waldo,G.S., Standish,B.M., Berendzen,J. and Terwilliger,T.C. (1999) Nat. Biotechnol., 17, 691–695.[CrossRef][ISI][Medline]

Wattanarangsan,J., Chusacultanachai,S., Yuvaniyama,J., Kamchonwongpaisan,S. and Yuthavong,Y. (2003) Mol. Biochem. Parasitol., 126, 97–102.[CrossRef][ISI][Medline]

Wooden,J.M., Hartwell,L.H., Vasquez,B. and Sibley,C.H. (1997) Mol. Biochem. Parasitol., 85, 25–40.[CrossRef][ISI][Medline]

Yuthavong,Y., Vilaivan,T., Chareonsethakul,N., Kamchonwongpaisan,S., Sirawaraporn,W., Quarrell,R. and Lowe,G. (2000) J. Med. Chem., 43, 2738–2744.[CrossRef][ISI][Medline]

Yuvaniyama,J., Chitnumsub,P., Kamchonwongpaisan,S., Vanichtanankul,J., Sirawaraporn,W., Taylor,P., Walkinshaw,M.D. and Yuthavong,Y. (2003) Nat. Struct. Biol., 10, 357–365.[CrossRef][ISI][Medline]

Zolg,J.W., Plitt,J.R., Chen,G.X. and Palmer,S. (1989) Mol. Biochem. Parasitol., 36, 253–262.[CrossRef][ISI][Medline]

Received March 10, 2005; revised June 16, 2005; accepted June 17, 2005.

Edited by Frances Arnold





This Article
Abstract
Full Text (PDF)
All Versions of this Article:
18/10/457    most recent
gzi044v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Request Permissions
Google Scholar
Articles by Japrung, D.
Articles by Yuthavong, Y.
PubMed
PubMed Citation
Articles by Japrung, D.
Articles by Yuthavong, Y.