Department of Biochemistry, Uppsala University, Biomedical Center, Box 576, SE-751 23 Uppsala, Sweden 2Previously Lars O.Hansson
1 To whom correspondence should be addressed. e-mail: Bengt.Mannervik{at}biokem.uu.se
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: alkyltransferase/cluster analysis/directed evolution/glutathione transferase/molecular quasi-species
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In nature the evolution of a protein requires the simultaneous optimization of several functional parameters. For example, improving the catalytic efficiency of an enzyme may concomitantly affect thermal stability, pH optimum and solubility. In the optimization of a catalyst there may also be a choice between increasing substrate selectivity or developing improved activity with several alternative substrates. The present investigation demonstrates how the multidimensional functional space of a library of mutant enzymes can be explored and how subsets of variants with similar activity profiles can be identified. Multivariate statistical methods (Krzanowski, 2000), including principal components, dendrograms and K-means cluster analyses, were used in the characterization of consecutive glutathione transferase (GST) libraries. One of the mutant enzymes was purified to homogeneity and compared with the wild-type human GST T1-1 (hGST T1-1). Its catalytic efficiency was enhanced over the wild-type alkyltransferase activities by 65-fold.
Our paper presents a rational approach based on multidimensional factor analysis, which should be of value to the engineering not only of proteins but also of nucleic acids (Gold et al., 1995; Joyce, 1998
; Wilson and Szostak, 1999
) and other molecules (Erlanson et al., 2000
; Houghten, 2000
) for novel functional properties.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
cDNAs encoding the Theta class GSTs, hGST T1-1 and rat (r) GST T2-2, were randomly fragmented and recombined to create a mutant library. The construction of this T1/T2 library (the F1 generation) has been described in detail (Broo et al., 2002).
Preparation of lysates
Colonies of the T1/T2 mutant library were randomly picked and the bacteria were grown to saturation overnight in 2 ml of 2TY medium supplemented with ampicillin (100 µg/ml). After a 100-fold dilution, the cultures were grown for 2 h before the expression of GSTs was induced by addition of isopropyl ß-D-thiogalactopyranoside (IPTG) to a final concentration of 0.2 mM. The bacteria were harvested by centrifugation 18 h after the induction. Each pellet was resuspended in 0.1 M sodium phosphate pH 6.5 and the bacteria were lysed by four cycles of freezing at 80°C and thawing at 37°C. After the final thawing, the suspensions of lysed bacteria were centrifuged and the resulting supernatants were stored at 80°C until the activity measurements were performed. Lysates of wild-type hGST T1-1 and rGST T2-2 were used in parallel as controls.
Measurement of GST activity
The GST activities of the prepared lysates were measured with six alternative substrates: dichloromethane (DCM), 1,2-epoxy-3-(4-nitrophenoxy)propane (EPNP), 4-nitrobenzyl chloride (NBC), 4-nitrophenethyl bromide (NPB), 1-chloro-2,4-dinitrobenzene (CDNB) and 1-menaphthyl sulfate (MS). All activity measurements were performed in microplates on a SPECTRAmaxPLUS384 microplate spectrophotometer (Molecular Devices, Sunnyvale, CA). For all substrates except DCM, the formation of the GSH conjugate was monitored continuously. The activity with DCM was determined by an end-point assay measuring the amount of formaldehyde formed after 40 min. For details of the GST assays see Habig and Jakoby (1981) and Broo et al. (2002)
.
Amplification of DNA to create a new library
Five clones with high alkyltransferase activity in the F1 generation were selected and used as starting material for a new generation, i.e. the F2 library. DNA preparations from the five selected mutants were used separately as templates for PCR amplification. The PCR primers used were pKK for (5'-AAT TGT GAG CGG ATA ACA AT-3') and Eco RGNB (5'-AAG CTG AAA ATC TTC-3').
Cloning of mouse and rat GST T1 sequences
In order to increase the diversity of the library, mouse and rat GST T1 sequences were introduced in the DNA shuffling procedure. Mouse GST T1 was cloned from Mouse Liver QUICK-clone cDNA (Clontech, Palo Alto, CA) using the primers mT1 start (5'-ATA TGA ATT CAT GGT TCT GGA GCT GTA C-3') and mT1 stop (5'-ATA TAA GCT TTT ATT ACT GGA TCA TTG CCA G-3'). The PCR product was digested with EcoRI and HindIII and ligated into the EcoRI and HindIII cloning sites of pKK-D (Björnestedt et al., 1992).
For the rat GST T1 sequence, the primers rT1 Pst I for (5'-TGA CCA CTG GTA CCC CCA AGA CCT GCA-3') and rT1 stop (5'-CCC AGA GTG CTG ACC ATG ATC CAG TAA TAA AAG CTT ATA T-3') were used to amplify nucleotides 243720 of the coding sequence from Rat Liver QUICK-Clone cDNA (Clontech).
Digestion of DNA
A mixture of the amplified DNA from the five mutants, consisting in total of 1 µg, was digested with 0.20.8 U DNase I (Roche, Mannheim, Germany) in 20 mM TrisHCl pH 8.0, 1 mM MgCl2 at room temperature. The digestion was performed in several rounds of 23 min each. After each round a small sample of the reaction mixture was run on a 2% (w/v) agarose gel. Freezing the sample on dry ice stopped the digestion during the analysis. When the size of the fragments was 100 bp or less, DNase I was inactivated by heating the reaction mixture to 70°C for 10 min. After phenol and chisam extractions the DNA was separated by electrophoresis on a 2% (w/v) agarose gel and all fragments smaller than 100 bp were recovered from the gel. DNase digestions of mGST T1 and rGST T1 cDNA were performed separately in the same way as for the mixed mutants.
Reassembly of GST cDNA sequences
All the recovered DNA from the digestion reactions, including fragments of T1/T2 mutants, mGST T1 and rGST T1, were mixed together and used in a reassembly PCR. In addition to the DNA, this reassembly reaction contained 0.2 mM dNTPs, 40 U/ml Pfu DNA polymerase and buffer as recommended by the manufacturer (Stratagene, La Jolla, CA). The PCR conditions were 3 min at 95°C, followed by 40 cycles of 1 min at 94°C, 2 min at 50°C and 2 min at 72°C, completed by 10 min at 72°C.
Amplification of full-length GST coding sequences and construction of the F2 library
A small amount of the product from the reassembly reaction was used as a template in a PCR to amplify full-length coding sequences. In addition, the PCR mixture contained 0.8 mM each of flanking primers pKK for and Eco RGNB, 0.2 mM dNTPs, 12 U/ml Pfu DNA polymerase and buffer as recommended by the manufacturer. The temperature cycle, 1 min at 94°C, 2 min at 55°C and 2 min at 72°C, was run 35 times, followed by 10 min at 72°C. The amplified product was digested with EcoRI and HindIII, purified on gel and ligated into the EcoRI and HindIII cloning sites of pKK-D. The product of the ligation reaction was used to transform electrocompetent Escherichia coli XL1-Blue cells (Stratagene).
Preparation of lysates and catalytic activity measurements
Clones from the F2 library were randomly picked from agar plates and grown in 350 µl of 2TY supplemented with 100 µg/ml ampicillin. After growth overnight at 37°C in 96-well microplates the cultures were diluted 100-fold into fresh 2TY supplemented as above. Two hours after the dilution, the expression of GST was induced by addition of IPTG to a final concentration of 0.2 mM. The bacteria were grown for an additional 16 h before being harvested by centrifugation. Each pellet was resuspended in 0.1 M sodium phosphate pH 6.5. Lysis of the bacteria was performed by four rounds of freezingthawing. Following centrifugation, the supernatants were transferred to a new 96-well microplate and stored at 80°C for subsequent measurements of enzymatic activities. The GST activities of lysates from mutants of the F2 library were tested with NPB and EPNP as alternative substrates. Each microplate with lysates of F2 mutants contained lysates of hGST T1-1 and mGST T1-1 as controls.
Sequencing
From the F2 library, 13 clones with increased alkyltransferase activity were selected for DNA sequencing. The sequencing was performed with Big Dye v2.0 (Applied Biosystems, Foster City, CA) using an ABI Model 310 DNA sequencer (Applied Biosystems). DNA sequences were analyzed using Chromas version 1.45 (Technelysium Pty Ltd, Helensvale, Australia).
Linking a histidine tag to the F2:1215 mutant
An N-terminal histidine tag sequence (5'-CAT CAC CAT CAT CAT CAC-3') was attached to mutant F2:1215 from the F2 library by PCR. The PCR products were digested with EcoRI and HindIII and ligated to the EcoRI and HindIII cloning sites of pKK-D. The ligation mixture was used to transform electrocompetent E.coli XL1-Blue cells.
Purification of mutant F2:1215
Mutant F2:1215 was expressed as described for wild-type hGST T1-1 (Jemth and Mannervik, 1997). The harvested bacteria were resuspended in 20 mM sodium phosphate pH 7.4, 0.5 M NaCl, 1 mM 2-mercaptoethanol, 0.1 M imidazole and lysed by sonication and addition of lysozyme (0.2 mg/ml). After centrifugation the resulting supernatant was purified on Ni-IMAC (Amersham Biosciences, Uppsala, Sweden). The protein was eluted with 0.5 M imidazole and thereafter dialyzed against a buffer containing 10 mM TrisHCl pH 7.8, 20% (v/v) glycerol, 1 mM 2-mercaptoethanol and 0.02% (w/v) sodium azide.
Kinetic analysis of mutant F2:1215
Steady-state kinetic properties of the purified mutant F2:1215 were determined with NPB and EPNP as alternative substrates. The substrate concentrations varied between 10 and 200 µM for NPB and between 25 and 500 µM for EPNP. The concentration of GSH was 10 mM. Steady-state kinetic parameters were determined by non-linear regression analysis; KM and kcat values expressed per enzyme subunit were determined by fitting the MichaelisMenten equation to the experimental data.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
hGST T1-1 was targeted for directed enzyme evolution. Potential improvements of hGST T1-1 in the substrate-activity space were probed by recursive DNA recombination and screening procedures. The mutant library T1/T2 was created as a first generation of variants (F1) by recombination of DNA from hGST T1-1 and rGST T2-2, and the mutant enzymes were functionally characterized by use of six alternative electrophilic substrates (Broo et al., 2002). The six-dimensional substrate-activity space was explored by means of multivariate analyses to define a subgroup of mutants suited to parent the next generation. The cycle was then repeated by recombining the DNA of the selected parents with cDNA of rGST T1-1 and mouse GST T1-1 (mGST T1-1), thereby creating the second generation of variants (F2). Finally, mutants with further improved alkyltransferase activity were identified.
The T1/T2 library harbors functionally diverse mutants
The members of the T1/T2 library (the F1 generation) had previously been shown to be structurally and functionally divergent. Lysates of bacterial clones expressing isolated GST T1/T2 mutants were assayed for activity with six different substrates representing alternative substitution and addition reactions (Figure 1). In order to make the diverse activities comparable, the activity values were normalized to the activities characterizing the parental enzymes. The DCM and EPNP values were related to hGST T1-1 and the remaining four substrates were related to rGST 2-2 (Broo et al., 2002). Normalized activities (per cent) of 94 GST T1/T2 mutants and of the parental GSTs are shown in Figure 2. The activity values in the lysates are dependent on the amount of expressed enzyme protein, which varies among the different clones. Nevertheless, it is clear that the relative activities with the different substrates change from clone to clone. The most marked differences are displayed with DCM and MS, which distinguish the parental enzymes hGST T1-1 and rGST T2-2, respectively (Jemth and Mannervik, 1997
). Like the wild-type enzymes, none of the mutant enzymes displayed significant activity with both of these discriminating substrates. Of special interest for the present study was the alkyltransferase activity as monitored with DCM and NPB. The activities with DCM and NPB were highly correlated (Figure 3) and five clones displayed high alkyltransferase activity.
|
|
|
The six-tuple of activities with the different substrates serves as a fingerprint for a given GST mutant, which can be regarded as a point in six-dimensional substrate-activity space. Figure 3 shows the points representing the mutants characterized in the three-dimensional subspace spanned by the substrates DCM, NPB and EPNP. From a statistical point of view, it can be argued that the variance of the activity data might be accounted for by less than six independent variables. This possibility was investigated by a principal component analysis, and a scree plot of the eigenvalues demonstrated that only three principal components are needed to account for 95% of the variance (Figure 4). Thus, the dimensionality of the functional space harboring the analyzed GST mutants is not significantly higher than 3 based on the six substrates used. In fact, the orthogonal principal components 1 and 2 account for 75% of the variance. However, it should be noted that the principal component analysis, like the other statistical procedures used in the present study, is dependent on the standardization of the data and possible weighting schemes (Bardsley, 2002). If one of the independent variables (substrate-activity values) dominates in the analysis, its contribution can be attenuated by proper weighting. In this manner, qualitatively more significant factors can be evaluated without being overwhelmed by quantitatively more influential ones.
|
Discrete clusters of clones can be identified in the T1/T2 library
Distances in n-dimensional space can be used in order to form subsets of points that are close to one another. Thus, proximity in the six-dimensional substrate-activity space can be used for subgrouping of the GST mutants. Figure 5 shows a dendrogram based on the Euclidian distances among the GST mutants in factor space. The diagram provides information for each individual clone about its functionally most closely related neighbors. Clone 88 is far removed from the other clones in accordance with its separate location in the principal component analysis (Figure 4). The other clones form branches based on their similarities in catalytic activities. A group consisting of clones 18, 20, 57, 61 and 63 is distinguished by displaying high alkyltransferase activities with DCM and NPB.
|
|
Multivariate analysis of substrate-activity profiles is not dependent on information about amino acid and nucleotide sequences or variations in protein structures. Nevertheless, for an understanding of the relationship between functional plasticity and structure it is valuable to examine sequence variations that accompany the functional evolution. The parental DNA sequences of hGST T1-1 and rGST T2-2 are 63% identical, and the mutants analyzed in the F1 generation are predominantly based on one sequence or the other. Mutant 88 was a T2 sequence with a functional N49D modification and four silent codon mutations. The five mutants selected for prominent alkyltransferase activity were all highly similar in the 5' end of the coding DNA sequence and contained at least 59 nucleotides identical with the T2 sequence. The major portions of the mutant sequences, through the 3' end, were essentially T1 sequences with very few mutations. The common denominator of the five selected clones is the presence of Ser in position 14; none of the other mutants sequenced was found to contain this residue. Mutants 18 and 20 contained 74 and 141 nucleotides, respectively, identical with the T2 5' end, resulting in one and six non-silent mutations, respectively, in addition to the C14S modification in mutants 57, 61 and 63.
Further evolution of GSTs with enhanced alkyltransferase activity
The five selected clones from the F1 generation were subjected to DNA shuffling (Stemmer, 1994) together with cDNA encoding mouse and rat GST T1-1 wild-type sequences. The addition of the latter two sequences was made in order to increase the genetic diversity and reduce the risk of inbreeding in the creation of the F2 generation from the GST T1/T2 library.
The F2 generation of GST mutants was expressed in E.coli and lysates of bacterial clones were assayed for activity with NPB as well as with the alternative substrate EPNP. Both these activities can be monitored spectrophotometrically in real time and give more accurate values than the end-point assay of DCM. Dendrogram and K-means cluster analyses of 1031 mutants from the F2 generation identified a subset characterized by elevated alkyltransferase activity with the two alternative substrates (Figure 7).
|
F2:1215 is a GST variant with increased alkyltransferase activity
The purified F2:1215 protein demonstrated significantly enhanced catalytic activity with NPB and EPNP in comparison with the parental hGST T1-1 (Table I) and rGST T2-2 (Jemth et al., 1996). The targeted alkyltransferase activity showed a 65-fold elevation of catalytic efficiency, kcat/Km, as measured with NPB. The epoxide addition, assayed with EPNP, was enhanced 7-fold in catalytic efficiency. With both NPB and EPNP an increased kcat value was the main contributor to the improved activity.
|
In the sequence analysis the additional silent mutations served as evidence for the three separate origins of the F2:1215 sequence. However, the synonymous codons also had functional consequences at the RNA level, since F2:1215 was expressed at a >10-fold higher level than the parental hGST T1-1.
From a general evolutionary point of view, it is noteworthy that the substitution of three amino acids leading to enhanced alkyltransferase activity of hGST T1-1 is not dependent on the recombination of DNA fragments from different mammalian species, but could have arisen by four separate point mutations in the human gene. It is therefore evident that DNA encoding the improved enzyme is only four steps removed in the nucleotide-sequence space from the DNA encoding wild-type hGST T1-1.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The directed evolution of protein function to a large extent involves isolation of mutants with improved properties from a population of variants obtained by mutagenesis (Voigt et al., 2000; Glieder et al., 2002
; Lin and Cornish, 2002
; Santoro and Schultz, 2002
; Tao and Cornish, 2002
). If the targeted protein can be made a selectable marker, improved variants can be isolated from host cells grown under stringent conditions. In some cases an evolving binding affinity can be the basis for the selection of improved variants (Hansson et al., 1997
; Keefe and Szostak, 2001
). However, in general the isolation of mutants with valuable properties has to be based on screening of individuals in the mutant population. Like in nature, the directed evolution of optimized functions requires recursive changes of the genetic material through several generations. Conventional wisdom may suggest that the individual showing the highest improvement of the desired properties would be the best progenitor for the following generation. However, the breeding of animals and plants has demonstrated that a too narrow genetic background will lead to inbreeding and consequent degeneration of the progeny. At the level of molecular evolution the degenerative alterations may involve improper folding of the polypeptide chain, loss of thermal stability, etc.
Eigen and co-workers have developed the concept of the molecular quasi-species as a descriptor of the subpopulation from which the next improved generation derives (Eigen et al., 1988). The quasi-species can be regarded as a stochastic variable with a distribution in multidimensional factor space. Similarly, the directed evolution of molecules entails the crucial problem of selecting the group of variants that are suitable progenitors for a following new generation. In practice, this amounts to collecting a number of individuals with enhanced function from a population of mutants. By screening for several independent properties the desired genetic variability can be obtained in addition to the targeted property. On the other hand, excessive mutational diversity will lead to an error threshold, which must not be exceeded (Eigen et al., 1988
). In the tailoring of proteins, nucleic acids (Gold et al., 1995
; Joyce, 1998
; Wilson and Szostak, 1999
) and other molecules (Erlanson et al., 2000
; Houghten, 2000
) for novel functions, multivariate analysis and cluster analysis can be used to strategically guide the sampling of mutants.
The present study illustrates the use of multivariate analysis to monitor and guide directed enzyme evolution for enhanced activity. hGST T1-1 was evolved to 65-fold increased alkyltransferase activity by shuffling of DNA from mutant GST T1-1 clones selected on the basis of targeted properties. In order to broaden the genetic background, cDNA from rat and mouse GSTs was included in the creation of new recombinants of the selected mutants. A panel of six alternative substrates was used to explore the functional space, and the selection of clones was based on multidimensional analysis of activity profiles with the distinguishing substrates. The reactions monitored represent nucleophilic substitution reactions involving alkylhalides, i.e. DCM and NPB, aralkyl compounds, i.e. NBC and MS, and an aryl halide CDNB, as well as an epoxide addition reaction (EPNP). The principal component analysis of the kinetic data showed that the variance in the catalytic properties examined can essentially be accounted for by three orthogonal variables and it is clear that some of the activities are highly correlated. However, other choices of the numerous GST substrates (Mannervik and Danielson, 1988) may expand the substrate-activity space to dimensions higher than three.
In the present case, nucleophilic substitution in alkyl transfer reactions was the activity screened for. Alkylhalides have toxicological interest in view of their occurrence as environmental and occupational pollutants (Wheeler et al., 2001) as well as their similarity to cytostatic drugs used in cancer chemotherapy. However, GSTs have activity with different electrophilic functional groups, and directed evolution may selectively improve the catalytic efficiency with any of a variety of alternative substrates. Novel evolutionary pathways could target, for example, addition reactions. For this purpose, multivariate cluster analysis is a powerful approach to optimizing the selection of a suitable subset of molecular species to serve as progenitors of catalysts with desired properties.
![]() |
Acknowledgements |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Björnestedt,R., Widersten,M., Board,P.G. and Mannervik,B. (1992) Biochem. J., 282, 505510.[ISI][Medline]
Broo,K., Larsson,A.-K., Jemth,P. and Mannervik,B. (2002) J. Mol. Biol., 318, 5970.[CrossRef][ISI][Medline]
Eigen,M., McCaskill,J. and Schuster,P. (1988) J. Phys. Chem., 92, 68816891.[ISI]
Erlanson,D.A., Braisted,A.C., Raphael,D.R., Randal,M., Stroud,R.M., Gordon,E.M. and Wells,J.A. (2000) Proc. Natl Acad. Sci. USA, 97, 93679372.
Flanagan,J.U., Rossjohn,J., Parker,M.W., Board,P.G. and Chelvanayagam,G. (1998) Proteins, 33, 444454.[CrossRef][ISI][Medline]
Glieder,A., Farinas,E.T. and Arnold,F.H. (2002) Nat. Biotechnol., 20, 11351139.[CrossRef][ISI][Medline]
Gold,L., Polisky,B., Uhlenbeck,O. and Yarus,M. (1995) Annu. Rev. Biochem., 64, 763797.[CrossRef][ISI][Medline]
Habig,W.H. and Jakoby,W.B. (1981) Methods Enzymol., 77, 398405.[Medline]
Hansson,L.O., Widersten,M. and Mannervik,B. (1997) Biochemistry, 36, 1125211260.[CrossRef][ISI][Medline]
Houghten,R.A. (2000) Annu. Rev. Pharmacol. Toxicol., 40, 273282.[CrossRef][ISI][Medline]
Jemth,P. and Mannervik,B. (1997) Arch. Biochem. Biophys., 348, 247254.[CrossRef][ISI][Medline]
Jemth,P., Stenberg,G., Chaga,G. and Mannervik,B. (1996) Biochem. J., 316, 131136.[ISI][Medline]
Joyce,G.F. (1998) Proc. Natl Acad. Sci. USA, 95, 58455847.
Keefe,A.D. and Szostak,J.W. (2001) Nature, 410, 715718.[CrossRef][ISI][Medline]
Krzanowski,W.J. (2000) Principles of Multivariate Analysis. A Users Perspective. Oxford University Press, New York.
Lin,H. and Cornish,V.W. (2002) Angew. Chem. Int. Ed. Engl., 41, 44024425.[CrossRef][Medline]
Mannervik,B. and Danielson,U.H. (1988) CRC Crit. Rev. Biochem., 23, 283337.[ISI][Medline]
Santoro,S.W. and Schultz,P.G. (2002) Proc. Natl Acad. Sci. USA, 99, 41854190.
Stemmer,W.P.C. (1994) Proc. Natl Acad. Sci. USA, 91, 1074710751.
Stemmer,W.P.C. (2002) J. Mol. Catal. B Enzymol., 1920, 312.
Tao,H. and Cornish,V.W. (2002) Curr. Opin. Chem. Biol., 6, 858864.[CrossRef][ISI][Medline]
Wheeler,J.B., Stourman,N.V., Thier,R., Dommermuth,A., Vuilleumier,S., Rose,J.A., Armstrong,R.N. and Guengerich,F.P. (2001) Chem. Res. Toxicol., 14, 11181127.[CrossRef][ISI][Medline]
Wilson,D.S. and Szostak,J.W. (1999) Annu. Rev. Biochem., 68, 611647.[CrossRef][ISI][Medline]
Voigt,C.A., Kauffman,S. and Wang,Z.G. (2000) Adv. Protein Chem., 55, 79160.[ISI][Medline]
Received October 13, 2003; accepted October 16, 2003 Edited by Alan Fersht