Biased mutation-assembling: an efficient method for rapid directed evolution through simultaneous mutation accumulation

Norio Hamamatsu1, Takuyo Aita2,3, Yukiko Nomiya1, Hidefumi Uchiyama1, Motowo Nakajima1, Yuzuru Husimi2 and Yasuhiko Shibanaka1,4

1Tsukuba Research Institute, Novartis Pharma KK, Ohkubo 8, Tsukuba 300-2611 and 2Department of Functional Materials Science, Saitama University, Saitama 338-8570, Japan 3Present address: Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2–43 Aomi, Koto-ku, Tokyo 135-0064, Japan

4 To whom correspondence should be addressed. E-mail: yasuhiko.shibanaka{at}pharma.novartis.com


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
We have developed an efficient optimization technique, ‘biased mutation-assembling’, for improving protein properties such as thermostability. In this strategy, a mutant library is constructed using the overlap extension polymerase chain reaction technique with DNA fragments from wild-type and phenotypically advantageous mutant genes, in which the number of mutations assembled in the wild-type gene is stochastically controlled by the mixing ratio of the mutant DNA fragments to wild-type fragments. A high mixing ratio results in a mutant composition biased to favor multiple-point mutants. We applied this strategy to improve the thermostability of prolyl endopeptidase from Flavobacterium meningosepticum as a case study and found that the proportion of thermostable mutants in a library increased as the mixing ratio was increased. If the proportion of thermostable mutants increases, the screening effort needed to find them should be reduced. Indeed, we isolated a mutant with a 1200-fold longer activity half-life at 60°C than that of wild-type prolyl endopeptidase after screening only 2000 mutants from a library prepared with a high mixing ratio. Our results indicate that an aggressive accumulation of advantageous mutations leads to an increase in the quality of the mutant library and a reduction in the screening effort required to find superior mutants.

Keywords: additivity principles/directed evolution/prolyl endopeptidase/thermostability


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Directed evolution is a useful approach for improving a protein's physicochemical fitness, e.g. its thermostability. Various currently used enzymatic industrial catalysts and pharmaceutical products were developed using directed evolution techniques (Neylon, 2004Go), with the DNA shuffling method finding widespread developmental applicability (Stemmer, 1994aGo,bGo). However, since wild-type fragments are more included than the mutant fragments in a recombination reaction, reconstituted genes in a DNA shuffling library have only a few mutations, which leads to only small diversity of the mutant library. Ness et al. (2002)Go developed the synthetic shuffling method, which increases mutant diversity in a DNA shuffling library. In this approach, instead of double-stranded DNA fragments produced by random DNase digestion, ingeniously designed single-stranded DNAs are used for the production of mutant genes. Kikuchi et al. (2000)Go also reported an effective family shuffling technique, with single-stranded DNAs randomly digested by DNase. We reported ‘mutation scrambling’ as our original optimization strategy (Uchiyama, 1999Go), which was constructed on the basis of the overlap extension PCR technique (Pogulis et al., 1996Go). The mutant library contains every conceivable combination of advantageous mutations at a stochastically even frequency of occurrence. Thus, the directed evolution strategies that have been developed recently focus on expanding mutant diversity in a library by exploring as large a region in sequence space as possible and with an equal probability of mutation incorporation (Zha et al., 2003Go).

Wells proposed that mutational effects on the properties of biopolymers are almost independent and additive (Wells, 1990Go). Dill called this concept the additivity principle (Dill, 1997Go). Many studies, including our own (Aita et al., 2000Go, 2001Go, 2002Go), support these principles (Takeda et al., 1989Go; Houghten et al., 1991Go; Lowman and Wells, 1993Go). Therefore, it is tempting to design mutants that incorporate all known advantageous mutations. However, certain mutations (which are advantageous if they are the only mutation present in a protein) diminish each other's physiochemical effects (the epistatic effect; Kauffman and Macready, 1995Go; Aita et al., 2002Go). Therefore, it is desirable that many component mutations are assembled but a few combinations of the mutations are removed to avoid their epistatic effect. Here, we report an original directed evolution strategy that biases the creation of mutant libraries to favor the presence of multiple advantageous mutations within a mutant. This strategy, biased mutation-assembling, uses the technical framework of mutation scrambling (Uchiyama, 1999Go) and the theoretical one developed previously (Aita and Husimi, 2000Go).

A schematic of biased mutation-assembling is depicted in Figure 1. The method consists of the following steps:

  1. Initially, for a given gene, advantageous single-point mutations are identified using a random mutagenesis technique. These are then combined to produce new mutants containing multiple mutations as described below.
  2. The gene sequence is mapped into blocks, such that a block contains, by definition, one mutation. Blocks are also designed so that the ends of each block overlap the ends of adjacent blocks. DNA fragments, corresponding to the block sequences, are amplified from both wild-type and mutant genes. Note that each block ideally incorporates a single mutation; however, it is inevitable that more than one mutation will be present if the mutations are close in sequence.
  3. All DNA fragments are mixed in a reaction tube and recombinant genes are produced using overlap extension PCR without primers. This reaction produces full-length genes and a library composed of mutants with all conceivable mutation combinations. This library is designated the ‘assembling library’, because the individual mutations are stochastically assembled into the full-length gene products. It is the ratio of mutant to wild-type fragments in the reaction mixture that determines the probability of mutation incorporation.
  4. Mutants with the desired properties, owing to incorporation of multiple mutations and without epistatic mutations, are then identified.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 1. A schematic example of biased mutation-assembling, assuming a basis set of three mutations. The circle, triangle and square each represent one mutation. A block represents a portion of the gene containing one mutation and represents a recombination unit. The double-headed arrows represent overlapping sequences between adjacent blocks and these overlapping sequences hybridize during PCR recombination.

 
This protocol defines the optimization algorithm for biased mutation-assembling in binary n-dimensional sequence space. The probability of incorporating a mutation into a site is the bias, r, which is the fraction of all mutant DNA fragments containing the mutation that defines the block to the total DNA concentration present during recombination:

(1)
where [mutant DNA fragments] and [wild-type DNA fragments] are the concentrations of the DNA fragments with and without the defining mutation in the recombination mixture, respectively. The extent to which mutations accumulate is stochastically controlled by the r value (0 ≤ r ≤ 1), because a complete gene will be reconstituted only at overlapping ends of adjacent DNA fragments. Let n be the number of component mutations. The library, constructed with an r value of 0 or 1.0 contains only wild-type gene or only mutant gene with all n mutations, respectively. Libraries constructed with intermediate r values (0 < r < 1) will contain mutant genes with an average of approximately rn mutations. The probability of the number of accumulated mutations, d, obeys the binomial distribution nCdrd(1 – r)nd. In constructing assembling libraries, the r value should be large (near one, but not equal to one), so that mutants that incorporate many advantageous mutations, but not epistatic mutations, could be found. Because libraries, with large r values also contain few mutants that have a small number of mutations, the necessary screening effort required to find mutants with the most desirable properties is reduced (Aita and Husimi, 2000Go).

Prolyl endopeptidase (PEP) (prolyl oligopeptidase, EC 3.4.21.26) is the only endopeptidase known to show substrate specificity for proline and therefore it is expected to be applicable as a catalyst for pharmaceutical production, e.g. coupling reactions of biologically active peptides, such as C-terminal amidation of luteinizing hormone-releasing hormone (LH-RH) (Togame et al., 1994Go). However, PEP is susceptible to inhibition and inactivation under conditions for coupling reactions, such as the presence of organic solvents and high temperatures (Kreig and Wolf, 1995Go). Uchiyama et al. (2000)Go suggested that the thermostabilization (prolongation of activity half-life at high temperature) had a beneficial effect on the stability in the coupling reaction. Here, we describe the biased mutation-assembling protocol and the application of the strategy to the prolongation of the activity half-life of PEP from Flavobacterium meningosepticum at high temperature (thermostabilization of PEP) as a case study. We developed a basis set of 14 mutations that individually improved the thermostability of PEP and used these mutations to prepare assembling libraries with r values of 0.3, 0.5, 0.7, 0.8, 0.9 and 1.0. We found that the r value of a library directly correlates with the number of thermostable mutants in the library. After assaying only 2000 mutants, we found one, in the library with an r value of 0.9, that has a 1200-fold longer activity half-life than the wild-type enzyme at 60°C. This mutant contains 12 mutations and lacks an epistatic pair.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Development of a mutation basis set to improve PEP thermostability

We previously constructed the vector pUK-FPEPb to express PEP in Escherichia coli (Uchiyama et al., 2000Go). This vector contains the PEP gene, which is inserted between EcoRI and PstI restriction sites (Figure 2). The gene was amplified using error-prone polymerase chain reaction (PCR) as described previously (Uchiyama et al., 2000Go), with experimental conditions optimized so that only one or at most a few nucleotide substitutions were introduced (Leung et al., 1989Go). Amplified DNAs were digested with EcoRI and PstI and then ligated into pUK-FPEPb vector, which was digested with the same enzymes. The vectors were transformed into E.coli JM109 and randomly chosen transformants were screened for PEP thermostability using an active-staining method (Uchiyama et al., 2000Go). Mutations that were responsible for the thermostabilization were identified with DNA sequencing.



View larger version (54K):
[in this window]
[in a new window]
 
Fig. 2. Block and fragment design for the PEP assembling library. The open circles represent the 14 mutations that form the basis set. The PEP gene is divided into six blocks and the corresponding sets of DNA fragments are numbered I–VI. Fragments I–V include the mutants S19, S110, S475, N542 and K615 near the sequence midpoint, respectively. Arrows represent the primers used to prepare the fragments. The open triangles on the arrows represent mutations located on the primers. Primer names, except for EcoRI-396-f, 550-f, PstI+505-r and 555-r, indicate the corresponding mutations contained in the primers; for example, 67Q/70L-f is the forward primer encoding the 67Q and 70L mutations.

 
Construction of PEP biased mutation-assembling libraries

We found 14 single-point mutations, S19T, E67Q, F70L, S110F, N387K, A388V, S475G, I493V, E496K, N542I, K615R, G652V, S653A and Q656R, which individually improved PEP thermostability at 57.5°C. These mutations form the basis set for assembling libraries reported here. Since E67Q–F70L, N387K–A388V, I493V–E496K and G652V–S653A–Q656R are each close in sequence, we constructed libraries by modifying the protocol shown in Figure 1. The PEP gene was divided into six blocks, labeled I–VI, and sets of corresponding DNA fragments were labeled in the same manner (Figure 2). Blocks I–V (and therefore the corresponding sets of DNA fragments) were designed so that each block had closely located mutations on the overlapping regions and any of the mutations except for those near the midpoint. Figure 2 shows the detailed design of blocks. Incorporating proximal mutations into the primers for amplifying DNA fragments allowed us to control the number of mutations present in each DNA fragment, and also reconstituted genes. For example, fragment I was amplified from both the wild-type and the S19T mutant gene with the forward primer designated EcoRI-396-f and four reverse primers designated 67Q/70L-r, 67Q/70F-r, 67E/70L-r and 67E/70F-r, each of which coded one of four combinations of mutations at residues 67 and 70. This made fragment I form a set of eight (23) DNA fragments individually prepared; one wild-type fragment, three fragments with a single mutation, three fragments with two mutations and one fragment with three mutations, as shown in Table I. Figure 2 also shows the detailed information on the primers, in which the primers are named according to the mutations encoded and the direction, sense or anti-sense; for example, primer 67Q/70L-r is the reverse primer coding both E67Q and F70L mutations.


View this table:
[in this window]
[in a new window]
 
Table I. The mixing fraction for each DNA fragment in the fragment I set, assuming an r-value of 0.9

 
When a block can be designed to include only one mutation in the sequence, only two DNA fragments, the wild-type and the mutant, must be amplified as shown in Figure 1. To obtain a desired r-value for the mutation, the two fragments are mixed at a molar ratio of (1 – r):r (wild-type:mutant). The mixture is used as the block-corresponding fragment for a recombination reaction. In this study, however, each block included at least three mutations; therefore, each block-corresponding fragment was composed of at least eight component DNA fragments. For any given r value, the molar ratio of a component fragment with i mutations to others is ri(1 – r)m–i (for i = 0, 1, 2, ..., m), where m is the total number of mutations in the block. For blocks I, II, III, IV, V and VI, m=3, 5, 5, 3, 4 and 3, respectively. Table I lists the molar ratio of each component fragment in fragment I for an r value of 0.9. All block-corresponding fragments were prepared by mixing component fragments in the same manner. The corresponding fragments, I–VI, were present at equimolar concentrations for recombination reaction. In the recombination process, primers were not present during the first five thermal cycles for overlap extension PCR reaction. After five cycles, the primers EcoRI-396-f and PstI-505-r were added, so that the full-length PEP gene was amplified, and the thermal reaction cycle was repeated 20 more times. The full-length gene products were digested with EcoRI and PstI and then ligated into pUK-FPEPb vector, thereby producing an assembling library. Finally, E.coli JM109 was transformed with the assembling library. The transformants were cultured only on Luria broth (LB) agar, not LB liquid medium, containing ampicillin. This minimized any bias towards transformants that grew rapidly. In this study, assembling libraries were prepared with r values of 0.3, 0.5, 0.7, 0.8, 0.9 and 1.0.

Fluorescent measurement using a Pico Green double-stranded DNA quantitation kit (Molecular Probes) was employed for quantification of DNA fragments.

Preparation of periplasmic lysates containing PEP mutants

In our expression system of PEP using pUK-FPEPb and E.coli JM109, PEP is found not only in the cytoplasmic compartment of E.coli, but also in the periplasmic compartment (Uchiyama et al., 2000Go). As shown by Beacham (1979)Go and Swamy and Goldberg (1982)Go, the periplasmic space of E.coli has a limited number of proteases, only seven out of the 25 known cellular proteases, and comprises only 4–8% of the total cell protein. Also, we have reported the isolation of thermostable PEP mutants using whole cell lysate of E.coli JM109 (Uchiyama et al., 2000Go). In addition, we could not find any remarkable decrease in the PEP activity in the periplasmic lysate over 15 min even at 50°C. For these reasons, we considered that periplasmic lysates could be used for the evaluation of PEP thermostability with less proteolytic degradation effect. We prepared periplasmic lysates using a modified method reported by French et al. (1996)Go as follows: transformants, which formed colonies on LB agar, were randomly selected and then cultured in 200 µl of LB liquid medium containing 100 µg/ml ampicillin in 96 deep-well culture plates at 37°C for 14 h with shaking at 200 r.p.m. The transformants were harvested by centrifugation at 10 000 g for 5 min and washed with 200 µl of 20 mM potassium phosphate buffer (pH 7.0). The cells were centrifuged again, resuspended in 75 µl of osmotic shock buffer [100 mM Tris–HCl (pH 8.0), 0.5 M sucrose, 5 mM EDTA, 0.16 mg/ml lysozyme] and incubated on ice for 2 min. Next, 75 µl of cold distilled water were added and the incubation was continued on ice for 30 min. After centrifugation at 10 000 g for 5 min, the supernatant was used as the periplasmic lysates for evaluation of PEP activity half-life.

Evaluation of PEP mutant thermostability

Volumes of 50 µl of each periplasmic lysate were kept at 57.5 or 60°C for 5, 10 or 15 min, then PEP activity was assayed and, assuming first-order kinetics, a PEP activity half-life (t1/2) was calculated. The screening of mutants was conducted at 57.5°C, at which nearly 90% of the wild-type activity was lost during a 15 min heat treatment. Since highly thermostable mutants showed the great thermostability and showed very long half-lives with a large measurement error at 57.5°C, we evaluated such mutants by adopting higher temperature, 60°C, for precise measurement of their half-lives. For the PEP activity assay, 10 µl of a periplasmic lysate were added to 140 µl of PEP assay buffer [100 mM potassium phosphate (pH 7.0), 100 µg/ml bovine serum albumin, 1 mM dithiothreitol, 0.2 mM Z-Gly-Pro-pNA; Bachem] and the absorbance at 410 nm was monitored at 30°C. The PEP activity half-life is independent of total protein concentration of lysate and the initial lysate PEP activity (data not shown); consequently, it was not necessary to standardize or correct for protein concentration in the assay mixtures.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Identification of advantageous mutations that improved PEP thermostability

Initially, using an error-prone PCR technique and active-staining method (Uchiyama et al., 2000Go), we found 14 mutations that individually improved PEP thermostability, namely S19T, E67Q, F70L, S110F, N387K, A388V, S475G, I493V, E496K, N542I, K615R, G652V, S653A and Q656R. The PEP activity half-lives at 57.5°C for mutants which have one of these mutations are longer than that of the wild-type enzyme (Aita et al., 2002Go). These 14 mutations form the basis set for the biased mutation-assembling libraries. As shown in Figure 3, the mutations were mapped to the corresponding positions of the three-dimensional porcine muscle PEP structure (Fülöp et al., 1998Go). With the exception of S19T, for which a homologous site does not exist, many of these mutations are spatially located near the N- and C-termini. Mutations near the termini might decrease thermal motion, thereby stabilizing the catalytic domain.



View larger version (61K):
[in this window]
[in a new window]
 
Fig. 3. Mapping of the basis set mutations on to the porcine PEP crystal structure. The PEP crystal structure is that of porcine muscle (prolyl oligopeptidase, EC 3.4.21.26; Fülöp et al., 1998Go; Protein Data Bank ID 1QFM). The sequences of F.meningosepticum PEP and porcine PEP were aligned and a 38.3% sequence homology was found. Using the alignment results, 13 basis set mutations were mapped on to the crystal porcine PEP structure. The blue region is the catalytic domain (residues 1–72 and 428–710). The gray region is the propeller domain (residues 73–427). The mutations, with amino acid type and position identified, are colored red. The corresponding amino acids and their positions for porcine PEP are given in parentheses. The F.meningosepticum PEP Ser19 residue is not shown because no homologous porcine residue exists. The active-site serine is colored yellow.

 
Evaluation of PEP thermostability for assembling library mutants

Approximately 150–300 transformants were randomly picked from each assembling library prepared with individual r values of 0.3, 0.5, 0.7, 0.8 and 0.9. The PEP activity half-lives at 57.5°C for the periplasmic lysates of these transformants were determined. A proportion of transformants that showed markedly low PEP activity ranged from 10 to 20%, with these transformants eliminated from the evaluation. The distribution of the PEP activity half-lives for each library is presented as scatter plots in Figure 4a, in which each individual plot in a column represents a mutant in each assembling library. Figure 4b shows the maximum activity half-lives for the mutants in each library. The half-lives of wild-type PEP and the mutant with all 14 mutations are included in Figure 4a and b. Clearly, a large r value correlates directly with increased production of mutants with greatly improved thermostability. The most thermostable mutant among the 264 samples in the library prepared with r = 0.9 shows an ~1600-fold longer activity half-life at 57.5°C than wild-type PEP. However, when r = 1.0, the mutant with all 14 mutations is less thermostable than the most thermostable one and shows at most a 38-fold longer half-life than wild-type PEP. That the mutant containing all 14 mutations does not have a greatly increased thermostability is probably a consequence of the epistatic mutation pair. As the r value increases, the activity half-lives of the majority of mutants converge on approximately that of the mutant with 14 mutations. These results suggest that the mutational effect among the 14 mutations is almost additive, but that there are a few significantly epistatic pairs that cause a diminishing return in PEP thermostability.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 4. The thermostability distribution for PEP mutants in assembling libraries of different r values. (a) The distribution of PEP half-lives versus the r value, where r is defined in Equation 1. The PEP activity half-life was determined at 57.5°C for ~150–300 transformants selected randomly from each library as described in the Materials and methods section. The open circles represent individual mutants, with N the total number of mutants. The half-life of wild-type PEP, at 57.5°C, is 0.7 min, so that lnt1/2 is –0.36. (b) The half-life of the most thermostable mutant in each column in (a) and its value relative to that of the wild-type PEP. The half-life at an r value of 1.0 represents that of the mutant with all the 14 mutations.

 
Estimation of screening size required to find the fittest mutant

Figure 4 indicates that a biased library with a large r value should reduce the number of samples that must be assayed when the fittest mutants in binary n-dimensional sequence space are to be found. Since assembling libraries have a binomial distribution of mutant genes, we can estimate the minimum number of assays required to find the mutant with the best mutation combination. The probability, P, that a particular mutant with d point mutations exists among N samples picked randomly from an assembling library with a given r value is

(2)
where n is the number of mutations in the basis set and is 14 for the PEP assembling libraries. If a numerical value is assigned to P, N can be calculated according to this equation. When d = 12 and P = 0.999, N is determined as 113 200, 5500, 2500 and 2400 for r values of 0.5, 0.7, 0.8 and 0.9, respectively. For these calculations, we assumed that the most thermostable mutant would have 12 mutations because our previous studies indicated that the fittest mutant contained ~80–90% of the basis set mutations. Our calculations suggest that the number of assays needed to find mutants with maximum thermostability is dramatically decreased as the r value increases. To confirm this prediction, we screened 2000 and 40 000 transformants selected randomly from assembling libraries with r values of 0.9 and 0.5, respectively. Table II lists the top four thermostable PEP mutants found in the library with an r value of 0.9 and the top two thermostable mutants found in that with an r value of 0.5. The PEP activities of these mutants under physiological conditions (37°C) are similar to that of wild-type PEP. Mutant 22, the most thermostable mutant found in the library with an r value of 0.9, shows a 1200-fold longer activity half-life at 60°C than wild-type PEP. The activity half-life of mutant 22 is significantly longer than that of mutant 7, which is the most thermostable mutant found in the library with an r value of 0.5, in spite of the fact that 38 000 more transformants were screened in the latter library. The top four mutants found in the library with an r value of 0.9 contain almost all component mutations, but do not contain Q656R. It is known that if E67Q and Q656R are simultaneously incorporated in a sequence, this pair, at least, causes an epistatic effect and reduces the thermostability drastically (Uchiyama et al., 2000Go; Aita et al., 2002Go). This result strongly demonstrated that our strategy both reduced the required screening effort and found what may be the best combination of mutations, while avoiding epistatic effects. Especially the reduced screening effort is beneficial in cases where the number of conceivable combinations of mutations is large.


View this table:
[in this window]
[in a new window]
 
Table II. The mutation composition and PEP activity half-life at 60°C for the most thermostable PEP mutant found in assembling libraries with r-values of 0.9 and 0.5

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Biased mutation-assembling and other methods, e.g. DNA shuffling (Stemmer, 1994aGo,bGo), are similar in concept: novel mutants, containing several mutations, are produced by recombination of single- or double-stranded DNA fragments derived from wild-type and mutant or homologous genes. DNA shuffling is a simple and effective method for preparing a chimera gene library; however, mutant diversity in shuffling libraries tends to be small. Various modified methods expand the mutant population, while leveling the concentrations of individual mutants by more thoroughly exploring sequence space. Our method of biased mutation-assembling also produces a library with a large diversity; however, the method biases the content to favor multiple-point mutants. Biased mutation-assembling depends on the assumption that additivity principles are valid and the results in Table II are consistent with this assumption.

For biased mutation-assembling, the incorporation frequency for each mutation is stochastically controlled by the mixing fraction (which is determined by the r value) of the mutant DNA fragments present during recombination. Our previously reported mutation scrambling corresponds to the case with an r value of 0.5, not importing the additivity principles. Therefore, the library contains every conceivable combination of advantageous mutations at a stochastically even frequency of occurrence (Uchiyama, 1999Go). In this study, the library of PEP mutants, for which all mixing fractions were calculated using an r value of 0.9, contained many thermostable PEP mutants and consequently the search for such mutants was significantly reduced. For other methods, such as DNA shuffling, assuming that all mutations are introduced independently in a binary manner, the r value is the reciprocal of the number of mutations, n: r = 1/n. If we had used such a method to find thermostable PEP mutants, the r value would have been, at most, 0.07 (1/14) and the screening size 4.5 x 1014. However, when DNA fragments are prepared by DNase digestion, mutations, especially close in sequence, usually remain linked and recombine together; therefore, the r value would almost certainly have been smaller and the number of assays required larger. Synthetic shuffling (Ness et al., 2002Go), which is one of modified DNA shuffling methods, can also produce a biased library, but each mutation was introduced with an approximately equal probability of incorporation. It will be interesting to integrate additivity principles into this method. Additionally, for biased mutation-assembling, the incorporation probability of an individual mutation can be controlled by adjusting the value of the mixing fraction for each fragment containing a given mutation, although, in this study, the same r value was used for all 14 mutations. For example, if a mutation has a large (or a small) beneficial effect on a desired property, the r value in the equation mixing fraction = ri(1 – r)m–i should be set near 1.0 (or near 0.5). We have used this strategy to improve the substrate specificity of glucose dehydrogenase-B (GDHB) (EC 1.1.5.2) from Acinetobacter calcoaceticus for glucose. This enzyme shows broad substrate specificity that, especially the activity against maltose, greatly diminishes its utility. A set of mutations was identified that decreased the activity against maltose. Some of these mutations also decreased that against glucose and some of the mutations did not. Using an r value of 0.5 for the first subset of mutations and an 0.9 for the second subset, we were able to decrease specifically the activity against maltose by 4% of wild-type GDHB without significantly affecting the activity against glucose (retaining 40% of wild-type GDHB). Consequently, we found what is probably the best combination of mutations resulting in improved specificity for glucose (manuscript in preparation).

Meanwhile, biased mutation-assembling requires a lot of primers and PCR reaction for the preparation of fragments when a lot of mutations are closely located as described here, requiring a lot of work compared with other methods, such as DNA digestion. However, in cases where component mutations are scattered enough to design individual blocks to have only one mutation as shown in Figure 1, biased mutation-assembling is a very simple method like overlap extension PCR.

A biased library is also produced if the wild-type backcross strategy, for which a mutant with all mutations is recombined with wild-type DNA, is used (Zhao and Arnold, 1997Go). However, as discussed above, mutations, especially close in sequence, are usually linked during recombination, which makes the control of the incorporation probability of individual mutations virtually impossible. Additionally, this strategy requires the advanced preparation of a gene containing all desired mutations. For these reasons, the wild-type backcross strategy is a less efficient way to prepare a biased library than is biased mutation-assembling.

A crucial element for the successful design of biased mutation-assembling libraries is the composition of the mutation basis set. We found the basis set for the PEP assembling library using error-prone PCR. However, when error-prone PCR is the chosen method, codon degeneracy limits the extent to which the amino acid substitution occurs and, therefore, all desirable mutations might not be found. This limitation can be overcome if other methods are used to produce point mutations. For example, Murakami et al. developed a method that introduces mutations into a gene without the limitations imposed by codon degeneracy (Murakami et al., 2002Go). This powerful method exhaustively identifies single-point mutations that improve the physiochemical properties of a protein. Combining this method with biased mutation-assembling may make directed evolution experiments more efficient.

In our study, an r value of 0.9 most effectively improved the thermostability of PEP; however, in general, the optimal r value depends on how rugged the fitness landscape is for a particular protein (Aita et al., 2000Go). Previously, we roughly estimated that the optimal r value is between 0.8 and 0.9 for most proteins (Aita and Husimi, 2000Go; Aita et al., 2000Go, 2001Go, 2002Go). Additional characterization of protein fitness landscapes should improve the estimate of the statistically optimal r value.

In summary, we have shown that biased mutation-assembling is an efficient optimization technique for improving protein properties, through applying this strategy to the improvement of the thermostability of PEP as a case study. To show that biased mutation-assembling is a generally applicable method, we have produced libraries for several other enzymes and found mutants with improved thermostability and substrate specificity. Additional studies of other proteins will be needed to validate this method further. The statistical and theoretical nature of protein fitness landscapes is the foundation for the concept of biased mutation-assembling. We believe that if the knowledge gained from protein fitness landscape calculations is incorporated into new, rapid protein evolution methods, such methods will be robust and reliable.


    Acknowledgements
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
This work was performed as part of the R&D Project of the Industrial Science and Technology Frontier Program supported by the New Energy and Industrial Technology Development Organization, Japan.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Aita,T. and Husimi,Y. (2000) J. Theor. Biol., 207, 543–556.[CrossRef][ISI][Medline]

Aita,T., Uchiyama,H., Inaoka,T., Nakajima,M., Kokubo,T. and Husimi,Y. (2000) Biopolymers, 54, 64–79.[CrossRef][ISI][Medline]

Aita,T., Iwakura,M. and Husimi,Y. (2001) Protein Eng., 14, 633–638.[CrossRef][ISI][Medline]

Aita,T., Hamamatsu,N., Nomiya,Y., Uchiyama,H., Shibanaka,Y. and Husimi,Y. (2002) Biopolymers, 64, 95–105.[CrossRef][ISI][Medline]

Beacham,I.R. (1979) Int. J. Biochem., 10, 877–883.[CrossRef][ISI][Medline]

Dill,K.A. (1997) J. Biol. Chem., 272, 701–704.[Free Full Text]

French,C., Keshavarz-Moore,E. and Ward,J.M. (1996) Enzyme Microb. Technol., 19, 332–338.[CrossRef][ISI]

Fülöp,V., Böcskei,Z. and Polgar,L. (1998) Cell, 94, 161–170.[CrossRef][ISI][Medline]

Houghten,R.A., Phinilla,C., Blondelle,S.E., Appel,J.R., Dooley,C.T. and Cuervo,J.H. (1991) Nature, 354, 84–86.[CrossRef][ISI][Medline]

Kauffman,S.A. and Macready,W.G. (1995) J. Theor. Biol., 173, 427–440.[CrossRef][ISI][Medline]

Kikuchi,M., Ohnishi,K. and Harayama, S. (2000) Gene, 243, 133–137.[CrossRef][ISI][Medline]

Kreig,F. and Wolf,N. (1995) Appl. Microbiol. Biotechnol., 42, 844–852.[CrossRef][ISI][Medline]

Leung,D.W., Chen,E. and Goeddel,D.V. (1989) Technique J. Methods Cell Mol. Biol., 1, 11–15.

Lowman,H.B. and Wells,J.A. (1993) J. Mol. Biol., 234, 564–578.[CrossRef][ISI][Medline]

Murakami,H., Hohsaka,T. and Sisido,M. (2002) Nat. Biotechnol., 20, 76–81.[CrossRef][ISI][Medline]

Ness,J.E., Kim,S., Gottman,A., Pak,R., Krebber,A., Borchert,T.V., Govindarajan,S., Mundorff,E.C. and Minshull,J. (2002) Nat. Biotechnol., 20, 1251–1255.[CrossRef][ISI][Medline]

Neylon,C. (2004) Nucleic Acids Res., 32, 1448–1459.[Abstract/Free Full Text]

Pogulis,R.J., Vallejo,A.N. and Pease,L.R. (1996) Methods Mol. Biol., 57, 167–176.[Medline]

Stemmer,W.P.C. (1994a) Nature, 370, 389–391.[CrossRef][ISI][Medline]

Stemmer,W.P.C. (1994b) Proc. Natl Acad. Sci. USA, 91, 10747–10751.[Abstract/Free Full Text]

Swamy,K.H.S. and Goldberg,A.L. (1982) J. Bacteriol., 149, 1027–1033.[ISI][Medline]

Takeda,Y., Sarai,A. and Rivera,V.M. (1989) Proc. Natl Acad. Sci. USA, 86, 439–443.[Abstract/Free Full Text]

Togame,H., Inaoka,T. and Kokubo,T. (1994) J. Chem. Soc., Chem. Commun., 1107–1108.

Uchiyama,H. (1999) Bio Industry, 16, 27–35 (in Japanese)

Uchiyama,H., Inaoka,T., Ohkuma-Soyejima,T., Togame,H., Shibanaka,Y., Yoshimoto,T. and Kokubo,T. (2000) J. Biochem., 128, 441–447.[Abstract]

Wells,J.A. (1990) Biochemistry, 29, 8509–8517.[CrossRef][ISI][Medline]

Zha,D., Eipper,A. and Reetz,M.T. (2003) Chembiochem, 4, 34–39.[CrossRef][ISI][Medline]

Zhao,H. and Arnold,F.H. (1997) Proc. Natl Acad. Sci. USA, 94, 7997–8000.[Abstract/Free Full Text]

Received October 22, 2004; revised March 30, 2005; accepted April 11, 2005.

Edited by Alan Fersht





This Article
Abstract
Full Text (PDF)
All Versions of this Article:
18/6/265    most recent
gzi028v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (1)
Request Permissions
Google Scholar
Articles by Hamamatsu, N.
Articles by Shibanaka, Y.
PubMed
PubMed Citation
Articles by Hamamatsu, N.
Articles by Shibanaka, Y.