1 Division of Parasitology, 2 Division of Protein Structure and 3 Division of Virology, National Institute for Medical Research, Mill Hill, London NW7 1AA, UK
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: gene synthesis/Pichia pastoris/Plasmodium falciparum/protein expression/subtilisin-like protease
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Novel strategies are urgently needed to combat malaria. Over two-thirds of the world's population reside in malaria endemic regions and there are thought to be up to 500 million clinical cases of the disease annually. Serious complications following infection with P. falciparum are frequent and falciparum malaria is estimated to cause between 1.5 and 2.7 million deaths per year (Anon., 1994). There is no widely available vaccine against malaria and drug-resistant P. falciparum is a widespread and growing problem. Clinical malaria is caused by growth of the parasite in circulating red blood cells. The invasive merozoite stage of the parasite enters red blood cells, replicates within the cell, then is released to invade new red cells and repeat the cycle. Red cell invasion is known to require the activity of parasite serine proteases (Blackman et al., 1993
; McKerrow et al., 1993
). Work in this laboratory has recently identified a P. falciparum gene (pfsub-1) encoding a member of the subtilisin-like serine protease superfamily (Blackman et al., 1998
). The primary gene product is processed in two consecutive steps during transport through the parasite secretory system and the putative mature protease is concentrated in secretory organelles within merozoites, indicating that it may play a role in red cell invasion (Blackman et al., 1998
). Inhibition of the enzyme may therefore block invasion. We have been interested in achieving high-level heterologous expression of catalytically active PfSUB-1 for structural and enzymological studies. Attempts to express the entire pfsub-1 gene, or domains of it, in E.coli resulted in extremely low levels of expression of insoluble protein (M.Sajid and M.J.Blackman, unpublished data); expression of the malarial gene in P. pastoris or baculovirus was equally unsuccessful, probably owing to the unfavourable codon bias of the 72% A+T-rich pfsub-1 coding sequence. To solve this problem, it was decided to synthesize the pfsub-1 gene, adapting the codon usage for optimum expression in P. pastoris. Recent advances in gene synthesis technology have led to an increased number of available gene synthesis methods (Grantham et al., 1980
; Prapunwattana et al., 1996
; Mehta et al., 1997
; Au et al., 1998
), the most attractive ones relying on the use of polymerase chain reaction (PCR) (Prodromou and Pearl, 1992
; Graham et al., 1993
; Stemmer et al., 1995
; Casimiro et al., 1997
; Brocca et al., 1998
). Here we have adapted and optimized in terms of error rate the assembly PCR method of Stemmer et al. (1995). In this procedure, a DNA polymerase is used in a primary PCR (called the assembly process) to build increasingly long DNA fragments from a pool of overlapping oligonucleotides (oligos). A second PCR is then performed to amplify specifically the previously assembled synthetic product. Any point mutations in the final amplified product are corrected by subsequent subcloning steps. We present new PCR parameter settings which, combined with exclusive use of the proof-reading Pfu DNA polymerase, have allowed us to synthesize rapidly the 2.1 kb pfsub-1 gene in the absence of a functional screen and with unprecedented accuracy. The final synthetic pfsub-1 gene has been successfully expressed not only in P. pastoris, achieving levels of recombinant protein of 0.20.5 g/l, but also in recombinant baculovirus-infected insect cells. We invite researchers working on organisms with A+T-rich genomes to consider gene synthesis as a feasible, systematic approach to high-level protein expression for protein engineering and structurefunction studies.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The oligos, one 53-mer and 103 40-mers, were synthesized on a 40 nmol scale with no extra purification and dissolved in water to final concentration of 25 µM each (Oswell DNA Service, UK). All restriction enzymes were obtained from Boerhinger Mannheim or from Gibco/BRL Life Technologies. Cloned Pyrococcus furiosus (Pfu) DNA polymerase was purchased from Stratagene (La Jolla, CA, USA). The pMosBlue blunt-ended cloning kit was supplied by Amersham Life Science and T4 DNA ligase by New England Biolabs, UK. DH5TM E.coli competent cells were obtained from Gibco/BRL Life Technologies. Plasmid DNA purification was performed with MiniSnap (Invitrogen, San Diego, CA, USA). Automated DNA sequencing was done on a Perkin-Elmer ABI prism 377 DNA sequencer, using dye terminator cycle sequencing. Sequencing analysis was performed using the AutoAssembler (Factura) package. Nucleotide sequence comparisons were performed using Lasergene DNAStar software. The P. pastoris expression kit including the pPIC9K vector and media were obtained from Invitrogen. The pVL1393 transfer vector, AcMNPV linear baculovirus DNA and Sf9 and High FiveTM insect cells were supplied by Invitrogen and the insect cell media by Gibco/BRL Life Technologies and Expression Systems, LLC (Woodland, CA, USA).
Gene design
The sequence of the synthetic pfsub-1 gene was designed according to P. pastoris codon usage (Bennetzen and Hall, 1982; Sreekrishna et al., 1993) with the aid of the CODOP program (available from EPC, l-carpen{at}nimr.mrc.ac.uk). CODOP is a Unix perl script which provides a number of molecular biology functions, including codon optimization with host organism preference as proposed by Hale and Thompson (1998). CODOP reads a codon usage table, assessing the frequency of each codon per 1000 codons (F). It then calculates the codon preference N:
|
Gene assembly and amplification
Gene assembly. Equal volumes of solutions of each oligo (25 µM each) were combined and the mixture was diluted 10-fold in 50 µl of a PCR mixture [20 mM TrisHCl, pH 8.8, 10 mM KCl, 10 mM (NH4)2SO4, 3 mM MgSO4, 0.1% Triton X-100, 0.1 mg/ml BSA, 0.2 mM each dNTP, 2.5 U of Pfu polymerase]. The PCR program consisted of one denaturation step at 94°C for 60 s, followed by 25 cycles at 94°C for 30 s, 52°C for 30 s and 72°C for 2 min.
Gene amplification. An aliquot of the gene assembly mixture (5 µl) was diluted 10-fold in 50 µl of PCR mixture [20 mM TrisHCl, pH 8.8, 10 mM KCl, 10 mM (NH4)2SO4, 3 mM MgSO4, 0.1% Triton X-100, 0.1 mg/ml BSA, 0.2 mM each dNTP, 2.5 U of Pfu polymerase and the two outermost primers at 1 µM each]. The outer primers were the same as the two external ones used in the assembly process. The PCR program consisted of a denaturation step cycle at 94°C for 60 s, then 25 cycles at 94°C for 45 s, 68°C for 45 s, 72°C for 5 min and a final incubation cycle at 72°C for 10 min. The PCR product was desalted using a PCR purification kit (Qiagen).
The best results in terms of mutation frequency were obtained when the pfsub-1 gene was constructed and cloned in two separate sections of 1.1 and 1 kb. Assembly PCR of the two sections was achieved using the same conditions as described above, with 56 oligos being assembled in the first section and 50 oligos in the second section. For the amplification PCR step, the extension time was reduced to 2.5 min.
Cloning steps and DNA sequencing
PCR blunt-ended products were cloned into pMosBlue using the EcoRV site. The ligation products were used to transform DH5TM E.coli competent cells and selection performed on L agar supplemented with 50 µg/ml ampicillin and 15 µg/ml tetracycline. Plasmids isolated from white colonies were screened for the presence of insert by restriction analysis. For sequencing inserts, primers P6, P15, P24, P33, P42, P58, P67, P76, P85 and P94 were selected from those used in the gene synthesis. Subcloning steps to correct point errors introduced during gene synthesis were performed in pMosBlue using unique restriction sites within the gene and vector.
Expression in P. pastoris
The synthetic gene was cloned into the P. pastoris pPIC9K vector using the SnaBI and EcoRI sites of the polylinker region. The construct was linearized prior to transformation of the P. pastoris GS115 (his4) strain by electroporation (pulse conditions were 1.5 kV at 400 and 25 µF). Several colonies resistant to 1 mg/ml G418 were selected for expression trials in 250 ml shake flasks, containing 50 ml of rich medium (BMGY). The best candidate was grown in a 4 l fermenter, adding 10 µg/ml tunicamycin when switching to induction with BMMY medium. Three days following induction, 20 g of cells were lysed using a cell disrupter and resuspended in a denaturing buffer (8 M urea, 20 mM imidazole, 0.1% v/v Nonidet P40, 0.1 M NaH2PO4, 10 mM TrisHCl, pH 8.2). The recombinant protein was then purified by metal chelate chromatography using NiNTA agarose (Qiagen). Bound protein was washed on the column with a 80 M gradient of decreasing concentration of urea and finally eluted in a fully soluble form with 50 mM EDTA. Eluted protein was visualized on 10% SDSPAGE gels by Coomassie Brilliant Blue staining. The band corresponding to PfSUB-1 was confirmed by Western blotting using an antiserum raised against an E.coli-derived recombinant PfSUB-1 fragment (Blackman et al., 1998
).
Expression in baculovirus-infected High FiveTM insect cells
The full-length synthetic pfsub-1 gene was subcloned into the EcoRI site of the baculovirus transfer vector pVL1393 downstream of the polyhedrin promoter. Sf9 cells were cultured at 27°C in complete TC100 medium and co-transfected with the plasmid construct and the viral DNA according to the Invitrogen guidelines. Recombinant baculoviruses were plaque-purified and single plaques were picked for amplification. High FiveTM cells were cultured in ESF 921 protein-free medium. Cells were grown in roller bottles at a density of 106 cells/ml and then infected with recombinant baculovirus at various multiplicities of infection. Tunicamycin was added to the High FiveTM cells to a final concentration of 0.5 µg/ml at the time of infection. The medium was harvested 72 h post-infection, clarified by centrifugation at 5000 g for 30 min and filtered through a 0.22 µm filter (Whatman). The secreted protein was detected in the culture supernatant by Western blot.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The design of the oligos used for synthesis of the 2.1 kb pfsub-1 gene necessitated great attention to detail, owing to the requirement for a large number to be mixed in one PCR. The nucleotide sequence of the gene was designed according to the P. pastoris codon usage preference (Bennetzen and Hall, 1982; Sreekrishna et al., 1993). In addition, the panel of oligos was rigorously screened and matched in order to meet the following criteria: (i) a decrease in the overall A+T content with the elimination of potential transcription termination signals; (ii) elimination of palindromic sequences conducive to stable intramolecular hairpins; (iii) minimization of tandem or inverted repeats (<10 bp in length) which are likely to give rise to non-specific priming; and (iv) optimization of the 20 nucleotide overlap between each 40-mer primer, to give a melting temperature in the range 5862°C, in order to allow subsequent use of the primers for DNA sequencing. A Kozak concensus translation initiation sequence was incorporated in the extreme 5' oligo for efficient expression of the gene in P. pastoris and an additional five histidine codons were introduced just prior to the stop codon in the extreme 3' oligo. A number of unique restriction sites were introduced at strategic positions throughout the synthetic gene to facilitate subsequent gene manipulation and mutagenesis. Oligo design was performed with the aid of the Unix codon optimization program CODOP (see Materials and methods). This program translates a given DNA sequence into a protein sequence and then, using a user-defined codon usage table, back-translates the protein sequence with an improved codon usage. The program rejects codons with abundances below a cut-off value, then assigns a high-abundance codon to each residue in the protein sequence, using high abundance codons in proportion to their use in the codon usage table. Both strands of the sequence are then divided into overlapping oligos of 40 bases in length, melting temperatures are calculated for all the overlaps and restriction sites generated along the sequence are displayed. The resulting panel of oligos was then analysed using the Genetics Computer Group software package (GCG Version 8-Unix) for the presence of undesirable repeats, inverted repeats, stemloop structures and regions of complementarity which could potentially lead to non-specific intermolecular hybridization. In most cases these sequences were readily eliminated whilst maintaining the codon preference. Non-optimum codons were resorted to only if required to create unique restriction sites or at repetitive sequences. Systematic, reiterative use of these two programs resulted in the final selection of 104 unique oligos for gene synthesis. Table I
shows a comparison of the codon composition of the synthetic gene with that of the wild-type P. falciparum gene. Codons not present in highly expressed yeast genes have been drastically decreased in frequency and a number of very rare codons eliminated. For example, 31 ATA (Ile) codons and 49 AAA (Lys) codons present in the native gene have been completely removed. The overall A+T composition has been reduced from 72% in the native gene to 53% in the synthetic product. The final, codon-optimized sequence of the synthetic pfsub-1 sequence and the relative positions of the 104 oligos is shown in Figure 1
together with the predicted amino acid sequence.
|
|
|
|
Expression from the synthetic pfsub-1 gene was initially assessed in P. pastoris. The PfSUB-1 signal sequence was replaced by the pre-pro domain of the S. cerevisiae -mating factor by cloning the gene into SnaBI /EcoRI-digested pPIC9K and the linearized vector used to transform P. pastoris. Transformants containing multiple chromosomal insertions of the recombinant vector were selected and, following preliminary inductions to select the highest producing clones, a single clone was induced in a 4 l fermenter. Since N-glycosylation of blood-stage P. falciparum proteins is rare (Gowda et al., 1997
), induction was performed in the presence of tunicamycin. No recombinant protein was secreted by the clone. Examination of total cell extracts by Western blotting showed that the induced recombinant product accumulated intracellularly in an insoluble form. Taking advantage of the C-terminal hexahistidine tag, the recombinant protein was purified under denaturing conditions from extracts of the induced clone by nickel chelate chromatography (Holzinger et al., 1996
) (Figure 4
). From these purification data, the expression level of the recombinant PfSUB-1 was estimated at 0.20.5 g/l. N-terminal amino acid sequencing of the purified protein showed that the
-factor N-terminal secretory signal sequence had been removed whereas the
-factor pro domain was still present, suggesting that the protein had undergone translocation into the yeast ER but not been further processed.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The complete gene synthesis process, including assembly and amplification, allowed the production of the final synthetic product in one day. Including the subsequent steps of DNA sequencing and subcloning, the constructs used in the P. pastoris expression experiments were completed within a matter of weeks. Although expensive in terms of initial capital outlay (the total cost of the oligos used here, at £0.6 per nucleotide, was of the order of £2500), the simplicity and accuracy of the PCR-based gene synthesis described here render it feasible to consider the routine, complete synthesis of malarial genes of interest prior to attempting expression in any heterologous system. Our preliminary expression results are extremely encouraging; PfSUB-1 is expressed intracellularly in P. pastoris in the range 0.20.5 g/l, providing enough material for extensive refolding assays. The protein is also expressed and secreted in baculovirus-infected High FiveTM cells in a correctly processed form. The significant amount of protein produced will readily allow enzymological and structural studies.
New approaches to malaria control will require an improved understanding of mechanisms of drug action and resistance, the identification of new drug targets, improved diagnostic tools and the development of an effective vaccine. All of these aims will be facilitated by the ongoing malaria genome project, which was initiated in 1996 and is progressing rapidly, as evidenced by the recent completion of the entire sequence of chromosome 2 of the 14 chromosome, ~30 megabase haploid P. falciparum nuclear genome (Gardner et al., 1998). The genetic information generated by this project will provide access to the complete array of P. falciparum open reading frames, allowing researchers readily to identify and study potential chemotherapeutic targets and vaccine candidate antigens. There is little doubt that the rate-limiting step in fully realizing the potential of these advances in genomics will be that of heterologous expression of correctly folded, functionally active parasite gene products for characterization at the molecular structural level; the A+T bias of the parasite genome will constitute a permanent problem in this regard. The systematic PCR-based approach to gene redesign described here will bridge the existing technological gap between identification of putative targets at the nucleotide level and their expression for structurefunction studies. In this Institute, the gene synthesis method described here has already been applied successfully to the synthesis of two other malarial genes (C.Withers-Martinez, unpublished data).
![]() |
Additional data |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgments |
---|
![]() |
Notes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Au,L., Yang,F., Yang,W., Lo,S. and Kao,C. (1998) Biochem. Biophys. Res. Commun., 248, 200203.[ISI][Medline]
Bennetzen,J.L. and Hall,B.D. (1982) J. Biol. Chem., 257, 30263031.
Blackman,M.J., Chappel,J.A., Shai,S. and Holder. A.A. (1993) Mol. Biochem. Parasitol., 62, 103114.[ISI][Medline]
Blackman,M.J., Fujioka,H., Stafford,W.H.L., Sajid,M., Clough,B., Fleck,S.L., Aikawa,M., Grainger,M. and Hackett,F. (1998) J. Biol. Chem., 273, 2339823409.
Brocca,S., Schmidt-Dannert,C., Lotti,M., Alberghina,L. and Schmid,R.D. (1998) Protein Sci., 7, 14151422.
Casimiro,D.R., Wright,P.E. and Dyson,H.J. (1997) Structure, 5, 14071412.[ISI][Medline]
Cline,J., Braman,J.C. and Hogrefe H.H. (1996) Nucleic Acids Res., 24, 35463551.
de Bruin,D., Lanzer,M. and Ravetch,J.V. (1992) Genomics, 14, 332339.[ISI][Medline]
Gardner,M.J. et al. (1998) Science, 282, 11261132.
Gowda,D.C., Gupta,P. and Davidson,E.A. (1997) J. Biol. Chem., 272, 64286439.
Graham,R.W., Atkinson,T., Kilburn,D.G. Miller,R.C.,Jr and Warren R.A.J. (1993) Nucleic Acids Res., 21, 49234928.[Abstract]
Grantham,R., Gautier,C., Gouy,M., Mercier,R. and Pave,A. (1980) Nucleic Acids Res., 8, r49r62.[Abstract]
Hale,R.S. and Thompson,G. (1998) Protein Express. Purif., 12, 185188.[ISI][Medline]
Hernan,R.A., Hui,H.L., Andracki,M.E., Noble,R.W., Sligar,S.G., Walder,J.A. and Walder,R.Y. (1992) Biochemistry, 31, 86198628.[ISI][Medline]
Holzinger,A., Phillips,K.S. and Weaver,T.E. (1996) BioTechniques, 20, 804.[ISI][Medline]
Makoff,A.J., Oxer,M.D., Romanos,M.A., Fairweather,N.F. and Ballantine,S. (1989) Nucleic Acids Res., 17, 1019110202.[Abstract]
Martin,S.L., Vrhovski,B. and Weiss,A.S. (1995) Gene, 154, 159166.[ISI][Medline]
McKerrow,J.H., Sun,E., Rosenthal,P.J. and Bouvier,J. (1993) Annu. Rev. Microbiol., 47, 82153.[ISI][Medline]
Mehta,D.V., DiGate,R.J., Banville,D.L. and Guiles,R.D. (1997) Protein Express. Purif., 11, 8694.[ISI][Medline]
Prapunwattana,P., Sirawaraporn,W., Yuthavong,Y. and Santi,D.V. (1996) Mol. Biochem. Parasitol., 83, 93106.[ISI][Medline]
Prodromou,C. and Pearl,L.H. (1992) Protein Engng, 5, 827829.[ISI][Medline]
Ranjan,A. and Hasnain,S.E. (1994) Virus Genes, 9, 149153.[ISI]
Romanos,M.A., Makoff,A., Fairweather,N.F., Beesley,K.M., Slater,D.E., Rayment,F.B., Payne,M.M. and Clare,J.J. (1991) Nucleic Acids Res., 19, 14611467.[Abstract]
Schmidt-Dannert,C., Pleiss,J. and Schmidt R.D. (1998) Ann. N. Y. Acad. Sci., 864, 1422.
Sreekrishna,K. (1993) In Baltz,R.H., Hegeman,G.D. and Skatrud,P.L. (eds) Industrial Microorganisms: Basic and Applied Molecular Genetics. American Society of Microbiology, Washington, DC. Chapter 16, pp. 119126.
Sreekrishna,K., Brankamp,R.G., Kropp,K.E., Blankenship,D.T., Tsay,J.T., Smith,P.L., Wierschke,J.D., Subramaniam,A. and Birkenberger,L.A. (1997) Gene, 190, 5562.[ISI][Medline]
Stemmer,W.P.C., Crameri,A., Ha,K.D., Brennan,T.M. and Heyneker,H.L. (1995) Gene,164, 4953.[ISI][Medline]
Triglia,T. and Kemp,D.J. (1991) Mol. Biochem. Parasitol., 44, 207212.[ISI][Medline]
Wilson,R.J. et al. (1996) J. Mol. Biol., 261, 155172.[ISI][Medline]
Received July 7, 1999; revised September 1, 1999; accepted September 15, 1999.