From the Structural Genomics of Pathogenic Protozoa (SGPP) Consortium and Center for Human Genetics and Molecular Pediatric Disease, University of Rochester Medical Center, Rochester, NY 14642;
Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY 14642; ¶ Departments of Genome Sciences and Medicine, University of Washington, Seattle, WA 98195; || Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195; and ** Department of Biochemistry, University of Washington, Seattle, WA 98195
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
We have developed a facile and general method for protein co-expression in Escherichia coli that can utilize sets of ORFs in identical expression plasmids with the simple requirement that the starting plasmid contains a 61-nucleotide sequence called LINK. This method takes advantage of our demonstration that two otherwise identical plasmids bearing different ORFs can be joined "head to tail" in a single tandem plasmid and propagated in E. coli (7). The LINK sequence expedites the joining of two plasmids using methods from ligation-independent cloning (LIC)1 (8) and generalizes the method for any pair of ORFs. We demonstrate that the resulting tandem plasmid, with two identical replication origins and antibiotic resistance markers, efficiently propagates in E. coli, and that the two proteins are readily co-expressed in quantities that would satisfy the most demanding structural biology applications. The method is simple and rapid and does not require sequencing of the ORFs in the resulting tandem plasmid.
![]() |
EXPERIMENTAL PROCEDURES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
Plasmids and Plasmid Construction
BG1861 is a LIC vector that expresses proteins as N-terminal His6-ORF fusion proteins. It was constructed from pET14b by replacing pET14b sequences between the NcoI site and the BamHI site with the sequence 5'-CCATGGCTCACCACCACCACCACCATATGACGCGTTAACCACGTGAGTAAGATAGGATCC-3' containing NcoI, NdeI, MluI, BbrPI, and BamHI sites (underlined). Replacement was accomplished in two steps by inserting complimentary oligonucleotides of sequence 5'-CATGGCTCACCACCACCACCACCATATGACGCGTC-3' and 5'-TCGAGACGCGTCATATGGTGGTGGTGGTGGTGAGC-3' into the NcoI-XhoI sites in pET14b to create pET14b-LIC1, followed by insertion of oligonucleotides of sequence 5'-CGCGTTAACCACGTGAGTAAGATAG-3' and 5'-GATCCTATCTTACTCACGTGGTTAA-3' into the MluI-BamHI sites of pET14b-LIC1.
The LINK sequence was originally cloned into a vector AVA0220, a derivative of the pET14b vector, by insertion of the LINK sequence into EagI-SalI sites using oligos FLIP_Top (5'-GGCCGTAACAACACCATTTAAATGGAGTGGTTACAAATGGAGTGGTTAATTAACAACACCATTTG-3') and FLIP_Bottom (5'-TCGACAAATGGTGTTGTTAATTAACCACTCCATTTGTAACCACTCCATTTAAATGGTGTTGTTAC-3') resulting in AVA0229 vector. The LINK sequence was moved to additional vectors (see below) by PCR amplification, digestion with appropriate restriction enzymes, and standard cloning procedures.
AVA0469 vector was constructed from BG1861 by insertion of a fragment that contained the LINK sequence into the NcoI-EagI sites. In this construction, the LINK sequence is upstream of the T7 promoter and does not interfere with expression or LIC of ORFs. The sequence of this vector is reported in the supplemental materials.
AVA0306 vector is a LIC vector containing a LINK sequence that expresses proteins as N-terminal His6-MBP-3Cprotease site-ORF fusion proteins. It was derived from H-MBP-3C vector (9) in three steps: insertion of sequences necessary for LIC of ORFs, deletion of an endogenous SwaI site in the plasmid, and insertion of the LINK sequence into a M13 origin of replication, concomitantly destroying the M13 origin. First, to convert H-MBP-3C vector into a LIC vector, the multiple cloning site of the vector was replaced with the LIC-site-containing DNA fragment (5'-GAATTCCTGGAAGTTCTGTTCCAGGGTCCTGGTTCGCGAATATTCTAGCTTTGTTTAAACAGCACGAACAAGTTCTGCAG-3'), which was inserted into the EcoRI-PstI sites of the H-MBP-3C vector (9) to add NruI and PmeI sites as well as sequences for LIC. Second, an existing SwaI site present in the H-MBP-3C vector was deleted as follows: primers complimentary to the sequences flanking the SwaI site (but not including it) were used to copy the plasmid by PCR, and the plasmid was rejoined via an added BamHI site encoded in the oligonucleotides. The primer pair was Swa_to_Bam_F (5'-GCGGGATCCGTAAACGTTAATATTTTGTTAAAATTCGC-3') and Swa_to_Bam_R (5'-GCGGGATCCCAATCTTCCTGTTTTTGGGGC-3'). Third, the DNA fragment containing the LINK site was amplified from the vector AVA0229 using the primers BamHI_FLIP_F (5'-GCGGGATCCGCAACGCGGGCATCCC-3') and pMal_FLIP_R (5'-GAGGCCGTTGAGCACCGCACTACGTGATTCCTTCTG-3') and inserted into the BamHI-DraIII sites of the vector resulting from step two. The sequence of vector AVA0306 can be found in the supplemental materials.
Saccharomyces cerevisiae ORFs TRM8 (YDL201w) and TRM82 (YDR165w) were PCR-amplified from yeast genomic DNA using primer pairs 201_F_SG (5'-GGGTCCTGGTTCGATGAAAGCCAAGCCACTAAGCC-3'), 201_R_SG (5'-CTTGTTCGTGCTGTTTATTACAATATGGCTGGCGTTGGTAATC-3'), and 165_F_SG (5'-GGGTCCTGGTTCGATGAGCGTCATTCATCCTTTGCAG-3'), 165_R_SG (5'-CTTGTTCGTGCTGTTTACGCCGCCTTCAGCTAGAAACAGAG-3') and inserted into NruI- and PmeI-digested vector AVA0306 using standard LIC procedures (8).
Fragments of Plasmodium falciparum ORFs chr3.gen_223 and chr5.gen_178 were PCR-amplified from interacting clones obtained in a yeast two-hybrid screen, using primers Lic2F (5'-CTCACCACCACCACCACCATATGACATACTCATTGTTCAGCCG-3') and Lic2R (5'-ATCCTATCTTACTCACTTATCCACCTCGAGAACGCGTTTGTCG-3'), and inserted into NdeI- and PmlI-digested vector AVA0469 using LIC (8). Likewise, Leishmania major ORFs, indicated in Fig. 2, were PCR-amplified from genomic DNA and cloned into AVA0469 using LIC. Sequences of these ORFs can be found at depts.washington.edu/sgpp/.
|
Cell Growth
Plasmid-transformed BL21 (DE3) pLysS cells were grown from a single colony overnight at 37 °C in 6.2 ml of Luria broth media supplemented with 100 µg/ml ampicillin, and then for 16 more generations using serial dilutions (equivalent to 406 liters of culture) and induced with 1 mM isopropyl-ß-D-1-thiogalactopyranoside (IPTG) at 18 °C for 16 h. Cells were harvested, lysed with SDS, and proteins were analyzed by SDS-PAGE.
Co-purification of Protein Complexes
The ubiquitin-conjugating enzyme (E2)/ubiquitin-conjugating enzyme variant (UEV) protein pair from P. falciparum was purified using IMAC and gel filtration. Frozen cells were lysed by sonication in buffer A (20 mM HEPES, pH 7.5, 1 M NaCl, 2 mM 2-mercaptoethanol (BME), and 5% glycerol) and centrifuged to generate crude extract. Extract was applied to TALON resin in buffer B (20 mM HEPES, pH 7.5, 0.5 M NaCl, 2 mM BME, and 5% glycerol), resin was washed with buffer, and proteins were eluted using a stepwise gradient of 5, 10, and 250 mM imidazole in buffer B. The protein pair was further purified using gel filtration chromatography on HiLoad Superdex 200 26/60 column (Amersham Pharmacia Biotech, Piscataway, NJ) and dialysed against buffer C (20 mM HEPES, pH 7.5, 0.5 M NaCl, 2 mM DTT, and 50% glycerol).
S. cerevisiae Trm8/Trm82 protein complex was purified in buffer containing 20 mM HEPES, pH 7.5, 10% glycerol, and 2 mM BME, using IMAC, followed by protease 3C cleavage overnight at 4 °C to release the bound protein and gel filtration on a HiLoad Superdex 200 16/60 column, followed by dialysis.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
The tandem plasmids propagate efficiently to allow large scale co-expression of proteins. To demonstrate this, we monitored plasmid maintenance and protein expression after extended growth. Starting from a single colony of each transformant, we first grew a 6.2-ml overnight culture and then propagated each strain for 16 more generations by serial dilution of cultures, equivalent to the growth of 406 liters of culture. Then we assessed the plasmid content of the cells and induced the cells to express protein. Plasmids analyzed at this point were all the same size as before transformation into expression cells (compare Fig. 2, b and c), illustrating that the tandem plasmids propagate as a unit during this growth period, with little observed recombination between the monomeric components. We speculate that the efficient propagation of tandem plasmids made from pBR322-derived vectors such as pET and pMal is due to the absence of a cer sequence, which is necessary for efficient recombinational resolution of dimers of ColE1 plasmids (10, 11). Following induction of expression after this prolonged growth, each strain containing a tandem plasmid expressed the corresponding pair of proteins at high levels, as judged by SDS-PAGE analysis of whole-cell lysates (Fig. 2d, compare lanes 14 with lanes 510). We note that tandem plasmids are more likely than other methods to yield comparable expression of ORF pairs, because the relative stoichiometry of the two ORFs is fixed and each ORF uses the same promoter.
Proteins expected to be members of a complex are readily co-purified after co-expression from a tandem plasmid made with LINK. Fig. 2e illustrates this for two different protein pairs, each in a different LINK-containing vector. Lanes 14 show the analysis of the predicted complex of E2 and UEV from P. falciparum, identified in a yeast two-hybrid screen. Each ORF was cloned into a LINK version of the pET-derived LIC vector BG1861 (AVA0469) to produce the corresponding His6-ORF fusion protein when expressed alone (lanes 1 and 2). When combined as a tandem plasmid using the LINK sequence, both proteins are expressed (lane 3), and the E2/UEV protein pair is readily purified (lane 4). Similarly, lanes 69 illustrates this for the Trm8/Trm82 protein complex of S. cerevisiae (7), using a different LINK-containing vector (AVA0306), derived from a pMal vector, that expresses proteins as His6-MBP-3Cprotease site-ORF fusion proteins.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|
First, it has enormous potential as a high-throughput co-expression method for large sets of ORF-expressing plasmids containing LINK, because the method is not PCR-based and does not require any DNA purification steps and thus can be easily automated. Moreover, because the manipulations of each plasmid occur at a site remote from the ORF, the method does not require sequencing of ORFs in the tandem plasmid. We note that the LINK sequence can be introduced into any nonessential region of a plasmid for subsequent use. Thus, a plasmid containing the LINK sequence can be used for routine expression of individual proteins, because explicit testing showed that the sequence has no obvious negative effect on expression (data not shown).
Second, the method can be used for co-expression of virtually any pair of proteins because either of the two octameric restriction sites can be used for cleavage of a given plasmid, as long as the other site is used for the second plasmid. This feature allows the generation of tandem plasmids containing more than 99% of typical protein pairs (Supplemental Table I), the exact percentage depending on GC-content, average protein length, and distribution of sites within ORFs. In yeast, for example, 99.4% of all possible protein pairs can be obtained by this method. Indeed, the only pairs that cannot be joined are those exceedingly rare combinations of ORFs with the same octameric restriction site within their sequences or those with both octameric recognition sequences in one ORF.
Finally, we note that the LINK method could be used to generate a random library of plasmids containing nearly every possible combination of protein pairs in a given genome for functional screening or selection. Applied to yeast, this would result in a library of 1.8 x 107 possible protein pairs, well within the transformation capability of E. coli.
![]() |
FOOTNOTES |
---|
Published, MCP Papers in Press, July 6, 2004, DOI 10.1074/mcp.T400008-MCP200
1 The abbreviations used are: LIC, ligation-independent cloning; IMAC, immobilized metal-affinity chrmoatography; MBP, maltose-binding protein; LB, Luria broth; IPTG, isopropyl-ß-D-1-thiogalactopyranoside; BME, 2-mercaptoethanol; E2, ubiquitin-conjugating enzyme; UEV, ubiquitin-conjugating enzyme variant.
* This work was supported by National Institutes of Health Grant P50 GM64655 (to W. G. H.). W. G. H. and S. F. are investigators of the Howard Hughes Medical Institute. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
S The on-line version of this manuscript (available at http://www.mcponline.org) contains supplemental material.
To whom correspondence should be addressed: Center for Human Genetics and Molecular Pediatric Disease, University of Rochester Medical Center, Rochester, NY 14642. Tel.: 585-275-2765; E-mail: elizabeth_grayhack{at}urmc.rochester.edu
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() |
---|