Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan
1 To whom correspondence should be addressed. E-mail: hyana{at}bio.keio.ac.jp
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: genetic code/protein design/protein evolution/reduced amino acid alphabet/synthetic library
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Although modern proteins consist of 20 amino acids, it has been proposed that the origin and early evolution of protein synthesis involved a reduced alphabet, that was gradually extended through co-evolution of the genetic code and the primordial biochemical system for amino acid synthesis (Crick, 1968; Wong, 1975
; Brooks et al., 2002
). If this is so, the properties of random-sequence proteins with a reduced alphabet may be different from those of the 20-alphabet random-sequence proteins previously reported. Davidson and co-workers constructed and characterized random-sequence proteins consisted of only Gln, Leu and Arg (Davidson and Sauer, 1994
; Davidson et al., 1995
). These QLR proteins showed remarkable helical structures, but their solubility was fairly low. In a computational study using inverse folding techniques, Babajide et al. demonstrated that native-like folded structures of tested proteins were maintained with restricted alphabets containing primitive amino acids such as Ala and Gly, but were not maintained with a non-primitive QLR alphabet (Babajide et al., 1997
). In this paper, we describe the first attempt to construct and characterize random-sequence proteins using a restricted set of primitive amino acids.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Cloning, expression and purification of the random-sequence proteins
The DNA library was cloned, randomly selected and sequenced with an ABI PRISM 3100 (Applied Biosystems). The random-sequence region of the eight in-frame genes were digested with BglII and XhoI and then subcloned into a derivative of a pET vector (Novagen) containing the N-terminal T7·tag sequence and the C-terminal His6 tag sequence. Escherichia coli BL21-CodonPlus(DE3) cells (Stratagene) transfected with individual recombinant plasmids were grown in LB broth containing 100 µg/ml ampicillin and 40 µg/ml chloramphenicol at 37°C. When the culture achieved an optical density of 0.60.7 at 600 nm, isopropylthio-ß-D-galactoside was added to a final concentration of 0.1 mM. After an additional 3 h of incubation, the cells were harvested by centrifugation and lysed in a BugBuster (Novagen) containing a protease inhibitor cocktail (Sigma). The centrifuged supernatants were used as the soluble fractions. The pellets were resuspended in a buffer containing 8 M urea and the centrifuged supernatants were used as insoluble fractions. These fractions were analyzed by 16.5% Tricine SDSPAGE (Schägger and von Jagow, 1987). The proteins were detected with CBB (Coomassie Brilliant Blue R250) staining and Western blotting with anti-T7·tag antibody. The soluble fractions were loaded on the affinity column of nickelNTA agarose resin (Qiagen) and the recombinant proteins were eluted with an imidazole gradient. The protein molar concentration was determined from the UV absorption at 280 nm and the molar absorption coefficient was calculated from
= 5690 M1 cm1 for Trp (Gill and von Hippel, 1989
).
CD and fluorescence measurements
CD spectra of purified proteins in the absence and presence of 15 M Gdn·HCl were measured on a J-820 spectropolarimeter (JASCO) at 25°C. The protein concentration was 3 µM and the light pathlength used was 1 mm. The results were expressed as mean residue molar ellipticity [].
Fluorescence measurements were performed at 25°C on a Shimadzu RF-1500 spectrofluorometer. The emission spectra of Trp residues of 1 µM proteins were measured at an excitation wavelength of 280 nm and the fluorescence spectra of 50 µM ANS (1-anilinonaphthalene-8-sulfonic acid) (Molecular Probes) in the absence and presence of 1 µM protein were measured with excitation at 371 nm.
Size-exclusion chromatography
Gel-filtration experiments on purified proteins were performed using a Shodex KW-803 column (Showa Denko) on a Vision Workstation (Applied Biosystems). The column was calibrated with a low molecular weight gel filtration calibration kit (Amersham Pharmacia Biotech). The Stokes radii of purified proteins and the control proteins (BSA, ovalbumin, chymotrypsinogen and ribonuclease) were calculated from their elution volumes as described previously (Uversky, 1993).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The most abundant amino acids in the prebiotic environment as inferred from the results of spark-discharge experiments were Ala, Gly, Asp and Val (Miller, 1953; Eigen, 1978
), whereas those deduced from analysis of the Murchison meteorite were Gly, Ala, Glu and Val (Kvenvolden et al., 1970
). Interestingly, codons for all these amino acids have guanosine (G) at the first nucleotide (Figure 1) and thus codons GNC and GNN, where N denotes U, C, A or G, were proposed to have formed the early genetic code (Eigen, 1978
; Kuhn and Waser, 1994
). We chose the five amino acids Ala, Gly, Val, Asp and Glu as a primitive alphabet.
|
|
|
|
|
|
|
As shown in Figure 4, we examined the solubility of eight of the random-sequence VADEG proteins and found that all of them were expressed in the soluble fraction. In the case of G4, the protein was also detected in the insoluble fraction. In previous studies, only five of 25 20-alphabet random-sequence proteins (Prijambada et al., 1996) and only two of 11 random-sequence QLR proteins (Davidson et al., 1995
) were found to be soluble. The VADEG proteins presented here thus seem to possess remarkably high solubility.
Structural characterization of the VADEG proteins
Previous reports indicated that 20-alphabet random-sequence proteins did not show any marked secondary structure (Doi et al., 1998; Yamauchi et al., 1998
) whereas the three-alphabet QLR proteins had remarkably high levels of helical structure (fractional helicity 3270%) and it was suggested that proteins with a limited range of amino acids have a greater tendency to form secondary structure (Davidson et al., 1995
). However, the five-alphabet VADEG proteins revealed no marked secondary structure as analyzed by means of CD measurements (Figure 5 and Table I), in spite of the abundance of Glu and Ala, which are known to be strong helix formers (Chou and Fasman, 1978
). One reason for the low helical content of the VADEG proteins would be the presence of the strong helix breaker, Gly, within the polypeptides. The highly structured QLR proteins contain the strong helix former Leu and have no helix breaker in their sequences.
For tertiary structure analysis, the emission spectra of the Trp residues doped in the random-sequence proteins in advance were measured at an excitation wavelength of 280 nm. The emission maxima ranged from 348 to 352 nm in an aqueous buffer, suggesting that almost all the Trp residues are exposed to the solvent (Teale, 1960). However, the intensity at the emission maximum of the G8 random-sequence protein was decreased by half in buffer containing 5 M urea (data not shown). Hence some of the Trp side chains of the G8 protein appear to be located in a hydrophobic environment under native conditions and to undergo denaturation in 5 M urea. The presence of hydrophobic clusters in the random sequence polypeptide was also supported by the results of ANS binding experiments. The fluorescence emission spectrum of ANS is known to be enhanced when the dye binds to hydrophobic regions of proteins (Stryer, 1965
). As shown in Figure 6, ANS fluorescence increased in the presence of the G4 random-sequence protein. A relatively small increase was also observed for the G1, G3, G7 and G8 proteins (Table I).
Next, the oligomeric state of the VADEG proteins was analyzed by means of gel filtration experiments. Each VADEG protein, except for the G4 protein, was eluted as a single peak and the Stokes radius calculated from the elution position of each VADEG protein was slightly larger than that deduced from data for monomeric, globular proteins with similar molecular weights (Figure 7). This result suggests that the VADEG proteins with high solubility tended to exist as monomers with slightly extended shape, whereas the QLR proteins (Davidson et al., 1995) and the 20-alphabet random-sequence proteins (Yamauchi et al., 1998
) with poor solubility tended to form multimeric structures.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Why are the random-sequence VADEG proteins highly soluble? |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Origin and early evolution of proteins through functional selection |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keefe and Szostak estimated the frequency of functional proteins in a library of 20-alphabet random-sequence proteins as 1 in 1011 by using mRNA display technology (Keefe and Szostak, 2001). It would be interesting to study whether a library of simple-alphabet random-sequence proteins with higher solubility contains functional proteins in the reduced sequence space. Such studies are in progress by using in vitro display technologies developed in our laboratory (Nemoto et al., 1997
; Doi and Yanagawa, 1999
; Yonezawa et al., 2003
; Horisawa et al., 2004
). The acidic residues, Asp and Glu, can bind to metals such as Mg2+ and Ca2+, so primitive enzymes with the VADEG alphabet might act as metalloenzymes, like ribozymes in the RNA world. Recent studies in vitro (Riddle et al., 1997
; Silverman et al., 2001
; Akanuma et al., 2002
) and in silico (Chan, 1999
; Wang and Wang, 1999
; Murphy et al., 2000
; Fan and Wang, 2003
) provide insight into the extent to which native protein structure and function can be achieved with reduced alphabets, such as IKEAG (Riddle et al., 1997
). In order to acquire artificial proteins with various functions by laboratory evolution, it might be effective to dope basic amino acids such as Lys and Arg into a set of reduced-alphabet random-sequence libraries.
As an alternative approach to laboratory protein evolution and selection, Hecht et al. constructed a binary code library by designing a binary pattern of polar and non-polar amino acids that would favor proteins containing abundant secondary structure (Kamtekar et al., 1993; Wei and Hecht, 2004
). The quality of the library is high; almost all of the proteins in the library are soluble and have a well-ordered structure, although their structural diversity is limited to a predesigned unique fold, such as a four-helix bundle structure. Libraries of modest quality and diversity containing polypeptides with various folds and abundant secondary structure, but no well-ordered structure, can be constructed by randomly combining naturally occurring polypeptide segments such as repetitive peptide motifs (Shiba et al., 1997
, 2003
), secondary structure units (Tsuji et al., 1999
, 2001
) and random cDNA fragments (Fischer et al., 2004
). In general, there would be a trade-off between quality and diversity of combinatorial polypeptide libraries. However, a random-sequence library with a reduced alphabet can achieve relatively high quality (e.g. high solubility) with enormous diversity.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Babajide,A., Hofacker,I.L., Sippl,M.J. and Stadler,P.F. (1997) Fold. Des., 2, 261269.[ISI][Medline]
Brooks,D.J., Fresco,J.R., Lesk,A.M. and Singh,M. (2002) Mol. Biol. Evol., 19, 16451655.
Chan,H.S. (1999) Nat. Struct. Biol., 6, 994996.[CrossRef][ISI][Medline]
Chaput,J.C. and Szostak,J.W. (2004) Chem. Biol., 11, 865874.[CrossRef][ISI][Medline]
Cho,G., Keefe,A.D., Liu,R., Wilson,D.S. and Szostak,J.W. (2000) J. Mol. Biol., 297, 309319.[CrossRef][ISI][Medline]
Chou,P.Y. and Fasman,G.D. (1978) Annu. Rev. Biochem., 47, 251276.[CrossRef][ISI][Medline]
Crick,F.H.C. (1968) J. Mol. Biol., 38, 367379.[CrossRef][ISI][Medline]
Davidson,A.R. and Sauer,R.T. (1994) Proc. Natl Acad. Sci. USA, 91, 21462150.
Davidson,A.R., Lumb,K.J. and Sauer,R.T. (1995) Nat. Struct. Biol., 2, 856864.[CrossRef][ISI][Medline]
Doi,N. and Yanagawa,H. (1998) Cell. Mol. Life Sci., 54, 394404.[CrossRef][ISI][Medline]
Doi,N. and Yanagawa,H. (1999) FEBS Lett., 457, 227230.[CrossRef][ISI][Medline]
Doi,N., Itaya,M., Yomo,T., Tokura,S. and Yanagawa,H. (1997) FEBS Lett., 402, 177180.[CrossRef][ISI][Medline]
Doi,N., Yomo,T., Itaya,M. and Yanagawa,H. (1998) FEBS Lett., 427, 5154.[CrossRef][ISI][Medline]
Eigen,M. (1978) Naturwissenschaften, 65, 341369.[CrossRef][ISI]
Fan,K. and Wang,W. (2003) J. Mol. Biol., 328, 921926.[CrossRef][ISI][Medline]
Fischer,N., Riechmann,L. and Winter,G. (2004) Protein Eng. Des. Sel., 17, 1320.
Gill,S.C. and von Hippel,P.H. (1989) Anal. Biochem., 182, 319326.[CrossRef][ISI][Medline]
Horisawa,K., Tateyama,S., Ishizaka,M., Matsumura,N., Takashima,H., Miyamoto-Sato,E., Doi,N. and Yanagawa,H. (2004) Nucleic Acids Res., 32, e169.
Ito,Y., Kawama,T., Urabe,I. and Yomo,T. (2004) J. Mol. Evol., 58, 196202.[CrossRef][ISI][Medline]
James,L.C. and Tawfik,D.S. (2003) Trends Biochem. Sci., 28, 361368.[CrossRef][ISI][Medline]
Kamtekar,S., Schiffer,J.M., Xiong,H., Babik,J.M. and Hecht,M.H. (1993) Science, 262, 16801685.[ISI][Medline]
Kauffman,S. and Ellington,A.D. (1999) Curr. Opin. Chem. Biol., 3, 256259.[CrossRef][ISI][Medline]
Keefe,A.D. and Szostak,J.W. (2001) Nature, 410, 715718.[CrossRef][ISI][Medline]
Kuhn,H. and Waser,J. (1994) FEBS Lett., 352, 259264.[CrossRef][ISI][Medline]
Kvenvolden,K., Lawless,J., Pering,K., Peterson,E., Flores,J., Ponnamperuma,C., Kaplan,I.R. and Moore,C. (1970) Nature, 228, 923926.[ISI][Medline]
Kyte,J. and Doolittle,R.F. (1982) J. Mol. Biol., 157, 105132.[CrossRef][ISI][Medline]
LaBean,T.H., Kauffman,S.A. and Butt,T.R. (1995) Mol. Divers., 1, 2938.[CrossRef][ISI][Medline]
Lo Surdo,P., Walsh,M.A. and Sollazzo,M. (2004) Nat. Struct. Mol. Biol., 11, 382383.[CrossRef][ISI][Medline]
Mandecki,W. (1990) Protein Eng., 3, 221226.[ISI][Medline]
Miller,S.L. (1953) Science, 117, 528529.[ISI][Medline]
Murphy,L.R., Wallqvist,A. and Levy,R.M. (2000) Protein Eng., 13, 149152.[CrossRef][ISI][Medline]
Nemoto,N., Miyamoto-Sato,E., Husimi,Y. and Yanagawa,H. (1997) FEBS Lett., 414, 405408.[CrossRef][ISI][Medline]
Prijambada,I.D., Yomo,T., Tanaka,F., Kawama,T., Yamamoto,K., Hasegawa,A., Shima,Y., Negoro,S. and Urabe,I. (1996) FEBS Lett., 382, 2125.[CrossRef][ISI][Medline]
Riddle,D.S., Santiago,J.V., Bray-Hall,S.T., Doshi,N., Grantcharova,V.P., Yi,Q. and Baker,D. (1997) Nat. Struct. Biol., 4, 805809.[CrossRef][ISI][Medline]
Saven,J.G. (2002) Curr. Opin. Struct. Biol., 12, 453458.[CrossRef][ISI][Medline]
Schägger,H. and von Jagow,G. (1987) Anal. Biochem., 166, 368379.[CrossRef][ISI][Medline]
Shiba,K., Takahashi,Y. and Noda,T. (1997) Proc. Natl Acad. Sci. USA, 94, 38053810.
Shiba,K., Shirai,T., Honma,T. and Noda,T. (2003) Protein Eng., 16, 5763.[CrossRef][ISI][Medline]
Silverman,J.A., Balakrishnan,R. and Harbury,P.B. (2001) Proc. Natl Acad. Sci. USA, 98, 30923097.
Stryer,L. (1965) J. Mol. Biol., 13, 482495.[Medline]
Teale,F.W. (1960) Biochem. J., 76, 381388.[ISI][Medline]
Tsuji,T., Yoshida,K., Satoh,A., Kohno,T., Kobayashi,K. and Yanagawa,H. (1999) J. Mol. Biol., 286, 15811596.[CrossRef][ISI][Medline]
Tsuji,T., Onimaru,M. and Yanagawa,H. (2001) Nucleic Acids Res., 29, e97.
Uversky,V.N. (1993) Biochemistry, 32, 1328813298.[CrossRef][ISI][Medline]
Wang,J. and Wang,W. (1999) Nat. Struct. Biol., 6, 10331038.[CrossRef][ISI][Medline]
Watters,A.L. and Baker,D. (2004) Eur. J. Biochem., 271, 16151622.
Wei,Y. and Hecht,M.H. (2004) Protein Eng. Des. Sel., 17, 6775.
Wilkinson,D.L. and Harrison,R.G. (1991) Biotechnology, 9, 443448.[CrossRef][ISI][Medline]
Wong,J.T. (1975) Proc. Natl Acad. Sci. USA, 72, 19091912.
Wright,P.E. and Dyson,H.J. (1999) J. Mol. Biol., 293, 321331.[CrossRef][ISI][Medline]
Yamauchi,A. et al. (1998) FEBS Lett., 421, 147151.[CrossRef][ISI][Medline]
Yonezawa,M., Doi,N., Kawahashi,Y., Higashinakagawa,T. and Yanagawa,H. (2003) Nucleic Acids Res., 31, e118.
Received March 18, 2005; revised April 29, 2005; accepted May 5, 2005.
Edited by Dan Tawfik
|