Translated products of tandem microgene repeats exhibit diverse properties also seen in natural proteins

Kiyotaka Shiba1,2,5, Tsuyoshi Shirai3, Takako Honma2 and Tetsuo Noda2,4

1 Department of Protein Engineering, 2 Department of Cell Biology, Cancer Institute, Japanese Foundation for Cancer Research, Toshima, Tokyo 170-8455, 3 Department of Biotechnology, Graduate School of Engineering Nagoya University, Chikusa, Nagoya 464-8603 and 4 Department of Molecular Genetics, University of Tohoku School of Medicine, Aoba, Sendai 980-8575, Japan

5 To whom correspondence should be addressed. E-mail: kshiba{at}jfcr.or.jp


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Repetitiousness is often observed in the primary and tertiary structures of proteins. We are intrigued by the potential role played by periodicity in the evolution of proteins and have created artificial repetitious proteins from repeats of short DNA sequences (microgenes). In this paper we characterize the physicochemical properties of six such artificially created proteins, which are the translated products of repeats of three microgenes. Three of the six proteins contain ß-sheet-like structures and are rather hydrophobic in nature. These proteins form macroscopic membranous structures in the presence of monovalent cationic ions, suggesting they have the capacity to promote strong intermolecular interactions. Of the other three proteins, one is comprised of {alpha}-helices and two have disordered structures. Small angle X-ray scattering analysis indicates that the artificial proteins do not fold as tightly as natural proteins, but are more compact than if completely denatured. One {alpha}-helical protein whose microgene unit was designed from coiled coil proteins was crystallized, demonstrating that repetitious artificial proteins can undergo transition to a more ordered state under appropriate conditions. Application of this approach to the development of a novel protein engineering system is discussed.

Keywords: artificial repetitious protein/de novo protein synthesis/in vitro protein evolution/protein engineering


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Soon after the development of methodologies used to resolve the molecular structures of biomacromolecules, it became apparent that these structures were remarkably repetitive in nature, and the possibility that such repetition was related to the origin and evolution of proteins and genes was proposed (Rossmann et al., 1974Go; Barker et al., 1978Go; Ohno, 1981Go; Ohno and Epplen, 1983Go). Recent innovations in genome sequencing and structural biology have only served to increase our appreciation of the scope of this periodicity. Genome structures are now known to contain a multitude of periodicities, ranging from those on the level of one to several nucleotides (Sutherland and Richards, 1995Go) to those on the sub-chromosomal level (Wolfe and Shields, 1997Go). Indeed various repeated sequences account for more than 50% of the human genome (Lander et al., 2001Go). Likewise, protein structures also exhibit a variety of repeats; ‘domain’, ‘module’ and ‘element’ are terms often used to describe units that repeat within the tertiary structure of many proteins (Baron et al., 1991Go; Yura et al., 1993Go; Kobe and Deisenhofer, 1995Go; Kobe and Kajava, 2000Go). Within the primary structure of proteins, sequences ranging from 1 to >300 amino acids in length are repeated (Sutherland and Richards, 1995Go; Li et al., 1997Go) from 2 to >300 times (Wilkin et al., 2000Go). Such peptide sequence repeats may exist as simple contiguous tandem repeats (Eckert and Green, 1986Go), as non-tandem and scattered repeats, or as complicated hierarchical patterns (Hayashi and Lewis, 2000Go). Thus, biomacromolecules exhibit numerous modes of reiteration, varying widely with respect to length, number, conservation and spacing.

Some patterns of repetition are apparently related to the origin or evolution of molecules. Proteins of relatively recent origin (e.g. the circumsporozoite antigen of parasitic protozoa and antifreeze proteins of fish) often consist of near exact copies of oligopeptide sequences (Ohno, 1984Go). In the case of porcine ribonuclease inhibitor, the near-perfect periodicity observed in its tertiary structure implies that the protein evolved from the reiteration of a DNA sequence 87 bp in length or less (Kobe and Deisenhofer, 1993Go). We have focused on the hypothesis that new genes arise in the form of repeats of a short DNA block, and have created artificial repetitious polypeptides through polymerization of microgenes lacking a stop codon (Shiba et al., 1997Go). Our previous work has shown that periodic DNA produces ordered proteins at very high rates (Shiba et al., 2002Go). In this paper, we report on the characteristics of several artificial proteins created from reiteration of short DNA sequences.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Expression of microgene polymer products and their purification

The expression and purification of artificial proteins has been described previously (Shiba et al., 2002Go). Briefly, proteins were expressed in E.coli and purified under denaturing conditions using TALON resin (Clontech, Palo Alto, CA). Eluted proteins were dialyzed against a buffer containing 50 mM Tris–acetate pH 4.0, 100 mM NaCl and 1 mM EDTA to remove the urea.

Small angle X-ray scattering (SAXS)

Measurement of SAXS was carried out at a solution scattering station installed at BL-10C, Photon Factory, Tsukuba, Japan (Ueki et al., 1985Go). Proteins were dissolved in 10 mM phosphate buffer pH 6.0 containing 100 mM NaCl and placed in a cell (1 mm X-ray path length with 20 µm thick quartz windows) where the measurements were made. The wavelength of the incident X-rays was 1.488 Å, and the beam cross-section was 0.5x3.0 mm. The distance from the sample to the detector was 202 cm, calibrated with meridional diffraction of dried chicken collagen. The same solution without protein was measured as background. All measurements were performed at ambient temperature.

Radii of gyration (Rg) of natural proteins in native and denatured states

Reported values for the following proteins were used: ubiquitin (UBQ) and ß-lactalbumin (BLA) (Kamatari et al., 1999Go); Streptomyces subtilisin inhibitor (SSI) (Konno et al., 1995Go); cytochrome c (CYT) (Kamatari et al., 1996Go); ribonuclease A (RNA) (Sosnick and Trewhella, 1992Go); {alpha}-lactalbumin (ALA) (Kataoka et al., 1997Go); lysozyme (LYS) (Kamatari et al., 1998Go); staphylococcal nuclease (SNC) (Panick et al., 1999Go); apo-myoglobin (MYO) (Kataoka et al., 1995Go); aspergillopepsin II (PEP) (Kojima et al., 2000Go); Osp A (OSP) (Koide et al., 1999Go); and the {alpha}-subunit of tryptophan synthase (TRP) (Gualfetti et al., 1999Go).

Crystallization conditions

Protein no. 320 ({alpha}+) was crystallized using the hanging drop vapor-diffusion method. One millilitre of 50 mM HEPES buffer pH 6.5 containing 25% (w/v) PEG 4000 and 20% (v/v) glycerol served as the reservoir. A drop containing 5 µl of 1% (w/v) protein solution in 50 mM phosphate buffer and 5 µl of reservoir solution was equilibrated against the reservoir at 18 °C. Numerous microcrystals appeared within a few days; the crystal presented was one of the largest. A sample of the crystals was rinsed with reservoir solution and confirmed to be composed of protein no. 320 ({alpha}+) by SDS–PAGE.

X-ray diffraction analysis

The microcrystals were scooped up from a hanging drop with a cryo-loop (Hampton Research, Laguna Niguel, CA) and flash frozen in liquid nitrogen. The X-ray diffraction experiment was carried out using an RU-300 Cu-rotating anode X-ray generator (Rigaku, Tokyo) operating at 50 mVx100 mA ({lambda} = 1.542 Å). During the experiment, the frozen drop was kept in a 100 K nitrogen stream from a cryosystem (Oxford Cryosystem, Oxford, UK). The diffraction image was recoded for 10 h using a Raxis IV imaging plate detector (Rigaku).


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Creation of artificial proteins

The primary structures of the six proteins described in this report are shown in Figure 1Go. They are the translated products of artificial open reading frames that were created by tandem polymerization of short DNA sequences having no stop codon (microgenes) described previously (Shiba et al., 2002Go). For polymerization of microgenes, we used a method called microgene polymerization reaction (MPR), which randomly inserts or deletes nucleotides at end-joining junctions of microgene units (Shiba, 1998Go). The translated proteins, therefore, were not simple repeats of unique peptide sequences but combinatorial polymers of 2x3 open reading frames.



View larger version (31K):
[in this window]
[in a new window]
 
Fig. 1. Primary structure of products translated from microgene polymers. Numbers represent clone names and are followed by the name of the microgene unit and the direction of translation (+ and - denote sense and antisense microgenes, respectively). Red, black and blue letters represent charged, polar and non-polar residues, respectively (Shiba et al., 2002Go). Yellow, aqua and pink boxes represent residues calculated to form {alpha}-helices, ß-strands and ß-turns, respectively, using a secondary structure prediction program (Geourjon and Deleage, 1994Go). Bars represent deleted regions.

 
Proteins no. 287 (Cu+), no. 291 (Cu-) and no. 292 (Cu+) were all derived from the 54 bp designer microgene MG-14 (Cu) (Shiba et al., 2002Go). One of the reading frames (frame-1) of this microgene encodes the peptide GAGMYAESYGRKICSPHQ, which with head-to-tail concatenation can recreate part of the type-1 copper-binding motif of plastocyanin (H...[46aa]...CSPHQGAGM) (Adman, 1991Go). Protein no. 287 (Cu+) contains part of frame-1 within its C-terminal region (KICSPHQG), while the remainder of the protein is composed of frame-2. Protein no. 291 (Cu-) is translated from the reading frames of the ‘minus’ strand of MG-14 (frames-3, -4 and -5), and is therefore not intentionally related to any existing protein. The sequence of protein no. 291 (Cu-) has a rather complicated structure due to mutations occurring at end-joining junctions and elsewhere in the sequence; the latter appear to have been introduced by the thermocycle reaction or within E.coli cells. Internal deletions also produced complicated sequences in protein no. 292 (Cu+), which was translated from the three reading frames of the ‘plus’ strand of MG-14 within which the copper-binding motif appears five times.

The 42 bp microgene MG-15 ({alpha}) was designed so that it encodes the polypeptide RKVLQGRMENLQAE in its first reading frame. This peptide sequence was derived from part of the coiled coil {alpha}-helix of seryl-tRNA synthetase (Cusack et al., 1990Go) and was expected to provide {alpha}-helical structure to the reconstituted proteins. Protein no. 320 ({alpha}+) contains 5.5 repeats of the {alpha}-helix forming frame of the microgene, which accounts for about 80% of the protein. Analogously, the 66 bp MG-16 (ß) was designed so that one of its reading frames codes for the ß-strand forming peptide GVWVDESGNRMDSNNWIGSSAN. This peptide sequence was designed to be a 22 amino acid consensus sequence of the repeating units of the parallel ß-helix protein pectate lyase from Erwinia chrysanthemi (Yoder et al., 1993Go). Protein no. 416 (ß+) has 1.5 and 2.5 repeats of this ß-forming peptide at its N- and C-terminals, respectively; the repeats located in the middle portion of the protein were translated from frame-3. The 72 bp microgene MG-18 (tRNA) was designed based on the sequence of Bacillus subtilis glycine tRNA (Vold and Green, 1988Go), and none of the products translated from its six reading frames were consciously related to any natural protein. Eigen (Eigen and Winkler-Oswatitsch, 1981Go) suggested that tRNA might have served as an early gene, and this microgene was created with that in mind. Protein no. 377 (tRNA) has mutations at one microgene junction and at two positions elsewhere in the sequence that changed the reading frame three times. The length, amino acid composition and other characteristics of these six proteins are summarized in Table IGo.


View this table:
[in this window]
[in a new window]
 
Table I. Summary of six artificial proteins
 
An earlier study showed that microgene polymers yield proteins with secondary structures at a high rate (more than 10% of the library) (Shiba et al., 2002Go). Among the six proteins in Figure 1Go, CD analyses showed the presence of secondary structures in four: no. 291 (Cu-), no. 320 ({alpha}+), no. 377 (tRNA) and no. 416 +). As expected from the microgene designs and the reading frames used, proteins no. 320 ({alpha}+) and no. 416 (ß+) contained {alpha}-helix and ß-strand, respectively. The microgenes whose designs were not intentionally associated with specific structures also gave rise to structured proteins. Protein no. 291 (Cu-) was translated from polymers of the ‘minus’ strand of MG-14 (Cu), so none of its sequences were related to natural proteins [this was confirmed by a Blast search (Altschul et al., 1990Go) of a public database]. Similarly, protein no. 377 (tRNA) was composed of repeats of tRNA-derived sequences and was not associated with any natural proteins. Nevertheless, the CD spectra of these proteins showed the presence of ß-structure at pH 6.0, with the amount of orderly secondary structure diminishing as the buffer pH declined to 4.0 and 2.0 (Shiba et al., 2002Go). Predictions of secondary structure based on the sequences of these proteins agree that ß-structure should be present in these proteins (Figure 1Go and Table IGo). Although secondary structure was also predicted for proteins no. 287 and no. 292 (Table IGo), the CD measurement yielded spectra typical of no regular secondary structure (data not shown).

Absorption spectra for proteins no. 287 (Cu+) and no. 292 (Cu+) in the presence of copper ion

The presence of {alpha}-helix and ß-sheet-like structures in proteins no. 320 ({alpha}+) and no. 416 (ß+), respectively, suggest that structures intentionally embedded in microgenes are reconstituted in their translated products. The sequence of protein no. 292 (Cu+) contained four repeats of the copper-binding motif embedded in the first frame of MG-14 (Cu). Although the copper-binding motif was not fully expressed at the C-terminal fragment of protein no. 287 (Cu+), a Met residue necessary for the complete motif recurred in the N-terminal domain of the protein (Figure 1Go). When purifying the six proteins, we found that solutions of proteins no. 287 (Cu+) and no. 292 (Cu+) were pink in color, probably because we used an immobilized Co2+ affinity column for their purification, resulting in the association of Co2+ with these two proteins. The possible interaction of metal ions and MG-14 (Cu) polymer proteins was further investigated by refolding denatured protein no. 287 (Cu+) in a solution containing 50 mM 2-(N-morpholino)-ethanesulfonic acid, 100 mM NaCl and 0.5 mM CuSO4. This yielded a protein with a pale blue color and an absorption maximum at 630 nm, indicating association of Cu2+. Although the absorption maximum differed from that of plastocyanin (597 nm) (Yoshizaki et al., 1981Go) and the analyses were qualitative, these observations nevertheless suggest that proteins no. 287 (Cu+) and no. 292 (Cu+) associate with metal ions, possibly via an embedded copper-binding motif. Moreover, they demonstrate that a function encrypted within a microgene can be reconstituted in an artificial protein.

Spontaneous formation of macroscopic structures

Various levels of repetition are often observed in proteins that assemble into intracellular or extracellular macrostructures. Instances include spectrin, a component of the cytoskeleton (Yan et al., 1993Go); fibronectin, a component of extracellular matrix (Leahy et al., 1996Go); involucrin, a component of the keratinocyte envelope (Eckert and Green, 1986Go); and Flag, a component of spider flagelliform silk (Hayashi and Lewis, 2000Go). In addition, amyloid formation is believed to be related to repeats of oligopeptide (Liu et al., 1999Go). Their repetitious nature is believed to favor intermolecular interactions leading to the formation of macroscopic structures.

We evaluated the intermolecular interactions of our repetitious proteins by examining their ability to form a macroscopic membranous structure when poured into an alkaline monovalent cation solution (i.e. 0.2 M LiCl). This protocol was previously used by others to form membranes from a self-complementary oligopeptide [(AEAEAKAK)2] that exhibited the typical CD spectrum of ß-sheet (Zhang et al., 1993Go). Among the six proteins, no. 291 (Cu-), no. 377 (tRNA+) and no. 416 (ß+) formed a membranous structure in buffer containing 0.2 M LiCl (Figure 2Go). Interestingly, these membrane-forming proteins share two important features: (i) a rather high composition of non-polar amino acid residues (Table IGo) and (ii) a ß-sheet-like structure (Shiba et al., 2002Go). This suggests that by adjusting the hydrophobicity and ß-sheet forming ability of the polymerized products, microgenes could be designed so that they code for proteins capable of strong and orderly intermolecular interactions, thereby resulting in the formation of macroscopic structures.



View larger version (54K):
[in this window]
[in a new window]
 
Fig. 2. Macroscopic structures spontaneously formed by microgene polymer proteins. 3 µl of protein solution (8 µg/µl in 50 mM Tris–acetate pH 4.0, 100 mM NaCl, 1 mM EDTA) was poured into 200 µl of 50 mM sodium phosphate buffer pH 8.0 containing 0.2 M LiCl at ambient temperature; the emergent structures were observed under a stereomicroscope (Wild M30, Leica). Bar = 1 mm.

 
Small angle X-ray scattering (SAXS) analysis

The membrane formation described above suggests that proteins no. 291 (Cu-), no. 377 (tRNA+) and no. 416 (ß+) tend to form multimeric assemblages, even under physiological conditions. This possibility is supported by the observations that proteins no. 291 (Cu-) and no. 377 (tRNA+) behaved as huge proteins with molecular weights of more than 440 kDa in gel filtration experiments. Moreover, 6.0 M urea was required for filtration of protein no. 416 (ß+), suggesting intermolecular interaction was particularly strong for that protein (data not shown). By contrast, proteins no. 287 (Cu+) and no. 320 ({alpha}+) behaved as though they were relatively small (the calculated Stokes radii were 16 Å and 33 Å, respectively, in size-exclusion chromatography experiments; data not shown) and, in the presence of 6 M urea, no. 292 (Cu+) ran as a protein of approximately 100 kDa in size-exclusion chromatography (data not shown). Thus, proteins no. 287 (Cu+) and no. 320 ({alpha}+) might exist as tightly packed mono-disperse molecules in physiological solutions.

To address that question further, we performed small angle X-ray scattering (SAXS) analyses with proteins no. 287 (Cu+), no. 320 ({alpha}+) and no. 292 (Cu+) to determine their radii of gyration (Rg) in solution. The Rgs obtained from Guinier plots (Glatter and Kratky, 1982Go) were 26.6 Å, 26.6 Å and 29.0 Å for proteins no. 320 ({alpha}+), no. 292 (Cu+) and no. 287 (Cu+), respectively. Comparison of the apparent Rgs and monomeric molecular weights of the artificial proteins with those of natural proteins in their native and denatured states indicated that the former do not fold as tightly as the latter (Figure 3AGo). The Rgs of the artificial proteins fell in the vicinity of the lower boundary of the Rgs of denatured natural proteins. However, the fact that one of the natural proteins, aspergillopepsin II (cf. PEP in Figure 3AGo), retained its intramolecular disulfide bonds and therefore showed a relatively small Rg in the denatured state (Kojima et al., 2000Go) implies that the artificial proteins, especially no. 320 ({alpha}+) and no. 292 (Cu+), might not be in a completely random state. The same implication was obtained from Kratky plots (Figure 3BGo), which show a peak when a molecule has a globular shape and a plateau when it is a random coil (Glatter and Kratky, 1982Go). The plots for the artificial proteins suggested a tangled shape for no. 320 ({alpha}+) and no. 292 (Cu+) and a typical extended coil for no. 287 (Cu+) (Figure 3BGo). Consequently, the structures of proteins no. 320 ({alpha}+) and no. 292 (Cu+) are rather more compact than if completely denatured. It is not known why protein no. 287 (Cu+) behaved as a compact molecule in size-exclusion chromatography.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 3. (A) Radii of gyration (Rg) of microgene polymer proteins (red circles with error bars) and natural proteins in their native (blue circles) and denatured states (aqua triangles) plotted against molecular weight. The values of the microgene polymer proteins were deduced from Guinier plots (Glatter and Kratky, 1982Go). Those of the natural proteins were obtained from the literature. UBQ, ubiquitin; BLA, ß-lactalbumin; SSI, SSI-dimer; CYT, cytochrome c; RNA, ribonuclease A; ALA, {alpha}-lactalbumin; LYS, lysozyme; SNC, staphylococcal nuclease; MYO, apo-myoglobin; PEP, aspergillopepsin II; OSP, Osp A; and TRP, the {alpha}-subunit of tryptophan synthase. Probable Rgs for Gaussian peptide chains (green rectangles) were estimated as (92n/6)1/2, where n is the number of residues in the proteins. (B) Kratky plots (Glatter and Kratky, 1982Go) for proteins no. 320 ({alpha}+) (black), no. 292 (Cu+) (red) and no. 287 (Cu+) (blue). The plots show the relative scattering intensity q2I(q)/I(0) (Å-2) against the scattering vector q-1). The scattering vector q is defined as 4{pi}sin{theta}/{lambda}, where 2{theta} is the scattering angle, and {lambda} is the wavelength of the incident X-ray (1.488 Å). I(q) and I(0) are the observed intensities at scattering vector q and at the zero angle, respectively.

 
Crystallization

SAXS analyses showed that proteins no. 320 ({alpha}+), no. 292 (Cu+) and no. 287 (Cu+) do not fold into as compact a structure as natural proteins; instead they appear to be in a compact denatured or ‘gemisch’ state (Dill et al., 1995Go). This finding was predictable since the side chain packing that leads to interchain organization within proteins was not explicitly embedded in the design of the microgenes. Still, it is possible that these artificial proteins could undergo a transition to a more ordered state under appropriate conditions. In the case of some natural repetitious proteins, for example, local interactions among repeating units are sufficient to support a compactly folded structure (Kobe and Kajava, 2000Go). We tested this possibility by screening the crystallization conditions for proteins no. 320 ({alpha}+), no. 292 (Cu+) and no. 287 (Cu+) using reagents and methods usually applied to natural proteins. A protein crystal was obtained for no. 320 ({alpha}+) (Figure 4Go), and although it did not diffract well, a powder diffraction-like pattern was observed after a prolonged X-ray exposure. The three distinct diffractions observed suggest two possibilities: that the unit cell is 3D with dimensions 30x18x15 Å, or that it is 2D with dimensions 30 x18 Å. In either case, formation of a crystal proves that this microgene polymer protein can generate an ordered structure under appropriate conditions.



View larger version (155K):
[in this window]
[in a new window]
 
Fig. 4. Photomicrograph of a crystal of protein no. 320 ({alpha}+) formed using the hanging drop method. Bar = 10 µm. The inset shows the powder diffraction image from the crystal. The three ring diffractions correspond respectively to 30 Å, 18 Å and 15 Å resolutions from the center to the periphery.

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
We have characterized six artificial proteins created using the microgene polymerization method described in a previous report (Shiba et al., 2002Go). With that method, a single designer microgene is polymerized in a head-to-tail manner using the MPR technique (Shiba et al., 1997Go). Because MPR randomly adds or deletes nucleotides at the junctions of repeating units, the translated products of the microgene repeats behave as a combinatorial library of three reading frames derived from a single microgene. By designing the microgene so that it contains no termination codon, we can prepare a gene library that contains long open reading frames.

Although the microgene polymerization method provides a molecular diversity library, it differs from the ‘random sequence library’ that has been used so successfully to evolve functional RNAs, DNAs, peptides and even proteins (Cwirla et al., 1990Go; Devlin et al., 1990Go; Ellington and Szostak, 1990Go; Scott and Smith, 1990Go; Tuerk and Gold, 1990Go; Keefe and Szostak, 2001Go). A random sequence library is usually prepared by combinatorial polymerization of four nucleotides and is often called a ‘naïve library’, which means the molecules represent a random sampling of a huge sequence space. By contrast, microgene polymers represent a ‘sparse sampling of a local sequence space’ that is considerably biased toward the peptide sequences coded by the microgene unit (Shiba, 1998Go). In this line of comparisons, the microgene polymerization method is more like protein engineering than in vitro evolution. In the former, the aim is to obtain a rational de novo design for a novel protein (Ulmer, 1983Go), while in the latter a neo-Darwinian selection from a large pool of random sequences is emphasized (Szostak, 1992Go). As we observed previously, the rationally embedded secondary structures in microgenes MG-15 ({alpha}) and MG-16 (ß) were successfully expressed in the translated products of the polymers, as exemplified by proteins no. 320 ({alpha}+) and no. 416 (ß+). In addition, the fact that products of MG-14 (Cu) polymers were blue in color when in solution with copper ion [no. 287 (Cu+) and no. 292 (Cu+)] suggests that functionality that is correlated with a specific peptide sequence can also be embedded into designer microgenes and can be reconstituted in proteins translated from microgene polymers. Finally, the fact that formation of a macroscopic membranous structure (Figure 2Go) was correlated with hydrophobicity and ß-sheet-like structure suggests it may be possible to endow proteins with sketchy physicochemical properties by adjusting the compositions and arrangements of the amino acids coded by a microgene.

In addition to an aspect of rational design, the microgene polymerization method also provides the ability to present molecular diversity. For instance, two {alpha}-helix-rich proteins, no. 320 ({alpha}+) and no. 334 ({alpha}+), were translated from different combinations of frames of MG-15 ({alpha})—i.e. (5.5xframe-1)–(1xframe-3) for 320 ({alpha}+) and (9.5xframe-1) for no. 334 ({alpha}+). At acidic pH, protein no. 320 ({alpha}+) lost its secondary structure, indicating {alpha}-helix formation to be pH-dependent in that protein (Shiba et al., 2002Go). Protein no. 334 ({alpha}+), on the other hand, retained its secondary structure at pH 2.0, but precipitated at pH 8.0 where protein no. 320 ({alpha}+) remained soluble (data not shown). Thus, variation in reading frame and/or the number of repeats can modulate the biochemical characteristics of microgene polymer proteins.

One of the unique biophysical properties of natural proteins is that they fold into specific and compact structures. On the contrary, most artificial proteins adopt ensembles of expanded conformations that result in larger molecular radii and poor chemical shift dispersion in their 1H-NMR spectra (Davidson et al., 1995Go; Dill et al., 1995Go). Proteins created by microgene polymerization are no exception, with proteins no. 320 ({alpha}+), no. 292 (Cu+) and no. 287 (Cu+) showing rather expanded molecular structures in small angle X-ray scattering analyses (Figure 3Go), and proteins no. 291 (Cu-), no. 320 ({alpha}+), no. 377 (tRNA) and no. 416 (ß+) showing limited dispersion in their 1H-NMR chemical shifts (K.Shiba, S.Yuzawa, H.Hatanaka and F.Inagaki, unpublished results). This so-called ‘gemisch’ state of artificial proteins is believed to be due to insufficient or improper side chain packing between polypeptide chains. But although it may be inherently difficult to rationally embed within microgene sequences interactions between side chains that are distant from one another, the packing of repeating units with their neighbors can be rationally designed. This implies that a type of ‘solenoid’ protein (Kobe and Kajava, 2000Go), such as those having LRR (leucine-rich repeats), ß-helix and TPR (tetratricopeptide repeats) motifs, could be created using the microgene polymerization method. Furthermore, it is noteworthy that protein no. 320 ({alpha}+), whose microgene unit was designed from a natural coiled coil peptide, could be crystallized (Figure 4Go), demonstrating that under the appropriate conditions an artificial protein can be in an ordered state.

The ability to encrypt potential functions and structures in designer microgenes and then to construct a molecular pool having diverse physiochemical properties through polymerization of these microgenes could form the basis of a new approach to protein engineering that could be used to rationally reconstitute biological functions in artificial proteins.


    Acknowledgments
 
We are grateful to Dr Y. Muroga for useful discussions and comments. This work was supported in part by HFSP Research grant (K.S.) and by Grants-in-Aid from the Ministry (T. S.).


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 References
 
Adman,E.T. (1991) Curr. Opin. Struct. Biol., 1, 895–904.[CrossRef]

Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) J. Mol. Biol., 215, 403–410.[CrossRef][ISI][Medline]

Barker,W.C., Ketcham,L.K. and Dayhoff,M.O. (1978) J. Mol. Evol., 10, 265–281.[ISI][Medline]

Baron,M., Norman,D.G. and Campbell,I.D. (1991) Trends Biochem. Sci., 16, 13–17.[CrossRef][ISI][Medline]

Cusack,S., Berthet-Colominas,C., Härtlein,M., Nassar,N. and Leberman,R. (1990) Nature, 347, 249–255.[CrossRef][ISI][Medline]

Cwirla,S.E., Peters,E.A., Barrett,R.W. and Dower,W.J. (1990) Proc. Natl Acad. Sci. USA, 87, 6378–6382.[Abstract]

Davidson,A.R., Lumb,K.J. and Sauer,R.T. (1995) Nature Struct. Biol., 2, 856–864.[ISI][Medline]

Devlin,J.J., Panganiban,L.C. and Devlin,P.E. (1990) Science, 249, 404–406.[ISI][Medline]

Dill,K.A., Bromberg,S., Yue,K., Fiebig,K.M., Yee,D.P., Thomas,P.D. and Chan,H.S. (1995) Protein Sci., 4, 561–602.[Abstract/Free Full Text]

Eckert,R.L. and Green,H. (1986) Cell, 46, 583–589.[ISI][Medline]

Eigen,M. and Winkler-Oswatitsch,R. (1981) Naturwissenschaften, 68, 282–292.[ISI][Medline]

Ellington,A.D. and Szostak,J.W. (1990) Nature, 346, 818–822.[CrossRef][ISI][Medline]

Geourjon,C. and Deleage,G. (1994) Protein Eng., 7, 157–164.[Abstract]

Glatter,O. and Kratky,O. (1982) Small Angle X-ray Scattering. Academic Press, London.

Gualfetti,P.J., Iwakura,M., Lee,J.C., Kihara,H., Bilsel,O., Zitzewitz,J.A. and Matthews,C.R. (1999) Biochemistry, 38, 13367–13378.[CrossRef][ISI][Medline]

Hayashi,C.Y. and Lewis,R.V. (2000) Science, 287, 1477–1479.[Abstract/Free Full Text]

Kamatari,Y.O., Konno,T., Kataoka,M. and Akasaka,K. (1996) J. Mol. Biol., 259, 512–523.[CrossRef][ISI][Medline]

Kamatari,Y.O., Konno,T., Kataoka,M. and Akasaka,K. (1998) Protein Sci., 7, 681–688.[Abstract/Free Full Text]

Kamatari,Y.O., Ohji,S., Konno,T., Seki,Y., Soda,K., Kataoka,M. and Akasaka,K. (1999) Protein Sci., 8, 873–882.[Abstract]

Kataoka,M., Kuwajima,K., Tokunaga,F. and Goto,Y. (1997) Protein Sci., 6, 422–430.[Abstract/Free Full Text]

Kataoka,M., Nishii,I., Fujisawa,T., Ueki,T., Tokunaga,F. and Goto,Y. (1995) J. Mol. Biol., 249, 215–228.[CrossRef][ISI][Medline]

Keefe,A.D. and Szostak,J.W. (2001) Nature, 410, 715–718.[CrossRef][ISI][Medline]

Kobe,B. and Deisenhofer,J. (1993) Nature, 366, 751–756.[CrossRef][ISI][Medline]

Kobe,B. and Deisenhofer,J. (1995) Curr. Opin. Struct. Biol., 5, 409–416.[CrossRef][ISI][Medline]

Kobe,B. and Kajava,A.V. (2000) Trends Biochem. Sci., 25, 509–515.[CrossRef][ISI][Medline]

Koide,S., Bu,Z., Risal,D., Pham,T.N., Nakagawa,T., Tamura,A. and Engelman,D.M. (1999) Biochemistry, 38, 4757–4767.[CrossRef][ISI][Medline]

Kojima,M., Tanokura,M., Maeda,M., Kimura,K., Amemiya,Y., Kihara,H. and Takahashi,K. (2000) Biochemistry, 39, 1364–1372.[CrossRef][ISI][Medline]

Konno,T., Kataoka,M., Kamatari,Y., Kanaori,K., Nosaka,A. and Akasaka,K. (1995) J. Mol. Biol., 251, 95–103.[CrossRef][ISI][Medline]

Lander,E.S. et al. (2001) Nature, 409, 860–921.[CrossRef][ISI][Medline]

Leahy,D.J., Aukhil,I. and Erickson,H.P. (1996) Cell, 84, 155–164.[ISI][Medline]

Li,L., Hong,R. and Hastings,J.W. (1997) Proc. Natl Acad. Sci. USA, 94, 8954–8958.[Abstract/Free Full Text]

Liu,J.J. and Lindquist,S. (1999) Nature, 400, 573–576.[CrossRef][ISI][Medline]

Ohno,S. (1981) Proc. Natl Acad. Sci. USA, 78, 7657–7661.[Abstract]

Ohno,S. (1984) J. Mol. Evol., 20, 313–321.[ISI][Medline]

Ohno,S. and Epplen,J.T. (1983) Proc. Natl Acad. Sci. USA, 80, 3391–3395.[Abstract]

Panick,G., Vidugiris,G.J., Malessa,R., Rapp,G., Winter,R. and Royer,C.A. (1999) Biochemistry, 38, 4157–4164.[CrossRef][ISI][Medline]

Rossmann,M.G., Moras,D. and Olsen,K.W. (1974) Nature, 250, 194–199.[ISI][Medline]

Scott,J.K. and Smith,G.P. (1990) Science, 249, 386–390.[ISI][Medline]

Shiba,K. (1998) J. Biochem. Mol. Biol., 31, 209–220.[ISI]

Shiba,K., Takahashi,T. and Noda,T. (1997) Proc. Natl Acad. Sci. USA, 94, 3805–3810.[Abstract/Free Full Text]

Shiba,K., Takahashi,Y. and Noda,T. (2002) J. Mol. Biol., 320, 833–840.[CrossRef][ISI][Medline]

Sosnick,T.R. and Trewhella,J. (1992) Biochemistry, 31, 8329–8335.[ISI][Medline]

Sutherland,G.R. and Richards,R.I. (1995) Proc. Natl Acad. Sci. USA, 92, 3636–3641.[Abstract/Free Full Text]

Szostak,J.W. (1992) Trends Biochem. Sci., 17, 89–93.[CrossRef][ISI][Medline]

Tuerk,C. and Gold,L. (1990) Science, 249, 505–510.[ISI][Medline]

Ueki,T., Hiragi,Y., Kataoka,M., Inoko,Y., Amemiya,Y., Izumi,Y., Tagawa,H. and Muroga,Y. (1985) Biophys. Chem., 23, 115–124.[CrossRef][ISI][Medline]

Ulmer,K.M. (1983) Science, 219, 666–671.[ISI][Medline]

Vold,B.S. and Green,C.J. (1988) J. Biol. Chem., 263, 14390–14396.[Abstract/Free Full Text]

Wilkin,M.B., Becker,M.N., Mulvey,D., Phan,I., Chao,A., Cooper,K., Chung,H.J., Campbell,I.D., Baron,M. and MacIntyre,R. (2000) Curr. Biol., 10, 559–567.[CrossRef][ISI][Medline]

Wolfe,K.H. and Shields,D.C. (1997) Nature, 387, 708–713.[CrossRef][ISI][Medline]

Yan,Y., Winograd,E., Viel,A., Cronin,T., Harrison,S.C. and Branton,D. (1993) Science, 262, 2027–2030.[ISI][Medline]

Yoder,M.D., Keen,N.T. and Jurnak,F. (1993) Science, 260, 1503–1507.[ISI][Medline]

Yoshizaki,F., Sugimura,Y. and Shimokoriyama,M. (1981) J. Biochem. (Tokyo), 89, 1533–1539.[Abstract]

Yura,K., Tomoda,S. and Go,M. (1993) Protein Eng., 6, 621–628.[Abstract]

Zhang,S., Holmes,T., Lockshin,C. and Rich,A. (1993) Proc. Natl Acad. Sci. USA, 90, 3334–3338.[Abstract]

Received September 13, 2002; revised October 28, 2002; accepted October 31, 2002.





This Article
Abstract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (6)
Request Permissions
Google Scholar
Articles by Shiba, K.
Articles by Noda, T.
PubMed
PubMed Citation
Articles by Shiba, K.
Articles by Noda, T.