©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
The Disulfide Folding Pathway of Human Epidermal Growth Factor (*)

Jui-Yoa Chang (1)(§), Patrick Schindler (1), Ueli Ramseier (1), Por-Hsiung Lai (2)

From the (1) Pharmaceuticals Research Laboratories, Ciba-Geigy Ltd., Basel CH-4002, Switzerland and the (2) Protein Institute Inc., Broomall, Pennsylvania 19008

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES

ABSTRACT

Human epidermal growth factor (EGF) contains three disulfides and 53 amino acids. Reduced/denatured EGF refolds spontaneously in vitro to acquire its native structure. The mechanism of this folding process has been elucidated by structural analysis of both acid and iodoacetate trapped intermediates. The results reveal that the folding is accompanied by a sequential flow of unfolded EGF (0-disulfide) through three groups of folding intermediates, namely 1-disulfide, 2-disulfide, and 3-disulfide (scrambled) EGF isomers, to reach the native structure. Equilibrium occurs among isomers of each class of disulfide species, and the composition of intermediates appears to be highly heterogeneous. Together, at least 27 fractions of folding intermediates have been identified, but there exist only limited numbers of well populated species which constitute more than 80% of the total intermediates found during EGF folding.

Six species of such well populated intermediates have been isolated, which included two 1-S-S, two 2-S-S, and two 3-S-S scrambled species. Their disulfide structures have been identified here. Both 1-S-S isomers are found to contain non-native disulfides. One of the 2-S-S species consists of two non-native disulfides and the other admits two native disulfides. Among the six disulfides of the two scrambled species, only one is native. Together, native disulfides constitute 25% of the total disulfides found in these six well populated intermediates. These results contrast sharply to those observed with bovine pancreatic trypsin inhibitor, which has shown that well populated folding intermediates consist of exclusively native disulfides (Weissman, J. S., and Kim, P. S. (1991) Science 253, 1386-1393). We propose that well populated folding intermediates, regardless of whether they contain native or non-native disulfides, do not necessarily represent the productive species and specify the folding pathway.

Furthermore, conditions influencing the efficiency of EGF folding have been investigated. It is demonstrated here that under optimized compositions of redox agents, including the use of cysteine/cystine and protein disulfide isomerase, the in vitro folding of EGF could be achieved quantitatively within 1 min.


INTRODUCTION

Intermediates that occur in the folding pathway of disulfide-containing proteins were recognized in the pioneering work on bovine pancreatic trypsin inhibitor (BPTI)() (Creighton, 1978, 1990) and ribonuclease A (Creighton, 1979; Konishi et al., 1981; Scheraga et al., 1984). In the case of BPTI, eight (including native) disulfide-bonded intermediates out of 75 possible species were initially described (Creighton, 1978; Creighton and Goldenberg, 1984). Some of those well populated 1- and 2-disulfide intermediates appeared to contain non-native disulfides and were proposed to be involved in the process of folding. This original BPTI model was recently re-examined using modern separation and analytical methodologies. In that study (Weissman and Kim, 1991), it was concluded that all well populated folding intermediates consisted of only native disulfide bonds. Raging debates ensued as a consequence of these discrepancies (Creighton, 1992; Weissman and Kim, 1992), and discussions are focused mainly upon the importance of intermediates containing non-native disulfides. In those studies, however, no non-native 3-disulfide intermediates have been described.

On the contrary, kinetically trapped non-native 3-disulfide (scrambled) intermediates were detected and characterized in recent studies of recombinant hirudin (Chatrenet and Chang, 1992, 1993) and potato carboxypeptidase inhibitor (Chang et al., 1994). They were found reproducibly in high concentrations and were observed under a wide range of folding conditions, including those favorable conditions which permit regeneration of the native protein to be completed within 30 s. Furthermore, the level of accumulation of the scrambled species has been shown to depend upon the redox potential applied and thus could be experimentally manipulated (Chang, 1994). In one case, more than 98% of the total sample was found to be trapped as scrambled intermediates before trace amount of the native structure even appeared (Chang et al., 1994). These findings indicate that scrambled proteins may play an essential role along the pathway of productive folding. This proposal, however, contradicts conventional wisdom which considers scrambled species as abortive structures of ``off-pathway'' folding. One may further suggest that the presence of scrambled intermediates is not a general phenomenon and represents only isolated, unusual cases for hirudin and potato carboxypeptidase inhibitor.

In order to clarify these uncertainties, studies on disulfide folding pathway using other comparable proteins are required. In this report, we use recombinant human epidermal growth factor (EGF), which is also a small, compact protein (53 amino acid residues), and like hirudin, contains only antiparallel -sheet and no -helix (Carver et al., 1986; Montelione et al., 1987), as an example to study the behavior of folding intermediates and confirm the formation of non-native 3-disulfide in-termediates during folding. We also examine conditions that enhance the efficiency of EGF folding.


EXPERIMENTAL PROCEDURES

Materials

Recombinant human epidermal growth factor (EGF) was derived from Escherichia coli. Cells and was supplied by Protein Institute Inc., Broomall, PA. The purity was greater than 98% as judged by SDS-polyacrylamide gel electrophoresis and N-terminal sequence analysis. The recombinant EGF is fully biological active when compared with standards. Reduced glutathione (GSH), oxidized glutathione (GSSG), cysteine (Cys), cystine (Cys-Cys), thermolysin (P-1512), and Glu-C protease were obtained from Sigma. Protein disulfide isomerase (number 7318) was purchased from Takara, Kyoto, Japan.

Control Folding Experiments

Control foldings are those performed either in Tris-HCl buffer alone (control -) or in the same buffer containing 0.25 m M of 2-mercaptoethanol (control +). Results of control foldings serve as standards for measuring the efficiencies of EGF folding in the presence of various redox agents.

EGF (1.5 mg) was dissolved in 0.5 ml of Tris-HCl buffer (0.5 M, pH 8.5) containing 5 M of GdmCl and 30 m M of dithiothreitol. Reduction and denaturation of EGF was carried out at 22 °C for 90 min. To initiate the folding, the sample was passed through a PD-10 column (Pharmacia) equilibrated in 0.1 M Tris-HCl buffer, pH 8.5. Desalting took about 1 min and unfolded EGF was recovered in 1.1 ml, which was immediately diluted with the same Tris-HCl buffer to a final protein concentration of 1 mg/ml, both in the absence (control -) and presence (control +) of 0.25 m M 2-mercaptoethanol. Folding intermediates were trapped in a time course manner by mixing aliquots of the sample with an equal volume of ( a) 4% trifluoroacetic acid in water (reversible trapping) or ( b) 0.4 M iodoacetic acid in the Tris-HCl buffer (0.5 M, pH 8.5) (irreversible trapping). In the case of iodoacetate trapping, carboxymethylation was performed at 22 °C for 30 min, followed by desalting using the PD-10 column. Trapped folding intermediates were separated by HPLC.

Folding of EGF in the Presence of Redox Agents or Denaturants

The procedures of unfolding and refolding are as those described in the control folding experiments. Selected concentrations of redox agents or denaturants were introduced immediately after unfolded EGF was desalted through the PD-10 column. Folding intermediates were trapped reversibly or irreversibly as those described above.

Enzyme Digestion of Purified Well Populated Folding Intermediates

Six fractions of well populated intermediates (I-A, I-B, II-A, II-B, III-A, and III-B), derived from the ``control -'' folding and trapped by iodoacetate, were isolated for structural analysis. 1-Disulfide intermediates (I-A and I-B) (3 µg) were digested with 0.3 µg of Glu-C protease or trypsin in 30 µl of ammonium bicarbonate solution (50 m M, pH 8.0) for 16 h at 23 °C. In this case, fully reduced carboxymethylated EGF was processed in parallel as a control. The samples were then acidified with an equal volume of 4% trifluoroacetic acid and directly subjected to automatic sequencing. 2- and 3-disulfide intermediates (30 µg) were treated with 3 µg of thermolysin in 100 µl of N-ethylmorpholine/acetate buffer (50 m M, pH 6.4). Digestion was carried out at 23 °C for 16 h. Peptides were then isolated by HPLC and analyzed by amino acid sequencing and mass spectrometry.

Amino Acid Analysis, Amino Acid Sequencing, and MALDI Mass Spectrometry

Amino analysis was performed with the dabsyl chloride precolumn derivatization method (Chang and Knecht, 1991), which permits direct evaluation of the disulfide (cystine) content. Amino acid sequencing was done with either an Applied Biosystems 470A sequencer or a Hewlett-Packard G-1000A sequencer. The digests of I-A and I-B were mainly analyzed by the HP sequencer, because it gives more reliable quantitation on the recovery of PTH-Cys(Cm). Cystine-containing peptides were mostly analyzed by the ABI instrument. An internal standard, 2-nitroacetophenone, which eluted in between PTH-His and PTH-Tyr was introduced in order to ensure precise quantitation of PTH derivatives (Ramseier and Chang, 1994). It was predissolved in the solvent (2 µ M) which transfers PTH derivatives from the conversion flask to the HPLC. During the analysis of cystine containing peptides, a unique signal di-PTH-cystine appeared when both half-cystines were recovered in the same degradation cycle (Haniu et al., 1994). di-PTH-cystine is eluted near PTH-Tyr, but can be easily distinguished from the tyrosine derivative by an additional absorbance at 313 nm.

The MALDI mass spectrometer was a home-built time of flight instrument with a nitrogen laser of 337-nm wavelength and 3-ns pulse width. The apparatus has been described in detail elsewhere (Boernsen et al., 1990). The calibration was performed either externally or internally, by using standard proteins (hypertensin, M1031.19; synacthen, M2934.50 and calcitonin, M3418.91). Analysis of the iodoacetate-trapped folding intermediates is further explained in the legend of Fig. 1.


Figure 1: Molecular mass of the iodoacetate trapped folding intermediates of EGF. Time course trapped folding intermediates of EGF contain various concentrations of 0-disulfide ( R), 1-disulfide ( I), 2-disulfide ( II), and 3-disulfide ( III and N) species. As a result of carboxymethylation, these disulfide species can be well identified by MALDI mass spectrometry. Each additional pair of carboxymethylation increases the molecular mass by 118. Therefore, R, I, II, and III (N) exhibit molecular mass of 6570, 6452, 6334, and 6216, respectively. Each spectrum represents an accumulation of 100-150 shots. Peak response reflects the concentration of disulfide species in the folding intermediates.



Biological Assay of EGF

The biological activity of recombinant EGF was compared with the standard recombinant EGF using the assay method described (Savage et al., 1973). The EDas determined by the dose-dependent stimulation of thymidine uptake by Balb/c 3T3 cells is 2.0 ng/ml. Refolded EGF was compared with a standard sample by an HPLC stability-indicating assay. The fully biological active EGF samples and standard were assayed by reversed phase HPLC described in the legend of Fig. 2. Refolded native EGF is assessed by comparing their HPLC with that of the standard.


Figure 2: Analysis of iodoacetate-trapped ( left column) and acid-trapped ( right column) folding intermediates of EGF by HPLC. Samples were obtained from the folding carried out in the Tris-HCl buffer alone (control -). Both sets of intermediates were analyzed by the same HPLC conditions. The column was Vydac C-18, 10 µm, for peptides and proteins. Solvent A was water containing 0.1% trifluoroacetic acid. Solvent B was acetonitrile/water (9:1, v/v) containing 0.1% trifluoroacetic acid. The gradient was 14-34% solvent B linear in 15 min, 34-56% solvent B linear from 15 to 50 min. The flow rate was 1 ml/min. Native EGF ( N) was eluted at 23.5 min. Iodoacetate- and acid-trapped starting materials ( R) were eluted at 36.9 and 43.4 min, respectively. For further analysis of these intermediates, see Figs. 3 and 4. It should be noted that iodoacetate and acid trapped intermediates did not behave identically under the same HPLC conditions. For instance, the majority of iodoacetate-trapped 2-disulfide intermediates were eluted within three different fractions (see 30-min sample, left column), whereas acid-trapped 2-disulfide intermediates were accumulated within one fraction (marked as II, right column). The patterns of 24-h samples were not affected by the methods of trapping because these samples contained only 3-disulfide species, both the scrambled (fractions 4 and 5, etc.) and the native EGF.




RESULTS

Disulfide Content and Disulfide Species of Iodoacetate-trapped Folding Intermediates

Folding intermediates of EGF were first analyzed for their disulfide contents in order to evaluate the rate of disulfide formation during the folding. This was done with amino acid composition analysis (Chang and Knecht, 1991). Two sets of samples obtained from the folding experiments performed in the absence (control -) and presence (control +) of 2-mercaptoethanol (0.25 m M) were analyzed. The results showed that: ( a) the decrease of cysteine (detected in the form of carboxymethylcysteine) was quantitatively accounted for by the recovery of disulfide, and ( b) the rate of total disulfide recovery remained indistinguishable regardless of whether the folding was carried out in the absence or presence of 2-mercaptoethanol. In both experiments, three intact disulfides formed after 24 h of folding (data not shown).

The data for Cys/Cys-Cys composition played a crucial role in the identification of scrambled EGF. It was subsequently revealed by HPLC analysis that the yield of native EGF was indeed dependent upon the presence of 2-mercaptoethanol. In the presence of 2-mercaptoethanol, or Cys, or reduced glutathione, the formation of three disulfide bonds was accompanied by the quantitative recovery of native EGF. Without 2-mercapto-ethanol, about 45% of the 3-disulfide EGF were trapped as species distinguishable from the native one (see Fig. 2 , 24-h samples). These trapped EGF species are scrambled non-native 3-disulfide species.

Folding intermediates of EGF were further characterized by MALDI mass spectrometry in order to determine the concentrations of disulfide species presented in the intermediates. The results were obtained from samples folded in the buffer alone (control -) and trapped by iodoacetic acid. The data (Fig. 1) demonstrate a sequential flow of unfolded EGF through 1- and 2-disulfide intermediates to the 3-disulfide species. The high level of accumulation of 2-disulfide intermediates indicated that the conversion of 2-disulfide species to 3-disulfide species constituted one of the major rate-limiting steps of EGF folding. The 24-h folded sample was shown to contain virtually only 3-disulfide species which further confirmed that the non-native species trapped in the 24-h sample (Fig. 2) are the scrambled EGF.

Characterization of the Heterogeneity of Folding Intermediates by HPLC

Folding intermediates of EGF were analyzed by HPLC. The raw data, presented in Fig. 2, were obtained from the samples of control - experiment trapped either by acid ( right column) or by iodoacetic acid ( left column). In order to be able to interpret these chromatograms, structural information of the fractionated intermediates was required. Therefore, 18 fractions of intermediates were first isolated from the 30-min iodoacetate-trapped sample (Fig. 3) and analyzed by mass spectrometry. Concentrations of disulfide species presented in each of those fractions are given in Fig. 4. The results revealed that this sample comprised a minimum of seven 1-disulfide isomers and 13 2-disulfide isomers. Most 1-disulfide species eluted at fractions 16 (I-A) and 17 (I-B) and 2-disulfide species mostly accumulated within fractions 3 (II-A) and 6 (II-B). Similar analysis of the 7-h and 48-h samples (Fig. 3) showed that fractions 4 (III-A) and 5 (III-B) contained predominantly (>92%) 3-disulfide scrambled species. The three groups of intermediates were extensively overlapped, but predominant fractions of these three disulfide species were fortunately well separated. It was also apparent that along the folding process, equilibrium existed among isomers of each disulfide species. For instance, the concentration of 2-disulfide intermediates ascended and then descended as folding progressed, but the relative ratio of fractions 3, 6, and 7 (which contained exclusively 2-disulfide species) remained constant. Scrambled 3-disulfide species and 1-disulfide species behaved similarly during the folding.


Figure 3: Heterogeneity of the folding intermediates of EGF exemplified by three time course trapped samples. The intermediates were trapped by iodoacetic acid and analyzed by HPLC using the conditions described in the legend of Fig. 2. The 30-min trapped sample contained primarily the unfolded EGF ( R), 1-disulfide ( I), and 2-disulfide ( II) intermediates. Eighteen fractions of this samples were isolated and characterized by mass spectrometry. Contents of various disulfide species within each fractions are given in Fig. 4. The majority of 1-disulfide species were eluted within fractions 16 ( I-A) and 17 ( I-B), whereas most 2-disulfide species were eluted within fractions 3 ( II-A) and 6 ( II-B). The 7-h sample was comprised of 2-disulfide species ( II, fractions 3, 6, and 7, etc.), 3-disulfide scrambled species ( III, fractions 4 and 5, etc.), and the 3-disulfide native species ( N, fraction 1). The 24-h sample contains only 3-disulfide species, in which about 45% are native EGF and 55% are scrambled EGF. Two well populated species of scrambled EGF are eluted at fractions 4 ( III-A) and 5 ( III-B).




Figure 4: Analysis by mass spectrometry of folding intermediates of EGF isolated by HPLC. Eighteen fractions were isolated from the 30-min sample (trapped by iodoacetate) (see Fig. 3) and analyzed by MALDI mass spectrometry in order to determine the disulfide species contained in each fraction. The content of disulfide species is determined by the mass peak height and expressed as percentage in each fraction. The data should be allowed a standard deviation of ± 10%. Fractions 4 and 5 contain only minute amounts of intermediates and are shown to be comprised of about 50% each of 2-disulfide and 3-disulfide (scrambled) species. A separate analysis of the 7-h trapped sample shows that fractions 4 and 5 contain predominantly (>90%) 3-disulfide species.



These data demonstrated that the folding pathway of EGF was characterized by a sequential flow of unfolded EGF (R) through three groups of equilibrated intermediates, namely, 1-disulfide, 2-disulfide, and 3-disulfide (scrambled) isomers. With the control - folding experiment, 45% of the folding intermediates were stuck as scrambled species (see Fig. 2, 24-h sample), unable to convert to the native EGF due to the lack of free thiols to catalyze their disulfide reshuffling. This problem was overcome by including 2-mercaptoethanol (data not shown) in the folding buffer, in which recoveries of native EGF were found to be greater than 96% after 24 h of folding.

The HPLC profiles of acid-trapped intermediates (Fig. 2, right column) did not fully resemble those of iodoacetate-trapped counterparts. Notably, most acid-trapped 2-disulfide species were eluted under the same fraction (the peak marked as II). Thus, interpretation of EGF folding based on the analysis of acid-trapped samples can be very tricky. The predominance of a single peak containing 2-disulfide species can easily mislead to the simplification that folding of EGF undergoes only one species of 2-disulfide intermediates. The pattern of the 24-h acid-trapped sample is indistinguishable from that of iodoacetate-trapped sample, because both contained only 3-disulfide EGF.

Determination of the Disulfide Linkages of Well Populated 1-Disulfide Intermediates (I-A and I-B)

This can be achieved by a number of strategies. The most common one is ``peptide mapping.'' This requires isolation and analysis of every enzyme fragmented peptides, as will be shown in the following section. Alternatively, it can be done by selective labeling of disulfide bonds (after reduction) with a color (Chang, 1993) or fluorescent thiol-specific reagents (Weissman and Kim, 1991). Both methods need microgram amounts of the intermediates, HPLC separation of peptides, and numerous attempts of sequence analysis.

The most sensitive and effective method, however, is to take the advantage of modern Edman chemistry and the known sequence of EGF by direct sequencing of the peptide mixture of 1-disulfide intermediates. This strategy is sketched in Fig. 5and described as follows: 1) select an enzyme that will produce a mixture of peptides, with all cysteines located at different positions in the peptide sequences; 2) subject the peptide mixture to automatic sequencing and quantitate recoveries of PTH-Cys(Cm) at expected cycles of Edman degradation. Cysteines which are not involved in disulfide pairings will be recovered as PTH-Cys(Cm), and those engaged in the disulfide linking will generate a blank gap; 3) compare the results obtained from the folding intermediates to that of control sample (fully reduced carboxymethylated EGF). This method requires low picomoles (nanograms) of samples, no HPLC separation of peptides and basically only one sequence analysis for each intermediate.


Figure 5: Peptides of EGF derived from Glu-C protease digestion. Cleavages at the three indicated positions are specific and quantitative. Glu-Cyswas not digested by Glu-C protease at all. When these four peptides are collectively sequenced, the six half-cystines are recovered at six different cycles of Edman degradation. Cys, cycle 2; Cys, cycle 6; Cys, cycle 7; Cys, cycle 9; Cys, cycle 14; Cys, cycle 20. Their recoveries as free cysteine (carboxymethylated) are used to identify the disulfide structures of 1-disulfide intermediates (see Fig. 6).



For EGF, such peptide mixtures could be generated by either trypsin or Glu-C protease digestion (Fig. 5). The sequencing data obtained from the analysis of Glu-C digests are given in Fig. 6. It shows unambiguously that I-A and I-B contain Cys-Cysand Cys-Cys, respectively, both are non-native disulfides (Fig. 8). The results obtained from trypsin digests are equally conclusive (data not shown).


Figure 6: Identification of the disulfide structures of 1-disulfide folding intermediates of EGF. Folding intermediates ( I-A and I-B) were digested by Glu-C protease and peptides (Fig. 5) collectively sequenced by automatic Edman degradation. Recoveries of Cys(Cm) were quantitated at expected cycles and were compared with those obtained from the control sample (fully reduced carboxymethylated EGF). Cysteines which are involved in the disulfide pairing will not be recovered as Cys(Cm).




Figure 8: Disulfide structures of six well populated folding intermediates of EGF. The arrows do not imply the direct conversion between the indicated species.



Assignments of the Disulfide Pairings of 2-Disulfide (II-A and II-B) and 3-Disulfide (III-A and III-B) Intermediates

The choice of methods for elucidating the disulfide structures of 2- and 3-disulfide intermediates is limited to the technique of peptide mapping. In this approach, selection of enzymes is critical. The digestion should be carried out at neutral or acidic pH and allow at least partial cleavage at peptide bonds between all neighboring cysteines. Thermolysin has been found to be an ideal enzyme for this purpose. Peptides were separated by HPLC (Fig. 7). Distinctions between cystine- and non-cystine-containing peptides can be generally recognized. Those which do not appear constantly in all mappings most likely contain disulfides. All peptides were analyzed by amino acid sequencing and mass spectrometry. Crucial data which permit assignments of disulfide pairings are presented in Table I. Two cystine peptides, with nearly equal recoveries and corresponding to two native disulfides, Cys-Cysand Cys-Cys, were found in II-A-7 and II-A-12, respectively. Despite the shoulder peak of II-A-12, sequence and mass analysis have revealed no contaminants of minor sequences. The two disulfide bonds of species II-B were also found in two major peaks. II-B-15 consisted of three peptides linked by two disulfide bridges, which could be oriented in a combination of either Cys-Cys/Cys-Cysor Cys-Cys/Cys-Cys. The finding of Cys-Cysin peak II-B-7 confirms that the former structure is the correct one. In this intermediate, both disulfides are non-native.


Figure 7: Mappings of thermolytic peptides derived from the well populated 2- and 3-disulfide intermediates. Peptides were analyzed by amino acid sequencing and mass spectrometry. Data obtained from major cystine-containing peptides (numbered) were used to construct the disulfide structures of II-A, II-B, III-A, and III-B (see Fig. 8). Chromatographic conditions are similar to those described in the legend of Fig. 2, except for using a different gradient, 5%-22% solvent B linear in 32 min and 22%-50%B from 32 to 45 min.



Scrambled EGFs are 3-disulfide species. For III-A, the disulfides were detected in five peaks. Cys-Cysand Cys-Cyswere identified in III-A-5 and III-A-10 (). Cys-Cyswas found in three different peaks (III-A-13, III-A-15, and III-A-17), due to nonspecific cleavages by thermolysin. In III-A, all disulfides are non-native. III-B is the most predominant scrambled species. Its three disulfides were recovered in four major peaks. Cys-Cyswas found in III-B-2 and Cys-Cyswas detected in III-B-8. The third disulfide of III-B, Cys-Cys, was found in III-B-14 as well as the tailing shoulder (right-hand) of III-B-11 (). For all four well populated intermediates, there is no evidence of contamination of minor species (<10%). The results of their disulfide structures are summarized in Fig. 8.

The Efficiency of EGF Folding Is Regulated by the Applied Redox Potential

Two systems of redox agents, GSH/GSSG and Cys/Cys-Cys were evaluated here. The effect of GSH was found to be similar to that of 2-mercaptoethanol which was to promote the conversion of scrambled EGF to the native EGF. In achieving this, it neither accelerated the flow of intermediates between unfolded species and scrambled species nor altered the patterns of intermediates compositions. The only obvious difference between those performed with and without GSH was the level of accumulation of scrambled species and the recovery of native EGF. Without GSH (control -), about 50% of EGF was trapped as scrambled species. In the presence of GSH, the yield of native EGF was nearly quantitative after 24 h of folding.

GSSG played a different role. It enhanced the flow of intermediates between 0-disulfide EGF and scrambled species and as a consequence also accelerated the recovery of native EGF during the early phase of folding. In the presence of 0.5 m M of GSSG, the only detectable intermediates after 3 h of folding were scrambled species, and a substantial portion of scrambled EGF also become trapped, unable to convert to the native EGF even after 24 h of folding under these conditions. By including a mixture of GSH/GSSG in the folding solution, both the flow of intermediates and the conversion of scrambled EGF to the native EGF were accelerated. Under these conditions, folding of EGF was achieved quantitatively within 4 h. Cys/Cys-Cys also regulated the folding of EGF through a similar mechanism, except that it is more potent than the GSH/GSSG system. Direct comparison of the Cys-Cys and GSSG indicated that the former was about 5-10-fold more effective (at equal molar basis) in promoting the flow of intermediates to the 3-disulfide states (Fig. 9). In another experiment, it was demonstrated that trapped scrambled EGF species were able to reshuffle their mismatched disulfides to acquire the native structure within 1 h when 1 m M of Cys was introduced. Along this process of reorganization (consolidation), scrambled species remained in equilibrium.


Figure 9: Effect of GSSG and Cys-Cys on promoting the disulfide formation during the folding of EGF. Unfolded EGF was allowed to refold in the Tris-HCl buffer containing indicated concentrations of GSSG or Cys-Cys. Folding intermediates were trapped by acid and analyzed by HPLC. The recoveries of five different species of EGF, namely 0-disulfide ( R), 1-disulfide ( I), 2-disulfide ( II), 3-disulfide scrambled ( III), and native ( N) species, are evaluated quantitatively form each time course trapped sample. Mixed disulfide species, which can be as much as 5-12% of the total sample when folding was performed in the presence of 0.5 m M of GSSG or Cys-Cys, are not included in the calculation.



In Vitro Folding of EGF Can Be Achieved Quantitatively within 1 Min

The above findings suggested that both the speed of EGF folding and the recovery of native EGF could be greatly improved under optimized compositions of redox agents. To demonstrate this potential, unfolded EGF was refolded in the Tris-HCl buffer containing 4 M sodium chloride and in the presence of the following redox systems: ( a) GSH/GSSG (4 m M/2 m M); ( b) Cys/Cys-Cys (4 m M/2 m M), and ( c) Cys/Cys-Cys (4 m M/2 m M) plus protein disulfide isomerase (40 µ M). Selection of these conditions was intended to ( a) allow head-on comparison of the potencies between the GSH/GSSG and Cys/Cys-Cys systems and ( b) assess the efficacy of protein disulfide isomerase (Epstein et al., 1963; Freedman, 1984; Bulleid, 1993). The outcome was judged by the rate of the recovery of native EGF. It revealed that Cys/Cys-Cys was 10-fold more effective than GSH/GSSG in promoting the formation of native EGF. The improvement was multiplied by another 7-fold when 40 µ M of protein disulfide isomerase was added. Under these optimized conditions, folding of EGF completed within one minute (Fig. 10).


Figure 10: HPLC chromatograms of the accelerated folding pathway of EGF. The folding was carried out in the Tris-HCl buffer (0.1 M, pH 8.5) containing 4 M NaCl and in the presence of the following redox agents: A, Cys/Cys-Cys (4 m M/2 m M) plus protein disulfide isomerase (40 µ M); B, Cys/Cys-Cys (4 m M/2 m M). The folding intermediates were trapped by acid. The elution position of the fully reduced EGF ( R) is shown by an open arrow. III and II indicate the two most dominant fractions of 3-disulfide (scrambled) and 2-disulfide EGF. Under conditions in A, folding of EGF completes within 1 min. Without protein disulfide isomerase (conditions in B), quantitative folding is achieved in 15-20 min. Significant concentrations of species containing Cys mixed disulfide were shown to present along the folding pathway (the three peaks marked by arrows and eluted between N and II). Those mix-disulfide species apparently exist in equilibrium with scrambled 3-disulfide intermediates, as their concentrations are dependent upon the amounts of Cys-Cys applied.



Folding of EGF in the Presence of Denaturants

Denaturants (8 M urea or 5 M GdmCl) were included in the folding buffer in order to examine their effects on the folding mechanism of EGF. EGF was allowed to refold in the presence of 8 M urea without or with 0.25 m M of 2-mercaptoethanol (8 M urea - and 8 M urea +). These two experiments were repeated in the presence of 5 M GdmCl (5 M GdmCl - and 5 M GdmCl +). Folding intermediates were trapped by iodoacetic acid and analyzed by HPLC. The results were compared to those obtained from control experiments (control - and control +).

Reduced/denatured EGF is able to refold in the presence of denaturant to form the native structure. The recovery is dependent both upon the potency of the denaturant as well as the presence of supplementing free thiols (). The data clearly demonstrate that forces which guide correct folding of EGF have not been nullified in the presence of either 8 M urea or 5 M GdmCl. Analysis of the folding intermediates further reveals the effect of denaturant on the kinetics of EGF folding (Fig. 11). The results are summarized in the following. ( a) Denaturant exerts only a minimum influence on the apparent compositions of 1- and 3-disulfide intermediates. However, it does affect the 2-disulfide species. Concentration of the major 2-disulfide fraction (fraction 3, see Fig. 11 , left column) reduces by 70% in the presence of denaturant. This suggests that species eluted within fraction 3 adopt favored conformations which are partially abrogated by the denaturant. Interestingly, this 2-disulfide species (II-A) has been shown to contain two native disulfides (Fig. 8). This finding is also consistent with the observation that the flow from the 1-disulfide species to the 2-disulfide species slows down considerably in the presence of denaturant (Fig. 11). ( b) The most significant effect of denaturant is to disrupt the process of consolidation and diminish recovery of the native EGF. The yield of native EGF was only 8-9% when folding was performed in the presence of 5 M GdmCl without 2-mercaptoethanol ().


Figure 11: Folding of EGF in the absence ( Control -) and presence ( 8 M urea -) of denaturant. Reduced EGF was allowed to refold in the Tris-HCl buffer alone ( Control -) or in the same buffer containing 8 M urea ( 8 M urea -) (``[minus]'' indicates that folding was performed in the absence of 2-mercaptoethanol). Folding intermediates were trapped by iodoacetate. Predominant fractions of 1-disulfide ( peaks 16 and 17), 2-disulfide ( peaks 3 and 6), and 3-disulfide scrambled ( peaks 4 and 5) intermediates are indicated.




DISCUSSION

Comparison of the Folding Mechanisms of EGF and Hirudin

The folding mechanism of EGF described here is fundamentally indistinguishable from that observed with hirudin (Chatrenet and Chang, 1992, 1993; Chang, 1994). Aside from the degree of complexity of folding intermediates and the mode of their progression along the pathway, both foldings are also governed by the redox potential in identical ways. Even the effect of denaturant on the efficiency of EGF and hirudin folding is hardly distinguishable (). The most striking similarity, however, is the formation of scrambled species and the mechanism by which they have become accumulated in the control - folding experiments. Scrambled EGF are unable to reshuffle their non-native disulfides and convert to the native structure unless free thiols are around as catalyst. When folding is carried out in the buffer alone, free cysteines of 0-, 1-, and 2-disulfide species function as thiol catalyst during the early phase of folding. As the folding advances, more cysteines become involved in the disulfide pairing and less are available as thiol catalyst. Therefore, scrambled species accumulate and become trapped. The remarkable outcome is that, for both proteins, the same percentage (40-50%) of the starting material ends up being trapped in the scrambled states. Even under extremely favorable folding conditions that involve optimized redox potentials, high concentrations of scrambled EGF and hirudins (Chang, 1994) were still observed.

The problem is how to interpret the role of scrambled species as folding intermediates. From the standpoint of strictly kinetic analysis of disulfide bond formation, scrambled species are destined to be dead-end products, since their conversion to the native structure must undergo disulfide reshuffling and in practice they must return back to the 2- or 1-disulfide species. However, disulfide formation are signals used to trace, not to define the mechanism of protein folding. From the viewpoint of thermodynamics (Anfinsen, 1973), the presence of scrambled species as folding intermediates may become more understandable. Folding of EGF, as well as hirudin and potato carboxypeptidase inhibitor (Chang et al., 1994), undergoes an initial stage of nonspecific packing, which leads to the formation of scrambled species as folding intermediates. This is followed by reorganization and consolidation of scrambled species to reach the native structure. Thermodynamically, scrambled species simply represent a state of more advanced packing and lower free energy than that of 2-disulfide intermediates. Their conversion to the native structure, although accompanied by disulfide reshufflings, does not necessarily require substantial unfolding of the compactness that they have already attained.

There are nonetheless some important differences between the properties of the folding intermediates of EGF and hirudin. One is displayed by their behavior on reversed phase HPLC. In the case of hirudin, all intermediates, including the scrambled species, are clustered near the unfolded hirudin and far apart from the native hirudin (see Fig. 2of Chatrenet and Chang (1993)), which implies that there exists a wide gap of hydrophobicity between native hirudin and all species of intermediates. The folding intermediates of EGF, on the other hand, are evenly distributed in between the unfolded and the native species (Fig. 3). The second dissimilarity is the kinetics of flow of intermediates from 2-disulfide to 3-disulfide species. Unlike hirudin, conversion of 2-disulfide EGF to scrambled 3-disulfide species represents one of the major rate determining steps of EGF folding. This is best illustrated by comparing their foldings in the presence of Cys-Cys (0.5 m M). In the case of hirudin, more than 98% of the intermediates accumulated as scrambled species after 5 min of folding (Chang, 1994). By contrast, only 70% of the folding intermediates of EGF reached the scrambled species, with the remaining intermediates retained as 2-disulfide species under the same conditions (Fig. 9). The high level of accumulation of 2-disulfide EGF, in part, reflects their stability. Indeed, their stability is likely attributed to the species eluted within fraction 3 (II-A) (Fig. 3). This is also supported by the observation that the concentration of fraction 3 reduces drastically in the presence of denaturant (Fig. 11). For hirudin, denaturant affects only the ratio of 3-disulfide scrambled species and exerts no visible influence on the compositions of 1- and 2-disulfide intermediates (Chatrenet and Chang, 1993).

Comparison of the Folding Mechanisms of EGF and BPTI

The mechanism of EGF folding also displays a number of intriguing similarities and dissimilarities to that of the BPTI model (Weissman and Kim, 1991), a protein consists of 58 amino acids and three disulfides and a model of protein folding which has been characterized in detail (Creighton, 1978, 1990; Creighton and Goldenberg, 1984). One conspicuous feature is that the HPLC pattern of the folding intermediates of EGF (Fig. 3) closely resembles that of BPTI (Weismann and Kim, 1991). Their similarities are described in the following: 1) the fully reduced species (R) and the correctly folded native species (N) are widely separated on reverse phase HPLC, with all folding intermediates eluted in between; 2) despite the heterogeneity of minor species, there exist limited numbers (about five to six) of well populated folding intermediates; 3) well populated intermediates can be classified into three distinct groups (I, II, and III). Group I is eluted near R. Groups II and III are close to (N). They progress along the folding pathway sequentially to reach the native structure. Thus, during the folding, the diminishing of group I is accompanied by the emerging of group II which then gradually disappears when group III begins to build up. For different reasons, group III can become trapped and unable to convert to the native structure; 4) for both EGF and BPTI, group I has been shown to comprise 1-disulfide species, and group II contains 2-disulfide species. One major distinction between these two models is the nature of species found in group III. They are identified as 3-disulfide scrambled species in EGF, but have been characterized as 2-disulfide species containing two native disulfides in the case of BPTI (Weismann and Kim, 1991). Most importantly, well populated folding intermediates of BPTI have been shown to contain exclusively native disulfides (Weismann and Kim, 1991). This finding has far reaching implications. It suggests that the same interactions which stabilize the native structure also guide the entire process (pathway) of folding (Rose, 1979; Oas and Kim, 1988; Kim and Baldwin, 1990). If this phenomenon were to be a general rule that governs protein folding, one would expect it also applies to EGF. The results with EGF clearly indicate that there are exceptions.

However, interpretation of our results shall hinge upon the definition of ``folding pathway.'' The crucial question is what type of intermediates actually constitute the folding pathway, the well populated species or the productive species which account for the flow of intermediates? Unlike a conventional ``biochemical'' pathway which tracks the intermediates having defined covalent structures, the intermediates of the ``disulfide folding pathway'' are composed of isomers existing in a state of dynamic equilibrium. Under these circumstances, well populated intermediates cannot be presumed as productive species, even if they do contain native disulfides. Thermodynamically, well populated intermediates are favored in an equilibrium because of their lower free energy and better stability. They are thus likely to be more complacent and less productive (Chang, 1993). Indeed, well populated intermediates may only serve as ``parking lots'' of productive species along the pathway. This pitfall has also been pointed out by Creighton (1992).

In the case that productive species are chosen to specify the folding pathway of EGF (which we think is the correct definition), then these species still remain to be identified. They cannot be simply deduced from the kinetic analysis of well populated species described here. One way to identify the productive intermediates is to perform stop/go folding experiments of all species of intermediates, both major and minor, that present at the same stage of equilibrium. For EGF, as well as for BPTI, there are 15 possible 1-disulfide isomers and 45 possible 2-disulfide isomers. All these species have to be trapped alive ( e.g. by acid), purified to homogeneity, structurally characterized, and kinetically analyzed by stop/go folding experiment in order to fish out the productive species (Chang, 1993). The dilemma faced by this approach is that productive intermediates may exist as minor species with concentrations that are less than 1% of the well-populated intermediates. Finding all these minor species will be a daunting, if not impossible, task. The predicament can be further complicated by the argument that the undetected is not necessarily non-existing. Alternatively, one may chemically synthesize all possible isomers. Theoretically, this is feasible and can be achieved by selective and stepwise deblocking of desired disulfide pairs. Again, this will be a formidable challenge.

Even if well populated intermediates are selected to construct the folding pathway of EGF, there are still serious deficiencies (Fig. 8). Both well populated 1-disulfide species are non-native (Cys-Cysand Cys-Cys), and they are not found in the major 2-disulfide species. The only common characteristic shared by these two non-native disulfides is that they are the smallest disulfide loops, aside from Cys-Cys. Therefore, some unidentified 1-disulfide species most likely act as productive species that account for the flow between 1- and 2-disulfide intermediates. Of the 2-disulfide species, one (II-A) admits two native disulfides, and the other (II-B) contains two non-native disulfides that are found in one of the scrambled species (III-A) as well. Here, one may suggest a concise two-pathway model in which II-A converts to the native species (on-pathway) and II-B goes to III-A (off-pathway) which subsequently equilibrates with III-B and other minor scrambled species (Fig. 8). This is a tempting conclusion, but there is no proof to it, for two reasons. First, there is no evidence that II-A transforms to N directly without undergoing additional disulfide rearrangements. There are at least 15 fractions of minor 2-disulfide species, and they exist in equilibrium with II-A and II-B. Second, scrambled EGF form reproducibly in significant quantity during the folding, regardless of whether folding is carried out under favorable or unfavorable conditions. They cannot be simply dismissed as abortive structure of ``off-pathway'' folding. In our opinion, scrambled species are legitimate intermediates and passages to the native structure. Thermodynamically (Anfinsen, 1973, Anfinsen et al., 1961), their presence as folding intermediates is perfectly logical.

Nonetheless, the finding that intermediate II-A contains two native disulfides is highly interesting and appears to be comparable with the BPTI model (Creighton, 1978; Weissman and Kim, 1991). The rapid accumulation of II-A during EGF folding and its sensitivity to the denaturant (Chang et al., 1995) indicate that it is a favored intermediate stabilized by noncovalent interactions. It is likely that these non-covalent interactions are native-like interactions (Montelione et al., 1987), but this remains to be elucidated. We believe that equation of native disulfide and native-like structure cannot be taken for granted. Even the sensitivity of II-A to denaturants cannot be regarded as an unequivocal evidence. In the case of hirudin, 6 out of 11 scrambled species are sensitive to denaturant and most of them do not contain native disulfides at all.() Similarly, one cannot rule out the possibility that I-A and I-B, although composed of non-native disulfides, may adopt native like structures. The safest statement one can make out of the analysis of trapped disulfide species is the degree of heterogeneity of folding intermediates. Taking these arguments into consideration, it will be premature to conclude that folding of BPTI and EGF are guided by divergent principles, despite sharp differences of the disulfide structures of their well populated folding intermediates.

  
Table: Structures of the disulfide containing peptides derived from the well populated folding intermediates of EGF


  
Table: Recoveries of native EGF and hirudin under different folding conditions



FOOTNOTES

*
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked `` advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed: K-121, 104 Ciba-Geigy Ltd., Basel CH-4002, Switzerland. Fax:/Tel.: 4161-6968313.

The abbreviations used are: BPTI, bovine pancreatic trypsin inhibitor; EGF, recombinant human epidermal growth factor; GSH, reduced glutathione; GSSG, oxidized glutathione; Cys, cysteine; Cys-Cys, cystine; MALDI, matrix-assisted laser desorption ionization; GdmCl, guanidinium chloride; HPLC, high performance liquid chromatography; PTH, phenylthiohydantoin.

J.-Y. Chang, unpublished data.


REFERENCES
  1. Anfinsen, C. B. (1973) Science 181, 223-230 [Medline] [Order article via Infotrieve]
  2. Anfinsen, C. B., Haber, E., Sela, M., and White, F. H., Jr. (1961) Proc. Natl. Acad. Sci. U. S. A. 47, 1309-1515 [Medline] [Order article via Infotrieve]
  3. Boernsen, K. O., Schaer, M., and Widmer, M. (1990) Chimia 44, 412-416
  4. Bulleid, N. J. (1993) Adv. Prot. Chem. 44, 125-150 [Medline] [Order article via Infotrieve]
  5. Carver, J. A., Cooke, R. M., Esposito, G., Campbell, I. D., Gregory, H., and Sheard, B. (1986) FEBS Lett. 205, 77-81 [CrossRef][Medline] [Order article via Infotrieve]
  6. Chang, J.-Y. (1993) J. Biol. Chem. 268, 4043-4049 [Abstract/Free Full Text]
  7. Chang, J.-Y. (1994) Biochem. J. 300, 643-650 [Medline] [Order article via Infotrieve]
  8. Chang, J.-Y., and Knecht, R. (1991) Anal. Biochem. 197, 52-58 [Medline] [Order article via Infotrieve]
  9. Chang, J.-Y., Canals, F., Schindler, P., Querol, E., and Aviles, F. X. (1994) J. Biol. Chem. 269, 22087-22094 [Abstract/Free Full Text]
  10. Chatrenet, B., and Chang, J.-Y. (1992) J. Biol. Chem. 267, 3038-3043 [Abstract/Free Full Text]
  11. Chatrenet, B., and Chang, J.-Y. (1993) J. Biol. Chem. 268, 20988-20996 [Abstract/Free Full Text]
  12. Creighton, T. E. (1978) Prog. Biophys. Mol. Biol. 33, 231-297 [Medline] [Order article via Infotrieve]
  13. Creighton, T. E. (1979) J. Mol. Biol. 129, 411-431 [Medline] [Order article via Infotrieve]
  14. Creighton, T. E. (1990) Biochem. J. 270, 1-16 [Medline] [Order article via Infotrieve]
  15. Creighton, T. E. (1992) Science 256, 111-112 [Medline] [Order article via Infotrieve]
  16. Creighton, T. E., and Goldenberg, D. P. (1984) J. Mol. Biol. 179, 497-524 [Medline] [Order article via Infotrieve]
  17. Epstein, C. J., Goldberger, R. F., and Anfinsen, C. B. (1963) Cold Spring Harbor Symp. Quant. Biol. 28, 439-449
  18. Freedman, R. B. (1984) Trends Biochem. Sci. 9, 438-441 [CrossRef]
  19. Haniu, M., Acklin, C., Kenney, W., and Rohde, M. F. (1994) Int. J. Peptide Protein Res. 43, 81-86 [Medline] [Order article via Infotrieve]
  20. Kim, P. S., and Baldwin, R. L. (1990) Annu. Rev. Biochem. 59, 631-660 [CrossRef][Medline] [Order article via Infotrieve]
  21. Konishi, Y., Ooi, T., and Scheraga, H. A. (1981) Biochemistry 20, 3945-3955 [Medline] [Order article via Infotrieve]
  22. Montelione, G. T., Wuethrich, K., Nice, E. C., Burgess, A. W., and Scheraga, H. A. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 5226-5230 [Abstract]
  23. Oas, T. G., and Kim, P. S. (1988) Nature 336, 42-48 [CrossRef][Medline] [Order article via Infotrieve]
  24. Ramseier, U., and Chang, J.-Y. (1994) Anal. Biochem. 221, 231-233 [CrossRef][Medline] [Order article via Infotrieve]
  25. Rose, G. D. (1979) J. Mol. Biol. 134, 447-486 [Medline] [Order article via Infotrieve]
  26. Savage, C. R., Jr., Hash, J. H., and Cohen, S. (1973) J. Biol. Chem. 248, 7669-7672 [Abstract/Free Full Text]
  27. Scheraga, H. A., Konishi, Y., and Ooi, T. (1984) Adv. Biophys. 18, 21-41 [CrossRef][Medline] [Order article via Infotrieve]
  28. Weissman, J. S., and Kim, P. S. (1991) Science 253, 1386-1393 [Medline] [Order article via Infotrieve]
  29. Weissman, J. S., and Kim, P. S. (1992) Science 256, 112-114

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.