From the
Human epidermal growth factor (EGF) contains three disulfides
and 53 amino acids. Reduced/denatured EGF refolds spontaneously in
vitro to acquire its native structure. The mechanism of this
folding process has been elucidated by structural analysis of both acid
and iodoacetate trapped intermediates. The results reveal that the
folding is accompanied by a sequential flow of unfolded EGF
(0-disulfide) through three groups of folding intermediates, namely
1-disulfide, 2-disulfide, and 3-disulfide (scrambled) EGF isomers, to
reach the native structure. Equilibrium occurs among isomers of each
class of disulfide species, and the composition of intermediates
appears to be highly heterogeneous. Together, at least 27 fractions of
folding intermediates have been identified, but there exist only
limited numbers of well populated species which constitute more than
80% of the total intermediates found during EGF folding.
Six species
of such well populated intermediates have been isolated, which included
two 1-S-S, two 2-S-S, and two 3-S-S scrambled species. Their disulfide
structures have been identified here. Both 1-S-S isomers are found to
contain non-native disulfides. One of the 2-S-S species consists of two
non-native disulfides and the other admits two native disulfides. Among
the six disulfides of the two scrambled species, only one is native.
Together, native disulfides constitute 25% of the total disulfides
found in these six well populated intermediates. These results contrast
sharply to those observed with bovine pancreatic trypsin inhibitor,
which has shown that well populated folding intermediates consist of
exclusively native disulfides (Weissman, J. S., and Kim, P. S. (1991)
Science 253, 1386-1393). We propose that well populated
folding intermediates, regardless of whether they contain native or
non-native disulfides, do not necessarily represent the productive
species and specify the folding pathway.
Furthermore, conditions
influencing the efficiency of EGF folding have been investigated. It is
demonstrated here that under optimized compositions of redox agents,
including the use of cysteine/cystine and protein disulfide isomerase,
the in vitro folding of EGF could be achieved quantitatively
within 1 min.
Intermediates that occur in the folding pathway of
disulfide-containing proteins were recognized in the pioneering work on
bovine pancreatic trypsin inhibitor (BPTI)
On the contrary, kinetically trapped non-native
3-disulfide (scrambled) intermediates were detected and characterized
in recent studies of recombinant hirudin (Chatrenet and Chang, 1992,
1993) and potato carboxypeptidase inhibitor (Chang et al.,
1994). They were found reproducibly in high concentrations and were
observed under a wide range of folding conditions, including those
favorable conditions which permit regeneration of the native protein to
be completed within 30 s. Furthermore, the level of accumulation of the
scrambled species has been shown to depend upon the redox potential
applied and thus could be experimentally manipulated (Chang, 1994). In
one case, more than 98% of the total sample was found to be trapped as
scrambled intermediates before trace amount of the native structure
even appeared (Chang et al., 1994). These findings indicate
that scrambled proteins may play an essential role along the pathway of
productive folding. This proposal, however, contradicts conventional
wisdom which considers scrambled species as abortive structures of
``off-pathway'' folding. One may further suggest that the
presence of scrambled intermediates is not a general phenomenon and
represents only isolated, unusual cases for hirudin and potato
carboxypeptidase inhibitor.
In order to clarify these uncertainties,
studies on disulfide folding pathway using other comparable proteins
are required. In this report, we use recombinant human epidermal growth
factor (EGF), which is also a small, compact protein (53 amino acid
residues), and like hirudin, contains only antiparallel
EGF (1.5 mg) was dissolved in 0.5 ml of Tris-HCl
buffer (0.5
M, pH 8.5) containing 5
M of GdmCl and 30
m
M of dithiothreitol. Reduction and denaturation of EGF was
carried out at 22 °C for 90 min. To initiate the folding, the
sample was passed through a PD-10 column (Pharmacia) equilibrated in
0.1
M Tris-HCl buffer, pH 8.5. Desalting took about 1 min and
unfolded EGF was recovered in 1.1 ml, which was immediately diluted
with the same Tris-HCl buffer to a final protein concentration of 1
mg/ml, both in the absence (control -) and presence (control
+) of 0.25 m
M 2-mercaptoethanol. Folding intermediates
were trapped in a time course manner by mixing aliquots of the sample
with an equal volume of ( a) 4% trifluoroacetic acid in water
(reversible trapping) or ( b) 0.4
M iodoacetic acid in
the Tris-HCl buffer (0.5
M, pH 8.5) (irreversible trapping).
In the case of iodoacetate trapping, carboxymethylation was performed
at 22 °C for 30 min, followed by desalting using the PD-10 column.
Trapped folding intermediates were separated by HPLC.
The MALDI mass spectrometer was a
home-built time of flight instrument with a nitrogen laser of 337-nm
wavelength and 3-ns pulse width. The apparatus has been described in
detail elsewhere (Boernsen et al., 1990). The calibration was
performed either externally or internally, by using standard proteins
(hypertensin, M
The data for Cys/Cys-Cys composition played a crucial
role in the identification of scrambled EGF. It was subsequently
revealed by HPLC analysis that the yield of native EGF was indeed
dependent upon the presence of 2-mercaptoethanol. In the presence of
2-mercaptoethanol, or Cys, or reduced glutathione, the formation of
three disulfide bonds was accompanied by the quantitative recovery of
native EGF. Without 2-mercapto-ethanol, about 45% of the 3-disulfide
EGF were trapped as species distinguishable from the native one (see
Fig. 2
, 24-h samples). These trapped EGF species are scrambled
non-native 3-disulfide species.
Folding intermediates of EGF were
further characterized by MALDI mass spectrometry in order to determine
the concentrations of disulfide species presented in the intermediates.
The results were obtained from samples folded in the buffer alone
(control -) and trapped by iodoacetic acid. The data
(Fig. 1) demonstrate a sequential flow of unfolded EGF through 1-
and 2-disulfide intermediates to the 3-disulfide species. The high
level of accumulation of 2-disulfide intermediates indicated that the
conversion of 2-disulfide species to 3-disulfide species constituted
one of the major rate-limiting steps of EGF folding. The 24-h folded
sample was shown to contain virtually only 3-disulfide species which
further confirmed that the non-native species trapped in the 24-h
sample (Fig. 2) are the scrambled EGF.
The
HPLC profiles of acid-trapped intermediates (Fig. 2, right
column) did not fully resemble those of iodoacetate-trapped
counterparts. Notably, most acid-trapped 2-disulfide species were
eluted under the same fraction (the peak marked as II). Thus,
interpretation of EGF folding based on the analysis of acid-trapped
samples can be very tricky. The predominance of a single peak
containing 2-disulfide species can easily mislead to the simplification
that folding of EGF undergoes only one species of 2-disulfide
intermediates. The pattern of the 24-h acid-trapped sample is
indistinguishable from that of iodoacetate-trapped sample, because both
contained only 3-disulfide EGF.
The most sensitive and
effective method, however, is to take the advantage of modern Edman
chemistry and the known sequence of EGF by direct sequencing of the
peptide mixture of 1-disulfide intermediates. This strategy is sketched
in Fig. 5and described as follows: 1) select an enzyme that will
produce a mixture of peptides, with all cysteines located at different
positions in the peptide sequences; 2) subject the peptide mixture to
automatic sequencing and quantitate recoveries of PTH-Cys(Cm) at
expected cycles of Edman degradation. Cysteines which are not involved
in disulfide pairings will be recovered as PTH-Cys(Cm), and those
engaged in the disulfide linking will generate a blank gap; 3) compare
the results obtained from the folding intermediates to that of control
sample (fully reduced carboxymethylated EGF). This method requires low
picomoles (nanograms) of samples, no HPLC separation of peptides and
basically only one sequence analysis for each intermediate.
GSSG
played a different role. It enhanced the flow of intermediates between
0-disulfide EGF and scrambled species and as a consequence also
accelerated the recovery of native EGF during the early phase of
folding. In the presence of 0.5 m
M of GSSG, the only
detectable intermediates after 3 h of folding were scrambled species,
and a substantial portion of scrambled EGF also become trapped, unable
to convert to the native EGF even after 24 h of folding under these
conditions. By including a mixture of GSH/GSSG in the folding solution,
both the flow of intermediates and the conversion of scrambled EGF to
the native EGF were accelerated. Under these conditions, folding of EGF
was achieved quantitatively within 4 h. Cys/Cys-Cys also regulated the
folding of EGF through a similar mechanism, except that it is more
potent than the GSH/GSSG system. Direct comparison of the Cys-Cys and
GSSG indicated that the former was about 5-10-fold more effective
(at equal molar basis) in promoting the flow of intermediates to the
3-disulfide states (Fig. 9). In another experiment, it was
demonstrated that trapped scrambled EGF species were able to reshuffle
their mismatched disulfides to acquire the native structure within 1 h
when 1 m
M of Cys was introduced. Along this process of
reorganization (consolidation), scrambled species remained in
equilibrium.
Reduced/denatured EGF is able to refold in the presence of
denaturant to form the native structure. The recovery is dependent both
upon the potency of the denaturant as well as the presence of
supplementing free thiols (). The data clearly demonstrate
that forces which guide correct folding of EGF have not been nullified
in the presence of either 8
M urea or 5
M GdmCl.
Analysis of the folding intermediates further reveals the effect of
denaturant on the kinetics of EGF folding (Fig. 11). The results
are summarized in the following. ( a) Denaturant exerts only a
minimum influence on the apparent compositions of 1- and 3-disulfide
intermediates. However, it does affect the 2-disulfide species.
Concentration of the major 2-disulfide fraction (fraction 3, see
Fig. 11
, left column) reduces by 70% in the presence of
denaturant. This suggests that species eluted within fraction 3 adopt
favored conformations which are partially abrogated by the denaturant.
Interestingly, this 2-disulfide species (II-A) has been shown to
contain two native disulfides (Fig. 8). This finding is also
consistent with the observation that the flow from the 1-disulfide
species to the 2-disulfide species slows down considerably in the
presence of denaturant (Fig. 11). ( b) The most
significant effect of denaturant is to disrupt the process of
consolidation and diminish recovery of the native EGF. The yield of
native EGF was only 8-9% when folding was performed in the
presence of 5
M GdmCl without 2-mercaptoethanol
().
The problem is how to interpret the role of scrambled
species as folding intermediates. From the standpoint of strictly
kinetic analysis of disulfide bond formation, scrambled species are
destined to be dead-end products, since their conversion to the native
structure must undergo disulfide reshuffling and in practice they must
return back to the 2- or 1-disulfide species. However, disulfide
formation are signals used to trace, not to define the mechanism of
protein folding. From the viewpoint of thermodynamics (Anfinsen, 1973),
the presence of scrambled species as folding intermediates may become
more understandable. Folding of EGF, as well as hirudin and potato
carboxypeptidase inhibitor (Chang et al., 1994), undergoes an
initial stage of nonspecific packing, which leads to the formation of
scrambled species as folding intermediates. This is followed by
reorganization and consolidation of scrambled species to reach the
native structure. Thermodynamically, scrambled species simply represent
a state of more advanced packing and lower free energy than that of
2-disulfide intermediates. Their conversion to the native structure,
although accompanied by disulfide reshufflings, does not necessarily
require substantial unfolding of the compactness that they have already
attained.
There are nonetheless some important differences between
the properties of the folding intermediates of EGF and hirudin. One is
displayed by their behavior on reversed phase HPLC. In the case of
hirudin, all intermediates, including the scrambled species, are
clustered near the unfolded hirudin and far apart from the native
hirudin (see Fig. 2of Chatrenet and Chang (1993)), which implies
that there exists a wide gap of hydrophobicity between native hirudin
and all species of intermediates. The folding intermediates of EGF, on
the other hand, are evenly distributed in between the unfolded and the
native species (Fig. 3). The second dissimilarity is the kinetics of
flow of intermediates from 2-disulfide to 3-disulfide species. Unlike
hirudin, conversion of 2-disulfide EGF to scrambled 3-disulfide species
represents one of the major rate determining steps of EGF folding. This
is best illustrated by comparing their foldings in the presence of
Cys-Cys (0.5 m
M). In the case of hirudin, more than 98% of the
intermediates accumulated as scrambled species after 5 min of folding
(Chang, 1994). By contrast, only 70% of the folding intermediates of
EGF reached the scrambled species, with the remaining intermediates
retained as 2-disulfide species under the same conditions
(Fig. 9). The high level of accumulation of 2-disulfide EGF, in
part, reflects their stability. Indeed, their stability is likely
attributed to the species eluted within fraction 3 (II-A)
(Fig. 3). This is also supported by the observation that the
concentration of fraction 3 reduces drastically in the presence of
denaturant (Fig. 11). For hirudin, denaturant affects only the
ratio of 3-disulfide scrambled species and exerts no visible influence
on the compositions of 1- and 2-disulfide intermediates (Chatrenet and
Chang, 1993).
However, interpretation of our results shall hinge upon the
definition of ``folding pathway.'' The crucial question is
what type of intermediates actually constitute the folding pathway, the
well populated species or the productive species which account for the
flow of intermediates? Unlike a conventional ``biochemical''
pathway which tracks the intermediates having defined covalent
structures, the intermediates of the ``disulfide folding
pathway'' are composed of isomers existing in a state of dynamic
equilibrium. Under these circumstances, well populated intermediates
cannot be presumed as productive species, even if they do contain
native disulfides. Thermodynamically, well populated intermediates are
favored in an equilibrium because of their lower free energy and better
stability. They are thus likely to be more complacent and less
productive (Chang, 1993). Indeed, well populated intermediates may only
serve as ``parking lots'' of productive species along the
pathway. This pitfall has also been pointed out by Creighton (1992).
In the case that productive species are chosen to specify the
folding pathway of EGF (which we think is the correct definition), then
these species still remain to be identified. They cannot be simply
deduced from the kinetic analysis of well populated species described
here. One way to identify the productive intermediates is to perform
stop/go folding experiments of all species of intermediates, both major
and minor, that present at the same stage of equilibrium. For EGF, as
well as for BPTI, there are 15 possible 1-disulfide isomers and 45
possible 2-disulfide isomers. All these species have to be trapped
alive ( e.g. by acid), purified to homogeneity, structurally
characterized, and kinetically analyzed by stop/go folding experiment
in order to fish out the productive species (Chang, 1993). The dilemma
faced by this approach is that productive intermediates may exist as
minor species with concentrations that are less than 1% of the
well-populated intermediates. Finding all these minor species will be a
daunting, if not impossible, task. The predicament can be further
complicated by the argument that the undetected is not necessarily
non-existing. Alternatively, one may chemically synthesize all possible
isomers. Theoretically, this is feasible and can be achieved by
selective and stepwise deblocking of desired disulfide pairs. Again,
this will be a formidable challenge.
Even if well populated
intermediates are selected to construct the folding pathway of EGF,
there are still serious deficiencies (Fig. 8). Both well
populated 1-disulfide species are non-native
(Cys
Nonetheless, the finding that intermediate II-A contains
two native disulfides is highly interesting and appears to be
comparable with the BPTI model (Creighton, 1978; Weissman and Kim,
1991). The rapid accumulation of II-A during EGF folding and its
sensitivity to the denaturant (Chang et al., 1995) indicate
that it is a favored intermediate stabilized by noncovalent
interactions. It is likely that these non-covalent interactions are
native-like interactions (Montelione et al., 1987), but this
remains to be elucidated. We believe that equation of native disulfide
and native-like structure cannot be taken for granted. Even the
sensitivity of II-A to denaturants cannot be regarded as an unequivocal
evidence. In the case of hirudin, 6 out of 11 scrambled species are
sensitive to denaturant and most of them do not contain native
disulfides at all.
(
)
(Creighton, 1978, 1990) and ribonuclease A (Creighton,
1979; Konishi et al., 1981; Scheraga et al., 1984).
In the case of BPTI, eight (including native) disulfide-bonded
intermediates out of 75 possible species were initially described
(Creighton, 1978; Creighton and Goldenberg, 1984). Some of those well
populated 1- and 2-disulfide intermediates appeared to contain
non-native disulfides and were proposed to be involved in the process
of folding. This original BPTI model was recently re-examined using
modern separation and analytical methodologies. In that study (Weissman
and Kim, 1991), it was concluded that all well populated folding
intermediates consisted of only native disulfide bonds. Raging debates
ensued as a consequence of these discrepancies (Creighton, 1992;
Weissman and Kim, 1992), and discussions are focused mainly upon the
importance of intermediates containing non-native disulfides. In those
studies, however, no non-native 3-disulfide intermediates have been
described.
-sheet and
no
-helix (Carver et al., 1986; Montelione et
al., 1987), as an example to study the behavior of folding
intermediates and confirm the formation of non-native 3-disulfide
in-termediates during folding. We also examine conditions that enhance
the efficiency of EGF folding.
Materials
Recombinant human epidermal growth
factor (EGF) was derived from Escherichia coli. Cells and was
supplied by Protein Institute Inc., Broomall, PA. The purity was
greater than 98% as judged by SDS-polyacrylamide gel electrophoresis
and N-terminal sequence analysis. The recombinant EGF is fully
biological active when compared with standards. Reduced glutathione
(GSH), oxidized glutathione (GSSG), cysteine (Cys), cystine (Cys-Cys),
thermolysin (P-1512), and Glu-C protease were obtained from Sigma.
Protein disulfide isomerase (number 7318) was purchased from Takara,
Kyoto, Japan.
Control Folding Experiments
Control foldings are
those performed either in Tris-HCl buffer alone (control -) or in
the same buffer containing 0.25 m
M of 2-mercaptoethanol
(control +). Results of control foldings serve as standards for
measuring the efficiencies of EGF folding in the presence of various
redox agents.
Folding of EGF in the Presence of Redox Agents or
Denaturants
The procedures of unfolding and refolding are as
those described in the control folding experiments. Selected
concentrations of redox agents or denaturants were introduced
immediately after unfolded EGF was desalted through the PD-10 column.
Folding intermediates were trapped reversibly or irreversibly as those
described above.
Enzyme Digestion of Purified Well Populated Folding
Intermediates
Six fractions of well populated intermediates
(I-A, I-B, II-A, II-B, III-A, and III-B), derived from the
``control -'' folding and trapped by iodoacetate, were
isolated for structural analysis. 1-Disulfide intermediates (I-A and
I-B) (3 µg) were digested with 0.3 µg of Glu-C protease or
trypsin in 30 µl of ammonium bicarbonate solution (50 m
M,
pH 8.0) for 16 h at 23 °C. In this case, fully reduced
carboxymethylated EGF was processed in parallel as a control. The
samples were then acidified with an equal volume of 4% trifluoroacetic
acid and directly subjected to automatic sequencing. 2- and 3-disulfide
intermediates (30 µg) were treated with 3 µg of
thermolysin in 100 µl of N-ethylmorpholine/acetate buffer
(50 m
M, pH 6.4). Digestion was carried out at 23 °C for 16
h. Peptides were then isolated by HPLC and analyzed by amino acid
sequencing and mass spectrometry.
Amino Acid Analysis, Amino Acid Sequencing, and MALDI
Mass Spectrometry
Amino analysis was performed with the dabsyl
chloride precolumn derivatization method (Chang and Knecht, 1991),
which permits direct evaluation of the disulfide (cystine) content.
Amino acid sequencing was done with either an Applied Biosystems 470A
sequencer or a Hewlett-Packard G-1000A sequencer. The digests of I-A
and I-B were mainly analyzed by the HP sequencer, because it gives more
reliable quantitation on the recovery of PTH-Cys(Cm).
Cystine-containing peptides were mostly analyzed by the ABI instrument.
An internal standard, 2-nitroacetophenone, which eluted in between
PTH-His and PTH-Tyr was introduced in order to ensure precise
quantitation of PTH derivatives (Ramseier and Chang, 1994). It was
predissolved in the solvent (2 µ
M) which transfers PTH
derivatives from the conversion flask to the HPLC. During the analysis
of cystine containing peptides, a unique signal di-PTH-cystine appeared
when both half-cystines were recovered in the same degradation cycle
(Haniu et al., 1994). di-PTH-cystine is eluted near PTH-Tyr,
but can be easily distinguished from the tyrosine derivative by an
additional absorbance at 313 nm.
1031.19; synacthen,
M
2934.50 and calcitonin, M
3418.91). Analysis of the iodoacetate-trapped folding
intermediates is further explained in the legend of Fig. 1.
Figure 1:
Molecular
mass of the iodoacetate trapped folding intermediates of EGF. Time
course trapped folding intermediates of EGF contain various
concentrations of 0-disulfide ( R), 1-disulfide ( I),
2-disulfide ( II), and 3-disulfide ( III and
N) species. As a result of carboxymethylation, these disulfide
species can be well identified by MALDI mass spectrometry. Each
additional pair of carboxymethylation increases the molecular mass by
118. Therefore, R, I, II, and III (N) exhibit molecular mass of 6570,
6452, 6334, and 6216, respectively. Each spectrum represents an
accumulation of 100-150 shots. Peak response reflects the
concentration of disulfide species in the folding
intermediates.
Biological Assay of EGF
The biological activity of
recombinant EGF was compared with the standard recombinant EGF using
the assay method described (Savage et al., 1973). The
EDas determined by the dose-dependent stimulation of
thymidine uptake by Balb/c 3T3 cells is 2.0 ng/ml. Refolded EGF was
compared with a standard sample by an HPLC stability-indicating assay.
The fully biological active EGF samples and standard were assayed by
reversed phase HPLC described in the legend of Fig. 2. Refolded
native EGF is assessed by comparing their HPLC with that of the
standard.
Figure 2:
Analysis of iodoacetate-trapped ( left
column) and acid-trapped ( right column) folding
intermediates of EGF by HPLC. Samples were obtained from the folding
carried out in the Tris-HCl buffer alone (control -). Both sets
of intermediates were analyzed by the same HPLC conditions. The column
was Vydac C-18, 10 µm, for peptides and proteins. Solvent A was
water containing 0.1% trifluoroacetic acid. Solvent B was
acetonitrile/water (9:1, v/v) containing 0.1% trifluoroacetic acid. The
gradient was 14-34% solvent B linear in 15 min, 34-56%
solvent B linear from 15 to 50 min. The flow rate was 1 ml/min. Native
EGF ( N) was eluted at 23.5 min. Iodoacetate- and acid-trapped
starting materials ( R) were eluted at 36.9 and 43.4 min,
respectively. For further analysis of these intermediates, see Figs. 3
and 4. It should be noted that iodoacetate and acid trapped
intermediates did not behave identically under the same HPLC
conditions. For instance, the majority of iodoacetate-trapped
2-disulfide intermediates were eluted within three different fractions
(see 30-min sample, left column), whereas acid-trapped
2-disulfide intermediates were accumulated within one fraction (marked
as II, right column). The patterns of 24-h samples
were not affected by the methods of trapping because these samples
contained only 3-disulfide species, both the scrambled (fractions 4 and
5, etc.) and the native EGF.
Disulfide Content and Disulfide Species of
Iodoacetate-trapped Folding Intermediates
Folding intermediates
of EGF were first analyzed for their disulfide contents in order to
evaluate the rate of disulfide formation during the folding. This was
done with amino acid composition analysis (Chang and Knecht, 1991). Two
sets of samples obtained from the folding experiments performed in the
absence (control -) and presence (control +) of
2-mercaptoethanol (0.25 m
M) were analyzed. The results showed
that: ( a) the decrease of cysteine (detected in the form of
carboxymethylcysteine) was quantitatively accounted for by the recovery
of disulfide, and ( b) the rate of total disulfide recovery
remained indistinguishable regardless of whether the folding was
carried out in the absence or presence of 2-mercaptoethanol. In both
experiments, three intact disulfides formed after 24 h of folding (data
not shown).
Characterization of the Heterogeneity of Folding
Intermediates by HPLC
Folding intermediates of EGF were analyzed
by HPLC. The raw data, presented in Fig. 2, were obtained from
the samples of control - experiment trapped either by acid
( right column) or by iodoacetic acid ( left column).
In order to be able to interpret these chromatograms, structural
information of the fractionated intermediates was required. Therefore,
18 fractions of intermediates were first isolated from the 30-min
iodoacetate-trapped sample (Fig. 3) and analyzed by mass
spectrometry. Concentrations of disulfide species presented in each of
those fractions are given in Fig. 4. The results revealed that
this sample comprised a minimum of seven 1-disulfide isomers and 13
2-disulfide isomers. Most 1-disulfide species eluted at fractions 16
(I-A) and 17 (I-B) and 2-disulfide species mostly accumulated within
fractions 3 (II-A) and 6 (II-B). Similar analysis of the 7-h and 48-h
samples (Fig. 3) showed that fractions 4 (III-A) and 5 (III-B)
contained predominantly (>92%) 3-disulfide scrambled species. The
three groups of intermediates were extensively overlapped, but
predominant fractions of these three disulfide species were fortunately
well separated. It was also apparent that along the folding process,
equilibrium existed among isomers of each disulfide species. For
instance, the concentration of 2-disulfide intermediates ascended and
then descended as folding progressed, but the relative ratio of
fractions 3, 6, and 7 (which contained exclusively 2-disulfide species)
remained constant. Scrambled 3-disulfide species and 1-disulfide
species behaved similarly during the folding.
Figure 3:
Heterogeneity of the folding intermediates
of EGF exemplified by three time course trapped samples. The
intermediates were trapped by iodoacetic acid and analyzed by HPLC
using the conditions described in the legend of Fig. 2. The 30-min
trapped sample contained primarily the unfolded EGF ( R),
1-disulfide ( I), and 2-disulfide ( II) intermediates.
Eighteen fractions of this samples were isolated and characterized by
mass spectrometry. Contents of various disulfide species within each
fractions are given in Fig. 4. The majority of 1-disulfide species were
eluted within fractions 16 ( I-A) and 17 ( I-B),
whereas most 2-disulfide species were eluted within fractions 3
( II-A) and 6 ( II-B). The 7-h sample was comprised of
2-disulfide species ( II, fractions 3, 6, and 7, etc.),
3-disulfide scrambled species ( III, fractions 4 and 5, etc.),
and the 3-disulfide native species ( N, fraction 1). The 24-h
sample contains only 3-disulfide species, in which about 45% are native
EGF and 55% are scrambled EGF. Two well populated species of scrambled
EGF are eluted at fractions 4 ( III-A) and 5
( III-B).
Figure 4:
Analysis by mass spectrometry of folding
intermediates of EGF isolated by HPLC. Eighteen fractions were isolated
from the 30-min sample (trapped by iodoacetate) (see Fig. 3) and
analyzed by MALDI mass spectrometry in order to determine the disulfide
species contained in each fraction. The content of disulfide species is
determined by the mass peak height and expressed as percentage in each
fraction. The data should be allowed a standard deviation of ±
10%. Fractions 4 and 5 contain only minute amounts of intermediates and
are shown to be comprised of about 50% each of 2-disulfide and
3-disulfide (scrambled) species. A separate analysis of the 7-h trapped
sample shows that fractions 4 and 5 contain predominantly (>90%)
3-disulfide species.
These data
demonstrated that the folding pathway of EGF was characterized by a
sequential flow of unfolded EGF (R) through three groups of
equilibrated intermediates, namely, 1-disulfide, 2-disulfide, and
3-disulfide (scrambled) isomers. With the control - folding
experiment, 45% of the folding intermediates were stuck as scrambled
species (see Fig. 2, 24-h sample), unable to convert to the
native EGF due to the lack of free thiols to catalyze their disulfide
reshuffling. This problem was overcome by including 2-mercaptoethanol
(data not shown) in the folding buffer, in which recoveries of native
EGF were found to be greater than 96% after 24 h of folding.
Determination of the Disulfide Linkages of Well Populated
1-Disulfide Intermediates (I-A and I-B)
This can be achieved by
a number of strategies. The most common one is ``peptide
mapping.'' This requires isolation and analysis of every enzyme
fragmented peptides, as will be shown in the following section.
Alternatively, it can be done by selective labeling of disulfide bonds
(after reduction) with a color (Chang, 1993) or fluorescent
thiol-specific reagents (Weissman and Kim, 1991). Both methods need
microgram amounts of the intermediates, HPLC separation of peptides,
and numerous attempts of sequence analysis.
Figure 5:
Peptides of EGF derived from Glu-C
protease digestion. Cleavages at the three indicated positions are
specific and quantitative. Glu-Cys
was
not digested by Glu-C protease at all. When these four peptides are
collectively sequenced, the six half-cystines are recovered at six
different cycles of Edman degradation. Cys
, cycle 2;
Cys
, cycle 6; Cys
, cycle 7; Cys
,
cycle 9; Cys
, cycle 14; Cys
, cycle 20. Their
recoveries as free cysteine (carboxymethylated) are used to identify
the disulfide structures of 1-disulfide intermediates (see Fig.
6).
For EGF,
such peptide mixtures could be generated by either trypsin or Glu-C
protease digestion (Fig. 5). The sequencing data obtained from
the analysis of Glu-C digests are given in Fig. 6. It shows
unambiguously that I-A and I-B contain
Cys-Cys
and
Cys
-Cys
, respectively, both are
non-native disulfides (Fig. 8). The results obtained from trypsin
digests are equally conclusive (data not shown).
Figure 6:
Identification of the disulfide structures
of 1-disulfide folding intermediates of EGF. Folding intermediates
( I-A and I-B) were digested by Glu-C protease and
peptides (Fig. 5) collectively sequenced by automatic Edman
degradation. Recoveries of Cys(Cm) were quantitated at expected cycles
and were compared with those obtained from the control sample (fully
reduced carboxymethylated EGF). Cysteines which are involved in the
disulfide pairing will not be recovered as
Cys(Cm).
Figure 8:
Disulfide structures of six well populated
folding intermediates of EGF. The arrows do not imply the
direct conversion between the indicated
species.
Assignments of the Disulfide Pairings of 2-Disulfide
(II-A and II-B) and 3-Disulfide (III-A and III-B)
Intermediates
The choice of methods for elucidating the
disulfide structures of 2- and 3-disulfide intermediates is limited to
the technique of peptide mapping. In this approach, selection of
enzymes is critical. The digestion should be carried out at neutral or
acidic pH and allow at least partial cleavage at peptide bonds between
all neighboring cysteines. Thermolysin has been found to be an ideal
enzyme for this purpose. Peptides were separated by HPLC
(Fig. 7). Distinctions between cystine- and
non-cystine-containing peptides can be generally recognized. Those
which do not appear constantly in all mappings most likely contain
disulfides. All peptides were analyzed by amino acid sequencing and
mass spectrometry. Crucial data which permit assignments of disulfide
pairings are presented in Table I. Two cystine peptides, with nearly
equal recoveries and corresponding to two native disulfides,
Cys-Cys
and
Cys
-Cys
, were found in II-A-7 and
II-A-12, respectively. Despite the shoulder peak of II-A-12, sequence
and mass analysis have revealed no contaminants of minor sequences. The
two disulfide bonds of species II-B were also found in two major peaks.
II-B-15 consisted of three peptides linked by two disulfide bridges,
which could be oriented in a combination of either
Cys
-Cys
/Cys
-Cys
or
Cys
-Cys
/Cys
-Cys
.
The finding of Cys
-Cys
in peak II-B-7
confirms that the former structure is the correct one. In this
intermediate, both disulfides are non-native.
Figure 7:
Mappings of thermolytic peptides derived
from the well populated 2- and 3-disulfide intermediates. Peptides were
analyzed by amino acid sequencing and mass spectrometry. Data obtained
from major cystine-containing peptides (numbered) were used to
construct the disulfide structures of II-A, II-B, III-A, and III-B (see
Fig. 8). Chromatographic conditions are similar to those described in
the legend of Fig. 2, except for using a different gradient, 5%-22%
solvent B linear in 32 min and 22%-50%B from 32 to 45
min.
Scrambled EGFs are
3-disulfide species. For III-A, the disulfides were detected in five
peaks. Cys-Cys
and
Cys
-Cys
were identified in III-A-5 and
III-A-10 (). Cys
-Cys
was
found in three different peaks (III-A-13, III-A-15, and III-A-17), due
to nonspecific cleavages by thermolysin. In III-A, all disulfides are
non-native. III-B is the most predominant scrambled species. Its three
disulfides were recovered in four major peaks.
Cys
-Cys
was found in III-B-2 and
Cys
-Cys
was detected in III-B-8. The third
disulfide of III-B, Cys
-Cys
, was found in
III-B-14 as well as the tailing shoulder (right-hand) of III-B-11
(). For all four well populated intermediates, there is no
evidence of contamination of minor species (<10%). The results of
their disulfide structures are summarized in Fig. 8.
The Efficiency of EGF Folding Is Regulated by the Applied
Redox Potential
Two systems of redox agents, GSH/GSSG and
Cys/Cys-Cys were evaluated here. The effect of GSH was found to be
similar to that of 2-mercaptoethanol which was to promote the
conversion of scrambled EGF to the native EGF. In achieving this, it
neither accelerated the flow of intermediates between unfolded species
and scrambled species nor altered the patterns of intermediates
compositions. The only obvious difference between those performed with
and without GSH was the level of accumulation of scrambled species and
the recovery of native EGF. Without GSH (control -), about 50% of
EGF was trapped as scrambled species. In the presence of GSH, the yield
of native EGF was nearly quantitative after 24 h of folding.
Figure 9:
Effect of GSSG and Cys-Cys on promoting
the disulfide formation during the folding of EGF. Unfolded EGF was
allowed to refold in the Tris-HCl buffer containing indicated
concentrations of GSSG or Cys-Cys. Folding intermediates were trapped
by acid and analyzed by HPLC. The recoveries of five different species
of EGF, namely 0-disulfide ( R), 1-disulfide ( I),
2-disulfide ( II), 3-disulfide scrambled ( III), and
native ( N) species, are evaluated quantitatively form each
time course trapped sample. Mixed disulfide species, which can be as
much as 5-12% of the total sample when folding was performed in
the presence of 0.5 m
M of GSSG or Cys-Cys, are not included in
the calculation.
In Vitro Folding of EGF Can Be Achieved Quantitatively
within 1 Min
The above findings suggested that both the speed of
EGF folding and the recovery of native EGF could be greatly improved
under optimized compositions of redox agents. To demonstrate this
potential, unfolded EGF was refolded in the Tris-HCl buffer containing
4
M sodium chloride and in the presence of the following redox
systems: ( a) GSH/GSSG (4 m
M/2 m
M);
( b) Cys/Cys-Cys (4 m
M/2 m
M), and
( c) Cys/Cys-Cys (4 m
M/2 m
M) plus protein
disulfide isomerase (40 µ
M). Selection of these conditions
was intended to ( a) allow head-on comparison of the potencies
between the GSH/GSSG and Cys/Cys-Cys systems and ( b) assess
the efficacy of protein disulfide isomerase (Epstein et al.,
1963; Freedman, 1984; Bulleid, 1993). The outcome was judged by the
rate of the recovery of native EGF. It revealed that Cys/Cys-Cys was
10-fold more effective than GSH/GSSG in promoting the formation of
native EGF. The improvement was multiplied by another 7-fold when 40
µ
M of protein disulfide isomerase was added. Under these
optimized conditions, folding of EGF completed within one minute
(Fig. 10).
Figure 10:
HPLC chromatograms of the accelerated
folding pathway of EGF. The folding was carried out in the Tris-HCl
buffer (0.1
M, pH 8.5) containing 4
M NaCl and in the
presence of the following redox agents: A, Cys/Cys-Cys (4 m
M/2
m
M) plus protein disulfide isomerase (40 µ
M);
B, Cys/Cys-Cys (4 m
M/2 m
M). The folding
intermediates were trapped by acid. The elution position of the fully
reduced EGF ( R) is shown by an open arrow. III and II indicate the two most dominant fractions of
3-disulfide (scrambled) and 2-disulfide EGF. Under conditions in
A, folding of EGF completes within 1 min. Without protein
disulfide isomerase (conditions in B), quantitative folding is
achieved in 15-20 min. Significant concentrations of species
containing Cys mixed disulfide were shown to present along the folding
pathway (the three peaks marked by arrows and eluted between
N and II). Those mix-disulfide species apparently
exist in equilibrium with scrambled 3-disulfide intermediates, as their
concentrations are dependent upon the amounts of Cys-Cys applied.
Folding of EGF in the Presence of
Denaturants
Denaturants (8
M urea or 5
M GdmCl) were included in the folding buffer in order to examine
their effects on the folding mechanism of EGF. EGF was allowed to
refold in the presence of 8
M urea without or with 0.25 m
M of 2-mercaptoethanol (8
M urea - and 8
M urea +). These two experiments were repeated in the presence
of 5
M GdmCl (5
M GdmCl - and 5
M GdmCl +). Folding intermediates were trapped by iodoacetic
acid and analyzed by HPLC. The results were compared to those obtained
from control experiments (control - and control +).
Figure 11:
Folding of
EGF in the absence ( Control -) and presence ( 8
M urea -) of denaturant. Reduced EGF was allowed to
refold in the Tris-HCl buffer alone ( Control -) or in
the same buffer containing 8
M urea ( 8
M urea
-) (``[minus]'' indicates that folding
was performed in the absence of 2-mercaptoethanol). Folding
intermediates were trapped by iodoacetate. Predominant fractions of
1-disulfide ( peaks 16 and 17), 2-disulfide ( peaks
3 and 6), and 3-disulfide scrambled ( peaks 4 and
5) intermediates are indicated.
Comparison of the Folding Mechanisms of EGF and
Hirudin
The folding mechanism of EGF described here is
fundamentally indistinguishable from that observed with hirudin
(Chatrenet and Chang, 1992, 1993; Chang, 1994). Aside from the degree
of complexity of folding intermediates and the mode of their
progression along the pathway, both foldings are also governed by the
redox potential in identical ways. Even the effect of denaturant on the
efficiency of EGF and hirudin folding is hardly distinguishable
(). The most striking similarity, however, is the
formation of scrambled species and the mechanism by which they have
become accumulated in the control - folding experiments.
Scrambled EGF are unable to reshuffle their non-native disulfides and
convert to the native structure unless free thiols are around as
catalyst. When folding is carried out in the buffer alone, free
cysteines of 0-, 1-, and 2-disulfide species function as thiol catalyst
during the early phase of folding. As the folding advances, more
cysteines become involved in the disulfide pairing and less are
available as thiol catalyst. Therefore, scrambled species accumulate
and become trapped. The remarkable outcome is that, for both proteins,
the same percentage (40-50%) of the starting material ends up
being trapped in the scrambled states. Even under extremely favorable
folding conditions that involve optimized redox potentials, high
concentrations of scrambled EGF and hirudins (Chang, 1994) were still
observed.
Comparison of the Folding Mechanisms of EGF and
BPTI
The mechanism of EGF folding also displays a number of
intriguing similarities and dissimilarities to that of the BPTI model
(Weissman and Kim, 1991), a protein consists of 58 amino acids and
three disulfides and a model of protein folding which has been
characterized in detail (Creighton, 1978, 1990; Creighton and
Goldenberg, 1984). One conspicuous feature is that the HPLC pattern of
the folding intermediates of EGF (Fig. 3) closely resembles that
of BPTI (Weismann and Kim, 1991). Their similarities are described in
the following: 1) the fully reduced species (R) and the correctly
folded native species (N) are widely separated on reverse phase HPLC,
with all folding intermediates eluted in between; 2) despite the
heterogeneity of minor species, there exist limited numbers (about five
to six) of well populated folding intermediates; 3) well populated
intermediates can be classified into three distinct groups (I, II, and
III). Group I is eluted near R. Groups II and III are close to (N).
They progress along the folding pathway sequentially to reach the
native structure. Thus, during the folding, the diminishing of group I
is accompanied by the emerging of group II which then gradually
disappears when group III begins to build up. For different reasons,
group III can become trapped and unable to convert to the native
structure; 4) for both EGF and BPTI, group I has been shown to comprise
1-disulfide species, and group II contains 2-disulfide species. One
major distinction between these two models is the nature of species
found in group III. They are identified as 3-disulfide scrambled
species in EGF, but have been characterized as 2-disulfide species
containing two native disulfides in the case of BPTI (Weismann and Kim,
1991). Most importantly, well populated folding intermediates of BPTI
have been shown to contain exclusively native disulfides (Weismann and
Kim, 1991). This finding has far reaching implications. It suggests
that the same interactions which stabilize the native structure also
guide the entire process (pathway) of folding (Rose, 1979; Oas and Kim,
1988; Kim and Baldwin, 1990). If this phenomenon were to be a general
rule that governs protein folding, one would expect it also applies to
EGF. The results with EGF clearly indicate that there are exceptions.
-Cys
and
Cys
-Cys
), and they are not found in the
major 2-disulfide species. The only common characteristic shared by
these two non-native disulfides is that they are the smallest disulfide
loops, aside from Cys
-Cys
. Therefore,
some unidentified 1-disulfide species most likely act as productive
species that account for the flow between 1- and 2-disulfide
intermediates. Of the 2-disulfide species, one (II-A) admits two native
disulfides, and the other (II-B) contains two non-native disulfides
that are found in one of the scrambled species (III-A) as well. Here,
one may suggest a concise two-pathway model in which II-A converts to
the native species (on-pathway) and II-B goes to III-A (off-pathway)
which subsequently equilibrates with III-B and other minor scrambled
species (Fig. 8). This is a tempting conclusion, but there is no
proof to it, for two reasons. First, there is no evidence that II-A
transforms to N directly without undergoing additional disulfide
rearrangements. There are at least 15 fractions of minor 2-disulfide
species, and they exist in equilibrium with II-A and II-B. Second,
scrambled EGF form reproducibly in significant quantity during the
folding, regardless of whether folding is carried out under favorable
or unfavorable conditions. They cannot be simply dismissed as abortive
structure of ``off-pathway'' folding. In our opinion,
scrambled species are legitimate intermediates and passages to the
native structure. Thermodynamically (Anfinsen, 1973, Anfinsen et
al., 1961), their presence as folding intermediates is perfectly
logical.
(
)
Similarly, one cannot rule
out the possibility that I-A and I-B, although composed of non-native
disulfides, may adopt native like structures. The safest statement one
can make out of the analysis of trapped disulfide species is the degree
of heterogeneity of folding intermediates. Taking these arguments into
consideration, it will be premature to conclude that folding of BPTI
and EGF are guided by divergent principles, despite sharp differences
of the disulfide structures of their well populated folding
intermediates.
Table:
Structures of the disulfide containing
peptides derived from the well populated folding intermediates of EGF
Table:
Recoveries of native EGF and hirudin under
different folding conditions
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.