©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
The Structure of a 19-Residue Fragment from the C-loop of the Fourth Epidermal Growth Factor-like Domain of Thrombomodulin (*)

(Received for publication, April 24, 1995; and in revised form, June 12, 1995)

Marc Adler (§) Marian H. Seto Danute E. Nitecki Jiing-Huey Lin David R. Light John Morser

From the From Berlex Bioscience, Inc., Richmond, California 94804-0099

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

The solution structure has been determined for a 19-residue peptide that is fully folded at room temperature. The sequence of this peptide is based on the C-loop, residues 371-389, of the fourth epidermal growth factor-like domain of thrombomodulin, a protein that acts as a cofactor for the thrombin activation of protein C. Despite its small size, the peptide forms a compact structure with almost no repeating secondary structure. The results indicate the structure is held together by hydrophobic interactions, which in turn stabilize the two beta-turns in the structure. The first beta-turn in the C-loop represents a conserved motif that is found in the published structures of five other epidermal growth factor-like proteins. The critical role of Phe in the stabilization of the first beta-turn is consistent with mutagenesis data with soluble thrombomodulin. The results also show that a small subdomain of a larger protein can fold independently, and therefore it could act as an initiation site for further folding.


INTRODUCTION

Thrombomodulin (TM), (^1)an endothelial cell surface glycoprotein, binds thrombin and alters its specificity away from fibrinogen cleavage and toward the activation of protein C. The activation of protein C by thrombin is accelerated >1000-fold when TM is present as a cofactor. Generation of activated protein C, which inactivates factor Va and factor VIIIa, is an important anticoagulant mechanism of the endothelial cell surface(1, 2, 3) .

TM is a multidomain protein, which spans the endothelial cell membrane. Full cofactor activity is present in the soluble ectodomain produced by elastase(4) . Several studies performed with mutagenesis or with peptides derived from the ectodomain of human TM have defined the domains required for activity. The smallest fragment with full cofactor activity for the activation of protein C by thrombin contains the last three consecutive EGF-like domains, EGF4-6(5, 6) . Recent studies suggest that the region of thrombomodulin that binds tightly to thrombin is distinct from the region that modulates the active site of thrombin. Significantly, a construct made from the fourth and fifth EGF-like domains, EGF45, retains approximately 10% of the cofactor activity, although binding to thrombin is drastically reduced(5) .

Deletion mutants near EGF4 effect k/K for the thrombin-TM complex with protein C, and removal of the fourth domain results in a complete loss of cofactor activity. However, deletion mutants that include the C-loop of EGF6 have a normal k/K for protein C but decreased affinity for thrombin, as measured by the K for thrombin(6) . A cyclic peptide based on the C-loop of EGF5 and the interdomain loop between EGF5 and EGF6 binds with high affinity to thrombin at the anion exosite of thrombin, a positively charged groove on the surface of thrombin important for TM, fibrinogen, and hirudin binding(5, 7, 8) . The results suggest that EGF56 contains the high affinity binding site for thrombin and EGF4 contains residues that are absolutely required for activity.

Site-specific mutants around EGF4, which result in low activity analogs, have defined some residues important for cofactor activity in this domain(9, 10) . This includes Asp in the interdomain region between EGF3 and EGF4, Glu and Tyr in the B-loop, and Phe in the C-loop of EGF4 (the numbering of the residues in this paper is consistent with the sequence of thrombomodulin given by Suzuki et al.(11) ). Met in the interdomain region between EGF4 and EGF5 adjacent to the C-loop of EGF4 can be oxidized to the low activity methionine sulfoxide analog(12) . Perhaps of more interest are EGF4 mutants, which result in an increase in cofactor activity. Replacement of Met by leucine results in an analog with twice the cofactor activity of wild-type TM(13) . When a second mutation is introduced in the C-loop, His Gly, and combined with the Met Leu mutation, the resultant TM analog has four times the activity of wild-type TM.

Clearly, one way these mutants could modulate cofactor activity is by altering the conformation of TM. To test this hypothesis, a set of cyclic peptides was synthesized based on the sequence of the individual loops of TM. Each loop contained a single disulfide. NMR was used to measure structures of these peptides in solution. It was hoped that a comparison between several peptides with single site mutations would shed light on the relationship between structure and function.

Of course, these structural comparisons could only be made if the peptides were folded. In our experience, most peptides of this size do not fold in aqueous solutions. Indeed, there are relatively few structures of peptides of this length listed in the Brookhaven Protein Data Bank. There are seven solution structures for peptides of less than 30 residues (Spring, 1994). All of these compounds contain at least two internal cross-links, which form the core of the structures. Most of the remaining residues form hairpin loops that are wrapped around a densely packed central core. During the course of this study, NMR was used to investigate the structure of nine different peptides. Each peptide contained approximately 20 residues and a single disulfide cross-link. The sequences were based on a single loop found either in TM or a homologous EGF-like protein. For most of the peptides, the spacing between the cysteines was much greater than the loops found in the small peptides of the protein bank data. Therefore, there must be a significant loss of conformational entropy during refolding. The experimental results indicated that the only peptides that formed a compact structure were based on the sequence of the C-loop of TM-EGF4 (Table 1). This was an unexpected result, since the peptides contained a long stretch of 13 amino acids between the cysteines. The experimental results were used to evaluate the structural homology between TM-EGF4 and other EGF-like proteins or domains that, in turn, provided a working hypothesis that explains some of the functional data.




EXPERIMENTAL PROCEDURES

Materials

All peptides used in this study (Table 1) were synthesized by solid phase synthesis methods on a Biosearch 9600 Peptide Synthesis instrument using Fmoc (N-(9-fluorenyl)methyloxycarbonyl) as the temporary protection group and BOP (benzotriazolyl-L-oxy-tri(dimethylamino)phosphonium hexafluorophosphate) as the coupling agent; side chains were protected by groups that were appropriate for this chemistry. The peptides were cleaved by trifluoroacetic acid containing small amounts of anisole, thioanisole, and ethanedithiol and were purified by preparative high pressure liquid chromatography on a reverse phase C18 column using 0.1% trifluoroacetic acid in an acetonitrile/water gradient. The cyclization was achieved in dilute water/dimethyl sulfoxide solution, adjusted to pH 8, in 4 days or until Ellman's reaction was negative. The cyclized material was purified on a preparative Polymer Labs PLRP-S 300Å column. Mass spectrometry was used to analyze the purity of the peptides before and after cyclization.

Expression Plasmids and Site-directed Mutagenesis

Construction of Escherichia coli expression plasmids and procedures used for site-directed mutagenesis to construct plasmids coding for TM-M388L mutants were previously described(10) . Construction and expression of TM(E) mutants in Chinese hamster ovary cells were performed as previously described(12, 14) .

Preparation of Periplasmic Extracts, TM Cofactor Activity Assay, and ELISA Assay

Preparation of E. coli periplasmic extracts and TM cofactor activity assay (APC assay) were previously described(10) . ELISA assay was performed essentially according to Clarke et al.(13) with minor modifications. In the current assay, the captured monoclonal antibody 43B has been replaced by monoclonal antibody 531, which was previously used as the detection antibody(13) . The detection antibody is a rabbit polyclonal anti-TM(E)(14) . The relative specific activities of the mutants and the wild-type were determined under these conditions by the ratio of cofactor activity determined by the APC assay to mass determined by ELISA. Each clone of each mutant was assayed for specific activity between 4 and 17 times. All data were included in the determination of the significance of difference using a paired Student's t-test following guidelines provided with the program Statview.

NMR Data

All NMR measurements were made on a Bruker AMX 500. A 90° pulse width of approximately 9.5 ns was used in the experiments. The peptides were dissolved in 95% H(2)O and 5% D(2)O. The pH was adjusted to 6.8 ± 0.1. The sample concentrations ranged from 12 mM for the [Met]Mso to 2 mM for the Met Leu peptide. The sample concentration for the double mutant was 9 mM. No buffers or salts were used in any of the solutions. Water suppression was done using selective presaturation. A scuba pulse sequence (15) was added to both the DQF-COSY (16, 17, 18) and NOESY (19, 20) experiments. Except where noted, NOESY spectra were collected using a 200-ms mixing time. A separate spectrum, described below, was used to quantitate the NOE peak intensity. Clean total correlation spectra (21) were collected using a 60-ms MLEV-17 (22) spin lock. Phase-sensitive detection along (1) was achieved using time-porportional phase incrimination as described by Marion and Wüthrich(23) . For most experiments, 2048 complex data points were collected in the T(2) direction, and 768 real points were collected along T(1).

The sequential proton assignments of the four peptides were obtained using standard homonuclear techniques(24) . The spectra were processed using UXNMR (Bruker Instruments), and the results were analyzed using a software package developed from the algorithms outlined by Adler and Wagner(25) . Chemical shifts were somewhat arbitrarily assigned to be consistent with known values for both proteins and water. The chemical shift of water was assigned at 4.89 ppm at 3 °C and 4.66 ppm at 23 °C. Unless otherwise noted, the chemical shifts are reported for 3 °C at pH 6.8.

Structural Constraints

The HN-CH coupling constants were measured directly from the DQF-COSY spectra without any correction for line width. Dihedral angle constraints were obtained for six angles (Fig. 1) using the following method. It was assumed that the observed coupling constants were accurate to ±1.5 Hz and that the maximum coupling should not exceed 10.5 Hz. A computer program was written that used the Karplus equation as parameterized by Pardi et al.(26) to calculate all values of that would match the coupling constant within experimental error. The maximum and minimum values of were then used as dihedral angle constraints. This method produces a continuous function that nearly matches the values used by Kline et al.(27) for 8.0 and 10 Hz. The method sets a range of -77° to -53° for angle of Glu, which had a coupling constant of 4.9 Hz. The coupling constants of both Leu and Phe, 7.4 and 7.5 Hz, respectively, were also used for calculating restraints using the same program. A restriction was added to the calculation that rejected any positive values of . The intraresidue HN to CH NOE peak for both of these residues indicated an interproton distance of approximately 3 Å, which is inconsistent with a positive value of . The resulting constraints were -165° leq and leq -75 for Phe and 1° wider for Leu.


Figure 1: A graph that shows the sequential NOEs that were used in making the proton assignments. The rightcolumn identifies the type of NOEs. dsignifies an NOE between the X proton in the first residue to the Y proton in the second residue. The NOEs are listed for sequential residues except where noted. The widths of the blacklines are proportional to the NOE intensity. For the three prolines, NOEs to the CH were used in place of NOEs to the HN. The symbol () is used to denote NOEs that either could not be observed for either practical or theoretical reasons, such as potential overlap or residues that are missing a proton. The bottomline of the graph lists some of the observed J^3 coupling constants. Symbols are used as follows: alpha indicates a J^3 leq 5.5 Hz, b indicates a 8.0 leq J^3 < 10.0 Hz, beta indicates a 10.0 leq J^3. The use of the symbols alpha and beta do not imply that the residue is part of an alpha-helix or a beta-sheet.



NOE peak intensities were quantified from a NOESY spectrum of the double mutant (H381G,M388L). The recycle delay between pulses was 3.6 s. A 100-ms mixing was used to limit the artifacts caused by spin diffusion. The spectrum was processed using 75° shifted sine bell. The base line was flattened with a 5th order polynomial subroutine in both directions on the fully transformed data set using the Bruker processing software UXNMR. This subroutine has an automatic selection of base-line points. Correction along the F(2) axis was performed in eight uneven sections to minimize the distortions caused by the dispersive water signal.

The distance constraints were calculated from the peak intensity. A correction factor was included, which controlled for variation of peak width, based on the relative width of a resonance in the F(2) direction compared with the width of the peaks used for calibration. Eight well-resolved methylene pairs were used to calculate the scaling factor between the peak volumes and the target distances. The variation in intensity between these peaks was 20%. The standard equation for translating peak volumes into target distances was modified so that the distances were lengthened to compensate for any experimental uncertainty. First, the volume of each peak was divided by two to compensate for variations in both peak width and intensity. Also, it was assumed the volumes of the weaker peaks were less accurate than the more intense ones. This uncertainty was handled mathematically by using instead of power to calculate distances from peak volumes. The final affect on the experimental data was that the target distances of 2.2, 2.5, 3.0, 3.5, and 4.0 Å were lengthened to 2.6, 3.1, 3.8, 4.6, and 5.0 Å, respectively (an upper limit of 5.0 Å was used for all observed NOEs). The accuracy of the modified distance function was verified by examining the target values both intra- and sequential HN to CH NOEs. The ranges of distances calculated from NOE peaks were roughly 20% larger than the expected values. The calculated distances obtained from 100-ms NOESY spectrum ranged from 2.5 to 4.7 Å. An additional 0.7 Å was added to all NOEs involving methyl groups(27) . Lower bound constraints were set to the Van der Waal's contact radii.

No stereospecific assignments were performed for the methylene protons. If a pair of NOE peaks was observed between two protons of a diastereo pair to a third proton, the weaker NOE peak was used to calculate the distance constraint. When only one NOE was observed, the distance constraint was referenced back to the nearest heavy atom that was equidistant from both protons. The distance constraint was lengthened by the fixed distance between the protons and the heavy atom. In general, the preliminary structures were not used to further interpret the spectra due to the inherent uncertainty involved with these techniques. However, some of the NOEs to the CH positions of the two phenylalanines were stereospecifically assigned when the preliminary structures indicated a separation of greater than 7 Å between protons that had NOEs to same Phe CH.

The issue spin diffusion was not explicitly addressed during the preparation of the constraints. NOEs that involved a pair of methylene protons were examined for evidence of spin diffusion. We specifically looked at pairs of NOEs where one of two NOEs was very intense and could be a source of spin diffusion. In all cases, the inter-proton distance calculated for the weaker NOE was confirmed by an independent structural constraint.

Structure Calculations

All structures were generated using the distance geometry package, DGII (version 2.2.0), of insightII, which was generously provided by BIOSYM, Inc. After smoothing the bound matrix using the triangle inequality, the individual structures were embedded and subject to a maximum of 10,000 steps of simulated annealing at a maximum temperature of 200 K. The resultant structures were minimized by a conjugate gradient using a maximum of 250 steps. 34 out of 100 structures were selected for making comparisons. These structures had residual penalty functions below the mean structure and had no NOE violations greater than 0.2 Å. All peptides examined had the same fold. The structures with higher residual penalty functions showed greater deviations from the mean.


RESULTS

New Mutants in the EGF4 Domain

It was previously shown (10) that there was a drastic loss of activity when alanine was substituted for the following residues in EGF4 of TM: Glu and Tyr in the B-loop and Phe in the C-loop. Working with a construct made from EGF4-6 of TM, TM-M388L, it was shown that full function was restored by conservative substitution of the two aromatic residues. Thus, both Tyr Phe and Phe Tyr were fully active. Conservative substitution of Glu partially restored function. Glu Gln had only 1 ± 2% of native activity, and Glu Asp restored 48 ± 15% of the function (p < 0.05). Thus, the functional assay indicates there is little tolerance for variation at Glu, and major changes are not tolerated at positions Tyr and Phe.

A fourth set of substitutions was made in an attempt to increase activity. His is only found in human TM. Glycine is found in this position in both mouse and hamster TM. Comparison with homologous proteins indicates that there was probably a beta-turn at this position. Experiments demonstrated that His Gly doubled the specific activity of human TM-M388L. The double mutant, H381G,M388L, is 400% more active than soluble TM analogs with the native human sequence. The substitution of His Ala (10) had no affect. His Pro actually caused a slight decrease in activity (60 ± 28%).

NMR Structural Studies

Three separate peptides were synthesized based on the sequence of TM-EGF4: the A-loop, residues 345-361; the B-loop, residues 352-371; and the C-loop, residues 371-389. Each peptide was cyclized by forming a single native disulfide bond. For the A- and the B-loops, 2-aminobutyric acid was substituted for the unpaired cysteine. The NMR results indicated that the A- and B-loops did not fold into compact structures. The two-dimensional NMR spectra of these peptides showed that the chemical shifts of the protons were similar to the random coil values for the same amino acids(28) . Also, the J^3 coupling constants were all within 1 Hz of the random coil values with the single exception of 11 Hz for Val.

Exhaustive analysis of the NOESY spectra of both the A- and B-loops revealed only eight NOEs that connected residues separated by at least one amino acid; two bridged across the disulfide bonds, one connected the HNs of Asn to Tyr, and the remaining five involved residues separated by a single amino acid. Only two of the eight NOEs involved backbone-backbone interactions. Structure calculations utilizing the combined experimental constraints from both peptides failed to converge on a unique structure.

The solution structures of three other loops from EGF-like proteins were examined: the C-loop of TM-EGF5, the C-loop of transforming growth factor-alpha, and the B-loop of human urokinase EGF domain. Inspection of the two-dimensional TOCSY spectra indicated that there was little chemical shift dispersion of the protons beyond what was expected based on the random coil values(28) . No further analysis was attempted. The results from the C-loop of TM-EGF5 were confirmed by a recent report (8) . Although the peptide is unfolded in solution, it forms a unique conformation upon binding to the anion exosite of thrombin.

Two-dimensional spectra of the peptide based on the C-loop of TM-EGF4 indicated that the peptide did form a compact structure. In particular, the chemical shifts of the amide protons ranged from 7.5 to 9.3 ppm (Table 2). The comparable range for the same protons in unfolded peptides would be 8.2-8.4(28) . To probe further the relationship between structure and function, a total of four peptides derived from the C-loop of EGF4 were synthesized (Table 1). Although the peptides folded into compact structures, the isolated peptides had no measurable effect on modulating the activation of protein C by thrombin when tested alone as a cofactor for thrombin or as a competitive inhibitor of the action of thrombomodulin on thrombin, even at concentrations as high as 5 mM. (^2)The activity measurements shown in Table 1were performed by incorporating the sequence of each of these peptides back into a truncated but fully active form of thrombomodulin containing the fourth, fifth, and sixth EGF-like domains, TM-M388L. The activity, as a percentage of the specific activity of the TM native sequence, ranged from 400 to 10%. The most detailed structural work was performed on the double mutant, H381G,M388L, since this represents the most active sequence. The two-dimensional spectra indicated that all four peptides had the same overall fold (see below for details).



Proton Chemical Shift Assignments

The proton chemical shifts for the native sequence appear in Table 2. Selected residues from the other peptides are also listed. Assignments have been made for 113 of 115 slowly exchanging and nonexchanging protons in the native sequence. Assignments of the remaining protons were probably obscured by degenerate protons within the same residue. Fig. 1shows the sequential NOEs that were used in making the assignments.

Structure of the Peptide Based on the Double Mutant, H381G,M388L

The structure reported here is obtained from the peptide based on the double mutant at 3 °C (pH 6.8). A total of 213 NOEs was assigned for the double mutant. Approximately 21 of the 48 intraresidue NOEs and 20 of the 85 sequential NOEs had upper bound distances that were greater than the distance allowed by the covalent geometry. This left a total 172 useful NOEs, which had the following distribution: 27 intraresidue, 65 sequential, 26 medium range (i to i + 2-5) and 54 long range (i to i + <5) NOEs. Torsion angle constraints for the angle were derived from eight HN-CH coupling constants. This gives an average of 9.5 useful constraints per residue. The structure itself is shown in Fig. 2A.


Figure 2: A, a stereo view of the double mutant showing the heavy atoms, without the carbonyl oxygens. The structure presented was judged the best based on residual value of the penalty function and how close the coordinates were to the average structure. B, a superposition of the 20 best structures. The side chain of Val and the guanidinium group of Arg have been omitted for clarity.



The root mean square deviations of the well determined backbone atoms is 0.6 Å to the average structure and 0.9 Å for the pair-wise interactions (Fig. 2B). This figure excludes N-terminal Val because there is almost no structural information for this residue. Root mean square deviation for all heavy atoms to average structure is 1.3 Å (1.8 Å for the pair-wise interactions.) The side chain conformation has been accurately determined for Phe, Ile, His, and Gln. Constraints on the protein backbone also determine the locations of the side chains of Ala, Ala, Pro, Pro, and Pro. Less information is available for the other side chains.

The Structure of the C-loop of EGF4 in Solution

The final structure is depicted in Fig. 2A. The molecule forms a loop-like structure that is bracketed on either side by the two well defined beta-turns. The first turn, which extends through residues Ala, Glu, Gly, and Phe, is a type II beta-turn. There is a well defined hydrophobic pocket surrounding residue Phe. The phenylalanine side chain is flanked by, and has NOEs to, Cys, Ala, Ala, Cys, and Leu (Met in the native sequence). These residues limit the solvent exposure of the aromatic ring to the outer edge.

The second bend includes residues Ile, Pro, Gly, and Glu. Both type I and II beta-turns are compatible with the experimental data for the double mutant. The intensity of intraresidue NOEs between the HN and CHs of Gly would clearly resolve the ambiguity if there were stereospecific assignments available for the CHs. Unfortunately, the stereospecific assignments could not be determined in a reliable fashion. The other three peptides, including the native sequence, all have histidine at third position of this turn, and all three exhibit a type I beta-turn. This beta-turn is part of a five-residue insertion, Pro to His, in the sequence of this EGF-like domain (Table 3).



The beta-turn is stabilized by hydrophobic interactions that are centered on Ile. Only the outer edges of the methyl groups are exposed to the solvent. The side chain of Ile residue is covered by the methylene side chains of residues Pro, Glu, Arg, and Gln. The close interaction between these side chains probably adds to the stability of the protein.

A third, less well defined type I beta-turn appears between residues Glu, Pro, His, and Arg. The chain itself forms a right angle turn through this bend. This geometry distorts the conformation of Glu and weakens the hydrogen bond between the CO of Glu to the HN of Arg.

The three prolines, Pro, Pro, and Pro, all have trans peptide bonds. There was no detectable amount of any folded species that contained a cis peptide bond. All three prolines are located at bends in the protein backbone. The prolines are all involved in delineating the second beta-turn. This is part of a five-residue insertion in the sequence of the C-loop (Table 3).

There are three hydrogen bonds that can be easily identified in the structure. Two of the hydrogen bonds are found in the beta-turns: CO Ala to HN Phe and CO Ile to Glu. A third hydrogen bond is found between the CO of Ala to the HN of Gln. This hydrogen bond appears where the protein backbone crosses back upon itself (Fig. 2A). Other potential hydrogen bonds may exist in this structure but cannot be identified given the resolution of structures. Indeed, the peptide appears to have only a few internal hydrogen bonds that could contribute to the stability of the structure.

Structural Homology

Table 3lists the sequences of the C-loops from the five homologous proteins or domains whose atomic coordinates were available. The sequence alignment is based upon the structural comparison depicted in Fig. 3. 10 residues were identified as playing the same role in all six peptides. They are in TM-EGF4 (the peptide used in this study), the first cysteine, the first beta-turn, the next two residues after it (Cys-Pro), the second cysteine and its proceeding residue (Arg and Cys), and finally Met, which covers one side of Phe. The pair-wise root mean square deviation of the backbone atoms for these 10 residues is 1.0 Å. The comparable figure for the uncertainty in our own structures is 0.7 Å. Furthermore, the structural homology extends to the orientation of the peptide bonds and the C. It should be noted that the conserved bend is a type II beta-turn and with an X-X-Gly-(Phe/Tyr) sequence; the first two positions are variables, the third position is glycine, and the fourth position is either phenylalanine or tyrosine. This sequence is found in many proteins that are homologous to EGF.


Figure 3: The superimposed structures of the C-loop from TM-EGF4 (double mutant) with the C-loop from the five other EGF-like proteins shown in Table 3. The two loops that extend out on the leftside are labeled for TM-EGF4 and FXa-C, respectively. The superposition was done using the backbone atoms from structurally homologous residues shown in bold in Table 3. Side chains are shown for the residues that interact with the first beta-turn.



The sequence of the structurally conserved residues can be described as Cys-X-X-Gly-(Phe/Tyr)-X-X . . . X-Cys . . . X. The first gap contains between one and six residues; the second gap contains either one or two. The structural similarity in the last position is surprising. The charge, hydrophobicity, and size of this residue vary between the proteins listed in Table 3. Also, TM-EGF4 and FXa-C, the two longest C-loops, exhibit a one-residue deletion prior to this residue. The side chain of this residue is in close contact with the conserved aromatic residue in the first beta-turn and appears to limit the ring's exposure to solvent. This interaction must be important to the stability of the protein, since the interaction is maintained despite the large variation of sequence in this position. In fact, there is little conservation of the sequence for six out of ten conserved positions (Table 3).

All six C-loops exhibit a second chain reversal shown on the leftside of Fig. 3. Four of the proteins share the same overall length of the C-loop. The fold of this chain reversal is conserved in each peptide. FXa-C and TM-EGF4 have a four- and five-residue insertion between the cysteines. Both proteins accommodate this insertion in roughly the same manner (Fig. 3). The results show that this bend in the structure accommodates considerable variations in both sequence and structure.

Temperature Studies

A series of one-dimensional NMR spectra was obtained for the double mutant from 8 to 65 °C (data not shown). The peptide retained sufficient structure to protect the amide protons from exchange with solvent up to 40 °C. The HN resonances disappeared in a cooperative fashion above 40 °C and were nearly invisible at 50 °C. The chemical shifts of all resonances changed in a continuous manner from 8 to 65 °C. This indicates that there was a rapid exchange between the folded and unfolded conformations. Once the temperature was lowered from 65 °C back to 8 °C, the protein completely refolded and showed no sign of any degradation.

A more detailed study of temperature effects was carried out using two-dimensional NMR. A NOESY spectrum of the double mutant was collected at 23 °C and visually compared to the corresponding data obtained at 3 °C. Although the intensities of the cross-peaks were attenuated at the higher temperature, there was no evidence of any detectable change in conformation.

DQF-COSY spectra were obtained for all four peptides (Table 1) at both 3 and 23 °C. The similarity of chemical shifts again indicated that the conformation remained intact. There were, however, consistent changes in CH of residues 376-378 and 386-388 (Fig. 4). These residues form an antiparallel structure. Some of the chemical shift perturbations extended to the side chains that participate in the hydrophobic pocket surrounding residue Phe (Fig. 4). The elevated temperature had little effect on the aliphatic protons located near beta-turns, indicating that the beta-turns were stable at the higher temperature and there was no global unfolding of the peptide.


Figure 4: The structure of the double mutant showing the side chain heavy atoms. Each atom has been depicted in gray scale to show the absolute value of the change in chemical shift when the temperature was raised from 3 to 23 °C. The darkestgray represents no change; white represents a change of greater than 0.13 ppm. The gray scale of the heavy atoms is determined by the average of their attached proton(s) or by the average of the nearest assigned proton(s). The ribbon is shaded using the changes in the CH chemical shifts.



Single Site Mutations

Table 1shows the sequences and relative activities of the four peptides used in this study. A preliminary examination of the chemical shift data and the 200-ms NOESY spectra indicated that all four peptides had the same structure. These results were unexpected, since there is a 40-fold difference in activity when these sequences are incorporated back into the parent protein (Table 1).

A detailed comparison was made between the 200-ms NOESY spectra (3 °C) of the double mutant and the peptide with oxidized Met ([Met]Mso). These peptides represent the most and least active sequences. Of the original 213 assigned for the double mutant, only two NOEs were found missing for [Met]Mso peptide. Both NOEs involved the backbone protons of Phe. The remaining 17 NOEs to Phe were found in both peptides. The [Met]Mso peptide also had some new NOEs not found for the double mutant. Observation of the new NOEs probably stemmed from the higher sample concentration used for the [Met]Mso peptide. NOESY spectra of the native peptide and the Met Leu were also very similar to corresponding data for the double mutant. Each spectrum contains nearly the same set of NOEs for the side chains of residues 388 and 389. However, lower sample concentration precluded a more quantitative examination of the data.

Similarity in the structures of the four peptides is also demonstrated by comparing the chemical shifts. The substitution of His Gly did not significantly affect (±0.08 ppm) the chemical shifts of the protons beyond a 5-Å radius of the site of the modification. The oxidation of the Met also had very minor affects.

It is worth noting that the spectra of the oxidized [Met]Mso peptide indicated the presence of two closely related peptides, even though the compound was pure, as judged by high performance liquid chromatography and mass spectrometry. At 3 °C, there was measurable splitting of all the resonances of the methionine sulfoxide, Mso, and in the backbone protons of Gln and Phe. Within accuracy of the data, the intensity of both sets of peaks was the same. At 23 °C, this splitting became more pronounced and affected additional residues in the hydrophobic pocket around Phe. The mono-oxidation of S of methionine introduces a chiral center at the sulfoxide. Since the peptide was made with a synthetically prepared derivative of methionine, it contained a racemic mixture of both R and S forms of methionine sulfoxide at the S position. These results indicate that each enantiomer has a slightly different conformation.


DISCUSSION

Protein Folding and Stability

The structure presented here (Fig. 2A) is in some ways typical for small peptides. It contains only a few hydrogen bonds, and its structure is dominated by beta-turns. However, the protein is uncharacteristically flat and extended. Phe is the only residue that has close interactions with more than one other strand of the protein. The small amount of interior volume in the protein is defined by the packing of the side chains and not by the backbone. This protein lacks the tightly coiled structure that characterizes the other small proteins that appear in the protein data bank. It is hard to judge whether this flat structure will be found in other isolated peptides that contain a single disulfide loop. However, five of the six loops examined in this study failed to form a compact structure.

As discussed in the results section, the two beta-turns are stabilized by the formation of hydrophobic pockets. It is worth noting that the side chains of two other hydrophobic residues, Val and Phe, do not interact with the hydrophobic pocket surrounding Phe, even though their backbone residues are close to this pocket (Fig. 2A). A possible explanation for these results can be found by examining the structure of the homologous proteins. If Val and Phe are compared to homologous residues in other EGF-like proteins (Table 3), the corresponding amino acids do not participate in stabilizing this hydrophobic cluster. The results imply that the structural constraints that control folding of the intact protein are somehow encoded in the isolated C-loop.

Finally, the structure of this peptide has some interesting implications for protein folding. It clearly shows that a subdomain of a larger protein can act as an autonomous folding unit. The temperature shift data implies that the two beta-turns are more stable structures and, therefore, may guide the folding of the peptide. However, this peptide is small enough such that it could find the correct structure by random search of conformational space, and folding may take place in a single cooperative step. The folded C-loop may then act as a template that guides the subsequent steps in protein folding. The results suggest that the folding of the backbone and the side chains can take place concurrently. Overall folding of the protein may consist of a series of precise events with intermediates that have a well defined structure.

Structure and Function

When the sequences of the four peptides are incorporated back into TM, there is a 40-fold difference in activity ( Table 1and (13) ). The NMR work presented here indicates that there is no detectable difference between the structures that can be correlated with the function. Therefore, if we hope to explain the functional data, we must examine parts of the molecule that lie outside the C-loop.

Previous work has identified five residues in or near TM-EGF4, whose substitution by alanine decreases the activity of TM by more than a factor of four: Asp, Glu, Tyr, Phe, and Met(9, 10, 13) . Asp is found in the interdomain loop N-terminal to EGF4. Residues Glu and Tyr in the A-loop are located in the three-residue loop between the second and third cysteine. Phe is in the C-loop, and Met is in the comparatively short three-residue interdomain loop C-terminal to EGF4.

A potential explanation for this functional data can be found by examining the structure of the homologous protein domain, FXa-C, the C-terminal EGF domain of factor Xa(29) . Of the five proteins available for structural comparisons (Table 3), this protein comes closest to matching TM-EGF4 in the size of the critical loops, including matching the spacing between the second and third cysteine. The work presented here has shown that Phe and Met in TM-EGF4 are directly homologous to Tyr and Pro in FXa-C (the numbering of residues for FXa-C is the same used by Padmanabhan et al.(29) ). Based on the relative location of the second and third cysteines, Glu and Tyr should be directly homologous to residues Asp and Gln in FXa-C. These four residues form a contiguous patch on the surface of FXa-C. This patch accounts for roughly half of the contact area between the FXa-C and the serine protease domain. A similar interface involving an EGF-like domain is also found in prostaglandin H(2) synthase-1(34) . It is quite possible that TM-EGF4 forms of a ternary complex with thrombin and/or protein C using a similar binding motif. Therefore, mutations in residues Glu, Tyr, Phe, and Met would perturb the formation of this complex. However, without more direct experimental information, this model must be treated as a working hypothesis.


FOOTNOTES

*
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The atomic coordinates and structure factors (code 1tmr) have been deposited in the Protein Data Bank, Brookhaven National Laboratory, Upton, NY.

§
To whom correspondence should be addressed: Berlex Bioscience, Inc., 15049 San Pablo Ave., Richmond, CA 94804-0099. Tel.: 510-669-4023; Fax: 510-262-7844.

(^1)
The abbreviations used are: TM, thrombomodulin; DQF-COSY, double quantum-filtered correlation spectroscopy; EGF, epidermal growth factor; FXa-C, C-terminal EGF-like domain of human factor Xa; Mso, methionine sulfoxide; NOE, nuclear Overhauser effect (an NOE peak between two protons indicates a separation of less than 5 Å); NOESY, two-dimensional nuclear Overhauser enhancement spectroscopy; TM(E),M388L, a deletion mutant of soluble thrombomodulin consisting of all six EGF-like domains; TM-EGF4, the fourth EGF-like domain in thrombomodulin; TM-M388L, a deletion mutant of soluble thrombomodulin consisting of the last three EGF-like domains, the preceding interdomain loop, and the Met Leu point mutation; ELISA, enzyme-linked immunosorbent assay.

(^2)
G. Rumennik and D. R. Light, unpublished data.


ACKNOWLEDGEMENTS

We thank Sara Biancalana for help in the synthesis of the peptides. We also thank Dr. Wofram Bode for an advance copy of the coordinates of factor Xa, we thank Brian Sykes, Rick Harkins, Doris Hollander, and Laurie Adler for useful discussion and help in preparing the manuscript, and we thank Galina Rumennik and Manping Wang for sharing the assay results.


REFERENCES

  1. Esmon, N. L., Owen, W. G. & Esmon, C. T. (1982) J. Biol. Chem. 257,859-864 [Free Full Text]
  2. Dittman, W. A. & Majerus, P. W. (1990) Blood 75,329-336 [Medline] [Order article via Infotrieve]
  3. Esmon, C. T. (1992) Arterioscler. Thromb. 12,135-145 [Medline] [Order article via Infotrieve]
  4. Kurosawa, S., Galvin, J. B., Esmon, N. L. & Esmon, C. T. (1987) J. Biol. Chem. 262,2206-2212 [Abstract/Free Full Text]
  5. Hayashi, T., Zushi, M., Yamamoto, S. & Suzuki, K. (1990) J. Biol. Chem. 265,20156-20159 [Abstract/Free Full Text]
  6. Parkinson, J. F., Nagashima, M., Kuhn, I., Leonard, J. C. & Morser, J. (1992) Biochem. Biophys. Res. Commun. 185,567-576 [Medline] [Order article via Infotrieve]
  7. Mathews, I. I., Padmanabhan, K. P., Tulinsky, A., and Sadler, J. E. (1994) Biochemistry 33,13547-13552 [Medline] [Order article via Infotrieve]
  8. Srinivasan, J., Hu, S., Hrabel, R., Zhu, Y., Komives, E. A. & Ni, F. (1994) Biochemistry 33,13553-13560 [Medline] [Order article via Infotrieve]
  9. Zushi, M., Gomi, K., Honda, G., Kondo, S., Yamamoto, S., Hayashi, T. & Suzuki, K. (1991) J. Biol. Chem. 266,19886-19889 [Abstract/Free Full Text]
  10. Nagashima, M., Lundh, E., Leonard, J. C., Morser, J. & Parkinson, J. F. (1993) J. Biol. Chem. 268,2888-2892 [Abstract/Free Full Text]
  11. Suzuki, K., Kusumoto, H., Deyashiki, Y., Nishioka, J., Maruyana, I., Zushi, M., Kawahara, S., Honda, G., Yamamoto, S. & Horiguchi, S. (1987) EMBO J. 6,1891-1897 [Abstract]
  12. Glaser, C. B., Morser, J., Clarke, J. H., Blasko, E., McLean, K., Kuhn, I., Chang, R.-J., Lin, J.-H., Vilander, L., Andrews, W. H. & Light, D. R. (1992) J. Clin. Invest. 90,2565-2573 [Medline] [Order article via Infotrieve]
  13. Clarke, J. H., Light, D. R., Blasko, E., Parkinson, J. F., Nagashima, M., McLean, K., Vilander, L., Andrews, W. H., Morser, J. & Glaser, C. B. (1993) J. Biol. Chem. 268,6309-6315 [Abstract/Free Full Text]
  14. Lin, J.-H., McLean, K., Morser, J., Young, T. A., Wydro, R. M., Andrews, W. H. & Light, D. R. (1994) J. Biol. Chem. 269,25021-25030 [Abstract/Free Full Text]
  15. Brown, S. C., Weber, P. & Mueller, L. (1988) J. Magn. Reson. 77,166-169
  16. Piantini, U., Sørensen, O. W. & Ernst, R. R. (1982) J. Am. Chem. Soc. 104,6800-6801
  17. Shaka, A. J. & Freeman, R. (1983) J. Magn. Reson. 51,169-173
  18. Rance, M., Sørensen, O. W., Bodenhausen, G., Wagner, G., Ernst, R. R. & Wüthrich, K. (1983) Biochem. Biophys. Res. Commun. 117,479-485 [Medline] [Order article via Infotrieve]
  19. Jeener, J., Meier, B. H., Bachmann, P. & Ernst, R. R. (1979) J. Chem. Phys. 71,4546-4553 [CrossRef]
  20. Kumar, A., Ernst, R. R. & Wüthrich, K. (1980) Biochem. Biophys. Res. Commun. 95,1-6 [Medline] [Order article via Infotrieve]
  21. Griesinger, C., Otting, G., Wüthrich, K. & Ernst, R. R. (1988) J. Am. Chem. Soc. 110,7870-7872
  22. Bax, A. & Davis, D. G. (1985) J. Magn. Reson. 65,355-360
  23. Marion, D. & Wüthrich, K. (1983) Biochem. Biophys. Res. Commun. 113,967-974 [Medline] [Order article via Infotrieve]
  24. W ü thrich, K. (1986) NMR of Proteins and Nucleic Acids , pp. 130-161, John Wiley & Sons, Inc., New York
  25. Adler, M. & Wagner, G. (1992) Biochemistry 31,1031-1039 [Medline] [Order article via Infotrieve]
  26. Pardi, A., Billeter, M. & Wüthrich, K. (1984) J. Mol. Biol. 180,741-751 [Medline] [Order article via Infotrieve]
  27. Kline, A. D., Braun, W. & Wüthrich, K. (1988) J. Mol. Biol. 204,675-724 [Medline] [Order article via Infotrieve]
  28. Bundi, A. & Wüthrich, K. (1979) Biopolymers 18,285-298
  29. Padmanabhan, K., Padmanabhan, K. P., Tulinsky, A., Park, C. H., Bode, W., Huber, R., Blankenship, D. T., Cardin, A. D. & Kisiel, W. (1993) J. Mol. Biol. 232,947-966 [CrossRef][Medline] [Order article via Infotrieve]
  30. Ullner, M., Selander, M., Persson, E., Stenflo, J., Drakenberg, T. & Teleman, O. (1992) Biochemistry 31,5974-5983 [Medline] [Order article via Infotrieve]
  31. Baron, M., Norman, D. G., Harvey, T. S., Handford, P. A., Mayhew, M., Tse, A. G. D., Brownlee, G. G. & Campbell, I. D. C. (1992) Protein Sci. 1,81-90 [Abstract/Free Full Text]
  32. Montelione, G. T., Wüthrich, K., Burgess, A. W., Nice, E. C., Wagner, G., Gibson, K. D. & Scheraga, H. A. (1992) Biochemistry 31,236-249 [Medline] [Order article via Infotrieve]
  33. Harvey, T. S., Wilkinson, A. J., Tappin, M. J., Cooke, R. M. & Campbell, I. D. (1991) Eur. J. Biochem. 198,555-562 [Abstract]
  34. Picot, D., Loll, P. J. & Garavito, R. M. (1994) Nature 367,243-249 [CrossRef][Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.