Three-dimensional Models of Proteases Involved in Patterning of the Drosophila Embryo

CRUCIAL ROLE OF PREDICTED CATION BINDING SITES*,

Thierry RoseDagger , Ellen K. LeMosy§, Angelene M. CantwellDagger , Dolly Banerjee-RoyDagger , James B. Skeath, and Enrico Di CeraDagger ||

From the Dagger  Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, Missouri 63110, the § Department of Cellular Biology and Anatomy, Medical College of Georgia, Augusta, Georgia 30912, and the  Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110

Received for publication, November 20, 2002, and in revised form, December 18, 2002

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Three-dimensional models of the catalytic domains of Nudel (Ndl), Gastrulation Defective (Gd), Snake (Snk), and Easter (Ea), and their complexes with substrate suggest a possible organization of the enzyme cascade controlling the dorsoventral fate of the fruit fly embryo. The models predict that Gd activates Snk, which in turn activates Ea. Gd can be activated either autoproteolytically or by Ndl. The three-dimensional models of each enzyme-substrate complex in the cascade rationalize existing mutagenesis data and the associated phenotypes. The models also predict unanticipated features like a Ca2+ binding site in Ea and a Na+ binding site in Ndl and Gd. These binding sites are likely to play a crucial role in vivo as suggested by mutant enzymes introduced into embryos as mRNAs. The mutations in Gd that eliminate Na+ binding cause an apparent increase in activity, whereas mutations in Ea that abrogate Ca2+ binding result in complete loss of activity. A mutation in Ea predicted to introduce Na+ binding results in apparently increased activity with ventralization of the embryo, an effect not observed with wild-type Ea mRNA.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Several genes in the dorsal group (1) are involved in extracellular events that lead to dorsoventral polarization of the Drosophila melanogaster embryo. nudel, pipe, and windbeutel are expressed by somatic follicle cells during mid-oogenesis, whereas easter, gastrulation defective, snake, and spätzle are expressed by the nurse cells and oocyte. These genes were identified in several large scale genetic screens for maternal effect mutations that cause homozygous mutant females to produce embryos with abnormal cell fates (2). Among the dorsal group genes, nudel, gastrulation defective, snake, and easter encode proteins containing serine protease domains of the trypsin family. Easter (Ea),1 Snk, Gd, and Ndl are expressed and secreted during oogenesis as inactive zymogens into a thin, fluid-filled perivitelline space that lies between the eggshell and the oocyte. Genetic and molecular studies suggest that these proteins act in a proteolytic cascade many hours later in the early embryo (1, 3-5). The cascade resembles in its general organization those controlling the innate immune response and blood coagulation (6). Ovulation of the egg in some way triggers the self-activation of Ndl into Ndl*. Gd can be activated either by Ndl* or by self-activation in the presence of Snk. Subsequently, Gd* activates diffusible Snk and Snk* activates diffusible Ea. The result of this cascade is cleavage by Ea* of the diffusible dimeric nerve growth factor-like Spz (7). The processed Spz appears to function as a dimer to activate the transmembrane receptor Toll only on the embryo surface that will become ventralized through the Toll signaling pathway.

In contrast to the significant knowledge garnered from previous in vivo studies, quantitative information on activity and specificity of various members of the cascade has so far eluded characterization involving purified proteins. Several questions remain regarding the activation of Ndl and Gd (1, 3-5) and the specificity of Gd* and Snk*. Elucidation of these timely and important questions would benefit from the knowledge of the structural organization of the enzymes involved in the cascade. However, none of the members of the cascade has been crystallized so far or even expressed successfully for detailed in vitro characterization. Hence, we felt that the construction of three-dimensional models of Ndl*, Gd*, Snk*, and Ea* in complex with their targets could fill a critical structure-function gap in the field as recently shown for thrombin interactions with the platelet receptors (8) and fibrinogen (9). The value of these models stems from their timeliness and the new insight offered for future mutagenesis studies, as illustrated in the present work by the effect on embryo polarization when putative cation binding sites of examined proteases were mutated.

    MATERIALS AND METHODS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Sequence Alignment and Comparative Modeling-- The fly sequences came from the strain Berkeley in the Flybase (FB) and Swiss Protein (SP) databases: Ea (FBgn0000533, SP-P13582), Gd (FBgn0000808, SP-O62589), Ndl (FBgn0002926, SP-P98159), and Snk (FBgn0003450, SP-P05049). These sequences were aligned with 1800 serine proteases from the trypsin family pulled from the non-redundant data base at the National Center for Biotechnology Information (NCBI) (National Library of Medicine, National Institutes of Health, Bethesda, MD) and the Flybase image at the NCBI using trypsin homologues as seeds with the BLAST program and aligned together with ClustalX as described recently (10). Sequences were clustered into 100 groups from a neighbor junction tree accounting 500 bootstraps with ClustalX. One hundred sequences were selected, one per cluster. Three-dimensional models of 70 structures were built by comparative modeling based on 12 of 20 crystal structures of serine proteases used in the sequence core. These models were used to refine the alignment of the 100-sequence core (10). The theoretical three-dimensional models of activated protease domains Ndl* (central or Ndl1*-(1146-1385) and C-terminal or Ndl2*-(2017-2616)); Gd*-(256-528); Snk*-(191-430); and Ea*-(127-392) were constructed by comparative modeling using the program Modeller 4 (11). The following crystal structures of serine proteases downloaded from the Protein Data Bank (PDB) (12) were used as templates: trypsin (PDB code 1tld, 1.50 Å-resolution); chymotrypsin (PDB code 4cha, 1.68 Å); tPA (PDB code 1rtf, 2.30 Å); plasmin (PDB code 1bui, 2.65 Å); plasma kallikrein (PDB code 2pka, 2.05 Å); thrombin (PDB code 1ppb, 1.92 Å); factor Xa (PDB code 1hcg, 2.20 Å); factor IXa (PDB code 1rfn, 2.80 Å); factor VIIa (PDB code 1dan, 2.00 Å); and activated protein C (PDB code 1aut, 2.80 Å). These proteases were chosen because they span the breadth of diversity of trypsin-related domains and regulatory cation binding sites. Alignments extracted from the 100-sequence core were optimized manually during preliminary comparative modeling processes according to the distance violation from templates provided by the Modeller program output files.

Two hundred models were built for each protease with different frameshifts of alignment in the poorly conserved loops and different seeds for the number generator. Models were then checked and ranked for stereochemistry, structural topology features, and amino acid spatial distribution with Procheck (13), WhatCheck (14), and Verify3D (15). Conformers in the same clusters were pooled when the root mean square deviations of their backbone was <2.5 Å. The best conformer was kept for each cluster and optimized by molecular dynamics (fast annealing 50-600 K in 3 ps, slow cooling 600-50 K from 5 to 15 ps) and then minimized (200 steepest descents and then 500 conjugate gradient cycles) using the program Discover (Accelrys, San Diego, CA). The following parameters were used in all of the procedures: force field CFF91, dielectric constant set at 2, and cut off to threshold non-covalent bonds was set at 14 Å during dynamics and set to infinity  during minimizations. The highest ranked models were used for analysis. The in-and-out side chain distributions and the sequence-structure compatibility analyzed with Verify3D gave the following current score/expected score/threshold for the final three-dimensional models: Ndl1* (118/116/52), Gd* (130/130/58), Snk* (112/112/50), and Ea* (117/120/54). These values are comparable with those obtained with the crystal structures used as templates. The Ramachandran plot put phi-psi -dihedral angle pairs per residue mostly in the favored and allowed regions as per the program Procheck: Ndl1* (92.4%), Gd* (86.0%), Snk* (94.2%), and Ea* (91.4%). The stereochemistry of the three-dimensional models satisfied Procheck and WhatCheck requirements found for crystals of proteases solved at a resolution lower than 2.5 Å. Solvent-accessible surfaces were displayed with Insight II (Accelrys) using the Connolly's algorithm with a 1.4-Å probe radius.

We screened the three-dimensional models of catalytic domains for putative sites of cleavage by proteases. Only Arg, Lys, Ile, Leu, Phe, and Val residues were selected if not followed by Pro provided that their side chain was at least 50% exposed compared with same residue in the tripeptide GXG. Every selected position i in the model was scored for the solvent accessibility of neighbor residues from i - 4 to i + 2. We defined a protease cleavage site when [(rsc,i-4 + rsc,i-3 rsc,i-2)/3 + ri + (rsc,i+1 rsc,i+2)/2]/3 > 0.50, where ri is the percentage of overall solvent accessibility and rsc,i is the percentage of side chain solvent accessibility.

Ca2+ and Na+ binding sites were identified with the program VALE (16) using a grid of 0.1 Å, water molecule radius of 1.4 Å, and a minimum threshold for the sum of oxygen-cation bond-strength contributions of 0.8.

Modeling of Enzyme-Substrate Complex-- Three-dimensional models of protease-substrate complexes were built by comparative modeling in a thorough or quick mode. In the thorough mode, the protease-fragment complexes were threaded over thrombin-peptide crystal structures (8, 9). We used the following templates: peptide-Ac-DFLAEGGGVR from PDB (1bbr and 1ucy); PPACK from PDB (1ppb); hirugen peptide-NGDFEEIPEEYL from PDB (1hah); and peptide-LDPR from PDB (1nrs). 50 three-dimensional models were built and ranked in terms of stereochemistry quality and lowest potential binding energies. The accepted computer-generated models of protease-peptide substrate complexes had root mean square deviations of <1.5 Å for the protease backbone and peptide residues <10 Å from protease residues. Models containing a ligand with root mean square deviations of <2 Å from a higher ranked model were discarded. We selected the best ten models, extracted the ligand, docked it on the best free protease three-dimensional model as a starting point for a new modeling process, and optimized the best complex as described above for the free proteases. The thorough mode screened 1 of 50 three-dimensional models of enzyme-target peptide complexes and was used to screen every putative activation cleavage site of zymogens with every selected protease to assess activator-activated pairs.

The quick mode was used to screen possible cleavage sites all along sequence targets in and out of the catalytic domain. We used only one of seven peptide three-dimensional models to template the position P1-P11, chosen according to the length of the loop between P1 and the closest hydrophobic side chain from P4 to P10 (seven possibilities). Five three-dimensional models of the complex were provided by Modeler runs, and then the best one was minimized as in the thorough mode.

Binding Free Energy Calculations-- We examined the relative binding free energies of substrates on proteases by applying an empirical method on bound and free components. We used the potential energy of the system as an enthalpy term (force field CFF91), a conformational entropy term based on solvent-accessible surface area (SASA) of residues and a hydration free energy term based on finite difference approximation of the Poisson-Boltzmann equation.

The predicted free energy of association between receptor (R) and peptide (P), Delta G, was calculated considering that free R and P have the same conformation as in the complex RP from Delta G Delta GRP - Delta GR - Delta GP with Delta Gx Delta Gx,gas(epsilon =1) - Delta Gx,hyd(epsilon =80) and x =RP, R, or P. The value of Delta Gx,gas was calculated from its enthalpic and entropic contributions expressed as Delta Gx,gas Delta Hx,gas - TDelta Sx,gas with Delta Hx,gas = Ex,vdw + Ex,coul and Delta Sx,gas = Delta Sx,conf,gas Delta Sx,rt,gas + Delta Sx,vib,gas. The enthalpy Delta Hx,gas is a function of the van der Waals (Evdw) and coulombic (Ecoul) components, whereas Delta Sx,gas is defined in terms of the rotational, configurational, and vibrational components. Evdw and Ecoul were computed from the CFF91 force-field without cut-off with epsilon  = 2.

The value of the conformational entropy Delta Sx,conf,gas was computed from the loss of side and main chain rotation freedom using the definition as shown in Equation 1,
T&Dgr;S<SUB>X,conf,gas</SUB>=T&Dgr;S<SUB>X,confsc,gas</SUB>+T&Dgr;S<SUB>X,confmc,gas</SUB>

=<LIM><OP>∑</OP><LL>i</LL></LIM>&Dgr;f<SUB><UP>1</UP></SUB>(r<SUB>sc,i</SUB>)T&Dgr;S<SUB>i</SUB>+<LIM><OP>∑</OP><LL>i</LL></LIM>RT<UP>ln</UP><FENCE><FR><NU>&Dgr;f<SUB><UP>2</UP></SUB>((r<SUB>i<UP>−1</UP></SUB>+r<SUB>i</SUB>+r<SUB>i<UP>+1</UP></SUB>)/3)</NU><DE>&rgr;<SUB>i</SUB></DE></FR></FENCE> (Eq. 1)
where f1(rsc,i) = rsc,i8/(rsc,i8+0.5) and rsc,i are the relative accessibility of the ith residue side chain, rsc,i = SASAsc,i/SASAsc,i,GXG. SASAsc,i,GXG refers to the side chain solvent-accessible surface area of amino acid X in the tripeptide Gly-X-Gly. The empirical scales of side chain rotation freedom, Delta si were taken from Pickett and Sternberg (17). The function f1 decreases the entropy values when the accessibility of the side chain is <50%. The loss of freedom of residue i main chain dihedral angles phi and psi  was roughly considered as a function of the steric hindrance around residue i - 1 to i + 1, affecting the access of allowed and core region in the Ramachandran graph. ri is the smallest value of SASAmc,i/SASAmc,i,GXG (accessibility of the main chains only) or SASAi/SASAi,GXG (overall accessibility) weighted by the attenuation function f2(x) x7/(x7 + 0.5). The accessible area fraction, rho i was fixed for each residue dihedral pair phi and psi  of X from the tripeptide Ala-X-Ala in the Ramachandran graph: 0.28 for X = Pro; 0.56 for X = Gly; and 0.40 for all other amino acids according to the allowed and core region in Procheck graphs (13).

The size of the different ligands is very similar, and the resulting loss of rotational and translational entropy upon binding Delta Srt,gas between different ligands is negligible. For 25-residue peptides associated to proteases modeled by quick and thorough mode TDelta Srt,gas was ~18-20 kcal/mol at 298 K (18). Delta Svib,gas was not computed in the absence of experimental data on the examined structures or their normal mode vibrations. The main modes of vibrations are weakly affected for peptides of similar length, targeting the same site of a protease in a slightly different conformation. The values of Delta Gx,hyd were calculated from their electrostatic energy Ge and non-polar energy of hydration Gn as Delta Ghyd = Delta Ge + Delta Gn. The electrostatic energies Delta Ge were computed using the finite difference Poisson-Boltzmann method implemented in the program DelPhi (19) averaged from eight 1-Å resolution grids decayed by 0.5 Å in one, two, or three of the x, y, and z directions. The choice of the grid position and resolution affects final values (the mean ± S.D. is 0.8-1.8 kcal/mol). Ge values were computed for the transfer of the solute in water from epsilon  = 2.0 to 80.0, Ge(80.0,2.0), and then in gas from epsilon  = 2.0 to 1.0, Ge(1.0,2.0), as Delta Ge = Ge(80.0,2.0) - Ge(1.0,2.0). The radius was fixed to 1.4 Å for solvent molecules and 2 Å for ions. Ionic strength was set at 145 mM, and the protonation state and partial charge distribution were assigned by the program Biopolymer according to the pH fixed at 7.0. The non-polar contribution Gn was considered as linearly dependent on the molecule solvent-accessible surface area using a surface tension coefficient of 25 cal/mol/Å2 (20), i.e. Delta Gn = 25 Delta SASA.

Based on the above definitions, the free energy for the receptor-peptide complex becomes Delta G = Delta Hgas - TDelta Srt,gas - TDelta Svib,gas - TDelta Sconf,gas + Delta Ghyd.

Some of the terms cancel if we compare the association of same length peptides bound to the same protease. The approximation of the relative binding free energy is given by Delta Delta G ~ Delta Delta Hgas - TDelta Delta Sconf,gas + Delta Delta Ghyd.

This approach does not allow comparison of the binding of a peptide to two different proteases unless the vibrational entropy variation upon binding is comparable.

The Delta Delta G values refer to selected conformations and are affected by the choice of the "best model" according to global potential energy of the system and the goodness of its stereochemistry. The mean ± S.D. is ~2.4 kcal/mol between the Delta Delta G of the 10 best models of Snk* when it is bound to the activation site of Ea. Lower deviations were estimated as 1.2 kcal/mol for Ea* with Spz peptide, 1.7 kcal/mol for Ndl1* with Gd peptide, and 1.4 kcal/mol for Gd* with Snk peptide.

mRNA Preparation of Mutated gd and ea-- The plasmid pNB-GD2 containing a full-length gd cDNA was obtained from J. L. Marsh (University of California, Irvine, CA) (21). The plasmid pGEM7Zf(+) containing a full-length ea cDNA was obtained from K. V. Anderson (Sloan-Kettering Institute) (22). Mutations were introduced using the QuikChange Exchange kit (Stratagene). We mutated Phe-225 to Ala and Pro in the putative Na+ binding site of Gd. We also mutated separately Phe-225 to Ile, Ser, and Tyr to create the putative Na+ binding site of Ea, and we mutated Glu-70 to Ala and Lys in the putative Ca2+ binding site of Ea. mRNAs encoding wild-type and mutant Gd and Ea were transcribed from plasmids by using the SP6 mMessage mMachine kit (Ambion, Austin, TX) and were dissolved in water in a range of concentration from 0.06 to 1 mg/ml as estimated by UV absorbance (4).

Fly Stocks and Embryo Injection-- The mutations and allelic combinations used here were described previously: gd7/gd7 (23) and ea4/ea5022rx1 (24). Embryos (0.5-1.5 h post-fertilization) were injected centrally at 40-60% egg-length after the removal of the outer eggshell layer according to a standard procedure (2). Injected embryos were visually examined during gastrulation, and their cuticles were prepared for examination as described previously (25, 26). The injection of mRNAs encoding wild-type Gd or Ea was used as positive controls (4).

    RESULTS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Identification of Cleavage Sites by Alignment of Primary Sequences-- Zymogens Gd (amino acid 528), Snk (amino acid 430), and Ea (amino acid 392) are organized in three domains: an N-terminal signal that is cleaved during protein secretion and a zymogen that gives rise to A (N-terminal) and catalytic B (C-terminal) chains (Fig. 1). The A and B chains remain covalently linked through disulfide bridges after proteolytic activation. The topology of Ndl is more complex and unusual because it carries two S1a protease domains. The first catalytic domain (Ndl1*-(1145-1385)) is central, and the second (Ndl2*-(2017-2616)) is C-terminal. Eleven low density lipoprotein (LDL) receptor-binding repeats intercalate the two protease domains (27). Four LDL receptor repeats are inserted in the second protease catalytic domain.


View larger version (47K):
[in this window]
[in a new window]
 
Fig. 1.   Sequence topology of Ndl, Gd, Snk, Ea, and Spz. The sites of cleavage of signal peptide (S) and predicted activation (scissors) are indicated. Positions of putative secondary target sites are indicated in pink (chymotrypsin-like target) or blue (trypsin-like target) triangles. The secondary target sites were predicted from high binding free energy scores and are >50% accessible based on the three-dimensional models. Disulfide bonds predicted by sequence alignment and from three-dimensional models are shown in black. Also shown in the S1a domain in orange are residues of the catalytic triad (red circles), residues at the bottom of the S1 pocket (green circles), and residues involved in Ca2+ or Na+ binding (black circle). The Ndl1 and Ndl2 protease domains of Ndl are shown separately. The topologies of bovine beta -trypsin (Try) and bovine alpha -chymotrypsin (Chy) are reported for comparison. Low density lipoprotein (LDL) marks the position of low density lipoprotein receptor-like domain repeats (40 residues, Cys6-7) with the number of repeats shown in parenthesis. Ea and Snk share a disulfide-knot or "Clip" motif (1).

To locate cleavage positions (down-arrow ) of signal peptides, we used the program SignalP (28). This yielded the following sites of cleavage: Ndl 1-47down-arrow 48-2616 (VYHdown-arrow GL, score 0.54, threshold 0.48); Gd 1-19down-arrow 20-528 (TKAdown-arrow VA, score 0.85, threshold 0.48); Snk 1-27down-arrow 28-430 (LEAdown-arrow LD, score 0.75, threshold 0.48); Ea 1-21down-arrow 22-392 (SAGdown-arrow QF, score 0.82, threshold 0.48), and Spz 1-25down-arrow 26-326 (YEAdown-arrow KE, score 0.93, threshold 0.48). Alignment of Ea, Snk, Gd, and Ndl with other serine proteases suggests the following cleavage sites for zymogen activation: Ndl1* 48-1144down-arrow 1145-2616 (GDGRdown-arrow IVGG; trypsin-like cleavage); Snk* 28-183down-arrow 184-430 (SVPLdown-arrow IVGG; chymotrypsin-like cleavage); and Ea* 22-127down-arrow 128-392 (LSNRdown-arrow IYGG; trypsin-like cleavage) (Fig. 1). The predicted underivatized Ea* A chain (106 amino acids, theoretical mass 12,086 Da), Ea* B chain (265 amino acids, theoretical mass 28,951 Da), Snk* A chain (156 amino acids, theoretical mass 17,372 Da), and Snk* B chain (247 amino acids, theoretical mass 27,319 Da) agree with Western blots described by Dissing et al. (5). In the case of Gd, no basic or hydrophobic residue occupies the canonical position, and the closest putative cleavage site is either 30 or 22 residues upstream, specifically Gd* 20-211down-arrow 212-528 (GEPKdown-arrow SSDG; trypsin-like cleavage) or Gd* 20-220down-arrow 221-528 (TSPVdown-arrow FVDD; chymotrypsin-like cleavage). The fragment 212-528 expressed in S2 insect cells is an active protease (3). DeLotto (29) proposed the cleavage site Gd* 20-136down-arrow 137-528 (EHIRdown-arrow KLSF; trypsin-like cleavage) located 83 residues upstream of the canonical activation site. The proposed cleavage site is 12 residues upstream of a type A von Willebrand repeat motif (LLLDXXEXXVRXXD) as described for complement factor B and C2. With such a cleavage, the predicted underivatized Gd* A chain (116 amino acids, theoretical mass 13,396 Da) and Gd* B chain (390 amino acids, theoretical mass 43579 Da) agree with the Western blots described by Dissing et al. (5). The alignment of Spz with the protease activation sites proposes 26-220down-arrow 221-326 (VSSRdown-arrow VGGS; trypsin-like cleavage) as the best site of cleavage to produce the fragments documented by SDS-polyacrylamide gel electrophoresis (5). Ndl* could also be processed to release only the central catalytic domain Ndl1* (241 amino acids, theoretical mass 26,862 Da) by a second cleavage 1145-1385down-arrow 1386-2616 (TTPRdown-arrow LLPK; trypsin-like cleavage) as shown by LeMosy et al. (30). The cleavage site 1386-2016down-arrow 2017-2616 (NLMRdown-arrow LLNV; trypsin-like cleavage) is also detectable in the C-terminal domain Ndl2 (600 amino acids).

Ea and Snk show 25% identity overall, 33% within the B chain, and feature the same potential disulfide bridges (Fig. 1). Alignment with other proteases suggests that only one disulfide bridge, 1-122 in the chymotrypsin numbering,2 links the A and B chains. The disulfide bonds 42-58, 168-182, and 191-220 are highly conserved in the catalytic B chain of serine proteases. Gd is proposed to retain the disulfide bonds 42-58 and 168-182 as well as 1-122 linking the A and B chains (Fig. 1). Ndl could have five disulfide bonds in Ndl1* (1-122 between the A and B chains and 42-58, 136-201, 168-182, and 191-220 within the B chain). Ndl2* features only the 42-58 disulfide bond within the B chain and 1-122 between the A and B chains (Fig. 1).

Comparative Modeling of the Catalytic B Chains-- Ndl1*-(1145-1385), Ndl2*-(2195-2616), Gd*-(212-528), Snk*-(184-430), and Ea*-(128-392) were folded using the trypsin scaffold (CATH 2.40.10.20; SCOP B.47.1.1) with two orthogonal six-stranded beta -barrels flanking the active site groove hosting the catalytic triad (Fig. 2). Insertions or deletions relative to chymotrypsin occur in loops at the protein surface and outside the active site. Although the identity between Ndl2* and other trypsin-like proteases is low, it spreads uniformly among all domains and especially at the level of the two beta -barrels. Ndl2* features an unusual catalytic triad, where the nucleophile Ser-195 is coupled to Glu-102 and Ser-57 that replace the canonical Asp-102 and His-57. There is no other example of a Ser-Glu-Ser catalytic triad among 1800 other serine proteases in the NCBI (www.ncbi.nlm.nih.gov) and MEROPS (www.merops.co.uk) databases, which suggests that Ndl2* may not participate in the cascade as an active protease. The LDL domain is inserted away from the potential active site in the 186-loop, where insertions of various length also exists in thrombin and tissue- plasminogen activator.


View larger version (68K):
[in this window]
[in a new window]
 
Fig. 2.   Three-dimensional homology models of Ndl1* (A), Gd* (B), Snk* (C), and Ea* (D). Enzyme residues are numbered according to chymotrypsin for ease of comparison. The models are shown as ribbons on the left side or as solvent-accessible surface areas on the right side, color-coded according to amino acid properties (Asp and Glu in red; Lys and Arg in blue; His in purple; Ala, Ile, Leu, Met, and Val in yellow; Phe, Trp, and Tyr in green; Asn, Cys, Gln, Gly, Pro, Ser, and Thr in white). The Na+ (orange ball) and Ca2+ (purple ball) binding sites are detailed in the insets. Also shown are main residue contacts between primary targets (listed vertically in black as 25-residue peptides) and enzymes (with individual residues color-coded). Cleavage sites are indicated by scissors.

Fig. 2 displays the water-accessible surfaces of Ndl1*, Gd*, Snk*, and Ea*. The overall architecture of the active site is similar in all models, but their surfaces show notable differences in amino acid composition. The four proteases feature the catalytic triad His-57, Asp-102, Ser-195, and the important ancillary residues Cys-42 and Cys-58 (SS-linked), Gly-193, Gly-196, Gly-211, and Ser-214. The Cys-168/Cys-182 disulfide bond stabilizes the intervening loop that forms part of the binding site and is conserved in all four proteases. The Cys-191/Cys-220 disulfide bond is present in Ndl1*, Snk*, and Ea* but not in Gd*. This bond bridges the 186-loop and 220-loop that shape the bottom of the primary specificity pocket. Binding site pockets around residue 189 and the hydrophobic core around residues 99, 174, and 215 differ among the four proteases.

The presence of Asp-189 in the S1 (31) pocket shows that specificity is unambiguously trypsin-like for Ndl1* and Ea*, whereas Ser-189 suggests a chymotrypsin-like specificity for Gd*. The presence of Gly-189 in Snk* makes the prediction of specificity ambiguous. The shape and volume of the S1 pocket in Snk* could accommodate a variety of side chains. Leukocyte elastase, which carries Gly189 and cleaves after Val in P1, shows a 23% identity with Snk* in the catalytic B chain. The structure of elastase (PDB code 1ppg) complexed with the tetrapeptide AAPV (32) shows that Val-190 defines the S1 specificity toward hydrophobic P1 residues. In the model of Snk*, the unusual His-190 (His-371) points out of the S1 pocket and interacts with Asp-194 (Asp-375), thereby leaving the S1 pocket free to interact with a variety of side chains besides hydrophobic residues.

Preferred Cleavage Sites from Enzyme-Substrate Three-dimensional Models-- We predicted the position of potential protease targets in every protease sequence based on 25-residue peptide binding energy to each protease active site (Table I) or the accessibility of the peptide within the catalytic domain (Fig. 1). We computed the theoretical binding energy of all of the fragments 25-residue long with Arg or Lys at position 11 derived from the sequence of Ndl1 (37 fragments), Ndl2 (78 fragments), Gd (60 fragments), Snk (42 fragments), Ea (38 fragments), and Spz (36 fragments) after docking them using the quick mode on Ndl1*, Snk*, and Ea*. The same procedure was used for fragments containing Leu, Ile, Val, or Phe at position 11 derived from the sequence of Ndl1 (88 fragments), Gd (142 fragments), Snk (105 fragments), Ea (92 fragments), and Spz (74 fragments) and docked on Gd* and Snk*. All of the fragments containing Pro at position 12 (~3% of total) and cleavage sites inaccessible in protein three-dimensional models (score <0.5) were discarded. We used an empirical relative binding free energy Delta Delta G as a criterion to select optimal cleavage sites. Values were computed for all possible cross-activation to test protease-target specificity (Table I). The best cleavage sites for Ndl1* are Arg-1144 in Ndl and Lys-211 in Gd. The best cleavage site for Gd* is Leu-183 in Snk. The best cleavage site for Snk* is Arg-127 in Ea. The best cleavage site for Ea* is Arg-220 in Spz (Fig. 1, Table I). Interestingly, the other fragments are predicted to bind with high scores and can be regarded as secondary target sites. Fig. 1 reports potential cleavage sites within catalytic domains sorted by trypsin-like cleavage sites (blue, Arg or Lys in P1) or chymotrypsin-like cleavage sites (pink, Leu, Val, Ile, and Phe in P1) when the accessibility score of the corresponding site is >0.50. For example, Ndl1* can cleave Ndl at Arg-1385 and can separate Ndl2 from Ndl1. Ndl1* can also cleave at Arg-1094, yielding a fragment that may correspond to the non-diffusible 38-kDa fragment reported by LeMosy et al. (30). Ea* and Snk* are predicted to cleave the prodomain of Gd at Arg-187 and can generate the 50-kDa fragment described by LeMosy et al. (3). Furthermore, Ea* can cleave the prodomain of Snk at Arg-100, thereby producing the 50-kDa fragment observed when Ea* and Snk are coexpressed (3).


                              
View this table:
[in this window]
[in a new window]
 
Table I
Relative binding free energies Delta Delta G (in kcal/mol) of 25-residue peptide-protease complexes calculated from three-dimensional models
Enzymes are listed on the top row, and target proteases are listed in the left column. Cleavage sites are reported within the central eight residues of each peptide from the putative activation region of corresponding zymogens. For each enzyme (column), one complex was used as reference (Delta Delta G = 0.0) to facilitate comparison with other enzyme-target complexes.

The four proteases in complex with their primary targets were examined further using the thorough mode. Relevant contacts of the best targets with each enzyme are shown in Fig. 2, and the structure of the peptide and the epitope of recognition are displayed in Fig. 3. The Ndl fragment 1134SDSKEIVGDGRdown-arrow IVGGSHTSALQWPF1158 and the two Gd fragments 201ESLHVAIGEPKdown-arrow SSDGITSPVFVDDD225 (cleavage 30 residues upstream of the standard activation site) and 126FMTQIQLEHIRdown-arrow KLSFIPDKKSSLLL150 (C2/factor B-type cleavage site 83 residues upstream of the standard activation site) were docked on the active site of Ndl1*. Several single mutations have been made in ndl, and their associated phenotypes have been reported previously (33). We introduced these mutations in the three-dimensional model of Ndl1* and optimized the structure by 500 cycle-conjugated gradient minimization on the residues within 6 Å from the site of mutation. Mutations are displayed in Fig. 3 where residues are visible at the protein surface in its front view. The mutant C1114S (C1S) loses the disulfide bond with Cys-1252 (Cys-122) that connects the A and B chains, but it is processed and secreted normally and retains partial activity. The A chain is usually inconsequential to function in serine proteases (34). The mutants G1280S (G140S) and G1282R (G142R) affect the position of the highly conserved Trp-1281 (Trp-141) that lines part of the S1' specificity site. Either mutation may change the backbone and side chain orientation of Trp-141 with resulting poor substrate binding and loss of protease activity as seen experimentally (33). The same argument holds for the mutant V1278M (V138M) whose bulkier side chain is expected to perturb Trp-141. The mutant G1334R (G197R) has a protrusion into the S2' pocket that may cause steric hindrance with the substrate backbone around the scissile bond, thereby explaining the loss of activity seen experimentally (33). The mutant H1355L (H215L) perturbs the hydrophobic core next to the active site (Fig. 3A). The His residue at this position replaces the highly conserved Trp seen in almost all of the serine proteases. The hydrophobic residue Ile-207 of Gd or Val-1140 of Ndl can make contacts with Leu-1355 (Leu-215). However, non-conservative replacements of residue 215 in thrombin cause a drastic drop in activity (35), which may explain the total loss of activity of the Ndl1* mutant (33). The mutant A1360T (A221T) perturbs the backbone of the 220-loop responsible for Na+ binding (36). Mutations at the same position in thrombin result in the loss of Na+ binding and decreased protease activity (36), which again explains the results seen for the Ndl1* mutant (33).


View larger version (37K):
[in this window]
[in a new window]
 
Fig. 3.   Three-dimensional homology models of protease-substrates Ndl1*-Gd-(201-225) (A), Gd*-Snk-(173-197) (B), Snk*-Ea-(116-141) (C), and Ea*-Spz-(210-234) (D). The models of catalytic domains are shown as solvent-accessible surface areas. Peptide substrates are displayed as sticks (green for carbons, blue for nitrogens, red for oxygens, and yellow for sulfurs). Surface of enzyme residues making contact (<6Å) with the corresponding substrates are colored in orange. Mutated residues described in the text are colored in magenta. Some mutated residues are not visible on the particular displayed view or because they are not accessible to solvent.

The Gd* active site is characterized by very hydrophobic properties of both primed and unprimed subsites (Fig. 2B). Residue Ile-468 (Ile-194) replaces the canonical Asp in the S1 pocket and contributes to the enhanced hydrophobicity of this site together with Ile-463 (Ile-190) and the unusual Ile-511 (Ile-226) that replaces a highly conserved Gly. This largely hydrophobic architecture of the S1 pocket is unusual in serine proteases and probably compensates for the unusual Ala-488 (Ala-215) that replaces the highly conserved Trp at this position. Residue Val-220 of Gd is potentially a good cleavage site for Gd*. To test the possibility of Gd activation by Gd*, the Gd fragment 210PKSSDGITSPVdown-arrow FVDDDEDDVLEHQF234 was docked onto the active site of Gd*. A potential activation site with chymotrypsin-like specificity is 128TQIQLEHIRKLdown-arrow SFIPDKKSSLLLDP152 located near the C2/factor B-type cleavage site (Fig. 1). Gd* carries two insertions in the 60 and 149-loops relative to chymotrypsin, a feature also observed in thrombin (34) where the insertions contribute to the narrow substrate specificity. The 60-loop in Gd* covers the substrate residues at P1 and P1'. The 149-loop is quite flexible, judging from the various conformations obtained in the fifty best models, and interacts loosely with substrate residues at P3'-P8' (Fig. 3B). Mutant alleles of gd have been identified and grouped in three complementation groups (37). Mutations in the catalytic domain G466D (G193D), G469E (G196D), and G484D (G211D) are all in the same group complementing with the group with mutations in the propeptide domain. G466D places the acidic side chain at the bottom of the S4' site but does not disturb the aromatic cluster formed by Phe-275, Trp-388, and Phe-390. The hypomorphic effect of this mutation (gd6) is moderate (37). On the other hand, G469E introduces a charged side chain into a hydrophobic cluster formed by Ile-265 (Ile-33), Phe-294 (Phe-59), Val-306 (Val-64), Val-328 (Val-90), and Ile-331 (Ile-88). The unfavorable steric hindrance is partially compensated by backbone torsions in the structure core in the vicinity of the catalytic Ser-468 (Ser-195) and His-292 (His-57). A similar effect is observed for the mutant G484D where the charged side chain perturbs the hydrophobic cluster formed by Tyr-377 (Tyr-130), Leu-472 (Leu-201), Phe-445 (Phe-181), Tyr-513 (Tyr-228), and Ala-514 (Ala-229) with resulting rearrangement of the backbone structure of the S3-S4 sites and of the 186-loop and 220-loop shaping the Na+ binding site. G469E (gd10 allele) and G484D (gd7 allele) dramatically compromise the activity of Gd and yield completely dorsalized embryos.

The Snk fragment 173SGKQCVPSVPLdown-arrow IVGGPTRHG-LFPH197 was docked onto the active site of Gd* with the P1 residue Leu-183 into the chymotrypsin-like S1 pocket in contact with Ser-468 (Ser-189), Ile-463 (Ile-190), and Ile-467 (Ile-194) (contacts detailed in Fig. 2B and structure detailed in Fig. 3B). The small side chain of Ala-488 (Ala-215) in Gd* opens a cavity bordered by Leu-342 (Leu-97), and Tyr-344 (Tyr-99) that accommodates the P6 residue Val-178 of Snk. Of the other Snk residues, Val-181 at P3 contacts Ala-489 (A216) and Leu-490 (Leu-217), and Phe-195 at P11' stacks favorably against Trp-388 (Trp-141), Phe-390 (Phe-143), and Phe-275 (Phe-34), whereas Leu-194 makes close contacts with Leu-401 (L149d) and Phe-390 (Phe-143).

The Ea fragment 116LPGQCGNILSNRdown-arrow IYGGMKTKIDEFPW141 was docked on Snk* (contacts detailed in Fig. 2C and structure detailed in Fig. 3C). Snk* carries Gly-370 (Gly-189) in the S1 pocket, consistent with either trypsin or chymotrypsin activity. Residue His-371 (His-190) makes the S1 pocket more prone to interact with hydrophilic rather than hydrophobic side chains. Ea residue Arg-127 fills the S1 pocket of Snk*, making two H-bonds with the backbone carbonyls of His-371 (His-190) and Phe-402 (Phe-218).

The Spz fragment 210NDLQPTDVSSRdown-arrow VGGSDERFL-CRSIR234 (FBgn0003495, SP-48607) was docked onto the Ea* active site (contacts detailed in Fig. 2D and structure detailed in Fig. 3D) with Arg-220 at P1 bound to Asp-332 (Asp-189). Several naturally occurring mutations of Ea* have been identified that lead to dominant or recessive phenotypes of dorsoventral differentiation (24). Dominant alleles are A325V (A183V), P373S (P225S), R335C (R192C), G336S (G193S), G371R (G223R), G283S (G142S), V360 M (V213M), and G131E (G19E). Recessive alleles are G339R (G196R), G363E (G216E), S172L (S56L), and C324Y (C182Y). We introduced these mutations in Ea* and optimized the three-dimensional models by 500 cycle-conjugated gradient minimization of residues within 6 Å from the site of mutation. The mutant A325V (A183V) carries a bigger side chain in a densely packed region. We colored the positions of the mutated residues over the Ea* surface in Fig. 3D. The mutant P373S (P225S) perturbs the backbone of the 220-loop that is crucial for Na+ binding and substrate recognition (36, 38). The substitution may promote weak Na+ binding and enhanced catalytic activity, thereby explaining the gain-of-function phenotype observed experimentally (24). The mutant R335C (R192C) lacks one ion-pair interaction with the bound Spz. The mutant G336S (G193S) introduces a side chain into the S1' pocket that may lead to an incorrect orientation of the scissile bond and loss of catalytic activity. The mutant G371R (G223R) perturbs the 220-loop backbone. The mutant G283S (G142S) may displace Trp-141 nearby by constraining the backbone of the S' sites. The mutant V360M (V213M) reduces the accessibility of the S1 pocket and impairs the binding of substrates carrying Arg or Lys at P1. Other Ea* mutants that are expected to compromise substrate binding are G131E (G19E) that places an acidic side chain in a hydrophobic environment, G339R (G196R) and G363E (G216E) that occlude the S1 pocket, S172L (S56L) that places a bulkier side chain in a densely packed region, and C324Y (C182Y) that removes the disulfide bond and stability of the hydrophobic core next to the active site.

Putative Cation Binding Sites and Their Alteration in Vivo-- Many vertebrate serine proteases contain functional cation binding sites that allosterically regulate activity and stability of the enzymes (39, 40), but such sites have not previously been described in invertebrate serine proteases. The inspection of the primary sequence and screening of the dorsoventral protease three-dimensional models with the program VALE (16) identified binding sites for Na+ in Ndl1* and Gd* and for Ca2+ in Ea*, each corresponding to the positions of similar sites in the vertebrate proteases. The Na+ binding sites of Ndl1* (Fig. 2A) and Gd* (Fig. 2B) have an architecture similar to that described for thrombin (36, 38). Two carbonyl O atoms from residues 221 and 224 contribute together with four buried water molecules to the octahedral coordination of the cation: Arg-1361 (Arg-221) and Glu-1364 (Glu-224) for Ndl1*; Cys-506 (Cys-221); and Gln-509 (Gln-224) for Gd*. The Ca2+ binding site of Ea* (Fig. 2D) is similar to that of trypsin (40) with two carboxylic side chains from Glu-193 (Glu-70) and Glu-203 (Glu-80), contributing to the octahedral coordination. In Ea*, three additional carbonyl oxygens from the backbones of Thr-196 (Thr-73) and Thr-198 (Thr-75) and the side chain of Asn-199 (Asn-76) contribute to Ca2+ binding. A water molecule could provide the sixth oxygen in the coordination shell.

To determine whether these putative cation binding sites influence protease function in vivo, we mutagenized key residues in Gd and in Ea and then compared the ability of wild-type and mutant proteases to rescue embryos lacking maternal function for the respective proteins (Table II) (Figs. 4 and 5). In previous studies using the same wild-type mRNAs and recipient embryos, Gd has been shown to act in a dose-dependent manner to cause an abnormal expansion of ventral pattern elements ("ventralization"), whereas Ea rescues to wild type but cannot ventralize the embryo (4, 24). Similar studies could not be undertaken for the putative Na+ binding site in Ndl, as wild-type Ndl is unable to rescue in embryo RNA microinjection assays, presumably because of the complex activation mechanism and early action of this protease.3


                              
View this table:
[in this window]
[in a new window]
 
Table II
Rescue of ea- or gd-null embryos by wild-type and mutant mRNA injections
Ea and Gd mRNAs were injected into the corresponding null embryos and scored at gastrulation (criteria described in Fig. 4 legend). This scoring could be correlated with the cuticle phenotypes seen at the end of embryogenesis (Fig. 5).


View larger version (122K):
[in this window]
[in a new window]
 
Fig. 4.   Representative gastrulation patterns of injected embryos. Recipient embryos in A-F are of the genotype ea4/ea5022rx1, whereas those in G and H are gd7/gd7. All of the embryos are oriented with their anterior ends to the left and dorsal surface up. Injection of Ea wild type (WT) (A) results in a wild-type gastrulation pattern in which cells on the ventral side invaginate to form the ventral furrow, posterior cells migrate anteriorly on the dorsal side (arrowhead), and a headfold is visible only faintly along the lateral surface (asterisk). The Ea S195A (B), Ea E70A (C), and Ea E70K (data not shown) mutants are completely inactive in dorsoventral patterning, giving a complete dorsalization in which there is no ventral furrow or headfold, cells at the posterior do not migrate, and multiple symmetric folds appear along the anterior-posterior axis of the embryo. The Ea P225I mutant gives a weak partial rescue (D) in which the posterior cells migrate forward, but no ventral furrow forms and there are still multiple infoldings along the embryo anterior-posterior axis. Ea P225S (E) and Ea P225Y (F) mutants cause an intermediate ventralization of the embryo with the headfold prominent on the dorsal side (asterisk) and some anteriorward displacement of posterior cells along the dorsal side. Ea P225S typically gave more anteriorward movement of these cells than did Ea P225Y, consistent with a milder ventralization (supported by analysis of cuticle elements in Fig. 5), but these could not be readily distinguished in scoring gastrulation so they were grouped together in Table II. A more extreme ventralization could be seen with the injection of Gd wild type (WT) (G), Gd Y225P (H), or Gd Y225A (data not shown) in which there is a very prominent ventral furrow, no anterior displacement of posterior cells, and only a small headfold visible on the dorsal side of the embryo.


View larger version (154K):
[in this window]
[in a new window]
 
Fig. 5.   Representative cuticle patterns of injected embryos. Recipient embryos in A-E are of the genotype ea4/ea5022rx1, whereas those in F and G are gd7/gd7. When evident, embryos are oriented approximately with their anterior ends to the left and dorsal surface up. The external cuticle develops late in embryogenesis, but its elements reflect the earlier patterning along the dorsoventral axis. A, injection of Ea wild type (WT) results in a hatching embryo with bare dorsal cuticle, laterally derived filzkörper (arrowhead) and head skeleton (asterisk), and rows of ventral denticles (inverted V). B, an uninjected embryo does not develop either lateral or ventral structures and is considered completely dorsalized. The Ea S195A, Ea E70A, and Ea E70K mutants showed a similar phenotype, but their weak cuticles did not usually survive additional processing required for injected embryos. C, Ea P225S typically produced mildly ventralized embryos that retained filzkörper and a partial head skeleton, but their ventral denticles extend more dorsally than WT (this embryo was injected with 0.5 mg/ml mRNA). D, Ea P225I produced moderately dorsalized embryos with the rescue of filzkörper but rarely head skeleton and never ventral denticles. E, Ea P225Y typically produced moderately ventralized embryos lacking lateral filzkörper or head skeletons and having moderate expansion of disorganized ventral denticles around the embryo circumference (this embryo was injected with 0.5 mg/ml mRNA). F, a moderately to strongly ventralized embryo injected with Gd Y225P mRNA (0.06 mg/ml), showing strong expansion of ventral denticles. At high doses (0.6 mg/ml), Gd WT (G) and the Gd Tyr-225 mutants often gave rise to embryos with completely circumferential, disorganized ventral denticles.

For Gd, we injected synthetic mRNAs encoding wild-type Gd protein or mutations Y510A (Y225A) and Y510P (Y225P), expecting to disrupt Na+ binding (38, 39). At high doses (0.6 mg/ml), all three RNAs gave a strong ventralization phenotype in which excess Gd activity causes too much signaling through the Toll pathway (4), indicating that the mutant proteins are active. At a 10-fold lower dose (0.06 mg/ml), the wild-type RNA provides a broad range of phenotypes from strong ventralization to moderate dorsalization (partial rescue in Table II) while the mutant RNAs still show predominantly strongly and moderately ventralized embryos. This finding suggests that mutants predicted to lack Na+ binding have increased catalytic activity compared with wild type, implying that Na+ binding to Gd* may actually result in decreased catalytic activity.

For Ea, we compared the activity of wild type and two mutants of Glu-193 (Glu-70): E193A (E70A) predicted to abrogate Ca2+ binding, and E193K (E70K) predicted to eliminate Ca2+ binding but with the charged Lys partially substituting for Ca2+ (41, 42). These mutants resulted in a complete loss of Ea activity equivalent to the injection of the S337A (S195A) mutant lacking the catalytic serine, indicating that the Glu-193 (Glu-70) residue and possibly Ca2+ binding are critical for Ea function.

Engineering a Na+ Binding Site in Ea-- Most invertebrate proteases contain Pro-225, which is incompatible with Na+ binding rather than Tyr-225 or Phe-225, which are compatible with such binding (39). This usage dichotomy at residue 225 has profound structural (38, 39) and evolutionary (6) implications. One exciting possibility raised by the role of residue 225 in serine proteases (39) is the rational engineering of Na+ binding with the P225Y substitution. We surmised that Ea would be an excellent candidate to engineer a Na+ site, because it had already been shown that the P373S (P225S) substitution by an EMS mutation resulted in increased activity and ventralized phenotype (24). Ser is an intermediate in the genetic code between Pro and Tyr, and a saturation mutagenesis study has shown that Ser is also intermediate in catalytic activity between Pro and Tyr at position 225 in thrombin (38). We compared the activities of Ea proteins containing the mutations P373Y (P225Y) and P373S (P225S) with those of wild-type Ea. We found that P373Y (P225Y), similar to P373S (P225S), was capable of ventralizing easter-mutant embryos, something that the wild-type Ea protein is not able to do even when injected at high levels (Table II) (24). The P373Y (P225Y) mutant had significantly stronger ventralizing capacity than did P373S (P225S) and only rarely resulted in wild-type gastrulation or embryo hatching, even when titrated to a level (0.2 mg/ml) in which incomplete rescue was commonly seen together with weak ventralization. This behavior differs from that of previously described ventralizing easter alleles (24), which can be titrated to give a significant level of wild-type rescue, and suggests that this mutant enzyme may be less influenced by normal regulatory controls (43). A P373I (P225I) substitution resulted in a significant loss of activity with only weak partial rescue seen. The corresponding mutation in thrombin drastically reduced catalytic activity and did not provide Na+ binding from the mutated thrombin crystal structure (38).

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

The primary cleavage sites of Ndl, Snk, Ea and Spz have been proposed previously (3, 5, 7, 30) and are confirmed in the present study. Ndl is secreted in the perivitelline space and is required for the ventralization process upstream of Snk activation (30). The activation of Gd remains controversial, but our models propose that Ndl1* has trypsin-like activity and may bind Gd at its activation site Lys-211 better than Snk, Ea, or Spz. We believe that the second protease domain of Ndl*, Ndl2*, is inactive and plays no role in the cascade. Therefore, although Gd can be activated at Val-220 by Gd*, Ndl1* should be retained as a better potential actor in Gd activation from predicted binding energies (3). Gd autoactivation might be detected when Gd is overexpressed either in embryos (4) or in cell culture (3, 5) as suggested by the predicted low affinity of Gd for its own activation site, but this leaky autoactivation might not be as effective at physiologic expression levels of Gd. Hence, the proposed three-dimensional models of Ndl1*, Gd*, Snk*, and Ea* are consistent with the overall organization of the enzymatic cascade defining dorsoventral polarity in the fruit fly as recently described from cell culture and embryo studies (3, 5). The cascade is initiated by Gd activation, more probably by Ndl1* as suggested in vivo (4), and alternatively more weakly by Gd or Gd* as also proposed in vivo previously (5). Gd* then activates Snk and Snk* activates Ea. Ea* then processes Spz for signaling via the receptor Toll. The three-dimensional models are also consistent with previous mutagenesis studies of Ndl1* (33) and Ea* (24) and offer a structural explanation of the observed mutant phenotypes.

Notably, the three-dimensional models reveal new structural features that can be exploited in future in vivo studies. Of particular importance is the unanticipated identification of a Ca2+ binding site in Ea* and a Na+ binding site in Gd* and Ndl1*. The binding of Ca2+ in trypsin stabilizes the fold of the protease domain (40), and Na+ binding to thrombin and many other serine proteases increases the catalytic activity toward synthetic and natural substrates (36, 38, 39). Based on the results presented here, it is highly likely that Ca2+ binding to Ea* plays a key role in the function of this enzyme in vivo. Likewise, Na+ binding to Gd* and possibly Ndl1* has functional significance. Interestingly, Na+ binding to Gd* may actually result in the inhibition of the catalytic activity of the enzyme in contrast to the effect observed in all other Na+-dependent allosteric serine proteases studied to date (39). The current knowledge on the role of residue 225 in serine proteases (39) predicts that Na+ binding can be introduced in proteases carrying Pro-225 using the Pro right-arrow Tyr replacement. However, the P225Y substitution in tissue plasminogen activator is not sufficient to introduce Na+ binding and actually results in reduced catalytic activity (44). The introduction of Na+ binding in this protease requires substitution of a large number of residues in addition to Pro-225.4 Therefore, it is remarkable that the P225Y substitution in Ea* has such a profound effect on its catalytic activity, consistent with a gain of function that likely results from Na+ binding. This observation motivates the analysis of this protease in terms of kinetic and direct structural studies and offers new and important insights into ongoing efforts to engineer Na+ binding and enhanced catalytic activity in serine proteases of medical and biotechnological relevance.

    ACKNOWLEDGEMENTS

We thank Dr. Carl Hashimoto for offering decisive guidance at early stages of this project. E. K. L. is grateful to Lora LeMosy for computing assistance and to Fu-Shin Yu and Ke-Ping Xu for the use of their phase-contrast photomicroscope in Fig. 5.

    FOOTNOTES

* This work was supported in part by National Institutes of Health Research Grants HL49413 and HL58141 (to E. D. C.) and NS36570 (to J. B. S.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The on-line version of this article (available at http://www.jbc.org) contains supplemental Table III and Fig. 6.

E. D. C. dedicates this article to Professor Eraldo Antonini on the occasion of the 20th anniversary of his untimely death on March 19, 1983.

|| To whom correspondence should be addressed. Dept. of Biochemistry and Molecular Biophysics, Washington University School of Medicine, 660 S. Euclid Ave., Box 8231, St. Louis, MO 63110. Tel.: 314-362-4185; Fax: 314-747-5354; E-mail: enrico@biochem.wustl.edu.

Published, JBC Papers in Press, December 18, 2002, DOI 10.1074/jbc.M211820200

2 Underlined numbers refer to positions aligned with chymotrypsin(ogen). Non-underlined positions refer to the corresponding zymogen precursor sequence.

3 E. LeMosy, unpublished data.

4 E. Di Cera, unpublished results.

    ABBREVIATIONS

The abbreviations used are: Ea, Easter zymogen; Chy, chymotrypsin; Ea*, activated Easter; Gd, Gastrulation Defective zymogen; Gd*, activated Gastrulation Defective; Ndl, Nudel zymogen; Ndl*, activated Nudel; Snk, Snake zymogen; Snk*, activated Snake; Spz, Spätzle; Thr, thrombin; Try, trypsin; PDB, Protein Data Bank; FB, Flybase; SP, Swiss Protein.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

1. LeMosy, E. K., Hong, C. C., and Hashimoto, C. (1999) Trends Cell Biol. 9, 102-107[CrossRef][Medline] [Order article via Infotrieve]
2. Anderson, K. V., and Nüsslein-Volhard, C. (1984) Nature 311, 223-227[Medline] [Order article via Infotrieve]
3. LeMosy, E. K., Tan, Y. Q., and Hashimoto, C. (2001) Proc. Natl. Acad. Sci. U. S. A. 98, 5055-5060[Abstract/Free Full Text]
4. Han, J. H., Lee, S. H., Tan, Y. Q., LeMosy, E. K., and Hashimoto, C. (2000) Proc. Natl. Acad. Sci. U. S. A. 97, 9093-9097[Abstract/Free Full Text]
5. Dissing, M., Giordano, H., and DeLotto, R. (2001) EMBO J. 20, 2387-2393[Abstract/Free Full Text]
6. Krem, M. M., and Di Cera, E. (2002) Trends Biochem. Sci. 27, 67-74[CrossRef][Medline] [Order article via Infotrieve]
7. DeLotto, Y., and DeLotto, R. (1998) Mech. Dev. 72, 141-148[CrossRef][Medline] [Order article via Infotrieve]
8. Ayala, Y. M., Cantwell, A. M., Rose, T., Bush, L. A., Arosio, D., and Di Cera, E. (2001) Proteins 45, 107-116[CrossRef][Medline] [Order article via Infotrieve]
9. Rose, T., and Di Cera, E. (2002) J. Biol. Chem. 277, 18875-18880[Abstract/Free Full Text]
10. Rose, T., and Di Cera, E. (2002) J. Biol. Chem. 277, 19243-19246[Abstract/Free Full Text]
11. Sali, A., and Blundell, T. L. (1993) J. Mol. Biol. 234, 779-815[CrossRef][Medline] [Order article via Infotrieve]
12. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., and Bourne, P. E. (2000) Nucleic Acids Res. 28, 235-242[Abstract/Free Full Text]
13. Laskowski, R. A., Moss, D. S., and Thornton, J. M. (1993) J. Mol. Biol. 231, 1049-1067[CrossRef][Medline] [Order article via Infotrieve]
14. Hooft, R. W. W., Vriend, G., Sander, C., and Abola, E. E. (1996) Nature 381, 272-272[Medline] [Order article via Infotrieve]
15. Luthy, R., Bowie, J. U., and Eisenberg, D. (1992) Nature 356, 83-85[CrossRef][Medline] [Order article via Infotrieve]
16. Nayal, M., and Di Cera, E. (1994) Proc. Natl. Acad. Sci. U. S. A. 88, 817-821
17. Pickett, S. D., and Sternberg, M. J. E. (1993) J. Mol. Biol. 231, 825-839[CrossRef][Medline] [Order article via Infotrieve]
18. Finkelstein, A. V., and Janin, J. (1989) Protein Eng. 3, 1-3[Medline] [Order article via Infotrieve]
19. Nicholls, A., Sharp, K. A., and Honig, B. (1991) Proteins 11, 281-296[Medline] [Order article via Infotrieve]
20. Sharp, K. A. (1998) Proteins 33, 39-48[CrossRef][Medline] [Order article via Infotrieve]
21. Konrad, K. D., Goralski, T. J., Mahowald, A. P., and Marsh, J. L. (1998) Proc. Natl. Acad. Sci. U. S. A. 95, 6819-6824[Abstract/Free Full Text]
22. Chasan, R., and Anderson, K. V. (1989) Cell 56, 291-400[Medline] [Order article via Infotrieve]
23. Konrad, K. D., Goralski, T. J., and Mahowald, A. P. (1988) Dev. Biol. 127, 133-142[Medline] [Order article via Infotrieve]
24. Jin, Y. S., and Anderson, K. V. (1990) Cell 60, 873-881[Medline] [Order article via Infotrieve]
25. Anderson, K. V., Jürgens, G., and Nüsslein-Volhard, C. (1985) Cell 42, 779-789[Medline] [Order article via Infotrieve]
26. Jürgens, G., and Nüsslein-Volhard, C. (1986) in Drosophila: A Practical Approach (Roberts, D. B., ed) , pp. 199-227, IRL Press at Oxford University Press, Oxford, United Kingdom
27. Willnow, T. E., Orth, K., and Herz, J. (1994) J. Biol. Chem. 269, 15827-15832[Abstract/Free Full Text]
28. Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997) Int. J. Neural. Syst. 8, 581-599[Medline] [Order article via Infotrieve]
29. DeLotto, R. (2001) EMBO Rep. 2, 721-726[Abstract/Free Full Text]
30. LeMosy, E. K., Kemler, D., and Hashimoto, C. (1998) Development 125, 4045-4053[Abstract/Free Full Text]
31. Schechter, I., and Berger, A. (1967) Biochem. Biophys. Res. Commun. 27, 157-162[Medline] [Order article via Infotrieve]
32. Wei, A. Z., Mayr, I., and Bode, W. (1988) FEBS Lett. 234, 367-373[CrossRef][Medline] [Order article via Infotrieve]
33. LeMosy, E. K., Leclerc, C. L., and Hashimoto, C. (2000) Genetics 154, 247-257[Abstract/Free Full Text]
34. Bode, W., Turk, D., and Karshikov, A. (1992) Protein Sci. 1, 426-471[Abstract/Free Full Text]
35. Arosio, D., Ayala, Y. M., and Di Cera, E. (2000) Biochemistry 39, 8095-8101[CrossRef][Medline] [Order article via Infotrieve]
36. Di Cera, E., Guinto, E. R., Vindigni, A., Dang, Q. D., Ayala, Y. M., Wuyi, M., and Tulinsky, A. (1995) J. Biol. Chem. 270, 22089-22092[Abstract/Free Full Text]
37. Ponomareff, G., Giordano, H., DeLotto, Y., and DeLotto, R. (2001) Genetics 159, 635-645[Abstract/Free Full Text]
38. Guinto, E. R., Caccia, S., Rose, T., Futterer, K., Waksman, G., and Di Cera, E. (1999) Proc. Natl. Acad. Sci. U. S. A. 96, 1852-1857[Abstract/Free Full Text]
39. Dang, Q. D., and Di Cera, E. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 10653-10656[Abstract/Free Full Text]
40. Bode, W., and Schwager, P. (1975) J. Mol. Biol. 98, 693-717[Medline] [Order article via Infotrieve]
41. Mathur, A., Zhong, D., Sabharwal, A. K., Smith, K. J., and Bajaj, S. P. (1997) J. Biol. Chem. 272, 23418-23426[Abstract/Free Full Text]
42. Lai, T.-S., Slaughter, T. F., Peoples, K. A., and Greenberg, C. S. (1999) J. Biol. Chem. 274, 24953-24958[Abstract/Free Full Text]
43. Chang, A. J., and Morisato, D. (2002) Development 129, 5635-5645[CrossRef][Medline] [Order article via Infotrieve]
44. Vindigni, A., and Di Cera, E. (1998) Protein Sci. 7, 1728-1737[Abstract/Free Full Text]


Copyright © 2003 by The American Society for Biochemistry and Molecular Biology, Inc.



This Article
Abstract
Full Text (PDF)
Supplemental Data
All Versions of this Article:
278/13/11320    most recent
M211820200v1
Purchase Article
View Shopping Cart
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Copyright Permissions
Google Scholar
Articles by Rose, T.
Articles by Di Cera, E.
Articles citing this Article
PubMed
PubMed Citation
Articles by Rose, T.
Articles by Di Cera, E.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 All ASBMB Journals   Molecular and Cellular Proteomics 
 Journal of Lipid Research   Biochemistry and Molecular Biology Education 
Copyright © 2003 by the American Society for Biochemistry and Molecular Biology.