©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
A Novel Mutagenesis Strategy Identifies Distantly Spaced Amino Acid Sequences That Are Required for the Phosphorylation of Both the Oligosaccharides of Procathepsin D by N-Acetylglucosamine 1-Phosphotransferase (*)

(Received for publication, October 7, 1994; and in revised form, October 27, 1994)

Michael L. Dustin (§) Thomas J. Baranski (¶) Deepak Sampath Stuart Kornfeld (**)

From the Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

A novel combinatorial mutagenesis strategy (shuffle mutagenesis) was developed to identify sequences in the propiece and amino lobe of cathepsin D which direct oligosaccharide phosphorylation by UDP-GlcNAc:lysosomal enzyme N-acetylglucosamine 1-phosphotransferase. Propiece restriction fragments and oligonucleotide cassettes corresponding to 13 regions of the cathepsin D and glycopepsinogen amino lobes were randomly shuffled together to generate a large library of chimeric molecules. The library was inserted into an expression vector encoding the carboxyl lobe of cathepsin D with a carboxyl-terminal myc epitope and a CD8 transmembrane extension. Transfected COS1 cells expressing the membrane-anchored forms of the cathepsin D/glycopepsinogen chimeras at the cell surface were selected with solid phase mannose 6-phosphate receptor or an antibody to the myc epitope. Plasmids were rescued in Escherichia coli and sequenced by hybridization to the original oligonucleotide cassettes. Two regions of the cathepsin D amino lobe (segments 7 and 12) were found to contribute to proper folding, surface expression, and selective phosphorylation of the carboxyl lobe oligosaccharide. Two different cathepsin D regions (the propiece and segment 5) cooperated with a previously identified recognition element in the carboxyl lobe to allow efficient phosphorylation of both the amino and carboxyl lobe oligosaccharides.

Three general models for extending the catalytic reach of N-acetylglucosamine 1-phosphotransferase to widely spaced oligosaccharides are presented.


INTRODUCTION

A key step in the targeting of newly synthesized acid hydrolases to lysosomes is the recognition by the enzyme UDP-GlcNAc:lysosomal enzyme N-acetylglucosamine 1-phosphotransferase (abbreviated phosphotransferase) (^1)of a protein determinant shared among at least 40 different lysosomal hydrolases(1, 2) . This interaction results in the phosphorylation of mannose residues on the Asn-linked high mannose oligosaccharides of the lysosomal hydrolases. These phosphomannosyl residues serve as high affinity ligands for binding of the hydrolases to mannose 6-phosphate receptors present in the Golgi and their subsequent translocation to the lysosome (3) .

A molecular dissection of the phosphotransferase recognition marker on lysosomal hydrolases was undertaken in this laboratory using a pair of aspartyl proteases, the lysosomal hydrolase cathepsin D and the secretory protein pepsinogen, as sequence donors for chimeric proteins that were tested for their ability to act as phosphotransferase substrates(4) . Even though cathepsin D and pepsinogen are 45% identical in amino acid sequence and have similar secondary and tertiary structures(5, 6) , they differ in that cathepsin D is well phosphorylated by phosphotransferase, whereas a glycosylated form of pepsinogen is not. These studies led to the identification of a phosphotransferase recognition patch in the carboxyl lobe of cathepsin D formed by two noncontinuous primary sequences (lysine 203 and amino acids 265-292). Although these residues were sufficient to confer recognition by phosphotransferase when substituted into homologous positions in pepsinogen, it was observed that the presence of multiple regions of the amino lobe of cathepsin D enhanced phosphorylation of the chimeric proteins. It was concluded that these elements may be part of an extended carboxyl lobe recognition domain or comprise a second independent recognition domain. It was also observed that although the presence of a phosphotransferase recognition domain located on either lobe of a cathepsin D/glycopepsinogen chimeric molecule is sufficient to allow phosphorylation of oligosaccharides on both lobes, the oligosaccharide on the lobe that contained the recognition domain was the preferred substrate(7, 8) .

The identification of specific sequences in the amino lobe of cathepsin D which contribute to phosphotransferase recognition was attempted by generating six chimeric proteins using two shared restriction sites in the 550-nucleotide amino lobe region to generate nine constructs(7) . Chimeric proteins containing only a single cathepsin D amino lobe region were not well phosphorylated, but those with the pairwise combinations were phosphorylated to some degree. This suggested that multiple regions are required for generation of the recognition site or are required for proper folding.

To characterize further the cathepsin D amino lobe elements that contribute to phosphorylation, we have taken an approach that redefines and extends the limits of the chimeric methodology applied in our original studies. Rather than making a small number of constructs manually and examining each individually, we sought to generate a large library of chimeric molecules and analyze the entire library by a phenotypic selection. Short segments of cathepsin D and pepsinogen encoded by oligonucleotide cassettes were shuffled together through a directed subcloning strategy that ensured high fidelity and representation. The library was expressed in COS cells, and chimeric proteins that folded correctly and were phosphorylated were selected and analyzed. This approach allowed the analysis of 2 orders of magnitude more constructs than had been screened in our prior studies over several years.

Our experiments have defined two regions of the amino lobe of cathepsin D which contribute to proper folding and surface expression and two distinct regions that promote phosphorylation of the amino lobe oligosaccharide (CHO 70). Three models are proposed to account for these results.


EXPERIMENTAL PROCEDURES

Cell Lines, Plasmids, and Reagents

COS1 cells (9) and the 9E10 hybridoma producing an antibody to c-myc peptide (10) were obtained from ATCC (Rockville, MD). The CDM* and CDMK plasmids were a gift of Dr. Hugh Pelham(11) . A human CD8 cDNA clone was a gift of Dr. Paula Kavathas, Yale University(12) . The vector AprM8, a derivative of CDM8 (13) in which the supF marker was replace with an ampicillin resistance gene, was a gift of Dr. Lloyd Klickstein (Center for Blood Research, Boston). The cathepsin D and glycopepsinogen cDNAs were described previously(4, 14) . Recombinant endoglycosidase H fused to maltose-binding protein, recombinant peptide N-glycosidase F, and restriction enzymes were obtained from New England Biolabs and used in buffers supplied with the enzymes. T4 DNA ligase was obtained from Life Technologies, Inc. and used in a standard ligation buffer (15) . Human plasma fibronectin was obtained from the New York Blood Center. [2-^3H]Mannose and [S]methionine/cysteine mixture were obtained from DuPont NEN. Crude [-P]ATP was obtained from ICN (Irvine, CA). Carrier-free I was obtained from Amersham Corp. Chromatography paper was obtained from Whatman. Dimethyl sulfoxide was obtained from Taylor Chemical Co. (St. Louis, MO). Bovine mannose 6-phosphate/IGF-II receptor (Man-6-P/IGF-II receptor) was purified from fresh or frozen calf liver as described previously except that the receptor was eluted from the affinity column with 10 mM Man-6-P in 1% octyl beta-D-glucopyranoside rather than in Triton X-100. Concanavalin A-Sepharose was from Pharmacia Biotech Inc. Fetal bovine serum and bovine calf serum were obtained from HyClone (Logan, UT). NuSerum was from Collaborative Research. Fluorescein isothiocyanate (FITC) was obtained from Molecular Probes (Eugene, OR). Phycoerythrin-labeled goat anti-mouse IgG was purchased from BioMeda (Foster City, CA). Other reagents were of the highest purity available and were purchased from Sigma or Fisher.

Oligonucleotides

Oligonucleotides were synthesized on an Applied Biosystems 380A solid-phase synthesizer and purified by ethanol precipitation. The sequences of the oligonucleotides used to generate the myc peptide linker, the new restriction sites in cathepsin D and glycopepsinogen, and the cassettes used to construct the chimeric library will be made available on request.

Construction of Membrane-anchored Cathepsin D and Shuffle Library

Membrane-anchored cathepsin D (CD-MCD8) was generated by ligating the SfiI-MaeI fragment containing nucleotides 990-1287 of cathepsin D, a preannealed oligonucleotide cassette corresponding to the myc peptide recognized by the 9E10 mAb (SMEQKLISEEDLN) with MaeI and DraIII cohesive ends, and a DraIII-NotI fragment containing nucleotides 466-1322 of the human CD8 cDNA into the 4-kilobase fragment of Bluescript containing human cathepsin D cDNA (HindIII to BglII/BamHI) that had been cut with SfiI and NotI. The products of the ligation were sequenced through the oligonucleotide cassette. The HindIII-NotI fragment of the CD-MCD8 was subcloned into HindIII-NotI-cut pCDM8 and tested for expression in COS cells (see below). Later the construct was subcloned into pAprM8 for routine COS cell expression. This construct contains the entire sequence of cathepsin D; the myc peptide; and the hinge, transmembrane, and cytoplasmic domains of CD8 (see Fig. 3A).


Figure 3: Binding of Man-6-P/IGF-II receptor and 9E10 antibody to COS cells transiently expressing membrane-anchored chimeric proteins. The Man-6-P/IGF-II receptor was directly labeled with FITC (horizontal axis); the 9E10 antibody was detected with phycoerythrin goat anti-mouse IgG (vertical axis). Ninety-eight percent of mock transfected cells were below 10 fluorescent units on each axis. The primary sequence schematics show cathepsin D as dark gray, pepsinogen as light gray, the myc peptide as hatched, and CD8 sequences as white and black (transmembrane region). Panel A, CD-MCD8; panel B, CP71-MCD8 (GP p1-319, CD 320-348); panel C, CP70-MCD8 (CD p1-187, PG 188-319, CD 320-348); and panel D, CP55-MCD8 (CD p1-187, GP 199-230, CD 231-348).



Thirteen regions were selected which spanned from amino acids 37 to 186 of cathepsin D. The numbering conventions for the amino acid sequence of cathepsin D and glycopepsinogen are outlined in Fig. 1. XhoI and KpnI sites were engineered into the cDNA of cathepsin D and CP3 (which contains the amino lobe of glycopepsinogen up to the HincII site at nucleotide 931) by site-directed mutagenesis in the vector pGBT such that these restriction sites flanked the 5` and 3` boundaries of the 13 consecutive segments. The HindIII-XhoI regions of the cathepsin D and CP3 cDNAs were sequenced and used as a source of region 1 for subsequent steps. The 13 regions between the XhoI and KpnI sites were numbered regions 2-14 (Fig. 1). Oligonucleotide cassettes corresponding to regions 2-14 for cathepsin D and glycopepsinogen were synthesized to allow a three-step subcloning strategy outlined in Fig. 2. The oligonucleotides introduce SphI, SacI, BamHI, StyI, EcoRI, and SalI sites into the cathepsin D cDNA and allow complete assembly without altering the sequence of cathepsin D. The introduction of restriction sites SacI (Asp-63 to Glu), StyI (Gln-93 to Ser and Val-94 to Leu), and SalI (Ser-161 to Asp) and the requirement to otherwise maintain compatibility with the cathepsin D sequences in the junctions between regions 3 and 4 (Arg-58 to Lys, Phe-59 to Tyr), regions 7 and 8 (Ile-114 to Val), regions 8 and 9 (Ser-124 to Ile), and regions 13 and 14 (Ser-177 to Pro) resulted in introduction of the indicated changes to the glycopepsinogen sequence (Fig. 1).


Figure 1: Alignment of cathepsin D and glycopepsinogen amino lobes with junctions for shuffle mutagenesis. Numbering is for cathepsin D sequence with p1-p44 for the propiece and 1-191 from cathepsin D shown. Boxed residues are identical between cathepsin D and glycopepsinogen. The solid vertical bars represent restriction sites remaining in the completed chimeric molecules defining one class of junctions between regions. The dashed vertical/horizontal partitions represent a second class of junctions between regions based on annealing of overhangs between two oligonucleotide cassettes without leaving a unique restriction site. The circles indicate residues that were changed from the glycopepsinogen residue (shown) to another residue (as indicated under ``Experimental Procedures'') to splice to the cathepsin D sequence. The solid underline starting at residue 70 marks the site of Asn-linked glycan addition to the amino lobe. The single underlined residues at p34 and 77 are implicated in phosphotransferase recognition as described under ``Results.''




Figure 2: Construction of shuffle library. The numbers at the bottom of the plasmids are the number of potential products after each ligation. In round 1 oligonucleotide cassettes were annealed and ligated into pSP64 as indicated. At this point each potential product was independently prepared and sequenced. The indicated restriction sites were then used to prepare fragments used in subsequent steps. The final shuffle library encoding membrane-anchored chimeric proteins was subcloned into the vector pAprM8. After selection, a subset of constructs was subcloned into a modified version of pCDM for expression as secreted protein.



The initial round of subcloning into SP64 cut with HindIII and EcoRI resulted in 30 possible ligation products corresponding to every combination of cathepsin D and glycopepsinogen segments for the 13 consecutive regions excluding the propieces. These were sequenced, and validated plasmid copies of each ligation product were prepared as purified plasmid and quantified by A. In the second subcloning step the versions of each subregion corresponding to each possible cathepsin D/glycopepsinogen combination were mixed in equal ratio based on OD, and the pools were cut with the indicated restriction enzymes. The resulting fragments were then copurified and used in ligations as outlined in Fig. 2. The number of possible combinations at each step are indicated in Fig. 2. The number of colonies pooled at each step was sufficient to obtain greater than 95% of possible combinations(15) .

Nomenclature for New Chimeric Constructs

Previously generated constructs are referred to by the original CP (chimeric protein) nomenclature(4) . New constructs generated by the synthetic gene approach can be described in two ways based on the binary logic intrinsic to this system. A base 2 numbering system can be used to describe each construct using 1 = cathepsin D and 0 = glycopepsinogen at each region. In this system a construct with cathepsin D regions 1, 5, 7, and 11 is 10001010001000; a construct with only cathepsin D regions 7 and 12 is 000000100001000. Abbreviated series numbers with the prefix S (shuffle) were generated by converting the base 2 representations above to base 10 such that the two constructs above become S8840 and S132, respectively; the construct with all cathepsin D regions is S16383, which is identical to CP55, whereas the construct with all glycopepsinogen regions is S0, which is CP3 with the changes outlined above. Thus each potential chimeric construct generated from the random cassette mutagenesis strategy has a unique designation. Unless otherwise indicated the constructs also contain glycopepsinogen amino acids 187-230 and cathepsin D amino acids 231-348 plus the myc peptide and a stop codon. The membrane-anchored versions are indicated with suffix MCD8 to indicate the myc peptide and CD8 hinge, transmembrane and cytoplasmic tail regions.

COS Cell Transfections

COS1 cells were maintained in Dulbecco's modified Eagle's medium plus 10% bovine calf serum by 1:2 dilution every 2-3 days. COS1 cells were transfected with DNA purified by alkaline lysis and polyethylene glycol precipitation using the DEAE-dextran method(13) . High DNA transfections were performed with 10 µg of DNA/2 ml of RPMI 1640, 10% NuSerum (Collaborative Research), 50 µg/ml gentamycin that was mixed with 2 ml of the same media containing 200 µg/ml DEAE-dextran and 200 µM choroquine for each 100-mm plate of 50% confluent COS1 cells. Low DNA transfections used 10 ng of plasmid DNA and 10 µg of sheared salmon sperm DNA as a carrier with the same volumes and concentrations of other reagents. Twenty 100-mm plates were transfected for each selection. Selections were performed on 60-mm Petri plates coated with 9E10 mAb (2 ml of 10 µg/ml in PBS overnight at 24 °C followed by 1 mg/ml bovine serum albumin block in PBS and washing) or Man-6-P/IGF-II receptor-coated plates (2 ml of 5 µg/ml bovine liver Man-6-P/IGF-II receptor in 0.1% octyl glucoside detergent in PBS overnight at 24 °C followed by 1 mg/ml bovine serum albumin block in PBS and washing). Cells were split 1:1 24 h after transfection and used for selection at 48-72 h. The COS1 cells were released from the plates in Hanks' balanced salt solution, 10 mM Hepes, pH 7.4, no Ca or Mg, 5% fetal bovine serum, 0.5 mM EDTA. The cells were washed with the same media without EDTA, incubated on the selection plates (one 100-mm plate of COS cells/60-mm selection plate) in 2 ml of media for 60 min at 4 °C. The plates were gently washed four times with PBS containing no added protein. Plasmid DNA from adherent cells was recovered by adding 0.4 ml of 1% SDS, 10 mM EDTA to the plate and transferring the lysate to a 1.5-ml tube on ice followed by the addition of 0.1 ml of 5 M NaCl(16) . The tube was gently mixed, left on ice overnight, and centrifuged at 40,000 times g. The supernatant was phenol extracted, ethanol precipitated with 10 µg of tRNA carrier, and used to transform Escherichia coli strain DH10 by electroporation; the cells were plated on selective agar. The colonies were enumerated, and DNA was purified directly from a slurry of the colonies or after several hours of culture in liquid LB medium to increase the number of cells. Size selection on products isolated after one round of high DNA transfections/selection and two rounds of low DNA transfections/selection was carried out by cutting the plasmids with HindIII, isolating the correct band in the 7-kilobase range by excision and adsorption to ceramic particles (GeneClean, Bio101), and then ligation at 1 ng/ml DNA with T4 DNA ligase.

Labeling of Man-6-P/IGF-II Receptor with Fluorescein Isothiocyanate

Man-6-P/IGF-II receptor micelles were prepared by a concentrating 500 µg of receptor in 1% octyl beta-D-glucopyranoside to 50 µl in a Centricon 30 (Amicon, Danvers, MA), diluting it to 2 ml with cold PBS in the upper chamber of the Centricon unit, and concentrating it back to 50 µl. The dilution-concentration cycle was repeated twice more with PBS(17) . The receptor concentration in the final concentrated sample of 50 µl was determined by Coomassie Blue dye binding(18) . The receptor was labeled with FITC at a ratio of 0.025 mg of FITC/mg of receptor for 1 h at room temperature in 0.1 M NaHCO(3), pH 8.4. Labeled Man-6-P/IGF-II receptor was separated from free FITC on a Sephadex G-25 column. The FITC-Man-6-P/IGF-II receptor was stored at 500 µg/ml in PBS, 0.1% bovine serum albumin.

Immunofluorescence Staining of COS Cells

COS1 cells transfected with 10 µg of plasmid DNA/100-mm plate were treated with trypsin and replated after 24 h and then removed from the plates with Hanks' balanced salt solution, 10 mM Hepes, 5% fetal bovine serum, 0.5 mM EDTA. Approximately 10^6 cells were stained with 2.5 µg of FITC-Man-6-P/IGF-II receptor and 50 ng of 9E10 mAb in 50 µl of the same buffer. After 1 h at 4 °C, the cells were washed, and 2.5 µg of phycoerythrin-labeled goat anti-mouse IgG was incubated with the cells for 30 min at 4 °C. The cells were washed again and analyzed on a FACSCAN (Becton Dickinson) using a Consort 30 analysis package within 12 h.

Labeling of COS Cells

COS1 cells transfected with 20 µg of DNA for two 100-mm dishes were replated at 24 h onto a single 100-mm dish coated with 25 µg/ml human serum fibronectin. At 48 h the cell layers were washed twice with Dulbecco's modified Eagle's medium without glucose. The washed COS cells were incubated with 2 ml of Dulbecco's modified Eagle's medium, 10 mM Hepes, 10% dialyzed fetal bovine serum, 0.5 mMD-glucose, 400 µCi of [2-^3H]mannose for 4 h at 37 °C at which point additional glucose was added to 5 mM, effectively stopping mannose uptake into the cells. The media were harvested after an additional 12-16 h and were dialyzed against 25 mM TrisbulletHCl, pH 8.0, 150 mM NaCl, 0.02% NaN(3) to remove free [2-^3H]mannose. When cellular material was collected the cell layer was washed twice with imidazole-buffered saline and lysed in imidazole-buffered saline, pH 7.0, plus 1% Triton X-100, 50 trypsin inhibitory unit/ml aprotinin, 1 mM phenylmethylsulfonyl fluoride, 5 mM iodoacetamide, and 0.5 mM EDTA at 4 °C. After 30 min the lysate was collected and centrifuged at 13,000 times g in a microcentrifuge for 15 min.

Immunoprecipitation and Oligosaccharide Analysis

Lysates or dialyzed cell media samples were immunoprecipitated with 15 µl of 9E10 antibody coupled to CNBr-activated 4% beaded agarose. After rotating the antibody beads with the lysate or media sample, the beads were washed five times with 100 mM Tris, pH 8.0, 150 mM NaCl, 1% Triton X-100, 5 mM EDTA, 10 trypsin inhibitory unit/ml, aprotinin and once with 10 mM Tris, pH 6.8. The immunoprecipititated protein was eluted from the beads in 50 µl of 0.5% SDS, 10 mM beta-mercaptoethanol by heating to geq 80 °C for 10 min and rinsing the beads with 50 µl of water. Oligosaccharide analysis was carried out by a modification of standard methods(8, 19) . The eluted proteins were treated with 10 milliunits of Endo H overnight at 37 °C. The released high mannose oligosaccharides, which include the phosphorylated molecules, were recovered in the flow through of a Centricon 30. The retentate was then treated with 10 mg/ml Pronase or 10 milliunits/ml N-glycanase to generate glycopeptides or oligosaccharides, respectively, for analysis of complex-type oligosaccharides using concanavalin A-Sepharose and to determine the proportion of high mannose oligosaccharides released with Endo H. The Endo H-released oligosaccharides were treated with 20 mM HCl at 100 °C for 30 min to remove any GlcNAc residues still linked to phosphates and sialic acid from hybrid oligosaccharides. The samples were then analyzed by QAE-Sephadex chromatography to determine the degree of phosphorylation(19) . The samples were loaded onto a column of QAE-Sephadex in 2 mM Tris with leq 5 mM total salts and buffers, and the flow through containing neutral high mannose oligosaccharides was collected; the eluate with 70 mM NaCl contained high mannose oligosaccharides with one phosphate and the eluate with 140 mM NaCl contained high mannose oligosaccharides with two phosphates.

The Endo H-resistant material that had been treated with Pronase or N-glycanase was loaded onto concanavalin A-Sepharose in Tris/CaCl(2) buffer, and three fractions were collected: the flow-through containing tri- and tetraantennary complex oligosaccharides, the 10 mM alpha-methylglucoside eluate containing biantennary complex oligosaccharides, and the 100 mM alpha-methyl mannoside eluate containing Endo H-resistant high mannose oligosaccharides. In some experiments the level of phosphorylation of the Endo H-resistant, N-glycanase released high mannose oligosaccharides eluted from concanavalin A-Sepharose with 100 mM alpha-methylmannoside was determined. This material was found to be significantly underphosphorylated compared with the Endo H-released material. Therefore, no attempt was made to systematically compensate for Endo H-resistant high mannose oligosaccharides in estimating phosphorylation.

The percent phosphorylation was calculated as cpm recovered in Endo H-released oligosaccharides with one or two phosphates total cpm in Endo H-released oligosaccharides (phosphorylated plus neutral) + 2 (Endo H-resistant complex oligosaccharides). The values for the complex oligosaccharides were multiplied by 2 to correct for the fact that they contain 3 mannose residues versus an average of 6 mannose residues per high mannose oligosaccharide. The ratio of oligosaccharides with two and one phosphate was determined directly from the QAE-Sephadex chromatography data.

In Vitro Phosphotransferase Assays

UDP-[alpha-P]GlcNAc was prepared as described previously(20) . Partially purified rat liver phosphotransferase was prepared by sucrose gradient enrichment of Golgi membranes followed by removal of endogenous acceptors by low detergent extraction in the presence of 10 mM Man-6-P, detergent solubilization, and DEAE-cellulose chromatography(20) . Typically, assays were performed in 10-µl reactions containing 2 mM ATP, 0.5-2.0 µg of acceptor, 5.0 units of phosphotransferase (25,000 units/mg), and 5 times 10^6 cpm of UDP-[alpha-P]GlcNAc at 5,000 cpm/pmol. After 2 h at 37 °C, the reactions were terminated by the addition of SDS-polyacrylamide gel electrophoresis sample buffer, and the samples were analyzed on a 10% reducing SDS-polyacrylamide gel. The gels were stained with Coomassie Brilliant Blue and destained extensively prior to autoradiography to remove free radioactivity. The Coomassie Blue-stained gels and autoradiographs were analyzed by a laser densitometer (Molecular Dynamics) to determine the amount of protein in the appropriate size bands and the amount of P transferred. The autoradiography results were confirmed by direct counting of bands cut from the gel.


RESULTS

Construction and Testing of a Membrane-anchored ``Selection'' Vector for Aspartyl Proteases

A membrane-anchored form of cathepsin D was generated to allow cathepsin D and cathepsin D/glycopepsinogen chimeras to be expressed at the cell surface. Plasma membrane expression facilitates phenotypic selection and keeps the products of a particular cathepsin cDNA construct tightly associated with the cell, thereby allowing rescue of cDNAs from cells carrying constructs with selected properties. The CD-MCD8 construct contains the entire coding sequence of cathepsin D, including the propiece, linked to the hinge, transmembrane, and cytoplasmic tail of the lymphocyte glycoprotein CD8 through an oligonucleotide cassette encoding the c-myc peptide recognized by the 9E10 monoclonal antibody (Fig. 3A). Both the cathepsin D and glycopepsinogen (CP71-MCD8) versions were highly expressed at the cell surface when transiently transfected into COS1 cells as determined by binding of phycoerythrin-tagged 9E10 and immunofluorescence flow cytometry (Fig. 3, A and B). The observed heterogeneity in the level of expression is characteristic of this transient expression system. Simultaneous binding of fluorescein-labeled Man-6-P/IGF-II receptor shows that the CD-MCD8 expressed on the surface of COS cells is phosphorylated, whereas no surface Man-6-P/IGF-II receptor binding was detected in COS cells transfected with the glycopepsinogen construct, CP71-MCD8. It is not clear how the phosphorylated form of CD-MCD8 evades targeting to lysosomes. One possible explanation is that the mannose 6-phosphate receptors become saturated due to the high level of expression of the construct. Alternatively, CD-MCD8 molecules delivered to endosomes by the receptor may cycle to the cell surface rather than be targeted to lysosomes.

Analysis of the amino lobe recognition determinants for phosphotransferase required selection of a cathepsin D/glycopepsinogen chimera in which the cathepsin D amino lobe contributes to the level of phosphorylation. Glycopepsinogen sequences could then be substituted into the amino lobe of this positive construct to determine the minimal cathepsin D sequences required for phosphorylation. CP70-MCD8 is a chimera with the propiece and amino lobe of cathepsin D and the carboxyl lobe of glycopepsinogen, except for the carboxyl-terminal 29 amino acids (Fig. 3C). Although a similar construct (CP1) is moderately well phosphorylated in frog oocytes(4) , CP70-MCD8 did not bind fluorescent Man-6-P/IGF-II receptor when expressed in COS cells despite good surface expression (Fig. 3C), and a soluble form of this chimeric protein was only phosphorylated 0.5-1% (not shown). However, CP55-MCD8, which differs from CP70-MCD8 by having more cathepsin D carboxyl lobe sequence (residues 231-348), was phosphorylated at a level intermediate between CD-MCD8 and CP71-MCD8 (Fig. 3D). The additional cathepsin D carboxyl lobe sequences contain the beta-loop (residues 265-292) that is required for efficient phosphorylation(4) . However, the presence of the cathepsin D beta-loop alone does not result in phosphorylation(4) . Together, these results show that neither the cathepsin D propiece and amino lobe nor the carboxyl lobe beta-loop (residues 265-292) contains sufficient information for recognition by phosphotransferase, but together these regions cooperate to confer good phosphorylation of the membrane-anchored form. Therefore, the construct selected to accept the amino lobe chimeric library contained glycopepsinogen sequence from amino acids 187-230 and cathepsin D sequence from amino acids 231-348. The regions to be intermixed in the chimeric library are between residue P1 of the propiece and amino acid 186.

A panning procedure was developed for selecting COS cells expressing phosphorylated chimeric molecules on their surface. Transfected COS cells were allowed to adhere to Petri plates containing attached Man-6-P/IGF-II receptor. The unbound cells were collected by gentle washing and the adherent cells eluted with Man-6-P. As shown in Fig. 4, cells expressing CD-MCD8 (panels A-C) and CP55-MCD8 (panels D-F) were readily selected by this method as indicated by a the nearly complete depletion of 9E10-positive cells from the nonbound fraction. Cells expressing CP3-MCD8 (panels G-I) were less efficiently bound, as indicated by the failure to remove 9E10-positive cells from the nonbound population, and cells expressing CP71-MCD8 (panels J-L) did not bind at all. A variant of this assay involved attaching the 9E10 mAb to Petri plates which allowed panning for cells expressing a chimeric protein on its surface regardless of phosphorylation (not shown). With these tools a large library of constructs could be screened for both surface expression (folding) and degree of phosphorylation.


Figure 4: Selection of COS cells transiently expressing membrane-anchored chimeric molecules on plates coated with Man-6-P/IGF-II receptor. Cells were stained with 9E10 mAb and FITC goat anti-mouse IgG before (input) or after selection of Man-6-P/IGF-II receptor plates (1 µg/cm^2). After selection the sample was split into nonbound and bound populations. The latter was then eluted from the plates with 10 mM Man-6-P. Ninety-eight percent of mock transfected cells gave less than 10 fluorescence units. Panels A-C, CD-MCD8; panels D-F, CP55-MCD8; panels G-I, CP3-MCD8; panels J-L, CP71-MCD8.



Construction of the Chimeric Library

A strategy for building a cathepsin D/glycopepsinogen chimeric amino lobe from oligonucleotide cassettes was developed (Fig. 2). The process was planned using the following parameters. 1) Each cassette was joined to the next in a region of primary sequence identity or similarity between cathepsin D and glycopepsinogen. 2) Each cassette spanned a region of primary sequence divergence which might contain elements determining functional differences. 3) During assembly no greater than four ends were joined in any one ligation step. 4) After an initial round of subcloning the synthetic DNA cassettes into plasmid vectors, each possible cathepsin D/glycopepsinogen chimeric segment was subjected to DNA sequencing, and verified copies were used to generate restriction fragments for subsequent subcloning steps. This approach allowed the generation of 13 oligonucleotide cassettes spanning 150 amino acids in the primary structure of cathepsin D (Fig. 1). A number of changes in the glycopepsinogen synthetic cDNA fragments were required to allow coassembly with the cathepsin D synthetic cDNAs (Fig. 1). These changes, which were either conservative (two changes) or involved changes to the cathepsin D residue (eight changes), are discussed under ``Experimental Procedures.'' The propieces plus the highly conserved first 36 residues of the mature proteins were introduced as restriction fragments during the ordered subcloning strategy (Fig. 2). Thus these constructs were designed to contain 14 regions in the amino lobe, each of which would have a 0.5 probability of being from cathepsin D or glycopepsinogen. This strategy produces 16,384 potential chimeric constructs. After the initial round of subcloning each combination of cathepsin D and glycopepsinogen regions was cloned and sequenced directly. The number of colonies at the intermediate subcloning steps was sufficient to generate all possible combinations (15) . The final subcloning step resulted in the introduction of the shuffle library into the vector AprM8 for expression in COS cells and rescue in E. coli. A final library with 10^7 independent transformants was generated, and plasmid DNA from these colonies was used for COS cell transfection.

The quality of the library was preliminarily assessed by analyzing 22 colonies randomly picked from the initial plating. Hybridization sequencing established that each region was represented at close to a 50:50 ratio (cathepsin D/glycopepsinogen) except region 2, which was 15% cathepsin D (Table 1). The finding of a few inserts with hybridization signals for both cathepsin D and glycopepsinogen sequence at a single region suggests that some ``triplet'' ligation events (head-tail/tail-head/head-tail) occurred during the generation of the library. Binding of the 9E10 mAb and the Man-6-P/IGF-II receptor to COS cells transfected with the individual plasmids showed that 50% of the plasmids encoded chimeric proteins that were expressed at the cell surface and phosphorylated to various extents (data not shown). This indicated that the library was of adequate complexity (5 times 10^6 expressable constructs) and that constructs with variable phosphorylation levels were present in the library. When transfected COS cells were permeabilized and analyzed by immunofluorescence with the 9E10 antibody, some of the cells with low surface expression showed endoplasmic reticulum staining, suggesting that particular combinations of segments resulted in chimeric proteins that fail to fold well enough to exit this compartment (data not shown).



Selection of Individual Constructs

Three parallel rounds of selection on either 9E10 or Man-6-P/IGF-II receptor plates were used to generate two sublibraries from which individual colonies were picked for further analysis. In the first round a relatively high concentration of DNA (10 µg) was used for transfection to allow a survey of the entire library in approximately 5 times 10^7 COS cells. The second and third rounds used very low DNA concentrations (10 ng) in the transfection to reduce the number of plasmids introduced into each transfected cell toward 1(21) . After the third round the plasmids from the pooled colonies were cut with SfiI, liberally size selected to take only the size range corresponding to complete, nonrearranged plasmids, religated under dilute conditions, and transformed into E. coli. After size selection geq 95% of the picked colonies yielded plasmids of the correct size which encoded surface expressed constructs as confirmed by 9E10 binding (data not shown). Without the size selection, many of the plasmids after the third round corresponded to lower molecular weight recombinants lacking the cathepsin D cDNA insert.

Sequencing of 64 constructs randomly selected from the Man-6-P/IGF-II receptor binding sublibrary (32 constructs) and the 9E10 binding sublibrary (32 constructs) revealed a striking overrepresentation of cathepsin D regions 1 (+34%), 7 (+41%) and 12 (+28%) and an underrepresentation of cathepsin D region 9 (-42%) (Table 2). It is likely that these deviations from normal representation following selection are due to folding constraints, although it is not clear for any region whether an enriched segment was selected for or the depleted segment selected against. Fortunately, none of the enriched regions was absolutely required for secretion of soluble constructs. It was therefore possible to measure the phosphorylation levels in constructs lacking each enriched region and to determine the contribution of the enriched segments to phosphotransferase recognition in apparently well folded molecules (see below).



The conclusion from the analysis of the expression and phosphorylation of the membrane-anchored chimeric proteins was that both the Man-6-P/IGF-II receptor and 9E10 antibody selections were similarly effective in enriching for surface-expressed molecules with variable levels of phosphorylation. Further quantitative analysis of phosphorylation was performed on soluble versions of selected constructs in which we purposefully skewed the analysis toward constructs with relatively low contents of cathepsin D segments. Constructs encoding soluble proteins were used for the final analysis to allow comparison with earlier results.

Analysis of Soluble Forms of the Chimeric Proteins

The level of phosphorylation of a number of soluble chimeric proteins secreted from [2-^3H]mannose-labeled COS cells in the presence of a weak base was measured. The secreted molecules were immunoprecipitated and their [^3H]mannose-labeled oligosaccharides analyzed for the level of phosphorylation by ion exchange chromatography on QAE-Sephadex as described previously(8) . The construct with the fewest cathepsin D amino lobe regions obtained from the selection (S132) had regions 7 and 12 only. Constructs with regions 1 (S8192), 7 (S128), or 12 (S4) alone were generated by subcloning. Constructs S128 and S4 encoded soluble chimeric proteins that were well secreted from COS cells, whereas S8192 encoded a protein that did not leave the ER and was degraded intracellularly. Thus, although the propiece of cathepsin D (region 1) does not rescue expression of S0 on its own, as observed with cathepsin D regions 7 or 12, it must improve surface expression of constructs containing cathepsin D regions 7 or 12 to account for its overrepresentation in the selected sublibraries.

Sequences of chimeric proteins showing the greatest increase in phosphorylation with the addition of the least number of cathepsin D regions are compared in Fig. 5. Although the combination of segments 7 and 12 results in a small increase in phosphorylation over either alone, larger gains in total phosphorylation and in the fraction of oligosaccharides with two phosphates are made by including segments 1, 5, 8, 10, and 13 in different combinations. The complexity of these results led us to seek some simplification by examining whether the two oligosaccharides of the chimeric proteins might be phosphorylated differentially based on inclusion of different regions of the amino lobe of cathepsin D. Mutations eliminating one or the other of the glycosylation sites were introduced by subcloning or cassette mutagenesis(22) . Construct S206, which contains cathepsin D segments 7, 8, 11, 12, and 13, showed efficient phosphorylation of the carboxyl lobe oligosaccharide at position 199 (CHO 199) but almost no phosphorylation of the amino lobe oligosaccharide at position 70 (CHO 70) (Fig. 6). Also shown in Fig. 6are the results obtained with S206 containing both oligosaccharides. The extent of total oligosaccharide phosphorylation and the ratio of oligosaccharides containing one phosphate to those containing two phosphates were the same for the average of the two constructs containing a single oligosaccharide and S206 containing both oligosaccharides. Similar results were obtained with S132 (cathepsin D segments 7 and 12) and with S198 (cathepsin D segments 7, 8, 12, and 13). In contrast, construct S8782, which incorporates additional cathepsin D regions (segments 1, 5, 8, 11, 12, and 13), showed significant phosphorylation of both CHO 199 and CHO 70 (Fig. 6). Thus, regions 1 and 5, which are present in S8782 but not in S206, appear to contain a determinant for efficient phosphorylation of CHO 70. Of note, however, is the finding that CHO 70 rarely acquires two phosphates, whereas CHO 199 receives two phosphates about 50% of the time.


Figure 5: Phosphorylation of soluble chimeric proteins combining cathepsin D residues 231-348 with regions from the amino lobe of cathepsin D. COS cells transiently expressing soluble chimeric proteins bearing both Asn-linked oligosaccharides were labeled with [2-^3H]mannose, and the myc tagged chimeric proteins were isolated from the media. The labeled oligosaccharides were analyzed to determine percentage of phosphorylated Asn-linked oligosaccharides (%P) and the ratio of oligosaccharides with two and one phosphate (2:1). Each chimeric protein was analyzed at least three times, and standard deviations of percent phosphorylation are given.




Figure 6: Distribution of phosphorylation between the two oligosaccharides in two chimeric proteins. Chimeric proteins S206 and S8782 were mutated to delete one or the other Asn-linked glycosylation sites. Chimeric proteins with both oligosaccharides, only CHO 70, or only CHO 199 were analyzed for percent phosphorylation as described in the legend to Fig. 5. The total phosphorylation of the chimeric protein with both oligosaccharides should be the sum of the phosphorylation of the two individual oligosaccharides divided by 2. This calculation was performed using the percent phosphorylation from chimeric proteins with one or the other oligosaccharide in the column labeled (70 + 199)/2.



A panel of 18 soluble chimeric constructs bearing only CHO 70 and containing different permutations of the cathepsin D regions found in S8782 were assayed directly for phosphorylation of CHO 70 to test the hypothesis that cathepsin D regions 1 and 5 are required for phosphorylation of this oligosaccharide (Fig. 7). A very strong correlation between efficient CHO 70 phosphorylation and the presence of both regions 1 and 5 of cathepsin D was established. Region 1 contains the entire propiece of cathepsin D plus 36 amino acids of the mature protein, whereas region 5 (amino acids 63-81) is a component of the beta-flap 68-88, which contains CHO 70 and overlaps a portion of the propiece. Examination of Fig. 7shows that neither of the highly enriched cathepsin D segments 7 or 12 is essential for efficient phosphorylation of CHO 70 since S8840, which lacks region 12, and S8782, which lacks region 7, are both relatively well phosphorylated on CHO 70.


Figure 7: Phosphorylation of CHO 70 in soluble chimeric proteins combining cathepsin D residues 231-348 with regions from the amino lobe of cathepsin D. Chimeric proteins bearing only CHO 70 were expressed in COS cells, and phosphorylation was analyzed as indicated in the legend to Fig. 5. Since the proportion of oligosaccharides with two phosphates is negligible at CHO 70 this column was omitted. In cases in which the experiment was performed three or more times the standard deviation is reported; in other cases the percent phosphorylation (%P) is an average of two values. S206 was analyzed four times and was always less than 1%.



Site-directed Mutagenesis of Regions 1 and 5

A number of charged residues in regions 1 and 5 were mutated to alanine, and the resultant constructs tested were for effects on phosphorylation of CHO 70. As shown in Fig. 8, mutation at two of these sites (Lys at position 34 in the propiece (Lys-p34) and His-77 in the tip of the beta-flap) decreased phosphorylation of CHO 70 from 12.6% (in S8840) to 3.3 and 4.8%, respectively. His-77 could be changed to Arg without loss of activity, but conversion to Lys resulted in a partial decrease in phosphorylation compared with the Ala mutation.


Figure 8: Site-directed mutagenesis to identify residues in construct S8840 required for phosphorylation of CHO 70. The phosphorylation of S8840 containing only CHO 70 was determined as in the legend to Fig. 5. Standard deviations are based on at least three determination for each mutated form.



Mutation of Lys 58 to Ala had no effect on CHO 70 phosphorylation in S8840. Thus this residue, which is located in the junction between regions 3 and 4, does not appear to have a significant role in the phosphorylation of CHO 70. Similarly, when Lys-p8, GluAsp-p24-25, Lys-p29, Glu-p44, Lys-69, and Asp-75 were individually changed to alanines, there was no effect on the level of phosphorylation of CHO 70.

Phosphorylation of Chimeric Proteins by Phosphotransferase in Vitro

To establish that the observed differences in phosphorylation of CHO 70 among the constructs were due to interactions with phosphotransferase and were not the result of variability in the rate of transport of the various chimeric proteins through the Golgi, we tested several of the proteins for their ability to serve as substrates for rat liver phosphotransferase in an in vitro assay. Microgram quantities of S8840 (cathepsin D segments 1, 5, 7, and 11) and S198 (cathepsin D segments 7, 8, 12, and 13), each bearing only CHO 70, were purified by anti-myc monoclonal antibody affinity chromatography under conditions maintaining the presence of the propiece. These constructs were expressed with carboxyl-terminal KDEL extensions so that they would be retained in the ER(11) . As a consequence, the constructs contained a large proportion of nonphosphorylated high mannose oligosaccharides with 6-8 mannose residues, which are good substrates for phosphotransferase. In contrast, secreted chimeric molecules contain a mixture of phosphorylated high mannose oligosaccharides and complex-type oligosaccharides and therefore are poorer substrates for phosphotransferase. As shown in Fig. 9, the rate of transfer of GlcNAc-P to CHO 70 of S8840 (18 fmol/h/µg) was 6-fold greater than the rate of transfer to this oligosaccharide on S198 (3 fmol/h/µg). The inset compares the phosphorylation of S198 containing both CHO 70 and 199 to S198 containing only CHO 70. It is apparent that the oligosaccharide at position 199 is phosphorylated much better than the oligosaccharide at position 70. These results are consistent with the conclusion that the combination of cathepsin D regions 1 and 5 directs phosphorylation of CHO 70 by phosphotransferase.


Figure 9: In vitro phosphorylation of soluble chimeric proteins purified from COS cells. Transfer of [P]GlcNAc from UDP[alpha-P]GlcNAc to Asn-linked glycans of chimeric proteins with both oligosaccharides or only CHO 70 was determined as described under ``Experimental Procedures.'' , S8840 with CHO 70; circle, S198 with CHO 70. Inset: bullet, S198 with CHO 70 and CHO 199; circle, S198 with CHO 70.




DISCUSSION

The data presented in this paper demonstrate that the amino lobe of cathepsin D contains two types of elements that influence oligosaccharide phosphorylation. One class of elements promoted progression through the secretory pathway apparently by maintaining compatibility of the two lobes of the chimeric bilobed aspartyl protease and thereby facilitating proper folding of the chimeric protein. This same set of elements in concert with a portion of the previously described carboxyl lobe phosphotransferase recognition marker also had a positive effect on phosphorylation of the oligosaccharide located at position 199 (CHO 199). A distinct combination of cathepsin D regions in the amino lobe when combined with the carboxyl lobe recognition element were found to be required for phosphorylation of the oligosaccharide located at position 70 (CHO 70). The chimeric constructs that led to these findings were obtained through a novel combinatorial mutagenesis strategy (shuffle mutagenesis) in which a propiece restriction fragment and oligonucleotide cassettes corresponding to 13 regions of the cathepsin D and glycopepsinogen amino lobes were shuffled together randomly to generate a large library of chimeric molecules.

The shuffle mutagenesis approach provided two advantages over previous attacks on the same problem with conventional chimeric mutagenesis(7) . The most immediate outcome of the initial selection was the generation of a sublibrary of chimeric molecules that were well folded based on high level expression at the cell surface. The heterogeneity of expression in COS cells and the unexpectedly high basal activity of the well folded carboxyl lobe beta-loop for phosphorylation of CHO 199 (construct S132) prevented us from using the selected constructs to identify directly determinants for phosphorylation mediated by recognition elements in the amino lobe. However, further analysis allowed resolution of the cathepsin D regions required for phosphorylation of CHO 70 from the cathepsin D regions required for efficient folding. Therefore, the shuffle mutagenesis approach was successful in overcoming problems that can plague mutagenic approaches: resolution of folding effects from specific recognition phenomena and reconstructing complex determinants involving widely spaced regions in the primary sequence which are brought together through protein folding.

The initial screening of the membrane-anchored chimeric library by expression in COS cells and selection with Man-6-P/IGF-II receptor or an antibody to a myc peptide epitope incorporated into the construct enriched for chimeric proteins that localized efficiently at the cell surface. The sequencing of these constructs identified two cathepsin D regions, segments 7 and 12, which rescue the expression of the basal shuffle construct with all glycopepsinogen segments in the amino lobe and cathepsin D sequence after amino acid 230 of the carboxyl lobe. Aspartyl proteases are bilobed proteins in which the lobes are thought to have significant rigid body behavior in flexing during binding to substrates and inhibitors(23) . Therefore, the interactions between the lobes are limited. Segment 12 is part of a beta-pleated sheet that is continuous through both the amino and carboxyl lobes, and this region contacts a strand from the carboxyl-terminal region which is contributed by cathepsin D in all the chimeras analyzed here. The requirement to contact a cathepsin D segment from the amino lobe probably accounts for the ability of cathepsin D region 12 to correct folding problems that make constructs such as CP3 and S0 unable to escape the ER. Segment 7 is a short strand that runs through the core of the amino lobe, and although contacting no portion of the carboxyl lobe directly, it may influence the stability of the aformentioned beta-sheet through contacts in regions 9, 10, and 11 of the amino lobe, which in turn contact the beta-sheet. The stability of this beta-sheet may be particularly important for the membrane-anchored constructs since the carboxyl-terminal 20 amino acids of cathepsin D contribute the carboxyl lobe portion of the beta-sheet, and the attachment of the substantial carboxyl-terminal extension may place additional strain on this structure.

The identification of specific cathepsin D regions that influence phosphorylation required consideration of each oligosaccharide as an individual acceptor for phosphotransferase. Combinations of cathepsin D segments 7, 12, and neighboring regions resulted in significant increases in total phosphorylation, but virtually all of this phosphate was added to CHO 199 based on the low phosphorylation of chimeric proteins bearing only CHO 70. It is possible that the effect of these regions on the overall stability of the chimeric protein and the specific manner in which beta-loop 265-292 is presented may result in the enhanced phosphorylation of CHO 199.

A panel of constructs with CHO 70 and cathepsin D segments 7 or 12 plus other cathepsin D amino lobe sequences was tested, and a subset containing both regions 1 and 5 from cathepsin D was found to have significantly better phosphorylation than the constructs lacking one or both of these segments. Apparently elements present in segment 1 (propiece plus 36 amino acids of mature protein) and segment 5 (a component of the beta-flap 68-88), in combination with the beta-loop 265-292 element, serve to direct phosphotransferase to CHO 70. It should be noted that constructs with efficient phosphorylation of CHO 70 retain efficient phosphorylation of CHO 199 when it is present, such that an improvement in CHO 70 phosphorylation does not appear to occur at the expense of CHO 199 phosphorylation. In other words, phosphorylation of either oligosaccharide is independent of the phosphorylation of the other oligosaccharide. Also, the cathepsin D sequences in the carboxyl lobe are essential for phosphoryation of CHO 70 since replacement of cathepsin D amino acids 231-320 with glycopepsinogen sequence did not impair the ability of the chimeric protein to fold and be secreted but reduced phosphorylation to 0.5-1% (soluble form of CP70-MCD8).

Within cathepsin D segments 1 and 5, Lys-34 of the propiece (Lys-p34) and His-77 were found to be important for phosphorylation of CHO 70. In the cathepsin D structure, Lys-p34 is coordinated by the active site aspartic acids, suggesting that this residue acts through a conformational effect since it would not be available for direct interaction with phosphotransferase(5, 24) . However, constructs with this mutation fold correctly based on their ability to leave the ER, bind to pepstatin-agarose in a pH-dependent manner(4) , and are properly phosphorylated at CHO 199 (not shown). His-77, on the other hand, is exposed to solvent in the mature cathepsin D crystal structure and is predicted to be exposed to solvent in a procathepsin D model based on the pepsinogen and cathepsin D crystal structures. (^2)Thus, His-77 is a good candidate to be a contact residue involved in promotion of CHO 70 phosphorylation. Interestingly, His-77 is very close to the region of the propiece that would be directly perturbed by the Lys-p34 mutation, suggesting that noncharged residues in this region of the propiece may have to be presented in a specific conformation to cooperate with His-77.

The identification of a histidine as a participant in phosphorylation of the amino lobe oligosaccharide is of interest since the ionization state of histidine changes in the pH range encountered by lysosomal enzyme precursors as they traverse the secretory pathway. The exact pH of compartments containing phosphotransferase is not known, but there is a general opinion that a pH gradient exists between the ER (neutral) and trans-Golgi (mildly acidic). Since arginine replaces histidine at this position, it is likely that the charged form of histidine is the active form and that this form may be favored by acidification.

The mechanism by which the amino lobe propiece and beta-flap regions of cathepsin D direct enhanced phosphorylation of CHO 70 in concert with the distant beta-loop 265-292 in the carboxyl lobe is an open question. Models that would address this question must account for a number of observations. First, the amino lobe determinants do not function autonomously but must cooperate with the carboxyl lobe beta-loop element when expressed in mammalian cells. The carboxyl lobe beta-loop element, on the other hand, does appear to function autonomously when matched with a compatible amino lobe which may provide a more appropriate conformation. CHO 70 only receives one phosphate even in intact cathepsin D, suggesting that there is a fundamental difference in its accessibility to phosphotransferase compared with CHO 199, which usually receives two phosphates in intact cathepsin D and many chimeric molecules(7) .

Three models could account for the ability of surface determinants of the amino lobe to direct phosphotransferase to CHO 70 and also may be applied to the more general problem of phosphorylation of multiple oligosaccharides by phosphotransferase. All three models hold that the most important phosphotransferase contact is a properly presented carboxyl lobe beta-loop, whereas the elements in the amino lobe can act in one of three ways: 1) directly contact phosphotransferase simultaneously with the carboxyl lobe beta-loop (bivalent model); 2) directly contact phosphotransferase as a exclusive contact but only after the lysosomal enzyme is concentrated in the vicinity of phosphotransferase through interaction with the carboxyl lobe beta-loop (rebinding model); or 3) influence the orientation in space of CHO 70 such that it is either favorably (or unfavorably) oriented to allow access to the phosphotransferase catalytic site (oligosaccharide conformation model). In this case the contacts with phosphotransferase are mediated exclusively by the carboxyl lobe elements.

The first two models, which hold that the amino lobe determinants are directly involved in binding to phosphotransferase, are supported by the observation that in frog oocytes elements in the amino lobe of cathepsin D can function independently of elements in the carboxyl lobe to direct efficient phosphorylation of the CHO 70(8) . This result suggests the presence of a distinct recognition site for phosphotransferase in the amino lobe. The reason that the carboxyl lobe elements are essential for phosphorylation in mammalian cells is not clear.

The bivalent model proposes that phosphotransferase contacts the lysosomal enzyme at multiple sites simultaneously, with an increase in avidity due to suppression of dissociation. To account for the observations, the bivalent binding must result in a different orientation of CHO 70 compared with CHO 199 such that the second contact is required to allow phosphorylation of CHO 70. The concept that phosphotransferase could be large enough to contact both sides of cathepsin D simultaneously is plausible because there is evidence that phosphotransferase is a large multimeric protein(25) . This hierarchic model of the strong carboxyl lobe and weak amino lobe binding site interacting with possibly similar surfaces on a receptor molecule (phosphotransferase) is reminiscent of the interaction of growth hormone with its homodimeric receptor in which the homologous surfaces of two identical receptor subunits contact very different surfaces on the hormone molecule with very different contact areas and binding affinities(26) .

The rebinding model is based on the concept that phenomena such as receptor or binding site clusters can rebind dissociating ligand molecule with enhanced efficiency(27) . In detail this model suggests that the independently functioning cathepsin D carboxyl lobe binding site is bound by a phosphotransferase cluster allowing phosphorylation of the nearby CHO 199. The cathepsin D then dissociates and undergoes rotational and short range translational diffusion in the vicinity of the phosphotransferase cluster with a high probability of rebinding through the weak amino lobe site allowing phosphorylation of CHO 70. Another way to view this is that the interaction with the carboxyl lobe site increases the local concentration of the amino lobe site to a point where it can function. In the absence of a carboxyl lobe site, as occurs in the CP1 construct, the local concentration of the amino lobe near phosphotransferase clusters does not occur, and the phosphorylation of CHO 70 does not occur. The strong point of this model is that the binding is never multivalent so there is no expectation of long occupancy on phosphotransferase, and the relative positioning of the multiple sites required to phosphorylate multiple Asn-linked glycans is not as constrained since phosphotransferase never has to coordinate contacting two sites simultaneously.

The oligosaccharide conformation model proposes that the amino lobe determinants do not contact phosphotransferase directly but participate in an intramolecular binding of CHO 70 to position it for optimal access to the phosphotransferase catalytic site when phosphotransferase is bound to the carboxyl lobe recognition marker. Asn-linked oligosaccharides are long flexible appendages on globular proteins which essentially occupy a large ``cloud'' around the site of anchorage(28, 29) . Restraining an oligosaccharide or even a single branch such that it would be limited to a subregion of this cloud closest to the phosphotransferase catalytic site in the bound proenzyme could greatly enhance the chances of phosphorylation. CHO 70 is ordered in the crystal structure of mature cathepsin D. In fact, it has been suggested that the phosphate on CHO 70 may coordinate with Lys-203(5, 24) . The fact that a phosphorylated branch of CHO 70 can reach Lys-203 suggests that CHO 70 can be extended to reach the vicinity of CHO 199. Although the beta-flap is not in the path of CHO 70 as it reaches from Asn-70 to Lys-203, the conformation of CHO 70 may change substantially after propiece cleavage and phosphorylation itself.

The new regions of the amino lobe identified here, whether they work by intermolecular interaction with phosphotransferase or intramolecular interaction with CHO 70, are specific recognition sites in that they support a clear biological goal in ensuring that cathepsin D has the minimum number of Man-6-P groups required for efficient targeting. The evolutionary pressure for ensuring that multiple oligosaccharides contain Man-6-P is likely to arise from the nature of the two Man-6-P receptors, each of which contains two low affinity binding sites for Man-6-P which allow multivalent binding with a higher effective avidity (3) . The high avidity binding enhances the efficiency of the targeting of the acid hydrolases to lysosomes. Cathepsin D ensures this multivalency with 2-fold redundance. CHO 199 usually carries two phosphates providing one chance to generate a high affinity ligand, and the combination of any single phosphate on CHO 199 with the single phosphate on CHO 70, which is generated under the direction of the amino lobe determinants defined in this paper, provides a second chance to generate a high affinity ligand allowing targeting to lysosomes.


FOOTNOTES

*
This investigation was supported in part by United States Public Health Service Grant CA 08759 and by National Institutes of Health Research Award GM-07200. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
Fellow of the Janes Coffin Childs Fund. Present address: Dept. of Pathology, Jewish Hospital of St. Louis and Washington University School of Medicine, St. Louis, MO 63110.

Medical Scientist from the National Institute of General Medical Science. Current address: University of California, School of Medicine, San Francisco, CA.

**
To whom correspondence should be addressed: Dept. of Medicine, Washington University School of Medicine, Box 8125, 660 S. Euclid Ave., St. Louis, MO 63110. Fax: 314-362-8826.

(^1)
The abbreviations used are: phosphotransferase, N-acetylglucosamine 1-phosphotransferase; CD-MCD8, membrane-anchored cathepsin D; IGF-II, insulin-like growth factor II; FITC, fluorescein isothiocyanate; mAb, monoclonal antibody; CP, chimeric protein; PBS, phosphate-buffered saline; Endo H, endo-beta-N-acetylglucosaminidase H; ER, endoplasmic reticulum.

(^2)
G. Koelsch and M. Fusek, personal communication.


ACKNOWLEDGEMENTS

We thank the Protein Chemistry facility at Washington University School of Medicine for timely synthesis of many oligonucleotides. We thank Hugh Pelham, Lloyd Klickstein, Paula Kavathas, Alan Cantor, and Bill Canfield for plasmids and advice. We thank Gerald Koelsch for advice on the disposition of critical residues in the cathepsin D and pepsinogen crystal structures. We also thank Ronald Germain for pointing out the work on effects of surface receptor clustering on rebinding of ligands.


REFERENCES

  1. von Figura, K., and Hasilik, A. (1986) Annu. Rev. Biochem. 55, 167-193 [CrossRef][Medline] [Order article via Infotrieve]
  2. Kornfeld, S., and Mellman, I. (1989) Annu. Rev. Cell Biol. 5, 483-525 [CrossRef]
  3. Kornfeld, S. (1992) Annu. Rev. Biochem. 61, 307-330 [CrossRef][Medline] [Order article via Infotrieve]
  4. Baranski, T. J., Faust, P. L., and Kornfeld, S. (1990) Cell 63, 281-291 [Medline] [Order article via Infotrieve]
  5. Metcalf, P., and Fusek, M. (1993) EMBO J. 12, 1293-1302 [Abstract]
  6. Hartsuck, J. A., Koelsch, G., and Remington, S. J. (1992) Proteins Struct. Funct. Genet. 13, 1-25 [Medline] [Order article via Infotrieve]
  7. Baranski, T. J., Cantor, A. B., and Kornfeld, S. (1992) J. Biol. Chem. 267, 23342-23348 [Abstract/Free Full Text]
  8. Cantor, A. B., Baranski, T. J., and Kornfeld, S. (1992) J. Biol. Chem. 267, 23349-23356 [Abstract/Free Full Text]
  9. Gluzman, Y. (1981) Cell 23, 175-182 [Medline] [Order article via Infotrieve]
  10. Evans, G. I., and Bishop, J. M. (1985) Mol. Cell. Biol. 4, 2843-2850
  11. Pelham, H. R. B. (1988) EMBO J. 7, 913-918 [Abstract]
  12. Margolskee, R. F., Kavathas, P., and Berg, P. (1988) Mol. Cell Biol. 8, 2837-2847 [Medline] [Order article via Infotrieve]
  13. Peterson, A., and Seed, B. (1987) Nature 329, 842-846 [CrossRef][Medline] [Order article via Infotrieve]
  14. Faust, P. L., Kornfeld, S., and Chirgwin, J. M. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 4910-4914 [Abstract]
  15. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1987) Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  16. Hirt, B. (1967) J. Mol. Biol. 26, 365-369 [Medline] [Order article via Infotrieve]
  17. Dustin, M. L., Olive, D., and Springer, T. A. (1989) J. Exp. Med. 169, 503-517 [Abstract]
  18. Bradford, M. M. (1976) Anal. Biochem. 72, 248-254 [CrossRef][Medline] [Order article via Infotrieve]
  19. Varki, A., and Kornfeld, S. (1983) J. Biol. Chem. 258, 2808-2818 [Abstract/Free Full Text]
  20. Reitman, M. L., Lang, L., and Kornfeld, S. (1984) Methods Enzymol. 107, 163-173 [Medline] [Order article via Infotrieve]
  21. Munro, S., and Maniatis, T. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 9248-9252 [Abstract]
  22. Cantor, A. B., and Kornfeld, S. (1992) J. Biol. Chem. 267, 23357-23363 [Abstract/Free Full Text]
  23. Sali, A., Veerapandian, B., Cooper, J. B., Moss, D. S., Hofmann, T., and Blundell, T. L. (1992) Protein Struct. Funct. Genet. 12, 158-170
  24. Baldwin, E. T., Bhat, T. N., Gulnik, S., Hosur, M. V., Sowder, R. C., Cachau, R. E., Collins, J., Silva, A. M., and Erickson, J. W. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 6796-6800 [Abstract]
  25. Ketchem, C. M., and Kornfeld, S. (1992) J. Biol. Chem. 267, 11645-11653 [Abstract/Free Full Text]
  26. de Vos, A. M., Ultsch, M., and Kossiakoff, A. A. (1992) Science 255, 306-312 [Medline] [Order article via Infotrieve]
  27. Posner, R. G., Lee, B., Conrad, D. H., Holowka, D., Baird, B., and Goldstein, B. (1992) Biochemistry 31, 5350-5356 [Medline] [Order article via Infotrieve]
  28. Montreuil, J. (1980) Adv. Carbohydr. Chem. Biochem. 37, 157-223 [Medline] [Order article via Infotrieve]
  29. Rademacher, T. W., Parekh, R. B., and Dwek, R. A. (1988) Annu. Rev. Biochem. 57, 785-838 [CrossRef][Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.