(Received for publication, October 7, 1994; and in revised form, October 27, 1994)
From the
A novel combinatorial mutagenesis strategy (shuffle mutagenesis) was developed to identify sequences in the propiece and amino lobe of cathepsin D which direct oligosaccharide phosphorylation by UDP-GlcNAc:lysosomal enzyme N-acetylglucosamine 1-phosphotransferase. Propiece restriction fragments and oligonucleotide cassettes corresponding to 13 regions of the cathepsin D and glycopepsinogen amino lobes were randomly shuffled together to generate a large library of chimeric molecules. The library was inserted into an expression vector encoding the carboxyl lobe of cathepsin D with a carboxyl-terminal myc epitope and a CD8 transmembrane extension. Transfected COS1 cells expressing the membrane-anchored forms of the cathepsin D/glycopepsinogen chimeras at the cell surface were selected with solid phase mannose 6-phosphate receptor or an antibody to the myc epitope. Plasmids were rescued in Escherichia coli and sequenced by hybridization to the original oligonucleotide cassettes. Two regions of the cathepsin D amino lobe (segments 7 and 12) were found to contribute to proper folding, surface expression, and selective phosphorylation of the carboxyl lobe oligosaccharide. Two different cathepsin D regions (the propiece and segment 5) cooperated with a previously identified recognition element in the carboxyl lobe to allow efficient phosphorylation of both the amino and carboxyl lobe oligosaccharides.
Three general models for extending the catalytic reach of N-acetylglucosamine 1-phosphotransferase to widely spaced oligosaccharides are presented.
A key step in the targeting of newly synthesized acid hydrolases
to lysosomes is the recognition by the enzyme UDP-GlcNAc:lysosomal
enzyme N-acetylglucosamine 1-phosphotransferase (abbreviated
phosphotransferase) ()of a protein determinant shared among
at least 40 different lysosomal hydrolases(1, 2) .
This interaction results in the phosphorylation of mannose residues on
the Asn-linked high mannose oligosaccharides of the lysosomal
hydrolases. These phosphomannosyl residues serve as high affinity
ligands for binding of the hydrolases to mannose 6-phosphate receptors
present in the Golgi and their subsequent translocation to the lysosome (3) .
A molecular dissection of the phosphotransferase recognition marker on lysosomal hydrolases was undertaken in this laboratory using a pair of aspartyl proteases, the lysosomal hydrolase cathepsin D and the secretory protein pepsinogen, as sequence donors for chimeric proteins that were tested for their ability to act as phosphotransferase substrates(4) . Even though cathepsin D and pepsinogen are 45% identical in amino acid sequence and have similar secondary and tertiary structures(5, 6) , they differ in that cathepsin D is well phosphorylated by phosphotransferase, whereas a glycosylated form of pepsinogen is not. These studies led to the identification of a phosphotransferase recognition patch in the carboxyl lobe of cathepsin D formed by two noncontinuous primary sequences (lysine 203 and amino acids 265-292). Although these residues were sufficient to confer recognition by phosphotransferase when substituted into homologous positions in pepsinogen, it was observed that the presence of multiple regions of the amino lobe of cathepsin D enhanced phosphorylation of the chimeric proteins. It was concluded that these elements may be part of an extended carboxyl lobe recognition domain or comprise a second independent recognition domain. It was also observed that although the presence of a phosphotransferase recognition domain located on either lobe of a cathepsin D/glycopepsinogen chimeric molecule is sufficient to allow phosphorylation of oligosaccharides on both lobes, the oligosaccharide on the lobe that contained the recognition domain was the preferred substrate(7, 8) .
The identification of specific sequences in the amino lobe of cathepsin D which contribute to phosphotransferase recognition was attempted by generating six chimeric proteins using two shared restriction sites in the 550-nucleotide amino lobe region to generate nine constructs(7) . Chimeric proteins containing only a single cathepsin D amino lobe region were not well phosphorylated, but those with the pairwise combinations were phosphorylated to some degree. This suggested that multiple regions are required for generation of the recognition site or are required for proper folding.
To characterize further the cathepsin D amino lobe elements that contribute to phosphorylation, we have taken an approach that redefines and extends the limits of the chimeric methodology applied in our original studies. Rather than making a small number of constructs manually and examining each individually, we sought to generate a large library of chimeric molecules and analyze the entire library by a phenotypic selection. Short segments of cathepsin D and pepsinogen encoded by oligonucleotide cassettes were shuffled together through a directed subcloning strategy that ensured high fidelity and representation. The library was expressed in COS cells, and chimeric proteins that folded correctly and were phosphorylated were selected and analyzed. This approach allowed the analysis of 2 orders of magnitude more constructs than had been screened in our prior studies over several years.
Our experiments have defined two regions of the amino lobe of cathepsin D which contribute to proper folding and surface expression and two distinct regions that promote phosphorylation of the amino lobe oligosaccharide (CHO 70). Three models are proposed to account for these results.
Figure 3: Binding of Man-6-P/IGF-II receptor and 9E10 antibody to COS cells transiently expressing membrane-anchored chimeric proteins. The Man-6-P/IGF-II receptor was directly labeled with FITC (horizontal axis); the 9E10 antibody was detected with phycoerythrin goat anti-mouse IgG (vertical axis). Ninety-eight percent of mock transfected cells were below 10 fluorescent units on each axis. The primary sequence schematics show cathepsin D as dark gray, pepsinogen as light gray, the myc peptide as hatched, and CD8 sequences as white and black (transmembrane region). Panel A, CD-MCD8; panel B, CP71-MCD8 (GP p1-319, CD 320-348); panel C, CP70-MCD8 (CD p1-187, PG 188-319, CD 320-348); and panel D, CP55-MCD8 (CD p1-187, GP 199-230, CD 231-348).
Thirteen regions were selected which spanned from amino acids 37 to 186 of cathepsin D. The numbering conventions for the amino acid sequence of cathepsin D and glycopepsinogen are outlined in Fig. 1. XhoI and KpnI sites were engineered into the cDNA of cathepsin D and CP3 (which contains the amino lobe of glycopepsinogen up to the HincII site at nucleotide 931) by site-directed mutagenesis in the vector pGBT such that these restriction sites flanked the 5` and 3` boundaries of the 13 consecutive segments. The HindIII-XhoI regions of the cathepsin D and CP3 cDNAs were sequenced and used as a source of region 1 for subsequent steps. The 13 regions between the XhoI and KpnI sites were numbered regions 2-14 (Fig. 1). Oligonucleotide cassettes corresponding to regions 2-14 for cathepsin D and glycopepsinogen were synthesized to allow a three-step subcloning strategy outlined in Fig. 2. The oligonucleotides introduce SphI, SacI, BamHI, StyI, EcoRI, and SalI sites into the cathepsin D cDNA and allow complete assembly without altering the sequence of cathepsin D. The introduction of restriction sites SacI (Asp-63 to Glu), StyI (Gln-93 to Ser and Val-94 to Leu), and SalI (Ser-161 to Asp) and the requirement to otherwise maintain compatibility with the cathepsin D sequences in the junctions between regions 3 and 4 (Arg-58 to Lys, Phe-59 to Tyr), regions 7 and 8 (Ile-114 to Val), regions 8 and 9 (Ser-124 to Ile), and regions 13 and 14 (Ser-177 to Pro) resulted in introduction of the indicated changes to the glycopepsinogen sequence (Fig. 1).
Figure 1: Alignment of cathepsin D and glycopepsinogen amino lobes with junctions for shuffle mutagenesis. Numbering is for cathepsin D sequence with p1-p44 for the propiece and 1-191 from cathepsin D shown. Boxed residues are identical between cathepsin D and glycopepsinogen. The solid vertical bars represent restriction sites remaining in the completed chimeric molecules defining one class of junctions between regions. The dashed vertical/horizontal partitions represent a second class of junctions between regions based on annealing of overhangs between two oligonucleotide cassettes without leaving a unique restriction site. The circles indicate residues that were changed from the glycopepsinogen residue (shown) to another residue (as indicated under ``Experimental Procedures'') to splice to the cathepsin D sequence. The solid underline starting at residue 70 marks the site of Asn-linked glycan addition to the amino lobe. The single underlined residues at p34 and 77 are implicated in phosphotransferase recognition as described under ``Results.''
Figure 2: Construction of shuffle library. The numbers at the bottom of the plasmids are the number of potential products after each ligation. In round 1 oligonucleotide cassettes were annealed and ligated into pSP64 as indicated. At this point each potential product was independently prepared and sequenced. The indicated restriction sites were then used to prepare fragments used in subsequent steps. The final shuffle library encoding membrane-anchored chimeric proteins was subcloned into the vector pAprM8. After selection, a subset of constructs was subcloned into a modified version of pCDM for expression as secreted protein.
The initial round of subcloning into SP64 cut with HindIII and EcoRI resulted in 30 possible ligation
products corresponding to every combination of cathepsin D and
glycopepsinogen segments for the 13 consecutive regions excluding the
propieces. These were sequenced, and validated plasmid copies of each
ligation product were prepared as purified plasmid and quantified by A. In the second subcloning step the versions of
each subregion corresponding to each possible cathepsin
D/glycopepsinogen combination were mixed in equal ratio based on OD,
and the pools were cut with the indicated restriction enzymes. The
resulting fragments were then copurified and used in ligations as
outlined in Fig. 2. The number of possible combinations at each
step are indicated in Fig. 2. The number of colonies pooled at
each step was sufficient to obtain greater than 95% of possible
combinations(15) .
The Endo H-resistant material
that had been treated with Pronase or N-glycanase was loaded
onto concanavalin A-Sepharose in Tris/CaCl buffer, and
three fractions were collected: the flow-through containing tri- and
tetraantennary complex oligosaccharides, the 10 mM
-methylglucoside eluate containing biantennary complex
oligosaccharides, and the 100 mM
-methyl mannoside eluate
containing Endo H-resistant high mannose oligosaccharides. In some
experiments the level of phosphorylation of the Endo H-resistant, N-glycanase released high mannose oligosaccharides eluted from
concanavalin A-Sepharose with 100 mM
-methylmannoside was
determined. This material was found to be significantly
underphosphorylated compared with the Endo H-released material.
Therefore, no attempt was made to systematically compensate for Endo
H-resistant high mannose oligosaccharides in estimating
phosphorylation.
The percent phosphorylation was calculated as cpm
recovered in Endo H-released oligosaccharides with one or two
phosphates total cpm in Endo H-released oligosaccharides
(phosphorylated plus neutral) + 2 (Endo H-resistant complex
oligosaccharides). The values for the complex oligosaccharides were
multiplied by 2 to correct for the fact that they contain 3 mannose
residues versus an average of 6 mannose residues per high
mannose oligosaccharide. The ratio of oligosaccharides with two and one
phosphate was determined directly from the QAE-Sephadex chromatography
data.
Analysis of the amino lobe recognition determinants for
phosphotransferase required selection of a cathepsin D/glycopepsinogen
chimera in which the cathepsin D amino lobe contributes to the level of
phosphorylation. Glycopepsinogen sequences could then be substituted
into the amino lobe of this positive construct to determine the minimal
cathepsin D sequences required for phosphorylation. CP70-MCD8 is a
chimera with the propiece and amino lobe of cathepsin D and the
carboxyl lobe of glycopepsinogen, except for the carboxyl-terminal 29
amino acids (Fig. 3C). Although a similar construct
(CP1) is moderately well phosphorylated in frog oocytes(4) ,
CP70-MCD8 did not bind fluorescent Man-6-P/IGF-II receptor when
expressed in COS cells despite good surface expression (Fig. 3C), and a soluble form of this chimeric protein
was only phosphorylated 0.5-1% (not shown). However, CP55-MCD8,
which differs from CP70-MCD8 by having more cathepsin D carboxyl lobe
sequence (residues 231-348), was phosphorylated at a level
intermediate between CD-MCD8 and CP71-MCD8 (Fig. 3D).
The additional cathepsin D carboxyl lobe sequences contain the
-loop (residues 265-292) that is required for efficient
phosphorylation(4) . However, the presence of the cathepsin D
-loop alone does not result in phosphorylation(4) .
Together, these results show that neither the cathepsin D propiece and
amino lobe nor the carboxyl lobe
-loop (residues 265-292)
contains sufficient information for recognition by phosphotransferase,
but together these regions cooperate to confer good phosphorylation of
the membrane-anchored form. Therefore, the construct selected to accept
the amino lobe chimeric library contained glycopepsinogen sequence from
amino acids 187-230 and cathepsin D sequence from amino acids
231-348. The regions to be intermixed in the chimeric library are
between residue P1 of the propiece and amino acid 186.
A panning procedure was developed for selecting COS cells expressing phosphorylated chimeric molecules on their surface. Transfected COS cells were allowed to adhere to Petri plates containing attached Man-6-P/IGF-II receptor. The unbound cells were collected by gentle washing and the adherent cells eluted with Man-6-P. As shown in Fig. 4, cells expressing CD-MCD8 (panels A-C) and CP55-MCD8 (panels D-F) were readily selected by this method as indicated by a the nearly complete depletion of 9E10-positive cells from the nonbound fraction. Cells expressing CP3-MCD8 (panels G-I) were less efficiently bound, as indicated by the failure to remove 9E10-positive cells from the nonbound population, and cells expressing CP71-MCD8 (panels J-L) did not bind at all. A variant of this assay involved attaching the 9E10 mAb to Petri plates which allowed panning for cells expressing a chimeric protein on its surface regardless of phosphorylation (not shown). With these tools a large library of constructs could be screened for both surface expression (folding) and degree of phosphorylation.
Figure 4:
Selection of COS cells transiently
expressing membrane-anchored chimeric molecules on plates coated with
Man-6-P/IGF-II receptor. Cells were stained with 9E10 mAb and FITC goat
anti-mouse IgG before (input) or after selection of Man-6-P/IGF-II
receptor plates (1 µg/cm). After selection the sample
was split into nonbound and bound populations. The latter was then
eluted from the plates with 10 mM Man-6-P. Ninety-eight
percent of mock transfected cells gave less than 10 fluorescence units. Panels A-C, CD-MCD8; panels D-F,
CP55-MCD8; panels G-I, CP3-MCD8; panels
J-L, CP71-MCD8.
The quality of the library was
preliminarily assessed by analyzing 22 colonies randomly picked from
the initial plating. Hybridization sequencing established that each
region was represented at close to a 50:50 ratio (cathepsin
D/glycopepsinogen) except region 2, which was 15% cathepsin D (Table 1). The finding of a few inserts with hybridization
signals for both cathepsin D and glycopepsinogen sequence at a single
region suggests that some ``triplet'' ligation events
(head-tail/tail-head/head-tail) occurred during the generation of the
library. Binding of the 9E10 mAb and the Man-6-P/IGF-II receptor to COS
cells transfected with the individual plasmids showed that 50% of the
plasmids encoded chimeric proteins that were expressed at the cell
surface and phosphorylated to various extents (data not shown). This
indicated that the library was of adequate complexity (5
10
expressable constructs) and that constructs with
variable phosphorylation levels were present in the library. When
transfected COS cells were permeabilized and analyzed by
immunofluorescence with the 9E10 antibody, some of the cells with low
surface expression showed endoplasmic reticulum staining, suggesting
that particular combinations of segments resulted in chimeric proteins
that fail to fold well enough to exit this compartment (data not
shown).
Sequencing of 64 constructs randomly selected from the Man-6-P/IGF-II receptor binding sublibrary (32 constructs) and the 9E10 binding sublibrary (32 constructs) revealed a striking overrepresentation of cathepsin D regions 1 (+34%), 7 (+41%) and 12 (+28%) and an underrepresentation of cathepsin D region 9 (-42%) (Table 2). It is likely that these deviations from normal representation following selection are due to folding constraints, although it is not clear for any region whether an enriched segment was selected for or the depleted segment selected against. Fortunately, none of the enriched regions was absolutely required for secretion of soluble constructs. It was therefore possible to measure the phosphorylation levels in constructs lacking each enriched region and to determine the contribution of the enriched segments to phosphotransferase recognition in apparently well folded molecules (see below).
The conclusion from the analysis of the expression and phosphorylation of the membrane-anchored chimeric proteins was that both the Man-6-P/IGF-II receptor and 9E10 antibody selections were similarly effective in enriching for surface-expressed molecules with variable levels of phosphorylation. Further quantitative analysis of phosphorylation was performed on soluble versions of selected constructs in which we purposefully skewed the analysis toward constructs with relatively low contents of cathepsin D segments. Constructs encoding soluble proteins were used for the final analysis to allow comparison with earlier results.
Sequences of chimeric proteins showing the greatest increase in phosphorylation with the addition of the least number of cathepsin D regions are compared in Fig. 5. Although the combination of segments 7 and 12 results in a small increase in phosphorylation over either alone, larger gains in total phosphorylation and in the fraction of oligosaccharides with two phosphates are made by including segments 1, 5, 8, 10, and 13 in different combinations. The complexity of these results led us to seek some simplification by examining whether the two oligosaccharides of the chimeric proteins might be phosphorylated differentially based on inclusion of different regions of the amino lobe of cathepsin D. Mutations eliminating one or the other of the glycosylation sites were introduced by subcloning or cassette mutagenesis(22) . Construct S206, which contains cathepsin D segments 7, 8, 11, 12, and 13, showed efficient phosphorylation of the carboxyl lobe oligosaccharide at position 199 (CHO 199) but almost no phosphorylation of the amino lobe oligosaccharide at position 70 (CHO 70) (Fig. 6). Also shown in Fig. 6are the results obtained with S206 containing both oligosaccharides. The extent of total oligosaccharide phosphorylation and the ratio of oligosaccharides containing one phosphate to those containing two phosphates were the same for the average of the two constructs containing a single oligosaccharide and S206 containing both oligosaccharides. Similar results were obtained with S132 (cathepsin D segments 7 and 12) and with S198 (cathepsin D segments 7, 8, 12, and 13). In contrast, construct S8782, which incorporates additional cathepsin D regions (segments 1, 5, 8, 11, 12, and 13), showed significant phosphorylation of both CHO 199 and CHO 70 (Fig. 6). Thus, regions 1 and 5, which are present in S8782 but not in S206, appear to contain a determinant for efficient phosphorylation of CHO 70. Of note, however, is the finding that CHO 70 rarely acquires two phosphates, whereas CHO 199 receives two phosphates about 50% of the time.
Figure 5:
Phosphorylation of soluble chimeric
proteins combining cathepsin D residues 231-348 with regions from
the amino lobe of cathepsin D. COS cells transiently expressing soluble
chimeric proteins bearing both Asn-linked oligosaccharides were labeled
with [2-H]mannose, and the myc tagged chimeric
proteins were isolated from the media. The labeled oligosaccharides
were analyzed to determine percentage of phosphorylated Asn-linked
oligosaccharides (%P) and the ratio of oligosaccharides with
two and one phosphate (2:1). Each chimeric protein was analyzed at
least three times, and standard deviations of percent phosphorylation
are given.
Figure 6: Distribution of phosphorylation between the two oligosaccharides in two chimeric proteins. Chimeric proteins S206 and S8782 were mutated to delete one or the other Asn-linked glycosylation sites. Chimeric proteins with both oligosaccharides, only CHO 70, or only CHO 199 were analyzed for percent phosphorylation as described in the legend to Fig. 5. The total phosphorylation of the chimeric protein with both oligosaccharides should be the sum of the phosphorylation of the two individual oligosaccharides divided by 2. This calculation was performed using the percent phosphorylation from chimeric proteins with one or the other oligosaccharide in the column labeled (70 + 199)/2.
A panel of 18
soluble chimeric constructs bearing only CHO 70 and containing
different permutations of the cathepsin D regions found in S8782 were
assayed directly for phosphorylation of CHO 70 to test the hypothesis
that cathepsin D regions 1 and 5 are required for phosphorylation of
this oligosaccharide (Fig. 7). A very strong correlation between
efficient CHO 70 phosphorylation and the presence of both regions 1 and
5 of cathepsin D was established. Region 1 contains the entire propiece
of cathepsin D plus 36 amino acids of the mature protein, whereas
region 5 (amino acids 63-81) is a component of the -flap
68-88, which contains CHO 70 and overlaps a portion of the
propiece. Examination of Fig. 7shows that neither of the highly
enriched cathepsin D segments 7 or 12 is essential for efficient
phosphorylation of CHO 70 since S8840, which lacks region 12, and
S8782, which lacks region 7, are both relatively well phosphorylated on
CHO 70.
Figure 7: Phosphorylation of CHO 70 in soluble chimeric proteins combining cathepsin D residues 231-348 with regions from the amino lobe of cathepsin D. Chimeric proteins bearing only CHO 70 were expressed in COS cells, and phosphorylation was analyzed as indicated in the legend to Fig. 5. Since the proportion of oligosaccharides with two phosphates is negligible at CHO 70 this column was omitted. In cases in which the experiment was performed three or more times the standard deviation is reported; in other cases the percent phosphorylation (%P) is an average of two values. S206 was analyzed four times and was always less than 1%.
Figure 8: Site-directed mutagenesis to identify residues in construct S8840 required for phosphorylation of CHO 70. The phosphorylation of S8840 containing only CHO 70 was determined as in the legend to Fig. 5. Standard deviations are based on at least three determination for each mutated form.
Mutation of Lys 58 to Ala had no effect on CHO 70 phosphorylation in S8840. Thus this residue, which is located in the junction between regions 3 and 4, does not appear to have a significant role in the phosphorylation of CHO 70. Similarly, when Lys-p8, GluAsp-p24-25, Lys-p29, Glu-p44, Lys-69, and Asp-75 were individually changed to alanines, there was no effect on the level of phosphorylation of CHO 70.
Figure 9:
In vitro phosphorylation of
soluble chimeric proteins purified from COS cells. Transfer of
[P]GlcNAc from
UDP[
-
P]GlcNAc to Asn-linked glycans of
chimeric proteins with both oligosaccharides or only CHO 70 was
determined as described under ``Experimental Procedures.''
, S8840 with CHO 70;
, S198 with CHO 70. Inset:
, S198 with CHO 70 and CHO 199;
, S198 with CHO
70.
The data presented in this paper demonstrate that the amino lobe of cathepsin D contains two types of elements that influence oligosaccharide phosphorylation. One class of elements promoted progression through the secretory pathway apparently by maintaining compatibility of the two lobes of the chimeric bilobed aspartyl protease and thereby facilitating proper folding of the chimeric protein. This same set of elements in concert with a portion of the previously described carboxyl lobe phosphotransferase recognition marker also had a positive effect on phosphorylation of the oligosaccharide located at position 199 (CHO 199). A distinct combination of cathepsin D regions in the amino lobe when combined with the carboxyl lobe recognition element were found to be required for phosphorylation of the oligosaccharide located at position 70 (CHO 70). The chimeric constructs that led to these findings were obtained through a novel combinatorial mutagenesis strategy (shuffle mutagenesis) in which a propiece restriction fragment and oligonucleotide cassettes corresponding to 13 regions of the cathepsin D and glycopepsinogen amino lobes were shuffled together randomly to generate a large library of chimeric molecules.
The shuffle
mutagenesis approach provided two advantages over previous attacks on
the same problem with conventional chimeric mutagenesis(7) .
The most immediate outcome of the initial selection was the generation
of a sublibrary of chimeric molecules that were well folded based on
high level expression at the cell surface. The heterogeneity of
expression in COS cells and the unexpectedly high basal activity of the
well folded carboxyl lobe -loop for phosphorylation of CHO 199
(construct S132) prevented us from using the selected constructs to
identify directly determinants for phosphorylation mediated by
recognition elements in the amino lobe. However, further analysis
allowed resolution of the cathepsin D regions required for
phosphorylation of CHO 70 from the cathepsin D regions required for
efficient folding. Therefore, the shuffle mutagenesis approach was
successful in overcoming problems that can plague mutagenic approaches:
resolution of folding effects from specific recognition phenomena and
reconstructing complex determinants involving widely spaced regions in
the primary sequence which are brought together through protein
folding.
The initial screening of the membrane-anchored chimeric
library by expression in COS cells and selection with Man-6-P/IGF-II
receptor or an antibody to a myc peptide epitope incorporated into the
construct enriched for chimeric proteins that localized efficiently at
the cell surface. The sequencing of these constructs identified two
cathepsin D regions, segments 7 and 12, which rescue the expression of
the basal shuffle construct with all glycopepsinogen segments in the
amino lobe and cathepsin D sequence after amino acid 230 of the
carboxyl lobe. Aspartyl proteases are bilobed proteins in which the
lobes are thought to have significant rigid body behavior in flexing
during binding to substrates and inhibitors(23) . Therefore,
the interactions between the lobes are limited. Segment 12 is part of a
-pleated sheet that is continuous through both the amino and
carboxyl lobes, and this region contacts a strand from the
carboxyl-terminal region which is contributed by cathepsin D in all the
chimeras analyzed here. The requirement to contact a cathepsin D
segment from the amino lobe probably accounts for the ability of
cathepsin D region 12 to correct folding problems that make constructs
such as CP3 and S0 unable to escape the ER. Segment 7 is a short strand
that runs through the core of the amino lobe, and although contacting
no portion of the carboxyl lobe directly, it may influence the
stability of the aformentioned
-sheet through contacts in regions
9, 10, and 11 of the amino lobe, which in turn contact the
-sheet.
The stability of this
-sheet may be particularly important for the
membrane-anchored constructs since the carboxyl-terminal 20 amino acids
of cathepsin D contribute the carboxyl lobe portion of the
-sheet,
and the attachment of the substantial carboxyl-terminal extension may
place additional strain on this structure.
The identification of
specific cathepsin D regions that influence phosphorylation required
consideration of each oligosaccharide as an individual acceptor for
phosphotransferase. Combinations of cathepsin D segments 7, 12, and
neighboring regions resulted in significant increases in total
phosphorylation, but virtually all of this phosphate was added to CHO
199 based on the low phosphorylation of chimeric proteins bearing only
CHO 70. It is possible that the effect of these regions on the overall
stability of the chimeric protein and the specific manner in which
-loop 265-292 is presented may result in the enhanced
phosphorylation of CHO 199.
A panel of constructs with CHO 70 and
cathepsin D segments 7 or 12 plus other cathepsin D amino lobe
sequences was tested, and a subset containing both regions 1 and 5 from
cathepsin D was found to have significantly better phosphorylation than
the constructs lacking one or both of these segments. Apparently
elements present in segment 1 (propiece plus 36 amino acids of mature
protein) and segment 5 (a component of the -flap 68-88), in
combination with the
-loop 265-292 element, serve to direct
phosphotransferase to CHO 70. It should be noted that constructs with
efficient phosphorylation of CHO 70 retain efficient phosphorylation of
CHO 199 when it is present, such that an improvement in CHO 70
phosphorylation does not appear to occur at the expense of CHO 199
phosphorylation. In other words, phosphorylation of either
oligosaccharide is independent of the phosphorylation of the other
oligosaccharide. Also, the cathepsin D sequences in the carboxyl lobe
are essential for phosphoryation of CHO 70 since replacement of
cathepsin D amino acids 231-320 with glycopepsinogen sequence did
not impair the ability of the chimeric protein to fold and be secreted
but reduced phosphorylation to 0.5-1% (soluble form of
CP70-MCD8).
Within cathepsin D segments 1 and 5, Lys-34 of the
propiece (Lys-p34) and His-77 were found to be important for
phosphorylation of CHO 70. In the cathepsin D structure, Lys-p34 is
coordinated by the active site aspartic acids, suggesting that this
residue acts through a conformational effect since it would not be
available for direct interaction with
phosphotransferase(5, 24) . However, constructs with
this mutation fold correctly based on their ability to leave the ER,
bind to pepstatin-agarose in a pH-dependent manner(4) , and are
properly phosphorylated at CHO 199 (not shown). His-77, on the other
hand, is exposed to solvent in the mature cathepsin D crystal structure
and is predicted to be exposed to solvent in a procathepsin D model
based on the pepsinogen and cathepsin D crystal structures. ()Thus, His-77 is a good candidate to be a contact residue
involved in promotion of CHO 70 phosphorylation. Interestingly, His-77
is very close to the region of the propiece that would be directly
perturbed by the Lys-p34 mutation, suggesting that noncharged residues
in this region of the propiece may have to be presented in a specific
conformation to cooperate with His-77.
The identification of a histidine as a participant in phosphorylation of the amino lobe oligosaccharide is of interest since the ionization state of histidine changes in the pH range encountered by lysosomal enzyme precursors as they traverse the secretory pathway. The exact pH of compartments containing phosphotransferase is not known, but there is a general opinion that a pH gradient exists between the ER (neutral) and trans-Golgi (mildly acidic). Since arginine replaces histidine at this position, it is likely that the charged form of histidine is the active form and that this form may be favored by acidification.
The mechanism by which the amino lobe propiece and -flap
regions of cathepsin D direct enhanced phosphorylation of CHO 70 in
concert with the distant
-loop 265-292 in the carboxyl lobe
is an open question. Models that would address this question must
account for a number of observations. First, the amino lobe
determinants do not function autonomously but must cooperate with the
carboxyl lobe
-loop element when expressed in mammalian cells. The
carboxyl lobe
-loop element, on the other hand, does appear to
function autonomously when matched with a compatible amino lobe which
may provide a more appropriate conformation. CHO 70 only receives one
phosphate even in intact cathepsin D, suggesting that there is a
fundamental difference in its accessibility to phosphotransferase
compared with CHO 199, which usually receives two phosphates in intact
cathepsin D and many chimeric molecules(7) .
Three models
could account for the ability of surface determinants of the amino lobe
to direct phosphotransferase to CHO 70 and also may be applied to the
more general problem of phosphorylation of multiple oligosaccharides by
phosphotransferase. All three models hold that the most important
phosphotransferase contact is a properly presented carboxyl lobe
-loop, whereas the elements in the amino lobe can act in one of
three ways: 1) directly contact phosphotransferase simultaneously with
the carboxyl lobe
-loop (bivalent model); 2) directly contact
phosphotransferase as a exclusive contact but only after the lysosomal
enzyme is concentrated in the vicinity of phosphotransferase through
interaction with the carboxyl lobe
-loop (rebinding model); or 3)
influence the orientation in space of CHO 70 such that it is either
favorably (or unfavorably) oriented to allow access to the
phosphotransferase catalytic site (oligosaccharide conformation model).
In this case the contacts with phosphotransferase are mediated
exclusively by the carboxyl lobe elements.
The first two models, which hold that the amino lobe determinants are directly involved in binding to phosphotransferase, are supported by the observation that in frog oocytes elements in the amino lobe of cathepsin D can function independently of elements in the carboxyl lobe to direct efficient phosphorylation of the CHO 70(8) . This result suggests the presence of a distinct recognition site for phosphotransferase in the amino lobe. The reason that the carboxyl lobe elements are essential for phosphorylation in mammalian cells is not clear.
The bivalent model proposes that phosphotransferase contacts the lysosomal enzyme at multiple sites simultaneously, with an increase in avidity due to suppression of dissociation. To account for the observations, the bivalent binding must result in a different orientation of CHO 70 compared with CHO 199 such that the second contact is required to allow phosphorylation of CHO 70. The concept that phosphotransferase could be large enough to contact both sides of cathepsin D simultaneously is plausible because there is evidence that phosphotransferase is a large multimeric protein(25) . This hierarchic model of the strong carboxyl lobe and weak amino lobe binding site interacting with possibly similar surfaces on a receptor molecule (phosphotransferase) is reminiscent of the interaction of growth hormone with its homodimeric receptor in which the homologous surfaces of two identical receptor subunits contact very different surfaces on the hormone molecule with very different contact areas and binding affinities(26) .
The rebinding model is based on the concept that phenomena such as receptor or binding site clusters can rebind dissociating ligand molecule with enhanced efficiency(27) . In detail this model suggests that the independently functioning cathepsin D carboxyl lobe binding site is bound by a phosphotransferase cluster allowing phosphorylation of the nearby CHO 199. The cathepsin D then dissociates and undergoes rotational and short range translational diffusion in the vicinity of the phosphotransferase cluster with a high probability of rebinding through the weak amino lobe site allowing phosphorylation of CHO 70. Another way to view this is that the interaction with the carboxyl lobe site increases the local concentration of the amino lobe site to a point where it can function. In the absence of a carboxyl lobe site, as occurs in the CP1 construct, the local concentration of the amino lobe near phosphotransferase clusters does not occur, and the phosphorylation of CHO 70 does not occur. The strong point of this model is that the binding is never multivalent so there is no expectation of long occupancy on phosphotransferase, and the relative positioning of the multiple sites required to phosphorylate multiple Asn-linked glycans is not as constrained since phosphotransferase never has to coordinate contacting two sites simultaneously.
The oligosaccharide conformation model
proposes that the amino lobe determinants do not contact
phosphotransferase directly but participate in an intramolecular
binding of CHO 70 to position it for optimal access to the
phosphotransferase catalytic site when phosphotransferase is bound to
the carboxyl lobe recognition marker. Asn-linked oligosaccharides are
long flexible appendages on globular proteins which essentially occupy
a large ``cloud'' around the site of
anchorage(28, 29) . Restraining an oligosaccharide or
even a single branch such that it would be limited to a subregion of
this cloud closest to the phosphotransferase catalytic site in the
bound proenzyme could greatly enhance the chances of phosphorylation.
CHO 70 is ordered in the crystal structure of mature cathepsin D. In
fact, it has been suggested that the phosphate on CHO 70 may coordinate
with Lys-203(5, 24) . The fact that a phosphorylated
branch of CHO 70 can reach Lys-203 suggests that CHO 70 can be extended
to reach the vicinity of CHO 199. Although the -flap is not in the
path of CHO 70 as it reaches from Asn-70 to Lys-203, the conformation
of CHO 70 may change substantially after propiece cleavage and
phosphorylation itself.
The new regions of the amino lobe identified here, whether they work by intermolecular interaction with phosphotransferase or intramolecular interaction with CHO 70, are specific recognition sites in that they support a clear biological goal in ensuring that cathepsin D has the minimum number of Man-6-P groups required for efficient targeting. The evolutionary pressure for ensuring that multiple oligosaccharides contain Man-6-P is likely to arise from the nature of the two Man-6-P receptors, each of which contains two low affinity binding sites for Man-6-P which allow multivalent binding with a higher effective avidity (3) . The high avidity binding enhances the efficiency of the targeting of the acid hydrolases to lysosomes. Cathepsin D ensures this multivalency with 2-fold redundance. CHO 199 usually carries two phosphates providing one chance to generate a high affinity ligand, and the combination of any single phosphate on CHO 199 with the single phosphate on CHO 70, which is generated under the direction of the amino lobe determinants defined in this paper, provides a second chance to generate a high affinity ligand allowing targeting to lysosomes.