(Received for publication, September 10, 1996, and in revised form, October 17, 1996)
From the Department of Chemistry, University of
British Columbia, Vancouver, British Columbia V6T 1Z1, Canada, the
§ Research Institute, The Hospital for Sick Children,
Toronto, Ontario M5G 1X8, Canada, and the ¶ Departments of
Biochemistry and Pediatrics, University of Toronto,
Toronto, Ontario M5G 1X8, Canada
Human lysosomal -galactosidase catalyzes the
hydrolysis of
-galactosides via a double displacement mechanism
involving a covalent glycosyl enzyme intermediate. By use of the slow
substrate 2,4-dinitrophenyl-2-deoxy-2-fluoro-
-D-galactopyranoside,
a glycosyl enzyme intermediate has been trapped on the enzyme. This has
allowed the catalytic nucleophile to be identified as Glu-268 by peptic and tryptic digestion of the inactivated enzyme followed by high performance liquid chromatography-electrospray ionization tandem mass
spectrometry of the peptide mixture. This glutamic acid is fully
conserved in a sequence-related family of enzymes (Family 35),
consistent with its essential role.
Lysosomal -galactosidase (EC 3.2.1.23) is an essential
catabolic enzyme that catalyzes the hydrolysis of terminal nonreducing
-galactosyl residues from a variety of substrates, including ganglioside GM1,1
lactosylceramide, lactose, and some galactose-containing
oligosaccharides. The inherited deficiency of the enzyme in humans
results in GM1 gangliosidosis, a severe neurological
disease, or Morquio syndrome B, characterized by storage of keratan
sulfate. The enzyme, synthesized as an 88-kDa glycoprotein precursor
(677 amino acids), is processed to the 64-kDa mature enzyme by
proteolytic cleavage of the C-terminal end (1, 2, 3). The precursor is
kinetically and functionally identical to the mature form of the
enzyme, with pH optimum and kinetic parameters identical to the mature
enzyme, undergoes endocytosis, and is localized to the lysosomes of
GM1 gangliosidosis fibroblasts, correcting the enzymic
deficiency (2). The enzyme is a retaining glycosidase, catalyzing the
hydrolysis of galactoside substrates with net retention of anomeric
configuration (2) through a double displacement mechanism involving a
covalent
-galactosyl enzyme intermediate. Such enzymes employ a pair
of carboxylic acids in the active site, one functioning as a catalytic
nucleophile and the other presumably as an acid/base catalyst (4,
5).
Although a number of mutations in lysosomal -galactosidase resulting
in partial losses of in vitro activity have been associated with different disease phenotypes, no individual amino acid residues have been established as essential to the catalytic mechanism. One
approach to this problem involves the use of mechanism-based inactivators to covalently label key active site residues. The most
specific class of glycosidase inactivators available to date are
2-deoxy-2-fluoroglycosides with good leaving groups, which specifically
label the catalytic nucleophile of retaining glycosidases by trapping a
stabilized 2-deoxy-2-fluoroglycosyl enzyme rendered kinetically
accessible by the presence of the activated leaving group. This good
leaving group accelerates only the first step, whereas the C-2 fluorine
slows both steps so that the resulting 2-deoxy-2-fluoroglycosyl enzyme accumulates (6, 7). Previously, we have
inactivated human
-galactosidase precursor with such a
mechanism-based inactivator,
2,4-dinitrophenyl-2-deoxy-2-fluoro-
-D-galactopyranoside (2FDNPGal), which showed saturable binding and protection from inactivation by the competitive inhibitor
isopropyl-
-D-thiogalactopyranoside (2).
In this paper we report the use of this inactivator to identify,
without radiolabels, the catalytic nucleophile of the human lysosomal
-galactosidase precursor through a variety of mass spectrometric
techniques.
Activity of human lysosomal
-galactosidase was assayed with
2,4-dinitrophenyl-
-D-galactopyranoside in 50 mM citrate, 100 mM phosphate, pH 4.3, at
37 °C using a Pye-Unicam spectrometer equipped with a circulating
water bath as described previously (2).
The precursor form of -galactosidase was
purified from the medium above permanently transfected Chinese hamster
ovary cells as described by Zhang et al. (2). The final
specific activity of the preparation was 0.73 mmol/h/mg of protein. The
enzyme was homogeneous by SDS-polyacrylamide gel electrophoresis and
gave a single band with a Mr of 72,000 after
digestion with N-glycanase. Lysosomal
-galactosidase (50 µl, 5 mg/ml, 50 mM citrate, 100 mM phosphate,
pH 4.3) was incubated with 2FDNPGal (1.58 mM, 10 µl) at
37 °C for 15 min, resulting in inactive labeled enzyme. Enzyme
(native or inactivated, 60 µl, 5 mg/ml) was incubated with trichloroacetic acid (120 µl, 100 mM) in pH 2 buffer for
5 min at room temperature. Pepsin (Boehringer Mannheim, 60 µl, 0.5 mg/ml in the above buffer) was added, and the mixture was incubated at
room temperature for 12 h. For further tryptic proteolysis of the
peptic peptides, solid ammonium bicarbonate (10 mg) was added to 200 µl of this digest, and trypsin (Sigma, Type XIII, L-1-tosylamido-2phenylethyl chloromethyl
ketone-treated, 5 µl, 5 mg/ml, in 50 mM phosphate buffer,
pH 6.8) was immediately added. The digest was incubated at room
temperature for 15 min and then reacidified with 50% trifluoroacetic
acid (20 µl).
Mass spectra were recorded on a PE-Sciex API III triple quadrupole mass spectrometer (Sciex, Thornhill, Ontario, Canada) equipped with an ion spray ion source. Peptides were separated by reverse phase high performance liquid chromatography (HPLC) on an Ultrafast Microprotein Analyzer (Michrom BioResources, Inc., Pleasanton, CA) directly interfaced with the mass spectrometer. In each of the mass spectrometry (MS) experiments, the proteolytic digest was loaded onto a C18 column (Reliasil, Michrom BioResources Inc., Pleasanton, CA; 1 × 150 mm) and then eluted with a gradient of 0-60% HPLC solvent B over 20 min followed by 100% HPLC solvent B over 2 min at a flow rate of 50 µl/min (HPLC solvent A: 0.05% trifluoroacetic acid, 2% acetonitrile in water; HPLC solvent B: 0.045% trifluoroacetic acid, 80% acetonitrile in water). A postcolumn splitter was present in all experiments, splitting off 85% of the sample into a fraction collector and sending 15% into the mass spectrometer. Spectra were obtained in either the single quadrupole scan mode (liquid chromatography (LC)/MS), the tandem MS neutral loss mode, or the tandem MS daughter scan mode (LC/MS/MS).
In the single quadrupole mode (LC/MS) the quadrupole mass analyzer was scanned over a m/z range of 300-2400 Da with a step size of 0.5 Da and a dwell time of 1 ms/step. The ion source voltage was set at 5 kV and the orifice energy was 80 V.
In the neutral loss scanning mode, MS/MS spectra were obtained by
searching for the mass loss of m/z 165 corresponding to the losses of a 2-deoxy-2-fluorogalactosyl moiety from
a peptide ion in the singly charged state. To detect losses of the
labels from doubly charged peptides, the mass loss of one-half the mass of the label used (m/z 82.5) was probed. Thus,
scan range: m/z 300-2400; step size: 0.5 Da;
dwell time: 1 ms/step; ion source voltage: 5 kV; orifice energy: 80 V;
RE1 = 115; DM1 = 0.16; R1 = 0 V; R2 = 50 V;
RE3 = 115; DM3 = 0.16; collision gas thickness: 3.2-3.6 × 1014 molecules/cm2 (CGT = 320-360). To
maximize the sensitivity of neutral loss detection, normally the
resolution (RE and DM) is compromised without generating artifact
neutral loss peaks. Neutral loss peaks shown in chromatograms consist
of multiple data points detected in several consecutive scans and were
reproducible from run to run. Occasional instrument spikes in a given
run consisting of a single data point detected in a single scan are not
reproduced in displayed chromatograms.
In the tandem MS daughter scan mode, the spectrum was obtained by
selectively introducing the m/z 688 peptide from
the first quadrupole into the collision cell and observing the daughter ions in the third quadrupole. Thus, the first quadrupole was locked on
m/z 688; third quadrupole scan range: 200-800;
step size: 0.5 Da; dwell time: 1 ms; ion source voltage: 5 kV; orifice
energy: 80 V; RE1 = 112; DM1 = 0.18; R1 = 0 V; R2 = 50 V; RE3 = 112; DM3 = 0.18; collision gas thickness:
4.5 × 1014 molecules/cm2 (CGT = 450).
Samples for analysis were collected as they eluted from the postcolumn flow splitter by pooling fractions eluting approximately 1 min prior to and 1 min following the detection of the peak. Pooled fractions (~100 µl) were concentrated on a Speed-Vac to ~5 or 10 µl prior to liquid secondary ionization mass spectrometry (LSIMS) analysis. Low resolution (2,500) and high resolution (12,000) LSIMS was performed on a Kratos (Manchester, UK) Concept II HQ mass spectrometer using a thioglycerol matrix with an 8-kV ion source and a 12-kV LSIMS gun using cesium ions. The mass reference for high resolution spectra was potassium iodide/polyethylene glycol. The measurement was carried out by peak matching.
AminolysisAminolysis of labeled peptides was carried out with 15 µl of the labeled peptide digest (~1 mg/ml) to which was added 30% ammonium hydroxide (7.5 µl). The mixture was incubated at 50 °C for 15 min, reacidified with 50% trifluoroacetic acid (5 µl), and resubjected to electrospray mass spectrometry.
Inactivation of -galactosidase by 2FDNPGal
occurred in a rapid, time-dependent manner (2). The
2-deoxy-2-fluorogalactosyl enzyme intermediate so formed was
catalytically competent as expected, turning over upon incubation in
buffer at 37 °C after removal of excess inactivator according to a
pseudo first order rate constant of kreac = 0.002 min
1 (6, 7). The rate of inactivation was increased
14-fold by addition of GlcNAc (100 mM) to the mixture, the
sugar presumably facilitating turnover of the intermediate by
transglycosylation, as seen previously with other glycosidases (7, 8).
This establishes the trapped species as the catalytically relevant intermediate since GlcNAc is a preferred "aglycone" in the natural substrate. The residue so labeled is therefore the catalytic
nucleophile.
Peptic digestion of the 2-deoxy-2-fluorogalactosyl
enzyme after treatment with trichloroacetic acid resulted in a mixture of peptides that were separated by HPLC using the mass spectrometer as
detector. When scanned in LC/MS mode, the total ion current (TIC)
displays a complex mixture of peaks arising from every peptide in the
digest (Fig. 1A). The labeled peptides were
identified in a second run using tandem mass spectrometry in neutral
loss mode, in which the ions are subjected to limited fragmentation by
an inert gas in a collision cell. The ester bond between the inhibitor
and the peptide is one of the more labile bonds present and would be
expected to readily undergo homolytic cleavage resulting in loss of a
neutral sugar. The two quadrupoles are therefore scanned in a linked
mode so that only those ions differing by the mass of the label lost
could be detected. For a singly charged peptide, this
m/z difference is the mass of the label (165 Da); for a doubly charged peptide, the m/z difference
is one-half of the mass of the label (82.5 Da), and so on.
Scanning in neutral loss mode for the mass loss m/z 165 from a singly charged peptide revealed no significant peaks. However, scanning for mass loss m/z 82.5 revealed a large peak at 18 min (Fig. 1B) that was absent in an unlabeled control digest (Fig. 1C). This peak was due to three co-eluting doubly charged peptides measured at m/z 950.5, 1058.5, and 1114.5 ± 1 (Fig. 1D). These peptides therefore correspond to unlabeled singly charged peptides of masses 1736, 1952, and 2064 ± 2. These peptides, presumably containing overlapping sequences that include the catalytic nucleophile, are selectively detected because each has undergone loss of the inhibitor mass. The significant background observed in the control is likely due to losses of phenylalanine residues, which have the same mass, from several different peptides. Inspection of the protein sequence revealed 26, 35, and 31 candidate peptides of masses 1736, 1952, and 2064 ± 2, respectively. However, only 7 different glutamic or aspartic acids, residues that are known to act as catalytic nucleophiles in glycosidases, were present in candidate peptides of all three masses. Increasing the collision gas energy and scanning in daughter ion mode in an attempt to cause further amide bond fragmentation and thereby identify the labeled peptides was unsuccessful. This resulted only in loss of the inhibitor with no significant peptide fragmentation, presumably because the relatively large size of these peptides precludes observable fragmentation in the collision cell.
The mixture of peptic peptides was therefore treated with a second
protease in an attempt to cleave these large peptides into smaller
species that would undergo fragmentation. Solid ammonium bicarbonate
was added to a peptic digest of the labeled enzyme prepared as above to
increase the pH to ~8, and the mixture was treated with trypsin for
15 min. The mixture was then reacidified with trifluoroacetic acid to
pH ~ 2 since the inhibitor-peptide ester bond is much more
susceptible to hydrolysis above pH 7 than at acidic pH (9). A complex
mixture of peptides was once again observed upon scanning in LC/MS mode
(Fig. 2A). In neutral loss mode a search for
a mass loss of m/z 165 revealed a single peak (Fig. 2B) that was absent in a control, identically treated
unlabeled digest (Fig. 2C). This singly charged labeled
peptide had a mass of 1040.5 ± 1 (Fig. 2D),
corresponding to a singly charged peptide of mass 876.5 ± 1 (1040.5 165 + 1). Thirteen candidate peptides had a mass of
876 ± 1 but only 1 glutamic or aspartic acid was present in
candidate peptides of mass 876 ± 1 and of masses 1736, 1952, and 2064 ± 2. This residue is Glu-268, located within a GPLINSEF (262-269) sequence of mass 876 ± 1, a EPKGPLINSEFYTGW (259-273) sequence of mass 1736 ± 2, a CEPKGPLINSEFYTGWL
(258-274) sequence of mass 1952 ± 2, and a KGPLINSEFYTGWLDHW
(261-277), a PLINSEFYTGWLDHWGQ (263-279), or a LINSEFYTGWLDHWGQP
(264-280) sequence of mass 2064 ± 2. Since the GPLINSEF peptide
is the sole peptide detected in the neutral loss scan of the
peptic/tryptic digest, the latter two candidate peptides of mass
2064 ± 2 may be excluded because they do not contain the
N-terminal glycyl and prolyl residues present in the GPLINSEF peptide,
which must be derived from tryptic digestion of the larger peptic
peptides.
Daughter ion scans of peak 1040.5 at increased collision gas energy
resulted in loss of the label, giving rise to the peak at 876.5 (Fig.
3). Further peptide bond fragmentation was also observed, with peaks at 711, 581, 496, 381, and 268, corresponding to
C-terminal losses of phenylalanine and of EF, SEF, NSEF, and INSEF
species from the parent peptide, confirming that the labeled peptide
does indeed have the sequence GPLINSEF.
Further confirmation was obtained by high resolution LSIMS in which the
masses of the labeled 1040.5 and 876.5 peaks were measured at 1040.495 and 876.447, respectively. No possible peptides that have
masses within 5 ppm of the measured 1040.495 mass, regardless of
protease specificity, may be generated from the native protein, indicating that this peak can only be due to a labeled peptide. Indeed,
the 2-deoxy-2-fluorogalactosyl-labeled GPLINSEF species (calculated
mass, 1040.495; difference in mass (m) = 0.18 ppm) is
consistent with this mass. Only two possible peptides have masses
differing by less than 5 ppm from the measured 876.44696 mass:
GPLINSEF, with a calculated mass of 876.447 (
m = 0.28 ppm), and RNATQRM (25-31), with a calculated mass of 876.447 (
m = 0.50 ppm). However, the latter candidate
peptide contains no glutamic or aspartic acids, is inconsistent with
the neutral loss and MS/MS results above, and may therefore be
excluded.
The attachment site of the 2-fluorogalactosyl moiety, and thus the
nucleophile, was confirmed by aminolysis of the labeled peptide. A
labeled digest treated with ammonium hydroxide was examined by mass
spectrometry and compared to an untreated labeled control digest (Fig.
4). In the aminolyzed sample, the peak at 1040.5 and
retention time 19.8 min present in the untreated digest was absent and
was replaced by a new peak of mass 875.5 and retention time 19.5 min,
consistent with aminolysis of a glycosyl ester to glutamine, giving a
GPLINSQF peptide 1 mass unit less than the parent unlabeled peptide. A
peak of mass 876.5 and retention time of 15 min present in both treated
and untreated samples was also considerably increased in magnitude in
the aminolyzed digest and is presumably the unlabeled peptide resulting
from hydrolysis of the 2-fluoroglycosyl ester under the aminolysis
conditions (Fig. 4).
Conclusions
Glu-268 and much of the surrounding sequence are
conserved between the human and mouse lysosomal -galactosidases,
both of which belong to Family 35 of glycosyl hydrolases (Fig.
5). Several other
-galactosidases and related
proteins show significant homology to human
-galactosidase. In each
case, the corresponding glutamic acid is absolutely conserved, as would
be expected of an essential catalytic residue. Significantly, this
residue has recently been predicted to be the catalytic nucleophile on
the basis of hydrophobic cluster analysis of proteins of similar
protein fold (10). Eight other carboxylic acids are also absolutely
conserved in this glycosidase family and may be important to catalysis.
Jenkins et al. (11) have noted the presence of a conserved
Asn-Glu sequence in which Glu is the presumptive acid/base catalyst,
about 100 residues away from the catalytic nucleophile in hydrolases in
a variety of families (namely those in 1, 2, 5, 10, and 17). Because
human
-galactosidase is aligned with these families, these
considerations suggest that at least one of these conserved residues is
involved in acid/base catalysis with Glu-188 being the most likely
candidate.
Deficiencies in this enzyme resulting in GM1 gangliosidosis or Morquio syndrome B are characterized by a number of known genotypes, among them R49C, I51T, G123R, R201C, W273L, Y316C, R457Q, R482H, and W509C (12). Fibroblasts from most patients suffering from these conditions show markedly reduced levels of mature enzyme protein. It has been assumed that these mutations are responsible for defects in protein folding or trafficking leading to rapid degradation of the mature protein, thus explaining the fibroblast results. However, some enzymic deficits may be due to more direct effects of certain mutations on binding interactions of the catalytic residues or other residues in the active site. The deleterious effect of the W273L mutation (8% of normal activity (13)) in particular, which to date has been observed only in Morquio B patients, is possibly due to altered secondary structure within the enzyme's active site in close proximity to the catalytic nucleophile. Other mutant residues, although perhaps distant in primary sequence, may be in close spatial proximity to the active site and may have similar effects. The identification of Glu-268 as the catalytic nucleophile, arguably the most critically important residue to catalysis, should facilitate a greater understanding of the molecular basis of GM1 gangliosidosis and Morquio B disease.