©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Cloning and Expression of the - N-Acetylglucosaminidase Gene from Streptococcus pneumoniae
GENERATION OF TRUNCATED ENZYMES WITH MODIFIED AGLYCON SPECIFICITY (*)

Valerie A. Clarke (1), Nick Platt (2), Terry D. Butters (1)(§)

From the (1) Glycobiology Institute, Department of Biochemistry, University of Oxford, Oxford OX1 3QU, and the (2) Sir William Dunn School of Pathology, University of Oxford, OX1 3QX, United Kingdom

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

The gene encoding a - N-acetylglucosaminidase from Streptococcus pneumoniae has been obtained by screening an expression library for - N-acetylglucosaminidase activity. Clones of different nucleotide sizes each having arylglycoside activity were obtained, and DNA sequencing revealed a gene of 3933 base pairs possessing typical bacterial transcription initiation and termination sequences and terminating in an ochre stop codon. Computer analysis of the translated protein of 1311 amino acids (144,210 Da) identified a tandem repeat within which lies a sequence homologous with six other hexosaminidase gene products from a wide variety of species ranging from bacteria to humans. Also found were an amino-terminal putative secretion signal peptide and a carboxyl-terminal cell sorting/anchorage motif typically found in over 20 other Gram-positive surface proteins. The expression of an almost complete DNA clone in Escherichia coli produced a functional and authentic - N-acetylglucosaminidase with aglycon specificity identical to the wild-type enzyme. However, enzymes produced from truncated DNA clones show more restricted aglycon specificity and are unable to hydrolyze terminal 1-2GlcNAc residues from N-glycans containing a bisecting N-acetylglucosamine. The availability of these clones allows structural analyses to be made of catalytic and oligosaccharide recognition protein domains that enhance functional activity.


INTRODUCTION

- N-Acetylglucosaminidase, found in both the culture medium and in association with undisrupted cells, is one of six extracellular Streptococcus pneumoniae glycosidases purified to date (1, 2, 3) . This repertoire of enzymes, in addition to the - N-acetylglucosaminidase, includes a -galactosidase, three endoglycosidases, and a neuraminidase that are thought to aid the organism in the breakdown of oligosaccharides in its surrounding environment for use as a carbon source. In particular, the neuraminidase has been implicated in the process of pathogenesis. By cleaving the terminal sialic acids on cell-surface glycolipids, it is thought that the neuraminidase can expose the carbohydrate ligand by which the S. pneumoniae attaches to the host (4, 5) . S. pneumoniae also synthesizes and secretes a hyaluronidase that is capable of degrading a component of the extracellular matrix and may aid the organism to invade underlying host tissue (6) . Since GlcNAc1-linked residues are common components of several cell-surface molecules in host tissues, it is possible that - N-acetylglucosaminidase also has a pathogenic role.

All of these glycosidases are important reagents for oligosaccharide analysis. - N-Acetylglucosaminidase has been demonstrated to have broad specificity using GlcNAc-Gal, hydrolyzing both 1-3 and 1-6 linkages that are commonly found in mucins. By contrast, the hydrolysis of N-linked sugars at low enzyme concentrations is restricted to GlcNAc1-2Man linkages except when the Man-1-6 arm is substituted with a GlcNAc at both C-2 and C-6 positions, or the Man -linked to the chitobiose core is substituted at the C-4 position by a bisecting GlcNAc (7) . This limited activity can be a useful tool in the sequencing of oligosaccharides, giving specific structural information about the cleaved bond and the surrounding sugar residues. It also raises the issue of defining the intrinsic properties of the enzyme that govern this phenomenon. The aim of this study was to use an expression cloning strategy to obtain the gene for - N-acetylglucosaminidase and to investigate the interaction between the enzyme and its N-linked carbohydrate substrates at the molecular level. By the expression of truncated portions of the - N-acetylglucosaminidase gene, we have been able to examine defined regions of the protein that determine glycosidase specificity.


EXPERIMENTAL PROCEDURES

Materials

Oligonucleotides were synthesized by British Biotechnology (Oxford, United Kingdom). Restriction and DNA-modifying enzymes were purchased from Boehringer Mannheim and New England Biolabs. Radionucleotides were obtained from Amersham Corp. and ICN-Flow (High Wycombe, Bucks, UK). Prime-It random primer DNA labeling kit was purchased from Stratagene.

Bacterial Strains

XL1-Blue Escherichia coli strain ( recA- ( recA1 , lac-, endA1 , gyrA96 , thi, hsdR17 , supE44 , relA1 , ( F' proAB, lacI, lacZM15 , Tn10))) was purchased from Stratagene. NM554 E. coli strain ( recA13 , araD139 , (ara-leu)7696 , (lac)17A , galU, galK, hsdR, rspL(strr) mcrA, mcrB) was obtained from Stratagene.

Purification of -N-Acetylglucosaminidase

A homogeneous preparation of the cellular form of - N-acetylglucosaminidase was obtained from S. pneumoniae (ATTC 12213) using previously published methods (3) . A limited trypsin digest of - N-acetylglucosaminidase followed by amino-terminal amino acid sequencing was used to obtain two tryptic peptide sequences, a major sequence, EGADIPIIGGMVA, and a minor sequence, LQPMAFND. Samples for SDS-polyacrylamide gel electrophoresis were prepared by methods previously described (3) .

Construction and Screening of Genomic Expression Library

Genomic DNA was prepared from frozen S. pneumoniae cells (1 ml of packed cells) following a method used for tissue samples (8) . Extracted genomic DNA was partially digested with AluI and separated by aga-rose gel electrophoresis. Fragments ranging between 1 and 7 kb() were gently extracted from the agarose by adsorption to glass beads using the Geneclean II Kit (Bio101, Inc., Stratech Scientific, Luton, UK) to avoid shearing. The ends of the size-selected pieces were then blunt-ended using Klenow and T4 DNA polymerase (8) . EcoRI linkers were ligated to blunt-ended genomic DNA fragments with T4 DNA ligase, and these inserts were ligated into EcoRI-digested -ZapII vector arms (Stratagene) and packaged with GigapackII packaging extracts (Stratagene) to generate a bacteriophage expression library. The number of primary recombinants was determined by plating the library on the XLI-Blue E. coli host strain. The plated expression library was screened for - N-acetylglucosaminidase activity by incorporation of 50 µ M 4-methylumbelliferyl N-acetylglucosaminide in the top agarose and visualization of hydrolysis by UV light. Positive clones were plaque-purified by successive plating, and pure plaques were subcloned into plasmid Bluescript SK- by the in vivo excision protocol that accompanies the vector (Stratagene).

Rescreening of Library for Overlapping Clones

The library was plated and screened by plaque hybridization (Colony/Plaque Screen nylon discs, DuPont NEN) using DNA probes that were [-P]dATP-labeled using the Prime-It random primer labeling kit (Stratagene) and purified over Select-D, G-25 spin columns (5 Prime 3 Prime, Inc., CP Labs, Bishop's Stortford, Herts, UK). DNA probes were prepared from the 5` end of pBStrH7 by restriction digesting with BglII and EcoRI and from the 3` end of pBStrH17 by restriction digesting with NcoI and EcoRI to generate probes of 570 and 730 bp respectively.

Plasmid DNA Purification

Clones were amplified and purified using the maxi-prep pZ523 spin column plasmid purification kit (5 Prime 3 Prime, Inc.). DNA was also purified using Magic Mini Preps (Promega).

Southern Blot Analysis of Clones

Southern transfer of DNA to nylon membranes (Hybond, Amersham Corp.) was made according to published methods (8, 9) . Oligonucleotide probes designed from the amino acid sequence of the major tryptic peptide from the purified protein, GAA GG(T/A) GC(T/A) GAT AT(T/C) CC(A/T) AT(T/C) AT(T/C) GG(T/A) GG(T/A) ATG GT, were labeled with [-P]dATP using polynucleotide kinase (8) . After hybridization at 56 °C overnight, filters were washed and subjected to radioautography.

DNA Sequencing

Nucleotide sequences were determined using the dideoxy chain termination method (10) using the Sequenase DNA sequencing kit (U. S. Biochemical Corp. Cambridge BioScience, Cambridge, UK). Both strands of DNA were sequenced in their entirety by a combination of specific oligonucleotide primers and the transposon-facilitated DNA sequencing TN1000 nested set kit (Gold BioTechnology, Inc., St. Louis, MO). Sequence analysis was accomplished using MacVector and Assembly-Lign software (IBI, Cambridge, UK) and on-line NCBI data bases.

Northern Blot Analysis

Total RNA was prepared from S. pneumoniae using guanidinium thiocyanate (11) . RNA (10 µg) was separated on a denaturing formaldehyde agarose gel with RNA molecular sizing fragments ranging from 0.24 to 9.5 kb and Northern blotted. The nylon filter was probed with a 570-bp P-labeled BglII- EcoRI fragment used to rescreen the genomic library.

Subcloning into the pGX Expression Vector

The vector pGEX-3X (Pharmacia, Milton Keynes, UK) was digested with EcoRI and dephosphorylated. The inserts from clones pBStrH7, -8, and -17 were prepared by digestion with EcoRI and gel purification. The insert and vector were ligated overnight and transformed into NM554 E. coli by electroporation. Colonies were ampicillin-selected on LB/amp plates, and 20 were randomly chosen for analysis. Mini-prepped DNA was digested with EcoRI and separated on an agarose gel for verification of vector and insert sizes.

pGEX Expression and Affinity Purification

The expression and purification were carried out according to published methods (12) . 50-ml cultures of E. coli transformed with pGEX clones were induced by the addition of isopropyl-1-thio-- D-galactopyranoside (1 m M), and the cells were sonicated. Affinity beads were mixed with bacterial cell lysates for 2 h at 4 °C and washed 3 times with buffer, and the fusion protein was eluted from the affinity beads by the addition of fresh 5 m M reduced glutathione. Where appropriate, factor Xa cleavage was conducted by incubating factor Xa (Denzyme, Aarhus, Denmark) at 1 mg/ml with beads coupled to fusion protein at room temperature overnight. Supernatants were assayed for enzyme activity after centrifuging the beads.

Substrate Specificity Assays

Tritium-labeled biantennary, triantennary, tetraantennary, bisected-biantennary, and bisected-hybrid oligosaccharide alditols were obtained from Oxford GlycoSystems (Abingdon, UK). (GlcNAc1,4GlcNAc)was obtained from a partial chitin hydrolysis, and a degalactosylated biantennary native oligosaccharide was purified from human asialotransferrin (13) .

Native oligosaccharide substrate was incubated at different concentrations (0.2-1.0 m M) with 2.5 milliunits/ml of wild-type or recombinant - N-acetylglucosaminidase in 50 m M citric acid/sodium phosphate buffer, pH 5.0, containing 1 mg/ml bovine serum albumin at 37 °C for 1 h. The reactants were desalted, and hydrolysis was monitored by Dionex HPAEC (Dionex BioLC system) using a CarboPac PA-1 column eluted at 1 ml/min with 150 m M NaOH, 30 m M NaOAc, and the reaction products were detected using triple-pulsed amperometric detection with the following pulse potentials and durations: E= 0.01 V ( t= 120 ms), E= 0.6 V ( t= 120 ms), and E= -0.93 V ( t= 130 ms). The extent of hydrolysis was calculated from empirically derived response factors for substrate and reaction products, and the data were plotted using a weighted nonlinear regression analysis (Multifit 2.0, Day Computing, Cambridge, UK). Radiolabeled oligosaccharide alditols were separated using an isocratic eluant of 200 m M NaOH and the fractions taken for radioactivity determination by scintillation counting. Bio-Gel P-4 chromatography (Oxford GlycoSystems) was also used to separate the reaction products after enzyme digestion, and the radioactivity in each fraction was determined as above.


RESULTS

Expression Cloning of StrH

An expression cloning strategy was adapted from published procedures (14) to obtain the gene encoding S. pneumoniae - N-acetylglucosaminidase. A S. pneumoniae expression library of 5 10primary recombinants, generated in the vector -ZapII (Stratagene), was amplified in the E. coli host strain XL1-blue, and 300,000 clones were screened for - N-acetylglucosaminidase activity using the substrate 4-methylumbelliferyl N-acetylglucosaminide (4-MU-GlcNAc). 32 positive clones were identified by fluorescent halos encircling each plaque when visualized under 366-nm ultraviolet light. Hydrolysis was confirmed to be specific for N-acetylglucosaminide by the inclusion of a control substrate 4-methylumbelliferyl xyloside in the screening protocol. No hydrolysis of this substrate was seen. From the group of positive clones, 20 were selected at random and subcloned into pBluescript plasmids via in vivo excision according to the manufacturer's procedures and analyzed by restriction digestion to determine the sizes of the genomic DNA inserts. Restriction-digested DNA of the four largest inserts ranging in size from 1584 to 2504 bp were Southern blotted and probed with a degenerate P-labeled oligonucleotide designed from the amino acid sequence of the major tryptic peptide (Fig. 1). Hybridization of the radiolabeled probe to each clone selected by the enzyme activity of its translated product confirmed that they encoded the same protein that had been biochemical purified from the S. pneumoniae cells (Fig. 1). These clones (Fig. 2 a) were nucleotide sequenced and were found to compose a 2.7-kb continuous open reading frame with no start or stop codons, indicating that further screening of the genomic library was required to locate the missing gene sequences. P-Labeled DNA probes from the 5` and 3` ends of this 2.7-kb partial gene were used to identify two clones that contained the 5` and 3` ends of the strH gene (Fig. 2 b). Complete sequence analysis of these three contiguous clones revealed a single, continuous open reading frame of 3933 bp designated strH (str, streptococcus; H, hexosaminidase) (Fig. 3). The ATG (methionine) start codon, designated +1, was identified by the positions of three consensus sequences: the AGGAGG Shine-Dalgarno ribosome binding sequence located 7 bases upstream from the ATG translation initiation codon, and the two putative promoter sequences TTGACT resembling a ``[minus]35'' transcription initiation sequence beginning at -87 and TATAAT ``[minus]10'' transcription initiation sequence beginning at -43. The ochre stop codon located at base 3934 was followed by a hairpin-loop sequence resembling a rho independent transcription termination sequence.


Figure 1: Southern blot of strH clones. Clones 6, 7, 9, and 17 were restriction-digested with the following panel of enzymes: EcoRI, EcoRV, HindIII, ClaI, SacI. DNA (1 µg) was applied to a 0.7% Tris borate/EDTA buffer agarose gel in the order of enzymes listed above. Panel a shows the EthBr-stained gel. Lane 1, BRL 1-kb DNA ladder; lanes 2-6, pBStrH6; lanes 7-11, pBStrH7; lanes 12-16, pBStrH9; and lanes 17-21, pBStrH17. In panel b the gel was transferred to a nylon membrane by Southern blotting and hybridized with a P-labeled degenerate oligonucleotide probe designed from the amino acid sequence of the major tryptic peptide. Filters were then washed twice in 6 SSC at 37 °C, once in 4 SSC at 37, 42, 47, and 52 °C, and once in 2 SSC at 52 °C with cautious monitoring before subjecting to autoradiography.




Figure 2: Alignment of overlapping clones. a, the clones, selected by screening for - N-acetylglucosaminidase activity, were sequenced and aligned using MacVector and Assembly-lign software. The relative position of the 90-bp tandem repeating regions encoding 30-amino acid consensus sequences to other hexosaminidases (Table II) is shaded. b, the relation is illustrated between the contiguous 2701-bp partial gene composed of pBStrH6, -7, -8, and -17 obtained by activity screening of the genomic library and the pBStrH5` and 3` clones identified by screening the library with DNA probes derived from the ends of the contiguous sequence.




Figure 3: strH DNA and amino acid sequence and flanking regions. EcoRI and AluI sites flank both ends of the genomic sequence. Both tryptic peptide sequences are marked in boldface. Transcription initiation and termination sequences are double underlined. Shine-Dalgarno sequence is underlined. The start and stop codons are marked by a dotted underline, and the A of the ATG start codon is designated +1. The two 30-amino acid tandem repeating regions homologous with other hexosaminidases (Table II) are underlined. This sequence has been deposited in the GenBank data base (accession number L36923).



Protein Sequence of -N-Acetylglucosaminidase

The strH DNA was computer translated to give a protein of 1311 amino acids with a predicted molecular mass of 144,210 Da. Both tryptic peptide sequences obtained from a purified preparation of - N-acetylglucosaminidase were found (Fig. 3), and a comparison with all other published sequences in computer data bases did not reveal significant homology, confirming that the strH gene was unique. However, two amino acid sequence motifs common to other Gram-positive surface proteins were discovered (15) . The first was a putative secretion signal peptide at the amino terminus revealed by Kyte-Doolittle hydrophobicity analysis (Fig. 4 a). This sequence motif is characterized by a positively charged amino terminus followed by a hydrophobic core and a string of polar residues (16) . The proposed cleavage site of this signal peptide is between the two alanine residues at sites 33 and 34 in strH between the hydrophobic core region and the polar residues (17) . Within this region is a KQQRFSIRKXXXGAASLIG consensus sequence that appears to be highly conserved in many of the cell surface proteins of Streptococcus and Staphylococcus (). The carboxyl terminus contained a sequence homologous to a membrane spanning/cell wall anchorage sorting signal found in more than 20 Gram-positive surface proteins to date (18) (), identified by the consensus amino acid sequence LPXTGX where LPT and G are the conserved amino acids. In the translated sequence, we obtained LPETGT, followed by a hydrophobic stretch of amino acids proposed to be the membrane spanning region, and a short charged tail (Fig. 4 b). The LPETGT sequence is directly preceded by two lysine residues that may be of importance in anchoring the enzyme to the cell wall. It has been suggested (18) that the two lysines preceding the LPXTGX motif of protein A may be covalently attached via a transpeptide linkage to the peptidoglycan of the cell wall of Staphylococcus aureus. The authors based this proposal on the finding that the murein of the peptidoglycan in E. coli is covalently attached to lipoprotein via lysine and arginine residues (19) . Another striking feature found within the amino acid sequence of the - N-acetylglucosaminidase is a tandem repeat of approximately 335 amino acids from site 180 to 522 and again between sites 625 and 956. This repeat is apparent in both nucleotide sequence and amino acid sequence self-alignments, although greater homology is conserved when comparing the amino acid sequences than is found between the nucleotide sequences in these repeat regions. Within each of these tandem repeat regions lies a sequence spanning 30 amino acids which, when compared to each other, are 67% identical and 93% similar when considering conservative amino acid substitutions. Interestingly, both of these 30 amino acid sequences have considerable homology to protein sequences found in six other hexosaminidases isolated from a wide variety of species including bacteria and humans (Table II).


Figure 4: a, hydrophilicity plot of the amino terminus of strH. The strH translation product was analyzed by a Kyte-Doolittle hydrophilicity plot. The first 50 amino acids of the amino terminus were analyzed for features of a signal peptide motif found in other prokaryotes. The consensus motifs are labeled. b, hydrophilicity plot of the carboxyl terminus of strH. The last 35 amino acids of the strH carboxyl terminus were scanned for features of a cytoplasmic membrane/cell wall anchorage motif common in over 20 other Gram-positive bacterial surface proteins. The features of the sorting signal are labeled.



Northern Blot Analysis

A total RNA preparation from S. pneumoniae was used to identify the transcript of - N-acetylglucosaminidase. A major band hybridizing at 4.0 kb (results not shown) was consistent with the size of the 4.0-kb DNA obtained from sequencing.

Kinetic Analysis of Recombinant and Wild-type -N-Acetylglucosaminidase

Clones pBStrH8 and -17, as expected, showed arylglycoside hydrolytic activity, but to determine that these had the same activity as wild type - N-acetylglucosaminidase using a native biantennary N-glycan, they were subcloned into the pGEX-3X expression vector (Pharmacia Biotech Inc.) for the high-level production of recombinant enzyme. The Kand Vvalues were determined for the two recombinant enzymes from clones pGXStrH17, the longest clone, and pGXStrH8, the most truncated clone. Wild-type enzyme demonstrated the highest affinity for the natural oligosaccharide with a Kof 132 µ M compared with the Kvalues of the smaller length enzymes from clones pGXStrH17 and pGXStrH8, which were 2-2.5 times greater (Table III). The rates of hydrolysis, however, followed an opposite trend, where the Vmeasured for the smallest enzyme clone pGXStrH8, was 4 times faster than the wild-type enzyme and 3 times faster than the largest recombinant enzyme from clone pGXStrH17.

The Kand Vwas also determined for the factor Xa cleaved enzymes from clones pGXStrH8 and pGXStrH17 (I). The Kof the fusion protein, from clone pGXStrH17, and the Xa-cleaved enzyme, StrH17, were very similar. By contrast, the Kvalues of the clone pGXStrH8 and StrH8 recombinant enzymes differed by a factor of two. However, the standard error of the Kof the StrH8 enzyme was quite large (I).

Aglycon Specificity of Recombinant -N-Acetylglucosaminidase

Affinity-purified fusion proteins were assayed for their ability to hydrolyze a panel of radiolabeled oligosaccharide alditol substrates to compare aglycon specificity with the wild-type purified - N-acetylglucosaminidase (Fig. 5). The reaction products from enzyme digestions were separated by HPAEC. Because this technique shows a shift in retention times between the substrate and hydrolysis product indicating cleavage has occurred, the structures of hydrolysis products were verified using Bio-Gel P-4 analysis, which gave actual sizes of the reaction products in glucose units. The minimum enzyme concentration (units/ml, using 4-MU-GlcNAc) for each enzyme that was chosen showed no hydrolysis of (GlcNAc1-4GlcNAc)3 and thus retained exclusive 1-2 GlcNAc hydrolysis. Recombinant pGXStrH8 enzyme, from the most truncated clone, at 0.025 units/ml and wild-type - N-acetylglucosaminidase at 0.012 units/ml, hydrolyzed the biantennary alditol to completion (Fig. 5 a). Only the 1-2-linked GlcNAc residues were removed from a triantennary glycan (Fig. 5 b). However, only partial hydrolysis of a tetraantennary alditol was observed using the recombinant pGXStrH8 enzyme compared with wild-type enzyme (Fig. 5 c), and no GlcNAcs were removed from either bisected-biantennary or bisected-hybrid oligosaccharide substrates (Figs. 5, d and e). Concentrations of 0.25 units/ml of recombinant pGXStrH8 enzyme were required to hydrolyze completely the susceptible 1-2-linked GlcNAc of tetraantennary structures, but even at 45 units/ml, bisected oligosaccharides remained refractory to digestion. By contrast, the recombinant enzymes from the longest and intermediate clones pGXStrH17 and pGXStrH7, respectively, at low concentrations (0.025 units/ml) showed the same specificity as the wild-type enzyme, but only partially hydrolyzed those oligosaccharides with a bisecting GlcNAc. At a concentration of 1.5 units/ml, recombinant enzyme from pGXStrH17 exhibited the same activity as the wild-type enzyme when assayed using both bisected-biantennary and bisected-hybrid substrates. At higher enzyme concentrations (5 units/ml), this recombinant enzyme removed both terminal GlcNAc residues and the bisecting GlcNAc from bisected-biantennary substrates, an activity also achievable at high concentrations (0.1 units/ml) of the wild-type enzyme. The recombinant enzyme from clone pGXStrH7 exhibited partial activity against the bisected substrates at the highest enzyme concentration tested (17 units/ml). Thus the recombinant enzyme with the greatest number of amino acids, pGXStrH17 (Fig. 2, 1067 amino acids including the fusion tag, see I) shared the same aglycon specificity as the wild-type - N-acetylglucosaminidase, whereas the smallest length clone, pGXStrH8 (612 amino acids, I) was completely unable to hydrolyze 1-2-linked GlcNAc residues from bisected oligosaccharide alditols at extremely high enzyme concentrations (the maximum tested was 120 units/ml of factor Xa-cleaved enzyme). The intermediate length clone pGXStrH7 (641 amino acids, see Fig. 2) was partially active in hydrolyzing bisected substrates. A summary of these results is shown in Fig. 6.


Figure 5: Aglycon Specificity of - N-acetylglucosaminidase from S. pneumoniae. Both substrate and hydrolysis product elution positions are shown for each oligosaccharide. N, GlcNAc; M, Man. The H-labeled oligosaccharide alditol substrates (2.5 10cpm, ) were incubated with 12.5 milliunits/ml purified wild-type - N-acetylglucosaminidase () and 25 milliunits/ml recombinant enzyme from clone pGXStrH8 () in 50 m M citric acid/sodium phosphate buffer, pH 5.0, containing 1 mg/ml bovine serum albumin at 37 °C for 18 h. Samples were desalted, and the hydrolysis products were separated by HPAEC using an eluant of 200 m M NaOH. Fractions (1 ml) were collected and scintillation counted for radioactivity.




Figure 6: Oligosaccharide structures hydrolyzed by S. pneumoniae - N-acetylglucosaminidase. A summary of the data presented on the aglycon specificity of wild-type and recombinant - N-acetylglucosaminidase. Solid arrows, complete hydrolysis at appropriate enzyme concentration; dashed arrows, partial hydrolysis; crossed arrows, no hydrolysis at all enzyme concentrations tested. See text and Fig. 5 for full details. N, GlcNAc; M, Man.



Recombinant enzymes that had the glutathione S-transferase fusion tag removed by cleavage with factor Xa were checked for activity against the five substrates used in Fig. 5to confirm that the fusion tag did not affect the activity of the recombinant fusion proteins assayed above. At the same enzyme concentration, the factor Xa-cleaved recombinant enzymes showed identical aglycon specificities to the uncleaved enzymes (results not shown), demonstrating that the fusion tag did not change the substrate specificity of the recombinant enzymes.


DISCUSSION

The peculiar aglycon specificity exhibited by - N-acetylglucosaminidase from S. pneumoniae not only makes it a useful tool for oligosaccharide sequencing, it also makes it an interesting candidate for the investigation of substrate-enzyme interactions. At low enzyme concentrations, this enzyme will only hydrolyze 1-2-linked GlcNAc residues and is restricted by further N-acetylglucosaminidase substitutions of the 1-6 mannose arm of N-glycans and by bisecting GlcNAc residues of the core mannose. Little is known about the factors that might govern this restricted activity, though it might partly be explained by the potential steric hindrance created by the flexibility of 1-6-linked sugars, which have the additional angle of rotation. However, because these restricted 1-2-linkages are cleaved by higher concentrations of - N-acetylglucosaminidase as well as by other hexosaminidases, the intrinsic properties of the enzyme itself, which correlate with this narrow substrate specificity, remain to be clearly defined. In this paper we have shown that S. pneumoniae - N-acetylglucosaminidase is encoded by a unique 3933-bp gene, strH, which terminates in an ochre stop codon and possesses typical Gram-positive bacterial transcription initiation and termination sequences. The computer-translated amino acid sequence revealed two consensus motifs at both the amino and carboxyl terminus that are common to other Gram-positive surface proteins. A tandem repeat was identified in the strH gene, and within each repeat lies a stretch of 30 amino acids homologous to sequences in six other hexosaminidases found in a wide variety of species, suggesting that perhaps these amino acids may be important for the catalytic function of the enzyme. The tandem repeat regions of protein may be correlated with the ability of the clones to hydrolyze substrate. Substrate specificity experiments revealed that the longest enzyme clone, pGXStrH17, was able to hydrolyze 1-2-linked GlcNAc residues of bisected oligosaccharides similar to the wild-type enzyme. The intermediate sized cloned enzyme from pGXStrH7 demonstrated only a partial ability to hydrolyze these substrates, whereas the enzyme from shortest clone, pGXStrH8, was completely deficient in this activity. At the protein level, the differences between these clones are clearly defined by the presence or absence of the second tandem repeat region. Recombinant enzyme from pGXStrH17 contains both tandem repeats; pGXStrH7 contains all of the first and 95 amino acids (of 330) of the second tandem repeat region; and pGXStrH8 contains only the first tandem repeat. Sequence alignment revealed that these tandem repeat regions each contain a 30-amino acid consensus sequence present in six other hexosaminidases from divergent species, which leads us to speculate that this region may be part of an extended site of the enzyme, which is required for substrate orientation around the active site. In situations where this portion is partially missing, as with the shortest clone, efficient association with the some substrates may be affected and hydrolysis impaired. Further insights into the function of these tandem repeat regions may be gained by the separate expression of the second tandem repeat region followed by experiments addressing the substrate specificity. However, it is noteworthy that expression cloning failed to identify a clone that only contained the second repeat region and possibly indicates that the presence of this domain alone is not sufficient to maintain catalytic activity. The weaker affinity and increased rate of hydrolysis of biantennary oligosaccharides, together with changes in aglycon specificity, correlate with an amino acid-specific extension of the polypeptide carboxyl terminus. This suggests that there is either a minimum amino acid residue requirement for active site conformation, or a requirement for protein domains outside the active site to recognize and direct substrates into the enzymes catalytic center.

A better understanding of the cellular location of the Streptococcal - N-acetylglucosaminidase can be derived from the sequence information of the gene. The identification of a characteristic Streptococcal secretion signal at the amino terminus and a carboxyl-terminal sorting motif found in a variety of Gram-positive cell-surface proteins predicts a membrane-bound enzyme with an active site extending into the extracellular space. This structural model explains the origin of the - N-acetylglucosaminidase purified from the media. Several lines of evidence support the hypothesis that this form of the enzyme originates from the cell surface and is released during protease-assisted autolysis. First, the cell-associated enzyme was found to be present in much greater quantities than the enzyme found in the medium over an 8-h growth period (3) . Purification of the wild-type enzyme from Streptococcal cells revealed multiple amino termini, and the presence of a number of isoforms when examined by native-polyacrylamide gel electrophoresis (3) . The major protein and activity stained band of this preparation analyzed by SDS-polyacrylamide gel electrophoresis migrates with a molecular mass of 120 kDa, 24 kDa smaller than the 144-kDa translated gene, providing further evidence that proteolysis, additional to signal peptide cleavage, occurs during the release of the cell-associated enzyme.

Second, only a single gene was obtained from expression cloning. If more than one - N-acetylglucosaminidase was expressed by this organism, then more than one gene might have been isolated by enzyme activity. Although 32 clones were selected by their - N-acetylglucosaminidase activity, further analysis of five of these selected at random revealed that they were overlapping fragments of the same gene.

This evidence, together with the substrate specificity data (3) , implies with great certainty that both forms of the - N-acetylglucosaminidase are derived from the same protein located at the cell-surface.

The contribution to substrate configuration made by amino acid residues outside the active site in glycohydrolytic enzymes is a largely unexplored phenomenon. The availability of truncated enzymes that have altered substrate specificities as described here provides a unique opportunity to study the mechanism by which glycosidases bind and hydrolyze complex oligosaccharides.

  
Table: Aligned consensus sequences at the amino and carboxy termini of proteins from Streptococcus and Staphylococcus species

-GlcNAc`ase is the abbreviation for - N-acetylglucosaminidase. Residues in other Streptococcal and Staphylococcal proteins identical to - N-acetylglucosaminidase are in boldface and conservative amino acid substitutions are underlined.


  
Table: Hexosaminidase consensus sequence comparison between six different species

Boldface residues indicate perfect matching with the S. pneumoniae - N-acetylglucosaminidase. Underlined residues indicate conservative amino acid substitutions relative to the S. pneumoniae - N-acetylglucosaminidase.


  
Table: Kinetic analysis of wild-type and recombinant S. pneumoniae -N-acetylglucosaminidase

A desialylated, degalactosylated biantennary glycan was incubated with recombinant enzymes from clone pGXStr8 and -17, the corresponding factor Xa-cleaved enzyme, StrH8 and -17 or wild-type - N-acetylglucosaminidase. Kinetic measurements were made as described in the text. The length of the pGX constructs includes 232 amino acids from the 26-kDa glutathione S-transferase fusion protein tag.



FOOTNOTES

*
The Glycobiology Institute is supported by Monsanto/Searle Company. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked `` advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank/EMBL Data Bank with accession number(s) L36923.

§
To whom correspondence should be addressed. Tel.: 44-865-275725; Fax: 44-865-275216.

The abbreviations used are: kb, kilobase pair(s); StrH, S. pneumoniae hexosaminidase; 4-MU-GlcNAc, 4-methylumbelliferyl N-acetylglucosaminide; bp, base pair(s); HPAEC, high performance anion-exchange chromatography.


ACKNOWLEDGEMENTS

We thank Professor Raymond Dwek for support. We also thank Ned Siegel and Christine Smith, Monsanto Co. St. Louis, for providing amino-terminal amino acid analysis.


REFERENCES
  1. Hughes, R. C., and Jeanloz, R. W. (1964) Biochemistry 3, 1543-1548
  2. Glasgow, L. R., Paulson, J. C., and Hill, R. L. (1977) J. Biol. Chem. 252, 8615-8623 [Medline] [Order article via Infotrieve]
  3. Clarke, V. A., Willenbrock, F. W., Scudder, P. Rotsaert, J., Jacob, G. S., and Butters, T. D. (1994) Int. J. Biochrom. 1, 151-158
  4. Krivan, H. C., Roberts, D. D., and Ginsburg, V. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 6157-6161 [Abstract]
  5. Linder, T. E., Daniels, R. L., Lim, D. J., and Demaria, T. F. (1994) Microb. Pathog. 16, 435-441 [CrossRef][Medline] [Order article via Infotrieve]
  6. Boulnois, G. J. (1992) J. Gen. Microbiol. 138, 249-259 [Medline] [Order article via Infotrieve]
  7. Yamashita, K., Ohkura, T., Yoshima, H., and Kobata, A. (1981) Biochem. Biophys. Res. Commun. 100, 226-232 [Medline] [Order article via Infotrieve]
  8. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  9. Southern, E. (1975) J. Mol. Biol. 98, 503-517 [Medline] [Order article via Infotrieve]
  10. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467 [Abstract]
  11. Chomczynski, P., and Sacchi, N. (1987) Anal. Biochem. 162, 156-159 [CrossRef][Medline] [Order article via Infotrieve]
  12. Smith, D. B., and Johnson, K. S. (1988) Gene ( Amst.) 67, 31-40 [CrossRef][Medline] [Order article via Infotrieve]
  13. Butters, T. D., Scudder, P., Rotsaert, J., Petursson, S., Fleet, G. W. J.,Willenbrock, F. W., and Jacob, G. S. (1991) Biochem. J. 279, 189-195 [Medline] [Order article via Infotrieve]
  14. Robbins, P. W., Overbye, K., Albright, C., Benfield, B., and Pero, J. (1992) Gene ( Amst.) 111, 69-76 [CrossRef][Medline] [Order article via Infotrieve]
  15. Goward, C. R., Scawen, M. D., Murphy, J. P., and Atkinson, T. (1993) Trends Biochem. Sci. 18, 136-140 [Medline] [Order article via Infotrieve]
  16. Abrahamsen, L., Moks, T., Nilsson, B., Hellman, U., and Uhlen, M. (1985) EMBO J. 4, 3901-3906 [Abstract]
  17. von Heijne, G. (1984) J. Mol. Biol. 173, 243-251 [Medline] [Order article via Infotrieve]
  18. Schneewind, O., Model, P., and Fischetti, V. A. (1992) Cell 70, 267-281 [Medline] [Order article via Infotrieve]
  19. Braun, V., and Sieglin, U. (1970) Eur. J. Biochem. 13, 336-346 [Medline] [Order article via Infotrieve]
  20. Michel, J. L., Madoff, L. C., Olson, K., Kling, D. E., Kasper, D. L., and Ausubel, F. M. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 10060-10064 [Abstract]
  21. Jerlstroem, P. G., Chhatwal, G. S., and Timmis, K. N. (1991) Mol. Microbiol. 5, 843-849 [Medline] [Order article via Infotrieve]
  22. Haanes, E. J., and Cleary, P. P. (1989) J. Bacteriol. 171, 6397-6408 [Medline] [Order article via Infotrieve]
  23. Frithz, E., Heden, L. O., and Lindahl, G. (1989) Mol. Microbiol. 3, 1111-1119 [Medline] [Order article via Infotrieve]
  24. Smith, H. E., Vecht, U., Gielkens, A. L., and Smits, M. A. (1992) Infect. Immun. 60, 2361-2367 [Abstract]
  25. Signäs, C., Raucci, G., Jönsson, K., Lindgren, P. E., Anantharamaiah, G. M., Höök, M., and Lindberg, M. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 699-703 [Abstract]
  26. Loefdahl, S., Guss, B., Uhlen, M., and Lindberg, M. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 697-701 [Abstract]
  27. Uhlen, M., Guss, B., Nilsson, B., Gatenbeck, S., Philipson, L., and Lindberg, M. (1984) J. Biol. Chem. 259, 1695-1702 [Abstract/Free Full Text]
  28. Farrell, A. M., Foster, T. J., and Holland, K. T. (1993) J. Gen. Microbiol. 139, 267-277 [Medline] [Order article via Infotrieve]
  29. Soto-Gil, R. W., and Zyskind J. W. (1989) J. Biol. Chem. 264, 14778-14783 [Abstract/Free Full Text]
  30. Somerville, C. C., and Colwell, R. R. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 6751-6755 [Abstract]
  31. Myerwitz, R., Piekarz, R., Neufeld, E. F., Shows, T. B., Jr., and Suzuki, K. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 7830-7834 [Abstract]
  32. Beccari, T., Hoade, J., Orlacchio, A., and Stirling, J. L. (1992) Biochem. J. 285, 593-96 [Medline] [Order article via Infotrieve]
  33. Graham, T. R., Zassenhaus, H. P., and Kaplan, A. (1988) J. Biol. Chem. 263, 16823-16829 [Abstract/Free Full Text]
  34. Cannon, R. D., Niimi, K., Jenkinson, H. F., and Shepherd, M. G. (1994) J. Bacteriol. 176, 2640-2647 [Abstract]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.