(Received for publication, May 31, 1995)
From the
A UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase (GalNAc-transferase) from human placenta was purified to apparent homogeneity using a synthetic acceptor peptide as affinity ligand. The purified GalNAc-transferase migrated as a single band with an approximate molecular weight of 52,000 by reducing sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Based on a partial amino acid sequence, the cDNA encoding the transferase was cloned and sequenced from a cDNA library of a human cancer cell line. The cDNA sequence has a 571-amino acid coding region indicating a protein of 64.7 kDa with a type II domain structure. The deduced protein sequence showed significant similarity to a recently cloned bovine polypeptide GalNAc-transferase (Homa, F. L., Hollanders, T., Lehman, D. J., Thomsen, D. R., and Elhammer, Å. P.(1993) J. Biol. Chem. 268, 12609-12616). A polymerase chain reaction construct was expressed in insect cells using a baculovirus vector. Northern analysis of eight human tissues differed clearly from that of the bovine GalNAc-transferase. Polymerase chain reaction cloning and sequencing of the human version of the bovine transferase are presented, and 98% similarity at the amino acid level was found. The data suggest that the purified human GalNAc-transferase is a novel member of a family of polypeptide GalNAc-transferases, and a nomenclature GalNAc-T1 and GalNAc-T2 is introduced to distinguish the members.
Mucin-type O-linked glycosylation is one of the
dominant forms of glycosylation of glycoproteins. Mucin-type
glycosylation is initiated by the addition of the monosaccharide N-acetylgalactosamine to the hydroxyl group of serine and
threonine amino acids (GalNAc1-O-Ser/Thr). GalNAc O-glycosylation is found on a variety of glycoproteins but is
more prominent on high molecular weight secretory glycoproteins such as
mucins, where they may constitute up to 80% of the total mass. O-Linked glycosylation also appears to have a role in the
conformation and protease resistance of ``stilk regions'' of
membrane proteins necessary for correct exposure and accessibility of a
functional domain of membrane-bound protein (Jentoft, 1990; Sadler,
1984; Lis and Sharon, 1993; Varki, 1993). Recently, O-glycans
have been selectively implicated in carbohydrate-lectin cell adhesion
phenomena (Springer, 1994).
A polypeptide GalNAc-transferase ()initiating mucin-type glycosylation was recently purified
to apparent homogeneity from bovine colostrum and porcine submaxillary
glands (Elhammer and Kornfeld, 1986; Wang et al., 1992;
O'Connell et al., 1991). The purification strategy
relied on affinity chromatography using either ovine submaxillary
apomucin (acceptor substrate) or 5-mercury-UDP-GalNAc (donor
substrate). Using apomucin affinity chromatography, Homa et
al.(1993) succeeded in obtaining amino-terminal sequence
information of a bovine colostrum GalNAc-transferase sufficient for
cDNA cloning from a bovine small intestine library. More recently,
O'Connell et al.(1991) isolated the same
GalNAc-transferase from bovine colostrum by 5-mercury-UDP-GalNAc
affinity chromatography (Hagen et al., 1993).
One pertinent question has been whether one or several GalNAc-transferases are involved in mucin-type O-linked glycosylation initiation. The main evidence supporting multiple transferases has been the lack of activity toward serine substrates in purified transferase preparations (Wang et al., 1992; O'Connell et al., 1991). Elhammer et al.(1993) have, however, recently been able to identify serine transferase activity albeit apparently 35-fold lower than threonine transferase activity in purified bovine colostrum transferase using the peptide sequence PPASSSAPG. Furthermore, Wang et al. (1993) showed both activities in a purified porcine GalNAc-transferase preparation. Further evidence is the identification of an apparently unique GalNAc-transferase activity associated with fetal and tumor tissue responsible for the synthesis of the oncofetal fibronectin epitope defined by monoclonal antibodies FDC6 and 5C10 (Matsuura et al., 1988, 1989). We have not been able to demonstrate GalNAc-transferase activity toward the reported fibronectin-derived acceptor peptide VTHPGY (Mandel et al., 1994). There is thus no clear evidence suggestive of more than one transferase to date.
Here we report the purification of a novel human placenta GalNAc-transferase using a defined synthetic acceptor peptide as affinity ligand. Based on partial amino acid sequences cDNA was cloned and sequenced. The human GalNAc-transferase cloned in the present paper possesses domains homologous with the bovine GalNAc-transferase and its 98% similar human counterpart. As will be described in detail in the accompanying paper, affinity chromatography with a defined synthetic peptide in fact separated two distinct transferase activities in agreement with the potential existence of multiple GalNAc-transferases (Sørensen et al., 1995).
Placenta tissue stored frozen at -20 °C was
thawed 1-3 days at 4 °C and homogenized in a Waring blender
(Waring Products Division) in 3-fold deionized water twice with
intermittent centrifugation at 10,000 g. The pellets
were collected and extracted overnight in buffer A containing 1.5%
Triton X-100. Supernatants obtained after centrifugation at 10,000
g for 1 h were processed as follows.
EBHC23: 5`-AACGG(G/C/A/T)GA(G/A)GA(G/A)AA(G/A)GC(G/C/T)CA3`.
EBHC24: 5`-AA(G/A)AA(G/A)AA(G/A)GA(C/T)IIICA(T/C)CA-3`.
EBHC25: 5`-C(C/G)ACGTA(G/T/A/C)GCCTC(C/T)TG(A/G)TT(A/G)AA-3`.
Total RNA from
the gastric tumor cell line MKN45 was extracted by standard procedures
(Chomczynski and Sacchi, 1987), and polyadenylated RNA was purified
using a Promega poly(A) tract mRNA purification kit. cDNA was
synthesized using EBHC25 and Moloney murine leukemia virus-reverse
transcriptase. PCR was performed on 5 µl of the reverse
transcriptase reaction in a 25-µl solution containing 0.5
µM EBHC25 and EBHC24, 2 mM MgCl, 50
mM KCl, 125 µM dNTP, and 2.5 units of Taq polymerase (Cetus) and subjected to the following thermocycle
conditions: 95 °C, 45 s; 43 °C, 5 s; 72 °C, 15 s, for 40
cycles. Generated products were digested with EcoRI and
ligated into the EcoRI site of pT7T3U19 (Pharmacia). Competent Escherichia coli cells, SURE (Stratagene), were transformed
with the ligated constructs and plated onto LB AMP TET inositol
1-thio-
-D-galactopyranoside 5-bromo-4-chloro-3-indolyl
-D-galactoside plates for colorimetric selection.
Isolated white colonies were selected and sequenced. Plasmid from one
clone, TEB1, was selected and used as probe.
Figure 3: Nucleotide sequence and predicted amino acid sequence of the two cDNA clones 2782 and 5551. For further details, see ``Experimental Procedures.'' The amino acid sequence is shown in a single-letter code. The hydrophobic segment representing the putative transmembrane domain is double underlined. Peptide sequences corresponding to the sequences shown in Table 2are indicated by underlining. A line missing immediately under individual amino acids indicates discrepancies between the residues predicted from amino acid sequencing (upper case letter; x denotes not identifiable) and those derived from the nucleotide sequence.
Further binding to the Muc2 column could be achieved when a detergent exchange from Triton X-100 to n-octyl glycoside was performed. Detergent exchange was performed on S-Sepharose of the pass-through of the first Muc2 chromatography (step 4), and this resulted in more than 50% of the residual transferase activity binding another Muc2 peptide column run in the n-octyl glucoside detergent. Importantly, however, the enzyme eluted in n-octyl glucoside was not active when the detergent was removed in contrast to the enzyme activity eluted from the Triton X-100 Muc2 chromatography. Thus, n-octyl glycoside-purified enzyme could only be further purified on Mono S chromatography in the presence of detergent. Omission of detergent at this step resulted in a complete loss of enzyme activity in eluted fractions. Furthermore, S-12 gel filtration of the enzyme activity eluted from the Mono S column run in n-octyl glucoside resulted in a spread of activity throughout the eluate, suggesting tight interaction between protein and detergent (not shown). The purification and protein analyses reported were all performed on the enzyme purified by Muc2 chromatography run in Triton X-100 even though this constituted a minor fraction.
Figure 1:
Panel A, NaCl gradient elution of
GalNAc-transferase from Mono S cation exchange column (step 5). The
eluate from the Muc2 peptide column (step 4), diluted and pH-adjusted,
was applied to a cation exchange (Mono S PC 1.6/5) column. The column
was equilibrated in buffer F and washed with 1 ml of the same, followed
by elution with a gradient of 0-1 M NaCl over 30 min at
a flow rate of 50 µl/min. Elution was monitored at A (and 280 nm, not shown), and fractions of
50 µl were assayed for transferase activity using the Muc2 peptide
substrate. Fractions are 1 min, and fraction 14 starts at 79.4 min. Panel B, SDS-PAGE of fractions 14-20 stained by
Coomassie Blue. Prestained molecular weight markers (std) are
shown with molecular weights indicated in the margin (phosphorylase b, 106,000; serum albumin, 80,000; ovalbumin, 49,500; carbonic
anhydrase, 32,500; trypsin inhibitor, 27,500; lysozyme, 18,500).
Fractions 17 and 18 contained the peak of the transferase activity and
one major protein band with an apparent molecular weight of
52,000.
Figure 2:
S-12
gel filtration chromatography of the GalNAc-transferase. Ten µl of
the peak fraction from a Mono S chromatography containing approximately
1 µg of protein with a specific activity of 0.76 unit/mg was
applied to an S-12 gel filtration column (S-12 3.2/30) and run in
phosphate-buffered saline at 40 µl/min. Elution was monitored at A (and 280 nm, not shown). Fractions (100
µl) were collected and assayed for transferase activity using the
Muc2 substrate as well as the hCG-
peptide. SDS-PAGE
chromatography of fractions revealed a faint band with apparent
molecular weight of 52,000 corresponding to the peak transferase
activity (not shown).
Two clones were selected for complete sequence analysis (clones 2782 and 5551). Clone 2782 contained 2.5 kb including 3`- and 5`-untranslated sequences (approximately 900 and 50 bp, respectively). Clone 5551 contained 2.3 kb including the 5` end of the coding region but was 100 bp short of the 3` end. The combined nucleotide sequence of the GalNAc-transferase cDNA clones is shown in Fig. 3. Overlapping parts of the two isolated clones were identical in sequence apart from one nucleotide ``insertion'' in the 3` end of clone 2782 (position 1534, Fig. 3). Since internal peptide sequences apparently were located out of frame past the 3` termination introduced by the insertion in clone 2782, the purified GalNAc-transferase was longer than predicted by this clone. The apparent insertion in clone 2782 was concluded to be an artifact in the library as clone 5551 lacked this insertion, and more importantly, RT-PCR of this area using a variety of total RNA sources including the MKN45 RNA only yielded the sequence identified in clone 5551 (not shown). Furthermore, genomic sequencing of cloned PCR products covering this position only yielded the sequence found in clone 5551. The remaining nine cDNA clones selected were partially sequenced, and all were found to contain 5` intron sequences and were limited to coding sequences in the 5`-TEB1 probe area.
GalNAc-T2 is predicted to be a type II transmembrane protein with a strongly hydrophobic domain (amino acids 7-24) and not flanked by charged amino acids (Prosis software, Hitachi).
As shown in Fig. 3all of the peptide sequences obtained (Table 2) with minor discrepancies at weakly assigned amino acid residues are accounted for in the coding sequence. In agreement with the proposed soluble nature of the isolated transferase protein, the amino-terminal sequence obtained from this protein is found at amino acid position 51 carboxyl-terminal to the putative transmembrane-anchoring domain. The predicted amino acid sequence of the human GalNAc-T2 has one consensus sequence (-Asn-Xaa-Ser/Thr-) for N-glycosylation (amino acid 516). N-Glycanase digestion of the soluble transferase did not alter SDS-PAGE mobility (not shown), and as the amino acid sequence of one of the A. lyticus protease I-released peptides included this site and gave positive identification of asparagine, it may be suggested that this site is not utilized or is only partially utilized. The molecular mass of the deduced amino acid sequence of the soluble transferase is 59,205 Da, which is slightly higher than the experimentally determined molecular weight but within the limits of accuracy of the SDS-PAGE and gel filtration systems used.
Figure 4: Multiple sequence alignment analysis (DNASIS, Hitachi) of human (GalNAc-T2, top) and bovine (GalNAc-T1, bottom) GalNAc-transferases. Subscript letters indicate changes in the human version of GalNAc-T1 (bovine/human GalNAc-T1 similarity 99% amino acid level and 95% at nucleotide level).
Figure 5:
Northern blot analysis of MKN45. Twenty
µg of total RNA or 10 µg of poly(A)) mRNA
isolated from the MKN45 cell line was probed with
P-labeled PvuII/HindIII 2782 probe
fragment, TEB2.
Analysis of a human multiple tissue blot from Clontech showed hybridization to a 4.5-kb mRNA in all tissues (Fig. 6). Similar to the MKN45 cell line, several of the tissues also expressed the smaller size 2-3-kb mRNAs. The same Northern blot from Clontech was analyzed by Homa et al. (1993) using the bovine GalNAc-transferase-T1. 4.2-kb mRNA was detected using the GalNAc-T1 probe in all tissues but kidney, and the level of expression differed significantly from that found in Fig. 6.
Figure 6:
Northern blot analysis of human tissues.
Multiple human tissue blot from Clontech as labeled was probed with P-labeled TEB2 probe.
Mucin-type GalNAc-transferase activity has previously been purified to apparent homogeneity from bovine colostrum and porcine submaxillary glands using either apomucin acceptor substrate chromatography or 5-mercury-UDP-GalNAc donor substrate chromatography as the principal affinity purification step (Elhammer and Kornfeld, 1986; Wang et al., 1992; O'Connell et al., 1991). Here we report the purification of a unique GalNAc-transferase from human placenta using an acceptor substrate chromatography that is based on a defined synthetic peptide derived from the human intestinal mucin Muc2 (Gum et al., 1989). The rationale for selecting this strategy was a hypothesis that multiple GalNAc-transferases would exist. Initially, we attempted to use acceptor peptides with few threonine or serine sites to limit the number of different substrate sites, but to date only the Muc2 sequence with multiple threonine residues yielded significant purification. As shown in the following paper, our affinity chromatography apparently resulted in the separation of at least two distinct GalNAc-transferase activities (Sørensen et al., 1995).
The identity of the
isolated putative GalNAc-transferase cDNA was established by functional
expression using the baculovirus system (Table 3). The cloned
GalNAc-transferase is predicted to be a type II transmembrane protein
in concordance with all other cloned mammalian glycosyltransferases
(Paulson and Colley, 1989; Kleene and Berger, 1993). The purified
enzyme using Triton X-100 in the affinity chromatography steps was
found to be soluble by gel filtration analysis, and this was confirmed
by comparing the amino-terminal amino acid sequence of the purified
protein with the coding region predicted from the cloned cDNA (Fig. 3). Purification of other glycosyltransferases using the
detergent Triton X-100 in the affinity chromatography step has also
resulted in selective isolation of the soluble forms of the enzymes
(Weinstein et al., 1987; Clausen et al., 1990; Sarkar et al., 1991). An exception to this is the purification of the
Gal1-3GalNAc
2-3-sialyltransferase, where both
the membrane and soluble forms were isolated (Gillespie et
al., 1992). The soluble forms of glycosyltransferases like
GalNAc-T2 appear to be in the monomer form. Recently, the apparent
membrane form of the
1-4-galactosyltransferase was purified
and found to be a high molecular weight, probably multimeric, complex
(Bendiak et al., 1993).
We found here that a detergent exchange prior to the final affinity chromatography (step 4) to n-octyl glucoside apparently resulted in purification of the membrane-bound form of the GalNAc-transferase. Initially, we purified the placenta transferase by this method because significantly more activity was recovered. However, validation of the purification was impaired by difficulties with further purification steps including ion exchange and/or gel filtration. Fractions isolated by SDS-PAGE withg a molecular weight of approximately 60,000 were found to be amino-terminally blocked for sequencing, and gel filtration failed to yield distinct peaks (not shown). Because of the possibility of multiple copurified GalNAc-transferases we chose to focus on the soluble enzyme. Leaving placenta tissue thawing for 3-4 days at 4 °C prior to extraction appeared to increase the relative amount of soluble enzyme.
Most of the glycosyltransferases characterized to
date have been found to be N-linked glycoproteins (Kleene and
Berger, 1993), although the
1-2-N-acetylglucosaminyltransferases I and II
appear to be exceptions (Sarkar et al., 1991; Kumar et
al., 1990; Kleene and Berger, 1993). GalNAc-T1 has three N-linked glycosylation consensus sites (human GalNAc-T1 has
four), and some of these appear to be utilized (Homa et al.,
1993). GalNAc-T2 has one consensus site, but both N-glycanase
digestion (not shown) and amino acid sequencing (Table 2)
indicate that the soluble enzyme lacks N-linked glycosylation,
or at least that the site is only partially utilized.
The isolated
and cloned human GalNAc-transferase is predicted to be a novel member
of a family of polypeptide GalNAc-transferases. The similarity between
the human and bovine GalNAc-transferases (44% at the amino acid level)
is close to that found for the different members of the
sialyltransferase family (Wen et al., 1992). Species
similarities of glycosyltransferases at the amino acid level have
generally been found to be within 95-98% (Kleene and Berger,
1993), and here we show that the human counterpart of the bovine
GalNAc-transferase (Homa et al., 1993) was 99% similar at the
amino acid level (Fig. 4). We therefore suggest that a
nomenclature be introduced to identify the GalNAc-transferases:
GalNAc-T1 and GalNAc-T2, where the latter is the novel human
transferase cloned in the present paper. Comparison of the two amino
acid sequences shows two to three regions of high similarity (80%)
which could be targets for a PCR cloning strategy to identify
potentially additional members of this family of transferases. Such an
approach has been applied with success to the sialyltransferase family
comprising to date six members with a highly conserved 55-amino acid
segment in the putative catalytic domain of the transferases
(Livingston and Paulson, 1993). Our preliminary results indicate that
this strategy will be successful for GalNAc-transferases as well.
Homology among different glycosyltransferases is generally very
limited with the exception of members of the 1-3/4
fucosyltransferases (FUT 3, 5, 6) (Weston et al., 1992) and
the human blood group A/B with the
1-3 galactosyltransferase
(Joziasse et al., 1991). Drickamer(1993) showed that despite
lacking homology among three members of the sialyltransferase family,
cysteine residues were conserved. The present findings that 12 of 13
cysteine residues in GalNAc-T2 align with GalNAc-T1 and that these are
distributed throughout the proteins suggest that the enzymes have a
similar overall structure. Potential disulfide bonding is expected to
occur intramolecularly as evidenced by the molecular weight of
GalNAc-T2 estimated by gel filtration (Fig. 2). Interestingly,
GalNAc-transferase activity appears to be increased 2-3-fold in
the presence of reducing agents (Wang et al., 1992), possibly
suggesting an advantage of some ``opening'' of the predicted
globular catalytic domain for substrate accessibility and catalytic
activity at least as measured in vitro using peptide
substrates.
The finding that multiple GalNAc-transferases are involved in O-glycosylation is important for the understanding of this predominant post-translational modification. Detailed studies of the fine specificity of the individual transferase members are necessary for defining the role of each enzyme, and a preliminary study on this presented in the accompanying paper (Sørensen et al., 1995) clearly shows that distinct differences in substrate specificity may be expected. Here we show that Northern blotting with our human cDNA sequence (Fig. 5) revealed a slightly larger mRNA and a different organ distribution than that published by Homa et al.(1993) for GalNAc-T1, although most organs express both transcripts. Differential cell/organ expression of different GalNAc-transferases may result in different GalNAc O-glycosylation processing between cells and species. This could have a significant impact on interpretations of the peptide specificity of O-glycosylation inferred from identified in vivo glycosylation (O'Connell et al., 1991; Elhammer et al., 1993) and may eventually result in the identification of more defined consensus sequences of the acceptor peptide sites of individual transferases.
In conclusion, the present data provide evidence that mucin-type GalNAc O-glycosylation is controlled by at least two distinctly different GalNAc-transferases that are expressed differentially in cells and organs. Further studies of the substrate specificity of these and potentially additional enzymes are required, but it may be anticipated that this understanding will be significant for our insight into O-glycosylation processing and of practical use, for example, for designing appropriate mammalian expression systems for recombinant glycoproteins in drug use.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) X85018 [GenBank](GalNAc-T1) and X85019 [GenBank](GalNAc-T2).