(Received for publication, January 21, 1997, and in revised form, April 30, 1997)
From the Central Laboratories for Key Technology, Kirin Brewery Co., Ltd., 1-13-5, Fukuura, Kanazawa-ku, Yokohama 236, Japan
The consensus primary amino acid sequence for
mucin-type O-glycosylation sites has not been identified.
To determine the shortest motif sequence required for high level
mucin-type O-glycosylation, we prepared more than 100 synthetic peptides and assayed in vitro O-GalNAc transfer
to serine or threonine in these peptides using a bovine colostrum
UDP-N-acetylgalactosamine:polypeptide
N-acetylgalactosaminyl transferase (O-GalNAcT).
We chose the sequence PDAASAAP from human erythropoietin (hEPO) for
further systematic substitutions because it accepted GalNAc and was a
fairly simple sequence consisting only of four kinds of amino acids.
Several substitutions showed that threonine is ~40-fold better than
serine as the glycosylated amino acid and a proline at position +3 on
the C-terminal side is very important. To define the effect of proline
residues around the glycosylation site, we analyzed a series of
peptides containing one to three proline residues in a parent peptide
AAATAAA. The results clearly indicated that prolines at positions +1
and +3 had a positive effect. The O-GalNAc transfer level
of AAATPAP was increased approximately 90-fold from AAATAAA. The
deletion of amino acids from the N-terminal side of the glycosylation
site suggested that five amino acids from position 1 to +3 were
especially important for glycosylation. Moreover, the influence of all
20 amino acids at positions
1, +2, and +4 was analyzed. Uncharged amino acids were preferred at position
1, and small or positively charged amino acids were preferred at position +2. No preference was
observed at position +4. We propose a mucin-type O-glycosylation motif,
XTPXP, which may be suitable as a signal for
protein O-glycosylation. The features observed in this
study also appear to be very useful for prediction of mucin-type
O-glycosylation sites in glycoproteins.
Glycosylation is an important post-translational modification of
proteins in eukaryotic cells, and is divided broadly into three
categories of N-linked, O-linked, and
glycosylphosphatidylinositol-anchored types (1). Mucin-type sugar
chains are the principal structure in various O-linked
oligosaccharides. The initial transfer of an
N-acetylgalactosamine (GalNAc)1
from UDP-GalNAc to the hydroxyl group of serine or threonine is
catalyzed by UDP-GalNAc:polypeptide
1,O-N-acetylgalactosaminyl transferase
(O-GalNAcT; EC 2.4.1.41) in the cis-Golgi
compartment (2). To date, at least three isozymes for the enzyme have
been found by purification (3-8, 11) and cDNA cloning (7-13).
Although the consensus amino acid sequence Asn-Xaa-Ser/Thr (Xaa
Pro) is extensively known in the case of N-glycosylation
sites, such a consensus primary amino acid sequence has not been
identified among the mucin-type O-glycosylation sites.
However, because mucin-type O-glycosylation occurs only at
specific serine or threonine residues, it seems that a certain primary
sequence or three-dimensional structure must be required for the
recognition by O-GalNAcT.
Proline residues are found with considerable frequency around
glycosylation sites and may play an important role in recognition by
O-GalNAcT. Statistical studies by Wilson et al.
(14), O'Connell et al. (15), and Hansen et al.
(16) suggested that prolines occur most commonly at position 1 and +3
from the glycosylated residue (number with
or + refers to a
position of amino acid that is N-terminal or C-terminal to a
glycosylated amino acid). Elhammer et al. (17) suggested
that all proline residues from
4 to +4 may be acceptable for
O-glycosylation. Pisano et al. (18) proposed four
O-glycosylation motifs from the identification of sites in
human glycophorin A. One of them, XPXX (at least
one X is a glycosylated threonine residue), indicates that
prolines at position
1,
2, and +1 may be important. However, these
findings have not been fully supported by biochemical studies of the
effect of proline residues at each position around the
O-glycosylation sites.
Some attempts to analyze the influence of the primary sequence using
synthetic peptides in vitro have been reported. This method
is useful for identifying sequences capable of being glycosylated and
for determining the efficiency of O-GalNAc transfer to a
certain sequence. By using synthetic peptides derived from bovine
myelin, Young et al. (19) showed that the sequences
containing TPPP are good substrates and suggested that a triproline
sequence C-terminal to threonine might be sufficient for mucin-type
O-glycosylation. However, it has also been reported that
peptides not containing the triproline sequence are substrates for
O-GalNAcT (15, 17, 20). Tabak's group have made efforts to
analyze the in vitro GalNAc transfer using more than 50 peptides derived from human von Willebrand factor (hVF) (15, 20).
Although they reported that a proline at position +3 is essential and a
second proline placed at 1, +1, +2, or +4 is particularly effective,
single substitutions from the native sequence are not enough to
understand the effect of proline residues in general (20). Similar
experiments using more than forty synthetic peptides derived from human
Muc1 mucin were reported by Nishimori et al. (21). Their
findings suggested that a proline at position +3 enhances the
glycosylation but is not sufficient and the minimal size of the peptide
required for efficient glycosylation is six amino acids from
1 to +4, while Young et al. (19) described TPPP as the minimal
substrate.
These results show that the surrounding sequences of the O-glycosylated amino acid affect the GalNAc transfer remarkably; however, further experiments are needed to clarify the general requirements for mucin-type O-glycosylation.
Previously, most studies were complicated by the fact that no distinction was made between single and multiple glycosylations or between serine and threonine as the glycosylated residue. In this report we describe the features of amino acid sequence around single mucin-type O-glycosylation sites by in vitro GalNAc transfer analysis using a series of synthetic peptides. Bovine colostrum O-GalNAcT was able to transfer GalNAc to a short peptide PPDAASAAP (underscore indicates the glycosylated serine) derived from an inherent mucin-type O-glycosylation site of human erythropoietin (hEPO). Subsequent substitutions of the sequence showed the minimal peptide length required for high level GalNAc transfer, the preference of O-glycosylated amino acid, and two influential proline residues at position +1 and +3. From results observed in this study, we propose that XTPXP is an effective mucin-type O-glycosylation motif.
UDP-[3H]GalNAc (3.7 MBq/ml) and
Atomlight were purchased from NEN Life Science Products. UDP-GalNAc and
bovine submaxillary mucin were from Sigma. DEAE-Sephacel and CNBr
activated Sepharose 4B were from Pharmacia Biotech Inc. AG1X-8
(100-200-mesh, Cl form) was from Bio-Rad. C18 Vydac
columns were from Separations Group Inc.
Apomucin was prepared from bovine submaxillary mucin as described by Hagopian and Eylar (22).
The peptides were synthesized by Fmoc (N-(9-fluorenyl)methoxycarbonyl) solid-phase method using an automated peptide synthesizer PS3 (Protein Technologies, Inc.) on a 0.1 mM scale or purchased from Kurabo Co. Ltd. The crude peptides were purified by reverse-phase high performance liquid chromatography (HPLC) on a C18 Vydac column (10 × 250 mm) by elution with a water/acetonitorile gradient containing 0.1% trifluoroacetic acid. Each purified product displayed a single peak by the reverse-phase HPLC analysis on a C18 Vydac column (4.6 × 250 mm).
To quantify peptide amounts and verify their amino acid content, the peptides were hydrolyzed under vapor phase HCl containing 1% phenol at 130 °C for 3 h, and the hydrolyzed materials were then analyzed by a JLC-300 amino acid analyzer (JEOL Ltd.). The purity of each peptide was analyzed on an API III Sciex electron ion spray mass spectrometer (Perkin-Elmer).
Purification of UDP-N-acetylgalactosamine:Polypeptide N-Acetylgalactosaminyl Transferase from Bovine ColostrumO-GalNAcT was purified from 4.8 liters of bovine colostrum by DEAE-Sephacel column chromatography and two steps of apomucin affinity column chromatography essentially as described by Elhammer et al. (5). The enzyme activity of the purified fraction was determined by monitoring the transfer from UDP-[3H]GalNAc to an apomucin as described by Sugiura et al. (3).
In Vitro O-GalNAc Transfer Assay for Peptide AcceptorsThe
standard reaction mixture contained 50 mM imidazole-HCl
(pH7.2), 10 mM MnCl2, 0.5% Triton X-100, 150 µM UDP-[3H]GalNAc (approximately 133,000 dpm), 2 mM synthetic peptide, and enzyme solution in a
final volume of 50 µl. The initial velocity of GalNAc transfer to
peptide was measured by incubating the mixture at 37 °C from 30 to
480 min, depending on the reactivity of the peptide. The reaction was
terminated by adding 50 µl of 250 mM EDTA. The
glycosylated peptide was separated from unreacted
UDP-[3H]GalNAc on a 1-ml AG1X-8 (Cl form)
column with water as eluent. The 2.6 ml of run-through fraction was
collected directly in a glass scintillation vial and supplemented with
Atomlight liquid scintillation mixture. The radioactivity was measured
by a SE-100 scintillation counter (Packard) for 2 min. Each assay was
calibrated with PPASTSAPG and/or AAATPAP as standard.
O-GalNAc transfer was assayed using synthetic
peptides derived from inherent mucin-type glycoproteins, porcine
submaxillary mucin (23), ovine submaxillary mucin (24), human
granulocyte-colony stimulating factor (hG-CSF) (25), bovine -casein
(26), and hEPO (27, 28). The peptide RTPPP derived from
bovine myelin, which is not natural mucin-type glycoprotein (19), and
the peptide PPASTSAPG, which was predicted to be a good
substrate from statistical analysis of the mucin-type
O-glycosylation sites by Elhammer et al. (17),
were also assayed as substrates for convenience to the comparison with
previous results. The results are summarized in Fig.
1A. The transfer of GalNAc was observed in
peptides from hG-CSF, bovine
-casein, hEPO, bovine myelin, and the
sequence proposed by Elhammer, while the transfer was not detected in
two peptides from porcine and ovine submaxillary mucins. Because the peptide PPASTSAPG was the best substrate, the amount of
transferred GalNAc to each peptide was indicated in relative values to
this control peptide. The peptide PPDAASAAP from hEPO was
more highly glycosylated than other peptides derived from natural
mucin-type glycoproteins, and the amount of transferred GalNAc was 8%
of the control peptide.
Differences of the Glycosylation Efficiency between Serine and Threonine as a Glycosylated Amino Acid
The difference between Ser and Thr as a glycosylated amino acid was measured using the sequences from bovine myelin and hEPO. The glycosylation level of the peptide containing Thr was ~40-fold higher than those containing Ser in both cases (Fig. 1B). This value is in good agreement with the previously published results for the purified and the recombinant bovine colostrum O-GalNAcT1, respectively (8, 17). Because the sequence PPDAATAAP showed higher glycosylation than PPASTSAPG that showed the highest glycosylation in Fig. 1A, it is likely that the surrounding sequence of the O-glycosylation site from hEPO may contain suitable characteristics for the glycosylation by bovine colostrum enzyme.
Effect of Simplification in the hEPO SequenceTo find a
parent peptide for further investigation of the amino acid preference
at each position around the glycosylation site, the effect of
simplification in the hEPO sequence was analyzed (Fig. 1C).
All alanine substitutions and a deletion of Pro at position 4
decreased O-GalNAc transfer, but Pro at position +3 appears
to be critical. This result suggested that certain proline residues
that are often observed around mucin-type O-glycosylation sites are very important. The most simplified peptide,
AAATAAA, was glycosylated in low level and chosen as the
parent peptide.
The effective positions
of a proline residue were analyzed, and the result is shown in Fig.
2A. Two proline residues at position +3 and
+1 increased the transfer level by 33- and 7-fold in comparison to the
parent peptide, respectively. A 3-fold increase was caused by Pro at
position 1; however, proline residues at other positions did not
promote glycosylation.
Subsequently, effects of a second proline were measured by adding it to
AAATAAP, which was the best sequence in Fig. 2A.
As shown in Fig. 2B, the second proline at position +1
revealed a synergistic effect to Pro at +3 and increased the transfer
level to 90-fold of the parent peptide. While the prolines at position 3 and
2 showed a 2-fold increase from AAATAAP, other
second proline residues did not show any effects.
In addition, a series of peptides, containing a third proline which was
added to two prolines at position +1 and +3, were analyzed (Fig.
2C). No increase from AAATPAP was observed in
all cases, and a marked decrease was conferred by Pro at position 1.
The kinetic data of several representative peptides are shown in Table I. Substitution of either proline of the sequence AAATPAP by alanine decreased Vmax significantly, while Km values were comparable. Interestingly, the peptide AAATPPP, containing the third proline at position +2, has a Vmax higher than that of AAATPAP but also a higher Km. Hence, its catalytic efficiency is the same as that of AAATPAP. The GalNAc transfer to AAATAAA was measurable but too low to determine the kinetic parameters under the conditions used.
|
The positive effect caused by two proline residues at position +1 and +3 was analyzed in the case of serine as the glycosylated amino acid. As shown in Fig. 2D, these two proline residues were effective on the GalNAc transfer to serine, and the importance of proline at each position was similar to the result observed in the transfer to threonine as shown in Fig. 2 (A-C).
Deletion Analysis of N-terminal Amino Acids of the Glycosylation Site to Determine the Minimal LengthThe influence of deletion in
the N-terminal amino acids were analyzed to determine the minimal
length required for high level GalNAc transfer. The deletion of
alanines at position 3 and
2 reduced the glycosylation gradually
but was not fatal (Fig. 3). However, the lack of Ala at
1 made a significant loss of the transfer, suggesting that the amino
acid at position
1 is required for efficient
O-glycosylation. Because Young et al. (29)
suggested that acetylation (Ac) of the N-terminal amino group of the
glycosylated threonine might compensate for this deletion, we also
analyzed the influence of acetylation of the N-terminal amino group as well as amidation (Am) of the C-terminal carboxyl group. The results summarized in Fig. 3 disclosed that no effect was detected from the
comparison of (Ac)-TPAP with TPAP. In contrast,
the peptides containing Ala at position
1, such as ATPAP
and ATPAP-(Am), showed approximately 10-fold higher
transfer of GalNAc than TPAP. The amidation of C terminus
had no remarkable effect. These observations suggested that the minimal
length required for high level O-glycosylation is five amino
acids from
1 to +3.
Influences of Amino Acid Diversity at Positions
The influences of amino acid diversity at position 1 and +2,
at which proline did not show significant effect (Fig. 2,
A-C), were analyzed. The glycosylation was affected
considerably by the change of amino acids at position
1 and +2 (Fig.
4, A and B).
The bovine colostrum enzyme highly transferred GalNAc to peptides
containing Tyr, Ala, Trp, Phe, Thr, Ile, Ser, Gly, or Val, but not to
peptides containing Lys, Asp, Glu, or Arg at position 1 (Fig.
4A). Therefore, at position
1, it is likely that uncharged amino acids were acceptable while the charged amino acids were unfavorable. It was noteworthy that Tyr, Trp, and Phe having a bulky
aromatic ring were preferable as well as Ala, Ser, and Gly having a
small side chain.
High level transfer was seen when the amino acid at +2 was Pro, Ala, Cys, Lys, Arg, Ser, and His, but low level transfer was detected when it was Asp, Asn, Phe, Trp, Tyr, and Gly (Fig. 4B). From these results, we presumed that the amino acids with a positive charge or small side chain are advantageous at position +2. Nevertheless, an exception is Gly at position +2, which showed approximately 20% of the glycosylation to AAATPAP.
The influence of various amino acids at position +4 was assayed. By
contrast to the results of amino acids at position 1 and +2, no
remarkable difference on the transfer was observed (Fig.
4C). However, histidine decreased the transfer level to 24%
of AAATPAP.
During this experiment, several O-GalNAcTs have been purified (3-8, 11) and cloned (7-13). The cDNA for bovine colostrum O-GalNAcT which is used in this paper was cloned from bovine small intestine (7) and named as O-GalNAcT1 cDNA (12). Corresponding cDNAs with 95% homology were also cloned from various sources containing bovine placenta (8), porcine lung (9), and rat sublingual gland (10), suggesting that O-GalNAcT1 is common to across species and organs. Besides the O-GalNAcT1 cDNA, cDNAs of O-GalNAcT2 and T3 have also been cloned from human gastric tumor cell ine MKN45 (11) and human salivary gland (12), respectively. There is 45% homology among human O-GalNAcT1, T2, and T3. In contrast to O-GalNAcT1, O-GalNAcT2 was purified from a minor fraction containing 5-10% of total O-GalNAcT activity (11), and Northern blot analysis showed that O-GalNAcT3 is expressed only in some tissues such as pancreas and testis (12). It is supposed that these O-GalNAcTs should have different substrate specificity. Detailed analyses on O-GalNAcT2 and T3 should be necessary.
We reported here the features of primary peptide sequences required for high level O-glycosylation by a bovine colostrum O-GalNAcT1 in vitro. These results appear to be very useful for the improvement of mucin-type O-glycosylation efficiency and prediction of the glycosylation sites in glycoproteins. In summary we propose the several motifs for high level O-glycosylation: XTPXP, XTXXP, XTPXX, and XSPXP. They potentially correspond to single mucin-type O-glycosylation sites by O-GalNAcT1. XTPXP is the best motif among them.
The substitution analysis of hEPO-derived sequence on GalNAc transfer suggested that mutations of the amino acid sequence around the inherent O-glycosylation site might improve the glycosylation efficiency of intact hEPO in mammalian cells. Indeed, Elliott et al. (30) reported that hEPO mutants of Thr-126 (PPDAATAAPLR) and Pro-127 (PPDAASPAPLR) were almost completely O-glycosylated in COS-1 cells, while wild type glycoprotein (PPDAASAAPLR) was not fully O-glycosylated. This indicates that the results from in vitro GalNAc transfer analysis using short peptides can be used to convert an incompletely O-glycosylated glycoproteins to a fully glycosylated glycoproteins.
Our results showed that bovine colostrum O-GalNAcT could
transfer GalNAc to serine at a low level (Figs. 1A and
2D), while the GalNAc transfer to threonine of this enzyme
was apparently higher than that to serine by 15 to 40-fold in the same
surrounding sequences (Figs. 1B and 2, A-D).
This preference of threonine to serine was in good agreement with the
previous results of O-GalNAcT1 using sequences from hEPO,
bovine myelin, and porcine submaxillary mucin (8, 17, 31). It was
interesting that the peptide derived from hEPO was more highly
glycosylated than peptides from bovine -casein and hG-CSF, even
though the sequence contained a serine as the glycosylated amino acid.
This observation suggested that the surrounding sequence of the hEPO
glycosylation site is more suitable than those of bovine
-casein and
hG-CSF.
Surveys of the effective proline residues surrounding the glycosylation
site clearly showed that two specific proline residues at position +3
and +1 facilitate the glycosylation, but other prolines do not have
such significant roles (Fig. 2, A-C). These effects were
observed similarly when serine was selected as the glycosylated amino
acid in place of threonine (Fig. 2D). The kinetic studies
showed that the effect of these proline residues is on Vmax rather than Km (Table
I). This indicates that proline residues do not enhance the binding
energy between substrate and O-GalNAcT. The significance of
a proline at +3 in the surrounding sequence of hEPO was consistent with
those in hVF and hMuc1 mucin (15, 20, 21). However, the influence of
proline at 1 was not so significant, although statistical analyses by
some researchers have suggested that Pro at
1 and +3 might be
important (14-16). Although O'Connell et al. (20) reported
that a second proline placed at
1, +1, +2, or +4 is particularly
effective, our data indicated the second prolines at
1 and +2 have
little effect. The positive effects of Pro at
1 and +2 observed in
their results seem to be caused by the alteration of an unpreferable
amino acid to a preferable one, i.e. from Gly to Pro at
position +2 (Fig. 4B). A peptide AAPTPAP showed
an apparent decrease from AAATPAP in the GalNAc transfer.
While the exact reason is not clear, it might be come from the
unfavorable conformation by the introduction with the third proline at
position
1.
The minimal length required for the GalNAc transfer by bovine colostrum
O-GalNAcT was five amino acids from position 1 to +3.
While Young et al. (29) previously indicated that an amino acid at position
1 might not be required for the in vitro
glycosylation from the comparison of (Ac)-TPPP and
RTPPP, our results clearly showed that Ala at
1 has a
significantly larger effect than the N-terminal acetylation or Arg at
1 (Figs. 3 and 4A). A deletion experiment of N-terminal
amino acids indicated that not only the amino acid at position
1 but
also those at
3 and
2 may affect the O-GalNAc transfer
(Fig. 3), although no significant differences of the glycosylation
level was observed in the substitution from Ala to Pro at position
3,
2, and
1 (Fig. 2, A-C).
The in vitro glycosylation of peptides was greatly affected
by the substitution of amino acids at position 1 and +2, and a clear
preference was observed at each position (Fig. 4, A and B). At position
1, uncharged amino acids were good for the
glycosylation and the size of side chain was not involved. The
preferable amino acids at position +2 were those with a positive charge
or small side chain. The positive charge seemed to be very advantageous in this position, because the side chains of lysine, arginine, and
histidine are not so small. Glycine at +2 showed relatively low level
glycosylation, even though its side chain is apparently small. The
reason for this phenomenon may be come from the specific
-
angle
made by glycine. In contrast to positions
1 and +2, no remarkable
difference of the glycosylation was observed in the substitution of
amino acid at position +4 (Fig. 4C), suggesting that this
position is not so important in the recognition by bovine colostrum
O-GalNAcT.
We proposed several motifs for mucin-type O-glycosylation by in vitro analysis using a series of systematic synthetic peptides. Because mucin-type O-glycans on glycoproteins have various biological and physicochemical roles, the creation of a new O-glycosylation site in a protein seems to be very useful to improve its delivery and stability. We are currently investigating whether these motifs can work as a signal for mucin-type O-glycosylation when they are introduced into a protein.
We acknowledge Koiwai Daily Products Co., Ltd. for providing bovine colostrum. We thank Dr. M. D. Feese for critical reading of the manuscript.