(Received for publication, June 21, 1995; and in revised form, December 6, 1995)
From the
The elements that regulate O-glycosylation are poorly understood. We have developed a novel in vivo system to analyze the role of flanking sequence on the modification of a single well characterized O-glycosylation site derived from human von Willebrand factor (PHMAQVTVGPGL). A secreted chimeric reporter protein, containing the human von Willebrand factor sequence, an antibody recognition epitope, and a heart muscle kinase site, was engineered and expressed in COS7 and MCF-7 cells. Glycosylated and non-glycosylated forms of the immunoprecipitated reporter were resolved electrophoretically and their relative amounts quantitated. Using mutational analysis we find that the glycosylation apparatus of COS7 cells can accommodate a broad range of changes in the flanking sequence without compromising glycosylation, but that the distribution of charged amino acids flanking the O-glycosylation site can have a profound influence on glycosylation with position -1 relative to the glycosylation site being particularly sensitive. A combination of acidic residues at positions -1 and +3 almost completely eliminates glycosylation of the reporter in both COS7 and MCF-7 cells. The overall density of charged amino acids is less important since substitution of acidic residues at position -2, +1, and +2 had no effect in the level of glycosylation observed.
The acquisition of carbohydrate side chains in O-glycosidic linkage to either Thr or Ser has a profound
structural impact on a polypeptide backbone and thus underlies many
unique physicochemical properties of heavily O-glycosylated
molecules such as mucin-glycoproteins(1) . In addition, O-glycans function as ligands for receptors modulating such
diverse actions as lymphocyte trafficking(2) , sperm-egg
binding(3) , and tumor cell adhesion(4) . The first
committed step of O-glycosylation is catalyzed by
UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase
(ppGaNTase). ()
Since all hydroxyamino acids are not O-glycosylated, signals must exist to specify which Ser and Thr residues acquire O-glycans. From surveys of known O-glycosylation sites, it is apparent that charged residues are rarely found at positions flanking glycosylated Thr or Ser, while Pro and clusters of Ser and Thr often flank glycosylated residues(5, 6, 7) . A large number of synthetic substrates have been examined which confirm that the rate of GalNAc addition is sensitive to the sequence surrounding Ser or Thr in vitro(7, 8, 9, 10) . However, despite intensive efforts, no consensus sequence has emerged(11, 12) . This is due, in part, to the broad range of residues that the binding site of ppGaNTase can accommodate (7, 8, 13) and to the existence of multiple isoforms of ppGaNTase, which may have overlapping specificities(8, 9, 14, 15) .
Recent methodological advances in the quantitation of carbohydrate present on a single Ser or Thr during solid phase protein sequencing have provided an assessment of the occupancy level of specific O-glycosylation sites of native proteins(11, 16, 17) . It is not known whether the observed site heterogeneity reflects inter-animal variation or whether some sites are indeed variably occupied. Another limitation of this form of analysis is that it does not lend itself to the study of mutated forms of the glycoprotein.
In the present study we have examined a small chimeric reporter protein to determine the role played by flanking amino acids on in vivo O-glycosylation.
Figure 1:
Schematic and amino acid sequence of
the human von Willebrand factor chimeric reporter protein rHVF. The
pro-chimeric protein contains the insulin secretory signal (INS) fused to a 12-amino acid domain from the human von
Willebrand factor (rHVF) containing a single, threonine glycosylation
site (Gal1,3GalNAc
1), followed in turn by a metal binding
site (MBS), a bovine heart muscle kinase site (P),
and the FLAG
antibody recognition octapeptide. The
secreted form of the protein is cleaved immediately prior to the Phe
residue shown to the right of the arrow.
All molecular biological
manipulations were carried out essentially as described(18) .
The expression vector pKN4, which encodes the chimeric reporter, was
created as follows. Oligonucleotides KN4.1
(5`-CCCACCCGAGCCTTCGTTAACCCACATATGGCTCAAGTTACTGTGGGC-3`) and KN4.2
(5`-CCGGAATTCCCAATGATGCATACAGGAGGCCTGGGCCCACAGTAA CTTGAGCC-3`) were
annealed and extended. The reaction products were cut with AvaI and EcoRI and cloned into the vector pGIR199,
which contains the coding region for the insulin secretory signal and
was a generous gift of K. Drickamer(19) . The product vector
(pKN1) was then cut with NsiI and EcoRI and the
extended products of oligonucleotides KN4.3
(5`-GTTATGCATCATTGGCATCACCACGCAAGAAGAGCATCTCACGACGTGCATGACTACAAAGACGATG-3`)
and KN4.4 (5`-GGCGAATTCGTTGTAAAACGACTCATTTATCGTCATCGTCTTTGTAGTCATGC-3`)
were cut and cloned into these sites, generating pKN3. pKN3 was then
cut with NheI, filled with Klenow, ligated to EcoRI
linkers, cut with EcoRI, and the linker-ligated fragment
containing the reporter construct was cloned into the EcoRI
site of the eukaryotic expression vector pcDL-SR296 ((20) ; DNAX Research Institute of Molecular and Cellular
Biology, Inc.) to create pKN4.
The PstI-KpnI
fragment of pKN4 was cloned into PhagescriptSK m13 (Stratagene) to
create m13SKN4. The dutung
strain
RZ1032 was used to produce single-stranded uracil-containing m13SKN4
DNA for site-directed mutagenesis. Oligonucleotides for each of the
single amino acid substitutions were degenerate and coded for either
alanine and glutamic acid or proline and arginine. The oligonucleotide
5`-GGAAGATACTGTTGACGGGAAACG-3` is complementary to nucleotides
3465-3489 of pcDL-SR
296 and was used to screen by dideoxy
sequencing for the desired mutants in m13SKN4, the PstI-KpnI fragments of which were then recloned into
pcDL-SR
296. An infrared dye-labeled m13 -29 primer (LiCor)
was used for confirming the sequence of double-stranded pKN4 mutant DNA
on a LiCor automated sequencer.
For the pulse-chase analysis,
COS7 cells transfected with either wild-type or mutant expression
vector, were pulsed 18 h post transfection for 10 min with 200
µCi/ml [S]Met in 0.5 ml of Met-free
Dulbecco's modified Eagle's medium, washed twice with
phosphate-buffered saline, then incubated in 1 ml of complete
Dulbecco's modified Eagle's medium containing 5% fetal
bovine serum at 37 °C for periods ranging from 0 to 2 h. The cells
were then washed twice with ice-cold phosphate-buffered saline and
lysed (1% Tween 20, 0.1% SDS, 50 mM Tris-Cl, pH 7.5, 100
mM NaCl, 1 mM phenylmethylsulfonyl fluoride) in 0.5
ml. The lysates were spun at 14,000
g for 10 min,
diluted 2-fold, and the reporter immunoprecipitated from the cleared
lysate as described above.
Figure 2:
The chimeric reporter protein rHVF is O-glycosylated normally during secretion from COS7 cells.
Sialidase-digested material migrated more slowly than did
acid-hydrolyzed rHVF (lane 4) due to incomplete enzymatic
digestion, which may have been caused by the acetylation of a sialic
acid residue or by a linkage to Gal or GalNAc that is a poor substrate.
A labeled, purified mutant reporter protein containing a Thr Gly
substitution (lane 1) migrated similarly to the fully
deglycosylated wild-type rHVF (lanes 5 and 8). In a
mock transfection, no product the size of either the wild-type rHVF or
the Thr
Gly mutant was observed (data not
shown).
As an additional test for the presence of O-glycans, COS7 culture media containing the wild-type rHVF or
Thr Gly mutant proteins were ultrafiltered through a
Centricon-30 column, and 100 µl of jacalin-agarose, which
recognizes oligosaccharides with the core structure Gal
1,3GalNAc
in an
anomeric linkage to Ser or Thr (22) . Material
failing to bind to the jacalin-agarose was recovered by
immunoprecipitation, while the bound species were desorbed by elution
with 0.1 MD-galactose, then immunoprecipitated and
analyzed as described above.
Figure 3:
Wild-type rHVF (lane 1), but not
a Thr Gly mutant protein (lane 4), is recognized by the
lectin jacalin. Most rHVF is bound to a jacalin-affinity matrix and
recovered with desorption with D-galactose (lane 3).
A small quantity of rHVF failed to bind to jacalin matrix (lane
2). In contrast, all of the Thr
Gly mutant fails to bind to
jacalin (lane 5).
Figure 4:
O-Glycosylation of the rHVF is
insensitive to most changes in the sequence flanking the glycan linkage
site. A and B, site-directed mutagenesis was used to
introduce single amino acid changes in the sequence from -3 to
+3 flanking the Thr in the rHVF. Each amino acid was replaced by
either Ala, Glu, Pro, or Arg (except the +3 position, where the
Ala change was not obtained). In addition, Thr was mutated to both Gly
and Ser. The occupancy of each mutated glycosylation site was then
examined following secretion from COS7 cells(17) . All of the
reporter proteins shown, except the Thr Gly mutant, consisted
mainly of glycosylated protein and exhibited a distinct shift in
mobility following mild acid hydrolysis (data not shown). The overall
level of glycosylation was most sensitive to substitutions of a charged
residue at the -1 position and, to a lesser extent, to
substitution of Arg at the +3 position or Ser for Thr, as
indicated by the lower, less intense bands in those samples. The
labeling intensities in different lanes do not necessarily reflect the
relative abundance of each mutant protein, due to the low concentration
of ATP in the labeling reactions and to experimental deviation in the
amount of material recovered from separate transfections(17) .
The values determined for the fraction of each mutant protein
glycosylated are shown in Table 1.
Figure 5: The influence of charge distribution, charge density, and the rate of secretion on O-glycosylation in vivo. A, changes in the -1 position to Glu were combined with a scanning Glu mutation at all other positions from -3 to +3 flanking Thr, and the proteins were expressed in COS7 cells and analyzed(17) . The +3 position was the most sensitive to change in the single-site mutant background, while, in contrast, changing the -3 position to Glu restored a wild-type level of glycosylation. B, changes in the -2 position to Glu were combined with a scanning Glu mutation, then expressed and analyzed(17) . All of these double mutants were glycosylated normally, as was a triple charge mutant comprising three of the four residues immediately surrounding Thr. C, COS7 cells grown at 23 °C produced more fully glycosylated mutant rHVF relative to cells grown at 37 °C.
To demonstrate that the effect of charged residues at positions -1 and +3 on O-glycosylation is not unique to the COS7 cell background, we compared the glycosylation of rHVF with the -1/+3E mutant in a human mammary carcinoma cell line, MCF-7. The occupancy of the wild-type glycosylation site was >95%, whereas the level for the -1/+3 mutant was 22% (Fig. 6).
Figure 6: The human mammary carcinoma cell line from MCF-7 (29) was transfected with plasmid DNA encoding the wild-type rHVF (lane 1) or the -1/+3E mutant (lane 2). The percent occupancy was >95% for the wild-type and 22% for the mutant protein.
Our findings indicate that
although a wide range of flanking sequences are accommodated by the
ppGaNTase, charged residues at specific positions severely impair O-glycosylation of single acceptor sites in vivo in
at least two different cell backgrounds. Moreover, current studies with
a diverse spectrum of naturally occurring single-site acceptors suggest
that the importance of the -1 and +3 positions is
widespread. ()