(Received for publication, November 8, 1995; and in revised form, December 28, 1995)
From the
N-Linked glycosylation is a common form of protein processing that can profoundly affect protein expression, structure, and function. N-Linked glycosylation generally occurs at the sequon Asn-X-Ser/Thr, where X is any amino acid except Pro. To assess the impact of the X amino acid on core glycosylation, rabies virus glycoprotein variants were generated by site-directed mutagenesis with each of the 20 common amino acids substituted at the X position of an Asn-X-Ser sequon. The efficiency of core glycosylation at the sequon in each variant was quantified in a rabbit reticulocyte lysate cell-free translation system supplemented with canine pancreas microsomes. The presence of Pro at the X position completely blocked core glycosylation, whereas Trp, Asp, Glu, and Leu were associated with inefficient core glycosylation. The other variants were more efficiently glycosylated, and several were fully glycosylated. These findings demonstrate that the X amino acid is an important determinant of N-linked core-glycosylation efficiency.
One of the most common types of protein modification is N-linked glycosylation, in which oligosaccharides are added to
specific Asn residues(1, 2) . N-Linked
glycosylation plays a critical role in the expression of most
cell-surface and secreted proteins and is often required for protein
stability, antigenicity, and biological
function(1, 3, 4, 5, 6) .
The effects of N-linked glycosylation often depend on the
number and position of N-linked oligosaccharides added to a
protein
chain(5, 7, 8, 9, 10, 11) .
This is determined during core glycosylation, in which the
oligosaccharide GlcMan
GlcNAc
is
transferred to a protein by the enzyme
oligosaccharyltransferase(2, 12, 13) .
Oligosaccharyltransferase is integral to the endoplasmic reticulum
membrane, and the active site of the enzyme resides near the
endoplasmic reticulum membrane on the lumenal
side(13, 14, 15) . Core glycosylation usually
occurs co-translationally as the glycosylation site on a nascent
protein enters the endoplasmic reticulum
lumen(14, 16, 17, 18) .
Despite the importance of N-linked glycosylation, little is known about the protein signals that control the efficiency of oligosaccharide addition at specific Asn residues. N-Linked glycosylation generally occurs at the sequon Asn-X-Ser or Asn-X-Thr, where X is any amino acid except proline (Asn-X-Ser/Thr)(15, 19, 20) . However, because many Asn-X-Ser/Thr sequons in proteins are glycosylated inefficiently(8, 21, 22, 23, 24, 25, 26) or not at all(20, 27, 28) , other protein signals must also control this process.
We have used rabies virus
glycoprotein (RGP) ()as a model system to study the
regulation of N-linked core
glycosylation(8, 29, 30) . Using a rabbit
reticulocyte lysate cell-free translation system supplemented with
canine pancreas microsomes, we can examine the effects of specific
amino acid substitutions on the core-glycosylation efficiency (CGE) of
individual sequons in RGP(29) . Our results in the cell-free
system are similar to those obtained when RGP variants are expressed in
transfected Chinese hamster ovary cells(8, 29) . In
this report we examine the impact of the X amino acid on CGE.
To do this we generated a set of RGP variants by site-directed
mutagenesis in which each of the 20 common amino acids was substituted
at the X position of the sequon
Asn
-Leu
-Ser
. We then quantified
the CGE at the sequon in each variant using the cell-free system
described. Our results demonstrate that the amino acid at the X position is an important determinant of CGE.
Figure 2: RGP variants with amino acid substitutions at the X position of sequon 1 were generated by oligonucleotide cassette mutagenesis. A, the DNA and amino acid sequences near sequon 1 in RGP(1- -) are shown; the sequon is underlined. B, a cloning vector for cassette mutagenesis, pRGP(1- -)ES, was generated from pRGP(1- -) by replacing the sequence for amino acids 32-46 with a novel sequence (uppercase letters) containing EcoRV and SacI restriction sites (underlined). C, for cassette mutagenesis the cloning vector was digested with EcoRV and SacI restriction enzymes. D, an oligonucleotide duplex was generated for cassette mutagenesis. The oligonucleotides in the duplex are complementary to one another except at the bases corresponding to the X amino acid in sequon 1 (XXX); both oligonucleotides are fully degenerate at that position. E, ligation of the duplex (uppercase letters) into the cloning vector (lowercase letters) restores the amino acid coding sequence of RGP(1- -), except at the position corresponding to the X amino acid in sequon 1 (underlined). At that position each plasmid encodes one of the 20 common amino acids.
Figure 1:
Structure of RGP and RGP variants. The
extracellular domain of RGP(WT) contains three sequons at
Asn, Asn
, and Asn
. In
RGP(1- -) the sequons at Asn
and Asn
were deleted by site-directed mutagenesis in which the Thr
residue in each sequon was replaced with Ala. RGP(1- -)
contains a single sequon with the sequence
Asn
-Leu
-Ser
(sequon 1).
RGP(1- -)X38 variants were derived from
RGP(1- -) by oligonucleotide cassette mutagenesis; each
variant contains one of the 20 common amino acids at the X position of sequon 1. Black bar, sequon present; white bar, sequon deleted; stippled bar,
transmembrane domain; hatched bar, cytoplasmic
tail.
To simplify construction of RGP variants with amino acid substitutions at the X position of sequon 1, pRGP(1- -) was further modified to generate a cloning vector for cassette mutagenesis. This involved the introduction of unique EcoRV and SacI restriction sites on either side of sequon 1 ( Fig. 2and ``Materials and Methods''). An oligonucleotide cassette mutagenesis approach was used to generate a set of plasmids encoding variants of RGP(1- -) with each of the 20 common amino acids at the X position of sequon 1 (collectively referred to as RGP(1- -)X38 variants) ( Fig. 1and ``Materials and Methods''). A variant with Leu at position 38 (corresponding to the sequence normally present in RGP) was among the variants isolated using that approach. DNA sequencing was performed to confirm that each RGP(1- -)X38 plasmid encoded a protein identical to RGP(1- -) except for the amino acid at position 38.
RGP variants with each of the 20 common amino acids at the X position of sequon 1 were translated in parallel in the cell-free system described. Translation of the variants in the absence of microsomes confirmed that the amino acid substitutions at the X position did not alter the electrophoretic migration of the proteins (data not shown). The variants were then translated in the presence of microsomes to examine the effect of the X amino acid on core glycosylation. The amount of microsomes added to each reaction was optimized to maximize incorporation of RGP into microsomes while maintaining adequate translational activity. Under these conditions each translation reaction contains a small amount of protein that is not targeted to microsomes(8, 30) . These untargeted proteins are not glycosylated, retain the 19-amino acid N-terminal signal sequence, and migrate between the nonglycosylated and glycosylated forms of RGP synthesized on microsomes in our gel system(8, 30) . Because these untargeted proteins can interfere with the quantification of CGE, they were removed from translation reactions by proteinase K digestion prior to gel analysis(36) . The extracellular domain of RGP variants is translocated into the microsomal lumen during protein synthesis where it is protected from proteinase K digestion. In contrast, the 44-amino acid cytoplasmic tail (Fig. 1) remains outside of the microsome and is removed by this treatment. Removal of the cytoplasmic tail produces a small shift in the electrophoretic mobility of RGP proteins (data not shown) but does not interfere with the quantification of CGE. Following proteinase K treatment, radiolabeled translation products were analyzed directly (without immunoprecipitation) by gel electrophoresis and autoradiography. A gel autoradiograph showing the translation products of all 20 RGP variants is shown in Fig. 3. The positions of the nonglycosylated protein (N) and the protein glycosylated with a single core oligosaccharide (G) are shown. The total amount of protein produced in each translation can vary from tube to tube reflecting differences in the amount of RNA in each sample. For this reason glycosylation efficiency is determined by comparing the amounts of glycosylated and nonglycosylated protein produced in a single reaction for each variant.
Figure 3:
Core glycosylation of RGP variants in the
cell-free system. RNA encoding each RGP variant was generated by in
vitro transcription and expressed in the cell-free system in the
presence of canine pancreas microsomes and
[S]methionine. Translation products were treated
with proteinase K and analyzed by SDS-polyacrylamide gel
electrophoresis and autoradiography. The amino acid at the X position of sequon 1 in each variant is indicated. The migration
positions of the nonglycosylated protein (N) and the protein
glycosylated with a single core oligosaccharide (G) are
indicated.
To quantify the CGE
at the sequon in each RGP variant, the variants were expressed in the
cell-free system in three independent experiments and autoradiographs
from each experiment were analyzed by densitometric scanning. The
densities of bands representing glycosylated (G) and
nonglycosylated (N) proteins were quantified for each variant,
and the CGE was calculated as follows: G/(N + G) 100%(30) . The mean CGE ± 1 S.D. was
then determined for each variant (Fig. 4). This analysis
revealed that the CGE observed for each RGP variant was highly
reproducible in this system.
Figure 4: CGE of RGP variants with amino acid substitutions at the X position of sequon 1. The 20 RGP variants generated by cassette mutagenesis were analyzed in the cell-free system as described for Fig. 3in three independent experiments. Gel autoradiographs from each experiment were exposed in the linear range and analyzed by densitometric scanning. The CGE of each variant was calculated as described in the text for each experiment, and the mean CGE ± 1 S.D. from the three experiments was determined (shown).
The experiments presented demonstrate
that the amino acid at the X position of an Asn-X-Ser
sequon can have a profound effect on CGE. These studies confirm that
the presence of Pro at the X position completely blocks
glycosylation(15, 20, 27, 37) .
Also, consistent with our findings from earlier
studies(8, 29, 30) , these data demonstrate
that the sequon Asn-Leu
-Ser
is
glycosylated at an intermediate level (mean CGE = 43%).
Remarkably, we find that substitution of Leu
with Trp,
Asp, or Glu dramatically reduces the efficiency of core glycosylation
(mean CGE = 5, 19, and 24%, respectively), whereas substitution
of Leu
with other amino acids increases CGE to varying
degrees (mean CGE ranges from Phe = 70% to Ser = 97%).
These results provide the first direct demonstration that amino acids
at the X position of an Asn-X-Ser/Thr sequon can
influence the efficiency of co-translational core glycosylation.
This report extends previous studies by providing the first comprehensive direct analysis of the impact of the X amino acids on CGE. We demonstrate that the CGE at an Asn-X-Ser sequon in RGP ranges from no glycosylation to full glycosylation, depending on which amino acid is present at the X position. This demonstrates that the X amino acid is an important determinant of CGE.
Because the structure and enzymatic mechanism of oligosaccharyltransferase are not well characterized, currently it is not possible to determine the mechanism by which individual amino acids influence core glycosylation. Several studies suggest that the spatial relationship of the Asn and Ser/Thr residues in a sequon may be critical for oligosaccharide transfer(37, 38, 39, 40, 41, 42, 43) . Large hydrophobic amino acids (e.g. Trp, Leu, Phe, and Tyr) may inhibit core glycosylation by producing an unfavorable local protein conformation. In contrast, Gly, which is small and does not constrain protein conformation, is associated with efficient core glycosylation. Other factors also appear to be important. The negatively charged amino acids (Asp and Glu) inhibit glycosylation, whereas the positively charged amino acids (Lys, Arg, and His) are favorable. The charge of the X amino acid may influence the ability of oligosaccharyltransferase to bind simultaneously to the sequon and the negatively charged dolichol-PP-oligosaccharide precursor(41, 44) . Interestingly, the X amino acids with hydroxy groups (Ser and Thr) and Cys are associated with highly efficient core glycosylation, whereas those with amide groups (Asn and Gln) are associated with suboptimal core glycosylation. Further characterization of oligosaccharyltransferase may help clarify the role that individual amino acids play in oligosaccharide addition.
The general nature of these findings is
supported by studies comparing the X amino acid in
glycosylated and nonglycosylated sequons in native glycoproteins. Those
studies reveal that Cys, Trp (15) , Asp(19) , and Glu (27) are uncommon at the X position in
core-glycosylated sequons. Studies of synthetic peptides in membrane
preparations also find an inhibitory effect of Asp at the X position(41, 44) . The current report provides
direct confirmation that Trp, Asp, and Glu at the X position
inhibit core glycosylation. Interestingly, the sequon
Asn-Cys
-Ser
in RGP is fully
core-glycosylated. The lack of core glycosylation at Asn-Cys-Ser/Thr
sequons in other proteins may reflect the potential of certain Cys
residues to participate in disulfide bonding(40, 45) .
It is important to note that factors other than the X amino
acid also influence core glycosylation. For example, the presence of
Pro immediately following a sequon can inhibit core
glycosylation(20, 37) , and the presence of Thr rather
than Ser at the hydroxy position favors efficient
glycosylation(15, 29, 42) . Our previous
studies demonstrate that the inhibitory effect of Leu in the sequon
Asn-Leu
-Ser
in RGP can be
overcome by replacing Ser
with Thr(29) . Studies
are currently under way to compare the impact of other X amino
acids in Asn-X-Thr versus Asn-X-Ser sequons.
Core glycosylation can also be influenced by factors that influence the
accessibility of a sequon to oligosaccharyltransferase, such as the
position of the sequon in a
protein(14, 20, 30, 46, 47) and the folding of the nascent protein
chain(45, 48) . Core glycosylation is clearly a
complex process influenced by a variety of factors. Further
characterization of the protein signals that regulate core
glycosylation will enhance our understanding of glycoprotein expression
and facilitate the design of novel recombinant glycoproteins for
research and clinical applications.