(Received for publication, June 13, 1995)
From the
c-Myc is a helix-loop-helix leucine zipper phosphoprotein that
heterodimerizes with Max and regulates gene transcription in cell
proliferation, cell differentiation, and programmed cell death.
Previously, we demonstrated that c-Myc is modified by O-linked N-acetylglucosamine (O-GlcNAc) within or nearby the
N-terminal transcriptional activation domain (Chou, T.-Y., Dang, C. V.,
and Hart, G. W.(1995) Proc. Natl. Acad. Sci. U.S.A. 92,
4417-4421). In this paper, we identified the O-GlcNAc
attachment site(s) on c-Myc. c-Myc purified from sf9 insect cells was
trypsinized, and its GlcNAc moieties were enzymically labeled with
[H]galactose. The
[
H]galactose-labeled glycopeptides were isolated
by reverse phase high performance liquid chromatography and then
subjected to gas-phase sequencing, manual Edman degradation, and laser
desorption/ionization mass spectrometry. These analyses show that
threonine 58, an in vivo phosphorylation site in the
transactivation domain, is the major O-GlcNAc glycosylation
site of c-Myc. Mutation of threonine 58, frequently found in retroviral
v-Myc proteins and in human Burkitt and AIDS-related lymphomas, is
associated with enhanced transforming activity and tumorigenicity. The
reciprocal glycosylation and phosphorylation at this biologically
significant amino acid residue may play an important role in the
regulation of the functions of c-Myc.
c-Myc, the product of the c-myc protooncogene, is a
nuclear phosphoprotein of 439 amino acids that plays a critical role in
the regulation of gene transcription in normal and neoplastic cells.
Mutations of c-myc are associated with different types of
tumors in human and other species(1) . c-Myc has several
structural features conserved among many transcription factors. A basic
helix-loop-helix leucine zipper motif in the C-terminal region mediates
heterodimerization with Max (2) and DNA binding to a specific
E-box sequence, CACGTG or EMS (E-box myc site)(3) .
The N-terminal transcriptional activation domain (TAD) ()(amino acids 1-143) (4) has a proline-rich
element spanning from amino acid 41 to amino acid 103(1) ,
which contains several potential sites for O-linked N-acetylglucosamine (O-GlcNAc). The TAD is required
for neoplastic transformation(5) , inhibition of cellular
differentiation(6) , and induction of apoptosis (7) mediated by c-Myc.
c-Myc can be phosphorylated by casein kinase II(8) , MAP kinase(9) , or glycogen synthase kinase 3(10) . Phosphorylation at Thr-58 and/or Ser-62 in the TAD of c-Myc has been suggested to modulate the transactivation (11) and cellular transformation (12) by c-Myc. Our previous study (13) showed that the TAD of c-Myc is also modified by O-GlcNAc, a form of protein glycosylation composed of a single monosaccharide, GlcNAc, linked to the side chain hydroxyl of serine or threonine(14) . O-GlcNAc has been found almost exclusively in the nucleus and cytoplasm of eukaryotic cells. The known O-GlcNAc-bearing proteins share two common features; all of them are phosphoproteins and all form reversible multimeric complexes. Although the role of O-GlcNAc in altering protein function remains unknown, experiments to date suggest that the O-GlcNAc modification is dynamic and appears to have a reciprocal relationship with protein phosphorylation(15) .
In this report, using a variety of analytical techniques on c-Myc overexpressed in insect cells, we provide evidence that O-GlcNAc occurs at c-Myc threonine 58, a known phosphorylation site and a frequently mutated hot spot in human lymphomas.
c-Myc protein levels have been shown to be very low in normal
and transformed cells through the use of immunological
assays(24) . There are approximately 750 molecules of c-Myc per
cell in serum-starved fibroblasts. After serum stimulation, the number
increases to 6,300 per cell. HeLa cells, which are transformed, contain
97,000 molecules of c-Myc per cell. The levels of cellular c-Myc
polypeptide appear to be constant throughout the cell
cycle(25) . The low abundance of c-Myc and the inherent
limitation of the sensitivity of tritium labeling render the detection
of carbohydrate moieties on c-Myc and the subsequent mapping of
carbohydrate sites on c-Myc extremely difficult. In our previous
work(13) , we developed sensitive methods to detect O-GlcNAc on in vitro translated c-Myc and identified
a N-terminal region of c-Myc modified by O-GlcNAc. We also
overexpressed the c-Myc protein in either sf9 insect cells or Chinese
hamster ovary cells and demonstrated that c-Myc expressed in these
cells is modified by O-GlcNAc. In the present study, we used
c-Myc overexpressed in sf9 insect cells to map the O-GlcNAc
addition sites of c-Myc. It is estimated that sf9 insect cells infected
with recombinant baculovirus Ac373/hc-myc (17) express more
than 1 million molecules of c-Myc per cell (data not shown), an amount
at least 10-fold more abundant than the c-Myc molecules in HeLa cells.
Studies have shown that sf9 insect cells, like vertebrate cells, are
able to process post-translational protein modifications, including O-GlcNAc(26, 27) . A recent study on human
cytomegalovirus tegument basic phosphoprotein indicated that the O-GlcNAc sites of the recombinant baculoviral protein
faithfully correspond to those of the native virion
protein(28) .
c-Myc was purified from 5 liters of infected
sf9 cells, harvested at 2
10
cells/ml. During
the initial lysis procedure, 50 mM GlcNAc was added to the
buffers to partially inhibit the endogenous hexosaminidase activity.
Most of the overexpressed c-Myc protein appeared to be bound to DNA and
could only be extracted by 5 M urea or SDS(19) .
Protein samples from sequential steps of purification were subjected to
SDS-PAGE for silver staining and Western blot analysis (Fig.1).
Proteins co-electrophoresing with c-Myc comprised more than 95% of the
total after purification by Mono Q chromatography. When O-GlcNAc in this purified c-Myc preparation was labeled with
[
H]galactose by galactosyltransferase, c-Myc was
the major radioactive signal(13) .
Figure 1: Purification of recombinant c-Myc from sf9 insect cells. Protein samples from sequential steps of the purification procedures were resolved on two identical 10% SDS-PAGE gels for silver staining (top panel) and Western blot analysis (bottom panel) as described under ``Experimental Procedures.'' lane 1, low salt lysis, total lysate; lane 2, low salt lysis, supernatant; lane 3, low salt lysis, pellet; lane 4, high salt lysis, supernatant; lane 5, high salt lysis, pellet, 5 M urea extract; lane 6, after DEAE CL-6B column; lane 7, after Mono Q column. Molecular mass markers of 70 kDa (70) and 43 kDa (43) are shown on the left.
Purified c-Myc was first
trypsinized to gain better accessibility to O-GlcNAc residues
for subsequent galactosyltransferase labeling(20) . The O-GlcNAc-modified glycopeptides, labeled with
[H]galactose by galactosyltransferase, were
separated by RP-HPLC at pH 7.0 on a C18 column in the first dimension.
As illustrated in Fig.2a, only one major radioactive
peak was detected, which contained
48 pmol of labeled
glycopeptide. The fraction corresponding to the radioactive peak
(fraction 65) was applied to a second dimension RP-HPLC at pH 2.0 and
the radioactivity eluted as a single peak (Fig.2b).
When the fraction containing this radioactive peak was applied to a
third dimension RP-HPLC at pH 4.0, a single radioactive peak was eluted (Fig.2c). The yield of radiolabeled glycopeptide after
these multiple HPLC analyses was
12.5% due to the
``sticky'' nature of this glycopeptide. After the third round
of HPLC, the UV absorbance profile suggested that this peptide was
fairly homogeneous. Therefore, further fractionation (e.g. ion-exchange chromatography) was not performed (however, see
below).
Figure 2:
Isolation of
[H]galactose-labeled tryptic glycopeptides.
Purified c-Myc was digested with trypsin, labeled with
galactosyltransferase, and separated by RP-HPLC as described under
``Experimental Procedures.'' a, first
dimension; b, second dimension; c,
third dimension. Top panels, absorbance profile of eluted
peptides; bottom panels, tritium profile of eluted peptides.
The straight line in the top panel represents
acetonitrile gradient. %B is percentage of 60% (v/v)
acetonitrile.
An aliquot containing 5.5 pmol of
[H]galactose-labeled glycopeptide from the third
dimension RP-HPLC was subjected to gas-phase sequencing. The relative
abundance of amino acids recovered in each sequencing cycle
surprisingly suggested the presence of three major co-purified peptides (Fig.3): DTHKSEIAHRFK(DLGEEHFK) (
15 pmol), SFFAL(R) or
SFFAL(R)DQIPEL(ENNEK) (
10 pmol), and FELLP(T)PPL(SPSR) (
5
pmol). The two less abundant peptides were from c-Myc tryptic fragments
of amino acids 373-378, SFFALR (or 373-389,
SFFALRDQIPELENNEK), in the first helix (and loop) region and amino
acids 53-65, FELLPTPPLSPSR, in the transactivation domain. A
protein data bank search (BLAST, National Institutes of Health)
revealed the more abundant peptide, DTHKSEIAHRFKDLGEEHFK, was from
tryptic fragments of bovine serum albumin. Fetal bovine serum (10%)
added in the insect cell culture medium is likely the source of this
contaminating peptide. Apparently, this serum-derived peptide either
exactly co-migrates with the c-myc peptides or binds to them with high
affinity during HPLC purification. The presence of this contaminant is
surprising since we have typically found this iterative RP-HPLC method
to provide more than adequate purification of glycopeptides for
sequencing(20, 28) .
Figure 3:
Amino acids and peptide sequences derived
from gas-phase sequencing. [H]Galactose-labeled
tryptic glycopeptides from third dimension RP-HPLC were subjected to
gas-phase sequencing. Phenylthiohydantoin-amino acids released in each
cycle in order of abundance (high > low) are listed in the top
panel. Peptide sequences deduced from the
phenylthiohydantoin-amino acids released and tryptic maps of c-Myc and
bovine serum albumin (as described under ``Results and
Discussion'') are listed in the bottom
panel.
Since the three major peptides detected by the gas-phase sequencing contain serine or threonine at different positions, sequential manual Edman degradation, followed by scintillation counting to detect the released radiolabeled saccharide, provides an unambiguous assignment of the site of O-GlcNAc modification. The result of sequential manual Edman degradation (Fig.4) indicates that the threonine residue in FELLPTPPLSPSR (threonine 58) is glycosylated. Only the peptide FELLPTPPLSPSR has a threonine or serine at the sixth amino acid, the cycle in which the radiolabel is released. The assignment of O-GlcNAc glycosylation to threonine 58 is also supported by two other observations. First, the major radioactive peak from the first, second, and third round of RP-HPLC eluted at 22.5, 26.8, and 26.0% acetonitrile, respectively, consistent with the predicted retention times for the peptide FELLPTPPLSPSR(41) . Second, the gas-phase sequencing data showed the amount of the released phenylthiohydantoin-threonine was smaller than other internal residues of the peptide F(5.4)E(1.5)L(4.1)L(2.4)P(2.7)T(<0.5)P(1.1)P(0.5)L(4.3)S(<0.5)P(<0.5)S(<0.5)R (repetitive yields are in parentheses following each amino acid), suggesting that threonine 58 has been modified. This low recovery of threonine 58 also ruled out the possibility that the released radioactivity in the sixth sequencing cycle came from a small amount of contaminating peptide.
Figure 4:
Determination of O-GlcNAc site on
a c-Myc tryptic glycopeptide by sequential manual Edman degradation.
[H]Galactose-labeled tryptic glycopeptides from
third dimension RP-HPLC were subjected to manual sequential Edman
degradation. Counts released in each cycle (numbers) and
counts from the disc after all 12 cycles (DISC) were
plotted.
The conclusion that threonine 58 is the site
of O-GlcNAc was further confirmed by analyzing the peptides
by MALDI-MS (Fig.5). A peak with m/z 1867.3
was assigned to the mass of FELLPTPPLSPSR (1453.71) plus
[H]Gal
1-4GlcNAc (367.33) plus two
sodium (2
22.99) (total = 1867.02). We also noted a peak
with m/z 1493.9 corresponding to FELLPTPPLSPSR
(1453.71) plus sodium (22.99) plus H
O (18.02) (total
= 1494.72) and a peak with m/z 1696.5
corresponding to FELLPTPPLSPSR (1453.71) plus GlcNAc (203.18) plus
sodium (22.99) plus H
O (18.02) (total = 1697.9).
Since both glycosylated and non-glycosylated forms of c-Myc are present
in cells and the galactosyltransferase labeling is not completely
efficient(20) , the co-existence of FELLPTPPLSPSR,
FELLPTPPLSPSR with GlcNAc, and FELLPTPPLSPSR with
[
H]galactosylated GlcNAc is expected. In
addition, RP-HPLC generally does not resolve unmodified, O-GlcNAc-modified, or galactosylated O-GlcNAc-modified peptides(20) . Furthermore, we have
recently found that O-GlcNAc saccharides are rapidly and
selectively lost during ionization in electrospray mass spectrometry. (
)Also found in the mass spectrometry were a peak with m/z 865.6 corresponding to SFFALR (739.88) plus
phosphate (79.98) plus two sodium (2
22.99) (total =
865.84) and a peak with m/z 2717.7 corresponding to
DTHKSEIAHRFKDLGEEHFK (2405.13) plus copper (63.55) plus hexose (162.14)
plus three sodium (3
22.99) plus H
O (18.02) (total
= 2717.54). The peptide DTHKSEIAHRFKDLGEEHFK contains a copper
chelating site at its histidine residues. A non-enzymatic glycation at
lysine 12 of the peptide DTHKSEIAHRFKDLGEEHFK has been
reported(29) . From the data of gas-phase sequencing,
sequential Edman degradation, and mass spectrometry we conclude that
threonine 58 is the major O-GlcNAc site of c-Myc.
Figure 5:
Identification of
[H]galactose-labeled FELLPTPPLSPSR by mass
spectrometry. [
H]Galactose-labeled tryptic
glycopeptides from third dimension RP-HPLC were subjected to MALDI-MS.
The m/z of 1867.3 represents the molecular mass of
FELLPTPPLSPSR (molecular mass, 1453.71) plus
[
H]Gal
1-4GlcNAc (367.33) plus two
sodium (2
22.99) (total = 1867.02). For the assignment
of the other peaks, see ``Results and
Discussion.''
Threonine 58 is in the TAD of c-Myc within the region where we previously localized O-GlcNAc by more indirect methods(13) . It has been shown that threonine 58 is a phosphorylation site in vivo(30) and can be phosphorylated in vitro by glycogen synthase kinase 3(10) . Threonine 58 is altered to a methionine in MC29 and HB1 v-Myc and to an alanine in OK10 and MH2 v-Myc(31) . These mutations of threonine 58 in v-Myc enhance the transforming activity of Myc protein(32, 33) . On the other hand, a v-Myc protein with a threonine at amino acid 58 has a reduced capability to induce growth in soft agar by non-transformed embryo fibroblasts(34, 35) . Comparison of c-Myc and v-Myc by a variety of transformation assays also revealed that c-Myc has a reduced ability to induce tumor formation(33) . These results suggest that threonine 58 has a key role in transducing a negative growth signal of c-Myc through its post-translational modifications. This working hypothesis is supported by the observations that mutations of c-Myc at or near threonine 58 are frequently found in Burkitt or AIDS-related lymphomas, and threonine 58 is the most frequently mutated amino acid of c-Myc in these tumors(36, 37, 38, 39, 40) .
Since mutations altering threonine 58 augment c-Myc transforming ability, we speculate that reciprocal phosphorylation/O-GlcNAc glycosylation modulate the activity of c-Myc. With the observation that c-Myc polypeptide levels remain relatively constant throughout the cell cycle(25) , we also propose that these reciprocal post-translational modifications of threonine 58 differentially regulate c-Myc functions in different stages of the cell cycle.