(Received for publication, November 26, 1996, and in revised form, January 25, 1997)
From the Department of Biochemistry and Molecular Genetics Schools
of Medicine/Dentistry, University of Alabama at Birmingham Station,
Birmingham, Alabama 35294 and the Department of
Biological Chemistry, The Johns Hopkins University School of
Medicine, Baltimore, Maryland 21205
O-Linked N-acetylglucosamine (O-GlcNAc) glycosylation is a dynamic modification of eukaryotic nuclear and cytosolic proteins analogous to protein phosphorylation. We have cloned and characterized a novel gene for an O-GlcNAc transferase (OGT) that shares no sequence homology or structural similarities with other glycosyltransferases. The OGT gene is highly conserved (up to 80% identity) in all eukaryotes examined. Unlike previously described glycosyltransferases, OGT is localized to the cytosol and nucleus. The OGT protein contains multiple tandem repeats of the tetratricopeptide repeat motif. The presence of tetratricopeptide repeats, which can mediate protein-protein interactions, suggests that OGT may be regulated by protein interactions that are independent of the enzyme's catalytic site. The OGT is also modified by tyrosine phosphorylation, indicating that tyrosine kinase signal transduction cascades may play a role in modulating OGT activity.
Unlike other forms of protein glycosylation, serine (threonine)-O-linked N-acetylglucosamine (O-GlcNAc) is found primarily in the cytoplasm and nucleus, and is not modified or elongated to more complex structures (1, 2). Since it was first described in 1984 (3), the O-GlcNAc modification (termed O-GlcNAcylation) has been found on a myriad of eukaryotic nuclear and cytosolic proteins, including RNA polymerase II and its transcription factors, nuclear pore proteins, viral proteins, cytoskeletal proteins, tumor suppressor proteins, and oncoproteins (reviewed in Refs. 1 and 2).
Direct evidence is rapidly accumulating in support of the role of
O-GlcNAcylation as a regulatory modification.
O-GlcNAc appears to be as abundant as phosphorylation, and
several of the known sites of attachment are similar to those used by
proline-directed kinases (4, 5). O-GlcNAcylation is a
dynamic modification exhibiting properties more like phosphorylation
than typical O- and N-linked glycosylation (1,
2). The turnover rate of the O-GlcNAc moiety on cytokeratins
(6) and the small heat shock protein B-crystallin (7) is much higher
than the turnover rate of the protein. O-GlcNAcylation also
has been shown to regulate a number of cellular functions. For example
recent studies have shown the following. 1) O-GlcNAcylation
modulates the DNA binding activity of the p53 tumor suppressor (8). 2)
O-GlcNAcylation of p67 regulates protein synthesis by
controlling the phosphorylation state of the elongation initiation
factor 2 (eIF-2
) (9, 10). 3) O-GlcNAcylation of the head
domain of neurofilament-H appears to regulate neurofilament assembly
(11). 4) The O-GlcNAc and phosphate modifications of the RNA
polymerase II COOH-terminal domain are reciprocal and are likely to
regulate transcription (12, 13). 5) O-GlcNAc has a
reciprocal relationship with phosphorylation at the site on the c-Myc
protein, which has been implicated in modulating its oncogenic activity
(14).
Consistent with the dynamic nature of O-GlcNAcylation, both
a UDP-N-acetylglucosamine:peptide
N-acetylglucosaminyl-transferase (O-GlcNAc
transferase) (15), specific for the attachment of O-GlcNAc
to proteins, and a soluble
N-acetyl--D-glucosaminidase with a neutral pH
optima (O-GlcNAcase) (16), specific for the removal of
O-GlcNAc from proteins have been purified and characterized. These two enzymes appear to regulate the attachment and removal of
O-GlcNAc in much the same way that kinases and phosphatases regulate protein phosphorylation. Taken together these observations suggest that O-GlcNAc may play a role in modulating either
the phosphorylation state or the assembly and disassembly of
multimeric protein complexes in several key cellular systems,
including transcription, nuclear transport, and cytoskeletal
organization.
The O-GlcNAc transferase (OGT)1 (EC 2.4.1) purified from rat liver cytosol appears to be a heterotrimer composed of two catalytic 110-kDa (p110) subunits and one 78-kDa (p78) subunit (17). Here we describe the cloning and characterization of the gene encoding the catalytic p110 subunit. The gene is highly conserved throughout evolution, consistent with the ubiquitous nature of the O-GlcNAc modification. We also find that, like many other regulatory proteins, OGT contains several tandem repeats of the tetratricopeptide repeat (TPR) motif (reviewed in Refs. 18 and 19), suggesting that OGT can interact with other proteins via the TPR domain, to form a regulatory complex. Examination of the posttranslational modifications of OGT shows that the enzyme is modified by both O-GlcNAc and tyrosine phosphorylation. The subcellular localization of the cloned gene is consistent with O-GlcNAcylation as a nuclear and cytosolic modification.
The O-GlcNAc transferase was purified and concentrated on a Q-Sepharose column as described previously (17). The purified protein was separated by SDS-PAGE (20). The material corresponding to the 110-kDa subunit was visualized, excised, and subjected to in-gel protease digestion with trypsin (Boehringer Mannheim sequencing grade or Worthington tosylphenylalanyl chloromethyl ketone-treated trypsin) by the method of Rosenfeld et al. (21). The resulting peptides were separated by reverse phase high performance liquid chromatography (RP-HPLC). In addition, a second sample was electroblotted to Immobilon-Psq (Millipore) after SDS-PAGE separation and the protein stained with Amido Black 10B (Sigma) as per manufacturer's recommendations. The band corresponding to the 110-kDa subunit was excised and submitted to the Harvard Microchem protein sequencing facility (Boston, MA) for both NH2-terminal sequencing and internal sequence analysis.
RP-HPLCThe tryptic peptides were purified by several rounds of RP-HPLC. A Vydac 5-µm C18 column (4.6 × 250 mm) or a Rainin Microsorb-MV 5-µm C18 column (4.6 × 250 mm) was used for first and second-dimension RP-HPLC. A Vydac narrow-bore C18 column (2.1 × 150 mm) was used for third-dimension RP-HPLC. The peptides were bound to the column in either 0.05% trifluoroacetic acid or 0.1% phosphoric acid containing 100 mM sodium perchlorate at pH 2 or pH 7 (22), and eluted with a linear gradient from 0 to 60% acetonitrile.
Peptide SequencingThe RP-HPLC-purified peptides were sequenced by gas phase automated Edman degradation on either a Porton Instruments (Tarzana, CA) model PI 2090E microsequencing system, or an Applied Biosystems Inc. (Foster City, CA) model 470A gas phase sequenator.
Rabbit AntiserumPolyclonal antibodies to the OGT protein were generated as follows. His-tagged protein was expressed in Escherichia coli using the pTrcHis vector, and the protein was purified as described below. The purified protein was separated by SDS-PAGE, visualized, and excised from the gel as described below. The gel slices were homogenized and used directly as immunogens by Hazelton Research Products (Denver, PA) to produce polyclonal antisera in two rabbits designated AL-24 and AL-25. Immunoglobulin G (IgG) was purified by passing the rabbit antiserum over a protein A-Sepharose column (Pharmacia) as per manufacturer's recommendations.
Western Blot AnalysisCrude protein extracts were prepared either from the dissected tissue of 3-6-month-old Harlan Sprague Dawley rats or from transfected HEK293 cells by following the first two steps of the purification protocol as described previously (17). The proteins were separated by SDS-PAGE and transferred to polyvinylidene difluoride membrane (23). Purified rabbit polyclonal IgG AL-24 (1:5000), or monoclonal anti-phosphotyrosine (Sigma, 1:2000) was used as a primary antibody with anti-rabbit or anti-mouse IgG coupled to horseradish peroxidase (Amersham) as the secondary antibody (1:20,000 dilution). Detection of the horseradish peroxidase activity was by enhanced chemiluminescence (ECL) and fluorography as described by the manufacturer (Amersham).
Carbohydrate CharacterizationPurifed OGT from rat liver
(through step 8, as described by Haltiwanger et al. (17))
was probed for terminal GlcNAc using Gal(1-4)galactosyltransferase
and UDP-[3H]galactose as described (24). The subunits
were then resolved on a 7.5% SDS-acrylamide gel, and proteins tagged
with [3H]galactose were detected by fluorography of the
gel treated with 1 M sodium salicylate. The p110 subunit of
OGT was excised, and the O-linkage of the labeled sugar on
p110 was demonstrated by its sensitivity to
-elimination and
reduction (24). The released O-linked sugar was desalted
over a Dowex AG 50W-X8 (hydrogen form)/Dowex AG 1-X8 (formate form) run
in series. Identification of the
-eliminated products was performed
by high pH anion-exchange chromatorgraphy with pulsed-amperometric
detection (HPAE-PAD) by isocratic elution in 200 mM NaOH at
0.4 ml/min for 25 min on a Dionex Bio-LC equipped with a CarboPac-MA1
column.
Partially purified OGT from rat liver (through step 4, and desalted as described; Ref. 17) was incubated with AL-25 or preimmune IgG (see above) on ice for 3 h. The IgGs were then precipitated with protein A-Sepharose CL4-B (Pharmacia) in Tris-buffered saline plus 0.1% Tween 20 (TBST). The resin was washed extensively in TBST, followed by a desalting wash in 20 mM Tris, pH 7.8, 20% glycerol (desalt buffer), while the supernatant was desalted over a 1-ml Sephadex G-50 (Sigma) column in desalt buffer. The resin and supernatant was then resuspended to equivalent volumes in either SDS-PAGE sample buffer for Western blot analysis, or in desalt buffer for OGT activity assay as described previously (17).
General Recombinant DNA TechniquesRestriction endonuclease digestions and ligations were carried out as described (25). Plasmids were isolated using Wizard Prep Kits (Promega) according to the manufacturer's directions.
Polymerase Chain Reaction (PCR)Two oligonucleotides were
synthesized (ATGGGAAATACTTTGAAA = forward, ATGGATTATATATCACT = reverse), and PCR was carried out in 50-µl reactions containing
107
-ZAPII rat liver cDNA library (no. 936513, Stratagene) phage, 2.5 mM MgCl2, 50 mM KCl, 10 mM Tris, pH 9.0, 0.1% Triton X-100, 200 µM dNTPs, 1.5 µM each primer, 2.5 units
of Taq DNA polymerase (Promega). The
-ZAPII phage were
first denatured by boiling for 10 min in dH2O and then
added to the PCR reactions and amplified in a DNA Thermal Cycler (MJ
Research Inc.) using a step gradient annealing temperature cycle
starting at 48 °C (1 min) decreasing 0.5 °C/cycle for 20 cycles,
an additional 20 cycles were performed at 38 °C (1 min) annealing
temperature. A 65 °C (2 min) elongation step followed by a 92 °C
(1 min) denaturing step was used in all cycles. The PCR products were
resolved on a 1% agarose gel containing ethidium bromide (0.5 µg/ml), gel-purified using a silica suspension (26), and subcloned
into the pGEM-T cloning vector (Promega).
Subcloned PCR products and excised cDNA
ZAPII clones were subjected to double-stranded DNA sequencing
using deoxyadenosine 5
-[
-[35S]thio]triphosphate and
Sequenase II (U. S. Biochemical Corp.), as described by the
manufacturer. Additional sequence information of the cDNA clones
was obtained by automated DNA sequencing on an Applied Biosystems model
373A automated DNA sequencer.
Two rat liver cDNA
-ZAPII libraries (nos. 936513 and 936507, Stratagene) were plated on
XL-1 Blue host cells (Stratagene) and screened by hybridization in 50%
formamide (25), with the PCR-22b probe (see above) labeled to a
specific activity of
8 × 108 dpm/µg using the
Ready·To·Go DNA Labeling Kit (Pharmacia). Hybridization was
performed at 42 °C, and the filters were washed in 0.2 × SSC at 70 °C. Nine positive clones were isolated, ranging in size from
1-3.2 kb (designated H1-H9). The longest H1 (Fig. 1A),
contained a putative start site, 310 base pairs of upstream
untranslated DNA, and an open reading frame encoding 958 residues;
however, no in-frame stop codon was present. A rat hippocampus
-ZAPII library kindly provided by Anthony Lanahan (Johns Hopkins
School of Medicine, Baltimore, MD) was screened as above, except that a
gel-purified SacI restriction enzyme fragment representing
the 3
end of the partial clone was used as probe (Fig. 1A);
all other conditions were identical. Five clones were isolated
(designated LTP1-LPT5), ranging in size from 2.5 to 4.6 kb. All of
these clones overlapped the H1 clone by 0.3-2.2 kb, and all contained
a poly(A) tail. A total of 8 × 105 recombinant phage
from each library were screened. Positive clones were subjected to
in vivo excision protocol according to the manufacturers' directions, to recover a pBlueScript plasmid containing the cDNA clone of interest for further analysis. A full-length cDNA
clone was constructed by ligation of the 5
-end of the H1 clone to the 3
-end of LTP3 at a convient XhoI site (Fig.
1A).
DNA and RNA Blot Analysis-Genomic
DNA analysis was performed on a prepared Zoo-Blot (Clontech) as per manufacturer's directions with a PCR-22b probe labeled as above. In addition, total genomic DNA was isolated from HEK293 cells (25), digested to completion with EcoRI, resolved on a 0.8% agarose gel, and transferred to a Nytran (Schleicher & Schuell) filter according to manufacturer's directions. This blot was probed with a PCR-22b probe as described above. Total RNA was isolated from the dissected tissue of 3-6-month-old Harlan Sprague Dawley rats using RNeasy Total RNA kit (Qiagen) according to manufacturer's directions. 25 µg of each RNA was resolved on 1% agarose gels, transferred to a Nytran (Schleicher & Schuell) filter according to manufacturer's directions, and probed with the PCR-22b fragment (see above) as described (25).
Expression of the Ogt cDNAThe coding region of the GTF
was assembled from two cDNA clone fragments at a unique
XhoI site. The 5-untranslated region was removed and
replaced with a linker restoring the start codon, and a
BamHI site was added 5
to the start (Fig. 1B).
The coding region was then subcloned as a
BamHI/HindIII fragment into the polylinker of the
pTrcHis Xpress vector (Invitrogen), designated pLK51, for expression in
E. coli, or into pGW1 (a kind gift of Mike Lee) designated
pLK61, for transient expression in mammalian cells. pLK51 was
transformed into the XL-1 Blue (Stratagene) strain of E. coli and grown to midlog phase, and protein expression was induced
by the addition of 1 mM final concentration of
isopropyl-1-thio-
-D-galactopyranoside. Cells were
harvested 6 h after induction and protein was purified on Hi-Trap
Chelating Columns (Pharmacia) under urea denaturing conditions as per
manufacturer's instructions. HEK293 cells were grown in six-well
plates for protein expression assays, or on glass coverslips for
immunolocalization in Dulbecco's modified Eagle's medium, 10% fetal
bovine serum until 50% confluence. The cells were then transfected
with pLK61 by calcium phosphate-mediated transfection (27) and grown
for an additional 24-48 h to allow protein expression.
Transfected HEK293 cells (see above) or CHO cells, grown on glass coverslips in Dulbecco's modified Eagle's medium/Ham's F-12 (1:1) supplemented with 10% fetal bovine serum until 50% confluence, were washed twice in serum-free medium and fixed in 4% formaldehyde for 30 min. The cells were then washed four times in phosphate-buffered saline, pH 7.5 (PBS), and permeabilized in 0.5% Triton X-100 in PBS for 5 min. Cells were washed in PBS and blocked with goat serum and 3% bovine serum albumin in PBS (1:3) for 15 min at 37 °C. The cells were then incubated in primary antibody for 30 min at 37 °C. Excess primary antibody was washed away with four 10 min incubations in PBS, and the cells were incubated in secondary antibody, at room temperature for 30 min in the dark. The cells were rinsed as for the primary antibody and mounted onto slides in 0.1% paraphenylenediamine in 90% glycerol. Secondary antibody alone gave no signal, and no signal was observed in non-permeabilized cells. Primary antibodies, AL-25 or preimmune, were used at a 1:500 dilution. The secondary antibodies, FITC-conjugated goat anti-rabbit IgG (Jackson ImmunoResearch Laboratories) were diluted 1:200; DAPI stain was used at a final concentration of 0.1 µg/ml. All antibodies were diluted in 3% bovine serum albumin in PBS.
Fourteen unique peptide sequences were obtained from protease digests of the p110 subunit of OGT purified from rat liver (underlined in Fig. 1A). Polymerase chain reaction amplification and standard cDNA library screening techniques were combined to clone the gene encoding the p110 subunit (see "Experimental Procedures"). Two overlapping clones were isolated and the open reading frame was reconstructed at a convenient XhoI restriction site (Fig. 1B). The full-length cDNA contains a single open reading frame encoding a protein of 1037 residues, designated p110OGT.
Computer searches of the standard GenBankTM data bases using the BLAST
algorithm (28) revealed 61% sequence identity between p110OGT and a hypothetical 1194 residue protein
encoded at locus K04G7.3 of Caenorhabditis elegans
(accession number U21320[GenBank]). A matrix plot of the predicted C. elegans protein and p110OGT is shown in Fig.
2A. The homology extends through the entire clone with regions as long as 350 residues sharing >80% identity. A
recently cloned gene in Arabidopsis, SPINDLY, involved in
gibberellin signal transduction (29) also shares extensive homology
with p110OGT throughout the entire coding region. Thus, both
K04G7.3 and SPINDLY are likely to encode homologues of p110OGT.
In addition, searches of the dbEST data base of expressed sequence tags
(30) revealed homology between p110OGT and the conceptual
translation products of expressed sequence tags from human (R7594; 93%
identity over 414 residues), Schistosoma mansoni (T14553;
65% identity over 442 residues), and rice (D24403; 67% identity over
326 residues).
The amino-terminal portion of p110OGT shares homology with a diverse group of proteins all containing a common motif designated the TPR motif (18, 19), while the carboxyl terminus shares no significant homology to any known protein in the data bases. Thus, the p110OGT appears to consists of two distinct domains: the amino-terminal 463 residues containing 11.5 tandem repeats of the TPR motif (Fig. 1C), and the carboxyl-terminal 563 residues representing a novel polypeptide perhaps encoding the catalytic activity.
Ogt Is Not a Member of a Multigene Family and Is Present in Many OrganismsSouthern blot analysis was used to determine if the OGT clone represents a family of O-GlcNAc transferase genes. Rat genomic DNA was digested with several restriction enzymes (BamHI, EcoRI, HindIII, PstI and SacI) and probed with an 850-base pair fragment (PCR-22b, see Fig. 1B). Only one or two bands of equal intensity are seen in most lanes (Fig. 2B). The EcoRI lane has three bands, which is consistent with the expected restriction pattern of the cloned cDNA. The absence of several bands of varying intensity in each lane indicates that the Ogt gene is not a member of a closely related multigene family.
To further examine the level of conservation among higher eukaryotes, genomic DNA from rat, mouse, dog, cow, rabbit, and human was probed with PCR-22b (see above). A specific signal is seen in all lanes (Fig. 2C), demonstrating that a single related gene is present in many higher eukaryotes.
OGT Activity Is Immunoprecipitated from Rat Liver Extracts by an Antibody against Recombinant p110OGT Expressed in E. coliPolyclonal rabbit antibody (designated AL-25) was prepared
against purified, recombinant p110OGT overproduced in E. coli (see "Experimental Procedures"). AL-25 immunoglobulin G
(IgG) is highly specific for the OGT and shows no cross-reactivity to
other proteins present in partially purified preparation of rat liver
OGT designated the Q-Sepharose pool (17) (Fig.
3A, compare lanes 1 and
2). Preimmune IgG shows no reactivity (data not shown). On
Western blots, AL-25 antibody recognizes both the p110 and the p78
subunits of the rat liver OGT (lane 2), suggesting that p110
and p78 are related at the polypeptide level. Similar results were
obtained with antibodies raised against synthetic peptides derived from
the p110 subunit sequence (data not shown).
Both the p110 and p78 subunits of the native OGT are immunoprecipitated from the Q-Sepharose pool using AL-25, while no protein is precipitated using preimmune IgG (lanes 3 and 4). The pellets and supernatants from the immunoprecipitation were assayed for OGT enzyme activity (17). OGT enzymatic activity is precipitated from the Q-Sepharose pool using AL-25, while no activity is precipitated by buffer alone or preimmune IgG (Fig. 3B). These studies demonstrate that the cloned cDNA indeed represents the p110 subunit of the rat OGT.
Levels of OGT RNA, Protein, and Activity Vary in Different TissuesNorthern blot analysis (Fig.
4A) indicates that there are four transcripts
~of 8.0, 6.0, 4.2, and 1.7 kb, present in all rat tissues examined
thus far. The 6.0-kb transcript is closest in size to the cloned
cDNA (5.7 kb). The larger 8.0-kb transcript may be an alternate
splicing product containing additional exons, as is seen in C. elegans, which has two distinct cDNAs representing alternative
splicing events at the 5-prime end of the message. The longer message
produces a 130-kDa protein as predicted (by Wilson et al.,
31) (accession number U21320[GenBank]), while the smaller message would produce
a 112-kDa protein.2
The pattern of protein expression was examined by Western blot analysis of 30% ammonium sulfate cytosolic pellets (30% pellet) from rat tissues using the AL-25 antibody (Fig. 4B). The p110 band is clearly detectable in all tissues except the kidney, while the p78 band is detectable only in kidney, liver, and muscle, and an 80-kDa band is present only in muscle. Additional bands are visible upon longer exposure (data not shown), including a faint 110-kDa band in kidney as well as a 190-kDa band in liver, indicating that there are several Ogt-derived proteins present in most tissues.
The 30% pellets were also assayed for enzyme activity (17), the results are shown in Fig. 4C. All the extracts contained enzymatically active OGT, with brain and thymus having the highest specific activities and liver the lowest. However, the amount of enzyme activity did not always correlate well with the amount of OGT protein present in each extract, or with the transcript levels seen in each tissue.
OGT Is Modified by Tyrosine Phosphorylation and O-GlcNAcylationWestern blot analysis of the Q-Sepharose pool
using an anti-phosphotyrosine antibody shows that the p110 and p78
subunits are immunoreactive. This reactivity is blocked by the addition of 10 mM phosphotyrosine (Fig.
5A, compare lanes 1 and
2), but not 10 mM tyrosine (data not shown).
Similar experiments using antibodies against phosphoserine and
phosphothreonine showed no immunoreactivity (data not shown).
Examination of the p110OGT amino acid sequence indicates that
there is only one well conserved receptor protein-tyrosine kinase
phosphorylation site, Tyr979 (32, 33) (outlined in Fig.
1A).
To determine if OGT was itself modified by O-GlcNAc, highly
purified OGT was probed with galactosyltransferase (see "Experimental Procedures"). Galactosyltransferase is a specific probe for terminal GlcNAc resides (24, 34) that is commonly used to detect
O-GlcNAc by covalently labeling the GlcNAc with
UDP-[3H]galactose. Both the p110 and the p78 subunits are
labeled with [3H]galactose (Fig. 5A,
lane 3), indicating that they are modified by GlcNAc. The
labeled p110 band was excised from the gel and subjected to alkaline
-elimination. The label was released by this treatment indicating
that the sugar was an O-linked glycan (data not shown). The
released sugars were then analyzed by HPAE-PAD chromatography (see
"Experimental Procedures"). The radioactivity was seen to migrate
with the disaccharide alditol of Gal
1,4GlcNAc (Fig. 5B),
indicating that p110 is modified by O-GlcNAc.
Overexpression of the Ogt cDNA in
HEK293 cells by transient transfection produces a protein, which
co-migrates with the p110 subunit of purified liver OGT and is
recognized by AL-25 IgG (Fig. 6A, compare
lane 1 to lanes 4 and 5). The
endogenous OGT is not visible at the short exposure time shown
(lanes 2 and 3), a faint p110 band is seen in the
vector alone controls upon longer exposure (data not shown). The
protein level of p110OGT increases dramatically over time
(compare lanes 4 and 5); however, overexpression
of the p110OGT in HEK293 cells gives only a modest (20-30%)
increase in OGT activity over control cells transfected with vector
alone (Fig. 6B). To address the possibility that
mislocalization of the over expressed p110OGT was preventing
high levels of activity in transfected cells, we examined the
localization of OGT in transfected HEK293 cells (see below). There are
differences in the pattern of expression that may account for the
unusually small increase in enzyme activity of the transfected cells.
It is also possible that additional factors not in abundance in HEK293
cells are required for the activity of OGT, or that these cells are
down regulating the activity of the overexpressed OGT by some as yet
unknown mechanism.
OGT Is Present in Both the Cytosol and Nucleus
Immunolocalization of p110OGT in CHO cells using
AL-25 IgG shows that the p110OGT is located specifically in
nucleus and cytosol (Fig. 7, panels a-c)
where previous studies have shown the activity is located (15, 35). The
nucleus, where most of the O-GlcNAcylated proteins are found
(36), stains evenly and brightly, while the cytosol shows a weak,
diffuse, punctated staining pattern. No reactivity is seen in CHO cells
using preimmune IgG (panels d-f). HEK293 cells
overproducing p110OGT during a transient transfection show a
somewhat different pattern of expression (panels g and
h). The cytosolic staining in these cells is also punctate
but it is significantly more intense. In addition, the level of nuclear
staining is much reduced (compare panels b and
h). Several non-transfected cells in the same field have no
significant signal (panels g-i).
The OGT cloned in the present study displays several features that
are unique and also provides clues with respect to the general
functional significance of the O-GlcNAc modification. 1) The
high evolutionary conservation of the enzyme suggests that it has a
fundamental cellular function. 2) The enzyme's nuclear and cytosolic
localization is consistent with its action on a myriad of proteins in
both compartments. 3) The tyrosine phosphorylation of the enzyme
implies that it may be regulated by one or more of the receptor
tyrosine kinases, linking O-GlcNAcylation to signal transduction cascades (37, 38). 4) The presence of multiple TPR repeats
suggests a two-binding site model for the regulation of the
O-GlcNAcylation of proteins (Fig. 8).
The Ogt gene described above represents a novel
glycosyltransferase that has no structural or sequence similarities to
any previously described glycosyltransferase (39). It has been highly conserved among higher eukaryotes such as rats, nematodes, and plants.
This level of conservation indicates strong evolutionary pressure and
suggests that Ogt encodes a protein with an essential cellular function. However, aside from the common TPR motif domain, p110OGT does not share any significant homology at the primary
sequence level with any protein in the Saccharomyces
cerevisiae data bases. OGT also shares no sequence homology with
the -toxin from Clostridium novyi that catalyzes the
incorporation of O-GlcNAc into the Rho family of proteins in
a manner very analogous to OGT (40). However, the
-toxin does share
some sequence homology with an uncharacterized open reading frame from
yeast (accession no. Z73530[GenBank]). Thus, a protein unrelated to
p110OGT at the amino acid level could perform a similar
enzymatic function in yeast or other eukaryotes.
Southern blot analysis and sequence comparisons indicate that Ogt is not a member of a closely related gene family. However, O-GlcNAc is found on a diverse group of proteins at a multitude of glycosylation sites. Thus, it seems unlikely that one enzyme could be responsible for the specific addition of O-GlcNAc to all these proteins. We cannot rule out the possibility that a family of OGT proteins exists, which, like the Golgi glycosyltransferases, share no significant sequence homology, only structural similarity (39, 41). Alternatively, there may exist a mechanism for the regulation of a single OGT enzyme, which could confer both temporal and substrate specificity in response to cellular signals.
While there is only one Ogt gene, there are multiple transcripts and proteins related to p110OGT, some of which are tissue-specific. These related proteins likely arise from one gene by a combination of alternative RNA splicing and specific proteolysis. The presence of the p110OGT subunit in nearly every tissue examined leads us to postulate that this form of the enzyme provides the majority of the basal cellular OGT activity. However, the levels of activity in the various tissues do not always correlate with the p110 protein and RNA levels, indicating that additional factors regulate the activity of OGT. These additional factors may be limiting when the p110OGT is overexpressed in mammalian cells. Thus we do not see a proportional increase in OGT activity with protein expression.
The expression of tissue-specific forms of OGT is one mechanism by which the substrate specificity of the enzyme could be regulated. Posttranslational modifications of OGT present another mechanism for modulating activity or specificity. We have shown that OGT is modified by O-GlcNAc. In addition, we have shown that OGT is modified by tyrosine phosphorylation. A single receptor protein-tyrosine kinase phosphorylation consensus site is present in the putative catalytic domain, where it could act as a regulatory modification modulating the activity of OGT in response to signal transduction cascades (38).
The presence of TPR motifs in p110OGT is interesting, as TPRs have been found in a large number of proteins of diverse function and are believed to play a role in modulating a variety of cellular processes, including cell cycle (42-44), transcription regulation (45-47), and protein transport (48). Direct evidence for TPR-mediated protein-protein interactions regulating cellular functions is seen for the yeast transcription factor Cyc8. Cyc8 contains 10 TPR domains that are directly involved in recruitment of the Cyc8-Tup1 co-repressor regulating transcription from a distinct set of genes (47). Other examples are the yeast Cdc proteins: Cdc16p, Cdc23p, and Cdc27p, which directly interact with each other via their TPRs in a sequence-dependent manner during mitosis. Mutational analysis of the TPR domains of the Cdc proteins shows that a given TPR modulates a specific protein interaction (43). The ability of TPRs to regulate cellular processes via protein-protein interactions suggests that the TPR motifs of p110OGT could mediate specific protein interactions with accessory proteins, thereby modulating the activity or specificity of OGT. The tyrosine phosphorylation and O-GlcNAc modifications of OGT would provide an additional level of regulation.
A model for OGT regulation, combining the TPR accessory proteins and the posttranslational modifications, is presented in Fig. 8. In this model OGT has a basal level of activity for a narrow range of substrates. Binding of TPR accessory proteins would allow O-GlcNAcylation of additional specific substrates. The basal activity of OGT is up-regulated by changes in the phosphorylation or O-GlcNAcylation state of the protein. This model is not without precedence, as both the activity and specificity of RNA polymerase II are regulated by a large array of transcription factors as well as posttranslational modifications. Although RNA polymerase II does not contain TPR motifs, many of the transcription factors required for activation bind directly to the protein (reviewed in Refs. 49-51). In addition, both O-GlcNAcylation (12, 13, 52) and phosphorylation (52) of the COOH-terminal domain of RNA polymerase II have been documented. Additional study of OGT will allow further elucidation of the mechanisms regulating O-GlcNAcylation and will facilitate the direct evaluation of O-GlcNAc's functions in cellular metabolism.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U76557[GenBank].
We thank Bill Kelly for work on the characterization of the C. elegans transcripts, Anthony Lanahan from the Department of Neuroscience at Johns Hopkins Medical School for generously providing the rat hippocampus library, Betty Jean Earles for peptide sequencing and synthesis, and all the members of the Hart laboratory for helpful discussions.