(Received for publication, November 30, 1995)
From the
A cDNA encoding a novel sialyltransferase has been isolated
employing the polymerase chain reaction using degenerate primers to
conserved regions of the sialylmotif that is present in all eukaryotic
members of the sialyltransferase gene family examined to date. The cDNA
sequence revealed an open reading frame coding for 305 amino acids,
making it the shortest sialyltransferase cloned to date. This open
reading frame predicts all the characteristic structural features of
other sialyltransferases including a type II membrane protein topology
and both sialylmotifs, one centrally located and the second in the
carboxyl-terminal portion of the cDNA. When compared with all other
sialyltransferase cDNAs, the predicted amino acid sequence displays the
lowest homology in the sialyltransferase gene family. Northern analysis
shows this sialyltransferase to be developmentally regulated in brain
with expression persisting through adulthood in spleen, kidney, and
lung. Stable transfection of the full-length cDNA in the human kidney
carcinoma cel line 293 produced an active sialyltransferase with marked
specificity for the sialoside, Neu5Ac2,3Gal
1,3GalNAc and
glycoconjugates carrying the same sequence such as G
and
fetuin. The disialylated tetrasaccharide formed by reacting the
sialyltransferase with the aforementioned sialoside was analyzed by
one- and two-dimensional
H and
C NMR
spectroscopy and was shown to be the
Neu5Ac
2,3Gal
1,3(Neu5Ac
2,6)GalNAc sialoside. This
indicates that the enzyme is a GalNAc
2,6-sialyltransferase. Since
two other ST6GalNAc sialyltransferase cDNAs have been isolated, this
sialyltransferase has been designated ST6GalNAc III. Of these three,
ST6GalNAc III displays the most restricted acceptor specificity and is
the only sialyltransferase cloned to date capable of forming the
developmentally regulated ganglioside G
from
G
.
Positioned in the protective and interactive glycocalyx of the cell surface, sialylated glycoconjugates are optimally situated to mediate initial communication events between two cells. Examples demonstrating the significance of biological events mediated by these interactions range from neurodevelopment where sialosides confer migratory properties to cells to trafficking of leukocytes from blood to sites of inflammation and lymphoid organs(1, 2, 3, 4) . These interactions are controlled by the structural diversity of sialosides that undergo dramatic alterations throughout ontogeny and also by the site-specific expression of sialic acid recognition molecules such as the selectins and I-type lectins(1, 5, 6, 7, 8, 9, 10, 55, 60) .
Sialosides are generated by a family of glycosyltransferases termed
the sialyltransferases by the transfer of sialic acid from its high
energy donor CMP-sialic acid, to the nonreducing terminus of
oligosaccharides (33, 34) . Sialyltransferases
generate considerable structural diversity by transferring sialic acid
with remarkable specificity for the underlying oligosaccharide
substrate(33, 34) . The regulated expression of
sialosides is dependent on many factors including the availability of
sugar nucleotide to the Golgi lumen; competing glycosyltransferases;
co-localization of appropriate acceptors, transferases, and sugar
nucleotide transporters within a particular Golgi cisternae; and
transit time of acceptors through the Golgi
apparatus(56, 57, 58, 59) . However
the greatest determinant of sialoside expression is probably the
site-specific expression observed for each member of the
sialyltransferase gene family(30) . Based on known structures
for sialylated glycoconjugates, the sialyltransferase gene family has
been estimated to consist of 10-12 independent gene products,
although it is becoming evident that the final number of
sialyltransferases may ultimately prove to be larger. To date, 11
enzymatically distinct mammalian sialyltransferases have been cloned by
direct and indirect
methods(10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 27, 28, 29, 42) .
Analysis of their amino acid sequences has revealed two conserved
motifs. The longest is characterized by a 48-50-amino acid region
centrally located, and the shorter motif consists of a
20-24-amino acid stretch(14, 16, 26) .
These have been designated L-sialylmotif and S-sialylmotif
respectively(13, 20) . Site-directed mutagenesis of
L-sialylmotif indicates that it plays a role in the recognition of the
sugar nucleotide donar common to all sialyltransferases,
CMP-Neu5Ac(61) . When the eukaryotic sialyltransferase cDNAs
are compared, the greatest conservation of amino acids is found at
opposing ends of L-sialylmotif enabling the use of PCR ()to
clone additional members of this gene family.
Using degenerate
oligonucleotide primers derived from the most conserved regions of
L-sialylmotif, we have cloned a novel developmentally regulated
sialyltransferase that forms the Neu5Ac2,6GalNAc linkage and
utilizes sialylated glycoproteins as well as the ganglioside G
as acceptors generating the sequence
Neu5Ac
2,3Gal
1,3(Neu5Ac
2,6)GalNAc.
NMR spectra were
recorded at 23 °C on Bruker AM-500 and AMX-600 spectrometers,
operating at frequencies of 500 and 600 MHz, respectively, for H NMR. In all experiments, low power presaturation was
applied to the residual HDO signal.
H chemical shifts
(
) are expressed in ppm downfield from sodium
4,4-dimethyl-4-silapentane-1-sulfonate, with an accuracy of 0.002 ppm,
and were measured relative to internal acetone at
2.225 ppm.
C Chemical shifts are expressed in ppm downfield from
sodium 4,4-dimethyl-4-silapentane-1-sulfonate, with an accuracy of 0.02
ppm, and were measured from indirect
C detection
experiments with the RF carrier at a position calibrated to be 80.0 ppm
(relative to the methyl signal of internal acetone at
32.90 ppm).
One-dimensional TOCSY(43, 46) , and ROESY (45) experiments were performed with selective excitation of
accessible structural-reporter-group signals by DANTE pulse
trains(50) . The one-dimensional TOCSY pulse program contained
a 100-ms DIPSI-2 mixing sequence(52) . The one-dimensional
ROESY experiments used a 240 ms, 2.2 kHz, CW spin-lock pulse train
flanked by two 90° pulses for offset compensation(47) .
Two-dimensional double quantum filtered COSY(51) ,
HSQC(44) , and HMQC-TOCSY (48) datasets were collected
in phase-sensitive mode using the time-proportional phase
incrementation method(49) . For the two-dimensional double
quantum filtered COSY experiments, 800 FIDs of 2048 complex data points
were collected; 16 scans/FID were acquired, the spectral width was set
to 3623 Hz, and the RF carrier was placed at 4.0 ppm. For the HSQC
spectra, 256 FIDs of 1024 complex points were acquired with 16 and 64
scans/FID for the tri- and tetrasaccharides, respectively. The
HMQC-TOCSY experiment used a 40-ms Malcolm Levitt 17 mixing
sequence(43) . The GARP-1 sequence (53) was used for C decoupling during
H acquisition. The
spectral width in the
C dimension was set to 60 ppm, with
the carrier at 80.00 ppm referenced to internal acetone at 32.90 ppm,
with respect to sodium 4,4-dimethyl-4-silapentane-1-sulfonate. The
two-dimensional datasets were processed typically with a
Lorentzian-to-Gaussian weighting function applied in the t
dimension, and a shifted squared sine bell
function and zero-filling was applied in the t
dimension. Processing was performed with the Felix software
package, version 2.3 (BioSym Technologies, Inc.) on a Sun Sparc
workstation.
Figure 1: Comparison of sialylmotif L (SMY) and sialylmotif S of ST6GalNAc III with that of 11 previously cloned sialyltransferases. The 11 previously cloned sialyltransferases are the rat ST6Gal I(11) , the rat ST3Gal I(12) , the mouse ST3Gal II (13) , the rat STGal III(14) , the human ST3 Gal IV(15, 23) , the chick ST6GalNAc I(16) , the chick ST6GalNAc II(17) , the rat ST6GalNAc III (this publication), the mouse ST8Sia I(18, 22, 24) , the rat ST8Sia II (19) , the mouse ST8Sia III(20) , and the hamster ST8Sia IV(21) . The sialyltransferase motifs are grouped by the linkage that they form. Concensus sequences were designated using the following rules: 1) 7 of 12 sialylmotifs must contain the indicated amino acid at a particular aligned site, 2) if only 2 amino acids are found at a particular aligned site, the designated amino acid(s) must be present in at least three sialylmotifs.
Figure 2: Nucleotide and deduced amino acid sequences of the ST6GalNAc III protein. The putative signal-anchor domain of the ST6GalNAc III sialyltransferase is boxed, the potential N-glycosylation sites are marked with asterisks, and the ATG codons upstream from the sialyltransferase translation initiation are underlined. The sequences are numbered relative to the translation initiation site, which begins at the second in-frame ATG codon.
Comparison of the primary structure of STY protein and the 11 other cloned sialyltransferases indicates that it is the shortest of the 12 enzymes, which range from 305 to 566 amino acids in length, and that there is no significant similarity except in two regions, the so-called sialylmotifs(14, 16, 26) . These results indicate that this protein belongs to the sialyltransferase gene family, although outside of the sialylmotifs the homology is the lowest of all sialyltransferases cloned to date.
To confirm that Neu5Ac2,3Gal
1,3GalNAc is
required for activity with STY, we analyzed several sialylated
oligosaccharides at a concentration of 0.1 mM to test for
their ability to act as STY acceptors. As displayed by Table 2,
the best oligosaccharide acceptor was Neu5Ac
2,3Gal
1,3GalNAc
followed by LSTa (Neu5Ac
2, 3Gal
1,3GlcNAc-lactose) to which
STY transferred sialic acid at only approximately 3% of the rate of
transfer to Neu5Ac
2,3Gal
1,3GalNAc. As was the case with the
glycoconjugate acceptors (Table 1), the nonsialylated
Gal
1,3GalNAc sequence is not an acceptor for STY. From the above
data, the acceptor specificity of STY is restricted to sialylated
glycoconjugates carrying the Neu5Ac
2,3Gal
1, 3GalNAc sialoside
in which the GalNAc is either
-linked to serine or threonine or
-linked to galactose (G
).
Based on the acceptor
specificity of STY and previously described sialyltransferase
activities, we reasoned that STY formed either
Neu5Ac2,3Gal
1,3(Neu5Ac
2,6)GalNAc commonly found O-linked to glycoproteins as well as on the ganglioside
G
or possibly the sialoside
Neu5Ac
2,8Neu5Ac
2, 3Gal
1,3GalNAc found on the ganglioside
G
(39) . Differential sialidase and mild periodate
experiments failed to elucidate the linkage formed by STY. To resolve
this issue, the STY sialoside was extensively characterized by one- and
two-dimensional
H and
C NMR experiments as
described below.
Figure 3:
600-Mhz one-dimensional H NMR
of sialosides. One-dimensional
H-NMR spectra of the
acceptor trisaccharide (a) and the product tetrasaccharide (b) in D
O at pH 6.7, recorded at 600 MHz. The
structural reporter group signals are
marked.
Our initial attempt to assign the H spectra of the tri- and tetrasaccharide involved tracing
J
scalar coupling connectivities by
a two-dimensional COSY experiment in conjunction with one-dimensional
selective TOCSY experiments (results not shown). Thus, for example,
starting from the already assigned Neu5Ac H-3ax and H-3eq signals,
those of the Neu5Ac H-4, H-5 and H-6 atoms were located (see Table 3). Analogously, starting from the Gal H-1 and GalNAc H-1
signals, respectively, the resonances of H-2, H-3, and H-4 in each of
these rings were identified. However, due to the small Neu5Ac
J
and Gal(NAc)
J
coupling constant values, the
COSY and TOCSY experiments failed to lead us from Neu5Ac H-6 to H-7 and
also from Gal(NAc) H-4 to H-5. These problems were alleviated, in part,
by performing one-dimensional selective ROESY experiments on Gal and
GalNAc H-1. Owing to the occurrence of triaxial H-1/H-3/H-5 spin
systems in
-linked glycosyl residues, in which the three protons
are well within 4 Å from each other, the ROESY spectra provided
through-space connectivities compatible with the assignments of Gal
H-1, H-3, and H-5 (at
3.61), and similarly of GalNAc H-1, H-3 (at
3.80 and 3.78 for the tri- and tetrasaccharide, respectively),
and H-5 (at
3.68 in the trisaccharide,
3.76 in the
tetrasaccharide) (see Table 3). Having obtained the partial
assignment of the
H NMR spectra of the two saccharides, we
then attempted to assign their
C NMR spectra.
Fig. 4shows the pertinent portion of the two-dimensional H,
C HSQC spectrum of the tetrasaccharide. An
HSQC experiment as conducted here is a two-dimensional
H,
C correlation experiment that connects
signals belonging to
H and
C nuclei that are
directly (through one bond) attached to each other in the chemical
structure. First of all, such an HSQC spectrum provides the assignments
of the
C signals from already assigned
H
signals through one-on-one connectivities. Furthermore, HSQC spectra
can provide the missing entries to complete the assignment of the
H spectra, by revealing the C-6 (or C-9, in case of Neu5Ac)
methylene protons. Not only are methylene carbons coupled to two
protons each, it also happens that the carbon signals of carbohydrate
CH
groups in
C NMR spectra (55 <
<
70 ppm) are well separated from the CH signals (70 <
< 110
ppm). Thus, the HSQC experiment provides a relatively convenient way of
assigning the CH
protons in the
H spectra. The
HSQC CH
correlations are marked in Fig. 4. The
spectrum of the acceptor trisaccharide shows three CH
signals (at
63.40, 63.41, and 64.94), while that of the
product tetrasaccharide shows four of them (
63.56, 65.08, 65.27,
and 66.11). Their
J
coupled protons in the
H spectra were found at the positions listed in Table 3.
Figure 4:
Two-dimensional HSQC spectra of the
disialyl tetrasaccharide. Portions of the two-dimensional H,
C HSQC spectrum of the STY disialyl
tetrasaccharide, recorded at 600 MHz in D
O and pH 6.7. The
CH
connectivities are marked.
To assign the CH signals to specific
glycosyl residues, the
C signals need to be
``linked'' to
H signals already assigned by COSY,
TOCSY, and/or ROESY experiments. Two-dimensional HMQC-TOCSY experiments
were conducted to provide these links for two of the observed
C CH
signals, namely to the Gal and GalNAc H-5
signals (results not shown). The HMQC-TOCSY spectra show partial TOCSY
spectra in
H rows superimposed on
H,
C correlations. For example, the
tetrasaccharide CH
signal at
66.11 ppm showed a HMQC
connectivity to two C-6 protons (at
3.66 and 3.98 ppm) that in
turn are TOCSY-correlated to a proton signal at 3.76 ppm; the latter
had been assigned to GalNAc H-5 by a one-dimensional ROESY experiment
(see Table 3). Analogously, the C-6 signals of Gal in the tri-
and tetrasaccharide and C-6 of GalNAc in the trisaccharide were
assigned. By default, the Neu5Ac C-9 signals were identified as the
remaining CH
signals in the
C spectra. These
CH2 signals are displayed in Table 4.
Thus, it became obvious
that the crucial difference between the C NMR spectra of
the tri- and tetrasaccharide was the chemical shift increment of
2.7 ppm shown by the GalNAc C-6 signal, which is
typical for glycosylation at that position. Additionally the C-5 of the
GalNAc unit is shielded 1.5 ppm, which is characteristic of
glycosylation at the C-6 position of GalNAc(69). No other
C signals in the spectrum of the acceptor trisaccharide
underwent a similar chemical shift increment (see Table 3). It
was deduced that the newly introduced sialic acid residue in the
tetrasaccharide is attached to the C-6 position of GalNAc. Therefore,
the product disialoside formed by incubation of the acceptor
trisaccharide and CMP-Neu5Ac in the presence of STY was identified by
NMR spectroscopy as
Neu5Ac
2,3Gal
1,3(Neu5Ac
2,6)GalNAc
1,0-benzyl.
Since two other 2,6-GalNAc sialyltransferase cDNAs have been
isolated, we have designated STY, ST6GalNAc III to conform with current
sialyltransferase nomenclature.
The newly isolated
sialyltransferase cDNA will be referred to as ST6GalNAc III throughout
the remainder of the manuscript.
Figure 5: Differential expression of ST6GalNAc III in various rat tissues. Northern blots with mRNA from various adult and newborn rat tissues were hybridized with a probe for ST6GalNAc III, as described under ``Experimental Procedures.''
With the isolation of ST6GalNAc III cDNA, 12 enzymatically
distinct sialyltransferases have been isolated. Three of these,
including ST6GalNAc III, encode N-acetylgalactosaminide
2,6-sialyltransferases(16, 17) . As summarized in Table 5, each ST6GalNAc sialyltransferase utilizes the
Neu5Ac
2, 3Gal
1,3GalNAc
1,0-Thr/Ser glycoconjugate as an
acceptor. However the acceptor specificity of ST6GalNAc III is
considerably more restricted than that of ST6GalNAc I and II. While
ST6GalNAc I and II utilize various asialo O-linked structures
as acceptors, ST6GalNAc III displays an absolute requirement for the
sialylated structure(16, 17) . Indeed, the only
acceptor for ST6GalNAc III to be identified in the current study is
Neu5Ac
2,3Gal
1,3GalNAc both in its free oligosaccharide form
as well as attached to glycoconjugates. The next best oligosaccharide
acceptor, LSTa (see Table 2for structure) has an incorporation
rate of only 3% of that of Neu5Ac
2,3Gal
1, 3GalNAc, suggesting
that transfer to glycoconjugates carrying the LSTa sequence is unlikely
to be of physiological relevance. The acceptor specificity of ST6GalNAc
III most closely resembles a sialyltransferase activity described in
fetal liver by Bergh et al.(66) and in adult rat
brain by Baubichon-Cortay et al.(67) . Like ST6GalNAc
III, both of these tissues express a
2,6-GalNAc sialyltransferase
activity that utilizes the
Neu5Ac
2,3Gal
1,3GalNAc
1,0-Thr/Ser sialoside of fetuin as
an acceptor but not its asialo derivative.
The most striking
enzymatic difference between ST6GalNAc III and other members of the N-acetylgalactosamine sialyltransferase subfamily is that
ST6GalNAc III is the only sialyltransferase cloned to date capable of
forming the developmentally regulated ganglioside G from G
. ST6GalNAc I and II do not utilize G
as an acceptor(17) . (
)Thus ST6GalNAc I and II
transfer sialic acid to
-linked GalNAc (GalNAc
1,0-Thr/Ser)
but not
-linked GalNAc such as is found in G
(see Table 1for structure). In contrast ST6GalNAc III does not
discriminate between
- and
-linked GalNAc. Although common
gangliosides such as G
and G
carry the
Neu5Ac
2, 3Gal
1,3GalNAc moiety, unlike G
they
are not acceptors for ST6GalNAc III. The only difference between
G
and G
is a sialic acid residue linked to
the internal galactose of G
. Thus this sialic acid
residue abolishes the catalytic activity of ST6GalNAc III perhaps by
sterically hindering the access of the sialyltransferase to the C-6
position of GalNAc. It has recently been shown that an
2,6-GalNAc
sialyltransferase activity exists in rat liver that utilizes G
and G
as acceptors forming respectively
G
and G
(36) . Since
ST6GalNAc III does not utilize G
or G
as
acceptors, it is clear that another ST6GalNAc sialyltransferase exists
in addition to the three cDNAs already isolated. At this point, it is
unclear if gangliosides such as G
are acceptors for
ST6GalNAc III(39) .
Previously, sialosides generated by
novel cDNAs relied on indirect methods such as sialidase treatments for
linkage analysis. In the current studies, we were unable to confidently
characterize the sialoside generated by ST6GalNAc III with strictly
sialidase treatments, ()forcing us to synthesize and purify
quantities of sialoside sufficient for NMR analysis. NMR analysis
confirmed that the acceptor monosialoside utilized in these studies was
Neu5Ac
2,3Gal
1,3GalNAc
1,0-benzyl. Furthermore, the
product tetrasaccharide was identified by a combination of
one-dimensional and two-dimensional
H and
C
NMR experiments as
Neu5Ac
2,3Gal
1,3(Neu5Ac
2,6)GalNAc
1,0-benzyl.
Careful comparison of the HSQC spectra of the tri- and
tetrasaccharide revealed that one of the CH signals in the
C spectrum of the trisaccharide had undergone a chemical
shift increment typical for glycosylation at that site. That CH
group was attributed to GalNAc based on an HMQC-TOCSY experiment.
Thus, the linkage position of the second Neu5Ac residue was identified,
not (as would usually be the case) by an HMBC experiment (as Neu5Ac
does not have an anomeric proton), but by a
C-edited TOCSY
experiment. Synthesis of quantities of ST6GalNAc III sialoside
sufficient for these experiments was made possible by employing an
expression vector in which the entire open reading frame of ST6GalNAc
III was placed under control of the cytomegalovirus promoter. Stable
transfection of this construct into 293 cells yielded remarkably high
levels of ST6GalNAc III sialyltransferase activity that allowed for the
enzymatic synthesis of milligram quantities of the ST6GalNAc III
sialoside using only detergent lysates as an enzyme source. Indeed,
levels of ST6GalNAc III were high enough to render endogenous sialidase
and sialyltransferase activities insignificant relative to the
recombinant sialyltransferase activity. This expression system may be
of future utility for expression of sialyltransferases particularly
when relatively high levels of enzyme activity are required.
In
certain instances, sialyltransferases share sequence identity outside
the sialylmotifs, greatly enhancing the probability that they form
identical linkages. This is apparent when the sequences of four
2,8-sialyltransferase cDNAs are compared with one another.
Throughout their open reading frames, they share a 60-28%
identity, with the closest identity occurring between ST8Sia IV and
ST8Sia II (19, 21, 42) . Two distinct
ST6GalNAc transferases (ST6GalNAc I and II) differing slightly in their
substrate specificity share 32% sequence identity throughout their
coding region. The identity increases to 48% when the sequences are
compared from their respective sialylmotif L to the carboxyl
terminus(17) . However, it is difficult to predict the linkage
that a novel sialyltransferase cDNA will form if no homology with
previously characterized sialyltransferase cDNAs is observed outside of
the sialylmotif. For instance, outside of the sialylmotifs, ST6GalNAc
III shares no amino acid identity to any previously isolated
sialyltransferase cDNA including ST6GalNAc I and II. Thus even though
ST6GalNAc I-III each form the Neu5Ac
2,6-GalNAc linkage, it
was impossible to predict the linkage that ST6GalNAc III formed by
analysis of its primary sequence.
The spatial and temporal
expression of ST6GalNAc III correlates well with that of
G. The expression of G
is highest
in embryonic brain decreasing to low levels in adults(35) . Of
the other tissues examined, G
is found in spleen and
lung. It is expressed on all T-cells and enriched in Th-1 cells,
explaining its high expression levels in spleen(38) . While
ST6GalNAc III is abundantly expressed in newborn brain and kidney, a
survey of adult tissues reveals that its expression is restricted to
spleen, kidney, and to a small extent lung. Interestingly, ST6GalNAc
III message was below the detectable limits of Northern analysis in
adult brain. However the ST6GalNAc III cDNA was isolated from an adult
rat brain cDNA library, thus minute levels of ST6GalNAc III message are
present in this tissue, corresponding to the low levels of
G
detected in adult brain tissue(35) . Since
ST6GalNAc III forms G
in vitro and the
tissue-specific expression of the sialyltransferase and ganglioside
correlate well, it is likely that one potential function of ST6GalNAc
III is to synthesize G
in vivo. Since
ST6GalNAc III only utilizes the Neu5Ac
2,3Gal
1,3GalNAc
sequence as an acceptor, it must be co-expressed with ST3Gal I, ST3Gal
II, or ST3Gal IV in the same cell to synthesize G
in vivo. Without such co-expression, the activity of
ST6GalNAc III would be functionally null.
G has
been implicated as a molecular component of a variety of important
biological processes. These include metastasis of highly virulent
lymphomas and motor learning as elaborated by Purkinje
cells(37, 35) . In the future, it will be important to
determine if ST6GalNAc III is co-expressed with G
in
particular cell types and if so to genetically manipulate ST6GalNAc III
in different systems to ultimately determine the biological relevance
of G
.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) L29554[GenBank].