(Received for publication, November 27, 1995; and in revised form, January 16, 1996)
From the
The asialoglycoprotein receptors and many other C-type
(Ca-dependent) animal lectins specifically recognize
galactose- or N-acetylgalactosamine-terminated
oligosaccharides. Analogous binding specificity can be engineered into
the homologous rat mannose-binding protein A by changing three amino
acids and inserting a glycine-rich loop (Iobst, S. T., and Drickamer,
K.(1994) J. Biol. Chem. 269, 15512-15519). Crystal
structures of this mutant complexed with
-methyl galactoside and N-acetylgalactosamine (GalNAc) reveal that as with wild-type
mannose-binding proteins, the 3- and 4-OH groups of the sugar directly
coordinate Ca
and form hydrogen bonds with amino
acids that also serve as Ca
ligands. The different
stereochemistry of the 3- and 4-OH groups in mannose and galactose,
combined with a fixed Ca
coordination geometry, leads
to different pyranose ring locations in the two cases. The glycine-rich
loop provides selectivity against mannose by holding a critical
tryptophan in a position optimal for packing with the apolar face of
galactose but incompatible with mannose binding. The 2-acetamido
substituent of GalNAc is in the vicinity of amino acid positions
identified by site-directed mutagenesis (Iobst, S. T., and Drickamer,
K.(1996) J. Biol. Chem. 271, 6686-6693) as being
important for the formation of a GalNAc-selective binding site.
Ca-dependent (C-type) animal lectins are a
family of proteins whose members contain one or more homologous
carbohydrate-recognition domains (CRDs). (
)The majority of
C-type lectins bind to D-mannose, D-glucose, and
related sugars (Man-type ligands), or to D-galactose and its
derivatives (Gal-type ligands). The mammalian hepatocyte
asialoglycoprotein receptors, which play a part in serum glycoprotein
homeostasis(1, 2) , are the best known of the
Gal-binding C-type lectins. C-type lectins with high affinity for
glycoconjugates bearing terminal galactose residues have also been
identified on the surfaces of peritoneal macrophages and Kupffer cells (3, 4) , and appear to mediate recognition of tumor
cells(5, 6) . C-type CRDs with lower affinity for
Gal-type ligands are found in proteoglycan core proteins of cartilage
and other tissues and are presumed to contribute to the organization of
the extracellular matrix(7) .
Previous crystallographic
analyses of rat mannose-binding proteins (MBPs) A and C have shown that
Man-binding C-type lectins recognize their sugar ligands by formation
of direct coordination bonds between a Ca (designated
site 2) and a lone pair of electrons from each of two vicinal hydroxyl
groups possessing the same stereochemical arrangement as the equatorial
3- and 4-OH groups of D-mannose(8, 9) . The
Ca
is 8-coordinated in a pentagonal bipyramidal
arrangement, with the two sugar hydroxyls bisecting one of the apical
positions (8) (see Fig. 1a). In addition, the
same OH groups form hydrogen bonds with amino acid side chains that are
Ca
site 2 ligands, producing an intimately linked
ternary complex of protein, Ca
, and sugar (see Fig. 1a). Only one other contact, an apolar van der
Waals contact between a ring carbon and the C
of
residue 189 contributes significantly to
binding(8, 10) .
Figure 1:
Galactose
binding to QPDWG. a, mannose binding to wild-type MBP-A as
observed in a ManGlcNAc
Asn-MBP-A
complex(8) . Carbon, nitrogen, and oxygen atoms are shown as white, gray, and black spheres,
respectively; Ca
2 is shown as a larger white sphere.
Coordination and hydrogen bonds are represented by long and short dashed lines, respectively. Carbon atoms of the sugars
are numbered. b, stereo pair of the final
2
F
-
F
electron density map in the binding site of the
MeGal-QPDWG
complex, contoured at 1.2
. c, galactose binding to
QPDWG. Symbols are as described for a. Parts a and c were made with
MOLSCRIPT(26) .
Studies with derivatized sugars
have shown that free 3- and 4-OH groups are essential for binding to
mammalian asialoglycoprotein receptors as well as Man-binding C-type
lectins, whereas substitutions at other ring positions have little or
no effect on binding(11) . However, the 3- and 4-OH groups of
galactose have an equatorial/axial arrangement, so the mechanism of
Gal- and Man-type ligand recognition must be different. Sequence
analysis reveals that of the Ca 2 ligands, positions
equivalent to Glu
, Asn
, and Asp
of MBP-A are highly conserved among C-type lectins regardless of
specificity. In contrast, positions 185 and 187 are found to be Glu and
Asn in Man-binding family members, whereas Gal-binding C-type lectins
have Gln and Asp at these positions. The Glu
Gln/Asn
Asp mutant of MBP-A, designated
``QPD'', binds to galactose in preference to mannose by a
factor of 3 but with relatively low affinity for either
sugar(12) . Position 189 of MBP-A (Fig. 1a) is
not conserved among Man-binding C-type lectins but is always either Trp
or Phe in Gal-binding family members. Replacement of His
of MBP-A with Trp in the QPD mutant to make ``QPDW''
gives a protein with affinity for Gal comparable with natural
Gal-binding C-type lectins but that still does not discriminate well
between Gal and Man(13) . However, insertion of a glycine-rich
loop found in the major form of the rat asialoglycoprotein receptor,
rat hepatic lectin-1 (RHL-1), and other Gal-binding C-type lectins that
display strong discrimination against mannose results in a mutant
(``QPDWG'') with galactose affinity and selectivity
comparable with RHL-1(13) . The affinity for galactose is
comparable in QPDW and QPDWG, indicating that the determinants of
affinity and selectivity are somewhat distinct.
NMR measurements
reveal similar modes of galactose binding by QPDWG and
RHL-1(13) , demonstrating that galactose specificity in C-type
lectins is determined by a few residues and can be studied in the well
characterized MBP-A background. Here we describe the structure of a
trimeric fragment of QPDWG containing the neck and COOH-terminal CRD (14) , both alone and complexed with -methyl galactoside
(
MeGal) and N-acetylgalactosamine (GalNAc). The
structures reveal the molecular basis of selective galactose
recognition by C-type lectins. The structure of the QPDWG-GalNAc
complex is consistent with results of site-directed mutagenesis
experiments that have identified amino acid positions that contribute
to the preferential binding of GalNAc over Gal by certain C-type
lectins.
A trimeric fragment of QPDWG containing the neck and
COOH-terminal CRD (14) was crystallized, and the structure was
solved by molecular replacement, both alone and complexed with
MeGal and GalNAc ( Table 1and Table 2). The structures
were refined to resolutions of 2.0 Å or better ( Table 1and Table 2). Apart from the His
Trp change and
the glycine-rich insertion at the carbohydrate-binding site, the
structures of wild-type MBP-A and the QPDWG mutant are identical to
within the coordinate error. In particular, the Ca
site 2 ligands of the two structures superimpose, with the side
chain amide nitrogen of Gln
and the carbonyl oxygen of
Asp
of QPDWG in the same positions as the carbonyl oxygen
of Glu
and the amide nitrogen of Asn
in the
wild-type protein.
Despite the different stereochemistry of the 3-
and 4-OH groups, the mechanism of MeGal and GalNAc binding to
QPDWG is similar to that of Man-type ligands to wild-type MBPs, with
the full noncovalent bonding potential of 3- and 4-OH groups used for
Ca
coordination and hydrogen bond formation with
Ca
ligands (8, 9) (Fig. 1, a and c). However, maintenance of the pentagonal
bipyramidal Ca
coordination geometry forces the
pyranose ring into a very different orientation from that observed in
mannose binding to wild-type MBPs(8, 9) . The apolar
patch formed by the 3, 4, 5, and 6 carbons of
MeGal and GalNAc
packs against the side chain of Trp
, an interaction
observed in all galactose-lectin interactions studied to date (23) (Fig. 1c). The angle between the least
squares plane through the pyranose ring of galactose and the plane of
the Trp
side indole ring falls within the range found in
other galactose-binding lectins (Table 3). This interaction is
especially noteworthy given that no aromatic residues interact with the
sugar ligand in Man-binding C-type lectins, which in fact make few
nonpolar contacts with the sugar ligand(8, 9) .
Interaction of the Trp side chain with the Gly-rich
loop is critical to the selectivity of QPDWG for galactose and
discrimination against mannose. The loop is a rigid structure with a
somewhat unusual conformation (Fig. 2, a and b). His
tucks into the loop and stabilizes the
structure by forming hydrogen bonds with a main chain amide and a
carbonyl oxygen; Leu
is on the outside of the loop and
packs against Ala
, thereby holding the loop down against
the lower part of the protein (Fig. 2, a and b). The C
of Gly
packs against
Trp
and holds it in a slightly unfavorable
rotamer (+60 °). Modelling indicates that neither of the
most favored
rotamers of Trp (±90 °) can
be accommodated on the mutant protein, nor can other
rotamers. The Gly-rich loop thus serves as a
``doorstop'' that prevents Trp
from adopting a
more favorable conformation. Mutagenesis data show that changes in many
of the loop residues are tolerated with only small effects on galactose
selectivity (13) , consistent with the notion that the loop
serves as a rigid unit that restricts the conformation of Trp
rather than providing specific interactions with the sugar or
other residues of the protein. Superposition of mannose bound as
observed in a Man
oligosaccharide-MBP-A complex (8) (Fig. 1a) on QPDWG reveals that the
exocyclic C6 clashes with Trp
(Fig. 2c).
Man-type ligands bind to the homologous MBP-C in an orientation
reversed 180 ° from that shown in Fig. 1a, such
that the positions 3- and 4-OH groups are exchanged(9) , and
preliminary data indicate that MBP-A can also bind to monosaccharides
in this manner. (
)In this orientation, the anomeric oxygen
in the
configuration sterically clashes with Trp
.
Thus, the position of Trp
imposed by the Gly-rich loop
excludes Man-type ligands from the site and explains the essential role
of this loop in galactose selectivity.
Figure 2:
Galactose selectivity imposed by the
glycine-rich loop. a, ribbon diagram of wild-type and QPDWG
mutant of MBP-A in the vicinity of the binding site. The Ca 2 ligands, as well as the residue at position 189, are indicated
along with the positions of Ca
1 and 2 (spheres). The loop following residue 189, which differs
between wild-type and QPDWG, is highlighted in black. The van
der Waals contact between Leu
of the Gly-rich loop and
Ala
is shown as a dashed line. b,
stereo pair showing the detailed structure of the glycine-rich loop and
the packing of Trp
against Gly
. Symbols are
as described in the legend to Fig. 1. c, superposition
of mannose observed in the wild-type MBP-A binding site on the mutant
binding site. The steric clash between mannose and Trp
is
emphasized. The figure was made with
MOLSCRIPT(26) .
No significant differences in
the protein are observed between unliganded and sugar complex
structures. The sugar complexes were prepared using sugar
concentrations in approximately 100-fold excess over the K(13) , and the average temperature
factors of the sugar and its liganding residues are quite similar.
Thus, the sugars appear to be fully occupied in the binding sites,
although the correlation of temperature factor and occupancy at the
resolutions used in this study precludes refinement of the sugar
occupancy. In two of the three crystallographically independent copies
(protomers 1 and 3; Table 2), the glycine-rich loops in both the
unliganded and complexed structures have similar temperature factors.
The glycine-rich loop of protomer 2 of each structure has consistently
higher temperature factors and is most likely a consequence of
participating in relatively few lattice contacts. The average
temperature factors of the loop in protomer 2 differ by approximately
25 Å
between the unliganded and complexed structures.
It is possible that lattice contacts immobilize the loop so that any
effect of sugar binding on loop mobility would be detectable only in
the copy with no lattice contacts. However, the entire protomer 2 of
the unliganded structure has significantly higher temperature factors
than the equivalent protomer in the complexed structures (Table 2), so it cannot be concluded that sugar binding
significantly affects the mobility of the binding site region.
The
QPDWG structures explain binding, mutagenesis, and spectroscopic data
obtained from several galactose-binding mutants of MBP-A. Proton NMR
spectra of MeGal in the presence of QPDWG show upfield shifts of
the H5, H6, and H6` protons of Gal consistent with their interaction
with the delocalized
electron system of the Trp ring observed in
the crystal structure(13) . The line widths of the aromatic
protons of Trp
are broadened upon Gal binding to QPDW,
whereas they are broad in the absence or the presence of Gal in QPDWG,
consistent with the notion that the Gly-rich loop immobilizes
Trp
in a position optimal for interaction with
Gal(13) . The proteoglycan core protein CRDs have Phe instead
of Trp at position 189 and exhibit relatively poor selectivity against
Man-type ligands. The corresponding MBP-A mutant QPDFG, which includes
Phe
, binds to Gal-type ligands only 6-fold more strongly
than Man-type ligands, as opposed to the 40-fold selectivity for
Gal-type ligands shown by QPDWG (13) . These properties are
explained by exclusion of Man by the 6-membered portion of the
Trp
ring (Fig. 2c), which extends farther
out than the side chain of Phe.
Several Gal-binding C-type lectins,
including RHL-1, display strong preference for GalNAc over Gal, whereas
others do not discriminate between these two sugars. An example of the
latter is the macrophage galactose receptor (MGR), and the QPDWG mutant
of MBP-A mimics MGR in this respect. Site-directed mutagenesis of MGR
based on sequence comparisons with the asialoglycoprotein receptors has
identified residues in four regions of the sequence that provide
selectivity for GalNAc over Gal(24) . Of these regions, the
residue equivalent to Ser of MBP-A provides 20-fold of
the observed 60-fold selectivity for GalNAc over Gal by
RHL-1(24) . Moreover, a histidine equivalent to Thr
of QPDWG is found in both RHL-1 and MGR and must be present in
order to observe the enhancement provided by the residue at 154. The
structure of QPDWG complexed with GalNAc shows that the 2-acetamido
substituent is in the vicinity of Thr
, which in turn lies
near Ser
(Fig. 3), and is thus consistent with the
formation of a GalNAc-specific binding site by residues in these
positions in RHL-1.
Figure 3:
N-Acetylgalactosamine binding to
QPDWG. a, stereo pair of the final
2F
-
F
electron density map in the binding site of the GalNAc-QPDWG complex,
contoured at 1.2
. b, stereo ribbon diagram
(MOLSCRIPT(26) ) showing location of GalNAc with respect to the
CRD.
The present structures leave unclear how the
Glu
Gln/Asn
Asp differences
lead to specificity for Man- or Gal-type ligands. The residues in the
binding sites of wild-type and mutant MBP-A superimpose closely, so it
is not obvious why galactosides do not bind to wild-type MBPs in the
orientation observed in the present structures. Indeed, free galactose
binds to MBP-C through the 1- and 2-OH groups, emphasizing the
selectivity of the wild-type site for equatorial OH groups having the
same stereochemical arrangement as the 3- and 4-OH of
mannose(9) . These OH groups are related by a 2-fold rotation
axis that bisects the pyranose ring and form hydrogen bonds with side
chain carbonyl oxygen and amide nitrogen atoms that conform
approximately to this symmetry in the wild-type site but not in the QPD
site (Fig. 4). Although the mechanism is not obvious, this
difference in symmetry may be related to the weaker affinity of QPD for
either Gal- or Man-type ligands(12) . The absolute affinity of
wild-type MBP-A for Man is similar to that of QPDW or QPDWG for Gal,
which implies that the binding energy of Man to the wild-type
Ca
site is greater than that of Gal to the QPD mutant
site. Thus the favorable interaction with the aromatic residue at
position 189 can be viewed as compensating for the loss of symmetry in
the mutant site to provide affinity for Gal comparable with that of
wild-type MBP-A for Man. In the absence of the glycine-rich loop,
mannose is not excluded from the site but interacts with lower affinity
due to the asymmetric arrangement of its hydrogen-bonding partners.
Figure 4: Symmetry of hydrogen bonding partners in wild-type and mutant MBP-As. Arrangement of the four side chain groups that form hydrogen bonds with the 3- and 4-OH groups of the sugar ligand in wild-type MBP with Man and QPDWG mutant with Gal. The symbols as described in the legend to Fig. 1. The figure was made with MOLSCRIPT(26) .
Another potential source of the different specificities of wild-type
and QPD sites is the displacement of ordered water molecules upon sugar
binding. High resolution structures of MBP-C show that the 3- and 4-OH
of Man-type ligands replace two water molecules that form the same set
of hydrogen and Ca coordination bonds(9) .
Unfortunately, the amount of visible, ordered water structure in the
uncomplexed QPDWG site varies among the three crystallographically
independent copies, making it difficult to draw firm conclusions. In
the best ordered site, two water molecules that form hydrogen bonds
with the Ca
2 ligands at 185, 187, 198, and 210
equivalent to those formed by Gal can be discerned. These water
molecules are in approximately the same position as the 3- and 4-OH
groups of Gal but only one of them appears to be close enough to
Ca
2 to form a coordination bond. Only one water
molecule is observed in another copy, and no water molecules can be
placed with confidence in the third site. Higher resolution structures
of the uncomplexed QPDWG site will be required to assess whether or not
there is a change in Ca
coordination number upon
ligand binding.
The different locations of the bound pyranose ring
seen in the present structures and the structures of wild-type MBPs
complexed with Man-type ligands are a consequence of Ca coordination geometry. This observation and the fact that few
other contacts are made with the protein demonstrate the dominant role
that Ca
coordination plays in sugar recognition by
C-type lectins. The different pyranose ring locations dictated by
Ca
coordination geometry forms the basis of selective
recognition of galactose by steric exclusion of Man-type ligands
provided by Trp
and the glycine-rich loop.
The atomic coordinates and structure factors (codes 1AFA
(MeGal complex), 1AFB (GalNAc complex), and 1AFD (unliganded
QPDWG)) have been deposited in the Protein Data Bank, Brookhaven
National Laboratory, Upton, NY.