(Received for publication, October 17, 1994; and in revised form, December 7, 1994)
From the
A protein of 35 kDa which has the characteristic properties of
galectins (S-type lectins) was cloned from rat liver cDNA expression
library. Since names for galectins 1-7 were already assigned,
this new protein was named galectin-8. Three lines of evidence
demonstrate that galectin-8 is indeed a novel galectin: (i) its deduced
amino acid sequence contains two domains with conserved motifs that are
implicated in the carbohydrate binding of galectins, (ii) in vitro translation products of galectin-8 cDNA or bacterially expressed
recombinant galectin-8 are biologically active and possess sugar
binding and hemagglutination activity, and (iii) a protein of the
expected size (34 kDa) that binds to lactosyl-Sepharose and reacts with
galectin-8-specific antibodies is present in rat liver and comprises
0.025% of the total Triton X-100-soluble hepatic proteins.
Overall, galectin-8 is structurally related (34% identity) to
galectin-4, a soluble rat galectin with two carbohydrate-binding
domains in the same polypeptide chain, joined by a link peptide.
Nonetheless, several important features distinguish these two
galectins: (i) Northern blot analysis revealed that, unlike galectin-4
that is confined to the intestine and stomach, galectin-8 is expressed
in liver, kidney, cardiac muscle, lung, and brain; (ii) unlike
galectin-4, but similar to galectins-1 and -2, galectin-8 contains 4
Cys residues; (iii) the link peptide of galectin-8 is unique and bears
no similarity to any known protein; (iv) the N-terminal
carbohydrate-binding region of galectin-8 contains a unique WG-E-I
motif instead of the consensus WG-E-R/K motif implicated as playing an
essential role in sugar-binding of all galectins. Together with
galectin-4, galectin-8 therefore represents a subfamily of galectins
consisting of a tandem repeat of structurally different carbohydrate
recognition domains within a single polypeptide chain.
Lectins are involved in a wide variety of cellular functions,
many of which are related to their only common feature, the ability to
bind carbohydrates specifically and reversibly and to agglutinate cells
(reviewed in (1) and (2) ). Animal lectins are
classified as C-lectins, which are Ca-dependent and
are structurally related to the asialoglycoprotein receptor, and
galectins, previously known as S-type lectins, which are
thiol-dependent and specifically bind
-galactoside residues. In
mammals, four galectin types have been sequenced and characterized, and
there is evidence for the existence of other
relatives(3, 4) . All known members of this family
lack a signal peptide(5) , are found in the cytosol, and are
isolated as soluble proteins. However, there is evidence that some
members are externalized by an atypical secretory
mechanism(5, 6, 7) .
Galectins require
fulfillment of two criteria: affinity for -galactosides and
significant sequence similarity in the carbohydrate recognition domain
(CRD)(
)(8, 9) , the relevant amino acids
residues of which have been determined by x-ray
crystallography(10, 11) . Galectin-1 and -2 are
homodimers, with subunit molecular mass of
14 kDa, that are not
subjected to post-translational
modifications(12, 13) . Galectin-1 is found in the
extracellular matrix (14, 15) and has been shown to
interact with laminin(16, 17) . The function of
galectin-1 and -2 is not yet fully understood, although there is
evidence that they might be involved in regulation of cell
growth(18, 19) , cell adhesion(17) , cell
transformation(20) , and embryogenesis(21) .
Larger galectins (galectin-3) (previously known as CBP-35, Mac-2, RL-29) do exist ( (22) and references therein). These are monomeric 29-35-kDa mosaic proteins, composed of an N-terminal half made of tandem repeats characteristic of the collagen gene superfamily and a C-terminal half homologous to galectin-1 and -2(22) . Galectin-3 also binds laminin (23) and is implicated as component of growth regulatory systems(24) , mediator of cell-cell and cell-matrix interactions(2, 25) , modulator of immune response(26) , marker of neoplastic transformation, and indicator for metastatic potential of melanoma cells(27) .
Galectin-4 was cloned from rat
intestine(28) , and a homologous protein was cloned from
nematode(29) . Galectin-4 is a monomer with molecular mass of
36 kDa. It contains tandem domains of 140 amino acids each,
homologous to galectin-1 and -2, that are separated by a link region (28) . The function of galectin-4 is presently unknown. Here we
describe the cloning of a cDNA encoding for a novel protein that we
term galectin-8. Galectin-8 has the characteristic properties of other
galectins(3, 4) , and it is structurally related (34%
identity) to rat galectin-4.
To
express GST-galectin-8, bacteria were cultured in 0.5 liter of LB
medium until the absorbance at 600 nm was 0.5. Expression of
GST-galectin-8 was then induced with 5 mM isopropyl-1-thio--D-galactopyranoside for 4 h. To
isolate the recombinant protein, a bacterial pellet was isolated by
centrifugation, resuspended in 30 ml of buffer I (phosphate-buffered
saline containing 4 mM
-mercaptoethanol, 2 mM EDTA, 10 µg/ml soybean trypsin inhibitor, 2 mM benzamidine, and 1 mM phenylmethylsulfonyl fluoride, pH
7.5), and lysed by sonication. Debris were removed by centrifugation at
38,000
g at 4 °C for 45 min, and 30 ml of the
soluble extract were passed over 5 ml of lactosyl-Sepharose. Unbound
proteins were eluted with buffer I, while the lectin was subsequently
eluted with buffer I containing 100 mM lactose. A similar
procedure was utilized to express r-galectin-8 in the pET-3a expression
plasmid, save for the fact that the bacteria were centrifuged when the
absorbance at 600 nm was 0.3, without addition of
isopropyl-1-thio-
-D-galactopyranoside. Recombinant
galectin-8 was isolated under reducing conditions, since in their
absence the protein underwent denaturation even when maintained at 4
°C.
Figure 1: cDNA sequence of galectin-8 and deduced protein sequence. The cDNA sequence of 1247 base pairs contains an open reading frame from 121 to 1069 base pairs, which encodes for a protein of 316 amino acids.
Figure 2:
Galectin-8 encodes for a galectin with two
homologous carbohydrate-binding regions. A schematic structure of
galectin-8 is presented (top). Each box represents a
putative carbohydrate-binding domain, linked by a 32-amino acid long
peptide. Shown are invariant amino acids preserved in most galectins
analyzed so far. The Arg residue, indispensable for sugar
binding(9) , located at the C-terminal CRD, and its
corresponding Ile residue, localized to the N-terminal CRD, are shown
in bold. Amino acid sequences of different galectins are
presented for comparison (bottom). These include human
galectin-1 (Galec-1)(44) , human galectin-2 (Galec-2)(45) , the carbohydrate binding domain (amino
acids 128-263) of rat galectin-3 (Galec-3)(26) ,
N-terminal (Galec-4-Nt) and C-terminal (Galec-4-Ct)
halves of galectin-4(28) , N-terminal (CE-Nt) and
C-terminal (CE-Ct) halves of a 32-kDa
-galactoside-binding protein from C.
elegans(29) ; N-terminal (Galec-8-Nt) and
C-terminal (Galec-8-Ct) halves of galectin-8. Residues with
shared identity are boxed. Residues with shared similarity are shaded.
Figure 3: Northern blot analysis of RNA from rat tissues probed with galectin-8 cDNA. Top, 30 µg of total RNA from the indicated tissues was electrophoresed, blotted, and probed with labeled galectin-8 PCR product as described under ``Experimental Procedures.'' The migration of the 18 and 28 S rRNA are marked. Bottom, the same blot was stripped and reblotted with cDNA encoding for glyceraldehyde-3-phosphate dehydrogenase (GAPDH).
Figure 4:
Immunoprecipitation of in vitro translation product of galectin-8 by lp-lec8 antibodies. Fifty
µl of the S-labeled galectin-8, expressed as in
vitro translation product (see ``Experimental
Procedures''), were immunoprecipitated by lp-lec8 antibodies as
described under ``Experimental Procedures.'' Five µl of
the total
S-labeled galectin-8 (Total), 5 µl
of the fraction not precipitated by the antibodies (Sup), and
50 µl of the immunoprecipitated fraction (IP) were
subjected to 12% SDS-PAGE and
autoradiography.
In a different approach a (tag-free) r-galectin-8 was expressed employing a pET-3a expression plasmid (Novagen) in the pLysS bacterial host. Unlike intestinal recombinant galectin-4 that precipitates and cannot be extracted with buffers that preserve its lectin activity (28) , r-galectin-8 could be readily extracted from bacteria in a soluble form. r-galectin-8 was not subjected to major proteolytic cleavage, as it migrated at the expected size of 34 kDa. Most important, r-galectin-8 retained its sugar-binding activity, and 1.2 mg of protein/liter of bacteria were obtained following its purification over lactosyl-Sepharose column (Fig. 5).
Figure 5: Binding of tag-free r-galectin-8 to lactosyl-Sepharose. Tag-free r-galectin-8 was expressed in pLysS as described under ``Experimental Procedures.'' After centrifugation, 30 ml of the soluble bacterial proteins were purified over 5 ml of lactosyl-Sepharose. r-galectin-8 was eluted with 100 mM lactose in buffer I, and 1-ml fractions were collected. Ten µl of the total and effluent fractions and 50 µl from each elution fraction were resolved by 12% SDS-PAGE, transferred to nitrocellulose, and Western immunoblotted with lp-lec8 antibodies.
Figure 6:
Binding of rat hepatic galectin-8 to
lactosyl-Sepharose. Five g of rat liver were homogenized in buffer I as
described under ``Experimental Procedures,'' and cytosolic
extracts (25 ml) were applied over 5 ml of lactosyl-Sepharose. After
extensive washing the bound proteins were eluted with 100 mM lactose in buffer I. One-ml fractions were collected and frozen
for a period of 16 h in -20 °C. Eluted fractions were thawed
and centrifuged for 15 min at 12,000 g, and the
pellets were resuspended in 50 µl of sample buffer(34) .
Ten µg of protein of total (A) and effluent (B)
fractions as well as 50 µl of the supernatant (C) and
resuspended pellet (D) of the eluted fractions (nos.
3-5) were resolved by 12% SDS-PAGE, transferred to
nitrocellulose, and Western immunoblotted with lp-lec8 antibodies (top) or subjected to Coomassie staining (bottom).
To
estimate the amounts of galectin-8 in rat liver, Triton X-100-soluble
liver extracts were prepared and resolved by means of SDS-PAGE. Known
amounts of recombinant galectin-8 were run in parallel. All samples
were then subjected to Western immunoblotting, using anti galectin-8
antibodies. Assuming that the immunoreactivity of r-galectin-8 and the
endogenous hepatic protein are comparable, we calculated that 25
ng of galectin-8 are present in 100 µg of Triton X-100-soluble
liver extracts. These findings suggest that galectin-8 comprises
0.025% of total Triton X-100-soluble hepatic proteins.
In the present study we describe a novel, widely expressed protein with key features of mammalian galectins. Since the names galectins 1-7 were already assigned(3, 4) , we termed this new protein galectin-8. Three lines of evidence support the notion that galectin-8 has the characteristics of other mammalian galectins: (i) its deduced amino acid sequence contains two domains with conserved motifs that were implicated in carbohydrate binding of galectins; (ii) in vitro translation products of galectin-8 cDNA or recombinant galectin-8 retain biological activity, they specifically bind to a column of lactosyl-Sepharose, and possess hemagglutination activity; and (iii) a protein of the expected size (34 kDa), that binds to lactosyl-Sepharose and reacts with antibodies directed against a unique sequence of galectin-8, is present in rat liver.
Galectin-8 was cloned when a -ZAP rat liver cDNA library
was screened with affinity-purified antibodies directed against a
14-amino acid peptide located at the C-terminal end of the
IRS-1(39) . Since galectin-8 bears no sequence similarity
either to IRS-1, or to the peptide used as immunogen, we suspect that
the reactivity toward IRS-1 antibodies could be due to a false positive
reaction. This conclusion is supported by the fact that the
anti-peptide antibodies used for screening failed to react with
purified recombinant galectin-8 either by means of immunoprecipitation,
or immunoblotting. (
)
The primary structure of galectin-8
resembles that of galectin-4, namely, two homologous (38% identity)
carbohydrate-binding regions (CRDs) linked by a short 30-amino
acid linking peptide. This unique architecture is shared so far only by
two galectins: rat galectin-4 (28) and its C. elegans homologue(29) . Other galectin types that contain a single
CRD exist and function as noncovalent dimers, which provides them with
the potential to aggregate or agglutinate glycoconjugates(37) .
Since galectin-4 exists as a monomer(28) , it remains to be
determined whether galectin-8 shares a similar property. Hepatic
galectin-8 (Fig. 6) has a similar mobility on SDS-PAGE as its
recombinant counterpart (Fig. 5). This suggests, although not
proves, that hepatic galectin-8 is neither heavily glycosylated, nor it
is subjected to extensive post-translational modifications (e.g. phosphorylation).
Although galectin-8 contains two putative
CRDs, potential differences in sugar binding between the domains is
predicted from a critical difference in their sequence (WG-E-Iversus WG-E-R at the N- and C-terminal CRDs of
galectin-8, respectively (cf. Fig. 2)). The (bold) Arg
residue has been implicated as playing an important role in the
interactions between galectins and the glucose moiety of
lactose(10) . Furthermore, site-directed mutagenesis studies (9) indicate that this conserved Arg is indispensable for sugar
binding. The presence of Ile (instead of an Arg) at the
N-terminal CRD of galectin-8 suggests that this domain might have a
different sugar-binding specificity. In that respect galectin-8
resembles galectin-4 whose CRDs are distinct both in structure and
sugar binding specificity(28) . Galectin-8 resembles as well a
C-type lectin, that functions as the macrophage-mannose receptor (40) and contains eight domains with sequence homology to other
C-type CRDs, while only one domain has mannose binding activity by
itself when expressed in isolation(2) . The presence of two
CRDs with potential different sugar-binding specificity might be
required to achieve high affinity binding to multivalent glycoprotein
ligands possessing different sugar moieties.
Like other galectins(5) , galectin-8 lacks a classical signal sequence or a transmembrane segment. Indeed, galectin-8 was isolated from the cytosolic fraction of rat liver. These findings do not exclude the possibility that galectin-8, like other galectins(5, 6, 7) , could be externalized by an atypical secretory mechanism. Immunohistochemical studies revealed that secreted galectins are concentrated in evaginations of the plasma membrane, which pinch off to form labile lectin-rich extracellular vesicles which may interact with cell surface proteins(17, 41) . This property of galectins is not unique, as other cytoplasmic proteins, such as thymosin, interleukin-1(42) , and fibroblast growth factor(43) , lack a signal sequence, yet are externalized and function extracellularly. Expression of galectin-8 seems to be developmentally regulated. Very low levels of expression were noted in whole embryos, while high levels of expression were noted in adult tissues. In that respect galectin-8 might resemble other galectins that were implicated as regulators of cell growth(18, 19) and embryogenesis(21) .
Galectin-8 is a novel member of the rapidly growing family of galectins. Although its overall organization resembles that of galectin-4, several important features distinguish the two mammalian proteins. First, unlike galectin-4 that is specifically expressed in intestine and stomach, galectin-8 is expressed in several tissues including lung, liver, kidney, brain, hind limb, and cardiac muscle. Second, unlike galectin-4(28, 29) , but similar to galectins-1 and -2, galectin-8 contains 4 Cys residues. Third, the link peptide of galectin-8 is unique; it bears no similarity to the link peptide of galectin-4 or to the proline- and glycine-rich sequences within the N-terminal half of galectin-3, that contain Tyr or Phe residues at similar intervals(28) . Fourth, the N-terminal CRD of galectin-8 contains a unique WG-E-I motif instead of the consensus WG-E-R/K motif present in both CRDs of galectin-4. Hence, galectin-8 may represent a new ubiquitous subfamily of galectins, consisting of a tandem repeat of CRDs, joined by a linking peptide. Further studies are required to unravel the function of this newcomer to the galectin family.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U09824[GenBank].