Galectins are a family of proteins first identified as galactoside-binding lectins in extracts of vertebrate tissue (Barondes et al., 1994a; Various authors, 1997). Sequencing of such proteins isolated from amphibians, birds, fish, and mammals revealed extensive sequence similarity, and in 1994, the galectin family was formally defined (Barondes et al., 1994b) on the basis of both shared sequence and galactoside binding. Four human galectins, which had been discovered in various contexts, and which bore multiple names, were renamed galectin-1 through -4. Since that time, advances in molecular genetics have led to the identification of many new members of the galectin gene family, which have been discovered on the basis of sequence similarity. Here we call attention to these newly established galectins and to additional genes, whose sequence suggests that they too are galectin family members.
All galectins share a core sequence consisting of about 130 amino acids, many of which are highly conserved. Crystallography has been used to determine the structure of several galectins, most recently for galectin-7 (Leonidas et al., 1998). The portion of the core sequence which represents the carbohydrate recognition domain (CRD) (Figure
Since the formal naming of the galectin family, seven more mammalian galectins (-5 through -11) have been discovered (Table I), sharing the basic structural features and galactoside-binding of the original four. Four of these have one CRD (Ackerman et al., 1993; Gitt et al., 1995; Magnaldo et al., 1995; Madsen et al., 1995; Ogden et al., 1998), and the other three have two tandem CRDs (Hadari et al., 1995; Leal-Pinto et al., 1997; Tureci et al., 1997; Wada and Kanwar, 1997; Gitt et al., 1998; Matsumoto et al., 1998) like galectin-4. Unlike the original members of this family, which were discovered on the basis of their lectin activity, the new galectins have primarily been identified in other ways, even in multiple contexts. Only when sequenced were they found to be members of the galectin family.Introduction
Galectin family features
Using search algorithms based on the structure of these known galectins, we have screened the GenBank databases and identified seven additional mammalian candidates for membership in this family (Figure
Figure 1. Amino acid sequences of galectin CRDs are aligned to maximize sequence similarity, allowing variation in the size of some loops between [beta] strands. Highly conserved residues (specific amino acids or certain limited sets of similar amino acids) are capitalized with the most highly conserved residues also shaded. Beta strand positions are indicated at the top, with residues known to directly interact with carbohydrate marked by asterisks. Candidate galectin genes (i.e., not yet shown to bind galactosides) are indicated by an asterisk preceding their GenBank accession number. The mammalian galectins and candidates are all human, except for rat galectin-5 and mouse galectin-6. Further documentation for each of these genes is available at URL: http://www.sacs.ucsf.edu/home/cooper/galectins.htm. Similar hunting for novel galectins in other genomes is also productive. In the worm, Caenorhabditis elegans, two galectins have been isolated and shown to bind galactosides (Hirabayashi et al., 1996; Arata et al., 1997). By searching the GenBank databases we have identified 26 more candidate galectin genes (not shown) for a tentative total of 28 galectins among the ~20,000 genes in the C.elegans genome. Candidate galectins are also apparent in the genomes of other important model organisms (Table II), including Drosophila (LP06039), zebrafish (AI384777 and G47571), and Arabadopsis (AC000348, T7N9.14). The galectin-like sequence in Arabadopsis represents the first evidence for galectins in plants, where the whole class of lectin proteins was first discovered. Candidate galectin genes are even evident in two viruses, an adenovirus (Perillo et al., 1998; U25120) and a lymphocystis disease virus (L63545, 26549-27313 = 053R).
Table I.
Galectin
Gene
Message
Structure
1
22q13.1: gene J05303
CommonUnigene HS.129924
1 CRD, dimer
cosmid Z83844*
49650-49659
50934-51010
52430-52603
53552-53695
2
22q13.1: gene M87860
IntestineUnigene HS.13987ESTs: gall bladder, kidney
1 CRD, dimer
cosmid AL022315*
86175-86169
78200-78120
77006-76846
76682-76536
3
14q21.3: gene AF 031421-5
Common Unigene HS.621
1 CRD+N-term repet.
sts G22378
4
19q13.1-13.3
Intestine, esp. colon Unigene HS.5302
2 CRDs
5
Human not yet identified, rat gene L36862
Rat erythrocytes
1 CRD, monomer
6
Human not yet identified mouse gene tandem to gene for galectin-4
Intestine
2 CRDs
7
19q13.1
Stratified epithelia Unigene HS.99923
1 CRD, monomer
sts G38734
8
1q41-44
Common Unigene HS.4082
2 CRDs
stsG22174
9
Human not yet mapped,mouse chr. 11 sts Z36627
Lymphocytes, intestine specific isoform Unigene HS.81337
2 CRDs
10
19q13.1: gene U68398
Granulocytes Unigene HS.889
1 CRD, dimer
cosmid AC005393
17049-17035
14171-14095
13591-13381
10598-10473
Candidate
Gene
Message
Structure
GRIFIN
7
Rat lens
1 CRD, dimer
Cosmid AC004840
118481-118470
118100-118018
117926-117768
117407-117241
116973-116985
AA311108
11q23
ESTs: breast, infant adrenal, Jurkat T-cell
1 CRD
Cosmid U73641
28867-28853
28206-28120
24394-24242
23714-23568
H50956
11q23
ESTs: spleen, gastriccarcinoma, Jurkat T-cell
1 CRD
Cosmid U73641
33498-33427
31401-31315
31181-30963
30179-30060
29969-29972
N90645
2p
Unigene HS.114771 ESTs: fetal heart, fetal liver/spleen
1 CRD?
sts G30627
N30757
19q13.1
Unigene HS.24236 ESTs: placenta, aorta, embryo, fetal liver/spleen
1 CRD
Cosmid AC006133
????-8278
10331-10398
10913-11123
12932-13303
R31311
19q13.1
Unigene HS.23671 ESTS: placenta, fetal liver/spleen
1 CRD
Cosmid AC005205
14526-14541
16559-16633
17136-17348
19180-19293
AI138230
19q13.1
Unigene HS.143557 EST AI138230, placenta EST AI148582, placenta
1 CRD
Cosmid AC005515
3464-3468
3925-4001
4502-4712
6437-6559
AC005515-II
19q13.1
1 CRD
Cosmid AC005515
26891-26896
27476-27551
28040-28172
30073-30186
Mouse gene U67985
N40740
19q13.1
Unigene HS.146477 EST N40740, placenta EST AI128445, placenta
1 interrupted CRD
Cosmid AC005176
22584-22658
23164-23374
Cosmid AC005515
1866-18989
AC000052
22q11.2
1 interrupted CRD
Cosmid AC000052*
41727-42113
Table II.
Galectin | Gene or message | Structure | |
Amphibian | |||
Xenopus laevis | M88105 | 1 CRD, dimer | |
Bufo arenarum | P56217 | 1 CRD, dimer | |
Fish | |||
Electrophorus electricus | A28302 | 1 CRD, dimer | |
Conger myriater | Con I | AB010276 | 1 CRD, dimer |
Con II | AB010277 | 1 CRD, dimer | |
Danio rerio | *G47571 | 1 CRD | |
Gal-3 ? | *AI384777 | 1 CRD + N-terminal Pro/Gly rich repeat | |
Fugu rubripes | *FRa073apsB12 | 1 CRD | |
*FRa075apsH3 | 1 CRD | ||
Gal-3 ? | *AL020875: 199D14, 111H07 | 1 CRD + N-terminal Pro/Gly rich repeat | |
Bird | |||
Gallus gallus | C-14 | D00308-11, M11674 | 1 CRD, monomer |
C-16 | M57240 | 1 CRD, dimer | |
Gal-3 ? | *U50339 | 1 CRD + N-term. Pro/Gly rich repeat | |
Insect | |||
Anopheles gambiae | Z69982 | 1 CRD | |
Drosophila melanogaster | *LP06039.5prime | 1 CRD | |
Worm | |||
Caenorhabditis elegans | 16 | D63575; cosmid Y55B1 on ch.III-28-26 | 1 CRD, dimer |
32 | AB000802; cosmid Z82081, ch. II 7753-7767, alt. splices = W09H1.6a,6b =yk148a7.5, yk469b7.5 | 2 CRDs | |
Sponge | |||
Geodia cydonium | I | X93925 | 1 CRD, multimer |
II | X70849 | 1 CRD, multimer | |
Fungus | |||
Coprinus cinerea | CGL-I | L03301 | 1 CRD, dimer |
CGL-II | U64676 | 1 CRD, dimer | |
Virus | |||
Mastadenovirus | *U25120 | 2 CRDs | |
Lymphocystis disease | *L63545 26549-27313 = 053R | 1 CRD | |
Plant | |||
Arabadopsis thaliana | *AC000348: T7N9.14 | 1 CRD |
The presence of galectins in so many evolutionarily divergent species suggests that they participate in basic cellular functions. On the other hand, the evidence that there may be dozens of galectins within a single species suggests that they have evolved to participate in a variety of more specific functions. Indeed, there is abundant evidence that members of this family interact with glycoconjugates on or around cells and influence adhesion, migration, chemotaxis, proliferation, apoptosis, and neurite elongation (Barondes et al., 1994a; Puche et al., 1996; Hughes, 1997; Various authors, 1997; Matsumoto et al., 1998).
Even a single galectin can apparently affect cells in a variety of ways depending on the cell type and circumstances. For instance, galectin-1 can either stimulate or inhibit cell proliferation (Wells and Mallucci, 1991; Adams et al., 1996; Yamaoka et al., 1996) and can either stimulate or inhibit cell adhesion to extracellular matrix (Cooper et al., 1991; Van Den Brule et al., 1995). There is also evidence that galectins can simultaneously have distinct intracellular and extracellular functions. For instance, both galectin-1 and galectin-3 have been implicated in pre-mRNA splicing (Dagher et al., 1995; Vyakarnam et al., 1997).
Several recent studies have focused attention on possible galectin functions in regulating immune responses. For example, it has been found that galectin-1 or galectin-9 can induce apoptosis of activated T-cells by binding to cell surface oligosaccharides (Perillo et al., 1995; Wada et al., 1997; Allione et al., 1998; Vespa et al., 1998; Rabinovich et al., 1998; Novelli et al., 1999; ), galectin-3 can activate neutrophils (Yamaoka et al., 1995; Karlsson et al., 1998), and galectin-9 is a potent and specific chemoattractant for eosinophils (Matsumoto et al., 1998).
Another approach to study galectin function is to knock-out expression of individual galectin genes. Such mice lacking galectin-1 have so far been shown to have intriguing deficits in olfactory axon pathfinding (Puche et al., 1996). Mice lacking galectin-3 have so far been shown to have abnormalities in neutrophil accumulation during inflammation (Colnot et al., 1998). Although this approach can fail to detect normal biological functions of the missing protein, apparently because many functions can be performed by alternative or redundant systems, these initial positive results are very encouraging for further analysis of these and mice engineered to eliminate other galectin family members.
This work was supported in part by Grant R01-HL56199 from the USPHS to D.N.W.C.
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: jnl.info{at}oup.co.uk
Last modification: 14 Oct 1999
Copyright©Oxford University Press, 1999.