Fucosyltransferases can add fucose in [alpha]-2-linkage to the terminal galactose (EC 2.4.1.69), in [alpha]-3-linkage (EC 2.4.1.152) or in both [alpha]-3- and [alpha]-4-linkage (EC 2.4.1.65) to the subterminal GlcNAc of N-acetyllactosamine, or in [alpha]-6-linkage (EC 2.4.1.68) to the Asn-linked GlcNAc of chitobiose.
Thirty-one vertebrate fucosyltransferase genes were cloned when we wrote our last review two years ago. These genes shared substantial sequence homology which suggested a divergent model for their evolution (Costache et al., 1997). In a separate study including eukaryotic and some prokaryotic sequences, it was suggested that the [alpha]-2- and [alpha]-6-fucosyltransferases could be related proteins, and a highly conserved peptide motif was identified in the catalytic domain of these enzymes, which constitute a signature for this large family. The [alpha]-3-fucosyltransferases constitute a distinct family since they lack this consensus peptide. However, two other well-conserved regions were identified, which can constitute two characteristic motifs for this second family (Breton et al., 1998).
A new extensive search of GenBank/EBI and protein data banks, by sequence homology and with the three fucosyltransferase peptide motifs detected 19 new fucosyltransferase genes submitted to the databanks after the publication of the above mentioned review (Table I). The updated phylogenetic tree made with the cds of all the vertebrate genes known today is in good agreement with the previously proposed divergent model of evolution of vertebrate fucosyltransferase genes (Costache et al., 1997) and illustrates the existence of the three main families of [alpha]-2, [alpha]-3, and [alpha]-6-fucosyltransferase gene families (Figure
In addition, the present study brought up 30 new putative genes from invertebrates and bacteria. Alignment of these last new sequences with the known vertebrate fucosyltransferases, and peptide sequence comparisons using the sensitive hydrophobic cluster analysis (HCA), allowed us to define three new conserved peptide motifs, which lead in turn to the finding of new putative fucosyltransferase genes in the banks. The final results of this iterative search, plus the published data, lead to a total of 6 fucosyltransferase conserved peptide motifs and 78 fucosyltransferase genes. Overall, the analysis of the 30 new invertebrate and bacterial genes and the 48 vertebrate genes, brings extra support to the general model for the evolution of fucosyltransferases, by successive duplications followed by divergent evolution, from one or two ancestral fucosyltransferase genes.
Table I.
Enzyme | Species | GenBank/EBI | Gene | Size (aa) | References |
[alpha]-2-Fucosyltransferase | |||||
Chimpanzee | AF080603 | FUT1 | 366 | Apoil et al., unpubl. observations | |
Gorilla | AF080605 | FUT1 | 366 | | |
Gibbon | AF045545 | FUT1 | 365 | | |
Rhesus monkey | AF080607 | FUT1 | 366 | | |
Lemur fulvus | AF045546 | FUT1 | 365 | | |
Rat FUTA | AB015637 | FUT1 | 376 | Soejima et al., unpubl. observations | |
Green monkey | D87932 | FUT1 | 366 | Koda et al., unpubl. observations | |
Gibbon | AB006609 | Sec1 | 348 | Koda et al., unpubl. observations | |
Orangutan | AB006610 | Sec1 | 348 | | |
Gorilla | AB006611 | Sec1 | -a | | |
Chimpanzee | AB006612 | Sec1 | -a | | |
Green monkey | D87933 | Sec1 | 348 | Koda et al., unpubl.observations | |
Rhesus monkey | AF080608 | Sec1 | 348 | Apoil et al., unpubl.observations | |
Rat FUTB | AB006138 | FUT2 | 354 | Soejima et al., unpubl. observations | |
Green monkey | D87934 | FUT2 | 343 | Koda et al., unpubl. observations | |
Chimpanzee | AB015634 | FUT2 | 343 | | |
Gorilla | AB015635 | FUT2 | 343 | | |
Orangutan | AB015636 | FUT2 | 343 | | |
[alpha]-3-Fucosyltransferase | |||||
Hamster | U78737 | FUTh | 362 | Zhang et al., unpubl. observations |
Table II.
GDB | GenBank | OMIM | Chromosome | Enzyme | ||
locus | accession | identity | location | Size (aa) | name | References |
FUT1 | M35531 | 211100 | 19q13.3 | 365 | H | Larsen et al., 1990 |
FUT2 | U17894 | 182100 | 19q13.3 | 343 | Se | Kelly et al., 1995 |
FUT3 | X53578 | 111100 | 19p13.3 | 361 | Fuc-TIII | Kukowska-Latallo et al., 1990 |
FUT4 | M58596 | 104230 | 11q21 | 405 | Fuc-TIV | Goelz et al., 1990 |
FUT5 | M81485 | 136835 | 19p13.3 | 374 | Fuc-TV | Weston et al., 1992a |
FUT6 | L01698 | 136836 | 19p13.3 | 359 | Fuc-TVI | Weston et al., 1992b |
FUT7 | X78031 | 602030 | 9q34.3 | 342 | Fuc-TVII | Sasaki et al., 1994 |
FUT8 | D89289 | 602589 | 14q23 | 575 | [alpha]-6-FT | Yanagidani et al., 1997 |
Sec1 | U17895 | 182100 | 19q13.3 | - | - | Kelly et al., 1995 |
The nine human fucosyltransferase genes described in Table II constituted the vertebrate reference genes for the search of related sequences in the databases (Table III). However, the nucleotide sequences corresponding to the stem segments of fucosyltransferase genes are rich in G and C, and low but significant Fasta homology scores are frequently observed with irrelevant but GC-rich DNA sequences. This problem was overcome when Fasta searches were conducted with peptide sequences. For each retrieved sequence, apparently homologous to one group of fucosyltransferases ([alpha]-2-, [alpha]-3-, or [alpha]-6-fucosyltransferases), we first checked for the presence of the previously identified peptide motifs (Breton et al., 1998) and then we searched for other conserved regions. The last step was to compare each of the conserved motifs found to the others, in order to detect common conserved motifs shared by different fucosyltransferase groups.
Figure 1. Update of the phylogenetic tree of vertebrate fucosyltransferase genes previously presented in (Costache et al., 1997). The genetic distances were calculated from ClustalW multiple alignments made with the whole cds.* Identifies pseudogenes. Boldface italic characters are human genes. The six primate [alpha]-2-fucosyltransferase gene sequences submitted to GenBank/EBI by P.A. Apoil et al. (Table I) were not available and were not included in this tree, but the cognate enzymes share between 93% and 98% amino acid sequence identity with the corresponding human proteins. [alpha]-3-Fucosyltransferase conserved motifs
A search among peptide sequences of low Fasta homologyscores gave several proteins containing the conserved [alpha]-3-motifs. Four new putative [alpha]-3-fucosyltransferase enzymes were fromCaenorhabditis elegans (named A, B, C, and E), four were from Helicobacter pylori (A, B, C, D) and one from Vibrio cholerae. The final selection criteria for their retention as putative [alpha]-3-fucosyltransferases were the presence of the two [alpha]-3-motifs (Breton et al., 1998) with similar spacing (from 78 to 86 amino acids for Caenorhabditis elegans and from 49 to 50 amino acids for bacteria; Figure
A Caenorhabditis elegans [alpha]-3-fucosyltransferase (CEFT-1) with 20-23% overall sequence identity to the five human [alpha]-3-fucosyltransferases has been recently cloned, expressed in COS7 cells and shown to be able to make Lex (DeBose-Boyd et al., 1998). It is the Caenorhabditis elegans A sequence (Z66497). Three of the Caenorhabditis elegans [alpha]-3-fucosyltransferase gene sequences encode for similar proteins of 414 (E), 429 (B), and 451 (A) amino acids, but the fourth putative enzyme (C) is about a hundred amino acids shorter and lacks the transmembrane domain (Table III), suggesting that the 5[prime] terminus of the GenBank DNA sequence U40028 lacks the initial portion of the cds of this locus.
The bacterium Helicobacter pylori is known to synthesize Lex epitopes (Chan et al., 1995). Two of the four Helicobacter pylori sequences (A and B) have been recently cloned and expressed, and they were shown to be [alpha]-3-fucosyltransferases, able to synthesize the Lex epitope (Ge et al., 1997; Martin et al., 1997). The four Helicobacter pylori enzymes share about 90% sequence identity, but the A sequence lacks the N-terminal fragment of 121 amino acids, which does not seem to be necessary for enzyme activity (Ge et al., 1997). A tandem repeat of seven amino acids (DDLRINY)n was found close to the carboxy terminus of these four Helicobacter pylori sequences (7 repeats for A; 10 for B and D; and 2 for C). The number of repeats does not seem to be characteristic of the strain, since C and D are from the same strain 26695 and have 2 and 10 repeats, respectively, while B and D have 10 repeats each and were found in two different strains, 11639 and 26695, respectively.
A Schistosoma mansoni fucosyltransferase gene (AF016899) has been recently published (Marques et al., 1998) and proposed to be homologous to vertebrate FUT7. Unfortunately, it is probably a new splice variant of the mouse FUT7, because besides the 5[prime] untranslated sequence and the first translated 39 bp, the remaining 1017 bp representing 96% of the open reading frame have 100% sequence identity with the second exon of mouse FUT7 (U45980; Smith et al., 1996).
About 30% sequence difference is found between the same fucosyltransferase genes from birds and mammals (divergence estimated at 120 million years ago); about 20% difference is detected between the same fucosyltransferase genes among different mammalian species (the great mammalian radiation occurred about 80 million years ago); about 10% difference is found between FUT3, FUT5, and FUT6 genes which diverged more than 10 million years ago; and about 2% difference is found between human and chimpanzee fucosyltransferase genes (divergence estimated at about 5 million years ago; Costache et al., 1997). Therefore, from a phylogenetic point of view, the sequence identity of the mouse FUT7 with the described parasite sequence suggest that this gene did not follow the worm line from its divergence several hundred million years ago, but stayed in the main evolutionary stream and followed the mammalian, rodent, and mouse branches, and can only have moved to the schistosome very recently and directly from the mouse. There are at least two possibilities to explain this unusual high degree of sequence identity between a schistosome and a mouse fucosyltransferase gene. The easiest interpretation is that of an 'in vitro" contamination of the schistosome DNA with mouse DNA, since the adult parasite lives in the mouse vascular tree. However, the presence of the same gene in adult parasites harvested from hamster or in larval forms harvested from snails (Marques et al., 1998), suggest that an horizontal transfer of mouse DNA might have taken place 'in vivo" and that the parasite might have incorporated a fragment of mouse DNA in its genome. If this was the case, the mechanism of this transmission of host to parasite DNA would be extremely interesting. DNA mobile elements were originally described in plants and prokaryotes, but transposon-like elements shared by parasites and mammalian cells have also been described (Kimmel et al., 1987).
Also in favor of the idea that the gene is of mouse origin, is the fact that the products of FUT7 genes (Natsuka et al., 1994; Smith et al., 1996) can use only sialyllactosamine as acceptor in order to make sialyl-Lex structures and no sialic acid has been convincingly detected in schistosomes (Cummings and Nyame, 1996). Alternatively, neutral [alpha]-2 and [alpha]-3-fucosylated structures (Khoo et al., 1997) and an [alpha]-3-fucosyltransferase enzyme have been described in Schistosoma mansoni (DeBose-Boyd et al., 1996) and similar [alpha]-2 and [alpha]-3-fucosyltransferase enzymes and oligosaccharide structures have been found in avian schistosomes (Hokke et al., 1998), suggesting that fucose might be a major constituent of schistosome glycoconjugates, but the corresponding fucosyltransferase genes have not been cloned as yet. In good agreement with this idea, an EST of Schistosoma japonicum containing the [alpha]-3-motif-II (Figure
Table III.
Enzyme
Species
Accession
Size (aa)
References
[alpha]-3-Fucosyltransferases
C.elegans
A
Z66497_3
451
DeBose-Boyd et al., 1998
B
U40028a
429
Wilson et al., 1994
C
U40028a
>319
E
AF003386_13
414
H.pylori
A
AF006039_2
333
Martin et al., 1997
B
AF008596
478
Ge et al., 1997
C
AE000554_8
425
Tomb et al., 1997
D
AE000578_16
476
V.cholerae
Y07786_6
338
Stroeher et al., 1997
S.japonicum
AA269150
105
EST
[alpha]-2-Fucosyltransferases
C.elegans
F
Z92830_5
363
Wilson et al., 1994
G
U80026_1
355
H
U80026_2
339
I
AF024500_4
335
J
AF000198_4
381
Sulston et al., 1992
K
L16559_6
363
Wilson et al., 1994
L
AF016654_3
348
M
Z78018_3
434
N
AF024503_7
395
O
AF024503_5
388
P
AF039051_3
371
Q
Z81537_4
329
R
Z81066_5
537
L.major
AC003011_11
348
Miller et al., unpubl. obs.
L.lactis cremoris
U93364_11
309
Van Kranenburg et al., 1997
Y.enterocolitica
U46859_13
283
Zhang et al., 1997
H.pylori E
AE000531b
301
Saunders et al., 1998
[alpha]-6-Fucosyltransferases
C.elegans D
AF022968_5
818
Wilson et al., 1994
Rhizobium (nodZ)
A
L18897_8
328
Mergaert et al., 1996
B
L22756_5
369
Stacey et al., 1994
C
AE000064_6
322
Freiberg et al., 1997
Mus musculus
AA981143
112
EST
AA867506
156
AA162775
142
AA710426
161
AA474517
158
W74827
159
Rattus sp.
H31816
100
D.melanogaster
AA140819
148
AA698745
146
[alpha]-2-Fucosyltransferase conserved motifs
A similar Fasta search and HCA analysis of protein sequences homologous to [alpha]-2-fucosyltransferases lead to 13 Caenorhabditis elegans (F to R), one Leishmania major and three bacterial new sequences (Lactococcus lactis cremoris, Yersinia enterocolitica, and Helicobacter pylori E). The Helicobacter pylori E (Saunders et al., 1998) and the Yersinia enterocolitica (Breton et al., 1998) enzymes have already been proposed as being [alpha]-2-fucosyltransferases, based on sequence homology data, but all these sequences display low sequence identity with the vertebrate [alpha]-2-fucosyltransferases (less than 30%). However, an [alpha]-2-fucosyltansferase activity on type 1 chains has been recently reported in extracts of Caenorhabditis elegans (DeBose-Boyd et al., 1998) and positive histochemical reactions were found with Ulex europaeus I, a lectin specific for the [alpha]-2-fucosylated H-type-2 (Mollicone et al., 1996) in the yolk of Caenorhabditis elegans (Borgonie et al., 1994).
Figure 2. Conserved [alpha]-3-motifs found in all the [alpha]-3-fucosyltransferases and in a Schistosoma japonicum EST. White letters on black background represent identical or related amino acids found in all sequences and black letters on gray background represent other conserved amino acids. Figures inside < > indicate the number of residues between two motifs.
In addition to the previously reported [alpha]-2-motif-I (Breton et al., 1998), two additional motifs were found in all vertebrate, invertebrate, and bacteria enzymes. The [alpha]-2-motif-II was in general less well preserved than the [alpha]-2-motif-III, but both could be clearly identified by HCA. The three conserved [alpha]-2-motifs are spaced from 26 to 58 amino acids for motifs I and II and from 35 to 64 amino acids for motifs II and III (Figure
Figure 3. Conserved [alpha]-2-motifs found in all the [alpha]-2-fucosyltransferases. Other codes are as in Figure 2. [alpha]-6-Fucosyltransferase conserved motifs
The Fasta search and HCA comparison of [alpha]-6-fucosyltransferases lead to one Caenorhabditis elegans molecule (D) and three bacterial proteins, NodZ from Rhizobium species (Azorhizobium caulinodans A, Bradyrhizobium japonicum B, and Rhizobium sp. C). The Caenorhabditis elegans putative [alpha]-6-fucosyltransferase gene (818 amino acids) contains an extra C-terminal domain of 343 amino acids, but the N-terminal 560 amino acids display 37% sequence identity with the entire vertebrate FUT8 sequence(575 amino acids).
Although the three rhizobial NodZ proteins are shorter than the other [alpha]-6-fucosyltransferases, and have very low overall sequence identity with them (< 20%), three conserved [alpha]-6-motifs (I, II, and III, Figure
Figure 4. Conserved [alpha]-6-motifs found in all the [alpha]-6-fucosyltransferases and in contigs of EST. Other codes are as in Figure 2.
The consensus contigs of five human retina EST fragments (Costache et al., 1997) and four mouse EST fragments (Table III), all homologous to FUT8, contained the [alpha]-6-motif-III. The consensus contig of two Drosophila melanogaster EST, also homologous to FUT8, contained the [alpha]-6-motifs I and II. Finally, a rat EST of 301 bp and another mouse contig of two EST with more than 90% sequence identity to FUT8 were found (Table III), but they were outside the area containing the fucosyltransferase conserved motifs. The [alpha]-6-motifs were spaced from 34 to 37 amino acids for motifs I and II and from 35 to 38 amino acids for motifs II and III (Figure Peptide motifs shared by [alpha]-2-fucosyltransferases and [alpha]-6-fucosyltransferases
The presence of a highly conserved peptide motif in the [alpha]-2- and [alpha]-6-fucosyltransferases from prokaryotic and eukaryotic origin was described previously (Breton et al., 1998). This motif corresponds to the [alpha]-2-motif-I and [alpha]-6-motif-I of the present study. A careful comparison of all HCA plots demonstrated that another motif is shared by all these enzymes, which correspond to the [alpha]-2-motif-II and [alpha]-6-motif-II. The HCA alignment of the two shared motifs is displayed in Figure
Figure 5. HCA graphical display of the conserved peptide motifs I and II shared by [alpha]-2- and [alpha]-6-fucosyltransferases. 1, 2, and 3 are peptides from [alpha]-2-fucosyltransferases; 4, 5, and 6 are peptides from [alpha]-6-fucosyltransferases. 1, H [alpha]-2-fucosyltransferase (FUT1); 2, Caenorhabditis elegans I; 3, Yersinia enterocolitica; 4, pig [alpha]-6-fucosyltransferase (FUT8); 5, Caenorhabditis elegans D; and 6, NodZ A from Azorhizobium caulinodans. The one-letter code for amino acids is used except for Gly, Pro, Ser, and Thr, which are represented by diamonds, stars, squares with solid dots, and open squares, respectively. The conserved hydrophobic clusters are shaded and the conserved nonhydrophobic residues are indicated with white letters on a black background.
On the other hand, the new [alpha]-2-motif-III and the new [alpha]-6-motif-III, seem to be specific of each of these two groups of fucosyltransferases. A scheme of the relative positions of the conserved peptide motifs and the ranges of numbers of inter-motif amino acids and protein sizes is illustrated in Figure
Figure 6. Schematic representations of vertebrate, invertebrate and bacterial fucosyltransferases showing the location of the conserved peptide motifs. The black rectangles I, II, and III identify motifs specific of each group of fucosyltransferases. The gray rectangles I and II identify the motifs shared by both [alpha]-2- and [alpha]-6-fucosyltransferases. Figures inside rectangles indicate the range of pre-, post-, and inter-motifs peptide lengths.
The majority of the cds of vertebrate fucosyltransferase genes are monoexonic, with the exception of human (Natsuka et al., 1994) and mouse (Smith et al., 1996) FUT7, which have one intron in the cds, and FUT8 which is multi-exonic. The human FUT8 has more than nine exons, but only two exons have been submitted to GenBank/EBI (exon 8, AF038280; exon 9, AF038281). In this respect, the 18 Caenorhabditis elegans putative fucosyltransferase genes resemble the vertebrate FUT8, since they are all multi-exonic (5-15 exons), whereas the putative fucosyltransferase gene of the protozoa Leishmania major is monoexonic as the bacterial genes (Table IV).
The eukaryotic fucosyltransferases have, in general, the typical topology of type II membrane proteins with the transmembrane hydrophobic domain in their N-terminus, whereas the prokaryotic enzymes apparently lack such domain (Table IV).
These general properties plus a large variation in the overall size of the cds illustrate that any attempt of making phylogenetic evolutionary trees with full cds sequences are highly hazardous and difficult to interpret. Therefore, we decided to make this phylogenetic analysis only on the conserved peptide motifs.
The common motifs between [alpha]-2 and [alpha]-6-fucosyltransferases suggest a common genetic origin for these two families of enzymes. Furthermore, the phylogenetic tree made with the peptide sequences containing the [alpha]-2-motifs and the [alpha]-6-motifs (Figure
Table IV.
Figure 7. Phylogenetic tree of the conserved [alpha]-2- and [alpha]-6-motifs. H and Se are the human [alpha]-2-fucosyltransferases encoded by FUT1 and FUT2 and Sec1 is the orangutan enzyme. Ce identifies Caenorhabditis elegans putative fucosyltransferase genes. Bootstrap values >70% of a 100 repeatsare shown in the divergence points.
Figure 8. The hypothetical model of divergent evolution for the known fucosyltransferase genes. The rectangular labels identify genes with significant overall sequence identity (>30%), the ovoid labels identify sequences with common peptide motifs, and the circle identifies an hypothetical common ancestor of the [alpha]-2/6-fucosyltransferase and the [alpha]-3-fucosyltransferase families. The black symbols correspond to enzymes expected to use chitobiose as acceptor substrate, and the gray symbols correspond to enzymes expected to use N-acetyllactosamine as acceptor substrate. *FUT3 and *FUT5 are the genes of the only animal enzymesable to use both type 1 (Galß1->3GlcNAc) and type 2 (Galß1->4GlcNAc) acceptors.
On the other extreme of the tree, at the level of the present terminal leaves, a Leishmania major and two subfamilies of Caenorhabditis elegans putative [alpha]-2-fucosyltransferases can be distinguished (CeK/CeF/CeP/CeN/CeM/CeO/CeI/ CeL and CeJ/CeH/CeG/CeQ/CeR), but they have a similar genetic distance to each of the three vertebrate subfamilies of [alpha]-2-fucosyltransferases (H, Se, and Sec1), suggesting that the present invertebrate putative [alpha]-2-fucosyltransferases are homologous to an ancestor of the present vertebrate enzymes. The same is true for the three bacterial [alpha]-2-fucosyltransferases. In the same way, similar genetic distances between each of the three NodZ [alpha]-6-fucosyltransferases and the two vertebrate and the invertebrate [alpha]-6-fucosyltransferases suggest that the present NodZ proteins are homologous to an ancestor of the eukaryotic [alpha]-6-fucosyltransferases (Figure
The same kind of phylogenetic analysis of the peptide sequences containing the conserved [alpha]-3-motifs I and II, show that each of the [alpha]-3-fucosyltransferase enzymes from bacteria and Caenorhabditis elegans have similar genetic distances to each of the five vertebrate [alpha]-3-fucosyltransferase enzymes (Fuc-TIII to Fuc-TVII) and are consequently homologous to an ancestor of these five [alpha]-3-fucosyltransferases found today in vertebrates (Figure
Figure 9. Phylogenetic tree of the conserved peptide [alpha]-3-motifs. Fuc-TIII to Fuc-TVII are the human [alpha]-3/4-fucosyltransferases encoded by FUT3 to FUT7 genes. Ce identifies Caenorhabditis elegans putative fucosyltransferase genes. Bootstrap values >70% of a 100 repeats are shown in the divergence points.
These phylogenetic data suggest that we cannot expect to find, among invertebrates or bacteria, fucosyltransferase genes homologous to any of the present FUT1 to FUT7 human genes, which have appeared by relative recent duplication events, probably in vertebrates, but we can only find genes homologous to the ancestors of the present FUT1 to FUT7 genes (Figure
All evolutionary models postulate the existence of nonavailable ancestor molecules, based on the sequences observed today. This imply certain assumptions which are not always confirmed. The limitations of divergent and convergent evolutionary models are well illustrated in the ABO histo-blood group locus, where first the analysis of mutations found in exon 7 suggested a divergent model of evolution (Martinko et al., 1993), and 4 years later a study of mutations in intron sequences, made by the same team, suggested a convergent model of evolution for the same ABO locus (OhUigin et al., 1997), which has been recently confirmed by the sequence of the macaque ABO locus (Doxiadis et al., 1998).
The very high values of sequence identity among the three main vertebrate [alpha]-2-fucosyltransferases on one side, and among the five main vertebrate [alpha]-3-fucosyltransferases on the other side, strongly favors the hypothesis that these two families of genes have appeared by successive duplications and have later mutated and diverged from their respective ancestral [alpha]-2- and [alpha]-3-fucosyltransferase genes (Costache et al., 1997). The same applies to the Caenorhabditis elegans D sequence and the pig (Uozumi et al., 1996) and human (Yanagidani et al., 1997) [alpha]-6-fucosyltransferase genes (FUT8) (Figure
The two main families of [alpha]-2 and [alpha]-3-fucosyltransferase enzymes have similar size and use the same donor substrate (GDP-fucose). However, since less than 20% overall sequence identity was detected between these two families of proteins (Breton et al., 1996, 1998), the question arises of the existence of a common ancestor for the two families. We previously suggested that the motif-I in [alpha]-3-fucosyltransferases could be the pendant of the motif-I in [alpha]-2- and [alpha]-6-fucosyltransferases, since they display intriguing HCA similarities (Breton et al., 1998), and it has recently been shown that 2-4% of [alpha]-2-fucosyltransferase activity is expressed by human recombinant FUT3 (Gallet et al., 1998) and by purified human Lewis enzyme (Fuc-TIII) (Chandrasekaran et al., 1995) [alpha]-3/4-fucosyltransferases. All these data suggest the existence of common functional features among members of these two families of [alpha]-2- and [alpha]-3-fucosyltransferases.
We propose now that a unique ancestor might have existed and some of the present [alpha]-3-fucosyltransferases of plants or insects (Staudacher, 1996; Staudacher and Marz, 1998; Wilson and Altman, 1998) have the properties expected for this hypothetical common ancestor (Figure
Figure 10. The two main types of acceptor substrates. Chitobiose (black square) and N-acetyllactosamine of type 1 or type 2 (gray square) and the corresponding fucosyltransferase enzymes able to add fucose in [alpha]-2-, [alpha]-3-, and [alpha]-6-linkages.
Irrespective of this prediction, we can already say that not all aspects of fucosyltransferase evolution have been divergent. Indeed, there are good reasons to believe that the Lea epitopes on the terminal branches of N-glycans in plants (Fitchette-Laine et al., 1997) are made by an [alpha]-4-fucosyltransferase (Crawley et al., 1989), different from the human Lewis [alpha]-3/4-fucosyltransferases encoded by FUT3 or FUT5. Among animals, the capacity to use the type 1 as acceptor, in order to make the Lea epitope has appeared after the last known fucosyltransferase gene duplication, that is, only in the cognate enzymes of FUT3 and FUT5, the two vertebrate enzymes able to use both type 1 (Gal[beta]1->3GlcNAc) and type 2 (Gal[beta]1->4GlcNAc) acceptors to make Lea and Lex, respectively (Oulmouden et al., 1997). In fact, these two FUT3 and FUT5 genes must be relatively recent, since they have only been found in chimpanzee and man (Costache et al., 1997). Therefore, we have to accept that the [alpha]-4-fucosyltransferase activity has appeared twice independently, first in plants in an [alpha]-4-fucosyltransferase working on type 1 acceptors and then in primates in an [alpha]-3/4-fucosyltransferase working on both type 1 and type 2 acceptors, thus suggesting convergent evolution for this particular case of fucosyltransferases making the Lea oligosaccharide epitope found in plants and primates.
The more striking finding of the present study was the discovery of 18 different putative fucosyltransferase genes including members of the three families of [alpha]-2, [alpha]-3, and [alpha]-6-fucosyltransferase genes in such a small worm as Caenorhabditis elegans, which has only 959 cells and 13,000 genes overall (Felix, 1997). In favor of the existence of the corresponding gene products, [alpha]-2 and [alpha]-3-fucosyltransferase activities have recently been detected in extracts of Caenorhabditis elegans and one [alpha]-3-fucosyltransferase gene has been cloned and expressed in COS7 cells (DeBose-Boyd et al., 1998). Furthermore, this soil free-living worm, like parasitic schistosomes, does not seem to express sialic acid (Bacic et al., 1990; Borgonie et al., 1994; Nyame et al., 1998). For some unknown reasons, these nematodes have favored through evolution, fucosylation instead of sialylation of their terminal nonreducing oligosaccharide epitopes or glycotopes and since sialic acid and fucose are usually in competition for the same acceptors, the lack of all forms of sialic acid in Caenorhabditis elegans fits well with a large expression of different fucosyltransferase genes, making this animal species an ideal model for evolutionary studies of fucosyltransferases. In good agreement with this concept, no sialyl motifs (Drickamer, 1993; Livingston and Paulson, 1993; Geremia et al., 1997) were found in any of the putative proteins predicted from the Caenorhabditis elegans genome sequence databases (neither in GenBank/EBI or in ACeDB). Sequences search and analysis
Homologous fucosyltransferase sequences were searched in GenBank/EBI and protein data banks with Fasta (Pearson and Lipman, 1988). The Pongo pigmeus (orangutan) Sec1 gene (AB006610) was used for the search of conserved peptide motifs because the human Sec1 (U17895) is a pseudogene (Kelly et al., 1995). Local alignments were made with Lalign (Huang et al., 1992). Consensus sequences from overlapping EST were obtained by Hucap (Huang, 1992). Conserved peptide ambiguous motifs were searched with the Pattern program (Cockwell and Giles, 1989). Multiple alignments of DNA and protein sequences were performed with the ClustalW 1.7 program (Thompson et al., 1994). Genetic distances of whole cds and of the peptide conserved regions, after gap striping, were calculated with the Phylip phylogeny package (Felsenstein, 1993), and phylogenetic trees were made by Maximum Parsimony and Neighbor Joining methods with Seaview and Phylo_win programs (Galtier et al., 1996), in an Indy R6400 Silicon Graphics work station. Both methods gave similar results. All software is available from Cis Infobiogen (E-mail: bioinfo{at}infobiogen.fr; WEB: http://www.infobiogen.fr).
Hydrophobic cluster analysis (HCA)
It is a graphical protein sequence comparison method, based on the detection of hydrophobic clusters containing the amino acids Val, Ile, Leu, Phe, Met, Trp, and Tyr, which are presumed to correspond to the regular secondary structure elements constituting the hydrophobic core of globular proteins (Gaboriaud et al., 1987; Lemesle-Varloot et al., 1990). Protein sequences are represented in a duplicated [alpha]-helical net, and the shapes of the clusters of continuous hydrophobic amino acids are drawn. Plots were obtained from the Drawhca server on the Internet (http://www.lmcp.jussieu.fr/~soyer/www-hca/hca-form.html). Graphical manipulations of the HCA plots were performed with IslandDraw V3.0 (Island Graphics Corp. Hoofddorp, The Netherlands).
After submission of this article, a new mouse gene (AB015426) encoding for an enzyme (mFuc-TIX) with the two [alpha]-3-fucosyltransferase conserved peptide motifs making the Lex glycotope in brain was published. Five Caenorhabditis elegans genes (Z81132_3, Z81132_5, Z81132_6, Z99710_6, and Z92813_2) and one Vibrio cholerae gene (AB012957_18), all encoding for putative enzymes with the three [alpha]-2-fucosyltransferase conserved peptide motifs, plus another Vibrio cholerae gene (AB012957_17) and a Dictyostelium discoideum gene (AF076599) encoding for new putative enzymes with the two [alpha]-3-fucosyltransferase conserved peptide motifs, were submitted to GenBank/EBI.
This work has been performed within the French network 'GT-rec" supported by grants from MENRT (ACC SV no. 9514111) and CNRS (Program PCV); and the concerted action 3026PL950004 and the shared cost Xenotransplantation Contract IO4-CT97-2242 from the Immunology Biotechnology program DG XII, from the European Union (EU). R.O. and R.M. are full-time investigators of the Centre National de la Recherche Scientifique (CNRS) and C.B. of the Institut National de la Recherche Agronomique (INRA), France.
bp, base pairs; cds, coding sequence; EST, expressed sequence tag; Fuc-TIII, FUT3 encoded Lewis [alpha]-3/4-fucosyltransferase; Fuc-TIV, FUT4 encoded myeloid [alpha]-3-fucosyltransferase; Fuc-TV, FUT5 encoded [alpha]-3/4-fucosyltransferase; Fuc-TVI, FUT6 encoded plasma [alpha]-3-fucosyltransferase; Fuc-TVII, FUT7 encoded leukocyte [alpha]-3-fucosyltransferase; [alpha]-6-FT, FUT8-encoded [alpha]-6-fucosyltransferase; FUT1 to FUT8, Genome Data Base (GDB) names of cloned human fucosyltransferases; HCA, hydrophobic cluster analysis; PCR, polymerase chain reaction.
General properties of fucosyltransferase genes and enzymes
Possibility of a common ancestor for eukaryotic and prokaryotic fucosyltransferase genes
78 total fucosyltransferase genes
Conserved motifs
Protein size (aa)
Introns in cds
Transmembr. domain
48 from vertebrates
Sec1, FUT1, FUT2
[alpha]-2
325-377
0
+
FUT3 to FUT6
[alpha]-3
356-433
0
+
FUT7
[alpha]-3
342-389
1
+
FUT8
[alpha]-6
575
>8
+
18 from C.elegans
F, G, H, I, J, K, L, M, N, O, P, Q, R
[alpha]-2
329-537
4-12
+
A, B, C, E
[alpha]-3
319-451
6-9
+
D (homologous to FUT8)
[alpha]-6
818
14
+
1 from L.major
[alpha]-2
348
0
+
11 from bacteria
L.lactis cremoris
[alpha]-2
309
0
-
Y.enterocolitica
[alpha]-2
283
0
-
H.pylori: E
[alpha]-2
301
0
-
H.pylori: A, B, C, D
[alpha]-3
333-478
0
-
V.cholerae
[alpha]-3
338
0
-
Rhizobium NodZ: A, B, C
[alpha]-6
322-369
0
-
The divergent model of fucosyltransferase genes in evolution
One or two ancestors for the fucosyltransferase gene family?
An exception to the divergent model of fucosyltransferase gene evolution
Why are there so many fucosyltransferase genes in Caenorhabditis elegans?
Materials and methods
Note added in proof
Acknowledgments
Abbreviations
2To whom correspondence should be addressed at: INSERM U504,16 Avenue Paul Vaillant-Couturier, 94807 Villejuif Cedex, France