C-Type lectin-like domains in Caenorhabditis elegans: predictions from the complete genomesequence

Kurt Drickamera and Roger B.Dodd

Glycobiology Institute, Department of Biochemistry, Universityof Oxford, Oxford OX1 3QU, UK

Received on April 16, 1999. revisedon June 3, 1999; accepted on June 4, 1999.


    Abstract
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
Protein modules related to the C-type carbohydrate-recognitiondomains of animal lectins are found in at least 125 proteinsencoded in the Caenorhabditis elegans genome. Withinthese proteins, 183 C-type lectin-like domains (CTLDs) have beenidentified. The proteins have been classified based on the overallarrangement of modules within the polypeptides and based on sequence similaritybetween the CTLDs. The C.elegans proteins generallyhave different domain organization from known mammalian proteinscontaining CTLDs. Most of the CTLDs are divergent in sequence fromthose in mammalian proteins. However, 19 show conservationof most of the amino acid residues that ligate Ca2+ toform a carbo­hydrate-binding site in vertebrate C-typecarbohydrate-recognition domains. Seven of these domains are particularly similarin sequence to mannose- and N-acetylglucosamine-binding domainsin the vicinity of this Ca2+ site.


    Introduction
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
The recent determination of the complete genome sequence of theroundworm Caenorhabditis elegans provides an opportunity togain a global picture of the role of protein modules in a simple multicellularorganism (The C.elegans Genome Consortium, 1998). Of particularinterest are proteins that have evolved to meet the special needsof a multicellular organism, both for formation of differentiatedtissues and for coordinating the activities of cells in these differenttissues. Genes encoding such proteins account for a significantamount of extra coding capacity of the C.elegans genomecompared with the eukaryotic but single-celled yeast (30GoMewes et al., 1997). Particularly common aremotifs typical of plasma membrane receptors for extra­cellularsignals such as hormones.

Among the types of protein modules found in substantial numbersin C.elegans, but absent from yeast, are thosethat fit the profile of the C-type lectin-like domains (CTLDs) of highereukaryotes. These types of protein modules were orig­inallyidentified as carbohydrate-recognition domains (CRDs) in a familyof Ca2+-dependent animal lectins, includingthe asialoglycoprotein receptor, its chicken homologue and serum andliver mannose-binding proteins (13GoDrickamer,1988). Similar domains have since been described in othervertebrate and invertebrate carbohydrate-binding proteins (15GoDrickamer, 1993). Less closely relatedbut still definitely homologous domains have been identified ina variety of proteins that do not appear to have carbohydrate-bindingactivity. Many such CTLDs are found in receptors on the surfaceof natural killer lymphocytes (42GoWeis et al., 1998), while others include ligand-binding domainsin proteins that bind to various blood coagulation factors (20GoFuhlendorff et al., 1987; 1GoAtoda et al., 1991), receptors forphospholipases (26GoInoue et al.,1991), Ca2+-binding proteins associatedwith pancreatic disease (15GoDrickamer, 1993)and antifreeze proteins from arctic fishes (11GoDaviesand Sykes, 1997).

All of the domains in the CTLD group show distinct evidence ofsequence similarity and are thus believed to have descended froma common ancestor by a process of divergent evolution. Additionalgroups of protein domains, such as link protein modules (28GoKohda et al., 1996) andendostatin (25GoHohenester et al.,1998), share topological folding characteristics withthe CTLDs but sequence comparisons show no evidence of homology.It is quite likely that these domains have achieved a similar foldingtopology through a process of convergent evolution. Relationshipsbetween different protein modules that display the C-type lectinfold are summarized in Figure 1.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 1. Evolutionof CTLDs and proteins with related folds. The C-type lectin foldappears both in CTLD-containing proteins and in proteins that arenot related in amino acid sequence. This fold has probably arisenseveral times in a process of convergent evolution. Divergent evolutionof a precursor to the CTLDs has resulted in several groups of domainswith distinct functions. The CRDs show carbohydrate-binding activity,while other CTLDs bind proteins or other ligands.

 
The CTLDs share a common sequence motif that almost invariablyincludes two disulfide bonds and a set of hydrophobic residues atcharacteristic spacings (15GoDrickamer, 1993).X-Ray crystallography of CRDs from animal lectins (40GoWeiset al., 1991; 21GoGraves et al., 1994; 34GoNg et al., 1996) and more recently of otherCTLDs (31GoMizuno et al., 1997; 35GoNielsen et al., 1997; 7GoBoyington et al., 1999)reveal that many of these residues form the hydrophobic cores ofthe domains and determine their overall structure. In addition,CTLDs that display Ca2+-dependent carbohydrate-bindingactivity (CRDs) contain amino acid side chains at certain key positionthat chelate Ca2+ and form a carbohydrate-bindingsite (41GoWeis et al., 1992). Thus, analysisof the sequences of CTLDs can provide important clues about theirpotential ligand-binding activities.

In the present work, C.elegans proteins containingthe CTLD motif have been compared to reveal their overall domain organization,to establish evolutionary relationships amongst the CTLDs and todetermine the degree of conservation of amino acid residues thatform Ca2+- and carbohydrate-binding sites invertebrate CRDs. The results raise the possibility that a smallsubset of these proteins have carbohydrate recognition functionsanalogous to those of vertebrate CRDs.


    Results
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
Context of CTLDs in overall protein architecture
The C.elegans database was screened initiallyfor proteins that are annotated with the designation "C-typelectin" and then by sequence comparison searches usingvarious CRD sequences as probes. A total of 128 proteins containingCTLDs were identified by this combination of approaches. These sequences wereexamined for the presence of the CTL or C_TYPE_LECTINmotifs in the SwissProt library of protein profiles and the LECTIN_Cmotif in the PfamA library in order to establish the number andlocation of the CTLDs. A total of 180 CTLDs were identified in thisway. It is likely that more sensitive structure-based scanning methodswould reveal the presence of additional members of the CTLD superfamily andeven more proteins that share the C-type lectin fold but are notnecessarily homologous. However, the primary aim of this analysiswas to identify proteins for which functional pre­dictionscould be made. Therefore, it was useful to concentrate on proteinswith discernible sequence similarity to the known CTLDs.

An initial classification of proteins was made based on the numberof CTLDs present. The profile scans also revealed the presence andlocation of other types of protein motif that match known profiles.The presence of these other types of protein modules led to theclassification shown in Figure 2. This analysisresulted in nine major groups: (A) single CTLDs, (B) single CTLDsand von Willebrand Factor (VWF) domains, (C) CTLDs and CUB domains(a family of domains originally identified in complement proteinsand bone morphogenic protein 1 (6GoBork andBeckmann, 1993), (D) two CTLDs, (E) three CTLDs, (F) CTLDswith low density lipoprotein (LDL) receptor motifs, (G) CTLDs andVWF domains along with epidermal growth factor (EGF)-like domains,(H) CTLDs and sushi domains (short consensus repeats (4GoBentley,1988)), and (I) CTLDs with seven-transmembrane-type hormonereceptor domains (5GoBockaert and Pin, 1999).



View larger version (32K):
[in this window]
[in a new window]
 
Fig. 2. Overallorganization of proteins from C.elegans containingCTLDs. The key at the bottom summarizes the different domains inthese proteins. Linkers denotes segments containing a high proportionof a relatively few types of amino acid residues, often in repeatingpatterns, and generally shorter than 50 residues in length. Segmentsdesignated unknown modules are longer (between 50 and 200 aminoacid residues) and lack evident repeated motifs. These domains couldnot be matched with vertebrate proteins in the SwissProt database.Shading of CTLDs indicates subgroups in which some of the CTLDscontain at least three potential ligands for a Ca2+-bindingsite analogous to site 2 in mannose-binding protein. The numbersof members of each group are indicated in parentheses.

 
Hydropathy plots were used to confirm the presence of signalsequences at the N-terminal end of most of the poly­peptides,indicating that they are likely to be soluble, secretory proteins.Proteins in three subgroups (A10, D5, and D6) lack such sequencesbut contain internal membrane-anchor type sequences characteristicof type II transmembrane proteins with relatively short cytoplasmicN-terminal domains. One protein, constituting subgroup A6, containsa hydrophobic sequence at the C-terminus, which is a likely signalfor addition of a glycolipid anchor. None of the proteins containinternal hydrophobic sequences likely to form stop transfer sequences leadingto simple type I transmembrane orientation, although the singleprotein in group I contains the hallmarks of a receptor with a 7-transmembranehelix domain. Thus, proteins in the A6, A10, D5, D6, and I subgroupsare the only CTLD-containing proteins expected to be membrane-associated.

Several types of linking regions could also be identified and usedin further categorisation of the CTLD-containing proteins. Mucin-typedomains are characterized by the presence of repeated stretchesof serine and threonine residues. This density of hydroxyl-containingamino acids and the presence of a high proportion of proline residuesare consistent with the suggestion that these segments are likelysites of O-linked glycosylation (18GoElhammer et al., 1993). Other repetitive sequencesare rich in glycine, cysteine, or proline. Subgroups of groups Aand D were defined based on the presence and positions of theselinking domains.

In addition to the CTLDs and other identified domains, several regionsof 100–200 amino acids that lack simple repetitive motifsare likely to be globular domains but are not recognized by theavailable profile databases. These sequences were screened againstthe complete SwissProt protein database. Several were found to givepartial matches to proteins with CTLDs and CUB domains, but examinationof the alignments indicated that these domains could be alignedonly to part of the typical domains in these families. Thus, someof these unidentified segments may represent truncated modules.Other segments were found to match domains in a number of C.elegans proteinsbut there were no systematic similarities to domains in proteinsfrom other organisms. All of these domains were treated as unknown,and their presence was used to distinguish subgroups as shown inFigure 2.

For the most part, the organization of C.elegans proteins summarizedin Figure 2 differs from those that havebeen identified to date in other invertebrates and in vertebrates. Mostvertebrate C-type lectins contain one CRD per poly­peptidechain (15GoDrickamer, 1993), butthe only examples that resemble C.elegans proteinsin overall organization are like the subgroup A1 examples. Theseare simple CTLDs found in the absence of other protein domains,which have been identified as Group V of the mammalian CTLD-containingproteins (15GoDrickamer, 1993). Proteinsconsisting of isolated CTLDs, some with carbohydrate-binding propertiesand others with different binding specificities, have been characterizedin snake venom and in a number of invertebrates (23GoHirabayashi et al., 1991).

The only domains found in association with CTLDs in both C.elegans andvertebrate proteins are the CUB, EGF-like, and sushi modules. CUBdomains are also found in proteins that bind to CTLD-containingvertebrate proteins, such as mannose-binding protein-associatedproteases. Also, although no vertebrate proteins are known to containboth VWF domains and CTLDs, several mammalian and reptilian proteinsinteract with proteins of the coagulation pathway that contain VWF domains.Examples include human serum tetranectin (20GoFuhlendorff et al., 1987) and factor IX and X bindingproteins from snake venom (1GoAtoda et al., 1991).

Even in those cases where proteins contain the same modules in both C.elegans and vertebrates, the arrangement of these domainsis different. These results suggest that there have been extensiveand distinct domain shuffling events in the lineages leading topresent day vertebrates and invertebrates. However, it is also possiblethat proteins with domain organization similar to that in at leastsome of the C.elegans groups will eventually be foundin vertebrates.

Classification of CTLDs based on sequence comparisons
As an alternative approach to analyzing the relationships betweenCTLDs in C.elegans, the sequences of individual CTLDswere compared. The CTLDs identified in the database searches describedabove were abstracted from the surrounding sequences in a uniformmanner. As discussed in detail below, most of the CTLDs containconserved cysteine residues near the N- and C-termini. These cysteineresidues were used as markers to truncate the sequences at equivalentpositions, with two residues included before and after the cysteines.In the few cases where one of these residues is not present, truncationwas effected following pairwise alignment with the most similar availableCTLD sequence that does contain the missing cysteine residue.

Using the truncated sequences as input, dendrograms were constructedwith a variety of different parameters for calculating pairwisecomparison scores. Variables included the scoring matrix and gappenalties. In addition, inclusion of various distantly related CTLDsfrom other species was used to root the trees in different ways.The major clusters from a dendrogram created with the C.elegans sequencesthemselves is shown in Figure 3. Eight clusters,labeled I through VIII, in this diagram, were found to be stableunder all of the different analysis conditions. The robust natureof these clusters suggests that, within each, the CTLDs are descendedfrom a single progenitor module. The relationships between theseclusters were not well defined, and varied depending on the parametersemployed in constructing the dendrogram. Therefore, no attempt wasmade to define the descent of the different clusters from an original CTLDprogenitor.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 3. Sequencerelationships between CTLDs in C.elegans. Dendrogramswere created using the Blosum comparison matrix, a gap opening penaltyof 10 and a gap extension penalty of 0.1 for pairwise sequence comparisons.The length of the segment connecting any two CTLDs reflects theirdegree of sequence similarity. Major evolutionary categories aredenoted I to VIII at the left.In proteins with multiple CTLDs, the domains are denoted a, b, c,and d from N- to C-terminus. CTLDs containing most of the Ca2+-bindingsite 2 ligands are marked as follows: M, similar to mannose-bindingprotein; N, similar to the chicken hepatic lectin (N-acetylglucosaminebinding); G, similar to galactose-binding CRDs; S, not similar tospecific CRDs.

 
Comparison of these results from sequence alignments with theoverall domain arrangements is summarized in Table Go. In most cases, members of a structuralsubgroup cluster together in the dendrogram, indicating that themembers of each such subgroup are descended from a common progenitorgene in which the overall domain organisation was already established. Thisconclusion is born out by the finding that, in proteins containingmultiple CTLDs, the first and second copies usually fall into differentclusters. This result reflects that fact that domain duplicationwithin the gene preceded the overall duplication of the gene thatled to formation of the different members of a group. However, someof the domain organizations described by the different structuralgroups appear to have arisen multiple times, since proteins withsimilar domain structures in some cases appear in different evolutionaryclusters.


View this table:
[in this window]
[in a new window]
 
Table I. Comparison of CRD classifications
 
In order to investigate the relationship of the C.elegans CTLDsto those in other invertebrates and in vertebrates, a variety ofadditional CTLDs were included in the dendrograms, which were againgenerated using multiple sets of parameters. CTLDs from other organismsconsistently segregate from the C.elegans CTLDsand thus do not fall within any of the clusters established in Figure 3. A vertebrate CTLD associated with any ofthe clusters was always the most outlying member and associationwith a given cluster was not robust under different analysis conditions.Thus, it is not possible to correlate any of the C.elegans groupswith specific proteins in other organisms.

Arrangement of disulfide bonds in CTLDs
In CTLDs analyzed to date, six different disulfide bonds have beendescribed. The positions of these bonds are summarized in Figure 4. Chemical evidence for the presence of eachof these bonds, except number 4, has been provided in at least one CTLD(20GoFuhlendorff et al., 1987; 39GoUsami et al., 1993). The positionsof disulfide bonds designated 1, 2, and 3 have been demonstratedby x-ray crystallography as well (40GoWeiset al., 1991; 35GoNielsenet al., 1997), while homology modelingof CTLDs containing disulfide bonds 5 and 7 shows that they couldreadily be accommodated into the C-type lectin fold.



View larger version (10K):
[in this window]
[in a new window]
 
Fig. 4. Disulfidebonds in CTLDs. Secondary structure shared by most CTLDs is summarized,with coils representing {alpha}-helices, jaggedlines denoting {alpha}-strands and loops shownas curved segments. The number of these elements corresponds tothe secondary structure organisation of rat serum mannose-bindingprotein (40GoWeis et al., 1991).Potential disulfide bonds within the CTLD are numbered 1 through6 and cysteines that participate in interchain disulfide bonds arenumbered 7 through 9.

 
The patterns of cysteine residues in the CTLDs from C.elegans areconsistent with the presence of disulfide bonds in each of the arrangementsshown in Figure 4 except for bond type 4.No additional pairs of cysteine residues within the CTLDs are consistentlyevident for any of the subgroups, indicating that the cysteine residuesare mostly involved in disulfide bonds of the types already characterizedin vertebrate homologues. CTLDs lacking one of a pair of cysteineresidues almost invariably also lack the cysteine side chain towhich the first residue would be linked.

Like the CTLDs from other organisms, those from C.elegans eachcontain a subset of the possible disulfide bonds. As summarizedin Table Go, CTLDs in a given subgroupgenerally show the same disulfide bonds, although a few domainscontain extra unique pairs of cysteine residues that might form disulfides.Thus, the similarity in disulfide bond structure in each subgroupreflects the overall similarity in sequence of the CTLDs as detectedin the dendrogram shown in Figure 3.

Single cysteine residues appear in several of these groups at positions7 and 8 as well as at a unique position 9. The turn between ß-strands 3 and 4 is exposed on thesurface of the domain, so it is expected that cysteine residuesat position 9 would be accessible for formation of disulfide bonds.It is possible that such bonds could form with other cysteine residues withinthe same polypeptide but outside the CTLDs. However, no likely pairingpartner is evident for any of these residues, suggesting that theyare more likely to form interchain disulfide bonds. Homo- and hetero-dimerformation through cysteine residues at positions 7 and 8 has beenparticularly well documented in snake venom proteins containingCTLDs (1GoAtoda et al., 1991; 39GoUsami et al., 1993).

Analysis of potential Ca2+-bindingsites
The results discussed so far are based on overall sequence characteristicsof CTLDs and thus reflect primarily a similarity in the basic foldof these domains in C.elegans compared with thosein other organisms. The failure of any of the C.elegans domainsto show particularly close sequence similarity to any of the subgroupsof the mammalian CTLDs precludes total sequence comparison as ameans of making predictions about likely functions of the C.elegans proteins.Overall sequence comparison is a relatively insensitive approachto detecting similarity in function, because many residues in theCTLD are not directly involved in function. Mutations will be readily acceptedat such positions. At relatively large evolutionary distances, suchas between mammals and C.elegans, a large numberof sequence differences will accumulate at such non­essentialpositions. The high level of noise created by this process obscuresthe conservation of key residues essential to function.

A more sensitive approach to functional analysis is to look forconservation of amino acid side chains known to be important inCTLDs with known binding properties. Among the CTLDs, the best characterizedis the CRD of mannose-binding protein. In this case, the key residuesare those that surround one of the two Ca2+-bindingsites, Ca2+ site 2, since this portion of the proteinforms the carbohydrate-binding site (41GoWeiset al., 1992; 34GoNg et al., 1996). Five acid and amide sitechains, which are highlighted in Figure 5,form this site. In E-selectin, the position of this Ca2+ siteis conserved and is also an essential part of the ligand-bindingsite (21GoGraves et al., 1994).However, one of these five amino acid side chains, from the glutamicacid residue located between loop 4 and ß-strand3, is present but takes up a different position and does not formpart of the site. It is replaced by a water molecule that is hydrogenbonded to the asparagine residue marked with an asterisk in Figure 5.



View larger version (62K):
[in this window]
[in a new window]
 
Fig. 5. Sequencecomparisons with C.elegans CTLDs that may bindCa2+ and carbohydrate. Residues that match thesequence motif characteristic of CRDs are highlighted light blue,except the cysteine residues, which are highlighted yellow. Themotif is indicated at the top of each section: H, hydrophobic; F,aromatic; A, aliphatic; O, oxygen-containing; C, E, G and W followthe one-letter amino acid code. Secondary structure elements areindicated as ß-strands (S), {alpha}-helices (H),and loops (L). Ligands for Ca2+ 1 and 2 in ratmannose-binding proteins are denoted with 1 and 2, while potentialligands for the alternative Ca2+-binding site inCRD-4 of the mannose receptor are denoted 3. Theresidue that forms part of Ca2+ site 1 in themannose-binding proteins, but is part of site 2 in E-selectin, is marked *. Potential site 1 ligands are highlightedin green and potential site 2 ligands are highlighted in red. Hydrophobicresidues that make contact with carbohydrate ligands bound to themannose-binding proteins are indicated with +,while the aromatic residues that confer additional selectivity forN-acetylglucosamine in the chicken hepatic lectin are marked with o. These residues are highlighted darkblue. Specific similarity to CRDs is indicated as in Figure 3.RHL-1, rat hepatic lectin 1; CHL, chicken hepatic lectins; MBP-A,rat serum mannose-binding protein; MMR-CRD-4, CRD-4 from human macrophage mannosereceptor.

 
Multiple sequence alignments were used to identify CTLDs in C.elegans in which some or all of the residuesthat make up Ca2+ site 2 are conserved. CTLDsin which at least three of these amino acid residues are presentas acidic or amide-containing side chains are shown in Figure 5. Few of the remaining CTLDs display conservationof more than one of these amino acid residues, suggesting that thisset of residues has been selected as a group. However, it is notpossible, from the available evidence, to determine whether theseresidues have been conserved during descent of the vertebrate CRDs andthe C.elegans domains from a common precursor CTLDor whether they have been inserted on more than one occasion intothe CTLD framework.

Amino acids that form Ca2+ site 1 ligandsin mannose-binding protein are less conserved in the CTLDs. No morethan two of the four ligands are conserved in any of the sequences shownin Figure 5. As noted above, one of theseligands can also form part of site 2. The canonical Ca2+ site1 seen originally in rat serum mannose-binding protein is absentfrom several other mammalian CRDs such as E-selectin (21GoGraves et al., 1994). An alternative Ca2+-bindingsite has been detected in one of the CRDs of the macrophage mannosereceptor (32GoMullin et al.,1997). Examination of the sequences of all of the CRDs revealsthat these alternative Ca2+ ligands are alsoabsent except in the second CTLDs of subgroup D3 proteins, in which acidor amide-containing side chains are present at up to three of thefour positions. Thus, while the comparisons suggest that these CTLDsmay contain the more generally conserved Ca2+ site2, the residues necessary to form a second site are not present.

In order to form a functional Ca2+-bindingsite analogous to site 2 of the vertebrate CRDs, the conserved acidand amide side chains must be presented in the proper context. Several keyresidues help to establish the appropriate geometry for the bindingsite in mannose-binding protein. In strand ß4,the two liganding residues are preceded by a tryptophan residue.The indole side chain of this residue projects into the hydrophobic coreof the protein and packs with a conserved tryptophan residue inloop L3 (40GoWeis et al., 1991).This latter residue is completely conserved in all the CTLDs inFigure 5, but the tryptophan residue instrand ß4 is present in only some cases. Inthe others, it is replaced by a large, aliphatic side chain (leucine,isoleucine, or methionine) which would be expected to make someof the same contacts in the hydrophobic core. An additional keydeterminant of the positions of the site 2 Ca2+ ligandsis a cis proline residue located between two ofthe ligands, which forms a turn between loops L3 and L4. This prolineresidue is a completely conserved feature of the CTLDs in Figure 5. The presence of this residue and the conservedtryptophan side chain in the preceding loop L3 makes it quite likelythat the flanking side chains would be positioned much as they arein the crystal structures of vertebrate CRDs. These tryptophan andproline residues are each found in fewer than half of the CTLDsnot shown in Figure 5.

As noted above, the arrangement of domains in C.elegans proteinsthat contain CTLDs are generally different to the vertebrate proteinscontaining C-type CRDs, although the subgroup A1 proteins resemblesome of the simplest mammalian proteins (group VII) since they areCTLDs that lack accessory domains (15GoDrickamer,1993). In keeping with the general trends amongst the C.elegans proteins, all of the proteins shown in Figure 5 are predicted to be soluble and secreted.Beyond this common feature, there is little similarity in domainorganisation between the different subgroups that contain CTLDswith potential Ca2+ sites 2.

In the case of proteins with two CTLDs, only one domain showsconservation of the motifs characteristic of Ca2+-binding site2. This finding is consistent with the suggestion that the overallarchitecture of these proteins was established relatively earlyin evolution and that the two CTLDs within a polypeptide do notrepresent a recent duplication. This result also suggests that thetwo CTLDs may have diverged to bind two different types of ligands.As indicated in Figure 3, the CTLDs with potentialCa2+-binding site 2 fall into two clusters thatare widely spaced in the overall dendrogram. This finding might suggestthat the progenitor of all CTLDs contained these res­idues,which have been lost in most of the members of the superfamily.However, it is also possible that the constellation of ligands thatmay form a Ca2+-binding site analogous to site 2in mannose-binding protein appeared independently in these two groupsof CTLDs in C.elegans.

Potential carbohydrate-binding sites
The presence of potential Ca2+-binding site2 ligands in several of the CTLDs in Figure 5 suggeststhat some of these sites might be arranged like the carbohydrate-bindingsites in vertebrate CRDs, in which the relative positions of acidor amide side chain Ca2+ ligands are importantdeterminants of carbo­hydrate-binding selectivity. Forexample, in mannose-binding protein the first and second Ca2+ site2 ligands form cooperative hydrogen bonds with one hydroxyl groupof a hexose residue while the third and fourth ligands form verysimilar bonds with an adjacent hydroxyl group (41GoWeiset al., 1992). The sequence Glu-Pro-Asnin the turn between loops L3 and L4, containing the first and secondCa2+ ligands, is associated with binding of saccharidescontaining mannose and structurally related sugar residues, whilethe sequence Gln-Pro-Asp is associated with binding of ligands thatcontain galactose and related sugar residues (14GoDrickamer,1992). Interestingly, all but two of the CTLDs in Figure 5 conform to one or the other of these patterns.

Four members of subgroups A1/2/3, one memberof subgroup A8 and all of the second CTLDs in subgroups D1 and D4contain the Glu-Pro-Asn sequence. Of the CTLDs in subgroups A1/2/3, onecontains a large deletion and another lacks several Ca2+ site2 ligands. The remaining two subgroup A1/2/3 proteins,designated S in Figure 5, have potentialCa2+ site 2 ligands, but they differ in theirarrangement compared to known mannose- or galactose-binding CRDs.Whereas the second and third residues are glutamic acid and asparaginein the mammalian CRDs, in these two proteins they are glutamineand aspartic acid. Although this switch is analogous to the changein the first and second ligands between the mannose- and galactose-binding CRDs,these sites would not be directly analogous to any known sugar-bindingCRDs.

The CTLD in the subgroup A8 protein and all of the subgroupD4 proteins show perfect conservation of all five Ca2+ site2 ligands in rat serum mannose-binding protein. The proteins insubgroup D1 differ only in the presence of asparagine instead ofaspartic acid as the fifth Ca2+ site 2 ligandin ß-strand S4. The structures of CRDsfrom serum and liver mannose-binding protein suggest that this substitutioncould be tolerated with little effect on the ligand-binding site,as this side chain provides an axial ligand for Ca2+ anddoes not interact directly with bound carbohydrate. Thus, all ofthese proteins share the characteristics expected of CRDs that bindsaccharides containing hexose or hexosamine residues related tomannose in the disposition of hydroxyl groups 3 and 4.

Nonpolar residues at two additional positions, marked with + andhighlighted in dark blue in Figure 5, alsoplay a role in carbo­hydrate binding to the mannose-bindingCRDs by making packing interactions with the ligand. The presenceof valine or isoleucine residues at the second of these positionsin ß-strand 4 in the subgroup A8, D1,and D4 proteins corresponds to the presence of these two amino acidside chains in serum and liver mannose-binding protein and in CRD-4of the macrophage mannose receptor (41GoWeiset al., 1992; 34GoNg et al., 1996; 24GoHitchen et al., 1998). The other hydrophobic residue,in loop L4, is more variable in the vertebrate CRDs that bind mannose,as it can be either aromatic or aliphatic. The aliphatic characterof this residue in the subgroup D1 and D4 CTLDs is consistent withthe formation of a binding site for mannose and related hexose andhexosamine residues in these proteins. Overall, the second CTLDsin subgroup D1 and D4 proteins are very similar to liver mannose-bindingprotein at Ca2+ site 2.

In contrast to mannose-binding proteins, which bind saccharides containingmannose, N-acetylglucosamine and fucose, the chicken hepatic lectinbinds selectively to N-acetylgluco­samine. This narrowedselectivity is believed to result from the presence of additionalcontacts with the 2-substituent, which are mediated in part by twoaromatic residues flanking the cysteine residue that follows ß-strand 4 (8GoBurrows et al., 1997). These positions, markedwith o and highlighted in dark blue in Figure 5, are occupied by aromatic amino acids in oneof the group D4 proteins as well as in one of the group A8 proteins. Thedesignation N is used in Figure 5 to indicate the two CTLDs that share thesefeatures of the N-acetylglucosamine-binding site, while M isused to denote the remaining domains that are similar to mannose-bindingCRDs.

The remaining five CTLDs in Figure 5 fromsubgroups A1/2/3 as well as the three CTLDs insubgroup D3 contain the sequence Gln-Pro-Asp at the turn betweenloops L3 and L4. This arrangement is similar to vertebrate CRDsthat bind galactose and structurally related sugars (14GoDrickamer,1992). However, the positions of acid and amide groupsare not conserved at some of the other positions associated withCa2+ site 2. As noted above, the change of thefifth Ca2+ ligand from aspartic acid to asparagine,seen in the subgroup D3 proteins, might be expected to have littleeffect on carbohydrate-binding activity. Thus, the CTLDs in thissubgroup are the most like known galactose-binding domains and aredesignated G in Figure 5, although galactosebound to vertebrate lectins such as asialoglycoprotein receptorsand aggrecan packs against an aromatic residue at the first conservedhydrophobic position in loop L4, marked with + in Figure 5 (27GoIobst and Drickamer,1994) and no aromatic residues are present at corresponding positionsin the subgroup D3 CTLDs. The remaining CTLDs in subgroups A1/2/3are marked S since they are not closely similar to specific vertebrateCRDs.

Potential for oligomer formation
An important aspect of carbohydrate recognition by many mammalianC-type lectins is the presentation of multiple sites with weak affinityfor simple monosaccharides, which results in enhanced affinity andselectivity for multivalent oligo­saccharides. The necessaryclustering of CRDs is usually achieved by oligomerization of polypeptidescontaining single CRDs, which is often brought about by formationof coiled-coils of {alpha}-helices in domainsadjacent to the CRDs (3GoBeavil et al., 1992; 17GoDrickamer,1999). None of the C.elegans proteinsthat contain CTLDs appear to have such oligomerization sequences.However, some of the relatively simple group A proteins may oligomerizeby direct domain-domain contacts (17GoDrickamer,1999) or through other mechanisms.

Other potential carbohydrate-binding proteins in C.elegans
In addition to the Ca2+-dependent C-typelectins, there are several other groups of animal lectins that mediateintracellular and extracellular recognition events. A brief screenfor members of these other lectin families in C.elegans wasundertaken. Examination of the database reveals 13 members of thegalectin family of soluble, galactose-binding proteins. A detailedexamination of these sequences lies outside the scope of this review. However,it can be noted that some of these proteins contain the conservedresidues that confer lactose- and N-acetyllactosamine-binding activityon the known mammalian proteins (2GoBarondes et al., 1994). Several of these proteinshave pre­viously been isolated and shown to bind carbohydrateslike their mammalian homologues. Like the mammalian proteins, someare single carbohydrate-binding domains and others are pairs ofsuch domains. In general, it appears that it is possible to identifysome of these proteins as true orthologues of their mammalian counterparts.However, there is also a subset of proteins that are clearly homologousbut lack key carbo­hydrate-binding residues, suggestingthat proteins in this structural group have also diverged to servemultiple functions.

Homologues for two types of lectins involved in sorting eventswithin luminal compartments of cells have previously been identified.These include a calnexin precursor (38GoTrombettaand Helenius, 1998) and two proteins homologous to theL-type lectins, ERGIC-53 and VIP-36 (19GoFiedlerand Simons, 1994; 16GoDrickamer,1995). Since these latter proteins are distantly relatedto the legume lectins, it is clear that the legume lectins representa very old lineage of carbohydrate-binding proteins. The wide distributionof these sorting proteins suggests that they have played an importantrole in eukaryotic cells over a long evolutionary period. In contrast,searches with motifs and individual examples of mannose 6-phosphatereceptors failed to identify homologues in C.elegans.The absence of mannose 6-phosphate receptors (P-type lectins) isconsistent with the fact that, although various forms of mannose6-phosphate are present on invertebrate proteins such as those from Dictyostelium, the receptor has not been identified(29GoMehta et al., 1996).


    Discussion
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
The presence of such a large number of members of the CTLD familyof protein domains in C.elegans and the diversityof the protein contexts in which these domains are found suggestthat this type of domain can serve a variety of different functions.It is reasonable to conclude, from the presence of a CTLD motif, thatthe encoded domain is likely to have a fold similar to that of theC-type lectin CRDs. However, the presence of a CTLD alone is notsufficient to make conclusions about whether a particular proteinwill bind to carbohydrate in a Ca2+-dependent manner.Sequences must be analyzed for the presence of potential Ca2+ ligandsin order to generate an informed hypothesis about the possibilityof carbohydrate-binding activity of the type that has been shownto be biologically significant for C-type CRDs. Knowledge of structure–functionrelationships in CRDs is sufficient to identify CTLDs that may resemblevertebrate CRDs near Ca2+ site 2, which wouldallow sequence-based predictions about potential ligands for a subsetof CTLDs such as those shown in Figure 5.However, in the face of increasing evidence about novel ways thatcarbohydrates can be ligated around Ca2+ site2, it will be necessary to refine these predictions.

As noted in Figure 2, all of the CTLDsthat resemble CRDs are found in proteins that are likely to be secretedfrom cells, suggesting that they may function by recognition ofextra­cellular glycoconjugates. Our present understandingof the structures of glycoproteins in C.elegans islimited, but analysis of N-acetylglucosaminyl and fucosyl transferasessuggests that complex as well as high mannose N-linked oligosaccharides arelikely to be present (12GoDeBose-Boyd et al., 1998; 9GoChen et al., 1999). In addition, there is evidence forexpression of multiple N-acetylgalactosaminyl transferases capableof initiating O-linked glycan synthesis (22GoHagenand Nehrke, 1998). Although it is possible that some ofthe potential CRDs identified in this work might bind to endogenousglycoproteins, the fact that none of these proteins have membraneanchors suggests that they do not function in the endocytic andcell adhesion functions mediated by many of the C-type vertebratelectins. By analogy to the soluble mannose-binding proteins of vertebrates,possible roles in the innate immune response based on recognitionof exogenous carbohydrates must also be considered. Such functions mightbe particularly important in the absence of an adaptive immune system.

Further structural studies of CTLDs are likely to reveal some ofthe mechanisms through which these domains interact with noncarbohydrateligands. For example, the recently determined structure of a CTLDfrom CD94, a natural killer cell receptor, provides insights intohow this family of proteins interacts with histocompatibility antigensin a carbohydrate- and Ca2+-indepen­dentmanner (7GoBoyington et al.,1999). The CD94 structure also demonstrates that absenceof the key Ca2+ site 2 ligands is indeed correlatedwith the absence of this Ca2+-binding site whichforms an essential part of the Ca2+-dependentcarbohydrate-binding site. As more such structures emerge, it willbecome possible to search for residues that would predict alternative typesof ligand-binding activity.

In the absence Ca2+ site 2, CTLDs mightbind carbohydrates though mechanisms other than those that havebeen demonstrated to date. However, there is no experimental evidence thatCTLDs lacking this Ca2+ site bind carbohydratein a biologically significant manner. This observation, combinedwith the increasing number of examples of CTLDs lacking Ca2+ site2 that bind protein ligands in a carbohydrate-independent manner, suggeststhat such CTLDs are no more likely than any other type of proteinmodule to bind saccharide ligands. Evidence presented here indicatingthat only a small subset of CTLDs in C.elegans arelikely to display Ca2+-dependent carbohydrate-bindingactivity suggests that any effort to demonstrate such activity shouldbe focused on this subset. In addition, sound predictions aboutcarbohydrate-binding activity should underpin speculation aboutthe roles of carbohydrate in processes mediated by proteins containingCTLDs.


    Materials and methods
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
Database searching
Sequences were downloaded from the C.elegans website maintained at the Sanger Centre at Hinxton, UK (http://www. sanger.ac.uk/Projects/C_elegans).The annotated list of open reading frames was first searched forany comment lines containing "C-type," "lectin," "mannose," or "galactose." Thetotal protein sequence database was then screened using the blastp server.Search sequences included the CRD from rat serum mannose-bindingprotein A, the major subunit of the rat hepatic asialoglycoproteinreceptor and E-selectin.

Identification of CTLDs and other protein domains
Domain organization was verified using the ProfileScan server atthe Swiss Institute for Experimental Cancer Research (http://www.isrec.isb-sib.ch/software/PFSCAN_form.html)to compare each sequence against the profiles available in the SwissProt (SwissInstitute for Experimental Cancer Research) and PfamA (Sanger Centre)databases. Sequence comparisons for unknown domains were made usingthe FastA algorithm to search the entire SwissProt database throughthe European Bioinformatics Institute server (http://www2.ebi.ac.uk/fasta3/).

Sequence comparisons
Pairwise sequence comparisons, dendrogram construction and multiplesequence alignments were performed using the AlignX software packagein the BioSuite package from InforMax (North Bethesda, MD) runningon a personal computer equipped with a 266 MHz Pentium II microprocessor.Scoring of alignments was based on the Blosum and PAM250 matrices. Gapopening penalties were varied between 5 and 20 and gap extensionpenalties were varied between 0.1 and 0.5. Dendrograms were constructedusing the Clustal W algorithm implemen­ted in the samepackage. Mammalian CRD sequences used for comparison included thosefrom human aggrecan (group I), chicken hepatic lectin (group II),the major subunit of the rat hepatic asialoglycoprotein receptoror rat hepatic lectin 1 (group II), rat serum mannose-binding protein(group III), human E-selectin (group IV), and CRDs 4 and 6 of thehuman macrophage mannose receptor (group VI). Mammalian and non-mammalianCTLDs tested against the C.elegans proteins includedseveral natural killer cell receptors (42GoWeiset al., 1998), rattlesnake (23GoHirabayashi et al., 1991), barnacle (33GoMuramotoand Kamiya, 1990), tunicate (36GoSuzuki et al., 1990) and flesh fly lectins (37GoTakahashi et al., 1985),fish antifreeze proteins (Davies andSykes, 1997), and phospholipaseinhibitors (26GoInoue et al.,1991) and coagulation factor binding proteins (20GoFuhlendorff et al., 1987; 1GoAtoda et al., 1991).


    Acknowledgments
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
This work was supported by Grant 48104 from the Wellcome Trust.We thank Maureen Taylor for helpful discussions and comments onthe manuscript.


    Abbreviations
 
CRD, carbohydrate-recognition domain; CTLD, C-type lectin-like domain;EGF, epidermal growth factor; LDL, low density lipoprotein; VWF,von Willebrand factor.


    Footnotes
 
a Towhom correspondence should be addressed at: Department of Biochemistry, Universityof Oxford, South Parks Road, Oxford OX1 3QU, United Kingdom Back


    References
 Top
 Abstract
 Introduction
 Results
 Discussion
 Materials and methods
 Acknowledgments
 References
 
1 Atoda,H.,Hyuga,M. and Morita,T. (1991) The primary structureof coagulation factor IX/factor X-binding protein isolatedfrom the venom of Trimeresurus flavoviridis: homologywith asialoglycoprotein receptors, proteoglycan core protein, tetranectinand lymphocte Fc{varepsilon} receptor for immunoglobulin E. J.Biol. Chem., 266, 14903–14911.[Abstract/Free Full Text]

2 Barondes,S.H.,Cooper,D.N.W., Gitt,M.A. and Leffler,H. (1994) Galectins: structureand function of a large family of animal lectins. J. Biol.Chem., 269, 20807–20810.[Free Full Text]

3 Beavil,A.J.,Edmeades,R.L., Gould,H.J. and Sutton,B.J. (1992) {alpha}-Helical coiled-coil stalks in thelow-affinity receptor for IgE (Fc{varepsilon}RII/CD23)and related C-type lectins. Proc. Natl. Acad. Sci. USA, 89, 753–757.[Abstract]

4 Bentley,D.R. (1988)Structural superfamilies of the complement system. Exp. Clin.Immunogenet., 5, 69–80.

5 Bockaert,J. andPin,J.P. (1999) Molecular tinkering of G protein-coupled receptors:an evolutionary success. EMBO J., 18, 1723–1729.[Abstract/Free Full Text]

6 Bork,P. andBeckmann,G. (1993) The CUB domain: a widespread modulein developmentally regulated proteins. J. Mol. Biol., 231, 539–545.[ISI][Medline]

7 Boyington,J.C.,Riaz,A.N., Patamawenu,A., Coligan,J.E., Brooks,A.G. and Sun,P.D.(1999) Structure of CD94 reveals a novel C-type lectinfold: implications for the NK cell-associated CD94/NKG2receptors. Immunity, 10, 75–82.[ISI][Medline]

8 Burrows,L.,Iobst,S.T. and Drickamer,K. (1997) Selective bindingof N-acetylglucosamine to the chicken hepatic lectin. Biochem.J., 324, 673–680.[ISI][Medline]

9 Chen,S.,Zhou,S., Sakar,M., Spence,A.M. and Schachter,H. (1999)Expression of three Caenorhabditis elegans N-acetylglucosaminyltransferase I genes during development. J. Biol. Chem., 274, 288–297.

10 C.elegans SequencingConsortium(1998)Genome sequence of the nematode C.elegans: a platform for investigatingbiology. Science, 282, 2012–2018.[Abstract/Free Full Text]

11 Davies,P.L. andSykes,B.D. (1997) Antifreeze proteins. Curr. Opin.Struct. Biol., 7, 828–834.[ISI][Medline]

12 DeBose-Boyd,R.A.,Nyame,A.K. and Cummings,R.D. (1998) Molecular cloning andcharacterization of an {alpha}1,3 fucosyltransferase,CEFT-1, from Caenorhabditis elegans. Glycobiology, 9, 905–917.

13 Drickamer,K. (1988)Two distinct classes of carbohydrate-recognition domains in animallectins. J. Biol. Chem., 263, 9557–9560.[Free Full Text]

14 Drickamer,K. (1992)Engineering galactose-binding activity into a C-type mannose-bindingprotein. Nature, 360, 183–186.[ISI][Medline]

15 Drickamer,K. (1993)Ca2+-dependent carbohydrate-recognition domainsin animal proteins. Curr. Opin. Struct. Biol., 3, 393–400.[ISI]

16 Drickamer,K. (1995)Increasing diversity of animal lectin structures. Curr. Opin.Struct. Biol., 5, 612–616.[ISI][Medline]

17 Drickamer,K. (1999)C-type lectin-like domains. Curr. Opin. Struct. Biol., 9, inpress.

18 Elhammer,A.P.,Poorman,R.A., Brown,E., Maggiora,L.L., Hoogerheide,J.G. and Kezdy,F.J.(1993) The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferaseas inferred from a database of in vivo subtsratesand the in vitro glycosylation of proteins andpeptides. J. Biol. Chem., 268, 10029–10038.[Abstract/Free Full Text]

19 Fiedler,K. andSimons,K. (1994) A putative novel class of animal lectinsin the secretory pathway homologous to leguminous lectins. Cell, 77, 625–626.[ISI][Medline]

20 Fuhlendorff,J.,Clemmensen,I. and Magnusson,S. (1987) Primary structureof tetranectin, a plasminogen kringle 4 binding plasma protein:homology with asialoglycoprotein receptor and cartilage proteoglycancore protein. Biochemistry, 26, 6757–6764.[ISI][Medline]

21 Graves,B.J.,Crowther,R.L., Chandran,C., Rumberger,J.M., Li,S., Huang,K.-S., Presky,D.H.,Familletti,P.C., Wolitzky,B.A. and Burns,D.K. (1994) Insightinto E-selectin/ligand interaction from the crystal structureand mutagenesis of the lec/EGF domains. Nature, 367, 532–538.[ISI][Medline]

22 Hagen,F.K. andNehrke,K. (1998) cDNA cloning and expression of a familyof UDP-N-acetylgalatcosamine:polypeptide N-acetylgalactosaminyltransferase sequencehomologs from Caenorhabditis elegans. J.Biol. Chem., 273, 8268–8277.[Abstract/Free Full Text]

23 Hirabayashi,J.,Kusunoki,T. and Kasai,K. (1991) Complete primary structure ofa galactose-specific lectin from the venom of the rattlesnake Crotalus atrox: homologies withCa2+-dependent-type lectins. J. Biol. Chem., 266, 2320–2326.[Abstract/Free Full Text]

24 Hitchen,P.G.,Mullin,N.P. and Taylor,M.E. (1998) Orientation of sugars boundto the principal C-type carbohydrate-recognition domain of the macrophagemannose receptor. Biochem. J., 333, 601–608.[ISI][Medline]

25 Hohenester,E.,Sasaki,T., Olsen,B.R. and Timpl,R. (1998) Crystal structureof the angiogenesis inhibitor endostatin at 1.5A resolution. EMBOJ., 17, 1656–1664.[Abstract/Free Full Text]

26 Inoue,S.,Kogaki,H., Ikeda,K., Samejima,Y. and Omori-Satoh,T. (1991) Aminoacid sequences of the two subunits of a phosphlipase A2 inhibitor fromthe blood plasma of Trimeresurus flavoviridis: sequence homologies with pulmonary surfactantapoprotein and animal lectins. J. Biol. Chem., 266, 1001–1007.[Abstract/Free Full Text]

27 Iobst,S.T. andDrickamer,K. (1994) Binding of sugar ligands to Ca2+-dependent animallectins. II. Generation of high affinity galactose binding by site-directedmutagenesis. J. Biol. Chem., 269, 15512–15519.[Abstract/Free Full Text]

28 Kohda,D.,Morton,C.J., Parkar,A.A., Hatanaka,H., Inagaki,F.M., Campbell,I.D. andDay,A.J. (1996) Solution structure of the link module:a hyaluronan-binding domain involved in extracellular matrix stabilityand cell migration. Cell, 86, 767–775.[ISI][Medline]

29 Mehta,D.P.,Ichikawa,M., Salimath P.V, Etchison,J.R., Haak,R., Manzi,A. and Freeze,H.H.(1996) A lysosomal cysteine proteinase from Dictyostelium discoideum containsN-acetylglucosamine-1-phosphate bound to serine but not mannose-6-phosphateon N-linked oligosaccharides. J. Biol. Chem., 271, 10897–10903.[Abstract/Free Full Text]

30 Mewes,H.W.,Albermann,K., Bahr,M., Frishman,D., Gleissner,A., Hani,J., Heumann,K.,Kleine,K., Maierl,A., Oliver,S.G., Pfeiffer,F. and Zollner,A. (1997)Overview of the yeast genome. Nature, 387,S7-S8.

31 Mizuno,H.,Fujimoto,Z., Koizumi,M., Kano,H., Atoda,H. and Morita,T. (1997)Structure of coagulation factors IX/X-binding protein,a hetero­dimer of C-type lectin domains. Nature Struct.Biol., 4, 438–441.[ISI][Medline]

32 Mullin,N.P.,Hitchen,P.G. and Taylor,M.E. (1997) Mechanism of Ca2+ and monosaccharidebinding to a C-type carbohydrate recognition domain of the macrophagemannose receptor. J. Biol. Chem., 272, 5668–5681.[Abstract/Free Full Text]

33 Muramoto,K. andKamiya,H. (1990) The amino-acid sequence of multiple lectinsfrom the acorn barnacle Megabalanus rosa and itshomology with animal lectins. Biochim. Biophys. Acta, 1039, 42–51.[ISI][Medline]

34 Ng,K.K.-S.,Drickamer,K. and Weis,W.I. (1996) Structural analysisof monosaccharide recognition by rat liver mannose-binding protein. J.Biol. Chem., 271, 663–674.[Abstract/Free Full Text]

35 Nielsen,B.B.,Kastrup,J.S.,H.,R., Holtet,T.L., Graversen,J.H., Etzerodt,M., Thogersen,H.C.and Larsen,I.K. (1997) Crystal structure of tetranectin,a trimeric plasminogen-binding protein with an alpha-helical coiledcoil. FEBS Lett., 412, 388–396.[ISI][Medline]

36 Suzuki,T.,Takagi,T., Furukohri,T., Kawamure,K. and Nakauchi,M. (1990) A calcium-dependentgalactose-binding lectin from the tunicate Polyandto­carpamisakiensis: isolation, characterization andamino acid sequence. J. Biol. Chem., 265, 1274–1281.[Abstract/Free Full Text]

37 Takahashi,H.,Komano,H., Kawagushi,N., Kitamura,N., Nakanishi,S. and Natori,S.(1985) Cloning and sequencing of cDNA of Sarcophagaperegrina humoral lectin induced on injury of the body wall. J.Biol. Chem., 260, 12228–12233.[Abstract/Free Full Text]

38 Trombetta,E.S. andHelenius,A. (1998) Lectins as chaperones in glycoprotein folding. Curr.Opin. Struct. Biol., 8, 587–592.

39 Usami,Y.,Fujimura,Y., Suzuki,M., Ozeki,Y., Nishio,K., Fukui,H. and Titani,K.(1993) Primary structure of two-chain botrocetin, avon Willebrand factor modulator purified from the venom of Bothropsjararaca. Proc. Natl. Acad. Sci. USA, 90, 928–932.[Abstract]

40 Weis,W.I.,Kahn,R., Fourme,R., Drickamer,K. and Hendrickson,W.A. (1991) Structureof the calcium-dependent lectin domain from a rat mannose-bindingprotein determined by MAD phasing. Science, 254, 1608–1615.[ISI][Medline]

41 Weis,W.I.,Drickamer,K. and Hendrickson,W.A. (1992) Structureof a C-type mannose-binding protein complexed with an oligosaccharide. Nature, 360, 127–134.[ISI][Medline]

42 Weis,W.I.,Taylor,M.E. and Drickamer,K. (1998) The C-type lectinsuperfamily in the immune system. Immunol. Rev., 163, 19–34.[ISI][Medline]