Glycobiology Institute, Department of Biochemistry, Universityof Oxford, Oxford OX1 3QU, UK
Received on April 16, 1999. revisedon June 3, 1999; accepted on June 4, 1999.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Among the types of protein modules found in substantial numbersin C.elegans, but absent from yeast, are thosethat fit the profile of the C-type lectin-like domains (CTLDs) of highereukaryotes. These types of protein modules were originallyidentified as carbohydrate-recognition domains (CRDs) in a familyof Ca2+-dependent animal lectins, includingthe asialoglycoprotein receptor, its chicken homologue and serum andliver mannose-binding proteins (13Drickamer,1988). Similar domains have since been described in othervertebrate and invertebrate carbohydrate-binding proteins (15
Drickamer, 1993). Less closely relatedbut still definitely homologous domains have been identified ina variety of proteins that do not appear to have carbohydrate-bindingactivity. Many such CTLDs are found in receptors on the surfaceof natural killer lymphocytes (42
Weis et al., 1998), while others include ligand-binding domainsin proteins that bind to various blood coagulation factors (20
Fuhlendorff et al., 1987; 1
Atoda et al., 1991), receptors forphospholipases (26
Inoue et al.,1991), Ca2+-binding proteins associatedwith pancreatic disease (15
Drickamer, 1993)and antifreeze proteins from arctic fishes (11
Daviesand Sykes, 1997).
All of the domains in the CTLD group show distinct evidence ofsequence similarity and are thus believed to have descended froma common ancestor by a process of divergent evolution. Additionalgroups of protein domains, such as link protein modules (28Kohda et al., 1996) andendostatin (25
Hohenester et al.,1998), share topological folding characteristics withthe CTLDs but sequence comparisons show no evidence of homology.It is quite likely that these domains have achieved a similar foldingtopology through a process of convergent evolution. Relationshipsbetween different protein modules that display the C-type lectinfold are summarized in Figure 1.
|
In the present work, C.elegans proteins containingthe CTLD motif have been compared to reveal their overall domain organization,to establish evolutionary relationships amongst the CTLDs and todetermine the degree of conservation of amino acid residues thatform Ca2+- and carbohydrate-binding sites invertebrate CRDs. The results raise the possibility that a smallsubset of these proteins have carbohydrate recognition functionsanalogous to those of vertebrate CRDs.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
An initial classification of proteins was made based on the numberof CTLDs present. The profile scans also revealed the presence andlocation of other types of protein motif that match known profiles.The presence of these other types of protein modules led to theclassification shown in Figure 2. This analysisresulted in nine major groups: (A) single CTLDs, (B) single CTLDsand von Willebrand Factor (VWF) domains, (C) CTLDs and CUB domains(a family of domains originally identified in complement proteinsand bone morphogenic protein 1 (6Bork andBeckmann, 1993), (D) two CTLDs, (E) three CTLDs, (F) CTLDswith low density lipoprotein (LDL) receptor motifs, (G) CTLDs andVWF domains along with epidermal growth factor (EGF)-like domains,(H) CTLDs and sushi domains (short consensus repeats (4
Bentley,1988)), and (I) CTLDs with seven-transmembrane-type hormonereceptor domains (5
Bockaert and Pin, 1999).
|
Several types of linking regions could also be identified and usedin further categorisation of the CTLD-containing proteins. Mucin-typedomains are characterized by the presence of repeated stretchesof serine and threonine residues. This density of hydroxyl-containingamino acids and the presence of a high proportion of proline residuesare consistent with the suggestion that these segments are likelysites of O-linked glycosylation (18Elhammer et al., 1993). Other repetitive sequencesare rich in glycine, cysteine, or proline. Subgroups of groups Aand D were defined based on the presence and positions of theselinking domains.
In addition to the CTLDs and other identified domains, several regionsof 100200 amino acids that lack simple repetitive motifsare likely to be globular domains but are not recognized by theavailable profile databases. These sequences were screened againstthe complete SwissProt protein database. Several were found to givepartial matches to proteins with CTLDs and CUB domains, but examinationof the alignments indicated that these domains could be alignedonly to part of the typical domains in these families. Thus, someof these unidentified segments may represent truncated modules.Other segments were found to match domains in a number of C.elegans proteinsbut there were no systematic similarities to domains in proteinsfrom other organisms. All of these domains were treated as unknown,and their presence was used to distinguish subgroups as shown inFigure 2.
For the most part, the organization of C.elegans proteins summarizedin Figure 2 differs from those that havebeen identified to date in other invertebrates and in vertebrates. Mostvertebrate C-type lectins contain one CRD per polypeptidechain (15Drickamer, 1993), butthe only examples that resemble C.elegans proteinsin overall organization are like the subgroup A1 examples. Theseare simple CTLDs found in the absence of other protein domains,which have been identified as Group V of the mammalian CTLD-containingproteins (15
Drickamer, 1993). Proteinsconsisting of isolated CTLDs, some with carbohydrate-binding propertiesand others with different binding specificities, have been characterizedin snake venom and in a number of invertebrates (23
Hirabayashi et al., 1991).
The only domains found in association with CTLDs in both C.elegans andvertebrate proteins are the CUB, EGF-like, and sushi modules. CUBdomains are also found in proteins that bind to CTLD-containingvertebrate proteins, such as mannose-binding protein-associatedproteases. Also, although no vertebrate proteins are known to containboth VWF domains and CTLDs, several mammalian and reptilian proteinsinteract with proteins of the coagulation pathway that contain VWF domains.Examples include human serum tetranectin (20Fuhlendorff et al., 1987) and factor IX and X bindingproteins from snake venom (1
Atoda et al., 1991).
Even in those cases where proteins contain the same modules in both C.elegans and vertebrates, the arrangement of these domainsis different. These results suggest that there have been extensiveand distinct domain shuffling events in the lineages leading topresent day vertebrates and invertebrates. However, it is also possiblethat proteins with domain organization similar to that in at leastsome of the C.elegans groups will eventually be foundin vertebrates.
Classification of CTLDs based on sequence comparisons
As an alternative approach to analyzing the relationships betweenCTLDs in C.elegans, the sequences of individual CTLDswere compared. The CTLDs identified in the database searches describedabove were abstracted from the surrounding sequences in a uniformmanner. As discussed in detail below, most of the CTLDs containconserved cysteine residues near the N- and C-termini. These cysteineresidues were used as markers to truncate the sequences at equivalentpositions, with two residues included before and after the cysteines.In the few cases where one of these residues is not present, truncationwas effected following pairwise alignment with the most similar availableCTLD sequence that does contain the missing cysteine residue.
Using the truncated sequences as input, dendrograms were constructedwith a variety of different parameters for calculating pairwisecomparison scores. Variables included the scoring matrix and gappenalties. In addition, inclusion of various distantly related CTLDsfrom other species was used to root the trees in different ways.The major clusters from a dendrogram created with the C.elegans sequencesthemselves is shown in Figure 3. Eight clusters,labeled I through VIII, in this diagram, were found to be stableunder all of the different analysis conditions. The robust natureof these clusters suggests that, within each, the CTLDs are descendedfrom a single progenitor module. The relationships between theseclusters were not well defined, and varied depending on the parametersemployed in constructing the dendrogram. Therefore, no attempt wasmade to define the descent of the different clusters from an original CTLDprogenitor.
|
|
Arrangement of disulfide bonds in CTLDs
In CTLDs analyzed to date, six different disulfide bonds have beendescribed. The positions of these bonds are summarized in Figure 4. Chemical evidence for the presence of eachof these bonds, except number 4, has been provided in at least one CTLD(20Fuhlendorff et al., 1987; 39
Usami et al., 1993). The positionsof disulfide bonds designated 1, 2, and 3 have been demonstratedby x-ray crystallography as well (40
Weiset al., 1991; 35
Nielsenet al., 1997), while homology modelingof CTLDs containing disulfide bonds 5 and 7 shows that they couldreadily be accommodated into the C-type lectin fold.
|
Like the CTLDs from other organisms, those from C.elegans eachcontain a subset of the possible disulfide bonds. As summarizedin Table , CTLDs in a given subgroupgenerally show the same disulfide bonds, although a few domainscontain extra unique pairs of cysteine residues that might form disulfides.Thus, the similarity in disulfide bond structure in each subgroupreflects the overall similarity in sequence of the CTLDs as detectedin the dendrogram shown in Figure 3.
Single cysteine residues appear in several of these groups at positions7 and 8 as well as at a unique position 9. The turn between ß-strands 3 and 4 is exposed on thesurface of the domain, so it is expected that cysteine residuesat position 9 would be accessible for formation of disulfide bonds.It is possible that such bonds could form with other cysteine residues withinthe same polypeptide but outside the CTLDs. However, no likely pairingpartner is evident for any of these residues, suggesting that theyare more likely to form interchain disulfide bonds. Homo- and hetero-dimerformation through cysteine residues at positions 7 and 8 has beenparticularly well documented in snake venom proteins containingCTLDs (1Atoda et al., 1991; 39
Usami et al., 1993).
Analysis of potential Ca2+-bindingsites
The results discussed so far are based on overall sequence characteristicsof CTLDs and thus reflect primarily a similarity in the basic foldof these domains in C.elegans compared with thosein other organisms. The failure of any of the C.elegans domainsto show particularly close sequence similarity to any of the subgroupsof the mammalian CTLDs precludes total sequence comparison as ameans of making predictions about likely functions of the C.elegans proteins.Overall sequence comparison is a relatively insensitive approachto detecting similarity in function, because many residues in theCTLD are not directly involved in function. Mutations will be readily acceptedat such positions. At relatively large evolutionary distances, suchas between mammals and C.elegans, a large numberof sequence differences will accumulate at such nonessentialpositions. The high level of noise created by this process obscuresthe conservation of key residues essential to function.
A more sensitive approach to functional analysis is to look forconservation of amino acid side chains known to be important inCTLDs with known binding properties. Among the CTLDs, the best characterizedis the CRD of mannose-binding protein. In this case, the key residuesare those that surround one of the two Ca2+-bindingsites, Ca2+ site 2, since this portion of the proteinforms the carbohydrate-binding site (41Weiset al., 1992; 34
Ng et al., 1996). Five acid and amide sitechains, which are highlighted in Figure 5,form this site. In E-selectin, the position of this Ca2+ siteis conserved and is also an essential part of the ligand-bindingsite (21
Graves et al., 1994).However, one of these five amino acid side chains, from the glutamicacid residue located between loop 4 and ß-strand3, is present but takes up a different position and does not formpart of the site. It is replaced by a water molecule that is hydrogenbonded to the asparagine residue marked with an asterisk in Figure 5.
|
Amino acids that form Ca2+ site 1 ligandsin mannose-binding protein are less conserved in the CTLDs. No morethan two of the four ligands are conserved in any of the sequences shownin Figure 5. As noted above, one of theseligands can also form part of site 2. The canonical Ca2+ site1 seen originally in rat serum mannose-binding protein is absentfrom several other mammalian CRDs such as E-selectin (21Graves et al., 1994). An alternative Ca2+-bindingsite has been detected in one of the CRDs of the macrophage mannosereceptor (32
Mullin et al.,1997). Examination of the sequences of all of the CRDs revealsthat these alternative Ca2+ ligands are alsoabsent except in the second CTLDs of subgroup D3 proteins, in which acidor amide-containing side chains are present at up to three of thefour positions. Thus, while the comparisons suggest that these CTLDsmay contain the more generally conserved Ca2+ site2, the residues necessary to form a second site are not present.
In order to form a functional Ca2+-bindingsite analogous to site 2 of the vertebrate CRDs, the conserved acidand amide side chains must be presented in the proper context. Several keyresidues help to establish the appropriate geometry for the bindingsite in mannose-binding protein. In strand ß4,the two liganding residues are preceded by a tryptophan residue.The indole side chain of this residue projects into the hydrophobic coreof the protein and packs with a conserved tryptophan residue inloop L3 (40Weis et al., 1991).This latter residue is completely conserved in all the CTLDs inFigure 5, but the tryptophan residue instrand ß4 is present in only some cases. Inthe others, it is replaced by a large, aliphatic side chain (leucine,isoleucine, or methionine) which would be expected to make someof the same contacts in the hydrophobic core. An additional keydeterminant of the positions of the site 2 Ca2+ ligandsis a cis proline residue located between two ofthe ligands, which forms a turn between loops L3 and L4. This prolineresidue is a completely conserved feature of the CTLDs in Figure 5. The presence of this residue and the conservedtryptophan side chain in the preceding loop L3 makes it quite likelythat the flanking side chains would be positioned much as they arein the crystal structures of vertebrate CRDs. These tryptophan andproline residues are each found in fewer than half of the CTLDsnot shown in Figure 5.
As noted above, the arrangement of domains in C.elegans proteinsthat contain CTLDs are generally different to the vertebrate proteinscontaining C-type CRDs, although the subgroup A1 proteins resemblesome of the simplest mammalian proteins (group VII) since they areCTLDs that lack accessory domains (15Drickamer,1993). In keeping with the general trends amongst the C.elegans proteins, all of the proteins shown in Figure 5 are predicted to be soluble and secreted.Beyond this common feature, there is little similarity in domainorganisation between the different subgroups that contain CTLDswith potential Ca2+ sites 2.
In the case of proteins with two CTLDs, only one domain showsconservation of the motifs characteristic of Ca2+-binding site2. This finding is consistent with the suggestion that the overallarchitecture of these proteins was established relatively earlyin evolution and that the two CTLDs within a polypeptide do notrepresent a recent duplication. This result also suggests that thetwo CTLDs may have diverged to bind two different types of ligands.As indicated in Figure 3, the CTLDs with potentialCa2+-binding site 2 fall into two clusters thatare widely spaced in the overall dendrogram. This finding might suggestthat the progenitor of all CTLDs contained these residues,which have been lost in most of the members of the superfamily.However, it is also possible that the constellation of ligands thatmay form a Ca2+-binding site analogous to site 2in mannose-binding protein appeared independently in these two groupsof CTLDs in C.elegans.
Potential carbohydrate-binding sites
The presence of potential Ca2+-binding site2 ligands in several of the CTLDs in Figure 5 suggeststhat some of these sites might be arranged like the carbohydrate-bindingsites in vertebrate CRDs, in which the relative positions of acidor amide side chain Ca2+ ligands are importantdeterminants of carbohydrate-binding selectivity. Forexample, in mannose-binding protein the first and second Ca2+ site2 ligands form cooperative hydrogen bonds with one hydroxyl groupof a hexose residue while the third and fourth ligands form verysimilar bonds with an adjacent hydroxyl group (41Weiset al., 1992). The sequence Glu-Pro-Asnin the turn between loops L3 and L4, containing the first and secondCa2+ ligands, is associated with binding of saccharidescontaining mannose and structurally related sugar residues, whilethe sequence Gln-Pro-Asp is associated with binding of ligands thatcontain galactose and related sugar residues (14
Drickamer,1992). Interestingly, all but two of the CTLDs in Figure 5 conform to one or the other of these patterns.
Four members of subgroups A1/2/3, one memberof subgroup A8 and all of the second CTLDs in subgroups D1 and D4contain the Glu-Pro-Asn sequence. Of the CTLDs in subgroups A1/2/3, onecontains a large deletion and another lacks several Ca2+ site2 ligands. The remaining two subgroup A1/2/3 proteins,designated S in Figure 5, have potentialCa2+ site 2 ligands, but they differ in theirarrangement compared to known mannose- or galactose-binding CRDs.Whereas the second and third residues are glutamic acid and asparaginein the mammalian CRDs, in these two proteins they are glutamineand aspartic acid. Although this switch is analogous to the changein the first and second ligands between the mannose- and galactose-binding CRDs,these sites would not be directly analogous to any known sugar-bindingCRDs.
The CTLD in the subgroup A8 protein and all of the subgroupD4 proteins show perfect conservation of all five Ca2+ site2 ligands in rat serum mannose-binding protein. The proteins insubgroup D1 differ only in the presence of asparagine instead ofaspartic acid as the fifth Ca2+ site 2 ligandin ß-strand S4. The structures of CRDsfrom serum and liver mannose-binding protein suggest that this substitutioncould be tolerated with little effect on the ligand-binding site,as this side chain provides an axial ligand for Ca2+ anddoes not interact directly with bound carbohydrate. Thus, all ofthese proteins share the characteristics expected of CRDs that bindsaccharides containing hexose or hexosamine residues related tomannose in the disposition of hydroxyl groups 3 and 4.
Nonpolar residues at two additional positions, marked with + andhighlighted in dark blue in Figure 5, alsoplay a role in carbohydrate binding to the mannose-bindingCRDs by making packing interactions with the ligand. The presenceof valine or isoleucine residues at the second of these positionsin ß-strand 4 in the subgroup A8, D1,and D4 proteins corresponds to the presence of these two amino acidside chains in serum and liver mannose-binding protein and in CRD-4of the macrophage mannose receptor (41Weiset al., 1992; 34
Ng et al., 1996; 24
Hitchen et al., 1998). The other hydrophobic residue,in loop L4, is more variable in the vertebrate CRDs that bind mannose,as it can be either aromatic or aliphatic. The aliphatic characterof this residue in the subgroup D1 and D4 CTLDs is consistent withthe formation of a binding site for mannose and related hexose andhexosamine residues in these proteins. Overall, the second CTLDsin subgroup D1 and D4 proteins are very similar to liver mannose-bindingprotein at Ca2+ site 2.
In contrast to mannose-binding proteins, which bind saccharides containingmannose, N-acetylglucosamine and fucose, the chicken hepatic lectinbinds selectively to N-acetylglucosamine. This narrowedselectivity is believed to result from the presence of additionalcontacts with the 2-substituent, which are mediated in part by twoaromatic residues flanking the cysteine residue that follows ß-strand 4 (8Burrows et al., 1997). These positions, markedwith o and highlighted in dark blue in Figure 5, are occupied by aromatic amino acids in oneof the group D4 proteins as well as in one of the group A8 proteins. Thedesignation N is used in Figure 5 to indicate the two CTLDs that share thesefeatures of the N-acetylglucosamine-binding site, while M isused to denote the remaining domains that are similar to mannose-bindingCRDs.
The remaining five CTLDs in Figure 5 fromsubgroups A1/2/3 as well as the three CTLDs insubgroup D3 contain the sequence Gln-Pro-Asp at the turn betweenloops L3 and L4. This arrangement is similar to vertebrate CRDsthat bind galactose and structurally related sugars (14Drickamer,1992). However, the positions of acid and amide groupsare not conserved at some of the other positions associated withCa2+ site 2. As noted above, the change of thefifth Ca2+ ligand from aspartic acid to asparagine,seen in the subgroup D3 proteins, might be expected to have littleeffect on carbohydrate-binding activity. Thus, the CTLDs in thissubgroup are the most like known galactose-binding domains and aredesignated G in Figure 5, although galactosebound to vertebrate lectins such as asialoglycoprotein receptorsand aggrecan packs against an aromatic residue at the first conservedhydrophobic position in loop L4, marked with + in Figure 5 (27
Iobst and Drickamer,1994) and no aromatic residues are present at corresponding positionsin the subgroup D3 CTLDs. The remaining CTLDs in subgroups A1/2/3are marked S since they are not closely similar to specific vertebrateCRDs.
Potential for oligomer formation
An important aspect of carbohydrate recognition by many mammalianC-type lectins is the presentation of multiple sites with weak affinityfor simple monosaccharides, which results in enhanced affinity andselectivity for multivalent oligosaccharides. The necessaryclustering of CRDs is usually achieved by oligomerization of polypeptidescontaining single CRDs, which is often brought about by formationof coiled-coils of -helices in domainsadjacent to the CRDs (3
Beavil et al., 1992; 17
Drickamer,1999). None of the C.elegans proteinsthat contain CTLDs appear to have such oligomerization sequences.However, some of the relatively simple group A proteins may oligomerizeby direct domain-domain contacts (17
Drickamer,1999) or through other mechanisms.
Other potential carbohydrate-binding proteins in C.elegans
In addition to the Ca2+-dependent C-typelectins, there are several other groups of animal lectins that mediateintracellular and extracellular recognition events. A brief screenfor members of these other lectin families in C.elegans wasundertaken. Examination of the database reveals 13 members of thegalectin family of soluble, galactose-binding proteins. A detailedexamination of these sequences lies outside the scope of this review. However,it can be noted that some of these proteins contain the conservedresidues that confer lactose- and N-acetyllactosamine-binding activityon the known mammalian proteins (2Barondes et al., 1994). Several of these proteinshave previously been isolated and shown to bind carbohydrateslike their mammalian homologues. Like the mammalian proteins, someare single carbohydrate-binding domains and others are pairs ofsuch domains. In general, it appears that it is possible to identifysome of these proteins as true orthologues of their mammalian counterparts.However, there is also a subset of proteins that are clearly homologousbut lack key carbohydrate-binding residues, suggestingthat proteins in this structural group have also diverged to servemultiple functions.
Homologues for two types of lectins involved in sorting eventswithin luminal compartments of cells have previously been identified.These include a calnexin precursor (38Trombettaand Helenius, 1998) and two proteins homologous to theL-type lectins, ERGIC-53 and VIP-36 (19
Fiedlerand Simons, 1994; 16
Drickamer,1995). Since these latter proteins are distantly relatedto the legume lectins, it is clear that the legume lectins representa very old lineage of carbohydrate-binding proteins. The wide distributionof these sorting proteins suggests that they have played an importantrole in eukaryotic cells over a long evolutionary period. In contrast,searches with motifs and individual examples of mannose 6-phosphatereceptors failed to identify homologues in C.elegans.The absence of mannose 6-phosphate receptors (P-type lectins) isconsistent with the fact that, although various forms of mannose6-phosphate are present on invertebrate proteins such as those from Dictyostelium, the receptor has not been identified(29
Mehta et al., 1996).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
As noted in Figure 2, all of the CTLDsthat resemble CRDs are found in proteins that are likely to be secretedfrom cells, suggesting that they may function by recognition ofextracellular glycoconjugates. Our present understandingof the structures of glycoproteins in C.elegans islimited, but analysis of N-acetylglucosaminyl and fucosyl transferasessuggests that complex as well as high mannose N-linked oligosaccharides arelikely to be present (12DeBose-Boyd et al., 1998; 9
Chen et al., 1999). In addition, there is evidence forexpression of multiple N-acetylgalactosaminyl transferases capableof initiating O-linked glycan synthesis (22
Hagenand Nehrke, 1998). Although it is possible that some ofthe potential CRDs identified in this work might bind to endogenousglycoproteins, the fact that none of these proteins have membraneanchors suggests that they do not function in the endocytic andcell adhesion functions mediated by many of the C-type vertebratelectins. By analogy to the soluble mannose-binding proteins of vertebrates,possible roles in the innate immune response based on recognitionof exogenous carbohydrates must also be considered. Such functions mightbe particularly important in the absence of an adaptive immune system.
Further structural studies of CTLDs are likely to reveal some ofthe mechanisms through which these domains interact with noncarbohydrateligands. For example, the recently determined structure of a CTLDfrom CD94, a natural killer cell receptor, provides insights intohow this family of proteins interacts with histocompatibility antigensin a carbohydrate- and Ca2+-independentmanner (7Boyington et al.,1999). The CD94 structure also demonstrates that absenceof the key Ca2+ site 2 ligands is indeed correlatedwith the absence of this Ca2+-binding site whichforms an essential part of the Ca2+-dependentcarbohydrate-binding site. As more such structures emerge, it willbecome possible to search for residues that would predict alternative typesof ligand-binding activity.
In the absence Ca2+ site 2, CTLDs mightbind carbohydrates though mechanisms other than those that havebeen demonstrated to date. However, there is no experimental evidence thatCTLDs lacking this Ca2+ site bind carbohydratein a biologically significant manner. This observation, combinedwith the increasing number of examples of CTLDs lacking Ca2+ site2 that bind protein ligands in a carbohydrate-independent manner, suggeststhat such CTLDs are no more likely than any other type of proteinmodule to bind saccharide ligands. Evidence presented here indicatingthat only a small subset of CTLDs in C.elegans arelikely to display Ca2+-dependent carbohydrate-bindingactivity suggests that any effort to demonstrate such activity shouldbe focused on this subset. In addition, sound predictions aboutcarbohydrate-binding activity should underpin speculation aboutthe roles of carbohydrate in processes mediated by proteins containingCTLDs.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Identification of CTLDs and other protein domains
Domain organization was verified using the ProfileScan server atthe Swiss Institute for Experimental Cancer Research (http://www.isrec.isb-sib.ch/software/PFSCAN_form.html)to compare each sequence against the profiles available in the SwissProt (SwissInstitute for Experimental Cancer Research) and PfamA (Sanger Centre)databases. Sequence comparisons for unknown domains were made usingthe FastA algorithm to search the entire SwissProt database throughthe European Bioinformatics Institute server (http://www2.ebi.ac.uk/fasta3/).
Sequence comparisons
Pairwise sequence comparisons, dendrogram construction and multiplesequence alignments were performed using the AlignX software packagein the BioSuite package from InforMax (North Bethesda, MD) runningon a personal computer equipped with a 266 MHz Pentium II microprocessor.Scoring of alignments was based on the Blosum and PAM250 matrices. Gapopening penalties were varied between 5 and 20 and gap extensionpenalties were varied between 0.1 and 0.5. Dendrograms were constructedusing the Clustal W algorithm implemented in the samepackage. Mammalian CRD sequences used for comparison included thosefrom human aggrecan (group I), chicken hepatic lectin (group II),the major subunit of the rat hepatic asialoglycoprotein receptoror rat hepatic lectin 1 (group II), rat serum mannose-binding protein(group III), human E-selectin (group IV), and CRDs 4 and 6 of thehuman macrophage mannose receptor (group VI). Mammalian and non-mammalianCTLDs tested against the C.elegans proteins includedseveral natural killer cell receptors (42Weiset al., 1998), rattlesnake (23
Hirabayashi et al., 1991), barnacle (33
Muramotoand Kamiya, 1990), tunicate (36
Suzuki et al., 1990) and flesh fly lectins (37
Takahashi et al., 1985),fish antifreeze proteins (Davies andSykes, 1997), and phospholipaseinhibitors (26
Inoue et al.,1991) and coagulation factor binding proteins (20
Fuhlendorff et al., 1987; 1
Atoda et al., 1991).
![]() |
Acknowledgments |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Abbreviations |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Barondes,S.H.,Cooper,D.N.W., Gitt,M.A. and Leffler,H. (1994) Galectins: structureand function of a large family of animal lectins. J. Biol.Chem., 269, 2080720810.
3 Beavil,A.J.,Edmeades,R.L., Gould,H.J. and Sutton,B.J. (1992) -Helical coiled-coil stalks in thelow-affinity receptor for IgE (Fc
RII/CD23)and related C-type lectins. Proc. Natl. Acad. Sci. USA, 89, 753757.[Abstract]
4 Bentley,D.R. (1988)Structural superfamilies of the complement system. Exp. Clin.Immunogenet., 5, 6980.
5 Bockaert,J. andPin,J.P. (1999) Molecular tinkering of G protein-coupled receptors:an evolutionary success. EMBO J., 18, 17231729.
6 Bork,P. andBeckmann,G. (1993) The CUB domain: a widespread modulein developmentally regulated proteins. J. Mol. Biol., 231, 539545.[ISI][Medline]
7 Boyington,J.C.,Riaz,A.N., Patamawenu,A., Coligan,J.E., Brooks,A.G. and Sun,P.D.(1999) Structure of CD94 reveals a novel C-type lectinfold: implications for the NK cell-associated CD94/NKG2receptors. Immunity, 10, 7582.[ISI][Medline]
8 Burrows,L.,Iobst,S.T. and Drickamer,K. (1997) Selective bindingof N-acetylglucosamine to the chicken hepatic lectin. Biochem.J., 324, 673680.[ISI][Medline]
9 Chen,S.,Zhou,S., Sakar,M., Spence,A.M. and Schachter,H. (1999)Expression of three Caenorhabditis elegans N-acetylglucosaminyltransferase I genes during development. J. Biol. Chem., 274, 288297.
10 C.elegans SequencingConsortium(1998)Genome sequence of the nematode C.elegans: a platform for investigatingbiology. Science, 282, 20122018.
11 Davies,P.L. andSykes,B.D. (1997) Antifreeze proteins. Curr. Opin.Struct. Biol., 7, 828834.[ISI][Medline]
12 DeBose-Boyd,R.A.,Nyame,A.K. and Cummings,R.D. (1998) Molecular cloning andcharacterization of an 1,3 fucosyltransferase,CEFT-1, from Caenorhabditis elegans. Glycobiology, 9, 905917.
13 Drickamer,K. (1988)Two distinct classes of carbohydrate-recognition domains in animallectins. J. Biol. Chem., 263, 95579560.
14 Drickamer,K. (1992)Engineering galactose-binding activity into a C-type mannose-bindingprotein. Nature, 360, 183186.[ISI][Medline]
15 Drickamer,K. (1993)Ca2+-dependent carbohydrate-recognition domainsin animal proteins. Curr. Opin. Struct. Biol., 3, 393400.[ISI]
16 Drickamer,K. (1995)Increasing diversity of animal lectin structures. Curr. Opin.Struct. Biol., 5, 612616.[ISI][Medline]
17 Drickamer,K. (1999)C-type lectin-like domains. Curr. Opin. Struct. Biol., 9, inpress.
18 Elhammer,A.P.,Poorman,R.A., Brown,E., Maggiora,L.L., Hoogerheide,J.G. and Kezdy,F.J.(1993) The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferaseas inferred from a database of in vivo subtsratesand the in vitro glycosylation of proteins andpeptides. J. Biol. Chem., 268, 1002910038.
19 Fiedler,K. andSimons,K. (1994) A putative novel class of animal lectinsin the secretory pathway homologous to leguminous lectins. Cell, 77, 625626.[ISI][Medline]
20 Fuhlendorff,J.,Clemmensen,I. and Magnusson,S. (1987) Primary structureof tetranectin, a plasminogen kringle 4 binding plasma protein:homology with asialoglycoprotein receptor and cartilage proteoglycancore protein. Biochemistry, 26, 67576764.[ISI][Medline]
21 Graves,B.J.,Crowther,R.L., Chandran,C., Rumberger,J.M., Li,S., Huang,K.-S., Presky,D.H.,Familletti,P.C., Wolitzky,B.A. and Burns,D.K. (1994) Insightinto E-selectin/ligand interaction from the crystal structureand mutagenesis of the lec/EGF domains. Nature, 367, 532538.[ISI][Medline]
22 Hagen,F.K. andNehrke,K. (1998) cDNA cloning and expression of a familyof UDP-N-acetylgalatcosamine:polypeptide N-acetylgalactosaminyltransferase sequencehomologs from Caenorhabditis elegans. J.Biol. Chem., 273, 82688277.
23 Hirabayashi,J.,Kusunoki,T. and Kasai,K. (1991) Complete primary structure ofa galactose-specific lectin from the venom of the rattlesnake Crotalus atrox: homologies withCa2+-dependent-type lectins. J. Biol. Chem., 266, 23202326.
24 Hitchen,P.G.,Mullin,N.P. and Taylor,M.E. (1998) Orientation of sugars boundto the principal C-type carbohydrate-recognition domain of the macrophagemannose receptor. Biochem. J., 333, 601608.[ISI][Medline]
25 Hohenester,E.,Sasaki,T., Olsen,B.R. and Timpl,R. (1998) Crystal structureof the angiogenesis inhibitor endostatin at 1.5A resolution. EMBOJ., 17, 16561664.
26 Inoue,S.,Kogaki,H., Ikeda,K., Samejima,Y. and Omori-Satoh,T. (1991) Aminoacid sequences of the two subunits of a phosphlipase A2 inhibitor fromthe blood plasma of Trimeresurus flavoviridis: sequence homologies with pulmonary surfactantapoprotein and animal lectins. J. Biol. Chem., 266, 10011007.
27 Iobst,S.T. andDrickamer,K. (1994) Binding of sugar ligands to Ca2+-dependent animallectins. II. Generation of high affinity galactose binding by site-directedmutagenesis. J. Biol. Chem., 269, 1551215519.
28 Kohda,D.,Morton,C.J., Parkar,A.A., Hatanaka,H., Inagaki,F.M., Campbell,I.D. andDay,A.J. (1996) Solution structure of the link module:a hyaluronan-binding domain involved in extracellular matrix stabilityand cell migration. Cell, 86, 767775.[ISI][Medline]
29 Mehta,D.P.,Ichikawa,M., Salimath P.V, Etchison,J.R., Haak,R., Manzi,A. and Freeze,H.H.(1996) A lysosomal cysteine proteinase from Dictyostelium discoideum containsN-acetylglucosamine-1-phosphate bound to serine but not mannose-6-phosphateon N-linked oligosaccharides. J. Biol. Chem., 271, 1089710903.
30 Mewes,H.W.,Albermann,K., Bahr,M., Frishman,D., Gleissner,A., Hani,J., Heumann,K.,Kleine,K., Maierl,A., Oliver,S.G., Pfeiffer,F. and Zollner,A. (1997)Overview of the yeast genome. Nature, 387,S7-S8.
31 Mizuno,H.,Fujimoto,Z., Koizumi,M., Kano,H., Atoda,H. and Morita,T. (1997)Structure of coagulation factors IX/X-binding protein,a heterodimer of C-type lectin domains. Nature Struct.Biol., 4, 438441.[ISI][Medline]
32 Mullin,N.P.,Hitchen,P.G. and Taylor,M.E. (1997) Mechanism of Ca2+ and monosaccharidebinding to a C-type carbohydrate recognition domain of the macrophagemannose receptor. J. Biol. Chem., 272, 56685681.
33 Muramoto,K. andKamiya,H. (1990) The amino-acid sequence of multiple lectinsfrom the acorn barnacle Megabalanus rosa and itshomology with animal lectins. Biochim. Biophys. Acta, 1039, 4251.[ISI][Medline]
34 Ng,K.K.-S.,Drickamer,K. and Weis,W.I. (1996) Structural analysisof monosaccharide recognition by rat liver mannose-binding protein. J.Biol. Chem., 271, 663674.
35 Nielsen,B.B.,Kastrup,J.S.,H.,R., Holtet,T.L., Graversen,J.H., Etzerodt,M., Thogersen,H.C.and Larsen,I.K. (1997) Crystal structure of tetranectin,a trimeric plasminogen-binding protein with an alpha-helical coiledcoil. FEBS Lett., 412, 388396.[ISI][Medline]
36 Suzuki,T.,Takagi,T., Furukohri,T., Kawamure,K. and Nakauchi,M. (1990) A calcium-dependentgalactose-binding lectin from the tunicate Polyandtocarpamisakiensis: isolation, characterization andamino acid sequence. J. Biol. Chem., 265, 12741281.
37 Takahashi,H.,Komano,H., Kawagushi,N., Kitamura,N., Nakanishi,S. and Natori,S.(1985) Cloning and sequencing of cDNA of Sarcophagaperegrina humoral lectin induced on injury of the body wall. J.Biol. Chem., 260, 1222812233.
38 Trombetta,E.S. andHelenius,A. (1998) Lectins as chaperones in glycoprotein folding. Curr.Opin. Struct. Biol., 8, 587592.
39 Usami,Y.,Fujimura,Y., Suzuki,M., Ozeki,Y., Nishio,K., Fukui,H. and Titani,K.(1993) Primary structure of two-chain botrocetin, avon Willebrand factor modulator purified from the venom of Bothropsjararaca. Proc. Natl. Acad. Sci. USA, 90, 928932.[Abstract]
40 Weis,W.I.,Kahn,R., Fourme,R., Drickamer,K. and Hendrickson,W.A. (1991) Structureof the calcium-dependent lectin domain from a rat mannose-bindingprotein determined by MAD phasing. Science, 254, 16081615.[ISI][Medline]
41 Weis,W.I.,Drickamer,K. and Hendrickson,W.A. (1992) Structureof a C-type mannose-binding protein complexed with an oligosaccharide. Nature, 360, 127134.[ISI][Medline]
42 Weis,W.I.,Taylor,M.E. and Drickamer,K. (1998) The C-type lectinsuperfamily in the immune system. Immunol. Rev., 163, 1934.[ISI][Medline]