Correspondence to: Richard O. Hynes, E17-227, Center for Cancer Research, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 20139. Tel:(617) 253-6422 Fax:(617) 253-8357 E-mail:rohynes{at}mit.edu.
![]() |
The Essence of Being Metazoan |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Multicellular organisms clearly require mechanisms for intercellular communication and, perhaps even more basically, for intercellular cohesion. The most primitive sponges and coelenterates depend on cell adhesion for their organismal organization; so do insects, nematodes and vertebrates. What molecules and mechanisms are common among these different phyla and which ones differ and why?
The availability of the euchromatic genomic sequences of Drosophila melanogaster (
We review here the results of our analyses and discuss some of the implications. Overall, we identified ~500 Drosophila genes that are candidates for involvement in cell adhesion (~4% of the genome). The molecules mediating cellcell and cellmatrix adhesion exemplify both extreme conservation among diverse organisms and considerable diversification in different phyla, presumably to meet different biological needs.
![]() |
CellCell Adhesion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Many of the major classes of cellcell adhesion molecules were already known to be shared among vertebrates and invertebrates and the genome sequences confirm this picture in great detail. However, they also reveal some interesting differences.
The two major classical groups of cellcell adhesion receptors, cadherins and immunoglobulin superfamily (Ig-SF)1 proteins, are both well represented in Drosophila. We found 17 convincing cadherin homologues, five of which were previously known (Fig 1; Table S1, all supplemental tables S1S8 are available at http://www.jcb.org/cgi/content/full/150/2/F89/DC1). As the cadherin superfamily has grown, the nomenclature has become somewhat confused. In this article we will define "classical cadherins" by their cytoplasmic domain homology and restrict use of protocadherin to refer to homologues of the clustered "protocadherins" (or CNRs) of vertebrates (
|
Thus, Drosophila and Caenorhabditis have similar numbers and spectrum of cadherin homologues (17 and 13, respectively), but vertebrates have many more. Clearly this family of Ca++-dependent cellcell adhesion molecules arose early in metazoan evolution and evolved early into several distinct variant subtypes (classical, fat-like, and flamingo-like) that are conserved to this day. Additional subtypes (protocadherins, desmocollins, desmogleins) arose later in chordates but not in the two sequenced invertebrates (see also below).
The Ig-SF of adhesion receptors is larger than the cadherin superfamily in all three phyla. Drosophila has ~150 genes containing Ig domains (more than Caenorhabditis, which has ~70). They can be sorted roughly into several groups (Tables S2S4). There are around 50 Ig-SF genes encoding 12 Ig domains, most without obvious transmembrane (TM) domains (Table S2). They could be involved in cell adhesion or, as secreted proteins, may participate in intercellular communication or in binding to pathogens. A second group has three or more (up to nine) Ig domains but no other recognizable domains. Many of these, but not all, have predicted TM domains and are likely involved in cell adhesion (Table S3). A third group contains one or more Ig domains in tandem with other domains; EGF, TSP-1, LRR, collagen, sushi, and, most frequently, Fn3 domains (Table S4, A and B). These are likely to be (or known to be) involved in cell adhesion or as receptors for ligands such as netrins (e.g., CT20824, unc5-like) and homologues with similar structures are known in nematodes and vertebrates. Finally, several Ig-alone proteins have associated protein tyrosine kinase domains and are all presumably signaling proteins (Table S4 A).
In addition to their presence in Ig/Fn3 adhesion receptors (Table S4 B), Fn3 domains also exist in around a dozen other genes (Table S5). Some are clearly signaling receptors with tyrosine kinase or tyrosine phosphatase domains. Similar Ig and/or Fn3 kinase and phosphatase receptors exist in C. elegans and in vertebrates. The other Drosophila Fn3 proteins (Table S5) are presumably adhesion receptors or ECM molecules. Interestingly, Fn3 repeats do not appear in tandem arrays with EGF or other disulfide-bonded domains as is common in vertebrate ECM molecules (see more below). Also absent from Drosophila are the extremely repeated Fn3 (myotactin) or Ig (hemicentin) domain proteins found in C. elegans. (Note: Ig and Fn3 domains also exist in all three phyla in large intracellular muscle proteins such as titin, twitchin and projectin, presumably not involved in cell adhesion and beyond the scope of this article).
Thus, Drosophila, in common with nematodes and vertebrates, makes extensive use of Ig and/or Fn3 domains for cell surface adhesion and/or signaling receptors. Like cadherins, these cell interaction molecules must have arisen early in metazoan evolution and diverged to perform differing functions before separation of the arthropod, nematode, and deuterostome lineages, and have been conserved since.
Other proteins likely to be involved in cell adhesion and shared among the three phyla include many EGF family proteins; >100 in each of D. melanogaster (Table S6) and C. elegans, leucine-rich repeat (LRR) proteins (~50 in C. elegans and twice as many in D. melanogaster; see Table S7) and C-type lectins of which the worm has many more (165) than the fly (37) (see http://www.ebi.ac.uk/proteome/ and http://www.mpimf-heidelberg.mpg.de/ewgdn/ for lists). There are 32 TM4-superfamily (tetraspanin) proteins in D. melanogaster, 11 of them linked in a cluster on chromosome 2R (see http://www.ebi.ac.uk/proteome/), and 7 ADAMs (disintegrin-metalloproteinase) family members (see Tables S6 and S7). Again, all of these families must have evolved early, before divergence of the three lineages. It appears that once cells evolved some basic mechanisms for sticking together sensibly they did not let go, either of each other or of those adhesion receptors.
![]() |
CellMatrix Adhesion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The second arm of cell adhesion, attachment to basement membranes, appears equally ancient and also exquisitely conserved. Coelenterates have basement membranes, as do all more complex animals. The basic constituents of basement membranes are type IV collagen, laminin, nidogen/entactin and proteoglycans of the perlecan type; these molecules are all highly conserved (Fig 2). Drosophila has a laminin comprising three subunits (, ß,
; formerly A, B1, B2). These have been known for some time and are clearly related to laminins in vertebrates (of which there are many) and in C. elegans (which has 2
's, 1ß and 1
). D. melanogaster, like C. elegans, has a second alpha subunit. Similarly, both the fly and the worm have a single pair of type IV collagen genes. In vertebrates, which have three pairs, it is notable that each pair is organized in an antiparallel head-to-head arrangement with a common promoter region between (
|
Two other common constituents of vertebrate basement membranes are the proteoglycan perlecan and the glycoprotein nidogen/entactin. Both have homologues in C. elegans and D. melanogaster. The perlecans of worms, flies and vertebrates all comprise tandem arrays of LDLR-A, LM-EGF, Ig and LM-G domains (see Fig 2;
Therefore, it seems clear that these four complex proteins; type IV collagen, laminin, nidogen/entactin, and perlecan, formed the basis of an early basement membrane that has been preserved in molecular detail ever since.
Several other ECM proteins are well conserved among the three phyla, including collagen XV/XVIII (CT14872, see Fig 2), SPARC/osteonectin (CT19876), netrins (CT27014 and CT29512, see Table S6), and the anosmin/Kallmann syndrome protein (CT19368, see Table S4). All these were first identified in vertebrates and have good homologues in C. elegans as well as Drosophila. Netrins are well established neural guidance molecules as are slit (CT21700 and CT37068, see Table S6) and semaphorins (see Table S7), which also occur in all three phyla. The functions of collagen XV/XVIII, SPARC/osteonectin, and anosmin/Kallmann syndrome protein are less clear but, given their strong evolutionary conservation, are likely to be fundamental and well worthy of further study. As we will discuss below, many other ECM molecules show much less conservation.
Extracellular matrix is clearly important but equally significant are the receptors by which cells attach to ECM; these too are well conserved. The major ECM receptors are integrins, ß heterodimeric receptors linking ECM to the cytoskeleton (
, CT40473 and CT5192, both previously known) and five
subunits of which three (
PS1-3) were known (
subunits are most closely related to
PS3 and one (which we call
PS4) is closely linked to
PS3 (chromosome 2R, 5IE-F).
PS5 is also on 2R although not so closely linked (59E) but it is also similar in structure. Given this homology, it is likely that all five
subunits complex with ßPS to form five PS integrins (already known for
PS1-3); ß
so far has no known
partner. It is clear that
PS1ßPS and
PS2ßPS are, respectively, receptors for laminin and RGD-containing ECM proteins (
subunits is most homologous with a set of functionally related vertebrate
subunits (laminin-specific;
3,
6,
7,
PS1 or RGD-specific;
5,
8,
v,
IIb,
PS2; see Fig 2). It is notable that C. elegans also has an orthologue of each of these subfamilies (
|
It seems evident that some early metazoan evolved two integrins, one laminin-specific and one recognizing RGD or something like it, and these two families have been preserved ever since. Since PS1ßPS and
PS2ßPS are frequently expressed by apposed tissues separated by extracellular matrix (
PS3,
PS4, and
PS5) not closely homologous with orthologues in other phyla. A similar phenomenon has been noted before for echinoderm integrin ß subunits (
PS3 affect short-term memory in flies (
subunits known to date). Around half of the vertebrate
subunits include an extra inserted I domain (homologous with von Willebrand A domains). I domains are found in many integrins that bind to collagens and in leukocyte integrins but no I domains occur in fly or worm integrin
subunits. Indeed, we could not detect vWF-A domain homologues in Drosophila adhesion molecules except, notably, for ß
. Perhaps ß
functions alone or as a homodimer. We will return later to the issue of differential evolution of adhesion molecules.
![]() |
Cytoskeletal Connections |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A key feature of cell adhesion is the linkage of cell adhesion receptors to the cytoskeleton. This affects not only the intracellular consequences of cell adhesion (cell shape and polarity, cytoplasmic organization and cell motility) but also intracellular signal transduction and even the efficacy of the adhesive interactions at the extracellular surface. The cytoskeletal connections of cadherins and integrins have been extensively studied in vertebrates and appear to be conserved in many details in Drosophila, although some key features appear to be absent.
Classical vertebrate cadherins link via ß-catenin to -catenin and thence to the actin cytoskeleton. The fly homologue of ß-catenin is armadillo (CG 11579; three alternatively spliced forms). Two ß-cateninlike molecules are known in vertebrates (ß-catenin and plakoglobin or
-catenin). Drosophila also has a homologue of vertebrate
-catenins (CT39986). Thus, this cytoskeletal connection is well conserved.
In contrast with this conservation of classical catenin-binding via cadherins, other classes of cadherin known in vertebrates are missing. Desmocollins and desmogleins are cadherin homologues found in vertebrate desmosomes. They have characteristic cytoplasmic domains that link via desmoplakins to intermediate filaments. Since Drosophila lacks intermediate filaments (
The more typical integrin-actin microfilament connection is well conserved in Drosophila, which has single copy genes for the cytoskeleton linker/adapter proteins of integrins; talin, -actinin, vinculin, paxillin, tensin, as well as the integrin-linked signal transduction molecules, FAK, ILK, p95PKL and p130CAS (Fig 4). Many of these proteins occur in multiple copies in vertebrates. Their occurrence as single genes in Drosophila (and C. elegans?) will facilitate genetic and other analyses of their functions in this evidently ancient ECM-integrin-cytoskeleton connection.
|
Another well analyzed transmembrane ECM-cytoskeleton linkage is the laminin-dystroglycan/sarcoglycan-dystrophin linkage. There are single Drosophila homologues of dystroglycan (CT41273) and /
sarcoglycan (CT34621), two transmembrane proteins linking laminin to dystrophin in vertebrates. As mentioned above, laminin exists in Drosophila, as does dystrophin together with dystrobrevin and syntrophins (
![]() |
Variations on Basic Themes |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In contrast with the high degree of evolutionary conservation in cellcell and cellmatrix adhesion discussed above, other aspects of cell adhesion and, in particular, extracellular matrix proteins show considerable variation among flies, worms and vertebrates. We have already mentioned the abundance of C-type lectins in nematodes as compared with fruit flies (other lectins are also very numerous in C. elegans). Why is this? Drosophila has made much more use of Ig and LRR domains than has C. elegans. Again, why?
Both species have elaborated large, complex, extracellular matrix molecules. It is far from clear what the advantage of very large ECM molecules might be. One can clearly make stable polymers from small proteins (intermediate filaments, bacterial flagellae) so structural arguments are not in general terms compelling. Even for some of the best understood ECM proteins, we can only assign functions to a small fraction of the repeated domains. Yet the others are equally well conserved. What are they doing? Clearly, many of the domains that have been used to elaborate matrix and other adhesive proteins are good at binding other proteins; that is what many of the well defined domains do. Presumably the others do something similar. They may bind other ECM proteins. They may engage multiple cell surface receptors to trigger complex intracellular responses, and, as discussed earlier, there are certainly a large number of likely adhesion receptors with unassigned functions. Another possibility is that ECM proteins act as docking sites for diffusible factors. That is known to happen and may be more prevalent than we know. Classical mathematical models that attempt to explain morphogenetic gradients typically invoke both freely diffusible and more stably anchored gradients of morphogens. Binding to ECM proteins could well be one way to establish the more stable, slowly changing gradients. Perhaps that is what explains the exuberant elaboration of domains in many ECM proteins. In the absence of a clear explanation for the multiplicity of domains in any one matrix protein, it is even harder to understand why C. elegans should proliferate Fn3 domains in myotactin or Ig domains in hemicentin, while D. melanogaster concatenates von Willebrand D, trypsin inhibitor and other domains in hemolectin (CT21553) and why both species have proteins with multiply repeated EGF and CUB domains (Table S6).
It is a little easier to offer rationalizations for the extreme expansion of the set of collagen genes in C. elegans. In contrast with the rather limited set of collagens found in Drosophila, C. elegans has around 170 collagen genes, many of them encoding cuticular collagens. The collagenous cuticle provides an exoskeleton for C. elegans. Vertebrates have also used a wide variety of collagens, particularly extended fibrillar collagens to construct endoskeletons (cartilage, bones) and the connections to them (tendons) as well as the interstitial connective tissue that provides structural strength to vertebrate tissues. Neither flies nor nematodes appear to have elaborated such fibrillar collagens. Indeed, apart from the basement membrane collagens mentioned earlier, Drosophila has only a few genes encoding short collagen segments.
Has Drosophila evolved ECM proteins specialized, for example, to attach muscle cells to the chitin exoskeleton? A number of Drosophila ECM proteins are concentrated at muscle attachment sites. One of these, tiggrin is composed of 16 repeats of 75 ± 2 amino acids (CT36389;
![]() |
What's Missing and Why? |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We have already mentioned the striking absence of fibrillar collagens and of intermediate filaments in Drosophila. It must be noted that ablation of intermediate filaments from vertebrate cells frequently has remarkably subtle cellular effects; the defects lie rather at the tissue structural level (e.g., skin blisters). Perhaps flies do not need the mechanical strength provided by IF and fibrillar collagens because of the existence of a chitinous exoskeleton.
Many well known vertebrate ECM proteins appear to be missing from the Drosophila genome. These include fibronectin, vitronectin, elastin, fibulins, osteopontin, von Willebrand factor, thrombospondins, tenascins, and fibrinogen. In many cases some of the characteristic domains are present, such as the Fn3 domains characteristic of fibronectin and tenascin, and vWF-C and D domains. However, in each of these examples, other domains are missing. We were unable to detect any FN type I and II domains in the Drosophila or C. elegans genomic sequences and only a few vWF-A domains, none associated with other vWF domains. Although both EGF and Fn3 domains are prevalent in Drosophila, we could not detect any genes that contained these two domains together (as in tenascins) although that had been claimed for ten-m (
Many of these vertebrate ECM proteins are not essential for life (
It seems that an entire set of genes necessary to construct blood vessels and contain the pressure of circulating blood was elaborated during the evolution of vertebrates. This entailed novel assemblages of ancient domains (Fn3, EGF, vWA, vWD, TSP-1) and the development of new domains (e.g., Fn1, Fn2, TB) and novel proteins (e.g., elastin). Several of the other missing proteins (vitronectin, fibrinogen, von Willebrand factor) are prevalent in, or unique to, blood. Further pursuit of this idea led us to search for VEGFs and angiopoietins. We found no convincing matches for VEGF and, although the fibrinogen C domain proteins somewhat resemble angiopoietins, the homology is limited to the FB-C domain. Furthermore, we have found no tyrosine kinase receptors with Ig, EGF and Fn3 domains like the tie2 receptor for angiopoietins. It appears that these crucial genes involved in vascular development in vertebrates are absent from Drosophila. There are parallels between vascular and insect tracheal development (
Returning to the earlier discussion of additional vertebrate integrins and the paucity of integrin I and vWF-A domains in Drosophila and C. elegans, it is worth noting that several A domains in vWF bind collagen and that I domains bind collagen in several vertebrate integrins. Perhaps the elaboration of A/I domains in vertebrates accompanied the proliferation of collagen genes. Other I domain integrins are selectively expressed on white blood cells. Those integrins are absent from the two invertebrates, as are selectins, another class of adhesion receptors involved in adhesion of white blood cells. Selectins rely on a C-type lectin plus an EGF domain and, while C-type lectins are present in both C. elegans and Drosophila, we found no CL/EGF pairs in the fruit fly. Genes labeled selectin-like in the worm and fly lack EGF domains, although some fly genes do contain C-type lectins together with sushi domains, also found in selectins. Thus, two major classes of receptors key in adhesion functions of blood cells appear to be vertebrate inventions.
It appears to us that a large number of genes involved in the development, maintenance and function of the vasculature in vertebrates evolved only in the chordate lineage (Table 1). This is not a great surprise but it points to the value of further genomic and genetic analyses of vertebrate systems.
|
Another organ system that is much more elaborate in vertebrates is the nervous system and it appears that, there too, vertebrates have elaborated genes encoding adhesion molecules that are not found in Drosophila or C. elegans. Vertebrates have many more cadherins than do flies and worms, and many of them are expressed in the brain (
In conclusion, although the analysis that has been possible to date represents only the beginning in extracting insights from comparative analyses of the genomes of flies, worms and vertebrates, some clear messages are apparent. They are not particularly surprising in outline but the details are stimulating and informative and the fact that one is able to look at essentially the entire blueprint for the organism adds strength to the hypotheses that can be formulated and the sequences open the route to testing those hypotheses.
Examination of the set of genes encoding adhesion proteins reveals both the great conservation of some basic processes as well as the elaboration of new genes and processes during evolution. The detailed molecular conservation of basic cellcell and cellmatrix adhesions and of basement membrane structure is remarkable and confirms yet again the evolution from a common ancestor of arthropods, nematodes and mammals.
For metazoans to evolve from single cells they had to invent cell adhesion. This apparently involved evolution of new protein domains. Ig, EGF, TSP-1, LDLR-A, C-type lectins, cadherins, and collagen triple-helix domains are all absent from yeast, as are laminins, tyrosine kinases, integrins, band 4.1 proteins and many others involved in cellcell interactions. But, once these domains and genes evolved, they have been used over and over again. The complex proteins elaborated early in metazoan evolution to assemble basement membranes and attach cells to them and to one another have been conserved in great detail ever since. Many adhesive routines appear to be the same in flies, worms and people, although often duplicated and replicated in vertebrates. Such processes can be very effectively analyzed in invertebrates. However, it is equally clear from browsing the genomes and proteomes that vertebrates have evolved some new tricks not found in flies and worms. In many cases new proteins have been assembled from new arrangements of old domains. However, new domains and entirely new proteins have also evolved that have no close counterparts in invertebrates. This is particularly evident in vascular biology and in some aspects of neurobiology and is likely to be true for some other uniquely vertebrate functions such as neural crest migration. It will be fascinating to be able to look at the entire set of human genes in the near future and ask what additional new adhesive tricks have been elaborated during our evolution from the common ancestor of protostomes and deuterostomes.
The genomic analysis of Drosophila reinforces the conclusion that it and C. elegans are wonderful models for some aspects of vertebrate life but it also shows that mice and zebrafish and the human genomic sequence will offer insights that we cannot hope to gain from invertebrates. That is particularly so for the multicellular processes in which cell adhesion plays an important part.
![]() |
Footnotes |
---|
The online version of this article contains supplemental material.
1 Abbreviations used in this paper: ADAM, disintegrin metalloproteinase; CNR, cadherin-related neural receptor; CT, COOH-terminal; ECM, extracellular matrix; FB-C, fibrinogen COOH-terminal domain; Fn3, fibronectin type III repeat; Ig-SF, immunoglobulin superfamily; LDLR, LDL receptor; LM, laminin; LM-G, laminin G domain; LRR, leucine-rich repeat; PI, phosphatidylinositol-linked; PS, position-specific; RGD, arginine-glycine-aspartate; TB, TGFß-binding domain; TM, transmembrane; TM4, tetraspanin; TSP-1, thrombospondin type 1 domain; vWA, von Willebrand A domain; vWD, von Willebrand D domain.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We thank Rolf Apweiler, Larry Goldstein, Tom Maniatis, and Masatoshi Takeichi for helpful inputs during the analysis, Denisa Wagner for critical review of the manuscript and Charlie Whittaker for help with Fig 3. Richard Hynes is an Investigator of the Howard Hughes Medical Institute.
Submitted: 1 June 2000
Revised: 27 June 2000
Accepted: 27 June 2000
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|