Origins, Lineage-Specific Expansions, and Multiple Losses of Tyrosine Kinases in Eukaryotes

Shin-Han Shiu and Wen-Hsiung Li

Department of Ecology and Evolution, University of Chicago

Correspondence: E-mail: whli{at}uchicago.edu.


    Abstract
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Tyrosine kinases are important components of metazoan signaling pathways, and their mutant forms are implicated in various malignancies. Searching the sequences from the genomes of 28 eukaryotes and the GenBank, we found tyrosine kinases not only in metazoans but also in the green algae Chlamydomonas reinhardtii, the potato late blight pathogen Phytophthora infestans, and the protozoan pathogen Entamoeba histolytica, contrary to the current view that tyrosine kinases are animal-specific. Based on a phylogenetic analysis, we divided this gene family into 43 subfamilies and found that at least 19 tyrosine kinases were likely present in the common ancestor of chordates, arthropods, and nematodes. Interestingly, most of the subfamilies have conserved domain organizations among subfamily members but have undergone different degrees of expansion during the evolution of metazoans. In particular, a large number of duplications occurred in the lineage leading to the common ancestor of Tagifugu and mammals after its split from the Ciona lineage about 450 to 550 MYA. The timing of expansion coincides with proposed large-scale duplication event in the chordate lineage. Furthermore, gene losses have occurred in most subfamilies. Interestingly, different subfamilies have similar net gain rates in the chordates studied. However, the tyrosine kinases in mouse and human or in fruit fly and mosquito mostly have a one-to-one relationship between species, indicating that static periods of 90 Myr or longer in tyrosine kinase evolution have followed large expansion events.

Key Words: tyrosine kinase • gene family • gene duplication • gene loss


    Introduction
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
The ability to perceive and process information via cell surface receptors is a basic property of all organisms. Signal perception through these receptors initiates a series of events that leads to the elicitation of proper cellular responses. In animals, the tyrosine kinase family is one of the most important gene families in mediating such signal transduction events. Members of this gene family are involved in aspects of animal development, tissue differentiation, immune responses, and cell death (van der Geer, Hunter, and Lindberg 1994; Hunter 1998; Hubbard and Till 2000). Signal transduction involving tyrosine kinases is recognized as one of the few conserved pathways in animal development (Pires-daSilva and Sommer 2003). As a consequence, mutations in tyrosine kinase genes result in many disease states such as various forms of cancer and in congenital syndromes such as dwarfism and hereditary lymphedema (Robertson, Tynan, and Donoghue 2000; Blume-Jensen and Hunter 2001).

Tyrosine kinases form a monophyletic group within the large protein kinase superfamily (Hanks and Hunter 1995), which includes also plant receptor-like kinases (RLK), Pelle kinases, and Raf kinases (Shiu and Bleecker 2001). In this study, this relationship is used to distinguish the tyrosine kinase family from other kinase families capable of phosphorylating tyrosine, such as MAP kinase kinase and casein kinase II. Members of the tyrosine kinase family contain a canonical kinase domain that is capable of phosphorylating tyrosines on substrate proteins. Outside of the kinase domain, these proteins possess a diverse array of sequence motifs responsible for interaction with other components in signal transduction pathways (Hubbard and Till 2000). On the basis of these sequence motifs and sequence similarity, human tyrosine kinases have been classified into multiple subfamilies (Robinson, Wu, and Lin 2000; Blume-Jensen and Hunter 2001; Manning et al. 2002). Interestingly, tyrosine kinases have so far been isolated only from animals, including sponge and hydra (Gamulin et al. 1997; Suga et al. 1999; Steele, Stover, and Sakaguchi 1999) and from the choanoflagellate Monosiga brevicollus, a unicellular relative of metazoans (King and Carroll 2001). Saccharomyces cerevisiae (budding yeast) is devoid of this gene family all together (Hunter and Plowman 1997), and it has been reported that receptor tyrosine kinases are absent from the Arabidopsis genome (Arabidopsis Genome Initiative 2000). Therefore, it is generally believed that tyrosine kinases are specific to metazoans.

Similar to many other multigene families, the tyrosine kinase family underwent differential expansion during the course of metazoan evolution. Several Caenorhabditis elegans–specific tyrosine kinases were found to have undergone expansions (Plowman et al. 1999; Popovici et al. 1999). In a comparison of all kinases from budding yeast, Drosophila melanogaster (fruit fly), and C. elegans (Manning et al. 2002), it was noted that the tyrosine kinase subfamilies in fruit fly and C. elegans were of different sizes. In addition, the orthologous relationships for quite a few tyrosine kinases could not be readily established. Based on surveys of GenBank sequences and targeted sequencing of selected tyrosine kinase subfamilies, it was proposed that two major episodes of duplications have occurred in this gene family (Iwabe, Kuma, and Miyata 1996; Suga et al. 1997; Suga et al. 1999). The first seems to have occurred before the split between poriferans and the other metazoans. The second occurred around the time of the split between the cyclostomes, such as lamprey, and the gnathostomes, including jawed fishes and tetrapods. However, these assertions were based on limited numbers of tyrosine kinases identified using degenerate primers, raising the question of whether the pattern observed is representative of this gene family across different taxa.

In this study, we sought to address several key questions in tyrosine kinase evolution. In eukaryotes, only in budding yeast has the absence of tyrosine kinases been rigorously examined on the basis of whole-genome analyses. Therefore, we searched for tyrosine kinases in 28 eukaryotic genomes including seven metazoans, four fungi, one microsporidian, one green alga, two flowering plants, and nine unicellular eukaryotes to examine the notion that tyrosine kinases are metazoan-specific. This search was further extended to GenBank polypeptide and expressed sequence tag (EST) databases. Previous studies on the expansion of tyrosine kinases did not use completely or nearly completely sequenced genomes. We constructed a phylogeny for all tyrosine kinases found in the genomes analyzed to determine the extents of family expansion in different evolutionary lineages. We also examined the domain organizations of tyrosine kinases in an attempt to determine the timing of their establishment and to identify domain recruitment events. Finally, gene duplications and losses are two counteracting mechanisms in controlling gene family size. To understand the relative contribution and timing of these events in different tyrosine kinase subfamilies, we determined the numbers of duplications and losses of tyrosine kinases among the organisms analyzed.


    Methods
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Searching for Members of the Kinase Superfamily
The protein kinase sequences were obtained from the genomes listed in table 1 with the following procedure. The amino acid sequences of representative kinases (Shiu and Bleecker 2001) were used as queries to conduct BLAST (Altschul et al. 1997) searches against known and predicted genes of 28 eukaryotic genomes (for Chlamydomonas, the protein sequences were predicted using Arabidopsis gene models with GenScan; Burge and Karlin 1998). All sequences with E values less than 1 were further analyzed with SMART (Schultz et al. 2000) and Pfam (Sonnhammer et al. 1998) HMM models in the SMART database. Sequences with kinase domains predicted by the SMART database were regarded as members of the protein kinase superfamily. The kinase domain amino acid sequences of all kinases identified in each organism were aligned with representative kinases using ClustalW (Higgins, Thompson, and Gibson 1996). A phylogenetic tree was then generated using the Neighbor-Joining method with bootstrap replicates implemented in MEGA2 (Kumar et al. 2001). The multiple substitutions were corrected using the Poisson distance. The alignment gaps were treated as missing characters. The phylogeny was rooted with the aminoglycoside kinase from Staphylococcus, a divergent kinase related to eukaryotic protein kinases (accession number P00554; Leonard, Aravind, and Koonin 1998). This procedure was repeated for each of the 28 genomes analyzed.


View this table:
[in this window]
[in a new window]
 
Table 1 Genomes Analyzed and the Size of Tyrosine Kinase Family.

 
Delineation of the Tyrosine Kinase Family in the Genomes Analyzed
The tyrosine kinase family was defined by two approaches. The first one was aimed at delineating the tyrosine kinase family in the context of all kinases within each genome. The kinases from each organism were aligned with representative sequences used in the previous section. The kinase phylogeny for each organism was generated based on the alignments. Sequences in the same clade as the representatives from the receptor kinase group (including the tyrosines kinase, the RLK/Pelle, and the Raf families; Shiu and Bleecker 2001) are regarded as "candidate tyrosine kinases." After eliminating potentially alternatively spliced forms, these candidates were aligned with kinase representatives. The alignments were used for generating a phylogenetic tree as described in the previous section. "Tyrosine kinases" were the sequences in a large clade that formed a sister group to the RLK/Pelle and the Raf family and included tyrosine kinase representatives. To uncover tyrosine kinases that may have been missed using the first approach, a BLAST search was conducted to obtain kinases that were more similar to tyrosine kinases (defined using the first approach) than any other kinases. Using this criterion, 12 sequences were identified and their affiliation to tyrosine kinases was verified by constructing phylogenetic trees with kinase representatives as described earlier. Ten sequences were found to be in the same clade as the tyrosine kinase representatives and were combined with the tyrosine kinase set defined in the first approach. Together with the human and mouse sequences that were missed in Ensembl (see next section), the sequence names for tyrosine kinases used in this analysis are listed in Supplement A of the Supplementary Material online.

Obtaining Tyrosine Kinases from GenBank Polypeptide and EST Databases
The kinase domain sequences of the tyrosine kinase set obtained above were used to BLAST against the GenBank polypeptide and EST sequence release 132 with the cutoff threshold of 1. These candidate kinase sequences were used to search against the combined kinase set from Arabidopsis, human, and all four fungi using BLAST. Sequences with a known human tyrosine kinase as the top match were regarded as tyrosine kinase candidates and those candidates from non-metazoans were verified further by determining their affiliation with kinase representatives. Seven human and three mouse sequences in these GenBank-derived sequences were not found in the Ensembl sequence release used. These sequences were incorporated into the tyrosine kinase set described in the previous section. For a list of tyrosine kinases from organisms other than the genomes analyzed, see Supplement B of the Supplementary Material online.

Classification of Tyrosine Kinase Subfamilies and Inference of Duplications and Losses
A phylogeny of the kinase domain protein sequences of all tyrosine kinases identified from the genomes listed in table 1 was constructed as described earlier but rooted with human cyclin–dependent kinase 3 (CDK3, NP_001249). The phylogeny was then compared to the published classification scheme (Blume-Jensen and Hunter 2001), and all subfamilies except JAK had more than 40% support, whereas the majority has more than 80% support (for the full phylogeny of tyrosine kinases, see Supplement C of the Supplementary Material online). Therefore, a cutoff of 40% bootstrap support was used for defining subfamilies. The subfamilies were further joined or divided based on the presence or inferred presence of any C. elegans tyrosine kinase. For each clade with two basal bifurcating branches, those with more than 40% bootstrap support and possessed of at least one C. elegans tyrosine kinase in one branch but not the other were defined as subfamilies. Any clade with higher than 40% support but without a C. elegans tyrosine kinase was also regarded as a subfamily only if its sister group is a subfamily with a C. elegans tyrosine kinase. The other clades with more than two sequences and more than 40% bootstrap support were also classified as subfamilies. The rest were defined as singletons.

The kinase domain protein sequences from each of the 43 subfamilies were aligned and manually inspected according to the kinase subdomain signatures (Hanks and Hunter 1995). The phylogeny for each subfamily was generated as described above, but with 1,000 bootstrap replicates. The rooted phylogenies were superimposed on species trees to infer gene duplications and losses using the program GeneTree (Page 1998).


    Results
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Distribution of Tyrosine Kinases in Eukaryotes
To determine whether this gene family is restricted to the animal kingdom, a survey was conducted to uncover tyrosine kinases in 28 eukaryotic genomes (table 1). Among annotated genomes, the number of protein kinases found varies from 30 in the microsporidian Encephalitozoon cuniculi to more than 1,000 and 1,600 in the flowering plants Arabidopsis thaliana and rice, respectively (table 1). As expected, all metazoan genomes contain various numbers of tyrosine kinases (table 1; see also Supplement A of the Supplementary Material online at www.mbe.oupjournals.org). Tyrosine kinase is absent in fungi, the fungal relative Encephalitozoon, flowering plants, and most unicellular eukaryotes. But surprisingly, several non-metazoan sequences were found to cluster with known tyrosine kinases, including three sequences from Chlamydomonas, one from Phytophthora, and two from Entamoeba (fig. 1A; shaded box a, b, and c, respectively). Although the bootstrap support is low at 44%, the sequences are enclosed within the previously defined tyrosine kinase family (Manning et al. 2002).



View larger version (21K):
[in this window]
[in a new window]
 
FIG. 1. Phylogenetic distribution of the tyrosine kinase gene family. A. Phylogeny of protein kinase representatives including putative tyrosine kinases from three non-metazoan eukaryotes with black backgrounds labeled a through c. The kinase representative sequences included are as described previously (Shiu and Bleecker 2001). The phylogeny was constructed using the Neighbor-Joining method with 1,000 bootstrap replicates. The bootstrap values in percent support are shown at each branch; branches with less than 30% are collapsed. The sequences are named with species abbreviations followed by gene names. The species abbreviations: At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Cr, Chlamydomonas reinhardtii (a); Dd, Dictyostelium discoideum; Dm, Drosophila melanogaster; Eh, Entamoeba histolytica (c); Hs, Homo sapiens; Pi, Phytophthora infestans (b). Kinase family abbreviations: TK, tyrosine kinase; RSK, animal receptor serine kinase; RLK, plant receptor-like kinases. B, The phylogenetic relationships between some of the major taxonomic groups of eukaryotes (Baldauf et al. 2000; Hedges 2002) are shown on the left. The tyrosine kinases that were obtained from GenBank protein and EST sequences and were found to be present (filled circles) in metazoans and a choanoflagellate but absent (empty circles) in other taxa, are indicated in the first column on the right. The tyrosine kinases were also defined using the full kinase complements consolidated from the protein-coding genes of 28 eukaryotic genomes shown in table 1. Tyrosine kinases were only found in metazoan genomes and 3 other eukaryote genomes, as indicated in the second column. The non-metazoan eukaryotes with putative tyrosine kinases are labeled a–c according to figure 1A.

 
To extend the analysis to organisms whose genomes had not been completely sequenced, another survey was conducted to retrieve all tyrosine kinases from GenBank polypeptide and EST databases. Sequences that are more similar to known tyrosine kinases than to kinases in any other kinase families could be found in several basal animal taxa including cnidarians, poriferans, and choanoflagellates (fig. 1B; for the putative tyrosine kinases identified in these organisms, see Supplement B of the Supplementary Material online at www.mbe.oupjournals.org). No tyrosine kinase was found in non-metazoans except for one EST (accession number BI094600) from a fungus, Laccaria bicolor. It shares 97% identity at the nucleotide level to human insulin receptor. However, its presence in L. bicolor could not be confirmed by polymerase chain reaction (PCR) (Gopi K. Podila, personal communication) and may represent contamination from another source. These observations suggest that the tyrosine kinase family is mostly restricted to the metazoans and the closely related choanoflagellates. However, Chlamydomonas, Phytophthora, and Entamoeba likely also contain tyrosine kinases, arguing against the notion that they are metazoan-specific.

Differential Lineage Expansion in Tyrosine Kinase Subfamilies
The size of the tyrosine kinase family in animal genomes ranges from ~30 in arthropods to 115 in Tagifugu. This family can be divided into subfamilies with a wide range of domain organizations and functions (Robinson, Wu, and Lin 2000; Blume-Jensen and Hunter 2001). In this study, we used the kinase domain sequences to construct a phylogeny to define 43 subfamilies for further comparative analysis as outlined in Methods (fig. 2; for the full phylogeny, see Supplement C of the Supplementary Material online). Under the assumption that C. elegans is the earliest diverging taxa among the metazoans examined, 19 subfamilies were defined with at least one C. elegans tyrosine kinase. These subfamilies are likely to have been present in the common ancestor of these metazoans. An additional 11 subfamilies were defined because they are sister groups to the 19 C. elegans containing subfamilies (fig. 2A, arrowheads). Many of these subfamilies were also likely present in the common ancestor, and the absence of C. elegans sequences might be due to gene loss. These inferences suggest that the number of tyrosine kinase genes in the common ancestor of the metazoans examined was at least 19 and could be up to 30 or more. Thirteen additional subfamilies were defined, although they did not have any C. elegans sequence and they were not a sister group to any subfamily with C. elegans sequences (fig. 2A, arrows). In addition, 19 singletons do not fall into the subfamilies defined (for their identity, see Supplement A of the Supplementary Material online).



View larger version (53K):
[in this window]
[in a new window]
 
FIG. 2. Kinase phylogeny and conservation of domain organization. The kinase domains of tyrosine kinases were aligned and used for phylogenetic reconstruction. The phylogeny is shown on the left. The branches were labeled according to the taxa of the source sequences. Subfamilies were classified based on the scheme outlined in Materials and Methods, except the MUSK subfamily. For subfamilies with domains or motifs identified in addition to kinases, their organizations were shown in the middle. In subfamilies with more than one domain configuration, they are enclosed in gray background. The conservation of domain organization is shown on the right. The species name abbreviations are shown on the top. The circle indicates that the organism has at least one tyrosine kinase that belongs to this subfamily. The circles were color-coded based on the resemblance of subfamily members in an organism to the domain organization shown in the middle: black, same domain composition and arrangement; blue, similar but with fewer or more non-kinase domains; red, no identical domain outside of the kinase domain; white, not determined

 
Thirteen of these subfamilies are shared among C. elegans, arthropod (fly or mosquito), and at least one of the four chordates studied. Some other subfamilies are either organism-specifif or lineage-specific. For example, the TIE, AATYK, AXL, and SRM subfamilies are chordate-specific. The remaining subfamilies (e.g., PDGFR and HGFR) are shared between insects and chordates or between C. elegans and chordates. The subfamilies in C. elegans, fruit fly, and mosquito usually have only one to two genes, except for the CeKIN-15 and CeFER subfamilies in C. elegans (fig. 2B). Ciona, a basal chordate species, has slightly more genes in each subfamily than do C. elegans, fruit fly, and mosquito. It is of interest that subfamily sizes are consistently larger in Tagifugu, mouse, and human than in Ciona. On the other hand, very little difference in terms of subfamily size is seen between mouse and human and between fruit fly and mosquito. In fact, the only difference between mouse and human is in the SRC subfamily, where they differ by one gene, YES2, a reported pseudogene (Semba et al. 1988) that is misannotated as YES1 in Ensembl. Taken together, the subfamily size differences among organisms suggest that this gene family experienced a period of stasis until the separation of the vertebrate lineage from Ciona. The expansion in the vertebrate lineage seems to be followed by a stasis in mammals.

Establishment of Domain Configurations in Tyrosine Kinases
In addition to gene duplication, one mechanism to generate molecular novelties is the recruitment of additional protein domains through domain shuffling. Multiple domains are found in tyrosine kinases. Therefore, an intriguing question is whether domain addition in the tyrosine kinase family occurred frequently after the divergence of the animals analyzed. Tyrosine kinases with known domains outside the kinase regions were compared among organisms to uncover the pattern of conservation or divergence in domain organization.

Interestingly, the domain organization and composition of subfamily members are in general similar among organisms (fig. 3). Among the 13 subfamilies shared among C. elegans, arthropod (fruit fly or mosquito), and chordates (at least one of the four species), the domain organizations are essentially the same in 11 subfamilies with minor differences such as the number of repeated motifs. In the subfamilies shared among the four chordates, most subfamily members have the same organization. In particular, no difference is found between mouse and human or between fruit fly and mosquito. Nevertheless, differences in domain organization are found in some subfamilies. For example, in the HGFR subfamily, all chordate members have large, homologous extracellular domains, which were not found in the C. elegans ortholog. Because the N-terminal region of the C. elegans HGFR member did not have sequence resembling the chordate HGFR extracellular domain, this absence may indicate a domain loss in the C. elegans lineage or a domain recruitment before the divergence of chordates. Taken together, the domain configurations in most subfamilies were established before the divergence of the metazoan genomes analyzed. Few clear changes in domain configurations were identified, and conservation of organization seems to be the rule.



View larger version (43K):
[in this window]
[in a new window]
 
FIG. 3. Differential expansion of tyrosine kinase subfamilies. Subfamilies of tyrosine kinases were delineated based on the phylogeny of the kinase domain sequences and their domain organizations. A, Phylogeny of tyrosine kinase subfamily representatives indicating a monophyly of the tyrosine kinase family and subfamily relationships. Bootstrap values are shown next to the nodes. Outgroups are boldfaced and italicized. Arrowheads indicate subfamilies defined based on their sister group relationships to subfamilies with C. elegans tyrosine kinases. Arrows indicate clades that do not have C. elegans genes and do not have sufficient support for their affiliation with subfamilies with C. elegans sequences. B, The number of subfamily members in different metazoan genomes. Species abbreviations are as defined in figure 1. Subfamilies with more than 15 members were truncated with the gene number indicated. The full names for gene acronyms can be found in Supplement A of the Supplementary Material online

 
Patterns of Tyrosine Kinase Duplication and Loss
Gene numbers are determined not only by gene duplication but also by gene loss, which may obscure the duplication pattern. To determine the contributions of gene duplication and loss to each subfamily, subfamily phylogenies were constructed and evaluated using the concept of reconciled tree (Page and Charleston 1997), where a gene tree was superimposed onto the species tree to infer gene duplications and losses by the parsimony principle.

Two alternative species trees were used. The first assumed a closer relationship between deuterostomes and arthropods (fig. 4A), and the second assumed that nematodes and arthropods belong to Ecdysozoa (Aguinaldo et al. 1997; fig. 4B). Both topologies were evaluated but the details of only the first are shown in figure 4C. The numbers of duplications and losses were scored on each branch of the trees. Interestingly, there are more inferred duplications in branch 6, right after the split of the vertebrate lineage from the Ciona lineage, than in any other branches (fig. 4A). For most subfamilies, the numbers of duplications on branch 6 are higher than on most other branches (fig. 4C). This finding indicates that the tyrosine kinase family underwent an episodic expansion in the lineage leading to the common ancestor of Tagifugu, mouse, and human after its split from the Ciona lineage. In addition, the number of duplications in branch 9 is also large compared to the sister lineages that lead to either mouse or human (branches 7, 10, 11), consistent with the proposed genome duplication in the ray-fin fish lineage. Very few or no duplications were inferred in the arthropod lineage (branches 2, 3, 4) and in the lineages leading to either mouse or human (branches 7, 10, 11). In the two alternative topologies, most branches have similar numbers or the same number of inferred duplications. The only large difference is seen in branch 0, which represents the common ancestor of the species analyzed. More duplication events were inferred when assuming the presence of Ecdysozoa in the species tree.



View larger version (69K):
[in this window]
[in a new window]
 
FIG. 4. Patterns of gene duplication and loss in tyrosine kinase subfamilies. A, B. Two alternative topologies indicating the relationships between the metazoans analyzed (Adoutte et al. 2000). The first characters in the species names in table 1 are taken as species abbreviations. Numbers within rectangles located on the branches were used to facilitate scoring of events. Numbers above each rectangle indicate the number of duplications and losses (duplication/loss) in all subfamilies. C, The number of duplications and losses for species topology shown in (A). The subfamily order is the same as in figure 2. Species abbreviations are the same as those in figure 1. The presence of subfamily member(s) is indicated with a filled circle. The inferred events are numbered according to the numbers within the rectangles in (A). Subfamilies with more than 10 inferred events were truncated, with the number of events indicated. In cases where no events were inferred in all subfamilies, the columns were not shown

 
Substantial numbers of inferred losses are found in nearly all nodes. Some branches, such as the lineages leading to arthropods (branch 5), C. elegans (branch 1), and mammals (branch 7), however, have more inferred losses. The large number of inferred losses in the lineage leading to mammals may indicate the loss of genes duplicated during the episodic expansion. Like the pattern of duplications, very few or no gene loss could be inferred in the fruit fly, mosquito, mouse, and human lineages. In addition, more losses were inferred under the Ecdysozoa hypothesis.

Duplications and Losses After the Split Between Ciona and the Other Chordates
Because different tyrosine kinase subfamilies play substantially different roles in animals, some may be retained at a higher rate than the others. To determine the retention rates, we analyzed the relationships between duplication and loss in tyrosine kinase subfamilies. Because the frequency of duplication was highest in the lineage leading to the vertebrates after its separation from the Ciona lineage (branch 6; fig. 4A and fig. 4C), the number of duplications was compared to the number of losses in the chordate lineages excluding Ciona.

Nearly all subfamilies have undergone more gene duplications than losses, as indicated by their deviations from a one-to-one relationship between duplication and loss (fig. 5). The deviation is most pronounced in the EPHR subfamily. Interestingly, the relationships between the numbers of duplications and losses fit a linear model well (fig. 5, solid line), suggesting that, despite the differences in duplication rate among subfamilies, the rates of net gene gain are quite similar. The slope indicates one gene loss per 2.16 duplication events. Therefore, about one out of every two tyrosine kinase duplicates has been retained during the period examined.



View larger version (16K):
[in this window]
[in a new window]
 
FIG. 5. Comparison of inferred duplication and loss in different subfamilies after the split between Ciona and the other chordates. The sum of inferred duplications in branches 6–7 and 9–11 is compared to the sum of inferred losses in branches 7 and 9–11. The dotted line represents a one-to-one relationship between duplication and loss. The solid line indicates the linear relationship between duplication and loss in different subfamilies. The equation and the correlation coefficient for the solid line are shown on top

 

    Discussion
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Taxonomic Distribution and Origin of the Tyrosine Kinase Family
Tyrosine kinases are important regulators of growth and development in animals. In two previous genome-wide analyses, the budding yeast and Arabidopsis were found to be devoid of tyrosine kinases (Hunter and Plowman 1997; Arabidopsis Genome Initiative 2000). In this study, we analyzed GenBank polypeptide and EST sequences and 28 eukaryotic genomes and found tyrosine kinases, as expected, in various metazoans and the choanoflagellate Monosiga brevicollis (King and Carroll 2001). In addition, we found putative tyrosine kinases from three non-metazoan eukaryotes including the green alga Chlamydomonas, the potato late blight pathogen Phytophthora, and the protist pathogen Entamoeba. However, these three eukaryotes are only distantly related, and some organisms with closer affiliations to these three eukaryotes do not have sequences that clustered with tyrosine kinases (fig, 1B). One possibility is that tyrosine kinase was present in the common ancestor of the eukaryotes examined. The absence of tyrosine kinases in other non-metazoan eukaryotes may be explained by multiple losses in eukaryote lineages. However, the absence of tyrosine kinases in several non-metazoan eukaryotes can also be attributed to the limitation of the methodology used. That is, these eukaryotes may contain tyrosine kinases but they are too divergent to be uncovered by the sequence analysis. Alternatively, the three organisms may have acquired tyrosine kinases via horizontal gene transfer. Further research is needed to resolve this issue, and biochemical studies are necessary to establish if these kinases can phosphorylate tyrosine.

Among the protein kinase superfamily, tyrosine kinase is related to Raf kinases and the receptor-like kinase/Pelle family that defines the "receptor kinase group" (Shiu and Bleecker 2001). Because tyrosine kinases are able to phosphorylate tyrosine, this gene family may have been derived from another kinase family that exhibits similar specificity such as MAP kinase kinases (Dhanasekaran and Premkumar Reddy 1998), casein kinase II (Litchfield 2003), and Dictyostelium dual-specificity kinases. Among these kinase families, only the Dictyostelium dual-specificity kinases are closely related to the receptor kinase group (data not shown). Interestingly, Raf kinases and nearly all RLK/Pelle family members tested to date are serine/threonine kinases. The kinase domains in guanylyl cyclases are also closely related to tyrosine kinases (fig. 2A, RETGC) but autophosphorylating on serine residues only (Aparicio and Applebury 1996). Therefore, the ability to phosphorylate tyrosine in the tyrosine kinase family might have evolved several times independently in the protein kinase superfamily, and the tyrosine kinase family may be descendents of Dictyostelium dual-specificity kinase-like genes.

Conservation of Domain Organization at the Subfamily Level
The domain organizations are similar among members of each tyrosine kinase subfamily (see Supplement C of the Supplementary Material online), providing additional support for our classification of subfamilies. As expected, this conservation is most pronounced between more closely related organisms, such as mouse and human. Interestingly, the conservation of domain organization between distantly related organisms such as C. elegans and human is still quite high. These findings indicate that some of the additions of domains to tyrosine kinases occurred before the divergence among the organisms studied 600 to 1,000 MYA (Ayala and Rzhetsky 1998; Hedges 2002). Based on the analysis of a limited number of tyrosine kinase subfamilies and other gene families in metazoans, Iwabe, Kuma, and Miyata (1996) suggested that duplications leading to different subfamilies with distinct domain organizations occurred before the protostome-deuterostome split. Our findings based on the whole tyrosine kinase family from multiple genomes are mostly in line with this notion. Nevertheless, there are some interesting exceptions. In addition to the HGFR subfamily, the vertebrate AXL members contain a large extracellular domain, whereas the C. elegans counterpart F11E6.8 is devoid of the extracellular part. The immunoglobulin domains in vertebrate MUSKs and RORs are not found in their arthropod relatives.

Given the proposed increase of molecular complexity expressed in the form of domain recruitment in the human lineage (Lander et al. 2001; Venter et al. 2001), what we found was a relative stasis in domain configuration that lasted for hundreds of millions of years. This observation suggests that most domain additions may be selected against due to interference with normal developmental programs. Indeed, tyrosine kinases have been implicated in several malignancies because of their fusion to other proteins, as occurs in chronic myelogenous leukemia (Groffen et al. 1984; Shtivelman et al. 1985), anaplastic large-cell lymphoma (Morris et al. 1994), and congenital fibrosarcoma (Knezevich et al. 1998). These examples suggest that most, if not all, tyrosine kinase fusions significantly decrease the fitness of the individuals. This strong negative effect may contribute substantially to the lack of changes in domain organization among the metazoans examined.

Similarity in domain organization alone may not be sufficient for establishing homology between genes. In previous studies, three C. elegans sequences were designated as VEGFR orthologs because they contained similar numbers of immunoglobulin repeats (Plowman et al. 1999; Popovici et al. 1999). Our study indicates that their kinase sequences are more closely related to kinases in the CeKIN15 subfamily instead of VEGFR. This raises the possibility that the similar extracellular domains in VEGFRs and the C. elegans sequences may have been recruited independently by different tyrosine kinases. In addition, the C. elegans FER subfamily was designated as mammalian FES/FER relatives (Plowman et al. 1999). However, our phylogeny of tyrosine kinases does not lend support to this designation. Further analysis is needed to resolve these conflicting results.

One limitation of our analysis is that certain sequences were not clustered with other sequences because of their high sequence divergence—i.e., the singletons that did not fall into any subfamilies. The extracellular domains of C. elegans singletons C16D9.2 and C25F6.4 are similar to those of ROS and DDR, respectively. In addition, a mosquito singleton contains a WSC domain and a fibronectin 3 domain in its extracellular region, representing a novel domain configuration not found in other organisms except fruit fly. It is not known if sequence divergence, domain shuffling, and/or gene conversion had confounded the inference of relationships between singletons and subfamilies. Nonetheless, most tyrosine kinases can be readily placed in subfamilies, and these singletons represent interesting exceptions that require further examination.

Expansion of the Tyrosine Kinase Family
The family of tyrosine kinases would have been small in the common ancestor of metazoans and the choanoflagellates, but it had expanded to approximately 30 members before the divergence of the metazoans examined. The expansion continued after the divergence of the metazoans examined, especially in C. elegans and vertebrates.

Interestingly, the subfamily sizes are generally similar among C. elegans, fruit fly, and mosquito with few exceptions. Most subfamilies shared among chordates are larger in Tagifugu, mouse, and human than in Ciona, which diverged from the three other chordates analyzed ~550 MYA (Dehal et al. 2002). Furthermore, an episodic increase of duplications was found in a number of subfamilies in the ancestral lineage of Tagifugu, mouse, and human after its split from the Ciona lineage. Similar patterns of duplications have been reported in several tyrosine kinase subfamilies (Suga et al. 1997). This pattern of duplication is also found in other gene families and its timing coincides with the large-scale duplication in early vertebrate evolution (McLysaght, Hokamp, and Wolfe 2002; Gu, Wang, and Gu 2002). The increase of gene duplications during this time period is regarded as evidence for a whole-genome duplication by some investigators, but this is hotly debated (Wolfe 2001, Hokamp, McLysaght, and Wolfe 2003, Hughes and Friedman 2003). Another explanation for the observed increase in duplication events is that multiple, independent segmental duplications of chromsomes occurred within a relatively short time frame. In any case, our study does not support or refute either possibility. In mouse and human, we also found that one-third of tyrosine kinases are located within tandem clusters (Shiu and Li, unpublished data). It is likely that the combined action of tandem duplication and large-scale duplication contributed to the expansion.

One intriguing finding is that no duplication event was inferred in human and mouse after the split of their common ancestor from the Tagifugu lineage. Few duplications were inferred in fruit fly and mosquito after their split from all the other metazoans examined. In addition, all human genes have mouse orthologs, and all but three fruit fly tyrosine kinases have mosquito counterparts. In both cases, the one-to-one relationships are rather remarkable considering the fact that reciprocal best matches covered only 80% of the genes in mouse and human (Mouse Genome Sequencing Consortium 2002) and only ~45% of the genes in fruit fly and mosquito (Zdobnov et al. 2002). These findings indicate gain or loss events that abolish a one-to-one relationship occurred rarely during the period of 90 Myr since the divergence between mouse and human and ~260 Myr in the case of fruit fly and mosquito. Because large-scale duplications seem to be the major mechanism for the expansion of the tyrosine kinase family, the low number of gains in these lineages may simply reflect a low rate of gene duplication. On the other hand, the low number of losses may be due to the deleterious consequences of loss-of-function mutations in tyrosine kinases (Robertson, Tynan, and Donoghue 2000). Interestingly, among the 74 tyrosine kinase knockouts generated in mice, five of them do not have obvious phenotypes, which were knockouts of BLK (Texido et al. 2000), SRM (Kohmura et al. 1994), BMX (Rajantie et al. 2001), CTK/Matk (Hamaguchi et al. 1996), and EphA6 (Shimoyama et al. 2002).

One rather unexpected finding is the linear correlation between duplications and losses. Because genes in different subfamilies have different functions, one might expect that some subfamily members have been retained more than the others. Surprisingly, the number of duplications and losses in different subfamilies is strongly correlated with a duplication-to-loss ratio of 2.16. That is, approximately every other duplicated tyrosine kinase was retained. This finding indicates the presence of a general trend for the retention of tyrosine kinases from different subfamilies. However, there is likely substantial variation in the net gain rate among different vertebrates. The common ancestor of Tagifugu and mammals might have approximately 90 genes. If a whole-genome duplication occurred in the ray-finned fish after its divergence from lobe-finned fish, as proposed (Amores et al. 1998; Gates et al. 1999; Postlethwait et al. 2000), we expect to see 135 tyrosine kinases in Tagifugu, assuming a gene-loss rate of 50%. Instead, there are only 105. In a comparative analysis between zebrafish and Tagifugu, it was shown that Tagifugu has lost many duplicates retained in zebrafish (Taylor et al. 2003). This may partially explain the smaller than expected number of tyrosine kinase genes in Tagifugu. In addition, we cannot rule out the possibility that a fraction of tyrosine kinases is not present in the current release of the Tagifugu genome.

Pattens of Gains and Implications on Functional Divergence
The expansion of the tyrosine kinase family in animals implies that these tyrosine kinases are somehow beneficial and might have been retained by some mechanisms. One possible mechanism is expression divergence in time or tissues (Ferris and Whitt 1979; Force et al. 1999). For example, the two PDGFR isoforms have distinct expression patterns (Ataliotis and Mercola 1997). Moreover, although their kinase domains are interchangeable under certain conditions (Klinghoffer et al. 2001), the two PDGFR isoforms have different ligand affinities. Thus, another retention mechanism may involve the evolution of divergent ligand-binding capacities. Intuitively, these isoforms bind to the same ligands but with different signal outputs, providing additional regulatory mechanisms in the signaling network, and they are therefore retained. This model is not restricted to the extracellular ligand-binding domain because a change in activity modulation may be selected for.

Another possible mechanism for retention may be gene dosage effect. The AXL subfamily members, AXL, MER, and Tyro3, are involved in spermatogenesis and the mice that are devoid of all three genes have various defects including infertility (Lu et al. 1999). However, various double mutant combinations have only limited phenotypes and are fertile, and individual gene knockouts are phenotypically the same as the wild type. In light of the mutant phenotypes and differences in severity, it is possible that the increased number of AXL members contributed to the increase in fecundity and were selected for.

With the wealth of sequence information and our current understanding of tyrosine kinase functions, we conducted this analysis to uncover the history and mechanisms of tyrosine kinase family expansion. We found that this gene family is not animal-specific and the expansion was preceded and followed by periods of stasis. Its presence in Chlamydomonas, Phytophthora, and Entamoeba genomes represents the only known examples outside of the animal kingdom, including choanoflagellates. We speculate that whole genome duplications were the major means for the expansion of this gene family. The domain architectures are conserved among members of most subfamilies, suggesting that most, if not all, domain acquisition events occurred prior to the divergence of the metazoans examined. We also found that gene gains are often followed by gene losses. Given the important roles of tyrosine kinases in essentially all stages of metazoan life, it is likely that tyrosine kinase duplicates were retained for increasing the capacity of cell-cell communication and intracellular signal transduction. Nevertheless, many gene losses have occurred. It is difficult to explain why certain duplicates were retained or lost simply based on the generalized functions of each subfamily. Further molecular genetic or biochemical analyses of members within a subfamily in multiple organisms may shed light on the relative contributions of various retention mechanisms in the tyrosine kinase family.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
We thank Melissa D. Lehti-Shiu and Marsha R. Rosner for reading the manuscript; Wojciech M. Karlowski, Klaus F. X. Mayer, and Munich Information Centre for Proteins Sequences (MIPS) database for providing rice gene predictions; The Institute for Genomic Research (TIGR) and respective funding agencies for allowing the use of unfinished eukaryote genomes. The work was supported by National Institutes of Health (NIH) National Research Service Award (5F32GM066554–02) to S.-H.S. and NIH grants to W.-H.L.


    Footnotes
 
Laura A. Katz, Associate Editor Back


    Literature Cited
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 

    Adoutte, A., G. Balavoine, N. Lartillot, O. Lespinet, B. Prud'homme, and R. de Rosa. 2000. The new animal phylogeny: reliability and implications. Proc. Natl. Acad. Sci. USA 97:4453-4456.[Abstract/Free Full Text]

    Aguinaldo, A. M., J. M. Turbeville, L. S. Linford, M. C. Rivera, J. R. Garey, R. A. Raff, and J. A. Lake. 1997. Evidence for a clade of nematodes, arthropods and other moulting animals. Nature 387:489-493.[CrossRef][ISI][Medline]

    Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.[Abstract/Free Full Text]

    Amores, A., A. Force, and Y. L. Yan, et al. (13 co-authors). 1998. Zebrafish hox clusters and vertebrate genome evolution. Science 282:1711-1714.[Abstract/Free Full Text]

    Aparicio, J. G., and M. L. Applebury. 1996. The photoreceptor guanylate cyclase is an autophosphorylating protein kinase. J. Biol. Chem. 271:27083-27089.[Abstract/Free Full Text]

    Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796-815.[CrossRef][ISI][Medline]

    Ataliotis, P., and M. Mercola. 1997. Distribution and functions of platelet-derived growth factors and their receptors during embryogenesis. Int. Rev. Cytol. 172:95-127.[ISI][Medline]

    Ayala, F. J., and A. Rzhetsky. 1998. Origin of the metazoan phyla: molecular clocks confirm paleontological estimates. Proc. Natl. Acad. Sci. USA 95:606-611.[Abstract/Free Full Text]

    Baldauf, S. L., A. J. Roger, I. Wenk-Siefert, and W. F. Doolittle. 2000. A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290:972-977.[Abstract/Free Full Text]

    Blume-Jensen, P., and T. Hunter. 2001. Oncogenic kinase signalling. Nature 411:355-365.[CrossRef][ISI][Medline]

    Burge, C. B., and S. Karlin. 1998. Finding the genes in genomic DNA. Curr. Opin. Struct. Biol. 8:346-354.[CrossRef][ISI][Medline]

    Dehal, P., Y. Satou, and R. K. Campbell, et al. (87 co-authors). 2002. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science 298:2157-2167.[Abstract/Free Full Text]

    Dhanasekaran, N., and E. Premkumar Reddy. 1998. Signaling by dual specificity kinases. Oncogene 17:1447-1455.[CrossRef][ISI][Medline]

    Ferris, S. D., and G. S. Whitt. 1979. Evolution of the differential regulation of duplicate genes after polyploidization. J. Mol. Evol. 12:267-317.[ISI][Medline]

    Force, A., M. Lynch, F. B. Pickett, A. Amores, Y. L. Yan, and J. Postlethwait. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545.[Abstract/Free Full Text]

    Gamulin, V., A. Skorokhod, V. Kavsan, I. M. Muller, and W. E. Muller. 1997. Experimental indication in favor of the introns-late theory: the receptor tyrosine kinase gene from the sponge Geodia cydonium. J. Mol. Evol. 44:242-252.[ISI][Medline]

    Gates, M. A., L. Kim, E. S. Egan, T. Cardozo, H. I. Sirotkin, S. T. Dougan, D. Lashkari, R. Abagyan, A. F. Shier, and W. S. Talbot. 1999. A genetic linkage map for zebrafish: comparative analysis and localization of genes and expressed sequences. Genome Res. 9:334-347.[Abstract/Free Full Text]

    Groffen, J., J. R. Stephenson, N. Heisterkamp, A. de Klein, C. R. Bartram, and G. Grosveld. 1984. Philadelphia chromosomal breakpoints are clustered within a limited region, bcr, on chromosome 22. Cell 36:93-99.[ISI][Medline]

    Gu, X., Y. Wang, and J. Gu. 2002. Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution. Nat. Genet. 31:205-209.[CrossRef][ISI][Medline]

    Hamaguchi, I., N. Yamaguchi, J. Suda, A. Iwama, A. Hirao, M. Hashiyama, S. Aizawa, and T. Suda. 1996. Analysis of CSK homologous kinase (CHK/HYL) in hematopoiesis by utilizing gene knockout mice. Biochem. Biophys. Res. Commun. 224:172-179.[CrossRef][ISI][Medline]

    Hanks, S. K., and T. Hunter. 1995. Protein kinases 6. The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification. FASEB J. 9:576-596.[Abstract/Free Full Text]

    Hedges, S. B. 2002. The origin and evolution of model organisms. Nat. Rev. Genet. 3:838-849.[CrossRef][ISI][Medline]

    Higgins, D. G., J. D. Thompson, and T. J. Gibson. 1996. Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 266:383-402.[ISI][Medline]

    Hokamp, K., A. McLysaght, and K. H. Wolfe. 2003. The 2R hypothesis and the human genome sequence. J. Struct. Funct. Genomics 3:95-110.[CrossRef][Medline]

    Hubbard, S. R., and J. H. Till. 2000. Protein tyrosine kinase structure and function. Annu. Rev. Biochem. 69:373-398.[CrossRef][ISI][Medline]

    Hughes, A. L., and R. Friedman. 2003. 2R or not 2R: testing hypotheses of genome duplication in early vertebrates. J. Struct. Funct. Genomics 3:85-93.[CrossRef][Medline]

    Hunter, T. 1998. The Croonian Lecture 1997. The phosphorylation of proteins on tyrosine: its role in cell growth and disease. Phil. Trans. R. Soc. Lond. Ser. B Biol. Sci. 353:583-605.[CrossRef][ISI][Medline]

    Hunter, T., and G. D. Plowman. 1997. The protein kinases of budding yeast: six score and more. Trends Biochem. Sci. 22:18-22.[CrossRef][ISI][Medline]

    Iwabe, N., K. Kuma, and T. Miyata. 1996. Evolution of gene families and relationship with organismal evolution: rapid divergence of tissue-specific genes in the early evolution of chordates. Mol. Biol. Evol. 13:483-493.[Abstract]

    King, N., and S. B. Carroll. 2001. A receptor tyrosine kinase from choanoflagellates: molecular insights into early animal evolution. Proc. Natl. Acad. Sci. USA 98:15032-15037.[Abstract/Free Full Text]

    Klinghoffer, R. A., P. F. Mueting-Nelsen, A. Faerman, M. Shani, and P. Soriano. 2001. The two PDGF receptors maintain conserved signaling in vivo despite divergent embryological functions. Mol. Cell 7:343-354.[ISI][Medline]

    Knezevich, S. R., D. E. McFadden, W. Tao, J. F. Lim, and P. H. Sorensen. 1998. A novel ETV6-NTRK3 gene fusion in congenital fibrosarcoma. Nat. Genet. 18:184-187.[CrossRef][ISI][Medline]

    Kohmura, N., T. Yagi, Y. Tomooka, M. Oyanagi, R. Kominami, N. Takeda, J. Chiba, Y. Ikawa, and S. Aizawa. 1994. A novel nonreceptor tyrosine kinase, Srm: cloning and targeted disruption. Mol. Cell. Biol. 14:6915-6925.[Abstract]

    Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.[Abstract/Free Full Text]

    Lander, E. S., L. M. Linton, and B. Birren, et al. (>100 co-authors; International Human Genome Sequencing Consortium). 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921.[CrossRef][ISI][Medline]

    Leonard, C. J., L. Aravind, and E. V. Koonin. 1998. Novel families of putative protein kinases in bacteria and archaea: evolution of the "eukaryotic" protein kinase superfamily. Genome Res. 8:1038-1047.[Abstract/Free Full Text]

    Litchfield, D. W. 2003. Protein kinase CK2: structure, regulation and role in cellular decisions of life and death. Biochem. J. 369:1-15.[CrossRef][ISI][Medline]

    Lu, Q., M. Gore, and Q. Zhang, et al. (13 co-authors). 1999. Tyro-3 family receptors are essential regulators of mammalian spermatogenesis. Nature 398:723-728.[CrossRef][ISI][Medline]

    Manning, G., D. B. Whyte, R. Martinez, T. Hunter, and S. Sudarsanam. 2002. The protein kinase complement of the human genome. Science 298:1912-1934.[Abstract/Free Full Text]

    McLysaght, A., K. Hokamp, and K. H. Wolfe. 2002. Extensive genomic duplication during early chordate evolution. Nat. Genet. 31:200-204.[CrossRef][ISI][Medline]

    Morris, S. W., M. N. Kirstein, M. B. Valentine, K. G. Dittmer, D. N. Shapiro, D. L. Saltman, and A. T. Look. 1994. Fusion of a kinase gene, ALK, to a nucleolar protein gene, NPM, in non-Hodgkin's lymphoma. Science 263:1281-1284.[ISI][Medline]

    Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520-562.[CrossRef][ISI][Medline]

    Page, R. D. 1998. GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14:819-820.[Abstract]

    Page, R. D., and M. A. Charleston. 1997. From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Mol. Phylogenet. Evol. 7:231-240.[CrossRef][ISI][Medline]

    Pires-daSilva, A., and R. J. Sommer. 2003. The evolution of signalling pathways in animal development. Nat. Rev. Genet. 4:39-49.[CrossRef][ISI][Medline]

    Plowman, G. D., S. Sudarsanam, J. Bingham, D. Whyte, and T. Hunter. 1999. The protein kinases of Caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc. Natl. Acad. Sci. USA 96:13603-13610.[Abstract/Free Full Text]

    Popovici, C., R. Roubin, F. Coulier, P. Pontarotti, and D. Birnaum. 1999. The family of Caenorhabditis elegans tyrosine kinase receptors: similarities and differences with mammalin receptors. Genome Res. 9:1026-1039.[Abstract/Free Full Text]

    Postlethwait, J. H., I. G. Woods, P. Ngo-Hazelett, Y. L. Yan, P. D. Kelly, F. Chu, H. Huang, A. Hill-Force, and W. S. Talbot. 2000. Zebrafish comparative genomics and the origins of vertebrate chromosomes. Genome Res. 10:1890-1902.[Abstract/Free Full Text]

    Rajantie, I., N. Ekman, K. Iljin, E. Arighi, Y. Gunji, J. Kaukonen, A. Palotie, M. Dewerchin, P. Carmeliet, and K. Alitalo. 2001. Bmx tyrosine kinase has a redundant function downstream of angiopoietin and vascular endothelial growth factor receptors in arterial endothelium. Mol. Cell. Biol. 21:4647-4655.[Abstract/Free Full Text]

    Robertson, S. C., J. A. Tynan, and D. J. Donoghue. 2000. RTK mutations and human syndromeswhen good receptors turn bad. Trends Genet. 16:265-271.[CrossRef][ISI][Medline]

    Robinson, D. R., Y. M. Wu, and S. F. Lin. 2000. The protein tyrosine kinase family of the human genome. Oncogene 19:5548-5557.[CrossRef][ISI][Medline]

    Schultz, J., R. R. Copley, T. Doerks, C. P. Ponting, and P. Bork. 2000. SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res. 28:231-234.[Abstract/Free Full Text]

    Semba, K., M. Nishizawa, H. Satoh, S. Fukushige, M. C. Yoshida, M. Sasaki, K. Matsubara, T. Yamamoto, and K. Toyoshima. 1988. Nucleotide sequence and chromosomal mapping of the human c-yes-2 gene. Jpn. J. Cancer Res. 79:710-717.[ISI][Medline]

    Shimoyama, M., H. Matsuoka, Nagata, A., Iwata, N., Tamckanc, A., Okamura, A., Gomjo, H., Ito, M., Jishage, K., and Kamada, N., et al. 2002. Developmental expression of EphB6 in the thymus: lessons from EphB6 knockout mice. Biochem. Biophys. Res. Commun. 298:87-94.[CrossRef][ISI][Medline]

    Shiu, S. H., and A. B. Bleecker. 2001. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases. Proc. Natl. Acad. Sci. USA 98:10763-10768.[Abstract/Free Full Text]

    Shtivelman, E., B. Lifshitz, R. P. Gale, and E. Canaani. 1985. Fused transcript of abl and bcr genes in chronic myelogenous leukaemia. Nature 315:550-554.[ISI][Medline]

    Sonnhammer, E. L. L., S. R. Eddy, E. Birney, A. Bateman, and R. Durbin. 1998. Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res. 26:320-322.[Abstract/Free Full Text]

    Steele, R. E., N. A. Stover, and M. Sakaguchi. 1999. Appearance and disappearance of Syk family protein-tyrosine kinase genes during metazoan evolution. Gene 239:91-97.[CrossRef][ISI][Medline]

    Suga, H., M. Koyanagi, D. Hoshiyama, K. Ono, N. Iwabe, K. Kuma, and T. Miyata. 1999. Extensive gene duplication in the early evolution of animals before the parazoan-eumetazoan split demonstrated by G proteins and protein tyrosine kinases from sponge and hydra. J. Mol. Evol. 48:646-653.[ISI][Medline]

    Suga, H., K. Kuma, N. Iwabe, N. Nikoh, K. Ono, M. Koyanagi, D. Hoshiyama, and T. Miyata. 1997. Intermittent divergence of the protein tyrosine kinase family during animal evolution. FEBS Lett. 412:540-546.[CrossRef][ISI][Medline]

    Taylor, J. S., I. Braasch, T. Frickey, A. Meyer, and Y. Van de Peer. 2003. Genome duplication, a trait shared by 22000 species of ray-finned fish. Genome Res. 13:382-390.[Abstract/Free Full Text]

    Texido, G., I. H. Su, I. Mecklenbrauker, K. Saijo, S. N. Malek, S. Desiderio, K. Rajewsky, and A. Tarakhovsky. 2000. The B-cell-specific Src-family kinase Blk is dispensable for B-cell development and activation. Mol. Cell. Biol. 20:1227-1233.[Abstract/Free Full Text]

    van der Geer, P., T. Hunter, and R. A. Lindberg. 1994. Receptor protein-tyrosine kinases and their signal transduction pathways. Annu. Rev. Cell Biol. 10:251-337.[CrossRef][ISI][Medline]

    Venter, J. C., M. D. Adams, and E. W. Myers, et al. (>100 co-authors). 2001. The sequence of the human genome. Science 291:1304-1351.[Abstract/Free Full Text]

    Wolfe, K. H. 2001. Yesterday's polyploids and the mystery of diploidization. Nat. Rev. Genet. 2:333-341.[CrossRef][ISI][Medline]

    Zdobnov, E. M., C. von Mering, and I. Letunic, et al. (36 co-authors). 2002. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science 298:149-159.[Abstract/Free Full Text]

Accepted for publication December 15, 2003.