Paralogy and Orthology of Tyrosine Kinases that Can Extend the Life Span of Caenorhabditis elegans

Brad A. Rikke2,, Shin Murakami and Thomas E. Johnson

Institute for Behavioral Genetics, University of Colorado

Abstract

Modification of any one of three transmembrane protein tyrosine kinase (PTK) genes, old-1, old-2 (formerly tkr-1 and tkr-2, respectively), and daf-2 can extend the mean and maximum life span of the nematode Caenorhabditis elegans. To identify paralogs and orthologs, we delineated relationships between these three PTKs and all known transmembrane PTKs and all known mammalian nontransmembrane PTKs using molecular phylogenetics. The tree includes a number of invertebrate receptor PTKs and a novel mammalian receptor PTK (inferred from the expressed-sequence tag database) that have not previously been analyzed. old-1 and old-2 were found to be members of a surprisingly large C. elegans PTK family having 16 members. Interestingly, only four members of this transmembrane family appeared to have receptor domains (immunoglobulin-like in each case). The C-terminal domain of this family was found to have a unique sequence motif that could be important for downstream signaling. Among mammalian PTKs, the old-1/old-2 family appeared to be most closely related to the Pdgfr, Fgfr, Ret, and Tie/Tek families. However, these families appeared to have split too early from the old-1/old-2 family to be orthologs, suggesting that a mammalian ortholog could yet be discovered. An extensive search of the expressed-sequence tag database suggested no additional candidate orthologs. In contrast to old-1 and old-2, daf-2 had no C. elegans paralogs. Although daf-2 was most closely related to the mammalian insulin receptor family, a hydra insulin receptor–like sequence suggested that daf-2 might not be an ortholog of the insulin receptor family. Among PTKs, the old-1/old-2 family and daf-2 were not particularly closely related, raising the possibility that other PTK families might extend life span. On a more general note, our survey of the expressed-sequence tag database suggested that few, if any, additional mammalian PTK families are likely to be discovered. The one novel family that was discovered could represent a novel oncogene family, given the prevalence of oncogenes among PTKs. Finally, the PTK tree was consistent with nematodes and fruit flies being as divergent as nematodes and mammals, suggesting that life extension mechanisms shared by nematodes and fruit flies would be reasonable candidates for extending mammalian life spans.

Introduction

Protein tyrosine kinases (PTKs) orchestrate many of the cell-cell interactions necessary for growth, development, and repair (reviewed by van der Geer, Hunter, and Lindberg 1994Citation ). PTKs may even have played a crucial role in establishing the first metazoans, as PTKs are well represented in sponges and hydra (Ottilie et al. 1992Citation ; Schacke et al. 1994Citation ; Cetkovic et al. 1998Citation ; Muller 1998Citation ), but no conventional PTKs are found in yeast or plants (Walker 1994Citation ; Hunter and Plowman 1997Citation ; Suga et al. 1997Citation ; Chervitz et al. 1998Citation ; Satterlee and Sussman 1998Citation ). In mammals, multicellular organization is still highly dependent on the proper functioning of PTKs, as many PTKs become oncogenic when their activity is altered. Because many PTKs are proto-oncogenes, their identification and characterization has been intensely pursued (Wilks 1991Citation ), with approximately 100 PTKs having now been identified in humans (Hunter 1998Citation ).

Variants of three transmembrane PTK genes have been found to extend the life span of Caenorhabditis elegans. daf-2 partial-loss-of-function mutations of the receptor and catalytic (kinase) domains extend mean and maximum life spans by about 100% (Kenyon et al. 1993Citation ; Kimura et al. 1997Citation ). daf-2 is most similar in sequence to the mammalian genes for the insulin receptor family (Kimura et al. 1997Citation ). old-1 and old-2 (overexpression longevity determinant), formerly tkr-1 and tkr-2, extend mean and maximum life spans by about 65% and 20%, respectively, when their wild-type alleles are overexpressed (Murakami and Johnson 1998Citation ). old-1 and old-2 are about 90% identical and most similar in sequence to mammalian genes for mast/stem cell growth factor receptors (Kit) and fibroblast growth factor receptors (Fgfrs) (Murakami and Johnson 1998Citation ).

Although old-1 and daf-2 do not share sequence similarity outside their catalytic domains, they appear to extend life span by a shared signaling pathway. Life extension by both old-1 and daf-2 is dependent on daf-16 (Kenyon et al. 1993Citation ; unpublished data), encoding a transcription factor homologous to the forkhead family of fruit flies and mammals (Lin et al. 1997Citation ; Ogg et al. 1997Citation ). In addition, old-1, daf-2, and most other variants that extend C. elegans and Drosophila life spans confer resistance to multiple kinds of stress, such as oxidation, heat, and ultraviolet radiation (reviewed by Lithgow 1996Citation ; Murakami and Johnson 1996, 1998Citation ; Gems et al. 1998Citation ; Lin, Seroude, and Benzer 1998Citation ). This mechanistic conservation raises the possibility that additional PTKs, particularly paralogs and orthologs, might be capable of extending life span.

This study specifically addresses a number of questions regarding the molecular evolution of old-1, old-2, and daf-2: (1) Are there paralogs of old-1, old-2, and daf-2 (and if so, what are they)? (2) Are the mammalian genes that are most similar in sequence to old-1, old-2, and daf-2 likely to be orthologs? (3) Are there additional candidate orthologs? (4) How closely related are old-1, old-2, and daf-2? The term "paralogs" in this study refers to other members of a gene family within a species (arising by gene duplication). "Orthologs" refers to homologous genes created by speciation; thus, their divergence should be congruent with that of the species being compared.

Because the number of PTK families is quite large and many invertebrate genes and mammalian expressed-sequence tag (EST) sequences have only recently been added to databases, this study secondarily touches on several more general questions: (1) How congruent is the PTK tree with previous PTK trees? (2) Is it likely that many more mammalian PTK families will be discovered? (3) Is the divergence between nematodes and mammals greater than the divergence between nematodes and fruit flies, as previously concluded?

Materials and Methods

Databases searched to identify PTK families are shown in table 1 . BLAST (Altschul et al. 1990Citation ) searches were conducted using the inferred amino acid sequences of old-1, old-2, the old-1 kinase domain, and an old-1/old-2 family consensus sequence. Final searches were current as of November 1998. Representatives from mammalian PTK families with no apparent C. elegans and Drosophila orthologs were used to BLAST search the European Molecular Biology Laboratory (EMBL) invertebrate database, C. elegans databases, and the Berkeley Drosophila Genome Project (BDGP) to help insure that potential orthologs were not missed. Likewise, invertebrate PTKs with no apparent mammalian orthologs were used to BLAST search the SwissProt-translated EMBL database (SPTR).


View this table:
[in this window]
[in a new window]
 
Table 1 Listing of Databases Used to Search for Protein Tyrosine Kinase Families

 
The amino acids of unfinished sequences (DS02309, DS03465, C30F8, Y50D4), Caenorhabditis briggsae sequences (G40L08a, G40L08b, G40L08c), and ESTs (R61934, AI386314) were inferred using BLAST alignments and GCG FRAMEALIGN. The inferred amino acid sequences of several old-1/old-2 family members were also modified based on BLAST alignments and analysis of intron/exon junctions. Additional alterations not shown in the old-1/old-2 family sequence figure are as follows: (1) VRFFWGKSGANFQFLKKSEFPK was deleted from M01B2.1, (2) W04G5.6a and W04G5.6b were derived from W04G5.6, and (3) T17A3.1/F40G9.13 was constructed from the end of cosmid T17A3 and the beginning of cosmid F40G9 with the overlapping sequence GDRQRKSSS removed. Outside the old-1/old-2 family, the only change made was that LVKDHANQLEFNDFVEEIKLMKGIGYHKNIVX was added to the beginning of Q25196 (hydra).

Amino acid sequences were multiply aligned using the default settings of CLUSTAL W (Thompson, Higgins, and Gibson 1994Citation ). No improvement was achieved using secondary structure information (obtained from SwissProt). The only regions conserved among all PTK sequences were the kinase 1 and 2 domains, as defined by Hubbard et al. (1994)Citation . The alignment began with the ATP-binding site of the kinase 1 domain (LGXGXGF motif) and ended at the C-terminal domain. The kinase insert domain was removed. The final alignment covered a region of about 250 amino acids (excluding gaps).

Some alignment gaps were adjusted manually. Private gaps of more than three amino acids were deleted. Privacy was confirmed by BLAST searches of the SPTR database. For regions in which closely spaced gaps were separated by a short region of poor alignment, the entire region was deleted and ambiguous characters were added back (Q15516_human: deleted TAWNCT ... IPKTG and added 21 ambiguous characters; Q21477_m03a1: deleted NTTMGRGGGGG; Q24315_drome: deleted DAERM ... GKRHA and added 18 ambiguous characters; Q25177_hydat: deleted LALCP ... DHDTF and added 10 ambiguous characters; Q26566_fluke: deleted YNWHRGAQ and ILND ... SYSE).

The final PTK tree was constructed using the pairwise distance matrix program FITCH (PHYLIP package; Felsenstein 1993Citation ). The invertebrate nonreceptor PTKs P18106 (FPS), P08630 (SRC2), P00528 (SRC1), P03949 (ABL1), C30F8 from WUGSC, Q24592 (HOP), Q24316 (PR2), and P53356 (TK16) were included in the tree construction to help subdivide long branches; however, they were excluded from the final figure to conserve space. The algorithm assumes an additive-tree model, uses the weighted least-squares criteria of Fitch and Margoliash (1967)Citation , and does not assume a molecular clock. The input order of sequences was jumbled 10 times. The distance matrix was constructed using PROTDIST (PHYLIP) and the Dayhoff PAM substitution matrix. No negative branch lengths were obtained. The tree was bootstrapped 100 times (input order jumbled once each time) using SEQBOOT (PHYLIP), PROTDIST, FITCH, and CONSENSE (PHYLIP). The C. elegans sequences T22B11.3, F59F5.3, F54F7.5, B0198.3, and Y38H6C.20 were not included in this bootstrapping because these sequences were added later (see Acknowledgments). These sequences were bootstrapped in the absence of the nonreceptor PTK sequences to reduce computing time, and the effect on the previous bootstrap results was negligible. The tree was rooted using two human mixed-lineage kinases (SPTR accessions P80192 and Q16584), two human LIM domain kinases (P53667 and P53671), and two RIP serine/threonine kinases (human Q13546 and murine Q60855). Trees were displayed and the final tree figure was drawn with the help of TREEVIEW (Page 1996Citation ).

Deletion analysis of the PTK tree to investigate bootstrap support for grouping the old-1/old-2 family with mammalian Pdgfr, Fgfr, and Ret families was carried out by excluding the root sequences, C08H9.8, T01G5.1, T22B11.3, F59F5.3, F54F7.5, B0198.3, Y38H6C.20, Tie1, Tie2, R61934 (sea urchin EST), O76167 (planarium), P11475 (Drosophila torso), O76809 (hydra), Q25196m (hydra), Q25199 (hydra), AI386314 (murine EST), B0252.1, C24G6.2A, F09A5.2, F08F1.1, F09G2.1, Q24315 (Drosophila), T14E8.1, C25F6.4, C16D9.2, Q24592 (Drosophila jak), and ZK1067.1 (let-23). The tree was bootstrapped 100 times.

A maximum-likelihood tree of the old-1/old-2 family was constructed using PUZZLE 4.0 (Strimmer, Goldman, and von Haeseler 1997Citation ), with eight classes of substitution rate heterogeneity and the BLOSUM 62 substitution matrix (Henikoff and Henikoff 1992Citation ). The sequence alignment began with the first residue that is reverse highlighted (P) in the sequence figure and ended with the last residue that is reverse highlighted (W). The tree thus included the two kinase domains, the kinase insert domain, and the C-terminal domain. The tree was rooted using the Fgfr family. Because PUZZLE uses a heuristic approach (quartet puzzling and neighbor joining) to identify the best tree, an exhaustive search was also conducted using the maximum-likelihood program PROTML (Adachi and Hasegawa 1996Citation ). There was no appreciable difference between the best trees found using PUZZLE and those found using PROTML. There was also no appreciable difference between trees constructed using maximum-likelihood and those constructed using GCG PAUP (version 4.0.0d55 for UNIX; Swofford 1993Citation ) and PROTPARS (PHYLIP).

The genes flanking old-1/old-2 family members were identified from the GeneFinder sequence annotations of the Sanger Centre and WUGSC databases. A BLAST search of the SPTR and GenBank databases was also conducted using the first 5 kb of sequence upstream and downstream of each old-1/old-2 family member.

Nonkinase intra- and extracellular PTK domains were identified using SwissProt and Pfam (www.sanger.ac.uk/Software/Pfam/search.shtml). Some domains were inferred based on sequence similarity to a known domain of a related sequence (e.g., the IG domain of O76167 was based on similarity to P18460).

We searched for sequence motifs using Meta-Meme, version 2.0.1 (Grundy et al. 1997Citation ; http://metameme.sdsc.edu/mhmm-links.html). Meta-Meme was also used to conduct homology searches of the National Center for Biotechnology Information's nonredundant peptide sequence database using the old-1/old-2 family C-terminal domain motif.

We searched for ESTs of the old-1/old-2 family using the C. elegans EST database at http://www.ddbj.nig.ac.jp/c-elegans/html/CE_BLAST.html (May 1999).

The EMBL/GenBank EST database was searched using the amino acid sequences of old-1, the last 25 residues of old-1, an old-1/old-2 family consensus, the last 31 residues of the old-1/old-2 family kinase domain, and an old-1/old-2 family consensus of the C-terminal domain. When last searched, the EST database contained 2.1 million entries (January 15, 1999). The only known mammalian PTK families not detected were Ros/Sevenless and AATK, whose ESTs did not overlap significantly with the PTK catalytic domain. Hits from the full-length consensus sequence query (all kinase domain hits) having expected probabilities of <5.0 (approximately 1,000 ESTs) were used as BLAST queries of the EMBL nucleotide database. ESTs that were less than 85% identical to a known sequence or whose hits covered less than 190 nt were analyzed further by using all six reading frames to BLAST query the SPTR or Entrez protein databases. ESTs whose best protein hits were PTKs but whose best sequence identities with a PTK kinase domain was less than 80% were included in the tree as possible representatives of novel PTK families. (The 80% cutoff was a conservative estimate designed to include false positives, given that the identity between all known transmembrane PTK families is <70%.) Only three ESTs (one murine, two invertebrate) appeared to represent potentially novel families. The murine EST was compared with the EST consensus sequences compiled by the Institute for Genome Research (TIGR, http://www.tigr.org/tigr_home/tdb/), but no contiguous sequence data were available.

The GenBank EST database was also searched using GCG FRAMESEARCH and GCG PROFILESEARCH (weighted matrix) with the old-1/old-2 family C-terminal domains as queries. The significance of the two best hits was determined by comparison against a C-terminal domain consensus sequence using GCG FRAMEALIGN. Hits with Z scores of >5 were also examined using Entrez and BLAST.

Access to most of the databases, programs, and program packages (BLAST, BLOCKS, CLUSTAL W, GCG, MPSEARCH, PHYLIP, ProDom, PROTML, and PUZZLE) was obtained using the U.K. Human Genome Mapping Project online computer service (Rysavy et al. 1992Citation ).

Results

Paralog and Ortholog Searches Using BLAST Queries
A BLAST search of the SwissProt database using full-length old-1 and old-2 amino acid sequences resulted in two outstanding hits, kin-15 and kin-16, both from C. elegans. BLAST searches of the TrEMBL database and Sanger Centre/WUGSC C. elegans database, covering virtually the entire C. elegans genome (C. elegans Sequencing Consortium 1998), resulted in eight additional outstanding hits from cosmids R09D1, F59F3, W04G5, M01B2, T17A3/F40G9, and yeast artificial chromosome Y50D4 (expected probabilities <10-50 vs. >10-30 for all other hits). To help ensure that potential paralogs were not overlooked, the top 60 hits from the C. elegans database were used as BLAST queries of the SPTR and GenBank databases. Those queries whose best hits were transmembrane PTKs were then included in the molecular phylogenetic tree.

The top seven old-1 nonnematode hits from an ungapped BLAST search of the SPTR database were all Kits (human, horse, pig, bovine, guinea pig, cat, and rat), with expected probabilities of <10-57. However, a gapped BLAST search (Altschul et al. 1997Citation ) resulted in 22 top nonnematode hits that were all Fgfrs (human, mouse, rat, chicken, frog, newt, and fruit fly). The Kit and Fgfr hits were then used to query the EMBL invertebrate database. Similar to previous strategies for finding orthologs (Tatusov, Koonin, and Lipman 1997Citation ; Chervitz et al. 1998Citation ), the expectation was that if Kits and/or Fgfrs were orthologs of old-1 and old-2, they would reciprocate by finding old-1 and old-2 as their best C. elegans hits. However, the best C. elegans hit for each Kit and Fgfr was egl-15, not old-1 or old-2 (or any other member of the old-1/old-2 family). Therefore, no PTKs stood out as probable orthologs of old-1 and old-2.

A more extensive survey (see Materials and Methods) was then conducted to (1) identify all of the potential orthologs of old-1 and old-2, (2) identify all of the potential orthologs of daf-2, (3) identify all of the PTK families that may have evolved from the last common ancestor of old-1, old-2, and daf-2, (4) identify as many PTK families as possible having putative mammalian and nematode orthologs, and (5) include as many PTK families as necessary to subdivide long branches of the PTK tree. The end result was that approximately 250 different (i.e., >5% divergent) vertebrate and invertebrate PTKs were identified, representing all known PTK families.

In aggregate, approximately 200 different PTK catalytic domains (noncatalytic domains were not widely conserved and thus were excluded) were examined by molecular phylogenetic analysis. Most trees were constructed using the Fitch-Margoliash distance matrix program FITCH, which does not assume a molecular clock. Parsimony methods (PAUP and PHYLIP PROTPARS) were also used to examine approximately 100 PTKs from families shown in figure 1 as being most closely related to old-1, old-2, and daf-2. The best parsimony trees (not shown) were not appreciably different from those produced using FITCH, but parsimony trees required much more computing time. These exploratory analyses suggested that old-1, old-2, and daf-2 were more closely related to transmembrane PTKs than to nontransmembrane PTKs, and that nontransmembrane PTK families tended to cluster together. Therefore, most of the invertebrate nontransmembrane PTKs were excluded to reduce computing time (some were retained to subdivide long branches). In addition, most vertebrate PTKs whose catalytic domains were more than 70% identical to that of a human PTK were excluded because these did not contribute significantly to subdividing long branches or distinguishing families. The final tree of 144 PTKs still contained representatives from all known vertebrate nontransmembrane and transmembrane PTK families (fig. 1 ).



View larger version (58K):
[in this window]
[in a new window]
 
  Fig. 1.—Protein tyrosine kinase tree. The tree is drawn to scale. The unit on the scale bar represents 10% divergence after correcting for multiple substitutions. Branches are labeled according to SPTR accession numbers followed by a gene name, descriptive name abbreviation, or cosmid coding sequence name. An accession number followed by an "m" indicates that the SPTR sequence was modified. Numbers at nodes indicate bootstrap support. The tree was bootstrapped 100 times. Black branches indicate human genes unless specified otherwise. Red branches indicate Caenorhabditis elegans genes. Blue branches indicate Drosophila melanogaster genes and one Drosophila virilis (sevenless) gene. Green branches indicate other invertebrate species as indicated. The yellow column indicates the nematode-mammal split. The tree was rooted using mixed lineage and serine/threonine kinase genes. To simplify the diagram, some nonsignificant bifurcations were removed. Roman numerals indicate family divisions according to Hanks and Hunter (1995). Family names without Roman numerals are families mentioned in the text but were not included in Hanks and Hunter's classification. Squares and names within brackets indicate extracellular domain homologies: black = immunoglobulin [IG]; white = fibronectin type III [FN3], blue = receptor L and furin-like cysteine rich; [EGF] = epidermal growth factor. Circles indicate intracellular domains: black = Src homology 2, white = Src homology 3; blue = sterile alpha motif

 
An old-1/old-2 Family
As shown in figure 1 , old-1 and old-2 were found to be members of a large gene family in C. elegans which will be referred to as the old-1/old-2 family and coincides with the Kin15/Kin16 family (Hanks and Hunter 1995Citation ). This family included 14 other C. elegans genes, 4 more than suggested by BLAST searches and 5 more than previously suggested for the Kin15/Kin16 family (Ruvkun and Hobert 1998Citation ). Ignoring F54F7.5 and B0198.3 (which occasionally clustered within the old-1/old-2 family), bootstrap support clustering the 16 genes together was 100%. Only two family members, T01G5.1 and F59F3.1, were represented in the C. elegans EST database. Three genes from the nematode C. briggsae also clustered with the old-1/old-2 family and will be discussed below.

Within the old-1/old-2 family, only kin-15 and kin-16 have been functionally characterized. Expression patterns suggested that both genes are involved in postembryonic development of the hypodermis (Morgan and Greenwald 1993Citation ). In spite of their apparent functional similarity, however, a kin-15 + kin–16 clade was not strongly supported. Interestingly, kin-16 had a greater tendency (51 bootstrap trees vs. 7) to group with clade T17A3.1/F40G9.13 + T17A3.8 + F59F3.1 + F59F3.5 than with kin-15.

It was also interesting that T17A3.1/F40G9.13, T17A3.8, F59F3.1, and F59F3.5 were the only old-1/old-2 family members to have recognizable receptor domain sequences. These domains were similar to the immunoglobulin-type domains (IG domains) that are found in many different proteins, including many receptor PTKs (fig. 1 ). These IG domains thus suggested a receptor function for this branch of the old-1/old-2 family rather than the adapter function proposed by Morgan and Greenwald (1993)Citation for kin-15 and kin-16. F59F3.1 and F59F3.5 appeared to have seven IG-like domains, which among PTKs is the same number found only in Vegfrs.

The tree also suggested several vertebrate gene families as being most closely related to the old-1/old-2 family. One of these families includes the macrophage colony stimulating factor receptor (Fms), Kit, FMS-like tyrosine kinase 3 (Flt3), platelet-derived growth factor rectors (Pdgfrs), and Vegfrs and will be referred to hereafter as the Pdgfr family according to the classification of Hanks and Hunter (1995)Citation . The other gene families were Fgfr, the ret proto-oncogene (Ret), and the "tyrosine kinase with IG and EGF homology/tunica interna endothelial cell tyrosine kinase" receptors (Tie/Tek). The bootstrap support grouping the old-1/old-2 family with these mammalian families was low, largely due to variability in the positioning of other invertebrate sequences and due to 28% of bootstrap trees placing the old-1/old-2 family as an outgroup of all the other PTK families. Ignoring these, 48 of 72 trees (67%) grouped the old-1/old-2 family with the Pdgfr, Fgfr, Ret, and Tie/Tek families, 12 trees grouped the old-1/old-2 family with the Egfr family, and only 1–3 trees supported other groupings. Similarly, deleting some of the more divergent sequences from the tree (see Materials and Methods) increased the bootstrap support grouping the old-1/old-2 family with the Pdgfr, Fgfr, and Ret families to 69%. Although the bootstrap support was still not high, the Pdgfr, Fgfr, Ret, and Tie/Tek families stood out as the best candidates for orthology with the old-1/old-2 family. As described below, shared intron sites and shared C-terminal domain sequences provided additional support. The second best candidate for orthology with the old-1/old-2 family appeared to be the Egfr family.

Like the old-1/old-2 family, the Pdgfr, Fgfr, Ret, and Tie/Tek families encode transmembrane PTKs. With the exception of Ret, the receptor domains of these families also contain IG domains. However, the PTK tree of figure 1 suggested that each of these mammalian families was more closely related to egl-15 than to the old-1/old-2 family, which was consistent with the BLAST results. The relatedness between these mammalian PTK families and egl-15 also agrees with previous molecular phylogenetic analyses (Rousset et al. 1995Citation ; Coulier et al. 1997Citation ).

The PTK tree was then used to investigate whether the Pdgfr, Fgfr, Ret, and Tie/Tek families might be considered probable orthologs of the old-1/old-2 family by comparing their split with the range of splits observed between other vertebrate PTKs and their probable orthologs. As shown in figure 1 (yellow column), the range of orthologous C. elegans–mammal splits also corresponded closely with other orthologous invertebrate-mammal splits. However, the split between the old-1/old-2 family and the Pdgfr, Fgfr, Ret, and Tie/Tek families preceded the probable range of orthologous invertebrate-mammal splits (likely to include some nonorthologous splits as well). Thus, the Pdgfr, Fgfr, Ret, and Tie/Tek families appear to have diverged too early from the old-1/old-2 family to be orthologs. Grouping the old-1/old-2 family with the Egfr family also resulted in a split that preceded the range of probable orthologous invertebrate-mammal splits.

It was curious that the old-1/old-2 family did not group more closely with the Pdgfr family given sequence similarities in both the kinase and the IG domains and given the absence of a Pdgfr ortholog in C. elegans. Because most members of the old-1/old-2 family did not have an IG domain and appeared to have evolved faster than the IG branch, we asked whether the IG branch might be more representative of the old-1/old-2 family progenitor. However, when the non-IG branch (with or without kin-16) was excluded from the tree, the genetic distance between the IG branch and the Pdgfr family increased rather than decreased, and the IG branch became an outgroup of all of the other PTK families. (When the IG branch, the C. briggsae branches, and the four longest C. elegans branches were excluded, the non-IG branch gave the same topology as the old-1/old-2 family as a whole.) Therefore, the kinase domain sequences of just the IG branch of the old-1/old-2 family also did not suggest an orthologous relationship with the Pdgfr family (or any other family).

We also noticed that in BLAST searches using an old-1/old-2 IG branch receptor domain consensus sequence, the only significant PTK hits (several non-PTKs gave similar alignment scores) in the SwissProt database were from the Pdgfr family (probability by chance <10-6 vs. <2 for Fgfrs). However, the alignment scores were quite low (<51 bits), and significance was due to the detection of extended regions of low identity (about 16%) with Vegfrs. In the region of about 180 residues where the F59F3.1 receptor domain overlapped both the human Vegfr1 (the best Pdgfr hit) and the human Fgfr4, the number of identities with Vegfr1 and Fgfr4 was actually the same (50). The IG-like receptor domains of the old-1/old-2 family and Pdgfrs also did not reciprocate as each other's best BLAST hits in species comparisons. In contrast, in the region of about 200 residues in which the egl-15 receptor domain overlapped both the rat Vegfr1 (the best Pdgfr hit) and the rat Fgfr1, the number of identities with Fgfr1 was 22% greater than the number of identities with Vegfr1 (62 vs. 51). In addition, egl-15 and Fgfr receptor domains reciprocated as each other's best BLAST hits in species comparisons.

old-1/old-2 Family Genomic Organization
We also examined the genomic organization of the old-1/old-2 family and found that most members were arranged as tandem gene pairs, and many members were flanked by chitinase genes (fig. 2A ). This organization suggested that some old-1/old-2 subfamilies might have arisen as trans duplications, perhaps similar to those of the Pdgfr family (Rousset et al. 1995Citation ). None of the subfamily arrangements, however, appeared to correspond to the subfamily arrangements of the Pdgfr family. Furthermore, Pdgfr family genes are not known to be flanked by chitinase-like genes, and the known mammalian chitinase-like genes (Jin et al. 1998Citation ) do not colocalize with any known PTK. The PTK tree clearly indicated that the gene pairs old-1 and old-2, W04G5.6a and W04G5.6b, and F59F3.5 and F59F3.1 arose as a result of cis duplications.



View larger version (18K):
[in this window]
[in a new window]
 
Fig. 2.—old-1/old-2 family genomic organization. Arrows point in the direction of transcription. Large arrows indicate old-1/old-2 family members. Small arrows indicate the genes that most closely flank the old-1/old-2 family (inferred by GeneFinder and/or BLAST). chi = chitinase gene

 
Given that old-1 and old-2 were two of the old-1/old-2 family members flanked by chitinase genes, we also reasoned that the paralogs most closely related to old-1/old-2 might be flanked by chitinase genes. It was found that three of the five other genes flanked by chitinase genes (R09D1.12, M01B2.1, and C08H9.8) were also suggested by the PTK tree to be most closely related to old-1 and old-2.

old-1/old-2 Family Intron Sites
To further elucidate relationships within the old-1/old-2 family, we examined intron/exon junctions (fig. 3 ). Preceding the kinase domain, the intron/exon junctions were not well conserved. Nevertheless, the N-terminal domains of old-1 and old-2 shared an intron site with R09D1.12 and kin-16. Given the close relationship suggested by the tree between R09D1.12 and M01B2.1, it was curious that M01B2.1 did not share this site. Moreover, it was curious that the 5' end of M01B2.1 seemed to terminate prematurely. Therefore, we searched and found that M01B2.1 could indeed have an intron at exactly the same shared site; however, an initiating methionine codon for the putative upstream exon was not found.



View larger version (94K):
[in this window]
[in a new window]
 
  Fig. 3.—old-1/old-2 family amino acid sequence alignment, modifications, conserved residues, and intron/exon junctions. Equal signs indicate gaps used to improve the alignment. Periods indicate residues identical to old-1. A caret, a less-than symbol, or a greater-than symbol indicates an intron between codons, within the codon to the left, or within the codon to the right, respectively. The boxed region indicates an immunoglobulin-like domain. Reverse highlighting in the N-terminal domain indicates a putative RVxRAxExDD motif. Reverse highlighting elsewhere indicates residues that are present in >=50% of the sequences. Sequence modifications and additions based on our analysis of the nucleotide sequences are shown in lowercase letters. The @ symbol preceding sequences 13–16 and the asterisk (*) following sequences 14 and 15 indicate additional residues not shown. Domain boundaries are approximate and based on Hubbard et al. (1994)

 
In the juxtamembrane domain, kin-16 shared an intron site with the IG branch genes, consistent with kin-16 being more closely related to the IG branch than to kin-15 as suggested by the tree. Also in the juxtamembrane domain, old-1 and old-2 again shared an intron site with R09D1.12 and M01B2.1.

Within the kinase and C-terminal domains, intron sites were highly conserved. All members of the C. elegans old-1/old-2 family shared six of six intron sites, often with the intron/exon junctions occurring within or between the same codons. Finding that these intron sites were highly conserved, we predicted and found six additional exons (one partial) that were not identified in the C. elegans database (fig. 3 , residues in lowercase letters). These included the C-terminal exons of R09D1.12, C08H9.8, W04G5.6a, and W04G5.6b. We also found that the old-1/old-2 family shared four intron sites with Pdgfr subfamilies, three with human Fgfr4, one with egl-15, and one with the C. elegans Ys3j family.

old-1/old-2 Family Sequence Features Outside the Kinase Domain
Shared sequence features among old-1/old-2 family members in regions outside the kinase domain were also examined. The N-terminal domain (the receptor domain of transmembrane PTKs) has previously been noted to be unusually short for kin-15 and kin-16 and thus more likely to function as an adapter rather than as a receptor (Morgan and Greenwald 1993Citation ). As shown in figure 3 , such short domains (20–80 residues) appeared to be characteristic of the non-IG members of the old-1/old-2 family. No other transmembrane PTK family, including Ryk (Hovens et al. 1992Citation ), had N-terminal domains that were nearly as short. Although short, the domain lengths appeared to be quite variable. No similarities to known domains or sequence motifs were found using Meta-Meme, BLOCKs, Pfam, or BLAST. However, a sequence motif (R-[VI]-x-R-[AS]-x-E-x-D-D) was found among the non-IG family members that appeared to be homologous to a motif within the IG domains of all four IG members (fig. 3 ). Among non-IG members (excluding kin-16), this sequence motif was shared by old-1, old-2, R09D1.12, and M01B2.1, again supporting R09D1.12 and M01B2.1 as being most closely related to old-1 and old-2.

The N-terminal domains of the IG members of the old-1/old-2 family tended to be in the size range typically associated with PTK receptor domains. T17A3.8, however, had only one IG domain and thus appeared to have a short N-terminal domain as well.

The transmembrane domain of 15–20 hydrophobic residues was well conserved among the old-1/old-2 family members. Although this domain appeared to be missing in W04G5.6a, Y50D4, and T01G5.1, their 5' ends were not well defined. The juxtamembrane domain sequence was not well conserved except for its C-terminal end, which might actually be part of the kinase 1 domain. Similarly, the kinase insert domain was only well conserved at its N-terminal end, which could be part of the kinase 1 domain.

Interestingly, almost all of the C-terminal domain sequence was conserved among the old-1/old-2 family members. The first half of this domain was similar to Pdgfrs and Fgfrs. The last half of this domain, however, was unique, with a conserved 16-residue sequence motif of [KR]-[LI]-x-x-E-x-[NE]-[EN]-Q-x-x-L-x-[ND]-W-I (fig. 3 ). Because the C-terminal domain of PTKs is typically involved in downstream signaling, this motif may have important functional implications unique to the old-1/old-2 family as well. A Meta-Meme search suggested a potential hit (11:1 odds) for this motif within the C-terminal domain of insect mitochondrial cytochrome oxidase II subunits, but this motif is not associated with a known function.

Because conserved regions of the old-1/old-2 family extended beyond the kinase domain, we asked whether including these regions might improve the resolution of the old-1/old-2 family tree. Maximum-likelihood and parsimony trees (see Materials and Methods) were constructed that included the conserved region just upstream of the kinase domain, the kinase insert domain, and the C-terminal domain. These trees (not shown) supported the same old-1/old-2 subfamilies as the large PTK tree, but the relationships between subfamilies were no better resolved.

EST Search
Because none of the known mammalian PTK families appeared to be probable orthologs of the old-1/old-2 family, additional candidate orthologs were searched for in the EMBL/GenBank EST database (see Methods and Materials). Approximately 360 mammalian PTK BLAST hits were obtained (almost entirely from mice and humans), representing all but two of the known PTK families. Only one EST appeared to represent a novel mammalian PTK family. This murine EST (accession number AI386314) covered 139 residues of the PTK kinase domain and was 100% identical to another, slightly smaller, EST (accession number AA0980224). On the tree, this EST grouped most closely with the mammalian Pdgfr, Fgfr, Ret, and Tie/Tek families (fig. 1 ). The most closely related C. elegans genes were B0252.1 and the Ys3j family. Although B0252.1 had a long branch length, the EST still grouped best with the C. elegans Ys3j family when B0252.1 was removed. No bootstrap trees grouped the EST with the old-1/old-2 family, making it seem unlikely that this EST is an old-1/old-2 ortholog. As a representative of a novel PTK family, this EST could be useful for discovering additional proto-oncogenes.

There were also two invertebrate ESTs that appeared to represent novel PTKs. One, from the sea urchin (accession number R61934), belonged to the Tie/Tek family (fig. 1 ). The other, from the nematode Onchocerca (accession number AI087777), appeared to be a nontransmembrane PTK and thus was not included in the tree.

daf-2
daf-2 did not appear to have any C. elegans paralogs (fig. 1 ). With regard to orthology, daf-2 received high bootstrap support for being most closely related to the mammalian Insr family, which is consistent with previous sequence comparisons (Kimura et al. 1997Citation ). Based on the range of probable nematode-mammal splits, the divergence between daf-2 and the Insr family was also congruent with orthology. It was curious, however, that a hydra gene (TK7) was more closely related to the Insr family than daf-2 was with a very high bootstrap score of 99. This hydra gene thus suggests that daf-2 and the mammalian Insr family might not be orthologous.

Among PTKs, it can be seen that daf-2 and the old-1/old-2 family were not particularly closely related (fig. 1 ). In fact, no PTKs could be excluded as having evolved from the common ancestor of the daf-2 and the old-1/old-2 family.

Discussion

This study is distinct from previous large-scale PTK studies in that it includes a number of invertebrate receptor PTKs and a novel mammalian PTK (EST AI386314) that have not previously been analyzed. It also represents a complete survey of the EMBL/GenBank EST database, containing over 2 million entries. The results are largely congruent with previous trees (Hanks and Hunter 1995Citation ; Suga et al. 1997Citation ), although one difference is noted.

In this study, the Kin15/Kin16 family was found to be more closely related to the Pdgfr, Fgfr, Ret, and Tie/Tek families than to the Hgfr family as previously suggested by Hanks and Hunter (1995)Citation . This difference is almost certainly due to our including additional Kin15/Kin16 family genes and the C. elegans F11E6.8 gene. The F11E6.8 gene appears to be the most probable ortholog of the Hgfr family and was not available at the time Hanks and Hunter (1995)Citation constructed their tree.

Nematode–Fruit Fly–Mammal Divergence
Our tree also differs from the view that nematodes are more closely related to fruit flies than they are to mammals. Although it has been recognized for some time that most protein-based trees place the C. elegans split before the Drosophila-mammal split, this phylogeny has recently lost favor, based largely on analyses of slow-evolving nematode 18S ribosomal RNA sequences (reviewed by Adoutte et al. 1999Citation ). Mushegian et al. (1998)Citation also argued that slow-evolving C. elegans proteins support a later split between C. elegans and Drosophila, but 3 of 6 and 7 of 12 of their slowest evolving proteins actually support a later split between Drosophila and mammals. More recently, Wang, Kumar, and Hedges (1999)Citation compiled a set of 18 C. elegans genes that also did not exhibit a significant divergence rate bias. By combining these sequences, they obtained a confidence probability greater than 99% that favored a later split between Drosophila and mammals. Our results are similar in that all four (five, if the Insr family is included) transmembrane PTK families having mammal, Drosophila, and C. elegans representatives suggested a later split between Drosophila and mammals (Fgfr, Ryk, Ltk/Alk, and Egfr). The well-characterized Fgfr and Egfr families in particular had bootstrap values of 83 and 98, respectively. Most nontransmembrane PTKs, having shorter branch lengths, also suggested a later Drosophila-mammal split (not shown). Similar to observations made regarding developmental control genes (reviewed by Ruvkun and Hobert 1998Citation ), we also observed that a Drosophila PTK gene is more likely to have a mammalian ortholog and no C. elegans ortholog (DS03465, Ret, Ror or Nrk, and Trk) than it is to have a C. elegans ortholog and no mammalian ortholog (none), again consistent with Drosophila and mammals having split later and also independent of substitution rate bias. Therefore, the total molecular evidence would still seem to favor a later split between Drosophila and mammals. At the very least, the divergence between C. elegans and Drosophila is as great as the divergence between C. elegans and mammals (Wang, Kumar, and Hedges 1999Citation ). This conclusion has particularly important implications for genetic studies of aging, given that such studies are long-term and expensive to conduct using mammalian models. In particular, it suggests that mechanisms, such as stress resistance, that can extend both C. elegans and Drosophila life spans would be reasonable candidates for extending mammalian life spans.

old-1 and old-2 Are Members of a Large Gene Family
Although ours is not the first PTK tree, this study is the first to carefully analyze PTKs known to extend life span. One of our major observations is that old-1 and old-2 are members of a surprisingly large multigene family in nematodes. In C. elegans, this family represents about one fourth of all PTKs and about half of all transmembrane PTKs. With 16 members, the old-1/old-2 family surpasses the mammalian Eph/Elk/Eck family (14 members) as the largest known transmembrane PTK family for a single species. Although not shown, C. elegans also has a large nontransmembrane family of about 20 members that appears to be most closely related to the Fes family. These two large families contrast with the perception that PTK duplications are unique to the evolution of vertebrates (Suga et al. 1997Citation ; Gibson and Spring 1998Citation ). On the other hand, large PTK families in C. elegans are the exception rather than the rule, as most of the other C. elegans PTKs are not members of multigene families.

Because of its large size, the old-1/old-2 family could provide a unique opportunity to address questions regarding when and how old-1 acquired its capacity to extend life. Did this capacity develop early or late in the evolution of C. elegans? Did it arise little by little or all at once? Was there coevolution with stress resistance? Were particular residue or transcriptional control changes responsible? For example, it can be predicted that if no other members of the old-1/old-2 family can extend life span, then old-1's ability was acquired mostly after the relatively recent split with old-2. The molecular changes likely to be responsible could then be identified using maximum-likelihood/parsimony molecular phylogenetic analyses. In vitro mutagenesis could be used to determine which changes actually extend life span and whether those changes coevolved directly with an ability to increase stress resistance.

As suggested by the tree, the paralogs most closely related to old-1 and old-2, and thus the best candidate genes to also extend the life span of C. elegans, appear to be R09D1.12 and M01B2.6. The closeness of this relationship was further supported by a shared N-terminal sequence motif, shared intron/exon junctions, and shared flanking chitinase genes. However, because it is not clear how these features should be weighted relative to shared amino acid changes, we have not attempted to quantify the additional support.

A Mammalian Ortholog of old-1 and old-2 Appears to Be Unlikely
A second major observation is that the old-1/old-2 family does not appear to have a mammalian ortholog. Three explanations are possible. The first is that a mammalian ortholog has not yet been cloned. Given that old-1 is expressed at very low levels (unpublished data), it could be that a mammalian ortholog has also eluded EST cloning. Or, even if an EST has been cloned, it may not be recognizable because the sequence lacks significant overlap with the kinase or C-terminal domain. A second possibility is that the old-1/old-2 family is really orthologous to the Pdgfr family in spite of what the tree shows. Although the old-1/old-2 family appears to be Pdgfr-like in terms of IG domain similarities and intron/exon junctions, an orthologous relationship seems unlikely for a number of reasons: (1) no bootstrap trees grouped the old-1/old-2 family only with the Pdgfr family (or with the Vegfr family), (2) probable Drosophila and hydra orthologs of the Pdgfr family were readily recognizable, (3) the IG branch of the old-1/old-2 family alone did not group more closely to the Pdgfr family, (4) the genomic organization of old-1/old-2 subfamilies does not correspond to that of Pdgfr subfamilies, and (5) the Pdgfr family does not exhibit sequence similarity to the old-1/old-2 family C-terminal motif. The third possibility is that the gene lineage leading to the mammalian ortholog is extinct. This possibility has merit for a number of reasons: (1) the identification of human PTK families has been intensively pursued, yet no new mammalian family has been reported during the 2-year course of this study; (2) only one novel mammalian PTK family was detected in the EST database; (3) most ESTs come from normalized libraries, making it less likely that rare transcripts would be missed (Soares et al. 1994Citation ); and (4) PTK-specific degenerate primers to amplify genomic DNA did not suggest any new mammalian PTK families (Oates et al. 1998Citation ).

Although no mammalian orthologs of the old-1/old-2 family were suggested, three potential orthologs were suggested in C. briggsae, which diverged from C. elegans 20–60 MYA (Heschl and Baillie 1990;Citation Lee et al. 1992Citation ; Kennedy et al. 1993Citation ). These C. briggsae genes lacked an IG domain, matched well with the highly conserved intron-exon junctions (fig. 3 ), exhibited similarity to the C-terminal motif (fig. 3 ), and were flanked by a chitinase gene (fig. 2B ). Nevertheless, these genes are probably not orthologous to the old-1/old-2 family. Based on a sample of 10 kinases (including two transmembrane PTKs), kinase domain divergence between C. elegans and C. briggsae orthologs is only 5%–15%. However, the divergence between the C. briggsae and C. elegans old-1/old-2 family kinase domains is about 50% (uncorrected for multiple substitutions). Therefore, it seems likely that better C. briggsae ortholog candidates can be found (only about 15% of the C. briggsae genome was sequenced at the time of this analysis), perhaps even a candidate orthologous to just old-1 and old-2 or to just old-1.

daf-2 Has No Paralogs, Orthology Is Unclear
A third observation is that daf-2 does not have paralogs in C. elegans. This is curious, given that C. elegans has 10 paralogs of an insulin-like gene (Duret et al. 1998Citation ). It is also curious that the hydra TK7 gene is more closely related to the mammalian insulin receptor family than daf-2 is. Can daf-2 be considered an insulin receptor ortholog? Initial evidence in support of orthology was based on the finding of significant similarity in both the kinase and receptor domains of INSR, IRR, and IG1R (Kimura et al. 1997Citation ). However, in the absence of a molecular phylogenetic tree, it is difficult to know whether the divergence coincides with that expected for the species split (Abouheif et al. 1997Citation ; Tatusov, Koonin, and Lipman 1997Citation ). For example, the criteria of finding significant similarity in both the kinase and the receptor domains when applied to old-1/old-2 family members F59F3.5 and F59F3.1 would have erroneously led to the conclusion that these genes are probably orthologous to the Vegfrs. In this study, we did find that the split between daf-2 and the mammalian insulin receptor family is congruent with that expected for a fruit fly–mammal split if TK7 is ignored. Recently, it has also been shown that daf-2 signaling through age-1 phosphoinositide-3-OH kinase, daf-18 phosphatase, pdk-1 3-phosphoinositide-dependent kinase, akt-1 and akt-2 kinases, and daf-16 transcription factor parallels the insulin receptor pathway in mammals via PI3K kinase, PTEN phosphatase, PDK-1 kinase, Akt/PKB kinase, and HNF-3 transcription factor (Paradis et al. 1999Citation ). Therefore, perhaps hydra TK7 has an unusually slow divergence rate. Resolving this will require finding additional insulin receptor–like genes from other early metazoan phyla (e.g., sponge).

Finally, among PTKs the old-1/old-2 family and daf-2 do not share a particularly close relationship, which is due in part to most PTK families having arisen at about the same time (Suga et al. 1997Citation ). Therefore, given that old-1 and daf-2 appear to interact with the same life extension pathway and that the split between the old-1/old-2 family and daf-2 occurred prior to the nematode-mammal split, it is tempting to speculate that PTKs from other families might also extend life span. Theoretically, there is perhaps little reason to think that interactions between PTKs and life extension pathways would be widely conserved given the declining force of natural selection with age (reviewed by Rose 1991Citation ). Empirically, however, it is intriguing that the Drosophila insulin receptor–like gene appears to extend the life span of Drosophila (M. Tatar, personal communication of preliminary results) and that a chimeric gene carrying the human Fgfr-1 kinase domain, substituting for the kinase domain of old-1, can extend the life span of C. elegans (unpublished data). Regardless, an understanding of the paralogy and orthology of life-extending PTKs is particularly important for interpreting and designing experiments along these lines.

Acknowledgements

Sequences F59F5.3, T22B11.3, F54F7.5, B0198.3, and Y38H6C.20 were suggested to us as probable receptor PTKs by Tony Hunter during the review of this manuscript. We are grateful to Steve Hardies, Sam Henderson, and Chris Link for helpful comments and discussions. Funding was provided by National Institutes of Health grants R01-AG08322, R01-AG16219, P01-AG08761, and AA00195, as well as grants from the American Federation for Aging Research and the Ellison Medical Foundation.

Footnotes

Thomas H. Eickbush, Reviewing Editor

1 Keywords: aging senescence tyrosine kinases evolution taxonomy longevity signal transduction Back

2 Address for correspondence and reprints: Brad A. Rikke, Institute for Behavioral Genetics, University of Colorado at Boulder, Campus Box 447, Boulder, Colorado 80309-0447. E-mail: rikke{at}colorado.edu Back

literature cited

    Abouheif, E., M. Akam, W. J. Dickinson, P. W. H. Holland, A. Meyer, N. H. Patel, F. A. Raff, V. L. Roth, and G. A. Wray. 1997. Homology and developmental genes (letter). Trends Genet. 13:432–433.[ISI][Medline]

    Adachi, J., and M. Hasegawa. 1996. MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Comput. Sci. Monogr. 28:1–150.

    Adoutte, A., G. Balavoine, N. Lartillot, and R. De Rosa. 1999. Animal evolution: the end of the intermediate taxa? Trends Genet. 14:104–108.

    Altschul, S. F., W. Gish, W. Miller, W. Myers, and W. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410.[ISI][Medline]

    Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402.[Abstract/Free Full Text]

    Bairoch, A., and R. Apweiler. 1998. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998. Nucleic Acids Res. 26:38–42.[Abstract/Free Full Text]

    C. ELEGANS Sequencing Consortium. 1998. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282:2012–2018.

    Cetkovic, H., I. M. Muller, W. E. Muller, and V. Gamulin. 1998. Characterization and phylogenetic analysis of a cDNA encoding the Fes/FER related, non-receptor protein-tyrosine kinase in the marine sponge sycon raphanus. Gene 216:77–84.

    Chervitz, S. A., L. Aravind, G. Sherlock et al. (13 co-authors). 1998. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282:2022–2028.

    Coulier, F., P. Pontarotti, R. Roubin, H. Hartung, M. Goldfarb, and D. Birnbaum. 1997. Of worms and men: an evolutionary perspective on the fibroblast growth factor (FGF) and FGF receptor families. J. Mol. Evol. 44:43–56.[ISI][Medline]

    Duret, L., N. Guex, M. C. Peitsch, and A. Bairoch. 1998. New insulin-like proteins with atypical disulfide bond pattern characterized in Caenorhabditis elegans by comparative sequence analysis and homology modeling. Genome Res. 8:348–353.[Abstract/Free Full Text]

    Duret, L., D. Mouchiroud, and M. Gouy. 1994. HOVERGEN, a database of homologous vertebrate genes. Nucleic Acids Res. 22:2360–2365.[Abstract]

    Felsenstein, J. 1993. PHYLIP (phylogeny inference package). Version 3.5c. Distributed by the author, Department of Genetics, University of Washington, Seattle.

    Fitch, W. M., and E. Margoliash. 1967. Construction of phylogenetic trees. Science 155:279–284.

    Gems, D., A. J. Sutton, M. L. Sundermeyer, P. S. Albert, K. V. King, M. L. Edgley, P. L. Larsen, and D. L. Riddle. 1998. Two pleiotropic classes of daf-2 mutation affect larval arrest, adult behavior, reproduction and longevity in Caenorhabditis elegans. Genetics 150:129–155.

    Gibson, T. J., and J. Spring. 1998. Genetic redundancy in vertebrates: polyploidy and persistence of genes encoding multidomain proteins. Trends Genet. 14:46–49.[ISI][Medline]

    Grundy, W. N., T. L. Bailey, C. P. Elkan, and M. E. Baker. 1997. Meta-MEME: motif-based hidden Markov models of protein families. Comput. Appl. Biosci. 13:397–406.[Abstract]

    Hanks, S. K., and T. Hunter. 1995. The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification. FASEB J. 9:576–596.[Abstract/Free Full Text]

    Henikoff, S., and J. G. Henikoff. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89:10915–10919.

    ———. 1994. Protein family classification based on searching a database of blocks. Genomics 19:97–107.

    Heschl, M. F. P., and D. L. Baillie. 1990. Functional elements and domains inferred from sequence comparisons of a heat shock gene in two nematodes. J. Mol. Evol. 31:3–9.[ISI][Medline]

    Hillier, L. D., G. Lennon, M. Becker et al. (25 co-authors). 1996. Generation and analysis of 280,000 human expressed sequence tags. Genome Res. 6:807–828.[Abstract]

    Hovens, C. M., S. A. Stacker, A. C. Andres, A. G. Harpur, A. Ziemiecki, and A. F. Wilks. 1992. RYK, a receptor tyrosine kinase-related molecule with unusual kinase domain motifs. Proc. Natl. Acad. Sci. USA 89:11818–11822.

    Hubbard, S. R., L. Wei, L. Ellis, and W. A. Hendrickson. 1994. Crystal structure of the tyrosine kinase domain of the human insulin receptor. Nature 372:746–754.

    Hunter, T. 1998. The phosphorylation of proteins on tyrosine: its role in cell growth and disease. Philos. Trans. R. Soc. Lond. B Biol. Sci. 353:583–605.[ISI][Medline]

    Hunter, T., and G. D. Plowman. 1997. The protein kinases of budding yeast: six score and more. Trends Biochem. Sci. 22:18–22.[ISI][Medline]

    Jin, H. M., N. G. Copeland, D. J. Gilbert, N. A. Jenkins, R. B. Kirkpatrick, and M. Rosenberg. 1998. Genetic characterization of the murine Ym1 gene and identification of a cluster of highly homologous genes. Genomics 54:316–322.

    Kennedy, B. P., E. J. Aamodt, F. L. Allen, M. A. Chung, M. F. Heschl, and J. D. McGhee. 1993. The gut esterase gene (ges-1) from the nematodes Caenorhabditis elegans and Caenorhabditis briggsae. J. Mol. Biol. 229:890–908.

    Kenyon, C., J. Chang, E. Gensch, A. Rudner, and R. Tabtiang. 1993. A C. elegans mutant that lives twice as long as wild type. Nature 366:461–464.

    Kimura, K. D., H. A. Tissenbaum, Y. Liu, and G. Ruvkun. 1997. daf-2, an insulin receptor-like gene that regulates longevity and diapause in Caenorhabditis elegans. Science 277:942–946.

    Lee, Y. H., X.-Y. Huang, D. Hirsh, G. E. Fox, and R. M. Hecht. 1992. Conservation of gene organization and trans-splicing in the glyceraldehyde-3-phosphate dehydrogenase-encoding genes of Caenorhabditis briggsae. Gene 121:227–235.

    Lin, K., J. B. Dorman, A. Rodan, and C. Kenyon. 1997. daf-16: an HNF-3/forkhead family member that can function to double the life-span of Caenorhabditis elegans. Nature 278:1319–1322.

    Lin, Y. J., L. Seroude, and S. Benzer. 1998. Extended life-span and stress resistance in the Drosophila mutant methuselah. Science 282:943–946.

    Lithgow, G. J. 1996. Invertebrate gerontology: the age mutations of Caenorhabditis elegans. BioEssays 18:809–815.

    Morgan, W. R., and I. Greenwald. 1993. Two novel transmembrane protein tyrosine kinases expressed during Caenorhabditis elegans hypodermal development. Mol. Cell. Biol. 13:7133–7143.[Abstract]

    Muller, W. E. 1998. Origin of Metazoa: sponges as living fossils. Naturwissenschaften 85:11–25.

    Murakami, S., and T. E. Johnson. 1996. A genetic pathway conferring life extension and resistance to UV stress in Caenorhabditis elegans. Genetics 143:1207–1218.

    ———. 1998. Life extension and stress resistance in Caenorhabditis elegans modulated by the tkr-1 gene. Curr. Biol. 8:1091–1094.[ISI][Medline]

    Mushegian, A. R., J. R. Garey, J. Martin, and L. X. Liu. 1998. Large-scale taxonomic profiling of eukaryotic model organisms: a comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes. Genome Res. 8:590–598.[Abstract/Free Full Text]

    Oates, A. C., P. Wollberg, M. G. Achen, and W. F. Wilks. 1998. Sampling the genomic pool of protein tyrosine kinase genes using the polymerase chain reaction with genomic DNA. Biochem. Biophys. Res. Commun. 249:660–667.[ISI][Medline]

    Ogg, S., S. Paradis, S. Gottlieb, G. I. Patterson, L. Lee, H. A. Tissenbaum, and G. Ruvkun. 1997. The fork head transcription factor daf-16 transduces insulin-like metabolic and longevity signals in C. elegans. Nature 389:994–999.

    Ottilie, S., F. Raulf, A. Barnekow, G. Hannig, and M. Schartl. 1992. Multiple src-related kinase genes, srk1–4, in the fresh water sponge Spongilla lacustris. Oncogene 7:1625–1630.

    Page, R. D. M. 1996. TREEVIEW: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12:357–358.[Medline]

    Paradis, S., M. Ailion, A. Toker, J. H. Thomas, and G. Ruvkun. 1999. A PDK1 homolog is necessary and sufficient to transduce AGE-1 PI3 kinase signals that regulate diapause in Caenorhabditis elegans. Genes Dev. 13:1438–1452.[Abstract/Free Full Text]

    Rose, M. R. 1991. Evolutionary biology of aging. Oxford University Press, Oxford, England.

    Rousset, D., F. Agnes, P. Lachaume, C. Andre, and F. Galibert. 1995. Molecular evolution of the genes encoding receptor tyrosine kinase with immunoglobulinlike domains. J. Mol. Evol. 41:421–429.[ISI][Medline]

    Ruvkun, G., and O. Hobert. 1998. The taxonomy of developmental control in Caenorhabditis elegans. Science 282:2033–2041.

    Rysavy, F. R., M. J. Bishop, G. P. Gibbs, and G. W. Williams. 1992. The UK Human Genome Mapping Project online computing service. Comput. Appl. Biosci. 8:149–154.[Abstract]

    Satterlee, J. S., and M. R. Sussman. 1998. Unusual membrane-associated protein kinases in higher plants. J. Membr. Biol. 164:205–213.[ISI][Medline]

    Schacke, H., H. C. Schroder, V. Gamulin, B. Rinkevich, I. M. Muller, and W. E. Muller. 1994. Molecular cloning of a tyrosine kinase gene from the marine sponge Geodia cydonium: a new member belonging to the receptor tyrosine kinase class II family. Mol. Membr. Biol. 11:101–107.[ISI][Medline]

    Smith, C. M., I. N. Shindyalov, S. Veretnik, M. Gribskov, S. S. Taylor, L. F. Ten Eyck, and P. E. Bourne. 1997. The protein kinase resource. Trends Biochem. Sci. 22:444–446.[ISI][Medline]

    Soares, M. B., M. F. Bonaldo, P. Jelene, L. Su, L. Lawton, and A. Efstratiadis. 1994. Construction and characterization of a normalized cDNA library. Proc. Natl. Acad. Sci. USA 91:9228–9232.

    Strimmer, K., N. Goldman, and A. von Haeseler. 1997. Bayesian probabilities and quartet puzzling. Mol. Biol. Evol. 14:210–211.[Free Full Text]

    Suga, H., K. Kuma, N. Iwabe, N. Nikoh, K. Ono, M. Koyanagi, D. Hoshiyama, and T. Miyata. 1997. Intermittent divergence of the protein tyrosine kinase family during animal evolution. FEBS Lett. 412:540–546.[ISI][Medline]

    Swofford, D. L. 1993. PAUP—a computer-program for phylogenetic inference using maximum parsimony. J. Gen. Physiol. 102:A9.

    Tatusov, R. L., E. V. Koonin, and D. J. Lipman. 1997. A genomic perspective on protein families. Science 278:631–637.

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequences alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680.[Abstract]

    van der Geer, P., T. Hunter, and R. A. Lindberg. 1994. Receptor protein-tyrosine kinases and their signal transduction pathways. Annu. Rev. Cell Biol. 10:251–337.[ISI]

    Walker, J. C. 1994. Structure and function of the receptor-like protein kinases of higher plants. Plant Mol. Biol. 26:1599–1609.[ISI][Medline]

    Wang, D. Y.-C., S. Kumar, and S. B. Hedges. 1999. Divergence time estimates for the early history of animal phyla and the origin of plants, animals and fungi. Proc. R. Soc. Lond. B Biol. Sci. 266:163–171.[ISI][Medline]

    Wilks, A. F. 1991. Cloning members of protein-tyrosine kinase family using polymerase chain reaction. Methods Enzymol. 200:533–546.[ISI][Medline]

Accepted for publication January 4, 2000.