Max-Planck-Institut für Biologie, Abteilung Immungenetik, Tübingen, Germany
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: molecular phylogeny lamprey hagfish cartilaginous fish bony fish
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The published molecular studies are inconclusive primarily because analysis of a single locus or small numbers of loci suffers from a large sampling error and from lack of high statistical support. Further, different tree-drawing methods often generate different phylogenies (Delarbre et al. 2000). Even in cases in which some methods have given high statistical support for a particular phylogeny, other methods have supported this phylogeny poorly (e.g., Mallatt, Sullivan, and Winchell 2001). The choice of a too distant outgroup and the use of only a few vertebrate lineages may have distorted the phylogenetic relationships among the lineages (e.g., Goodman, Miyamoto, and Czelusniak 1987; Kuraku et al. 1999; Hedges 2001). A number of authors have pointed out that mitochondrial genes are unsuitable for resolving phylogenetic relationships among major vertebrate groups (Takezaki and Gojobori 1999; Hedges 2001). Similarly, the use of rRNA is prone to potential problems such as the choice of alignment or difference in base frequencies in different lineages (Mallatt, Sullivan, and Winchell 2001). To overcome these problems, we obtained nucleotide sequences of the coding regions of 35 nuclear genes from each of the following groups: cephalochordates (used as an outgroup), hagfishes, lampreys, cartilaginous fishes (in this group only 31/35 genes were available), bony fishes (represented by teleost fishes), and tetrapods (represented by human).
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Phylogenetic Analysis
Phylogenetic trees were drawn by the maximum parsimony (MP) method with PAUP 4.0b10 (Swofford 2002), the NJ method, the minimum evolution (ME) method (Rzhetsky and Nei 1992), and the maximum likelihood (ML) method with PAML 3.0c (Yang 1999) and MOLPHY 2.3b3 (Adachi and Hasegawa 1996). In all the analyses, amino acid sequences were used after excluding positions with indels. Phylogenetic trees were drawn for each protein and for concatenated sequences of all the proteins. The NJ and ME trees were drawn using Poisson-correction and gamma distances (Nei and Kumar 2000). The gamma parameter was estimated by the ML method by using the tree topologies in figures 1 and 2 and the JTT model (Jones, Taylor, and Thornton 1992). The ML trees were drawn using JTT, Dayhoff, and Poisson models with or without assuming gamma distribution for rate variation across amino acid positions (G option) and with or without using observed amino acid frequencies in the data (F option). The MP trees were obtained by branch and bound search. In the ME and ML methods all the possible tree topologies (15 for five taxa and 105 for six taxa) were examined. Two thousand bootstrap iterations were conducted for the MP, NJ, and ME trees; 10,000 bootstrap replications were generated for the ML trees by the RELL method for all the possible tree topologies. The bootstrap probabilities for each branch on the ML trees (figs. 1 and 2) were calculated by summing up the RELL bootstrap values for tree topologies that contain the same branch. This way of calculating bootstrap values for branches is essentially the same procedure taken by the standard bootstrap test, except that the likelihood values for different tree topologies in bootstrap replications are computed by the RELL method rather than by carrying out the optimization process in the ML method each time. In the ML method, the results for individual proteins were combined by the TOTALML approach (Adachi and Hasegawa 1996). The bootstrap test by this approach was conducted by the TOTALML program in MOLPHY (Adachi and Hasegawa 1996). In TOTALML, the G option was not available.
|
|
|
|
|
All trees drawn, including those using LogDet distance (Gu and Li 1996), gave the same topologies as those in figures 1 and 2. The NJ and ME trees with LogDet distance are not discussed because there was no sign of amino acid frequency changes in different lineages.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
For each of the 35 loci, orthology of the genes in the different groups was determined by phylogenetic analysis of as many related sequences as were available. In cases of doubt, an effort was made to identify the cDNAs of the true orthologs, and if that search failed, the locus was excluded from the final analysis. The assembled loci represent a diverse group of nuclear protein-encoding genes which includes housekeeping as well as regulatory and large as well as small genes of different functions (table 1). About half of the genes encode ribosomal proteins which are known to evolve slowly and to be relatively short. In our analyses, they yielded trees differing in their topologies from those produced by the combined analyses of the whole set somewhat more often than the other sequences, probably because of the relative paucity of phylogenetic information. The loci included in our data set were relatively conserved. The number of amino acid substitutions per position between tetrapod and amphioxus sequences was 0.4 on average for the different proteins; the largest number was 1.3 (table 1).
Monophyly of the Agnathans
Figure 1 shows the phylogenetic trees drawn for the concatenated sequences of 35 loci by the MP, NJ, and ML methods. The set consisted of a total of 9,098 shared amino acid positions (6,786 positions when cartilaginous fishes were included; table 1). All the methods applied to the concatenated sequences using different substitution models and distance measures and the TOTALML approach generated the same tree topology shown in figure 1, in which the agnathans form a monophyletic cluster. The monophyly of the agnathans was supported by all the methods used with a high statistical confidence. The bootstrap probabilities (BP) for the two interior branches were 99%100% for different substitution models by the ML method including the TOTALML approach (not shown). In all the tree-drawing methods the second best tree was ((tetrapod, bony fish), lamprey), hagfish, amphioxus) and the best tree was significantly supported by the tests of tree comparison (for the MP tree P = 0.002 by Kishino-Hasegawa's and Templeton's nonparametric tests; for the ME trees P = 0.0001 using Poisson correction distance and P = 0.02 using the gamma distance; for the ML trees P = 0.0001 by the Kishino-Hasegawa test using all the substitution models examined; and for the TOTALML approach P < 0.0016 by the Kishino-Hasegawa test using all the substitution models.) In the MP method the tree length of the second best tree was larger than that of the best tree by 50. The gamma parameter, inversely related to the extent of rate variation across amino acid positions, estimated for the concatenated sequences by the ML method was 0.38.
The same tree topology as that generated from the concatenated sequences and the TOTALML approach (fig. 1) was obtained for 19 genes (54%) by the NJ method and for 14 genes (45%) with the MP and ML methods (see Supplementary Material). By considering only whether or not the cluster of hagfish and lamprey appeared in the trees drawn, the monophyly of the agnathans was supported by the majority of the trees (57% to 62% depending on the methods) of the individual proteins. The alternative agnathan phylogeny in which lampreys are more closely related to gnathostomes than to hagfishes was supported by only 10% to 27% of the trees.
Phylogenetic Relationship of Cartilaginous Fish and Bony Fish and the Impact of the Inclusion of the Former on Agnathan Monophyly
Figure 2 shows the phylogenetic trees drawn from concatenated sequences of 31 nuclear protein-encoding genes for which sequences of cartilaginous fishes are available. All the tree-drawing methods used for the concatenated sequences and the TOTALML approach produced the tree topology shown in figure 2, in which the agnathans form a monophyletic cluster and cartilaginous fishes are positioned outside the cluster of tetrapod and bony fish. With this data set the monophyly of the agnathans was also strongly supported. Bootstrap probabilities for the ancestral branches of hagfish and lamprey were 99%100% by all the methods. The position of cartilaginous fishes outside the cluster of bony fishes and tetrapods is significantly supported by most of the methods applied, but with slightly lower confidence than the agnathan monophyly. Bootstrap probabilities for the ancestral branch of tetrapods and bony fishes were 97%100% by the NJ method using all the distance measures, and by the ML methods using different substitution models including the TOTALML approach. The BP (94%) of the MP method for the clustering of tetrapods and bony fishes (fig. 2A) was slightly lower than the 5% significance level. The second best tree was (((tetrapod, bony fish), cartilaginous fish), lamprey), hagfish, amphioxus) by the MP method and by the TOTALML approach using the Poisson model with the F option regardless of whether the G option was specified; and (((tetrapod, (bony fish, cartilaginous fish)), (lamprey, hagfish), amphioxus) by the ME method and by the ML method except for the TOTALML approach mentioned earlier. Tests of tree comparisons were significant by the ME method using the Poisson-correction distance (P = 0.001), by the ML method (P = 0.020.004), and by the TOTALML approach (P = 0.0170.0001), but not significant by the MP method (P = 0.12) and by the ME method using the gamma distance (P = 0.26). In the MP method, the difference in the tree lengths between the best and the second best trees was 18. The gamma parameter estimated for the concatenated sequences by the ML method was 0.38.
The same tree topology as that generated from the concatenated sequences and the TOTALML approach (fig. 2) was obtained for eight, six, and five genes by the NJ, MP, and ML methods, respectively. The hagfish-lamprey cluster was observed in the majority (50%65%) of genes, as in the case of the dataset without cartilaginous fishes mentioned in the previous section. By contrast, the branching pattern ((tetrapod, bony fish), cartilaginous fish) shown in figure 2 appeared in the trees of only about 30% of genes. The alternative branching patterns for the relationship of the cartilaginous and bony fish were observed in even lower frequencies: the pattern (tetrapod, (bony fish, cartilaginous fish)) was observed in about 20% of the genes by all the methods, and the pattern ( (tetrapod, cartilaginous fish), bony fish) appeared in 19% of the NJ trees and 6% of the MP and ML trees.
Is the Agnathan Monophyly a Result of Long Branch Attraction?
Because hagfishes and lampreys are represented by long branches on the phylogenetic trees (fig. 1 and 2), it could be argued that their monophyly is the result of a long branch attraction (Nei and Kumar 2000). Distortions of phylogenetic reconstructions by long branch attraction occur especially when grossly simplifying models of the evolutionary process and the parsimony method are used (Yang 1996). To avoid this problem, we used other methods of phylogenetic reconstruction in addition to those based on the maximum parsimony principle, and in the analyses we took into account different substitution rates across sites, genes, and lineages. We therefore do not think that long branch attraction is the cause of the observed agnathan monophyly. It has been argued that potential artifactual attractions between branches can be avoided by including more taxa in the data set (e.g., Graybeal 1998). Others have pointed out, however, that adding taxa to break long branches increases the probability of distorting phylogenetic relationships in certain cases (Poe and Swofford 1999) and that a better way of dealing with this problem is to increase the number of amino acid positions in the analyses (Poe and Swofford 1999; Rosenberg and Kumar 2001). This is the way we have chosen in the present study.
Conclusion
The phylogenetic analysis leads to two important conclusions. First, the hagfishes and lampreys form a monophyletic group which constitutes a sister group of the gnathostomes, with cephalochordates as an outgroup. And second, the cartilaginous fishes diverged from the common ancestor of bony fishes and tetrapods before the latter two lineages diverged from each other. The first conclusion is well supported statistically by all tree-drawing methods applied to the dataset of the 35 proteins. It is also supported by the majority (57% to 62%, depending on the method) of the trees of the individual proteins. The second conclusion is supported by a high statistical confidence except by the maximum parsimony method and the minimum evolution method (Rzhetsky and Nei 1992) with gamma distance (Nei and Kumar 2000). The lower statistical support by the latter methods is probably due to a smaller number of amino acid positions in the data set that included cartilaginous fishes and to the relative shortness of the time interval during which the cartilaginous fishes and the bony fishes separated from the tetrapod lineage. The placement of cartilaginous fishes outside the cluster formed by bony fishes and tetrapods is in agreement with the traditional interpretation of this part of vertebrate phylogeny and in contradiction to the recent claim by Rasmussen and Arnason (1999) placing cartilaginous fishes inside the bony fish clade. This latter phylogeny is based on mtDNA sequences and supported by high statistical significance. As mentioned earlier, however, sequence data of mitochondrial protein-encoding genes are not suitable for deep phylogenetic reconstructions (Hedges 2001).
The origin and nature of the morphological characters previously claimed to support the close relationship of lampreys to jawed fishes will have to be re-examined in view of the strong support of the molecular data for the agnathan monophyly. Many of these characters may not be homologous (Yalden 1985; Carroll 1987; Mallatt, Sullivan, and Winchell 2001). Some may have arisen independently by convergent evolution in the lamprey and gnathostome lineages, while others still may have been present in the agnathan ancestors and lost in the hagfish (Yalden 1985). A good example of the uncertainties surrounding the nature of some of these characters is the agnathan "tongue." Originally, the resemblance between the "rasping tongue" of the lampreys and the "laterally biting jaws" of the hagfishes was one of the reasons for grouping the two lineages into Cyclostomata. Later, the resemblance was proclaimed to be superficial and the structures were interpreted as having been acquired independently by the two lineages (e.g., Janvier 1981). Later still, however, Yalden (1985) pointed out at least 11 synapomorphous anatomical features of the feeding apparatus were shared by lampreys and hagfishes, but not by gnathostomes. Now the molecular data clearly side with the view that the "tongue" of hagfishes and lampreys is indeed a homologous organ.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Naruya Saitou, Associate Editor
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adachi, J., and M. Hasegawa. 1996. MOLPHY2.3b3: programs for molecular phylogenetics based on maximum likelihood. Institute of Statistical Mathematics, Tokyo.
Carroll, R. L. 1987. Vertebrate paleontology and evolution. W. H. Freeman, New York.
Delarbre C., H. Escriva, C. Gallut, V. Barriel, P. Kourilsky, P. Janvier, V. Laudet, and G. Gachelin. 200l. The complete nucleotide sequence of the mitochondrial DNA of the agnathan Lampetra fluviatilis: bearings on the phylogeny of cyclostomes. Mol. Biol. Evol. 17:519-529.
Delarbre, C., C. Gallut, V. Barriel, P. Janvier, and G. Gachelin. 2002. Complete mitochondrial DNA of the hagfish, Eptatretus burgeri: the comparative analysis of mitochondrial DNA sequences strongly supports the cyclostome monophyly. Mol. Phylogenet. Evol. 22:184-192.[CrossRef][ISI][Medline]
Goodman, M., M. M. Miyamoto, and J. Czelusniak. 1987. Pattern and process in vertebrate phylogeny revealed by coevolution of molecules and morphologies. Pp. 141176 in C. Patterson, ed. Molecules and morphology in evolution: conflict or compromise? Cambridge University Press, New York.
Graybeal, A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47:9-17.[CrossRef][ISI][Medline]
Gu, X., and W.-H. Li. 1996. Bias-corrected paralinear and LogDet distances and tests of molecular clocks and phylogenies under nonstationary nucleotide frequencies. Mol. Biol. Evol. 13:1375-1383.
Gürsoy, H. C., D. Koper, and B. J. Benecke. 2000. The vertebrate 7S K RNA separates hagfish (Myxine glutinosa) and lamprey (Lampetra fluviatilis). J. Mol. Evol. 50:456-464.[ISI][Medline]
Hardisty, M. W. 1982. Lampreys and hagfishes: analysis of cyclostome relationships. Pp. 165259 in M. W. Hardisty and I. C. Potter, eds. The biology of lampreys. Academic Press, London.
Hedges, S. B. 2001. Molecular evidence for the early history of living vertebrates. Pp. 119134 in P. E. Ahlberg, ed. Major events in early vertebrate evolution: paleontology, phylogeny, genetics, and development. Taylor & Francis, London.
Janvier, P. 1981. The phylogeny of the Craniata, with particular reference to the significance of fossil "agnathans." J. Vert. Paleontol. 1:121-159.
Janvier, P. 1996. The dawn of the vertebrates: characters versus common ascent in the rise of current vertebrate phylogenies. Palaeontology 39:259-287.[ISI]
Jefferies, R. P. S. 1986. The ancestry of the vertebrates. British Museum, London.
Jones, D. T., W. R. Taylor, and J. M. Thornton. 1992. The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8:275-282.[Abstract]
Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170-179.[ISI][Medline]
Kuraku, S., D. Hoshiyama, K. Katoh, H. Suga, and T. Miyata. 1999. Monophyly of lampreys and hagfishes supported by nuclear DNA-coded genes. J. Mol. Evol. 49:729-735.[ISI][Medline]
Lanfranchi, G., A. Pallavicini, P. Laveder, and G. Valle. 1994. Ancestral hemoglobin switching in lampreys. Dev. Biol. 164:402-408.[CrossRef][ISI][Medline]
Lipscomb, D. L., J. S. Farris, M. Källersjö, and A. Tehler. 1998. Support, ribosomal sequences, and the phylogeny of the eukaryotes. Cladistics 14:303-338.[CrossRef][ISI]
Løvtrup, S. 1977. The phylogeny of vertebrata. Wiley, New York.
Mallatt, J., and J. Sullivan. 1998. 28S and 18S rDNA sequences support the monophyly of lampreys and hagfishes. Mol. Biol. Evol. 15:1706-1718.
Mallatt, J., J. Sullivan, and C. J. Winchell. 2001. The relationship of lampreys to hagfishes: a spectral analysis of ribosomal DNA sequences. Pp. 106118 in P. E. Ahlberg, ed. Major events in early vertebrate evolution: paleontology, phylogeny, genetics and development. Taylor & Francis, London.
Nei, M., and S. Kumar. 2000. Molecular evolution and phylogenetics. Oxford University Press, New York.
Philippe, H., A. Chenuil, and A. Adoutte. 1994. Can the Cambrian explosion be inferred through molecular phylogeny? Development Suppl:1525.
Poe, S., and D. L. Swofford. 1999. Taxon sampling revisited. Nature 398:299-300.[Medline]
Rasmussen, A. S., and U. Arnason. 1999. Molecular studies suggest that cartilaginous fishes have a terminal position in the piscine tree. Proc. Natl. Acad. Sci. USA 96:2177-2182.
Rasmussen, A. S., A. Janke, and U. Arnason. 1998. The mitochondrial DNA molecule of the hagfish (Myxine glutinosa) and vertebrate phylogeny. J. Mol. Evol. 46:382-388.[ISI][Medline]
Romer, A. S. 1966. Vertebrate paleontology. University of Chicago Press, Chicago.
Rosenberg, M. S., and S. Kumar. 2001. Incomplete sampling is not a problem for phylogenetic inference. Proc. Natl. Acad. Sci. USA 98:10751-10756.
Rzhetsky, A., and M. Nei. 1992. A simple method for estimating and testing minimum-evolution trees. Mol. Biol. Evol. 9:954-967.
Stock, D. W., and G. S. Whitt. 1992. Evidence from 18S ribosomal RNA sequences that lampreys and hagfishes form a natural group. Science 57:787-789.
Suga, H., D. Hoshiyama, S. Kuraku, K. Katoh, K. Kubokawa, and T. Miyata. 1999. Protein tyrosine kinase cDNAs from amphioxus, hagfish, and lamprey: isoform duplications around the divergence of cyclostomes and gnathostomes. J. Mol. Evol. 49:601-608.[ISI][Medline]
Suzuki, M., K. Kubokawa, H. Nagasawa, and A. Urano. 1995. Sequence analysis of vasotocin cDNAs of the lamprey, Lampetra japonica, and the hagfish, Eptatretus burgeri: evolution of cyclostome vasotocin precursors. J. Mol. Endocrinol. 14:67-77.[Abstract]
Swofford, D. L. 2002. PAUP*: phylogenetic analysis using parsimony (*and other methods). Sinauer Associates, Sunderland, Mass.
Takezaki, N., and T. Gojobori. 1999. Correct and incorrect vertebrate phylogenies obtained by the entire mitochondrial DNA sequences. Mol. Biol. Evol. 16:590-601.[Abstract]
Templeton, A. 1983. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37:221-244.[ISI]
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract]
Turbeville, J. M., J. R. Schulz, and R. A. Raff. 1994. Deuterostome phylogeny and the sister group of the chordates: evidence from molecules and morphology. Mol. Biol. Evol. 11:648-655.[Abstract]
Yalden, D. W. 1985. Feeding mechanisms as evidence for cyclostome monophyly. Zool. J. Linn. Soc. 84:291-300.[ISI]
Yang, Z. 1996. Phylogenetic analysis using parsimony and likelihood methods. J. Mol. Evol. 42:294-307.[ISI][Medline]
Yang, Z. 1999. PAML: a program package for phylogenetic analysis by maximum likelihood. Version 3.0c. University College London, London.