* Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
Laboratoire de Paléontologie, Paléobiologie et Phylogénie, Institut des Sciences de l'Evolution, Université Montpellier II, Montpellier, France
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Bayesian bootstrap Markov chain Monte Carlo maximum likelihood phylogeny posterior probability
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bayesian inference of phylogeny combines the prior probability of a phylogeny with the tree likelihood to produce a posterior probability distribution on trees (Huelsenbeck et al. 2001). The best estimate of the phylogeny can be selected as the tree with the highest posterior probability (i.e., the MAximum Posterior probability [MAP] tree) (Rannala and Yang 1996). Topologies and branch lengths are not treated as parametersas in ML methods (Felsenstein 1981)but as random variables. Because posterior probabilities cannot be obtained analytically, they are approximated by numerical methods known as Markov chain Monte Carlo (MCMC) or Metropolis coupled MCMC (MCMCMC). These chains are designed to explore the posterior probability surface by integration over the space of model parameters. Trees are sampled at fixed intervals and the posterior probability of a given tree is approximated by the proportion of time that the chains visited it (Yang and Rannala 1997). A consensus tree can be obtained from these sampled trees, and Bayesian posterior probabilities (PP) of individual clades, as expressed by the consensus indices, may be viewed as clade credibility values. Thus, Bayesian analysis of the initial matrix of taxa and characters produces both a MAP tree and estimates of uncertainty of its nodes, directly assessing substitution model, branch length, and topological variables, as well as clade reliability values, all in a reasonable computation time.
Reliability of nodes in phylogenetic trees is classically evaluated in two ways. First, from the initial matrix of characters, a strength of grouping value is measured, that is, the least decrease in log-likelihood associated with the breaking of the clade defined by that node (e.g., Meireles et al. 1999). The statistical significance of this decrease can be estimated with nonparametric or parametric tests (e.g., Goldman, Anderson, and Rodrigo 2000). With Bayesian methods, reliability of MAP tree nodes derives directly from corresponding posterior probabilities. In the second way, the initial character matrix is redrawn with replacement, and bootstrap percentages (BP) are calculated, for example under the ML criterion (BPML), and interpreted as a measure of experiment repeatability (Felsenstein 1985) or phylogenetic accuracy (Felsenstein and Kishino 1993).
The Bayesian approach is presumed to perform roughly as bootstrapped ML (Huelsenbeck et al. 2001) but runs much faster (Larget and Simon 1999; Huelsenbeck et al. 2001). However, Bayesian phylogenetics has its currently unsolved problems, and "perhaps the most vexing mystery is the observed discrepancy between Bayesian posterior probabilities and nonparametric bootstrap support values" (Huelsenbeck et al. 2002). Recent analyses have aimed at comparing Bayesian and ML supports by studying the correlation between PP and BPML (Leaché and Reeder 2002; Whittingham et al. 2002). A compilation of literature values (Karol et al. 2001; Murphy et al. 2001; Buckley et al. 2002; Leaché and Reeder 2002; Whittingham et al. 2002; Wilcox et al. 2002) reveals that plotting PP as a function of BPML can show significant correlation (P < 0.02), but that the strength of this correlation is highly variable and sometime very low (correlation coefficient r2 between 0.33 and 0.99; median at 0.73). Moreover, the slope (S) of the regression line (S between 0.29 and 1.08; median at 0.79) indicates that BPML values are generally lower than PP values. This trend has already been noticed by Rannala and Yang (1996) in their pioneering work, where PP values appeared systematically higher than resampling estimated log-likelihood (RELL) bootstrap support values.
As more phylogenetic results relying strictly on Bayesian analyses are published (Arkhipova and Morrison 2001; Henze et al. 2001; Lutzoni, Pagel and Reeb 2001), a better understanding of the relation between PP and BPML becomes essential. Wilcox et al. (2002) explored this relation by performing simulations on their original data set. They conclude that, under the condition of their study, PP and BP are both overconservative measures of phylogenetic accuracy but that Bayesian support values provide closer estimates of the true probabilities of recovering clades. Thus they advocate the preferential use of PP rather than BP (Wilcox et al. 2002). However, cases where conflicting hypotheses are supported by high posterior probabilities have been reported (Buckley et al. 2002; Douady et al. in press). This suggests that at least in certain cases, PP put overconfidence on a given phylogenetic hypothesis, and drawing conclusions from this sole measure of support might be misleading.
To better understand the relationship between PP and BP, we applied standard (i.e., nonparametric) bootstrap resampling procedures to the Bayesian approach, studying the correlation between PP, BPML, and BPBaythat is, posterior probabilities estimated after bootstrapping of the datafor eight empirical data sets spanning different kinds of characters, types of sequences, genomic compartments, and taxonomic groups. Even when the correlation between PP and BPML was weak (r2 < 0.52), it became very strong (r2 > 0.96) when Bayesian posterior probabilities are computed on bootstrapped data matrices. Moreover, albeit less clearly, simulation seems to confirm this trend. These simulations also tend to predict that PP overcome BP support for both true and false nodes. We discuss the effect of the bootstrapped approach in the case of apparent conflicts between data sets and consider its practical implications for measuring phylogenetic reliability.
![]() |
Material and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Bootstrapped Bayesian Analyses
We generated 100 bootstrap pseudo-replicates for each of the eight data sets using the program SEQBOOT 3.6a2.1 (Felsenstein 2001). For each pseudoreplicate, Bayesian posterior probabilities were estimated as previously described (i.e., the tree sampling and burn-in value were fixed as for the standard Bayesian approach). Bootstrapped Bayesian support was computed for each node into three ways: (1) the bootstrapped posterior probabilities (BPBay) obtained from the consensus of the 500 x 100 = 50,000 trees generated from the 100 bootstrapped pseudoreplicates, (2) the Bayesian bootstrap percentages obtained from the consensus of the 100 MAP trees (i.e., a "consensus of consensus" procedure), and (3) the average of each nodes PP for the 100 MAP trees. Given the tedious aspect of preparing files for bootstrapped Bayesian analyses, a Perl script was custom made and is available upon request.
Simulation Studies
We also explored the relation between PP and BP using a simulation design. Monte Carlo simulation of 100 data sets of 1,000 characters for seven taxa each was performed using SEQ-GEN 1.2.5 (Rambaut and Grassly 1997), under a model topology and associated branch lengths taken from the armadillo subset of VWF xenarthran data. The K2P model of nucleotide substitution (Kimura 1980) was chosen with a transition:transversion ratio of 2.00 and a 8 distribution with
= 1.00. BPML and PP supports were obtained for these 100 simulated data sets following the same procedure as described above. For computing time reasons (i.e., running 2,500 times MrBayes), BPBay were only computed for the 25 data sets showing the greatest contrast between BPML and PP.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Such a correlation between BPML and BPBay seems expectable since the use of uniform priors in the Bayesian analyses involves that the posterior probability density is strongly dependent upon the likelihood function. However, this correlation is not trivial either, because the ML and the MAP trees obtained from each bootstrap pseudoreplicate are not always identical. For example, in the case of the 21 shark and xenarthran data sets, ML and MAP trees are different in 38% and 27% of the replicates, respectively. Therefore, the very high quality correlation between BPBay and BPML (r2 > 0.95) cannot be expected a priori.
A Simulation Study to Compare Maximum Likelihood and Bayesian Node Reliability
Nonparametric bootstrapping may be an overconservative estimator of node reliability (Hillis and Bull 1993; Wilcox et al. 2002; but see Felsenstein and Kishino 1993; Efron, Halloran, and Holmes 1996), but it remains the most commonly used way to characterize it. From the statistical point of view, posterior probabilities have the advantage to be of straightforward interpretation as they represent the probability that the corresponding clade is true, given the model, the priors, and the data (Huelsenbeck et al. 2002). However, as we showed, they are not tightly correlated with ML bootstrap percentages. Thus, these estimators seem rather different, as PP needs to be calculated on bootstrapped data to behave like BPML supports. Recently, Wilcox et al. (2002), based on a simulation study, concluded that PP and BP are both overconservative measures of node support but that PP provided closer estimates of the true probabilities of recovering clades.
Results from our simulations seem to confirm the fact that PP is less conservative than BP. Indeed, when considering true nodesthose that were present in the model topologyPP are generally higher than BPML and BPBay (fig. 2A, upper right quarter). However, PP is also higher when looking at strong support for false nodesthose that were absent of the model tree (fig. 2B, upper right quarter). Below 50% of PP and BP (fig. 2B, lower left quarter), that is, for values that are usually not interpreted for phylogenetic inference, there is a large dispersion of points with a trend of low BP to overestimate accuracy, as noted by Hillis and Bull (1993). As a whole these simulation results imply that, at least in certain cases, high PP falsely interpret signal and may end up strongly supporting incorrect phylogenetic relationships. Thus, the more conservative BPML and BPBay seem less subject to the behavior of strongly supporting a node when it is actually false.
|
|
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Nevertheless, Bayesian inferencewith and without bootstrapremains a very efficient way to simultaneously estimate substitution model parameters, branch lengths, and topology under complex models of evolutionary change (Huelsenbeck 2002). If we take the chondrichthyan 12S16S data sets with 23 taxa as an example (fig. 3B), a standard Bayesian searchor one Bayesian bootstrap replicateruns roughly 80 times faster on a 1.80 GHz Pentium 4 than a single PAUP* replicate of BPML with simultaneous estimation of all parameters. Bayesian search on bootstrap data is much faster than ML if the user wants parameters to be estimated as the search goes, and it gives very similar results (fig. 1). However, in the wide majority of cases, an ML (or BPML) search with simultaneous estimation of the parameters is not necessary, and a priori approximations allow the identification of the optimal trees and bootstrap supports. The Bayesian approach also provides a unique way to analyze amino acid data with simultaneous parameters estimation, whereas this option is only available for DNA in popular phylogenetic packages such as PAUP or PHYLIP.
Both PP and bootstrap supports are of great interest to phylogeny as potential upper and lower bound of node support, but they are surely not interchangeable and cannot be directly compared. In that context, users may prefer computing PP and BPBay or BPML to better explore the range of node support estimates, especially when potential conflicts between data sets are explored.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Note Added in Proof: Suzuki, Glazko, and Nei recently showed by simulation that posterior probabilities in Bayesian analysis can be excessively liberal, whereas bootstrap probabilities in Neighbor-Joining and maximum likelihood analyses are generally slightly conservative (2000, Proc. Natl. Acad. Sci. USA 99:1613816143).
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Arkhipova, I. R., and H. G. Morrison. 2001. Three retrotransposon families in the genome of Giardia lamblia: two telomeric, one dead. Proc. Natl. Acad. Sci. USA 98:14497-502.
Boucher, Y., H. Huber, S. L'haridon, K. O. Stetter, and W. F. Doolittle. 2001. Bacterial origin for the isoprenoid biosynthesis enzyme HMG-CoA reductase of the archaeal orders Thermoplasmatales and Archaeoglobales. Mol. Biol. Evol. 18:1378-88.
Buckley, T. R. 2002. Model misspecification and probabilistic tests of topology: evidence from empirical data sets. Syst. Biol. 51:509-523.[CrossRef][ISI][Medline]
Buckley, T. R., P. Arensburger, C. Simon, and G. K. Chambers. 2002. Combined data, Bayesian phylogenetics, and the origin of the New Zealand cicada genera. Syst. Biol. 51:4-18.[CrossRef][ISI][Medline]
Buckley, T. R., and C. W. Cunningham. 2002. The effect of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support. Mol. Biol. Evol. 19:394-405.
Delsuc, F., M. Scally, O. Madsen, M. J. Stanhope, W. W. De Jong, F. M. Catzeflis, M. S. Springer, and E. J. P. Douzery. 2002. Molecular phylogeny of living xenarthrans and the impact of character and taxon sampling on the placental tree rooting. Mol. Biol. Evol. 19:1656-1671.
Douady, C. J., M. Dosay, M. S. Shivji, and M. J. Stanhope., Molecular phylogenetic evidence refuting the hypothesis of Batoidea (rays and skates) as derived sharks. Mol. Phylogenet. Evol. (in press).
Douzery, E. J., A. M. Pridgeon, P. Kores, H. P. Linder, H. Kurzweil, and M. W. Chase. 1999. Molecular phylogenetics of Diseae (Orchidaceae): a contribution from nuclear ribosomal ITS sequences. Am. J. Bot. 86:887-899.
Efron, B., E. Halloran, and S. Holmes. 1996. Bootstrap confidence levels for phylogenetic trees. Proc. Natl. Acad. Sci. USA 93:13429-13434.
Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368-376.[ISI][Medline]
Felsenstein, J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-791.[ISI]
Felsenstein, J. 2001. PHYLIP (PHYLogeny Inference Package). Version 3.6a2.1. Department of Genome Sciences. University of Washington. Seattle.
Felsenstein, J., and H. Kishino. 1993. Is there something wrong with the bootstrap on phylogeniesa reply. Syst. Biol. 42:193-200.[ISI]
Goldman, N., J. P. Anderson, and A. G. Rodrigo. 2000. Likelihood-based tests of topologies in phylogenetics. Syst. Biol. 49:652-670.[CrossRef][ISI][Medline]
Henze, K., D. S. Horner, S. Suguri, D. V. Moore, L. B. Sanchez, M. Muller, and T. M. Embley. 2001. Unique phylogenetic relationships of glucokinase and glucosephosphate isomerase of the amitochondriate eukaryotes Giardia intestinalis, Spironucleus barkhanus, and Trichomonas vaginalis. Gene 281:123-131.[CrossRef][ISI][Medline]
Hillis, D. M., and J. J. Bull. 1993. An empirical-test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42:182-192.[ISI]
Huelsenbeck, J. P. 2002. Testing a covariotide model of DNA substitution. Mol. Biol. Evol. 19:698-707.
Huelsenbeck, J. P., B. Larget, R. E. Miller, and F. Ronquist. 2002. Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. 51:673-688.[CrossRef][ISI][Medline]
Huelsenbeck, J. P., and F. Ronquist. 2001. MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17:754-755.
Huelsenbeck, J. P., F. Ronquist, R. Nielsen, and J. P. Bollback. 2001. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310-2314.
Karol, K. G., R. M. Mccourt, M. T. Cimino, and C. F. Delwiche. 2001. The closest living relatives of land plants. Science 294:2351-2353.
Kimura, M. 1980. A simple method for estimation evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Biol. 16:111-120.
Larget, B., and D. L. Simon. 1999. Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol. Biol. Evol. 16:750-759.
Leaché, A. D., and T. W. Reeder. 2002. Molecular systematics of the Eastern Fence Lizard (Sceloporus undulatus): a comparison of parsimony, likelihood, and Bayesian approaches. Syst. Biol. 51:44-68.[CrossRef][ISI][Medline]
Lutzoni, F., M. Pagel, and V. Reeb. 2001. Major fungal lineages are derived from lichen symbiotic ancestors. Nature 411:937-940.[CrossRef][ISI][Medline]
Maddison, W. P. 1997. Gene trees in species trees. Syst. Biol. 46:523-536.[ISI]
Meireles, C. M., J. Czelusniak, M. P. Schneider, J. A. Muniz, M. C. Brigido, H. S. Ferreira, and M. Goodman. 1999. Molecular phylogeny of ateline New World monkeys (Platyrrhini, Atelinae) based on gamma-globin gene sequences: evidence that Brachyteles is the sister group of Lagothrix. Mol. Phylogenet. Evol. 12:10-30.[CrossRef][ISI][Medline]
Murphy, W. J., E. Eizirik, S. J. O'brien, et al. (11 co-authors). 2001. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294:2348-2351.
Posada, D., and K. A. Crandall. 1998. Modeltest: testing the model of DNA substitution. Bioinformatics 14:817-818.[Abstract]
Rambaut, A., and N. C. Grassly. 1997. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput. Appl. Biosci. 13:235-238.[Abstract]
Rannala, B., and Z. Yang. 1996. Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J. Mol. Evol. 43:304-311.[ISI][Medline]
Strimmer, K., and A. von Haeseler. 1996. Quartet puzzling: a quartet maximum-likelihood method for reconstructing tree topologies. Mol. Biol. Evol. 13:964-969.
Swofford, D. L. 2002. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sinauer Associates, Sunderland, Mass.
Waddell, P. J., H. Kishino, and R. Ota. 2001. A phylogenetic foundation for comparative mammalian genomics. Genome Informatics Series. 12:141-155.[Medline]
Whittingham, L. A., B. Slikas, D. W. Winkler, and F. H. Sheldon. 2002. Phylogeny of the tree swallow genus, Tachycineta (Aves: Hirundinidae), by Bayesian analysis of mitochondrial DNA sequences. Mol. Phylogenet. Evol. 22:430-441.[CrossRef][ISI][Medline]
Wilcox, T. P., D. J. Zwickl, T. A. Heath, and D. M. Hillis. 2002. Phylogenetic relationships of the dwarf boas and a comparison of Bayesian and bootstrap measures of phylogenetic support. Mol. Phylogenet. Evol. 25:361-371.[CrossRef][ISI][Medline]
Yang, Z., and B. Rannala. 1997. Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo method. Mol. Biol. Evol. 14:717-724.[Abstract]