*Department of Computer Science, University of Manchester;
School of Biological Sciences, University of Manchester
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In contrast, recent analyses using complete mitochondrial genomes have been unable to establish the higher level relationship among mammals with any confidence and, in some cases, apparently high support is given to clades which are not consistent with these more recent studies (Cao et al. 2000
; Nikaido et al. 2000
). It has been argued that mitochondrial data are inherently less informative than nuclear data for obtaining deep-level mammalian phylogenies and that data sets obtained using only mitochondrial genomes will therefore provide less phylogenetic resolution per nucleotide (Springer et al. 2001
). However, with the current trend toward using ever larger data sets, researchers may be losing sight of the fact that a smaller data set will often provide more accurate results if it fits better with the evolutionary assumptions of the phylogenetic inference method (Swofford et al. 2001
).
We have carried out a phylogenetic analysis using the complete set of mitochondrial tRNA and rRNA sequences from 54 mammals, using an evolutionary model specifically suited to RNA molecules. We use Bayesian phylogenetic inference techniques which arguably provide the most principled and efficient use of data for problems in which there are very many highly probable alternative trees. Our results show support for the same four primary clades observed in the studies of Madsen et al. (2001)
and Murphy et al. (2001a,
2001b
).
Modern approaches to phylogeny are increasingly based on principled probabilistic foundations and make use of explicit evolutionary models. In particular, maximum likelihood and Bayesian methods have a stochastic model of sequence evolution at their heart in which substitutions are modeled as a time-homogeneous Poisson process. Standard models which consider substitutions at the DNA, amino acid, or codon level have been developed (see, for example, Swofford et al. 1996
; Lewis 2001
). Many RNA molecules are subject to functional constraints, resulting in highly conserved secondary structure over long evolutionary times. Mutations at different sites in a molecule are correlated because of compensatory mutations in the helices which are required to preserve base pairing. Models of DNA evolution which consider base paired sites as independent are therefore unsuitable for modeling RNA coding genes.
A number of substitution models for the helical regions of RNA have been proposed that consider pairs of sites as the fundamental unit rather than the single site. In principle, there are 16 possible pairs that can be formed with the four bases; hence, we need a 16 x 16 substitution rate matrix to describe the evolutionary process. In practice, however, there are only six frequently occurring pairs (AU, GU, GC, UA, UG, and CG), whereas the other 10 mismatch pairs together account for only 2%3% of pairs in conserved structural regions (Higgs 2000
). Some models use only a 6 x 6 matrix and neglect mismatches completely, whereas other models group all the mismatches into a single MM state and hence use a 7 x 7 matrix. Savill, Hoyle, and Higgs (2001)
compared many different evolutionary models using likelihood methods to distinguish which features of the rate models are important to give good explanations of real sequence evolution. The rate of double substitutions (e.g., AU to GC) is apparently high, and models that allow double substitutions as a single step fit the data significantly better than those that allow only single substitutions at a time (e.g., AU to GU, then GU to GC). This suggests that the double substitution process is occurring via the compensatory mutation mechanism explained by population genetics theory (Higgs 1998
). The consensus sequence of the population changes at two sites simultaneously, even though the two mutations almost certainly did not occur simultaneously. It was also found that the six-state and seven-state models appear to fit the data as well as the 16-state ones, and there appears to be little benefit from detailed treatment of the mismatches as separate states. Within the six-state and seven-state model families, the most general time-reversible models give significantly higher likelihood values than any of the alternative models with fewer parameters.
We have developed software implementing the most common six-state and seven-state RNA models for use with maximum likelihood and Bayesian phylogenetic inference (PHASE, available from http://www.bioinf.man.ac.uk/resources/). In this article we describe our Bayesian inference method which is based on Markov chain Monte Carlo (MCMC) techniques (Metropolis et al. 1953
; Hastings 1970
). Although MCMC is a well-established method, it has only recently been applied to the problem of phylogenetic inference (Li 1996
; Mau 1996
; Mau and Newton 1997
; Yang and Rannala 1997
; Larget and Simon 1999
; Mau, Newton, and Larget 1999
; Li, Pearl, and Doss 2000
). Two recent reviews suggest that this method of phylogenetic inference has a very promising future (Huelsenbeck et al. 2001
; Lewis 2001
). The main strength of the MCMC approach is the ability to assign a level of support, the posterior probability, to basically any phylogenetic hypothesis under investigation. None of the currently available packages (Larget and Simon 1999
; Huelsenbeck and Ronquist 2001
; McGiure, Denham, and Balding 2001
) include the RNA evolutionary model used here, which is a major limitation given the extensive use of RNA genes in phylogenetic analyses. Here, we apply our MCMC algorithm to the phylogeny of placental mammals and discuss some of the statistical issues that are important when applying Bayesian inference to RNA data sets.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
![]() |
We also allow for substitution rate variation over sites by using the discrete-gamma model of Yang (1994)
. This model approximates the distribution of rates across different sites, using a number of discrete categories which are chosen to approximate a Gamma distribution. A single parameter determines the shape of this Gamma distribution and we use four site categories here, a choice which was shown to provide good all-round performance for the cases considered by Yang (1994)
.
Bayesian Phylogenetics
In Bayesian inference we are interested in computing the posterior probability of the hypothesis under investigation. This quantity is the probability of the hypothesis given the available data and including any prior assumptions we wish to include. The symbol i labels the ith tree topology,
i are the branch lengths associated with this topology, and
are the parameters of our evolutionary model (i.e., the rate parameters
ij, base pair frequencies
i, and the Gamma distribution parameter). We can calculate the posterior probability density of the combined state
= {
i,
i,
} given sequence data X using Bayes' theorem,
|
The sum and integrals in the denominator of equation (4) are intractable for realistic sized problems. MCMC is therefore used to generate a large sample from the posterior probability distribution of states without explicit enumeration of these sums and integrals. To calculate the posterior probability of any phylogenetic hypothesis, we simply determine the fraction of samples in support of the hypothesis. For example, the posterior probability of a particular topology is simply given by the fraction of times we see this topology in our MCMC sample.
We use the standard Metropolis-Hastings MCMC algorithm which involves a two-stage process to construct a Markov chain in some state space (Metropolis et al. 1953
; Hastings 1970
). Firstly, a new state
' is proposed according to some proposal mechanism defined by a probability density function f(
'|
) conditional on the current state
. Secondly, the proposed state is accepted with some probability which depends on the posterior probability of the state as well as the proposal distribution. In particular, we define the transition probability between states to be,
|
Equation (5)
defines a first order Markov chain, and it can be shown that under quite weak conditions this chain converges to an equilibrium in which states are distributed according to the posterior density p(|X). The chain therefore provides us with samples from the posterior probability of topologies, model parameters, or any other quantity of interest. Our only condition is that the Markov chain be ergodic, i.e., that there is a nonzero probability of reaching any point in the state-space starting from any other point in a finite number of steps. Unfortunately, we do not always know when the Markov chain has converged sufficiently to give accurate results. We choose to complete a number of independent runs, which at least allows us to determine the consistency of our results.
We have no strong evidence for any particular prior distribution of trees, and we therefore choose a simple factorized prior p() = P(
i)p(
)p(
i). We set a uniform prior over topologies P(
i) = 1/Ns. We choose a flat Dirichlet distribution prior for base pair frequency parameters, i.e., all sets of base pair frequencies summing to one are equally likely. We choose a uniform positive prior for substitution rate ratio parameters, branch lengths, and the Gamma distribution parameter in the variable-rate model of Yang (1994)
. In the case of uniform priors one must set an upper limit to ensure a normalisable prior distribution, but in practice the particular choice of this upper limit does not usually affect our results unless an extreme value is chosen.
We split up the proposal process by having a number of different schemes for different groups of variables which we apply iteratively. For the frequency parameters we adopt the technique of Larget and Simon (1999)
who use a Dirichlet proposal distribution centered at the current frequency vector (see also the description of software implementation at http://www.bioinf.man.ac.uk/resources/). For the Gamma distribution parameter and substitution rates we use a normal proposal distribution centered at the current value for each rate, with a reflecting boundary at zero and at some user-defined upper limit. For the substitution rates the proposal is actually done for a parameter which is the ratio between each rate and one reference rate. This ensures that the Hastings ratio is one for this proposal irrespective of the normalization in equation (2)
.
We use three different proposals for branch length and topology changes which are illustrated in figure 1
. The continuous change proposal (top of fig. 1
) mainly results in branch length changes but can also induce local topology changes. This proposal changes the length of a randomly chosen branch. If the initial length is x, the new length of the branch is set to x + , where
is chosen from a normal distribution with mean zero. A special rule is applied when x +
becomes negative. If the branch is an internal branch (such as that shown at the top of fig. 1
) then the topology is changed to one of the two nearest neighbor topologies with each having a probability of 0.5, and the length of the new internal branch is set to y = |x +
|. If the branch changed is an external branch, then there is no alternative topology to change to if the length becomes negative. In this case the topology of the tree remains the same, and the length of the branch is set to |x +
|. We impose an upper limit on the branch length to ensure that the prior constraint on branch lengths is satisfied, although in practice we have not observed the branch lengths approaching such an upper limit in our simulations.
|
For the results described later trees are sampled after every 10 MCMC cycles, where one cycle corresponds to an update of all model parameters and one continuous branch length change. The other two topology proposals occur with equal probability every 10 cycles. An initial burn-in period was neglected, after which 20,000 trees were sampled and stored for every run.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
The monophyly of Afrotheria has previously been obtained using mitochondrial genes (Stanhope et al. 1998
; Mouchaty et al. 2000b
), but its position has been poorly resolved, and it has typically been positioned close to the fereuungulates, which are found in group IV here. We find that it typically branches off earlier than this. Xenarthra has long been identified as one of the oldest placental groups, and our analysis is consistent with this; however, it is only more recently that Afrotheria has been found close to the root of the placentals. This leads us to question the common view of an origin of placental mammals in Asia, spreading to the Northern Hemisphere and then migrating to the Southern Hemisphere.
Group III contains many of the most studied mammals, including the primates, rodents, rabbits, and a tree shrew. This group is well supported, with a posterior probability of 97%, but there are many alternative arrangements within the group that are less well resolved. For example, within the primates there is a posterior probability of 83% attached to the sister relationship of the gorilla and human, whereas the more usually accepted arrangement which places chimps as sister group to humans has only 12% support. We attach little significance to this, as our method is not ideal for very closely related species. Our choice of sites only consists of those which are reliably aligned over the full range of mammals. Rapidly varying sites which would give useful information about closely related species have been excluded. We note that the relationship of the higher primates was also unresolved by Madsen et al. (2001)
with a larger data set. Within the primates the Anthropoidea group (old and new world monkeys, and apes) is well supported. Our results support the sister relationship of the loris (a member of Strepsirhini) with the tarsier which is in agreement with other molecular studies (Madsen et al. 2001
) but is in disagreement with morphological evidence (Ross, Williams, and Kay 1998
) that places the tarsier as a sister group to the Anthropoidea. There is no resolution for the relative positioning of these species, and the arrangement with highest support (68%) has them separated from the other primates. This is perhaps explained by a compositional bias which makes the positioning of the tarsier using mitochondrial genes particularly problematic (Schmitz, Ohme, and Zischler 2002
).
According to our studies there is low support for the monophyly of rodents (26%), and therefore a monophyletic rodent group does not appear in the consensus tree shown in figure 2
. Support for the most likely arrangement which separates Muridae from the other rodents is also quite weak (61%). The monophyly of rodents has been questioned by recent complete mitochondrial genome studies (Reyes et al. 2000
). The latest studies using large numbers of predominantly nuclear genes show stronger support for monophyly (Adkins et al. 2001
; Madsen et al. 2001
; Murphy et al. 2001a,
2001b
), whereas studies using smaller nuclear data sets failed to find reliable support (Adkins et al. 2001
). We find some support (79%) for the tree shrew forming a clade with the rodents and lagomorphs, and there is very low support for the traditional positioning of the tree shrew at the root of the primates.
Group IV is supported with 100% posterior probability, but there are low support values for many clades within this group because of the large number of alternative arrangements that occur within the MCMC runs. This lack of resolution is largely caused by the alpaca, whose position within the tree is particularly variable for this data set. The carnivores are monophyletic with 100% support, and the Perissodactyla form a well-supported monophyletic group with 93% posterior probability. Although the Cetartiodactyla are poorly resolved with weak support for monophyly (28%), the generally accepted relationship for all species apart from the pig and alpaca is strongly supported (97%). The arrangement of species from Eulipotyphla and Chiroptera is poorly resolved for the most part, and the groups appear to mix. There appears to be surprisingly strong support for a close relationship between the hedgehogs and the bats, although we note that the hedgehogs are particularly problematic in all phylogenies using mitochondrial genes. Indeed, the hedgehog has been placed at the root of all placentals (Stanhope et al. 1998
; Mouchaty et al. 2000a,
2000b;
Nikaido et al. 2000
) and widely separated from the mole, which is usually thought to be closely related.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A major advantage of the Bayesian approach is that we do not limit ourselves to a single best tree. It is therefore possible to find strong support for a specific clade or clade arrangement of interest even when there may be a huge number of alternative topologies with comparable likelihood values. An alternative approach would be to use bootstrap resampling of maximum likelihood phylogenies to get an estimate of confidence. But, this would require much greater computational resources than the MCMC approach used here, and the interpretation of bootstrap confidence intervals can be problematic because they seem to provide a conservative estimate of confidence (Swofford et al. 1996
; however, see Efron, Halloran, and Holmes 1996
, for an alternative viewpoint). There may also be bias in bootstrap samples because of the particular optimization heuristic used to obtain an estimate of the maximum likelihood tree.
The resolution at some levels of our phylogeny was not good and in particular there were a number of alternative arrangements within the main groups identified which could not be excluded. This indicates that there is not sufficient evidence within the current set of sequence data to distinguish between these alternatives. It does not indicate a problem with the MCMC algorithm. The variation in posterior probability estimates from different MCMC runs is small for all strongly supported groups in figure 2 , indicating that we have good consistency. Only the relatively weakly supported groups have standard deviations greater than 10% in the mean posterior probability.
It is particularly important to use appropriate substitution models when applying Bayesian inference techniques to RNA data because ignoring covariation between base paired sites will generally invalidate our estimates of posterior probabilities. The usual four-state DNA substitution models consider the nucleotides which are base paired in a helical region of RNA to be independent, however, typically 90% of these bases form stable Watson-Crick base pairs, whereas the majority of other bases form the less stable GU-UG pairing. If we treated these bases as independent, then we would significantly overestimate our posterior probability for a hypothesis which is well supported. To illustrate the argument, consider an idealized situation in which we apply the HKY model (Hasegawa, Kishino, and Yano 1985
) to an RNA helical region in which all of the base pairs form Watson-Crick pairs. Let L(
1) and L(
2) denote likelihoods for the data on one side of the helical region for two different phylogenetic hypotheses
1 and
2. If we ignore correlations, then the likelihood for the data combined from both sides will be L(
1)2 and L(
2)2 by symmetry of the substitution model. If the prior probabilities associated with
1 and
2 are equal, we see that the ratio of posterior probabilities is L(
1)/L(
2) in the first case and L(
1)2/L(
2)2 in the second case. So, for example, if
1 is 100 times more likely given only information from one side of the helix, this will correspond to being 10,000 times more likely given data on both sides of the helix under a four-state DNA substitution model. Ignoring correlations therefore results in an inflated confidence in this hypothesis. Similar considerations would also invalidate bootstrap support values as well as the method of Kishino and Hasegawa (1989)
which is often used to construct confidence intervals for phylogenetic inferences using maximum likelihood or parsimony methods. In practice one could overcome the problem of correlated substitution by simply applying a standard DNA model to data from one side of the helical regions in an RNA molecule, however, in doing this we would lose important information from the 10% of base pairs not forming Watson-Crick pairs.
In figures 3 and 4 we present results using the general time-reversible four-state model for single DNA sites applied to the same paired parts of the RNA genes but treating the sites as independent. Four variable-rate categories are used. In figure 3 , we compare the number of occurrences of each unique topology in a single run (ranked in order of number of occurrences) with results obtained using the seven-state model. We see that runs using the four-state DNA model result in significantly higher posterior probabilities being attached to the top topologies compared with runs using the RNA model. Similarly, much lower posterior probabilities are attached to the less frequent topologies when using a DNA model. These results are consistent with our argument against using a DNA model on base paired regions of RNA.
|
|
Thus, on the basis of the topology of the consensus tree obtained, there seems little to distinguish between the four-state and seven-state models because each tree contains some aspects that appear more reasonable than the other. From a statistical point of view, the seven-state model is preferable because correlations make the posterior probabilities using the four-state model invalid. One thing that is clear is that this analysis, based on a fairly restricted number of sites, seems to give much more reliable results than several previous methods using large sets of mitochondrial genes (Cao et al. 2000
; Nikaido et al. 2000
). It is therefore clear that the paired regions of RNA are highly informative. One reason for this may be the lack of ambiguity in their alignment.
For tRNA genes, most of the informative sites are in the stem regions. The anticodon loop is highly conserved in most mitochondrial tRNAs and hence contains little phylogenetic information. The other two loops are very variable in both length and sequence in most cases. Highly variable and gappy regions of alignments are usually excluded from phylogenetic analyses. In the rRNA alignments, however, there are many reliably aligned sites in unpaired regions, and it would be desirable to include these sites in the analysis. We intend to develop our program so that it can deal with two evolutionary models simultaneously: a seven-state model for the paired sites and a four-state model for the unpaired sites. At present, our program deals with only one or other of these models and therefore we have limited our analysis to the paired sites.
Table 2 shows the posterior average of the estimated frequency parameter over two different MCMC runs in comparison with the empirical frequencies of the states measured directly from the sequences. In column 2 we show the estimates from the runs used to create the tree in figure 2 , using a variable-rate model with four rate categories. In column 3 we show estimates from similar runs but using a constant rate of evolution at every site. The frequency estimates in columns 2 and 3 are fairly close to the empirical frequencies except for a notable difference in the MM frequency, which is 11% in column 2 but is closer to 4% in columns 1 and 3. The transition rate matrices shown in tables 3 and 4 for the variable and constant rate models, respectively, are also significantly different from one another. The problem appears to be that sites with the highest substitution rates are also those with the highest frequency of MM states. This makes sense because the rapidly evolving sites are presumably also those under the weakest evolutionary constraints for conserved structure. The rapidly changing sites contribute more to the estimate of the frequency parameters than other sites because they provide less correlated and therefore more informative samples from the distribution of states. A possible way of eliminating this problem would be to allow independent estimates of the frequency parameters in each rate category of the variable-rate model.
|
We emphasize that the tree in figure 2 was obtained using a variable-rate model, and we estimate that there is a significant variation in evolutionary rates over sites in our sequences. Allowing for variable rates influences the resulting tree a great deal. When the MCMC analysis is performed with a constant rate over sites, we obtain trees differing markedly from figure 2 that contain some biologically very implausible clades. Therefore we do not show these results here.
![]() |
Conclusion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Keywords: Bayesian phylogeny
mitochondrial RNA
mammalian evolution
RNA substitution model
Markov chain Monte Carlo
Address for correspondence and reprints: M. Rattray, Department of Computer Science, University of Manchester, Manchester, UK. E-mail: magnus{at}cs.man.ac.uk
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adkins R. M., E. L. Gelke, D. Rowe, R. L. Honeycutt, 2001 Molecular phylogeny and divergence time estimates for major rodent groups: evidence from multiple genes Mol. Biol. Evol 18:777-791
Cannone J. J., S. Subramanian, M. N. Schnare, et al. (14 co-authors) 2002 The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs BioMed. Central Bioinformat 3:2.
Cao Y., M. Fujiwara, M. Nikaido, N. Okada, M. Hasegawa, 2000 Interordinal relationships and timescale of Eutherian evolution as inferred from mitochondrial genome data Gene 259:149-158[ISI][Medline]
Efron B., E. Halloran, S. Holmes, 1996 Bootstrap confidence levels for phylogenetic trees Proc. Natl. Acad. Sci. USA 93:7085-7090
Eizirik E., W. J. Murphy, S. J. O'Brien, 2001 Molecular dating and biogeography of the early placental mammals J. Hered 92:212-219
Felsenstein J. P., 1981 Evolutionary trees from DNA sequences: a maximum likelihood approach J. Mol. Evol 17:368-376[ISI][Medline]
. 1989 PHYLIP (phylogeny inference package). Version 3.2 Cladistics 5:164-166
Hasegawa M., H. Kishino, T. Yano, 1985 Dating of the human-ape splitting by a molecular clock of mitochondrial DNA J. Mol. Evol 21:160-174
Hastings W. K., 1970 Monte Carlo sampling methods using Markov chains and their applications Biometrika 57:97-109[ISI]
Helm M., H. Brulé, D. Friede, R. Giege, D. Pûtz, C. Florentz, 2000 Search for characteristic structural features of mammalian mitochondrial tRNAs RNA 6:1356-1379
Higgs P. G., 1998 Compensatory neutral mutations and the evolution of RNA Genetica 102:91-101
. 2000 RNA secondary structure: physical and computational aspects Q. Rev. Biophys 33:199-253[ISI][Medline]
Huelsenbeck J. P., F. Ronquist, 2001 Mrbayes: Bayesian inference of phylogenetic trees Bioinformatics 17:754-755
Huelsenbeck J. P., F. Ronquist, R. Nielsen, J. P. Bollback, 2001 Bayesian inference of phylogeny and its impact on evolutionary biology Science 294:2310-2314
Kishino H., M. Hasegawa, 1989 Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoides J. Mol. Evol 29:170-179[ISI][Medline]
Larget B., D. L. Simon, 1999 Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees Mol. Biol. Evol 16:750-759
Lewis P. O., 2001 Phylogenetic systematics turns over a new leaf Trends Ecol. Evol 16:30-37[ISI][Medline]
Li S., 1996 Phylogenetic tree construction using Markov chain Monte Carlo Doctoral dissertation, Ohio State University, Columbus
Li S., D. K. Pearl, H. Doss, 2000 Phylogenetic tree construction using Markov chain Monte Carlo J. Am. Stat. Soc 95:493-508
Madsen O., M. Scally, C. J. Douady, D. J. Kao, R. W. DeBry, R. Adkins, H. M. Amrine, M. J. Stanhope, W. W. de Jong, M. S. Springer, 2001 Molecular phylogenetics and the origins of placental mammals Nature 409:614-618[ISI][Medline]
Mau B., 1996 Bayesian phylogenetic inference via Markov chain Monte Carlo methods Doctoral dissertation, University of Wisconsin, Madison
Mau B., M. A. Newton, 1997 Phylogenetic inference for binary data on dendograms using Markov chain Monte Carlo J. Comput. Graph. Stat 6:122-131[ISI]
Mau B., M. Newton, B. Larget, 1999 Bayesian phylogenetic inference via Markov chain Monte Carlo methods Biometrics 55:1-12[ISI][Medline]
McGiure G., M. C. Denham, D. J. Balding, 2001 Mac5: Bayesian inference of phylogenetic trees from DNA sequences incorporating gaps Bioinformatics 17:479-480[Abstract]
Metropolis N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, E. Teller, 1953 Equations of state calculations for fast computing machines J. Chem. Phys 21:1087-1091[ISI]
Mouchaty S. K., A. Gullberg, A. Janke, U. Arnason, 2000a. The phylogenetic position of the Talpidae within Eutheria based on analysis of complete mitochondrial sequences Mol. Biol. Evol 17:60-67
. 2000b. Phylogenetic position of the tenrecs (Mammalia: Tenrecidae) of Madagascar based on analysis of the complete mitochondrial genome sequence of Echinops telfairi Zool. Scr 29:307-317[ISI]
Murphy W. J., E. Eizirik, W. E. Johnson, Y. P. Zhang, O. A. Ryder, S. J. O'Brien, 2001a. Molecular phylogenetics and the origins of placental mammals Nature 409:614-618[ISI][Medline]
Murphy W. J., E. Eizirik, S. J. O'Brien, et al. (11 co-authors) 2001b. Resolution of the early placental mammal radiation using Bayesian phylogenetics Science 294:2348-2351
Nikaido M., M. Harada, Y. Cao, M. Hasegawa, N. Okada, 2000 Monophyletic origin of the order Chiroptera and its phylogenetic position among Mammalia, as inferred from the complete sequence of the mitochondrial DNA of a Japanese megabat, the Ryukyu flying fox J. Mol. Evol 51:318-328[ISI][Medline]
Page R. D. M., E. Holmes, 1998 Molecular evolution, a phylogenetic approach Blackwell Science
Reyes A., C. Gissi, G. Pesole, F. M. Catzeflis, C. Saccone, 2000 Where do rodents fit? Evidence from the complete mitochondrial genome of Sciurus vulgaris Mol. Biol. Evol 48:1-5
Ross C., B. Williams, R. F. Kay, 1998 Phylogenetic analysis of anthropoid relationships J. Hum. Evol 35:221-306[ISI][Medline]
Savill N. J., D. C. Hoyle, P. G. Higgs, 2001 RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum likelihood methods Genetics 157:399-411
Schmitz J., M. Ohme, H. Zischler, 2002 The complete mitochondrial sequence of Tarsius bancanus: evidence for the extensive nucleotide compositional plasticity of primate mitochondrial DNA Mol. Biol. Evol 19:544-553
Springer M. S., R. W. DeBry, C. Douady, H. Amrine, O. Madsen, W. W. de Jong, M. J. Stanhope, 2001 Mitochondrial versus nuclear gene sequences in deep-level mammalian phylogeny reconstruction Mol. Biol. Evol 18:132-143
Stanhope M. J., V. G. Waddell, O. Madsen, W. D. Jong, S. B. Hedges, G. C. Cleven, D. Kao, M. S. Springer, 1998 Molecular evidence for multiple origins of insectivora and for a new order of endemic African insectivore mammals Proc. Natl. Acad. Sci. USA 95:9967-9972
Swofford D. L., G. J. Olsen, P. J. Waddell, D. M. Hillis, 1996 Phylogenetic inference Pp. 407515 in D. M. Hillis, ed. Molecular systematics. 2nd edition. Sinauer Associates, Sunderland, Mass
Swofford D. L., P. J. Waddell, J. P. Huelsenbeck, P. G. Foster, P. O. Lewis, J. S. Rogers, 2001 Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods Syst. Biol 50:525-539[ISI][Medline]
Waddell P. J., N. Okada, M. Hasegawa, 1999 Towards resolving the interordinal relationships of placental mammals Syst. Biol 48:1-5[ISI][Medline]
Yang Z., 1994 Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods J. Mol. Evol 39:306-314[ISI][Medline]
Yang Z., Rannala B., 1997 Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo method Mol. Biol. Evol 14:717-724[Abstract]