*Department of Biology, University of Utah;
and
Institute of Biological Anthropology, University of Oxford
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The increased efficiency of sequencing technology holds the promise that in the future, much longer sequences will routinely be generated. The potential gain in precision is already apparent for mitochondrial DNA (mtDNA) for which the existence of the entire genomic sequences for a substantial number of taxa (e.g., Xu, Janke, and Arnason 1996
; Arnason, Gullberg, and Janke 1997
; Penny and Hasegawa 1997
) has begun to yield more informative phylogenies, superseding those based on single mtDNA genes. The reliability to be gained by using the entire mtDNA genome has yet to be comprehensively evaluated, since there have been few statistically robust analyses comparing the accuracy of branch lengths estimated by the whole genome with that of those estimated from smaller mtDNA segments (but see Hillis et al. 1992
; Graybeal 1994
).
It is often not appreciated that maximum-likelihood (ML) criteria include both branch length and topology estimates, even though poor estimation of either can interfere with efforts to build a reliable tree. We show that trees that have identical topologies but are estimated from different sequences can differ significantly from one another simply because branch length estimates differ. Hence, the estimates of the timing and extent of evolutionary divergence may be significantly better for some mtDNA sequences than for others. Furthermore, some sequences that recover incorrect topologies provide significantly better branch length estimates than do sequences that recover the correct topology.
In addition, to detect statistically significant differences among trees, it is important that the test of the hypothesis is set up properly. So far, most analyses of the phylogenetic utility of shorter sequences (Cao et al. 1994, 1998
; Zardoya and Meyer 1996
) have not used statistically informative comparisons. These earlier studies fit the expected tree to the individual gene sequence and generally failed to show that ML gene trees are significantly different from the expected tree. Single gene sequences lack sufficient phylogenetic information to provide a powerful contrast, and hence will fail to discriminate among the alternative hypotheses. Ranking genes using tests that lack the power to detect the significance of differences among ML scores is meaningless, since the differences may be attributable to random error. Instead, the more appropriate procedure is to fit trees defined by single-gene (and longer) sequences to the full sequence of all 13 genes (11 kb). When this is done, we can demonstrate that trees built from single genes are significantly poorer than the expected tree generated from the entire mtDNA protein-coding sequence.
Nineteen mammalian sequences, including six hominoid primates (gibbon, orangutan, gorilla, human, and two chimpanzee species), six ungulates, and three carnivores, were chosen because there is general consensus about the phylogenetic relationships within (although not among) these orders. Phylogenies for modern eutherians can be problematic, because most families within modern orders appeared rapidly during the early Paleocene (Flynn and Galiano 1982
; Alroy 1999
) or during a subsequent mid-Cenozoic radiation (Simpson 1945
; Radinsky 1982
; Novacek 1990
). The difficulty arises because sequences must contain enough change to reveal closely spaced branching events without accumulating homoplasies that obscure the true branching pattern; thus, sequences must be relatively conserved and they also must be relatively long. Even the entire mtDNA sequence may be inadequate for resolving phylogenetic trees among very divergent taxa (Sullivan and Swofford 1997
; Naylor and Brown 1998
; this study). On the other hand, the subordinal relationships of taxa included in this study are known. Consequently, these modern eutherians that have an evolutionary history of not much more than 70 Myr represent an ideal set of taxa to evaluate the performance of mtDNA genes.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Data Analysis
ML trees were constructed from each of the 13 protein-coding gene sequences, from the full concatenated sequence, and from seven different combinations of genes. Several criteria were used to combine the seven multiple-gene sequences, which ranged in length from about 2 to 3.5 kb. For example, cytochrome b, widely used for mammalian phylogenies, was combined with the nearby ND5 gene, for which primers are readily available (CYTB/ND5). We used sequence similarity plots (GCG 1997
) to empirically determine the relative rates of sequence evolution and then assembled two combinations of sequences with similar levels of conservation (COI/COII, conserved; ND3/ND4/ND4L, intermediate) and one that mixed fast, intermediate, and conserved genes (ND2/ND1/COI, respectively). Finally, some sequences comprising genes that recovered topologies similar to that of the expected tree (ND1/ND2, ND5 [first and second positions only]/ND4) were assembled and compared with sequences of similar lengths with poorly performing genes (ND3/ND4L/COII/COIII).
While earlier studies focus on the utility of genes for resolving very deep divergences (Cao et al. 1994, 1998
; Graybeal 1994
; Zardoya and Meyer 1996
; Naylor and Brown 1997
), we increased the density of subordinal lineages dating from the mid-Cenozoic and assumed that our results would be more applicable to recent mammalian divergences. The time depth for which each gene is appropriate differs, because the rate of sequence evolution differs among genes (Simon et al. 1994
); less conserved genes (e.g., ND2) may perform better for more recent splits than the highly conserved genes (COI, COII, and COIII). To determine whether a different set of genes is phylogenetically informative for subordinal splits, we used the same set of parameters (DNAML default) (Felsenstein 1993
) used by Zardoya and Meyer (1996)
for higher-level relationships. Subsequently, we checked our results using a more complex model (see below).
The default DNAML model, hereinafter referred to as the Felsenstein and Hasegawa-Kishino-Yano model (F84/HKY; Swofford et al. 1996
), accommodates unequal base frequencies (empirical) and a 2:1 transition/transversion ratio (the ML estimate for these data and for this model, which assumes rate homogeneity; details available from P.S.C.). Each tree was constructed using a global search strategy for 10 runs, each with a randomly different order of species input. We estimated the degree to which the data support each clade with majority-rule consensus trees built from ML trees generated from 100 bootstrap replicates (Felsenstein 1985
) of the sequence data. Sequences were analyzed with and without the third codon positions.
To ensure that conclusions from this study were not artifacts of a poor model (e.g., Saccone, Reyes, and Pesole 1998
), we expanded the model to include across-site rate heterogeneity parameters and invariant sites (Hasegawa-Kishino-Yano+invariant+gamma [HKY+I+G]; Swofford et al. 1996
) using ML methods (Swofford 1999
). Since rate heterogeneity and invariant-sites parameters have been shown to significantly increase the likelihood for mammalian trees (Huelsenbeck 1997
; Sullivan and Swofford 1997
; unpublished data), we estimated ML parameters (transition/transversion ratio, base composition bias, the proportion of invariant sites, and gamma rate parameters) separately for each sequence. Subsequently, we used the more complex models to verify the results from the simpler (F84/HKY) model. For each sequence, model choice was based on nested ML tests (Posada and Crandall 1998
) of 12 models ranging in complexity from the 4-parameter F81 to the 10-parameter general-time-reversible (GTR+I+G) model (Swofford et al. 1996
).
To verify that results held across methods, we constructed neighbor-joining (NJ) trees for the combined sequence data and for individual genes from Tajima-Nei distance matrices (GCG 1997
), and using parsimony criteria (Maddison and Maddison 1993
), we compared the fit of the 13 gene tree topologies with the expected tree topology and reasonable alternatives.
The fit of each gene or multiple-gene tree to the full sequence was compared with that of the expected tree using the KH test (Kishino and Hasegawa 1989
), in which the mean difference across sites in log-likelihoods (lnL) was compared with the standard error (SE) of the differences. If the log-likelihood statistic (-lnL difference/SE) was greater than 1.96, then the alternative trees were considered significantly different from one another. For ML methods, the KH test does not (and cannot) explicitly compare alternative topologies (branching pattern), but, rather, compares the branch length estimates conditioned on each topology. The topology that fits the optimal branch length estimate is the ML tree topology (Felsenstein 1981
). Thus, alternative topologies have different branch length estimates, and these differences may or may not be statistically significant. Similarly, two trees that have exactly the same topology but are estimated from different genes can have significantly different branch length estimates. For this reason, for each gene sequence, we used a nested set of likelihood scores to evaluate deviations in branch length estimates not attributable to topology differences. In a manner analogous to the analysis of variance (ANOVA), which partitions the total log-likelihood statistic into individual components of variation, we separated the component of the log-likelihood difference due to changes in topology alone from the additional component due to branch length differences unrelated to topology.
Partitioning the likelihood score is straightforward using DNAML (Felsenstein 1993
) and PAUP (Swofford 1999
), which allow the user to specify a set of trees for the KH test and to either specify (constrain) or re-estimate (unconstrain) the branch lengths for each tree. For the first set of tests, we fit the topology and branch lengths estimated from the single genes to the full sequence and compared the resulting log-likelihood score with that of the expected tree. The full log-likelihood statistic is the difference between log-likelihood scores for the ML tree derived from the full sequence (the expected tree) and the ML tree estimated from the single-gene sequence. Therefore, the size of the full log-likelihood statistic reflects deviations of the gene tree from the expected tree that are attributable to topological differences (if any) and to other branch length differences.
Second, we fit each gene tree topology to the full sequence with the branch lengths unconstrained. Thus, the branch lengths were re-estimated conditioned on the gene tree topology, and the resulting change in log-likelihood deviation reflects only the change in topology. By constraining the topology but allowing the branch lengths to be reoptimized to those that best fit the full- sequence data, we were able to separate the component of the log-likelihood deviations attributable to the fit of the gene tree topology from the total deviation.
To explore the performance of genes in estimating specific branches of the expected tree, we used PHYLIP-generated estimates to compare the accuracy, across genes, of individual branch lengths by determining which estimates fell within the relatively narrow 95% confidence intervals (CIs) estimated for the expected tree. These tests of individual branch length fit cannot provide reliable measures of statistical significance and are meant to be exploratory rather than inferential. Work is now in progress to statistically evaluate hypotheses suggested by this analysis.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
ML Topologies
No single gene sequence recovers the expected topology (fig. 1 ) regardless of method (ML and NJ, with or without the third- position sites). Among single genes, only the ND1 and ND2 topologies provide a reasonable fit (F84, HKY+I+G, lnL/SE = 1.27 and 1.78, respectively) to the full 11-kb sequence (table 1
). All other single-gene trees, regardless of model, are significantly worse than the expected tree (
lnL/SE > 1.96 in each case). In contrast, three combinations of genes (Cytb/ND5, ND5 [third positions excluded]/ND4, and ND3/ND4/ND4L) recover the same topology as the full sequence does (table 2
). The sequence comprising ND2, ND1, and COI recovers a slightly different tree topology, but it is not significantly different from the expected tree (F84,
lnL/SE = 1.55; HKY+I+G,
lnL/SE = 0.03; the armadillo joins just outside the ferungulate clade, with a monophyletic rather than paraphyletic ungulate clade). Three combinations of gene sequences (COI/COII, ND3/ND4L/COII/COIII, and ND1/ND2) fail to recover the expected tree, yielding significantly worse topologies than the expected tree (table 2
;
lnL/SE > 1.96 in each case). However, by accommodating rate heterogeneity, it can be seen that the ND3/ND4L/COII/COIII and ND1/ND2 differences are spurious (
lnL/SE = 1.37 and 1.64, respectively). Among the combinations of genes studied, only the COI/COII topology differs significantly from the expected topology.
|
|
Figure 2
plots branch lengths estimated from each of the seven gene combinations against the expected branch lengths estimated from the full sequence. In principle, with perfect agreement between the two sequences, all points will lie on the diagonal line, while large deviations indicate large branch length differences. For example, both the ND1/ND2/COI and the Cytb/ND5 estimates are significantly closer to the expected branch lengths than are those of ND4/ND5 (table 3
; lnL/SE = 5.70 and 5.18, respectively). Although the ND1/ND2/COI tree topology differs slightly from the expected tree, the branch length deviations are so small that the overall fit (F84,
lnL/SE = 3.01; HKY+I+G,
lnL/SE = 5.70) is better than that for any other gene tree. Similarly, the ND3/ND4L/ND4 tree is significantly better than the topologically identical ND4/ND5 tree (
lnL/SE = 4.40), even though ND3/ND4L/ND4 is shorter by 555 bp. Although very important, sequence length alone is not responsible for poor branch length estimation or for large likelihood deviations from the expected branch lengths. Regardless of length, some sequences provide a better fit than others do.
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Regardless of the model (HKY or HKY+I+G), the Cytb/ND5 and ND1/ND2/COI trees fit the phylogenetic information in the full 11.2-kb sequence significantly better than do those of any other sequence tested. In particular, ND1/ND2/COI recovered a tree with the smallest overall log-likelihood deviation from the expected tree (table 2 ), suggesting that these three genes might be especially useful for resolving other mammalian phylogenies with post-Eocene or later divergence times. This is when the many modern eutherian groups radiated (Radinsky 1982
; Prothero 1994
), and the depth and rapidity of the divergences renders them difficult to detect with fossil data (Simpson 1945
) or with short sequences (Waits 1996
; Dragoo and Honeycutt 1997
). It is for this time frame, roughly the past 50 Myr, that our results are probably most useful.
We have less confidence that any of these sequences should be used to resolve ancient divergences (see also Graybeal 1994
). Indeed, it is doubtful that any combination of mtDNA genes will be adequate, since even the entire 11.2-kb sequence failed to fully resolve the relationships among ferungulate orders. Given the continuing controversy regarding relationships among the mammalian orders (Arnason, Gullberg, and Janke 1997
; Sullivan and Swofford 1997
; Saccone, Reyes, and Pesole 1998
; Waddell, Okada, and Hasegawa 1999
) and the relative lack of discrimination of the entire genome for these evolutionary depths, it is likely that resolution will require addition of nuclear sequences (Graybeal 1994
), as well as denser taxa sampling at the interordinal level (Graybeal 1998
; Poe 1998
).
On the other hand, for subordinal divergences, ML trees from the full sequence and from some multiple-gene sequences contain all of the expected topological relationships regardless of the model. This may be because ML techniques are relatively robust to violations of model assumptions (Felsenstein 1993
; Swofford et al. 1996
) or because the effects of adding rate parameters are smaller for shorter branches (Yang 1996
). Our preliminary results (unpublished data) suggest that with the inclusion of rate parameters, branch length estimates for divergences of less than 20 MYA are nearly identical to those estimated with the rate homogeneity model and are about 20% longer for divergences occurring about 50 MYA. Only for divergences occurring during the late Cretaceous to early Tertiary (just 2 of 34 branches) are branch lengths dramatically underestimated by the rate homogeneity model. With rate parameters included, these two branches are twice as long as those estimated without the extra parameters.
The cost of the additional parameters is reduced power to distinguish alternative trees from the ML tree, as well as increased CI size on the branch lengths (unpublished data). However, it is important to insure that the simple model of sequence evolution used in analysis is not biased in favor of an incorrect tree. Our preliminary site likelihood comparisons and skew tests (unpublished data) suggest that characters that distinguish one topology from another using the F84/HKY model are simply spurious differences corrected by inclusion of across-sites rate heterogeneity parameters. Similarly, Sullivan and Swofford (1997)
identified a larger set of alternative trees than did others using the same data (e.g., Saccone, Reyes, and Pesole 1998
), possibly because of reduced power, but more likely because their more realistic model of sequence evolution discarded spurious differences. Regardless, the lack of power certainly does not warrant using the wrong model, as some imply (Saccone, Reyes, and Pesole 1998
). Such a practice merely provides increased confidence in the wrong tree. Even if a larger set of alternative trees is obtained, the extra parameters render the branch length estimates less biased and therefore more useful for correctly inferring evolutionary history. The next logical step, now underway, is to identify the set of parameters that balance optimal branch length estimation against loss of power for phylogenies dating from the mid-Cenozoic to Recent.
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: maximum likelihood,
mammalian phylogeny,
complete mitochondrial genome,
branch length estimates,
mtDNA protein-oding genes.
2 Address for correspondence and reprints: Patrice Showers Corneli, Department of Biology, University of Utah, Salt Lake City, Utah 84112. E-mail: corneli{at}biology.utah.edu
![]() |
literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Alroy, J. 1999. The fossil record of North American mammals: evidence for a Paleocene evolutionary radiation. Syst. Biol. 48:107118.[ISI][Medline]
Anderson, S., M. H. de Bruijn, A. R. Coulson, I. C. Eperon, F. Sanger, and I. G. Young. 1982. Complete sequence of the mammalian mitochondrial genome. J. Mol. Biol. 156:683716.[ISI][Medline]
Arnason, U., and A. Gullberg. 1993. Comparison between the complete mtDNA sequence of the blue and fin whale, two species that can hybridize in nature. J. Mol. Evol. 37:312322.[ISI][Medline]
Arnason, U., A. Gullberg, and A. Janke. 1997. Phylogenetic analyses of mitochondrial DNA suggest a sister group relationship between Xenarthra (Edentata) and ferungulates. Mol. Biol. Evol. 14:762768.[Abstract]
Arnason, U., A. Gullberg, A. Janke, and X. Xiufeng. 1996. Pattern and timing of evolutionary divergences among Hominoids based on analysis of complete mtDNAs. J. Mol. Evol. 43:650661.[ISI][Medline]
Arnason, U., A. Gullberg, E. Johnsson, and C. Ledje. 1993. The nucleotide sequence of the mitochondrial DNA molecule of the grey seal, Halichoerus grypus, and a comparison with mitochondrial sequences of other true seals. J. Mol. Evol. 37:323330.[ISI][Medline]
Arnason, U., A. Gullberg, and B. Widegren. 1991. The complete nucleotide sequence of the mitochondrial DNA of the fin whale, Balaenoptera physalus. J. Mol. Evol. 33:556568.[ISI][Medline]
Arnason, U., and E. Johnsson. 1992. The complete mitochondrial sequence of the harbor seal, Phoca vitulina. J. Mol. Evol. 34:493505.[ISI][Medline]
Arnason, U., X. Xu, and A. Gullberg. 1996. Comparison between the complete mitochondrial DNA sequence of Homo and the common chimpanzee based on nonchimeric sequence. J. Mol. Evol. 42:145152.[ISI][Medline]
Cao, Y., J. Adachi, A. Janke, S. Pääbo, and M. Hasegawa. 1994. Phylogenetic relationships among Eutherian orders estimated from inferred sequences of mitochondrial proteins: instability of a tree based on a single gene. J. Mol. Evol. 39:519527.[ISI][Medline]
Cao, Y., A. Janke, P. J. Waddell, M. Westerman, O. Takenaka, S. Murata, N. Okada, S. Pääbo, and M. Hasegawa. 1998. Conflict Among individual mitochondrial proteins in resolving the phylogeny of eutherian orders. J. Mol. Evol. 47:307322.[ISI][Medline]
Dragoo, J. W., and R. L. Honeycutt. 1997. Systematics of mustelid-Like carnivores. J. Mammal. 78:426443.[ISI]
Felsenstein, J. 1981. Evolutionary trees from DNA Sequences: a maximum likelihood approach. J. Mol. Evol. 17:368376.[ISI][Medline]
. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783791.
. 1988. Phylogenies from molecular sequences: inferences and reliability. Annu. Rev. Genet. 22:521565.[ISI][Medline]
. 1993. PHYLIP (phylogenetic inference package). Version 3.5c. Distributed by the author, Department of Genetics, University of Washington, Seattle.
Flynn, J. M., and H. Galiano. 1982. Phylogeny of early Tertiary Carnivora with a description of a new species of Protictis from the Middle Eocene of northwestern Wyoming. Am. Mus. Novit. 2725:164.
Gadeleta, G., G. Pepe, G. DeCandia, C. Quadgliariello, E. Sbisa, and C. Saccone. 1989. The complete mitochondrial genome: cryptic signals revealed by comparative analysis between vertebrates. J. Mol. Evol. 28:497516.[ISI][Medline]
GCG. 1997. Wisconsin package. Version 9.1. Genetics Computer Group, Madison, Wis.
Graybeal, A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47:917.
. 1994. Evaluating the phylogenetic utility of genes: a search for genes informative about deep divergences among vertebrates. Syst. Biol. 43:174193.[ISI]
Hendy, M. D., and D. Penny. 1993. Spectral analysis of phylogenetic data. J. Classif. 10:524.[ISI]
Hillis, D. M., J. J. Bull, M. E. White, M. R. Badgett, and I. J. Molineux. 1992. Experimental phylogenetics: generation of a known phylogeny. Science 255(5044):589591.
Horai, S., K. Hayasaka, R. Kondo, K. Tsigane, and N. Takhata. 1995. Recent African origin of the modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc. Natl. Acad. Sci. USA 92:532536.
Horai, S., Y. Satta, K. Hayasaka, R. Kondo, T. Inoue, T. Ishida, S. Hayashi, and N. Takahata. 1992. Mans place in Hominoidea revealed by mitochondrial DNA genealogy. J. Mol. Evol. 35:3243.[ISI][Medline]
Huelsenbeck, J. P. 1997. Is the Felsenstein Zone a flytrap? Syst. Biol. 46:6974.
Janke, A. 1997. The complete mitochondrial genome of the wallaroo (Macropus robutus) and the phylogenetic relationship among Montremata, Marsupialia, and Eutheria. Proc. Natl. Acad. Sci. USA 94:12761281.
Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170179.[ISI][Medline]
Kretteck, A., A. Gullberg, and U. Arnason. 1995. Sequence analysis of the complete mitochondrial DNA molecule of the hedgehog, Erinaceous europeaus, and the phylogenetic position of the Lipotyphla. J. Mol. Evol. 41:952957.[ISI][Medline]
Lopez, J. V., S. Cevario, and S. J. OBrien. 1996. Complete nucleotide sequence of the domestic cat (Felis catus) mitochondrial geneome and a transposed mtDNA tandem repeat (Numt) in the nuclear genome. Genomics 33:229246.
Maddison, W. P., and D. R. Maddison. 1993. MacClade 3.0. Sinauer, Sunderland, Mass.
Martens, P. A., and D. A. Clayton. 1979. Mechanism of mitochondrial DNA replication of the light-stranded origin of replication. J. Mol. Evol. 135:327351.
Naylor, G. J. P., and W. M. Brown. 1997. Structural biology and phylogenetic estimation. Nature 388:527528.
. 1998. Amphioxus mitochondrial DNA, chordate phylogeny, and the limits of inference based on comparisons of sequences. Syst. Biol. 47:6176.[ISI][Medline]
Novacek, M. J. 1990. Morphology, paleontology, and the higher clades of mammals. Pp. 507543 in H. H. Genoways, ed. Current mammalogy. Vol. 2. Plenum Press, New York.
Penny, D., and M. Hasegawa. 1997. The platypus put in its place. Nature 387:549550.
Penny, D. L., M. Hasegawa, P. J. Waddell, and M. D. Hendy. 1999. Mammalian evolution: timing and implications from using the LogDeterminant transform for proteins of differing amino acid composition. Syst. Biol. 48:7693.[ISI][Medline]
Poe, S. 1998. Sensitivity of phylogeny estimation to taxon sampling. Syst. Biol. 47:1831.[ISI][Medline]
Posada, D., and K. A Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817818.
Prothero, D. R. 1994. The Eocene-Oligocene transition: paradise lost. Columbia University Press, New York.
Radinsky, L. B. 1982. Evolution of skull shape in carnivores 3. The origin and early radiation of the modern carnivore families. Paleobiology 8:177195.
Saccone, C., A. Reyes, and G. Pesole. 1998. Complete mitochondrial DNA sequence of the fat dormouse, Glis glis: further evidence of rodent paraphyly. Mol. Biol. Evol. 15:499505.[Abstract]
Simon, C., F. Frati, A. Beckenbach, B. Crespi, H. Liu, and P. Flook. 1994. Evolution, weighting, and phylogenetic utility of mitochondrial gene sequences and a compilation of conserved polymerase chain reaction primers. Ann. Entomol. Soc. Am. 87:651701.[ISI]
Simpson, G. G. 1945. The principles of classification and a classification of the mammals. Bull. Am. Mus. Nat. Hist. 85:1350.[ISI]
Sullivan, J., and D. L. Swofford. 1997. Are guinea pigs rodents? The importance of adequate models in molecular phylogenetics. J. Mamm. Evol. 4:7786.
Swofford, D. L. 1999. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4b2. Sinauer, Sunderland, Mass.
Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis. 1996. Phylogenetic inference. Pp. 407514 in D. M. Hillis, C. Moritz, and B. K. Mable, ed. Molecular systematics. Sinauer, Sunderland, Mass.
Waddell, P. J., Y. Cao, J. Hauf, and M. Hasegawa. 1999. Using novel phylogenetic methods to evaluate mammalian mtDNA, including amino acid-invariant sites-log det plus site stripping, to detect internal conflicts in the data, with special reference to the positions of hedgehog, armadillo, and elephant. Syst. Biol. 48:3153.[ISI][Medline]
Waddell, P. J., N. Okada, and M. Hasegawa. 1999. Towards resolving the interordinal relationships of placental mammals. Syst. Biol. 48:15.[ISI][Medline]
Waits, L. 1996. A comprehensive molecular study of the evolution and genetic variation of bears. Ph.D. thesis, University of Utah, Salt Lake City.
Xu, S., A. Janke, and U. Arnason. 1996. The complete mitochondrial DNA sequence of the greater Indian rhinoceros, Rhinoceros unicornis, and the phylogenetic relationship among Carnivora, Perrisodactyla and Artiodactyla (+Cetacea). Mol. Biol. Evol. 13:11671173.[Abstract]
Xu, X., and U. Arnason. 1994. The complete mitochondrial DNA sequence of the horse, Equus cabellus: extensive heteroplasmy of the control region. Gene 148:357362.
. 1996. A complete sequence of the mitochondrial genome of the western lowland gorilla. Mol. Biol. Evol. 13:691698.[Abstract]
. 1997. The complete mitochondrial DNA sequence of the white rhinoceros, Ceratotherium simum, and comparison with the mtDNA sequence of the Indian rhinoceros, Rhinoceros unicornis. Mol. Phylogenet. Evol. 7:189194.
Yang, Z. 1996. Among site variation and its impact on phylogenetic analysis. Trends Ecol. Evol. 11(9):367372.
Zardoya, R., and A. Meyer. 1996. Phylogenetic performance of mitochondrial protein-coding genes in resolving relationships among vertebrates. Mol. Biol. Evol. 13:933942.