* National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
Equipes Méthodes et Algorithmes pour la Bioinformatique, LIRMM, Montpellier, France
Correspondence: E-mail: gascuel{at}lirmm.fr.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: minimum evolution least-squares distance-based phylogenetic inference consistency method comparison using simulations
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The current work is divided into two parts: an investigation of the theoretical underpinnings of the balanced minimum evolution (BME) approach, and a discussion of extensive simulations comparing the BME approach to three popular distance methods. First, we demonstrate that the balanced minimum evolution branch lengths represent, in fact, a special type of weighted least-squares tree fitting, where the variances for each leaf-to-leaf distance estimate are assumed to be exponentially related to the topological distance in the tree between the pair of leaves. Next, we demonstrate that this approach is consistent: as distance estimates converge to true evolutionary distances, the FASTME tree converges to the true tree. Our proof is modeled on the proof of Rzhetsky and Nei (1993) that demonstrated consistency for a minimum evolution approach when branch lengths were assigned by ordinary least-squares fitting. Next, we also note a feature of FASTME trees: whereas many distance algorithms produce branches with confusing negative branch lengths, FASTME only produces positive branch lengths.
The second major section of the paper is an expanded simulation over a generalized model of tree topology selection, branch length assignment, and DNA evolution. We used the Aldous (1996) model for random tree topology selection, a generalization of the two most common random distributions on tree topologies: the uniform distribution and the biologically relevant Yule-Harding (Yule 1925; Harding 1971) distribution. We created a departure from the molecular clock using another random factor, and evolved 600 base-pair DNA sequences for each tree topology, using a covarion model analogous to Galtier (2001) to determine the evolutionary rate changes of the sites. The resulting 5,000 data sets cover a wide variety of tree topologies, model parameters, and evolutionary conditions. We used FASTME, NJ, WEIGHBOR, and PAUP's heuristic weighted least-squares topology search to determine a tree for each data set. Our simulations demonstrate the superiority of FASTME at estimating the original topology, as measured by counting the number of branches in the output tree that correspond to branches of the model tree.
![]() |
Theoretical Foundation of the BME Approach |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Using this notation, we observe that D = SL, and the branch lengths are estimated by minimizing the difference between the observation and D. The ordinary least-squares (OLS) approach involves selecting branch lengths
minimizing the squared Euclidean fit between
and D, i.e., (D -
)t(D -
). This yields
= (StS)-1St
. However, this approach implicitly assumes that each estimate
ij has the same variance and is independent, a supposition not generally true because large distances are much more variable than short distances, and because the sequences in question share a common evolutionary history. To address this problem, Fitch and Margoliash (1967), Felsenstein (1997), and others have proposed using a weighted least-squares approachi.e., minimizing (D -
)tV -1(D -
), where V is the diagonal matrix containing the variances of the
ij estimates. This approach yields
= (StV -1S)-1StV -1
. The weighted least-squares (WLS) approach accounts for the variable variances of the distance estimates, but not for their dependencies. The generalized least-squares approach (Bulmer 1991; Susko 2003) uses the full variance-covariance matrix V, and then accounts for dependencies, but it is rarely used because the covariances of the distance estimates are usually poorly known, and because this approach requires a lot of computing time.
In the minimum-evolution framework, we select the topology with the shortest estimated length. The tree length of T is equal to the sum of its branch lengths, and it may be written as l(T ) = 1tL, where 1 is a vector of 1's. The standard WLS tree length estimate is then equal to:
|
Pauplin (2000) followed another approach, modifying the OLS analytical formulae. Consider the two possible branch configurations in figure 1.
|
|
|
|
Finally, we demonstrated (Desper and Gascuel 2002) that in the BME framework, computing the tree length, as expressed by equations 2, 3, and 4, is unnecessary for tree inference. Indeed, our tree building and swapping algorithms only exploit the difference in tree length corresponding to a "nearest neighbor interchange" (NNI). Assume that T is the tree of figure 1a, and that T' is obtained from T by exchanging subtrees B and C. We then have:
|
From Balanced Tree Length to Minimum Variance Tree Length Estimation
Fitch and Margoliash (1967) assumed that the variances of the ij estimates are proportional to
, the default option in both the FITCH (Felsenstein 1997) and PAUP* (Swofford 1996) programs. Another common approximation (e.g., Gascuel 1997b) is simply to set the variance of
ij to be proportional to
ij. Better approximations have been found (Nei, Stephens, and Saitou 1985; Nei and Jin 1989; Bulmer 1991), basically indicating that the variance grows exponentially as a function of
ij. WEIGHBOR (Bruno, Socci, and Halpern 2000), for example, uses this latter approximation.
We assume here that the variance of ij is proportional to 2
, where pij is the topological distance between i and j. In other words, we have:
|
Theorem 1: Let T be the tree being studied and use the notation above defined. Assuming that the variances of the ij estimates are defined by equation 6, and assuming the WLS framework (the covariances are null), the balanced tree length estimation of T in equation 4 is then: (1) the minimum variance tree length estimator of T; (2) identical to the length defined by equation (1).
The proof of Theorem 1 is given in Appendix 2. The first part of this theorem implies that, assuming equation 6, the tree length given by BME is as reliable as possible. Because we select the shortest tree, reliability in tree length estimation is of great importance and tends to minimize the probability of selecting a wrong tree. Moreover, it is well known in statistics that rough variance values such as our approximation (eq. 6) are usually sufficient. We thus expect that BME computations will provide quite reliable branch and tree length estimates.
The second part of Theorem 1 indicates that FASTME should be close to FITCH and PAUP*, which use equation 1 to define the tree length, and, to a lesser extent, to WEIGHBOR, which is based on a different tree-building strategy but is also a WLS approach. However, because not all these programs use the same approximations for the variances nor the same criteria and algorithms, some differences between them can still be expected.
Consistency of the BME Principle of Phylogenetic Inference
Statistical consistency is a central issue in phylogenetic inference. In the case of distance-based methods, it is defined as follows: Let T be the correct tree, D the associated tree distance matrix, and the matrix of estimated distances. Assuming that
is a consistent estimate of D, the more data we have (e.g., the longer the sequences used to estimate the pairwise distances), the closer
is to D. Statistical consistency of tree inference then means that T is obtained with certainty as soon as
is sufficiently close to D. In other words, assuming that the model used to estimate the pairwise distance matrix is satisfied, the more data we have, the higher the probability to recover the correct tree. This property is essential and has been discussed at length in the past (e.g., Felsenstein 1978). Consistent methods contrast with inconsistent ones (e.g., parsimony in some cases), which may converge toward a wrong tree when the amount of data increases.
The ordinary least-squares version of the minimum evolution principle was shown to be consistent by Rzhetsky and Nei (1993), a result generalized by Denis and Gascuel (2003). However, as explained in the previous section, BME is a weighted least-squares version of the minimum evolution principle, and it was demonstrated (Gascuel, Bryant, and Denis 2001) that in some cases, depending on the variance matrix, this version might be inconsistent. So our aim in this section is to verify the consistency of BME.
A direct consequence of statistical consistency is that when = D the correct tree T has the shortest length among all possible tree topologies. This shortest length property is necessary for consistency, but also sufficient. Indeed, the length associated with a tree topology relative to a distance matrix is a continuous function of this matrix. Therefore, when
is sufficiently close to D, the estimated tree lengths relative to
and to D become close, and T becomes the shortest tree for
as it already is for D. When this occurs, T is then inferred with certainty from
. To this end, we prove the following theorem:
Theorem 2: Let T be the correct tree and D the corresponding distance matrix, and assume that the matrix of distance estimates is equal to D. Let W be any tree topology, and define its length estimate
(W) by equation 4 or, equivalently, by combining equations 1 and 6. Then,
(W) >
(T) = l(T), whenever W
T.
Theorem 2 demonstrates the consistency of BME. The equality (T) = l(T) simply results from the consistency of equation 1 or equation 4 in estimating the tree length. The rest of the proof (
(W) > l(T)) is given in Appendix 3, and it follows a line similar to Rzhetsky and Nei's (1993) for the OLS version of ME. A difference between the two proofs is that ours is very closely tied to our balanced nearest-neighbor interchange (BNNI) algorithm: we show that when W
T we can apply to W a nearest-neighbor interchange (NNI) that makes its length shorter. This seems to indicate that our simple BNNI algorithm is itself consistent, as confirmed by numerous computer simulations (not shown), but a complete formal proof remains to be done. See Appendix 3 for more details and technical comments.
Branch Length Positivity
Neighbor-Joining (Saitou and Nei 1987) and related algorithms often output trees with negative branch length estimates, which are biologically meaningless (Swofford 1996) and have to be interpreted as irresolutions (in the case of internal branches). In contrast, the FITCH and PAUP* weighted least-squares algorithms impose positivity (as the default option), but this constraint often yields branches with zero length, which also correspond to irresolutions. Moreover, when the positivity constraint is removed, the topological accuracy of these algorithms tends to be lower (Kuhner and Felsenstein 1994).
Our BNNI algorithm is used in FASTME software to improve a starting tree by performing NNIs, based on equation 5, until no more NNI decreases the length of the current tree. The following theorem indicates that, after running BNNI, all branches in the output tree have positive length. Thus BNNI trees tend to be better resolved than NJ or FITCH trees, and this fact partially explains the good performance of FASTME, as we shall see in the simulation section.
Theorem 3: Let be any evolutionary distance matrix, and suppose T is a tree that is a local minimum for a BNNI topology search. Moreover, assume that the branch lengths in T are obtained from equation 2 or 3. Then, the length estimate of every branch in T is positive.
The proof is in Appendix 4. We have assumed that is a distance; i.e., it satisfies the triangle inequality. This should be the case for any practical data set, except when some sites are unknown for some of the sequences, or correspond to gaps. Then, depending on the distance computation options, the matrix
might (slightly) violate the triangle inequality and FASTME might output trees with negative external branches. In any case the internal branches are always positive.
![]() |
Simulation Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Random Tree Generation
The Yule-Harding branching process (Yule 1925; Harding 1971) is a standard, biologically relevant method for generating phylogenetic trees. However, the uniform distribution on phylogenies is another natural approach for generating topologies when comparing tree inference methods. It has been argued (Gascuel 2000, Nakhleh et al. 2001) that method performance could vary depending on the tree-generation scheme. We thus chose the Aldous (1996) beta-splitting model, which generalizes both of the aforementioned distributions. In this model, topology generation is directed by a parameter denoted as ß: ß = -1.5 corresponds to the uniform distribution, whereas ß = 0 defines the Yule-Harding distribution. In our experiments ß was uniformly drawn from the interval [-1.5, 0] for each new tree, which was then generated using this value. One advantage of this approach is that it provides a much larger variety of trees than solely using either a Yule-Harding or a uniform distribution, as the Yule-Harding distribution diverges strongly from the uniform distribution when considering trees with 100 taxa (Gascuel 2000). Notably, Yule-Harding trees have a moderate topological diameter (about 22 branches in average), while uniform tree diameter is much larger (about 38 branches in average).
After topology generation each branch was assigned a length. We first used the standard coalescent model (Kuhner and Felsenstein 1994) to assign branch lengths, yielding a molecular clock on the tree. We then perturbed this molecular clock by multiplying every branch length (independently) by (1 + X ), where X was an exponential variable with parameter . The factor (1 + X ) was used (as opposed to, say, X ) to avoid an excessive number of very small branches. The parameter
was identical within each tree, but it varied from tree to tree. It was selected as
= 0.3/(0.01 + U), where U was uniformly drawn from the interval [0, 1]. If U = 0,
becomes very large and then X
0; i.e., the tree remains close to the molecular clock. If U = 1, the variance of X is large, and the tree tends to clearly depart from a molecular clock. The observed departure from the molecular clock, as measured by the ratio between the longest and shortest root-to-leaf lineages, was in the range [1.4, 6.0] (fig. 4), with a median value of 3.1 (where 1.0 represents the perfect molecular clock). Finally, trees were rescaled so that their total length would be uniform between 0.5 and 8.0. The maximum pairwise divergence was then in the range [0.1, 1.1] (fig. 5), with a median value of 0.55.
|
|
We used nucleotide sequences with 600 sites, evolving under a model analogous to Galtier's (2001). Four evolutionary rates were considered, defined by a discrete gamma distribution (Yang 1994) with parameter 1.0, which corresponds to moderate rate heterogeneity. The rate of each site was drawn at the root of each tree with equal probability, and then changed at each new speciation event (independently) with probability as one proceeded from root to each leaf. When a rate changed, it changed to each of the other three rates with equal probability. The parameter
was identical within each tree, but it differed from one tree to another. Its value was uniformly drawn from the interval [0, 1/98], where the number 98 was chosen to correspond to the number of speciation events (other than the root) in a tree with 100 taxa. When
= 0, the data set corresponds to the standard four-rate model of Yang (1994), with a gamma parameter equal to 1.0. When
= 1/98, the expected number of rate changes per site is equal to 1.0; this means that some sites would witness no rate changes, while other sites (among the 600 in the simulation) could witness up to 56 rate changes in the whole tree.
Once site rates have been determined as explained above, making branches shorter or longer depending on the site considered, every site evolved under the standard Kimura (1981) two-parameter model with a transition/transversion ratio of 2.0. The number of substitutions that effectively occurred along every branch was stored; we obtained this way the "observed" tree, whose topology was identical to that of the true tree, but whose branch lengths equaled the actual number of substitutions along each branch, divided by the sequence length.
Finally, a distance matrix was computed from the sequences using the Nei and Jin (1989) estimate for gamma distributed rates with parameter 1.0. When = 0 this estimate is almost unbiased (we only neglect the sparseness of rates); but when
0, the covarion effect is not taken into account by the distance estimation (and no distance correction allows for this). Thus, we can measure the robustness of each method to model violation: as the value of
grows, the violation of the model grows worse, leading to increasing problems with distance estimation, and (we expect) less accuracy with any tree-reconstruction method.
Topological Error Measures
To quantify the topological gap between the inferred tree and the correct tree, and also to compare tree inference methods, most authors use the Robinson and Foulds (1981) topological distance (RF). Let T be the observed tree, the inferred tree, S (X,
) the set of internal branches of tree X whose lengths are greater than or equal to
, n the number of taxa, and s the sequence length. In the context of comparing tree topologies, the Type I error is the number of inferred branches that do not belong to the correct tree, and the Type II error is the number of branches in the correct tree that are missing from the inferred tree. The standard RF distance between T and
is the sum of the Type I and Type II errors, which are denoted and computed as follows:
|
However, branches that are not supported by any substitution in observed tree T cannot be recovered except by chance. We define the irresolution of T, I(T), to be the number of such branches. We should then have approximately:
|
Consider an internal branch e in . If the length estimate of e is small or negative,
does not indicate any substitution on e, and e should then be considered to be an irresolution. The difference with T is that branch lengths in
are not restricted to sparse values of the form m/s, where m is an integer. So, we have to fix a threshold and the simplest choice, analogous to interval estimation in statistics, is to decide that branches with length less than 1/2s are not supported by any substitution, whereas longer branches are likely to have undergone one or more substitutions. The irresolution of
, I(
), is defined to be the number of non-supported branches. We cannot exclude the possibility that branches with length less than 1/2s are true branches (any threshold is imperfect), but we should still have, to some extent, an approximation of the form:
|
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Table 1 summarizes the results for the entire data set of 5,000 trees, using the error measures defined above. The irresolution of the observed tree, I(T), which corresponds to the number of branches not supported by any substitution, is quite high (25.15) and explains almost half of the RF topological error. The Type I error (E1) is always more than the Type II error (E2), indicating that a more severe threshold could be used to discard branches in the inferred tree. This phenomenon is particularly sensitive for FASTME (E1 = 18.64, E2 = 9.25), which is explained by the fact that this program provides trees better resolved than other methods we considered. However, regarding all error measures FASTME is best. Although the WEIGHBOR trees are approximately as good as the FASTME trees when comparing by E1, FASTME has a clear advantage over all other algorithms with respect to E2.
|
The first three parameters control the topology and the branch lengths of the tree, and the fourth controls the substitution process. We sorted the 5,000 data sets with respect to each of the four variables. We used this sorted list to create nine subsets of the data set for each parameter: For i = 0, ... , 8, we considered the subset of the data defined by those data sets whose parameter values lay ordinally in the interval of the form [500i + 500i + 1,000]. Figures 36 show the error rates E1 and E2 for each algorithm over each data subset.
|
|
In figure 4, we see the relationship between the departure from the molecular clock and the error measures. As the data sets diverge from a molecular clock, reconstruction of the observed tree becomes more difficult. However the difference remains slight, at about 3.0 branches for E1 and 1.5 branches for E2 between two extreme subintervals for all methods.
Figure 5 shows the relationship between the topological error measures and the maximum pairwise divergence. Unsurprisingly, the error increases with divergence, because of saturation. This parameter is very sensitive, the difference between extreme subintervals being about 10 for all methods, considering both E1 and E2.
Figure 6 shows the values of the error measures for each interval when the data sets are sorted according to the parameter of the covarion model. Quite surprisingly, the presence of a high covarion effect (the expected number of rate changes per site is equal to 1.0) has almost no influence on topological accuracy. The difference between extreme subintervals is below 1.5 branches for all methods and for both E1 and E2, even when the general tendency is, as expected, that the trees with a large covarion effect are (slightly) more difficult to recover than those without such an effect. This finding is quite reassuring given the fact that (DNA, RNA, or protein) sites were certainly subjected to different evolutionary pressures in different parts of the Tree of Life, and thus observed changes in the substitution rate during the course of evolution. According to our results, distance-based reconstructions then seem to be robust regarding this phenomenon.
In all of figures 36, we see that the BME algorithm outperforms the WLS approach, which outperforms the NJ algorithm. This ordering is true for all of the intervals looked at, over all of the parameters selected, and for both types of error. The WEIGHBOR algorithm is superior to WLS by E1, and in some intervals it approaches FASTME; but with respect to E2, WEIGHBOR is worse than WLS and FASTME.
It must be underscored that the good topological accuracy of FASTME does not correspond to an increase in computing times relative to the other algorithms, but rather a decrease. Indeed, the decrease is quite sharp in comparison to all of the other algorithms except for NJ. Using a two-way 2.2 GHz DELL PE2650 running Linux 2.4 to handle 1,000 (as commonly used in bootstrap studies) data sets with 100 taxa and 600 sites (as those above described), FASTME requires 34.2 s, NJ requires 2 min, WEIGHBOR requires 451 min, and (PAUP*) WLS requires 560 min. DNADIST from the PHYLIP package (Felsenstein 1989) requires 44 min to compute the distance matrices. (For more comparisons, see Desper and Gascuel 2002.)
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The trees produced by the FASTME program, using BNNI postprocessing, make strides towards all of these goals. Earlier work has demonstrated that the running time for FASTME far outstrips that of the leading traditional algorithms, even those of the NJ family. The current simulations show that BME trees are more accurate than even traditional WLS trees. The finding that the BME principle is actually a novel form of the WLS approach, with biologically realistic weights, may explain this advantage. It is unsurprising that the FASTME approach to tree reconstruction is consistent, but the discovery that FASTME will never output negative branch lengths represents another advantage over popular and fast methods such as NJ, BIONJ, and WEIGHBOR.
![]() |
Appendix 1 |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
![]() |
Appendix 2 |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
Each row of St corresponds to a branch of T. Letting e be any branch of T and [i, j] be the path from i to j in T, the constraints may then be written as:
|
|
The balanced tree length estimation of T is defined by fij = 2, and it has been demonstrated (Semple and Steel 2003) that this formula consistently estimates the length of T. Using the property of equation 9, this implies that the fijs so defined satisfy the second set of constraints. Now, assuming equation 6, we have 2vij fij = 4k, and the first set of equations becomes:
|
Let us now turn our attention to part (2). We demonstrate that coefficients from equation (1) also satisfy the linear system (equation 12) and are then identical to BME coefficients. First, it is well known that equation 1 consistently estimates the tree length; corresponding coefficients then satisfy the second set of constraints, as expressed by:
|
|
|
Given part (1), part (2) of Theorem 1 seems natural. Indeed, equation 1 defines the minimum variance branch length estimators. But those estimators are non-independent, and it is not trivial that the minimum tree length variance estimator is equal to the sum of minimum variance branch length estimators. Moreover, it is easily seen from above equations that this result extends to any diagonal V matrix.
![]() |
Appendix 3 |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We first introduce some more notation and definitions. Removing any branch in T induces a split (or bipartition) of the taxon set, which is denoted as X|Y, where X and Y are the two induced subsets. The set of splits induced by T is denoted as S(T). Letting X|Y be any split, we define the split metric DX|Y = () by:
= 1 when i
j and |{i, j}
X| = 1, and
= 0 otherwise. DX|Y is the metric that is obtained from T by having all branch lengths equal to zero, except for the branch corresponding to X|Y, which has length 1. Let D be the metric corresponding to T and l(X|Y) be the length in T of the branch inducing X|Y. It is easily seen (Bandelt and Dress 1992) that:
|
|
|
We color the leaves of W according to X|Y: let X be colored white and Y be colored black. Following the proof of Rzhetsky and Nei, we shall change the tree topology W to a tree topology W' that splits X from Y via a series of topological transformations, W = W0 W1
...Wt = W'. Each transformation Wi
Wi+1 (1) merges two disjoint monochromatic clusters into one (or four into two), and (2) decreases the estimated length of the corresponding tree. Because of (1), the number of clusters decreases until we have one branch with all the black leaves on one side and all the white ones on the other side; i.e., the corresponding topology splits X from Y and has length 1 (see above). Because of (2), we have a guarantee that the estimated length of W is larger than 1.
There are two types of transformations that we shall use as we move from W to W'. For consistency with Rzhetsky and Nei (1993), we shall refer to them as transformations of Type I (fig. 7) and Type II (fig. 8), respectively. It is easily seen that any black-and-white leaf coloring of any binary tree either splits the black-and-white leaves with one branch, or contains a Type I or Type II configuration.
|
|
|
Now consider the Type II transformation. Because the second step corresponds to two Type I transformation (swapping B1 and A2, and then C and A2), we need only to show that the first step detailed in figure 8 decreases tree size. Without loss of generality, we assume that C is not monochromatically black (otherwise, we could perform a Type I transformation at this juncture). Let pC be the distance from C to either of the white subtrees; i.e., pC = =
. Because C is not monochromatically black, we note that pC < 1. Consider Wi and
(fig. 8) and apply equation 5 to the swap between B1 and C. We obtain:
|
Equations 16 and 17 demonstrate that the lengths of the trees Wi are monotonically decreasing as i = 0, ... , t, and thus (W) >
(W'). The split X|Y was chosen arbitrarily from S(T) - S(W), and thus Inequality (15) holds for all of the splits in this set. This concludes the proof of Theorem 2. It has to be noted that only NNIs are used in this proof; thus, we have also demonstrated the consistency of our BNNI algorithm when restricted to split metrics.
![]() |
Appendix 4 |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
A similar argument regarding the swap of B and D proves that:
|
It follows from the two above inequalities and from equation 2 that (e) is positive for any internal branch. Now let e be an external branch and consider equation 8. The inner bracket is positive as long as
satisfies the triangle inequality, i.e., is a distance. This concludes the proof of Theorem 3.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Aldous, D. 1996. Probability distributions of cladograms. Pp. 118 in D. Aldous and R. Pemantle, eds. Random Discrete Structures, Springer-Verlag, New York.
Bandelt, H.-J., and A. W. M. Dress. 1992. A canonical decomposition theory for metrics on a finite set. Adv. Math. 92:47-105.[ISI]
Bruno, W. M., N. D. Socci, and A. L. Halpern. 2000. Weighted Neighbor Joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol. Biol. Evol. 17:189-197.
Bryant, D., and P. Waddell. 1998. Rapid evaluation of least-squares and minimum-evolution criteria on phylogenetic trees. Mol. Biol. Evol. 15:1346-1359.
Bulmer, M. 1991. Use of the method of generalized least squares in reconstructing phylogenies from sequence data. Mol. Biol. Evol. 8:868-883.
Denis, F., and O. Gascuel. 2003. On the consistency of the minimum evolution principle of phylogenetic inference. Discrete Appl. Math. 127:63-77.[CrossRef][ISI]
Desper, R., and O. Gascuel. 2002. Fast and accurate phylogeny reconstruction algorithms based on the minimum evolution principle. J. Comp. Biol. 9:687-705.[CrossRef][ISI]
Felsenstein, J. 1978. Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zool. 27:401-410.[ISI]
Felsenstein, J. 1989. PHYLIPPhylogeny Inference Package (Version 3.2). Cladistics 5:164-166.
Felsenstein, J. 1997. An alternating least-squares approach to inferring phylogenies from pairwise distances. Syst. Biol. 46:101-111.[ISI][Medline]
Fitch, W. M. 1971. Rate of change of concomitantly variable codons. J. Mol. Evol. 1:84-96.[Medline]
Fitch, W. M., and E. Margoliash. 1967. Construction of phylogenetic trees. Science 155:279-284.[ISI][Medline]
Galtier, N. 2001. Maximum-likelihood phylogenetic analysis under a covarion-like model. Mol. Biol. Evol. 18:866-873.
Gascuel, O. 1997a. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14:685-695.[Abstract]
Gascuel, O. 1997b. Concerning the NJ algorithm and its unweighted version, UNJ. Pp. 149170 in B. Mirkin, F. R. McMorris, F. S. Roberts, and A. Rzhetsky, eds. Mathematical hierarchies. American Mathematical Society, Providence, R. I.
Gascuel, O. 2000. Evidence for a relationship between algorithmic scheme and shape of inferred trees. Pp. 157168 in W. Gaul, O. Opitz, and M. Schader, eds. Data analysis, scientific Modeling and practical applications. Springer-Verlag, Berlin.
Gascuel, O., D. Bryant, and F. Denis. 2001. Strengths and limitations of the minimum evolution principle. Syst. Biol. 50:621-627.[CrossRef][ISI][Medline]
Harding, E. 1971. The probabilities of rooted tree-shapes generated by random bifurcation. Adv. Appl. Probab. 3:44-77.
Huelsenbeck J. P. 2002. Testing a covariotide model of DNA substitution. Mol. Biol. Evol. 19:698-707.
Kimura, M. 1981. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA 78:454-458.[Abstract]
Kuhner M. K., and J. Felsenstein. 1994. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol. 11:459-468.[Abstract]
Kumar, S. 1996. A stepwise algorithm for finding minimum evolution trees. Mol. Biol. Evol. 13:584-593.[Abstract]
Lopez, P., D. Casane, and H. Philippe. 2002. Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19:1-7.
Nakhleh L., U. Roshan, K. St John, J. Sun, and T. Warnow. 2001. Designing fast converging phylogenetic methods. Bioinformatics 17:(Suppl 1): 190-198.
Nei, M., and L. Jin. 1989. Variances of the average numbers of nucleotide substitutions within and between populations. Mol. Biol. Evol. 6:290-300.[Abstract]
Nei, M., J. C. Stephens, and N. Saitou. 1985. Methods for computing the standard errors branching points in an evolutionary tree and their application to molecular data from humans and apes. Mol. Biol. Evol. 2:66-85.[Abstract]
Pauplin, Y. 2000. Direct calculation of a tree length using a distance matrix. J. Mol. Evol. 51:41-47.[ISI][Medline]
Robinson, D., and L. Foulds. 1981. Comparison of phylogenetic trees. Math. Biosci. 53:131-147.[CrossRef][ISI]
Rzhetsky, A., and M. Nei. 1993. Theoretical foundation of the minimum-evolution method of phylogenetic inference. Mol. Biol. Evol. 10:1073-1095.[Abstract]
Saitou, N., and M. Nei. 1987. The Neighbor-Joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:1073-1095.
Semple, C., and M. Steel. 2003. Cyclic permutations and evolutionary trees. Adv. Appl. Math. In Press.
Susko, E. 2003. Confidence regions and hypothesis tests for topologies using generalized least squares. Mol. Biol. Evol. 20:862-868.
Swofford, D. 1996. PAUPphylogenetic analysis using parsimony (and other methods), Version 4.0. Sinauer Associates, Sunderland, Mass.
Vach, W. 1989. Least squares approximation of additive trees. Pp. 230238 in O. Opitz, ed. Conceptual and numerical analysis of data. Springer-Verlag, Berlin.
Yang Z. 1994. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39:306-314.[ISI][Medline]
Yule, G. 1925. A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis. Philos. Trans. R. Soc. London Ser. B, Biol. Sci. 213:21-87.