Uninformative Characters and Apparent Conflict Between Molecules and Morphology

Michael S. Y. Lee2,

Department of Zoology, University of Queensland, Brisbane, Queensland, Australia

Incongruence between different data sets remains one of the central issues in systematics. Numerous tree- based and character-based tests to identify incongruence have been developed, and each has strengths and weaknesses (e.g., for reviews, see Cunningham 1997a;Citation Johnson and Soltis 1998Citation ). Currently, one of the most widely used methods for evaluating incongruence within a parsimony framework is the homogeneity test of Farris et al. (1995)Citation , usually termed the incongruence length difference (ILD) test (Cunningham 1997a;Citation but see Johnson and Soltis 1998Citation ) or the partition homogeneity Test (Swofford 2000Citation ). The test has been argued to produce more accurate results than other tests (Cunningham 1997aCitation ) and is also easy to implement using PAUP* (Swofford 2000Citation ).

The ILD test evaluates whether the level of incongruence between two or more data sets is greater than that expected by chance alone, given the level of incongruence within each data set. Briefly, it is performed as follows (Farris et al. 1995Citation ; Johnson and Soltis 1998Citation ). Separate parsimony analyses of the original data sets are undertaken, and the tree lengths are summed; this value captures the total within-data-set incongruence. Then, a simultaneous parsimony analysis of all data sets is performed and the resultant tree length obtained; this value captures the total incongruence (both within and between data sets) and is always equal to or greater than the sum of the separate tree lengths. The difference between the two values represents the extra incongruence generated by combination of data sets, i.e., the between- data-sets incongruence. Next, the combined data set is randomly partitioned into divisions of equal size to the original data sets, and the amount of between-data-sets incongruence between these random partitions is obtained using the method just described. The randomization procedure is repeated numerous times (at least 100) to generate a distribution for the expected level of between-data-sets incongruence. If the between-data- sets incongruence in only 5% of the randomized partitions exceeds that in the real data sets, it can be concluded that the observed incongruence is statistically significant at P = 0.05. PAUP* outputs the sum of tree lengths rather than the between-data-sets incongruence for all randomizations. However, the between-data-sets incongruence can be readily obtained by subtracting the outputted sum of tree lengths from the total tree length of the combined analysis.

Cunningham (1997a,Citation p. 466) advocated removal of invariant characters before applying the ILD test, especially "when the original data partitions differ in the percentage of variable characters." However, there was no further discussion of this matter in the paper, nor in a related paper in which the same statement was made (Cunningham 1997bCitation ). Also, there was no mention regarding whether autapomorphic characters should be included or excluded. Some other studies have excluded invariant, but not autapomorphic, characters (Shaffer, Meylan, and McKnight 1997Citation ; Carranza et al. 2000Citation ; Mitchell, Mitter, and Regier 2000Citation ), and a few have excluded both types (Bremer and Struwe 1992Citation ; Bruneau, Dickson, and Knapp 1995Citation ). However, none of these studies discussed the specific effects of such deletions on the performance of the ILD test. Currently, many studies employing the test delete neither invariant nor autapomorphic characters (e.g., Wayne et al. 1997Citation ; Messenger and Maguire 1998Citation ; Montgelard, Ducrocq, and Douzery 1998Citation ). Here, it is demonstrated that unless all such uninformative (i.e., both invariant and autapomorphic) characters are excluded, the ILD test may overestimate the amount of incongruence, especially when morphological and molecular data sets are compared. In this letter, a hypothetical example is presented to demonstrate how uninformative characters can cause the test to overestimate the significance of the observed incongruence. Then, some empirical studies similar to the hypothetical example are reevaluated to demonstrate this bias in real data.

Consider data sets 1 and 2 (table 1 ), which deal with three taxa (A, B, C) plus an outgroup (X). The data sets each consist of 10 characters, all phylogenetically informative. Data set 1 consists of seven characters supporting AB and three supporting BC, while data set 2 consists of three characters supporting AB and seven supporting BC. An intuitive assessment suggests that, given the extensive conflict within each data set, the between-data-sets conflict is not very significant, and the ILD test confirms this prediction (10,000 replicates, P = 0.175). However, if 40 uninformative characters are added to data set 2, so that this data set consists of only 20% informative characters (10/50), the ILD test now suggests that the observed incongruence is significant (P = 0.033). If 90 uninformative characters are added to data set 2, so that the data set consists of only 10% informative characters (10/100), the observed incongruence becomes even more significant (P = 0.007) (fig. 1A ). If further uninformative characters are added to data set 2, the significance level becomes greater still. Identical results are obtained whether the additional uninformative characters are autapomorphies of A or B or C, invariant characters, or a mixture of all four types. However, if equal numbers of uninformative characters are added to each data set, the ILD test continues to return insignificant results in all cases, although the P values drop marginally: with only informative characters, P = 0.175 (as before); with 40 uninformative characters added to each data set, P = 0.121; and with 90 uninformative characters added to each data set, P = 0.118 (fig. 1B ).


View this table:
[in this window]
[in a new window]
 
Table 1 Hypothetical Character-by-Taxon Matrix, Consisting of Three Taxa (A, B, C) and an Outgroup (X) and Two Data Sets of 10 Characters Each

 


View larger version (28K):
[in this window]
[in a new window]
 
Fig. 1.—A, Levels of between-data-sets incongruence in randomized partitions for the matrix in table 1 , with 0 (dashed line), 40 (plain line), and 90 (bold line) uninformative characters added to data set 2. The incongruence between the real data partitions is shown by the vertical line. B, Levels of between-data-sets incongruence in randomized partitions for the matrix in table 1 , with 0 (dashed line), 40 (plain line), and 90 (bold line) uninformative characters added to each data set. The incongruence between the real data partitions is shown by the vertical line. C, Levels of between-data-sets incongruence in randomized partitions for the frog data set (Cannatella et al. 1998Citation ) in table 2 . The dashed line shows the results when uninformative characters are excluded, and the solid line shows the results when all characters are included. The incongruence between the real data partitions is shown by the vertical line. D, The same variables for Abronia lizards (Titus and Frost 1996Citation ). E, The same variables for seabirds (Paterson, Wallis, and Gray 1995Citation ). F, The same variables for Meroles lizards (Harris, Arnold, and Thomas 1998Citation ).

 
The reason for these results is as follows. The randomization procedure of the ILD test pools all of the characters from both data sets and then randomly assigns them to partitions of equal size to the original data sets. In the above example where 90 uninformative characters are added to data set 2 only, pooling data sets 1 (10 informative characters) and 2 (10 informative and 90 uninformative characters) would yield a combined data set of 110 characters, 20 (18%) of which would be informative. The ILD test would then randomly divide this combined data into two partitions of 10 and 100 characters. In each randomization, there would thus be an average of 1.8 informative characters in the smaller partition and an average of 18.2 in the larger partition. This contrasts with the original division, which had 10 informative characters in each data set. The phylogenetic signal in the randomized data partitions is therefore not representative of the phylogenetic signals in the original data sets. The amount of phylogenetic signal in the smaller data set has been reduced to almost zero, while the amount of phylogenetic signal in the larger data set is almost doubled. The result of this distortion is that the average between-data-sets incongruence in the randomized partitions will be unrealistically low. Since the smaller of the two randomized partitions will usually contain virtually no phylogenetic signal, it will usually take very few extra steps to reconcile it with the larger data set. In this example, for instance, the observed between-data-sets incongruence is four steps (it takes four extra steps to reconcile the two real data sets in a combined analysis). However, the smaller of the randomized data sets contains (on average) only 1.8 informative characters; thus, the amount of between-data-sets incongruence will usually be two steps or fewer. The randomization procedure unrealistically assigns most of the informative characters to the larger data partition, and consequently most of the incongruence gets transferred to within that data set. The null distribution of between- data-sets incongruence inferred from the randomized data partitions is therefore unrealistically low, and the moderate incongruence in the real data sets therefore emerges as "highly significant." This trend becomes more pronounced as more and more uninformative characters are added to one data set (fig. 1A ). If the proportions of uninformative characters are similar in both data sets, however, this bias should not occur. The proportions of uninformative characters in the pooled data set and thus in each randomized data set remain very similar to those in both real data sets (fig. 1B ).

To correct for the bias when uninformative characters are unequally distributed, they should be deleted (Cunningham [1997a, 1997b]Citation suggested that invariant characters should be deleted in this situation, but the suggestion should extend to autapomorphies as well). In the example discussed above (10 informative characters in data set 1 and 10 informative plus 90 uninformative characters in data set 2), this would leave 10 characters (all informative) in each data set and thus a pooled data set of 20 characters (all informative). The random partitions of the ILD test would then repeatedly divide the data into two data sets of 10 informative characters, exactly analogous to the number of informative characters in the original data sets. The distribution of incongruence levels generated by these partitions would then more accurately reflect the null distribution expected in the original data sets—given that the randomized partitions each contain 10 informative characters, it is likely that the conflict between these partitions will often be of four characters or more. Thus, the "expected," or null, distribution of between-data-sets incongruence is quite high. To be deemed significantly higher than expected, the observed levels must be even higher.

The root of the above problem is, of course, that informative characters are much more common in the smaller real data set than in the larger one but are equally assigned (on average) to the randomized partitions. This situation commonly occurs in combined morphological and molecular studies, in which the morphological data sets are often largely, or entirely, composed of informative characters, and the molecular data sets consist largely of uninformative characters (Cunningham 1997bCitation ). The conditions in the hypothetical example— the smaller data set containing 100% informative characters, the larger data set containing only 20% or 10% informative characters—are quite representative of real morphological and molecular data sets (table 2 ). The problem should not occur if the proportions of informative and uninformative characters are similar in both of the original data sets, for instance, in a combined analysis of multiple molecular data sets which all contain similar (low) proportions of informative characters, or a combined analysis of morphological and behavioral data sets which all contain mostly informative characters.


View this table:
[in this window]
[in a new window]
 
Table 2 The Effects of Uninformative Characters on the ILD Test for Incongruence

 
In combined morphological and molecular analyses, therefore, ILD tests that include all characters are likely to overestimate the significance of the observed incongruence. To test this hypothesis, 14 such data sets for tetrapods were reanalyzed (table 2 ). Data sets were obtained from journal web sites or directly from the authors; for the iguanian and ratite data sets, new matrices were used which incorporated minor corrections to the published versions (R. Macey, personal communication; J. Cracraft, personal communication). In all data sets, the morphological data contained a much higher percentage of informative characters than the molecular data (table 2 ). For simplicity and consistency, all morphological and molecular characters were treated as unordered and assigned equal weight (regardless of how they were treated in the original analyses); however, unalignable positions were excluded as discussed in the original studies. Two ILD tests were performed, with all characters and with only informative characters. The implementation of the test in PAUP* (partition homogeneity test; Swofford 2000) was used with 1,000 replicates, each employing either a branch-and-bound search or a heuristic search with at least 10 random-addition sequences. The results are shown in table 2 and figure 1 .

In seven of these studies (salamanders, frogs, iguanians, phrynosomatids, Abronia, cetaceans, and suiforms), the effects were substantial and in the predicted direction; i.e., if uninformative characters were deleted, the inferred significance of the incongruence decreased (i.e., P values went up). For the Abronia data, the result changed from highly significant (P = 0.006) to insignificant (P = 0.070). In three of the six remaining studies (ratites, canids, and primates), deletion of the uninformative characters superficially appeared to have a negligible effect on the results of the ILD test, since incongruence remained highly significant. For all of these data sets, the incongruence in the permuted data partitions never reached that in the real data sets in any of the replicates, for both the all-characters and the informative-only analyses. However, the maximum amount of incongruence generated in the permuted data partitions was always greater in the informative-only analyses and thus always approached the real levels more closely in these analyses. This suggests that when sufficient replicates are performed, the exact significance levels inferred in the informative-only analyses will be lower than those inferred in the analyses which included all characters, although both would have been highly significant. For ratites, when all characters were included, the maximum incongruence in 100,000 randomizations was nine steps less than the real incongruence, but when uninformative characters were excluded, this difference dropped to four steps. For canids, the corresponding numbers were 10 and 6 (1,000 replicates), and for primates, they were 10 and 8 (1,000 replicates). In all cases, the change was in the predicted direction; i.e., if uninformative characters were excluded, the randomized levels incongruence tended to be greater, implying that the observed levels were less significant (fig. 1CE ). The remaining three studies, however, did not conform to the prediction. For iguanines and oplurids, exclusion of uninformative characters did not change the significance of the ILD test. For Meroles, the change was in the opposite direction of that expected (fig. 1F ): with uninformative characters deleted, the randomized levels of incongruence tended to be slightly lower, and the significance of the ILD test actually increased (albeit slightly). Overall, however, the results largely conformed to predictions: of the 13 data sets analyzed, 10 showed the predicted effects, two showed no effects, and only one showed the opposite effect. Although deletion of uninformative characters changed the result of the ILD test from significant to insignificant in only one instance (Abronia), this might reflect the fact that most of the examples considered here were either very significant or not even close to significant when all characters were used.

The hypothetical examples above suggest that uninformative characters will artificially inflate the results of the ILD test if they are unequally distributed among data sets, as in most combined morphological and molecular analyses. The empirical examples confirm that this is usually the case, although in one data set the opposite trend was observed, and the reasons for this exception should be further investigated (fig. 1F ). Deletion of uninformative characters before employing the ILD test will ensure that each randomized data partition contains the same number of phylogenetically informative characters (not just the same number of characters) as the corresponding real data set. The incongruence between these randomized data partitions will then be a better estimate of the null level of incongruence expected between the real data sets.

Acknowledgements

This research was supported by a senior (Queen Elizabeth II) research fellowship from the Australian Research Council. I thank Christine Lambkin, Andrew Hugall, David Yeates, and two anonymous referees for helpful comments, and Tod Reeder, Anne Yoder, Adrian Paterson, Joel Cracraft, James Schulte II, Robert Macey, Jack Sites, Tom Titus, Paul Chippindale, James Harris, Nicholas Arnold, and Claudine Montgelard for generously providing information and/or data matrices.

Footnotes

Manolo Gouy, Reviewing Editor

1 Keywords: incongruence character conflict incongruence length difference test partition homogeneity test Back

2 Address for correspondence and reprints: Michael S. Y. Lee, Department of Zoology, University of Queensland, Saint Lucia, Brisbane QLD 4072, Australia. mlee{at}zoology.uq.edu.au Back

literature cited

    Bremer, B., and L. Struwe. 1992. Phylogeny of the Rubiaceae and the Loganiaceae: congruence or conflict between morphological and molecular data. Am. J. Bot. 79:1171– 1184[ISI]

    Bruneau, A., E. E. Dickson, and S. Knapp. 1995. Congruence of chloroplast DNA restriction site characters with morphological and isozyme data in Solanum sect. Lasiocarpa. Can. J. Bot. 73:1151–1167

    Cannatella, D. C., D. M. Hillis, P. T. Chippindale, L. Weight, A. S. Rand, and M. J. Ryan. 1998. Phylogeny of frogs of the Physalaemus pustulosus species group with an examination of data incongruence. Syst. Biol. 47:311–335[ISI][Medline]

    Carranza, S., E. N. Arnold, J. A. Mateo, and L. F. Lopez- Jurado. 2000. Long distance colonization and radiation in gekkonid lizards, Tarentola (Reptilia: Gekkonidae), revealed by mitochondrial DNA sequences. Proc. R. Soc. Lond. B Biol. Sci. 267:637–649[ISI][Medline]

    Chippindale, P. T., L. K. Ammerman, and J. A Campbell. 1998. Molecular approaches to phylogeny of Abronia (Anguidae: Gerrhonotinae), with emphasis on relationships in subgenus Auriculabronia. Copeia 1998:883–892

    Cunningham, C. W. 1997a. Can three incongruence tests predict when data should be combined? Mol. Biol. Evol. 14: 733–740

    ———. 1997b. Is congruence between data partitions a reliable predictor of phylogenetic accuracy? Empirically testing an iterative procedure for choosing among phylogenetic methods. Syst. Biol. 46:464–478

    Farris, J. S., M. Källersjö, A. G. Kluge, and C. Bult. 1995. Testing significance of incongruence. Cladistics 10:315– 319

    Harris, D. J., E. N. Arnold, and R. H. Thomas. 1998. Rapid speciation, morphological evolution, and adaptation to extreme environments in South African sand lizards (Meroles) as revealed by mitochondrial gene sequences. Mol. Phylogenet. Evol. 10:37–48[ISI][Medline]

    Johnson, L. A., and D. E. Soltis. 1998. Assessing congruence: empirical examples from molecular data. Pp. 297–348 in D. E. Soltis, P. S. Soltis, and J. J. Doyle, eds. Molecular systematics of plants 2: DNA sequencing. Kluwer, Boston

    Lee, K., J. Feinstein, and J. Cracraft. 1997. The phylogeny of ratite birds: resolving conflicts between molecular and morphological data sets. Pp. 173–211 in D. Mindell, ed. Avian molecular evolution and systematics. Academic Press, San Diego

    Messenger, S. L., and J. A. McGuire. 1998. Morphology, molecules and the phylogenetics of cetaceans. Syst. Biol. 47:90–124[ISI][Medline]

    Mitchell, A., C. Mitter, and J. C. Regier. 2000. More taxa or more characters revisited: combining data from nuclear protein-encoding genes for phylogenetic analyses of Noctuoidea (Insecta: Lepidoptera). Syst. Biol. 49:202–224[ISI][Medline]

    Montgelard, C., S. Ducrocq, and E. Douzery. 1998. What is a Suiforme (Artiodactyla)? Mol. Phylogenet. Evol. 9: 528–532

    Paterson, A. M., G. P. Wallis, and R. D. Gray. 1995. Penguins, petrels and parsimony: does cladistic analysis of behaviour reflect seabird phylogeny? Evolution 49:974–989

    Reeder, T. W., and J. J. Wiens. 1996. Evolution of the lizard family Phrynosomatidae as inferred from diverse types of data. Herpetol. Monogr. 10:43–84

    Schulte, J. A. II, J. R. Macey, A. Larson, and T. J. Papenfuss. 1998. Molecular tests of phylogenetic taxonomies: a general procedure and example using four subfamilies of the lizard family Iguanidae. Mol. Phylogenet. Evol. 10:367– 376[ISI][Medline]

    Shaffer, H. B., P. Meylan, and M. L. McKnight. 1997. Tests of turtle phylogeny: molecular, morphological, and paleontological approaches. Syst. Biol. 46:235–268[ISI][Medline]

    Sites, J. W. Jr., S. K. Davis, T. Guerra, J. B. Iverson, and H. L. Snell. 1996. Character congruence and phylogenetic signal in molecular and morphological data sets: a case study in the living iguanas (Squamata, Iguanidae). Mol. Biol. Evol. 13:1087–1105[Abstract]

    Swofford, D. L. 2000. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer, Sunderland, Mass

    Titus, T. A., and D. R. Frost. 1996. Molecular homology assessment and phylogeny in the lizard family Opluridae (Squamata: Iguania). Mol. Phylogenet. Evol. 6:49–62[ISI][Medline]

    Titus, T. A., and A. Larson. 1996. Molecular phylogenetics of desmognathine salamanders (Caudata; Plethodontidae): a reevaluation of evolution in the ecology, life history, and morphology. Syst. Biol. 45:451–472[ISI]

    Wayne, R. K., E. Geffen, D. J. German, K. P. Koepfli, L. M. Lau, and C. R. Marshall. 1997. Molecular systematics of the Canidae. Syst. Biol. 46:622–653[ISI][Medline]

    Yoder, A. D., M. Cartmill, M. Ruvulo, K. Smith, and R. Vigalys. 1996. Ancient single origin for Malagasy primates. Proc. Natl. Acad. Sci. USA 93:5122–5126

Accepted for publication November 14, 2000.