*Max-Planck-Institut für Biologie, Abteilung Immungenetik, Corrensstrasse 42, Tübingen, Germany;
Department of Biosystems Science, The Graduate University for Advanced Studies, Hayama, Kanagawa, Japan
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The two main causes commonly invoked to explain why different portions of the genome provide different answers regarding the phylogenetic relationships within a group of taxa are assortment of ancestral polymorphism and homoplasy. In the former case, an ancestral population of species H, C, and G may contain two alleles, a and b, at locus 1 and two other alleles, x and y, at locus 2. If, for example, at locus 1 the a allele is subsequently fixed in species G, whereas allele b is fixed in species C and H, C will be judged as the closest relative of H by the analysis of this locus. If, on the other hand, allele x at locus 2 is fixed in species C, whereas the y allele is fixed in species G and H, G rather than C will appear to be the closest relative of H. Similarly, at the nucleotide level, two sites within a single gene may yield contradictory phylogenetic information if recombination takes place between them and their polymorphism is differentially resolved among the species. The second major cause of phylogenetic ambiguity, homoplasy (i.e., independently attained similarity at a site), is commonly differentiated into parallel evolution (similarity acquired from the same ancestral condition) and evolutionary convergence (similarity attained from different ancestral conditions). Thus, for example, if a changes to b independently in G and H, whereas it remains unaltered in C, species G will appear to be more closely related to H than C, although in reality it may have diverged earlier than C from the lineage leading to H.
The extent to which ancestral polymorphism and homoplasy contribute to the obfuscation of a phylogenetic relationship is not known. In most molecular phylogenetic reconstructions, attempts are made to take homoplasy into account by correcting the observed sequence for presumed hidden substitutions with the help of one of the correction formulas available (Nei and Kumar 2000
, pp. 3350). The underlying assumptions of all these formulas are stochasticity of the evolutionary process at the molecular level and neutrality of the substitutions. The formulas differ in the extent to which they take into account various factors that may influence the stochasticity of the process, such as the ratio of transitions to transversions or the four-nucleotide content of the sequence.
Here we attempt to actually measure the extent to which ancestral polymorphism and homoplasy influence phylogenetic reconstruction. To this end, we use a large collection of primate sequences, one half of which we obtained in our Tübingen laboratory and the other half from databases. The collection was assembled for a variety of purposes, the estimate of the relative influence of ancestral polymorphism and homoplasy on phylogenetic reconstruction being one of them. The data set includes sequences of human, chimpanzee, gorilla, and orangutan, as well as representative species of Old World monkeys (OWM) and New World monkeys (NWM). It thus covers a range of divergence times extending from 5 MYA (the human-chimpanzee split; White, Suwa, and Asfaw 1998
) to nearly 50 MYA (the Platyrrhini-Catarrhini split dated by Kumar and Hedges [1998]
to 47.6 ± 8.3 MYA). It is this wide span of evolutionary time that allows us to use the data set for the present purpose. The expectation is that the degree to which ancestral polymorphism and homoplasy obscure phylogenetic relationships depends on the particular time frame of the evolutionary process. To understand the reason for this dependence, consider two time intervals, one encompassing the period during which three closely related species lineages diverged from one another (e.g., G from [H, C], followed by the divergence of H from C) and the second covering the period from the first divergence (i.e., G from [H, C]) to the present time. In the first case, we must take into account that the resolution of ancestral neutral polymorphisms in a population consisting of 105 breeding individuals may take up to 3 Myr (Takahata 1993
; Takahata and Satta 1997
). Hence, if the interval between the first and the second divergence was <3 Myr (as it probably was in the case of the H, C, and G lineages), then the resolution of the ancestral polymorphism can be expected to have confounded the phylogeny of the three lineages. On the other hand, if the interval between the first and the second divergences was >3 Myr (as was in the case of the divergences of OWM, NWM, and ape lineages), the resolution of the ancestral polymorphism should not have had any confounding effect. As for the interval from the first divergence to the present, the length of the divergence time determines how much homoplasy can be expected. Because homoplasy at the molecular level generally involves more than one substitution at a site and because in stochastic processes two hits at a site are more probable in a long time interval than in a short one, in the short interval, the frequency of homoplasy can be expected to be negligibly low, unless the substitution rate is very high (Takahata 1995
). Therefore, homoplasy may have confounded the phylogenetic relationship among some of the NWM, OWM, and ape genes, but it might not have influenced the phylogeny of the human, chimpanzee, and ape genes. The objectives of the present study were to estimate the degree of phylogenetic incompatibility for clades in which only homoplasy could be the cause, to infer the mechanisms by which homoplasy arises, and to determine the importance of homoplasy in phylogenetic reconstruction.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
To explain the system of partitioning, consider a site occupied by nucleotides g, c, c, a, a, and a in the OTUs H, C, G, O, M, and T, respectively. (To avoid confusion between nucleotide and OTU designations, here and subsequently we use italicized lowercase letters for the former and roman type uppercase letters for the latter.) The site contains one singleton because the nucleotide g occurs only in H and in no other OTU. The site also contains a doubleton because it is occupied by a c in both C and G but in no other OTU. Finally, the site also contains one tripleton because it is occupied by an a in O, M, or T but by a different nucleotide in the remaining three OTUs. If we did not know the root of the tree, this partitioning pattern would suggest that C and G share a common ancestor, as do O, M, and T (so that C, G, and H would share a common ancestor as well). But because we know the root, we can explain the observed pattern by assuming a single mutation in the stem of the C, G, and H branches. The partitioning of sites is then taken one step further. We notice that the chosen site has a g in H but that it is occupied by two different nucleotides in the other OTUs, c in C and G but a in O, M, and T. Altogether, the site is occupied by three different nucleotides in the six OTUs, and so it is classified as a three-base site. The sole singleton is classified as a three-base singleton. Similarly, the site contains a three-base doubleton and a three-base tripleton. If the site were occupied by g, a, a, a, a, and a in H, C, G, O, M, and T, respectively, it would consist of a two-base singleton, no doubletons, and no tripletons.
The classification was used to provide support for or against the arrangements of the OTUs into specific clades. Singletons are not phylogenetically informative and do not provide support for or against particular clades. Doubletons and tripletons that support the separation of OTUs into clades consistent with the consensus phylogeny (the one depicted in fig. 1 ) are called compatible. Doubletons and tripletons that support separation into clades inconsistent with the consensus phylogeny are called incompatible. A site can contain more than one doubleton or tripleton and so can be informative about more than one clade in the phylogeny. Because the number of variable sites in a data set is a function of the evolutionary rate and the total branch length of the phylogeny, the number of homoplasies is expected to increase with an increase in these two parameters. The estimated degree of homoplasy can therefore be expected to vary according to the species chosen as an out-group and the interval from the first divergence to the present time.
|
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To estimate the extent to which the interspecies comparisons might be influenced by either intraspecies polymorphism or by errors in sequence determination (either during PCR amplification or during sequencing), nine randomly chosen gene segments were reamplified and resequenced from all or nearly all the nonhuman OTUs. Comparison of the "old" and the "new" sequences (a total of about 30 kb in length) revealed 44 differences. The number of differences varied from gene to gene, being highest in APOA1 (12 differences in a total of 4.8 kb of sequence from four OTUs) and lowest in POMC (two differences in a total of 2.4 kb of sequence from four OTUs). The mean was 1.5 differences per kilobasepair of sequence, all the differences being singletons (i.e., they were not shared with any other sequence in the alignment). Significantly, all the incompatible sites included in the resequenced set could be confirmed.
Partitioning Analysis to Identify the Nearest Living Relative of the Human Species
In a partitioning analysis, the sites at which differences occur between the studied OTUs are considered individually in terms of their support or the lack thereof for a particular phylogeny. Initially, the differential sites are classified into singletons, doubletons, or tripletons for each of the six OTUs separately or for the various combinations of the OTUs, as described in Materials and Methods (table 2
). In the next step, the 41 partitions that are theoretically possible with a set of six OTUs are further classified into two-, three-, or four-base categories (see Materials and Methods). Phylogenetically informative partitions are then identified, their deposition regarding individual phylogenies is noted, and the sites supporting a particular phylogeny are tallied. Because a larger number of phylogenies are possible for clades with a greater number of OTUs, only those partitions that informatively group one OTU of the (H, C, G), (H, C, G, O), or (H, C, G, O, M) phylogenies with the appropriate out-group (O, M, or T, respectively) are considered in estimating the extent of incompatibility for each clade.
|
Altogether, 1,402 sites were found to be phylogenetically informative, and of these, approximately 90% provided information regarding the grouping of the OTUs under consideration here. The remaining 10% gave information on groupings not relevant to the study; for example, the grouping of O with M or of C, G, and M. Of the 89 sites that are informative about the H, C, and G relationship, 46 sites (52%) were found to support the (H, C) clade excluding all the other OTUs, a value similar to that found by Satta, Klein, and Takahata (2000)
. Of these, 43 sites were of the two-base type, and 3 sites were of the three-base type. Fourteen sites (11 of the two-base type and 3 of the three-base type) supported the (H, G) clade, and 27 sites (24 of the two-base and 3 of the three-base type) supported the (C, G) clade. Thus, the results of the partitioning analysis uphold the conclusion reached in an earlier study with a different data set (Satta, Klein, and Takahata 2000
), namely, that the chimpanzee is the nearest living relative of Homo sapiens. At the same time, however, the high proportion of incompatibility between the phylogenetically informative sites (with 48% of the sites supporting alternative phylogenies) indicates that sorting out of ancestral polymorphisms, homoplasy, or both have blurred the phylogenetic signals that might have otherwise indicated clearly the disjunction of the H, C, and G lineages.
Dissociation of Ancestral Polymorphism from Homoplasy
The results described in the preceding section indicate that the gorilla lineage diverged from the lineage leading to the common ancestor of human and chimpanzee before these last two species (lineages) diverged from each other. The interval between these two divergences was apparently relatively short, probably not more than 13 Myr, well within the range of persistence of ancestral polymorphism in a large population (Takahata 1995
). To estimate to what degree homoplasy might have contributed to the blurring of phylogenetic signals during this interval, it is necessary to extend the partitioning analysis by including more distantly related lineages (O, M, T) into it. Both the paleontological (Martin 1993
) and molecular (Sarich and Wilson 1967
; Sibley, Comstock, and Ahlquist 1990
; Horai et al. 1992
) data indicate that the orangutan lineage diverged from the lineage leading to the common ancestor of human, chimpanzee, and gorilla 1215 MYA. Because the human and chimpanzee lineages diverged from each other 5 MYA, and the gorilla lineage diverged from the lineage of the (H, C) ancestor not more than 8 MYA (Horai et al. 1992
), an interval of >4 Myr separated the divergence of the orangutan lineage from that of the (H, C, G) lineage. This interval is too long for any ancestral polymorphism (except that maintained by balancing selection; see Klein et al. 1998
) to survive, and so any incompatibilities between phylogenetically informative sites in the analysis of the (H, C, G, O) phylogeny should be attributable to homoplasy.
The inclusion of the O OTU in the partitioning analysis revealed the existence of 332 sites that support the (H, C, G) clade excluding O with M as the out-group (table 3 ). Forty-five sites are inconsistent with this grouping in that they include O in the clade and exclude H (10 sites), C (14 sites), or G (21 sites). Thus, the (H, C, G) clade is supported by 88% of the informative sites, with the remaining 12% of sites supporting alternative phylogenies. The incompatibilities of the latter sites are presumably the result of homoplasy in the lineages leading to M on the one hand and to the (H, C, G, O) lineage on the other. The increase in the proportion of sites that support the standard phylogeny from 52% for the (H, C) clade to 88% for the (H, C, G) clade is presumably a reflection of the corresponding decrease in the contribution of ancestral polymorphism to the evolution of the two clades.
|
Possible Reasons for the High Level of Homoplasy
The observation that 19% and 12% of the phylogenetically informative sites have undergone homoplasious substitutions during the time interval between the divergence of the Platyrrhini from the Catarrhini lineage and of the OWM from the ape lineage, respectively, is surprising and unexpected. The common perception, supported by computer simulations based on the standard models of molecular evolution (see below), is that homoplasies in intervals of these lengths are rare, on the order of a few percent at most. Even in cases in which intense selection is known or suspected to drive the substitution process (Takahata 1995
) or in which functional convergences at the molecular level are postulated (Swanson, Irwin, and Wilson 1991
; Irwin, White, and Wilson 1993
; Lawn, Schwartz, and Patthy 1997
), homoplasy is believed to be an exception rather than a rule. The following question therefore arises: What might be the reason for the high homoplasy found in the primate lineages? In what follows, we consider three possible answers to this question.
The first possibility is that the observed homoplasy is a manifestation of selection pressure for convergence in function. Many of the studied genes (ABO, RHAG, PRM2, RNASE3, SRY) are known or postulated to be under moderate-to-strong selection pressure (O'hUigin, Sato, and Klein 1997
; Zhang, Rosenberg, and Nei 1998
; Wyckoff, Wang, and Wu 2000
). Could this pressure be responsible for the high homoplasy? To test this possibility, we divided the data set into coding and noncoding subsets and carried out the partitioning analysis separately on the two subsets (table 3, parts B and C ). Of the 29,451 coding sites of the 51 loci, 3,018 (10%) were found to be variable, and of these, 545 (18%) sites are phylogenetically informative. Of the 545 sites, 27, 148, and 312 are informative about the (H, C, G), (H, C, G, O), and (H, C, G, O, M) phylogenies, respectively. Fifteen of 27 (56%) relevant informative sites support the (H, C) clade, whereas 129 of 148 (87%) sites support the (H, C, G) clade, and 255 of 312 (82%) sites are compatible with the (H, C, G, O) clade. Similarly, of the 32,921 noncoding sites, 5,424 (16%) have been found to be variable, and of these, 855 (16%) are phylogenetically informative. Of these, 62, 229, and 476 sites are informative about the (H, C, G), (H, C, G, O), and (H, C, G, O, M) phylogenies, respectively. Thirty-one of the 62 (50%) relevant informative sites support the (H, C) clade, 203 of 229 (89%) the (H, C, G) clade, and 382 of 476 (80%) the (H, C, G, O) clade. Thus, the differences in the proportion of incompatibilities between the coding and noncoding regions are small, and no strong tendency for homoplasy arising preferentially in the coding regions is apparent. Selection therefore does not appear to play a dominant part in the generation of homoplasy.
The second possibility is that the observed high level of homoplasy is caused by a bias in nucleotide composition. The mechanisms producing bias in equilibrium nucleotide frequencies in different genomic regions are unclear (Wolfe, Sharp, and Li 1989
). Because maintenance of compositional bias increases the probability of like substitutions, such bias could be expected to indirectly influence the extent of homoplasy at certain sites in a gene and in certain regions of a genome. This effect should be most pronounced in the third positions of codons and noncoding regions where mutational bias is the primary determinant of nucleotide composition. The absence of a significant increase in homoplasy in noncoding regions noted above therefore provides an argument against this explanation. Further evidence emerges from an examination of the nucleotide composition of the individual genes (table 4 ). Compositional bias was measured by using the method of Kornegay et al. (1993)
. It was then related to the number of variable sites, number of phylogenetically informative sites, and number of sites compatible or incompatible with the consensus phylogeny (table 4
). The measurement of incompatibility was limited to the (H, C, G, O) and (H, C, G, O, M) groups where no contribution of ancestral polymorphism should occur. Most (45 of 57 segments) of the genes in the data set showed some degree of incompatibility which ranged from 0% to 72%. The relatively high percentages of incompatibility found in some of the short genes (ZFX, ACAT2, DRD4, PROC) may be caused by stochastic effects associated with a low number of informative sites. The longer genes containing more informative sites probably provide a more reliable estimate of incompatibility. The gist of the comparison is that compositional bias in either coding or noncoding regions does not appear to have a strong effect on the degree of homoplasy found in the individual genes. Genes showing the highest levels of compositional bias often show a below average (e.g., ADBR3, MSH2) or no (e.g., ZNFN1A1, APOB) homoplasy. By contrast, genes with high levels of homoplasy may have a low (e.g., RNASE3, PROC) or moderate (e.g., PRM2, LCAT) degree of compositional bias.
|
Search for the Cause of High Homoplasy by Computer Simulation
Partitioning analysis excluded selection pressure, but not nucleotide composition from being responsible for the high homoplasy of the primate genes, and provided an indication that variation in the mutation rate of different sites might be a factor. To test whether a combination of the two factors, nucleotide composition bias and variability of mutation rates, might explain the data, a computer simulation was carried out. The influence of nucleotide composition was simulated by using either the four-state or the two-state model of molecular evolution. The effect of the mutation rate was assessed by letting the genes evolve with a rate of d5 = 0.05 and then again with a 10-fold higher rate of d5 = 0.5. (As mentioned earlier, the strong transition bias model is covered by the two-state model.) The first of these two values was obtained by taking the average divergence at synonymous sites of the 51 studied loci calculated from the comparison of the T sequences with the sequences of the other five OTUs (i.e., d5 = 0.09862/2 = 0.05). Thus, the combination of the two variants of each of the two factors tested four different conditions under which the genes evolved. The simulated evolutionary process was aimed at producing six OTUs related to one another in the manner depicted in figure 1
(for further details see Materials and Methods). The simulation was repeated 10,000 times for each of the four sets of conditions, and the results were summarized in a graphic form separately for the (H, C), (H, C, G), and (H, C, G, O) clades (fig. 2
; panels A, B, and C, respectively). Plotted on the abscissa of the graph are the incompatibility values or the proportions of informative sites incompatible with the particular clade in the set of artificially generated sequences. The ordinate indicates the frequency with which each particular proportion occurred in the 10,000 replicates; it can also be interpreted as the probability of obtaining a particular proportion of incompatible sites in one simulation experiment or at one locus.
|
Higher mutation rates increase the proportion of incompatibility in each of the three clades in both the four- and two-state models. In the case of the (H, C) clade, the proportion of replicates without incompatibility falls below 10% under the four-state model and to 0% under the two-state model. Incompatibilities are distributed in varying proportions through most replicates, the variation again being the result of sampling effects in the shallow phylogeny. In the (H, C, G) clade, no replicate under either the four- or two-state model is without incompatibility. The distribution of incompatibility shows a peak at 25% under the four-state model and a broad-shouldered peak at
35% under the two-state model. Finally, in the (H, C, G, O) clade, the peak under the four-state model moves to
30% and that under the two-state model to an equilibrium value of 80%. In the two-state model, four incompatibilities arise for every compatibility generated.
From these observations the following conclusions can be drawn. First, the level of homoplasy is insignificant when the mutation rate is uniformly low at all the sites and when the nucleotide composition is unbiased. Second, the two-state model does not markedly affect the extent of homoplasy in comparison with the four-state model when the range of sequence divergence is low (<10%). Third, a high mutation rate greatly increases the extent of homoplasy: even for a pair of OTUs with a short divergence time, the proportion of loci with compatible sites only is reduced to <10%. The reduction takes place under both models, but it is more pronounced under the two-state than under the four-state model.
Simulation Based on a Mixed Rate Model
Taking these results into consideration and taking into account the possibility that mutation rates may vary from site to site and between different regions of a gene or genome, a mixed rate model was constructed and used in another set of computer simulations. To simulate the situation encountered with the actual data set more realistically, the number of replicates was reduced from 10,000 to 51, corresponding to the number of genes in the set. And to provide for the heterogeneity of the mutation rate, we allowed 900 of the 1,000 sites at each gene to mutate at the low (1µ) rate and the remaining 100 sites at the 10 times higher (10µ) rate. In all other respects the simulation was carried out as in the first experiment. The observed compatibility values62%, 85%, and 85% for the (H, C), (H, C, G), and (H, C, G, O) clades, respectivelywere in reasonably good agreement with the actual data.
We have thus far considered compatible or incompatible sites for a particular clade irrespective of their occurrences within or between loci. However, a locus can be incompatible with a particular clade in two different ways: in one, all the informative (either compatible or incompatible) sites at the locus are incompatible (interlocus incompatibility), and in the other, the locus contains some sites incompatible with each other (intralocus incompatibility). In the experimental data, of the 57 sequence segments (table 4 ), 34 were informative for the (H, C) clade. Of these, six loci or segments (18%) showed intralocus incompatibility and contained 21 incompatible and 20 compatible sites within these loci. The relative extent of intralocus incompatibility for the (H, C) clade (21 vs. 20) was much larger than that for the (H, C, G) clade (44 vs. 244) and for the (H, C, G, O) clade (150 vs. 562). The remaining 28 segments supported either the (H, C) (18 segments, 53%) or the (H, G) and (C, G) (10 segments, 29%, of interlocus incompatibility) grouping unambiguously. Hence, in total, 82% of segments supported a single phylogeny. This proportion reduced to 28/53 = 53% for the (H, C, G) clade and 15/56 = 27% for the (H, C, G, O) clade, each with only one segment being incompatible with either of these clades. Similarly, in terms of the numbers of interlocus incompatible versus compatible sites, there were 22 versus 26 for the (H, C) clade but 1 versus 88 for the (H, C, G) clade and 1 versus 76 for the (H, C, G, O) clade. Thus, the actual data showed that compared with the (H, C, G) and (H, C, G, O) clades, both intra- and interlocus incompatibilities were notably high for the (H, C) clade.
Although the simulation result was in good agreement with the observed low extent of interlocus incompatibility for the (H, C, G) and (H, C, G, O) clades, it failed to account for the observed high extent of interlocus incompatibility for the (H, C) clade. Although by using a mixed rate model it was possible to generate a high degree of intralocus incompatibility, the extent of intralocus incompatibility tended to increase because the clade included distantly related OTUs. This simulation result was again inconsistent with the observed high extent of intralocus incompatibility for the (H, C) clade and the low extent for the (H, C, G) and (H, C, G, O) clades. Thus, the simulation result suggested that the relatively high extent of intra- and interlocus incompatibility observed in the (H, C) clade cannot be accounted for by high mutation rates at particular sites or by homoplasy. It must therefore have a different cause, namely, ancestral polymorphism.
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The inference that a category of rapidly evolving sites must exist in primate DNA has implications for phylogenetic studies. Such sites might be expected to contribute inordinately to phylogenetic information obtained on lineages diverging in rapid succession because few other sites will have undergone substitution within the short time interval. Examples of rapidly evolving sites in specific genes are known (Green et al. 1990
), but the extent of their occurrence in the genome has not been determined. In cases in which rapidly evolving sites are the major source of information about phylogenetic relationships, they may have undergone several substitutions before more slowly evolving sites could contribute to phylogenetic resolution. In such cases, a phylogeny may be built primarily on sites that show a high degree of incompatibility and is likely to be incorrect.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Present address: Department of Genetics, Trinity College, Dublin, Ireland
Abbreviations: C, chimpanzee; G, gorilla; H, human; M, macaque; NCBI, National Center for Biotechnology Information; NWM, New World monkeys; O, orangutan; OTU, operational taxonomic unit; OWM, Old World monkeys; PCR, polymerase chain reaction; T, tamarin.
Keywords: homoplasy
parallelism
convergent evolution
trichotomy
primates
polymorphism
Address for correspondence and reprints: Colm O'hUigin, Max-Planck-Institut für Biologie, Abteilung Immungenetik, Corrensstrasse 42, D-72076 Tübingen, Germany. E-mail: ohuiganc{at}tcd.ie
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bailey W. J., 1993 Hominoid trichotomy: a molecular overview Evol. Anthropol 2:100-108
Green P. M., A. J. Montandon, D. R. Bentley, R. Ljung, I. M. Nilsson, F. Giannelli, 1990 The incidence and distribution of CpGTpG transitions in the coagulation factor IX gene Nucleic Acids Res 18:3227-3231[Abstract]
Horai S., Y. Satta, K. Hayasaka, R. Kondo, T. Inoue, T. Ishida, S. Hayashi, N. Takahata, 1992 Man's place in hominoidea revealed by mitochondrial DNA genealogy J. Mol. Evol 35:32-43[ISI][Medline]
Irwin D. M., R. T. White, A. C. Wilson, 1993 Characterization of the cow stomach lysozyme genes: repetitive DNA and concerted evolution J. Mol. Evol 37:355-366[ISI][Medline]
Jukes T. H., C. R. Cantor, 1969 Evolution of protein molecules Pp. 21132 in H. N. Munro, ed. Mammalian protein metabolism III. Academic Press, New York
Kimura M., 1980 A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences J. Mol. Evol 16:111-120[ISI][Medline]
Klein J., A. Sato, S. Nagl, C. O'hUigin, 1998 Molecular trans-species polymorphism Annu. Rev. Ecol. Syst 29:1-21[ISI]
Kominato Y., P. D. McNeill, M. Yamamoto, M. Russel, S.-I. Hakomori, F. Yamamoto, 1992 Animal histo-blood group ABO genes Biochem. Biophys. Res. Commun 189:154-165[ISI][Medline]
Kornegay J. R., T. D. Kocher, L. A. Williams, A. C. Wilson, 1993 Pathways of lysozyme evolution inferred from the sequences of cytochrome b in birds J. Mol. Evol 37:367-379[ISI][Medline]
Kriener K., C. O'hUigin, H. Tichy, J. Klein, 2000 Convergent evolution of major histocompatibility complex molecules in humans and New World monkeys Immunogenetics 51:169-178[ISI][Medline]
Kumar S., S. B. Hedges, 1998 A molecular timescale for vertebrate evolution Nature 392:917-920[ISI][Medline]
Lawn R. M., K. Schwartz, L. Patthy, 1997 Convergent evolution of apolipoprotein (a) in primates and hedgehog Proc. Natl. Acad. Sci. USA 94:11992-11997
Martin R. D., 1993 Primate origins: plugging the gaps Nature 363:223-234[ISI][Medline]
Miyamoto M., B. F. Koop, J. L. Slightom, M. Goodman, M. Tennant, 1988 Molecular systematics of higher primates: genealogical relations and classification Proc. Natl. Acad. Sci. USA 85:7627-7631[Abstract]
Nei M., S. Kumar, 2000 Molecular evolution and phylogenetics Oxford University Press, Oxford
O'hUigin C., 1995 Quantifying the degree of convergence in primate Mhc-DRB genes Immunol. Rev 143:123-140[ISI][Medline]
O'hUigin C., A. Sato, J. Klein, 1997 Evidence for convergent evolution of A and B blood group antigens in primates Hum. Genet 101:141-148[ISI][Medline]
Rogers J., 1993 The phylogenetic relationships among Homo, Pan and Gorilla: a population genetics perspective J. Hum. Evol 25:201-215[ISI]
Ruvolo M., 1997 Molecular phylogeny of the hominoids: inferences from multiple independent DNA sequence data sets Mol. Biol. Evol 14:248-265[Abstract]
Sarich V. M., A. C. Wilson, 1967 Rates of albumin evolution in primates Proc. Natl. Acad. Sci. USA 58:142-148[ISI][Medline]
Satta Y., J. Klein, N. Takahata, 2000 DNA archives and our nearest relative: the trichotomy problem revisited Mol. Phylogenet. Evol 14:259-275[ISI][Medline]
Schwartz J. H., 1987 The red ape Orang-utans & human origins. Houghton Mifflin Company, Boston
Sibley C. G., J. A. Comstock, J. E. Ahlquist, 1990 DNA hybridization evidence of hominoid phylogeny: a reanalysis of the data J. Mol. Evol 30:202-236[ISI][Medline]
Swanson K. W., D. M. Irwin, A. C. Wilson, 1991 Stomach lysozyme gene of the langur monkey: tests for convergence and positive selection J. Mol. Evol 33:418-425[ISI][Medline]
Takahata N., 1993 Allelic genealogy and human evolution Mol. Biol. Evol 10:2-22[Abstract]
. 1995 Mhc diversity and selection Immunol. Rev 143:225-247[ISI][Medline]
Takahata N., Y. Satta, 1997 Evolution of the primate lineage leading to modern humans: phylogenetic and demographic inferences from DNA sequences Proc. Natl. Acad. Sci. USA 94:4811-4815
Teumer J., H. Green, 1989 Divergent evolution of part of the involucrin gene in the hominoids: unique intragenic duplications in the gorilla and human Proc. Natl. Acad. Sci. USA 86:1283-1286[Abstract]
White T. D., G. Suwa, B. Asfaw, 1998 Australopithecus ramidus, a new species of early hominid from Aramis, Ethiopia Nature 371:306-312
Wolfe K. H., P. M. Sharp, W. H. Li, 1989 Mutation rates differ among regions of the mammalian genome Nature 337:283-285[ISI][Medline]
Woodward E. R., A. Buchberger, S. C. Clifford, L. D. Hurst, N. A. Affara, E. R. Maher, 2000 Comparative sequence analysis of the VHL tumor suppressor gene Genomics 65:253-265[ISI][Medline]
Wyckoff G. J., W. Wang, C. I. Wu, 2000 Rapid evolution of male reproductive genes in the descent of man Nature 403:304-309[ISI][Medline]
Zhang J., H. Rosenberg, M. Nei, 1998 Positive Darwinian selection after gene duplication in primate ribonuclease genes Proc. Natl. Acad. Sci. USA 95:3708-3713