* Division of Population Genetics, National Institute of Genetics, Mishima, Japan
Department of Biological Sciences, Graduate School of Science, University of Tokyo, Tokyo, Japan
Correspondence: E-mail: nsaitou{at}genes.nig.ac.jp.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: humanness positive selection hominoids gene tree
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
It is our interest to determine whether amino acid changes occurred in the human lineage after the last common ancestor diverged from the chimpanzee lineage. Such changes are the candidates for the genetic basis of human-specific characteristics. Gene trees of human and great apes are necessary for extracting those genetic changes that occurred in the human lineage. There are three possible gene trees for human, chimpanzee, and gorilla (see figure 1). Because the speciation period of human and chimpanzee is difficult to infer, the "human lineage" in this paper is defined as branch connecting the present-day human and the last branching point designated as a circle in figure 1.
|
The majority of genes is evolving under neutral fashion, and natural selection plays mainly a conservative role as negative or purifying selection (Kimura 1983; Nei 1987). Nevertheless, a small portion of genes is under positive selection, and evidence of positive selection at the molecular level has been accumulated through comparison of synonymous and nonsynonymous substitutions since it was first found for MHC genes (Hughes and Nei 1988, 1989). Even if we restrict our attention to primates, 15 genes were so far shown to experience positive selection (table 1). We, therefore, also compared synonymous and nonsynonymous substitutions to identify human lineagespecific positive selection.
|
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Sequence Data Retrieval and Analyses
Eighty-five protein-coding gene sequences were retrieved from the DDBJ/EMBL/GenBank International Nucleotide Sequence Database (Supplementary Material online). This data set contains human, chimpanzee, gorilla, and orangutan sequences longer than 100 bp. To retrieve those sequences, we used orangutan sequences as queries for Blast search to obtain homologous sequences for human, chimpanzee, and gorilla. For human sequences, we used sequences that were cited in the NCBI Reference Sequences (RefSeq) as representative ones. When more than two sequences were found from one ape species, we used the sequence that showed the shortest branch in the neighbor-joining tree (Saitou and Nei 1987). ClustalW version 1.8 (Thompson, Gibson, and Higgins 1994) was used for multiple alignments. Tree topologies were determined by counting numbers of informative sites (Supplementary Material online). Tree shows human-chimpanzee cluster, tree ß shows chimpanzee-gorilla cluster, and tree
shows human-gorilla cluster. Genes with unclear topology (trichotomy or same number of informative sites support different trees) was categorized into group
. Genes belonging to group
were recategorized into
-
,
-ß, and
-
by using the UPGMA method (Sneath and Sokal 1973). Program pamp in PAML package (Yang 1997) was used for reconstruction of internal nodes of sequences. ODEN package (Ina 1994) was used for estimation of synonymous and nonsynonymous substitutions (Nei and Gojobori 1986).
Statistical Tests
We used two kinds of statistical tests for detecting human-specific natural selection. One test is the acceleration index test for nonsynonymous substitutions of human and apes. This test analogous to the test of Zhang, Webb, and Podlaha (2002), in which an acceleration index for the human lineage in comparison to the mammalian lineage before the human-chimpanzee split is defined by the equation (h/5.5)/[m/(2 x 90 5.5)] = 31.7h/m. The variables h and m are numbers of amino acid substitutions in the human lineage and the mouse lineage, respectively. Zhang, Webb, and Podlaha (2002) used divergence times between human and chimpanzee (5.5 MYA) and between primates and rodents (90 MYA). They also computed the tail probability in a binomial distribution of B(h+m, 0.03056) for testing the statistical significance of rate enhancement in the human lineage. The value 0.03056 is from 5.5/180, the time span for human branch, relative to that for primates and rodents branches. We applied this test as B(n-Human + n-Ape, 0.13268) to determine the statistical significance of rate enhancement in the human lineage in contrast to ape lineage (n-Human and n-Ape are numbers of nonsynonymous changes at the human lineage and ape lineage, respectively). In our test, 0.13268 = 5.4/40.7, is the ratio of the time span for human branch (5.4 MYA) to the time span for ape lineages (40.7 MYA). We used divergence times estimated by Chen and Li (2001). Taking the orangutan speciation date as approximately 12 to 16 MYA (midpoint is 14 MYA) (Goodman et al. 1998), they obtained an estimate of 4.6 to 6.2 MYA (midpoint is 5.4 MYA) for the human and chimpanzee divergence and an estimate of 6.2 to 8.4 MYA (midpoint is 7.3 MYA) for the gorilla speciation date, suggesting that the gorilla lineage branched off approximately 1.6 to 2.2 MYA (midpoint is 1.9 MYA) earlier than did the human and chimpanzee divergence. The time span between the ancestor of human-chimpanzee-gorilla-orangutan and the ancestor of human-chimpanzee-gorilla can be estimated to 6.7 MYA (= 14 MYA to 7.3 MYA). For simplicity, we took midpoint values and assumed the species tree. Therefore, the total divergence time of hominoid lineages is 40.7 MYA = 5.4 MYA + 5.4 MYA + 1.9 MYA + 7.3 MYA + 6.7 MYA + 14.0 MYA.
We also used Fisher's exact test (two tails) for synonymous and nonsynonymous substitutions of human and apes. This test is analogous to the test of McDonald and Kreitman (1991), in which silent and amino acid replacement changes for polymorphic and fixed differences were compared. We compared silent and amino acid replacement changes for human and ape branches in this study.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We divided the 103 protein-coding genes into four groups by its tree topology using the parsimony method. Human and chimpanzee are clustered in tree (fig. 1A), and it is the same as the species tree. Chimpanzee and gorilla are clustered in tree ß (fig. 1B), and human and gorilla are clustered in tree
(fig. 1C). Genes with unclear topology (trichotomy, or same number of informative sites, support different trees) was categorized into group
. Numbers of nucleotide sites supporting each tree for each gene are shown in Supplementary Material online. Thirty-four genes were group
, 10 genes were group ß, 14 genes were group
, and the remaining 45 genes were group
(table 2). The 45 genes in trifurcating tree (group
) were further classified by using UPGMA, under the assumption of approximate constancy of the evolutionary rate.
|
We also classified the total number (182) of informative sites into those supporting the three possible trees: 55% of sites supported tree , 21% supported tree ß, and 24% supported tree
. These proportions are similar to those estimated from numbers of genes. O'hUigin et al. (2002) estimated that the 53% of the informative nucleotide sites supported tree
, 31% supported tree ß, and 16% supported tree
from 87 informative sites found in 51 genes. The result of the present study showed more uniform distribution of two alternative informative sites.
If we compare different gene trees, the branch length of the human lineage for tree ß is expected to be longer than those of tree and tree
under the assumption of the molecular clock. We, thus, compared the number of synonymous substitutions (dS) for each branch of three gene trees (table 3). As expected, dS of the human branches for tree ß were longer than those of tree
and tree
, with clear statistical significance. This is consistent with the topological difference between tree ß and the remaining two trees. Human forms a cluster with chimpanzee or gorilla in tree
and
, whereas human is an outgroup to the chimpanzee-gorilla clade in tree ß (see figure 1). Similarly, the branch length of the chimpanzee lineage for tree
is expected to be longer than those of tree
and tree ß, and the branch length of the gorilla lineage for tree
is expected to be longer than those of tree ß and tree
. However, clear results were not obtained. This finding may be caused by a smaller number of compared genes.
|
Synonymous and Nonsynonymous Changes on the Human Branch
We compared numbers of synonymous and nonsynonymous substitutions for human and ape branches (see table 4). The ape branch denotes the sum of all branch lengths of the tree except for the human branch. We applied the acceleration index test (Zhang, Webb, and Podlaha 2002) for nonsynonymous substitutions of human and apes, and six genes (APOE, BRCA1, FOXP2, HCR, PRM2, and ZFY) showed acceleration at the human lineage with statistical significance at the 5% level (table 5). Zhang, Webb, and Podlaha (2002) applied this test to amino acid changes of 120 genes among human, chimpanzee, and mouse, and identified FOXP2 and PRM2, with significantly enhanced evolutionary rates in the human lineage (table 5). FOXP2 and PRM2 genes were also reported to be under positive selection on the human branch using different tests by Enard et al. (2002) and Wyckoff, Wang, and Wu (2001), respectively.
|
|
Zhang, Webb, and Podlaha (2002) used human, chimpanzee, and mouse sequence data to analyze human-specific selection of genes. Outgroup species are necessary to estimate the human lineagespecific changes; however, mouse may be too far removed to be used as an outgroup. Recently, Clark et al. (2003) compared coding regions of mouse, human, and chimpanzee but did not find many genes with significantly higher nonsynonymous substitutions in the human lineage. More closely related species, such as gorilla and orangutan, are appropriate as outgroups for the kind of analysis we conducted in the present study.
The BRCA1 gene was determined to be under positive selection on human and chimpanzee branches (Huttley et al. 2000). APOE codes apolipoproteins involved in cholesterol metabolism. Three major isoforms are known for human APOE (Weisgraber, Rall, and Mahley 1981), and these alleles differ in their association with hyperlipoproteinemia (Rall et al. 1982) and Alzheimer disease risk (Corder et al. 1993). It is possible that the cholesterol metabolism underwent different evolutionary pressures between the human and the ape branches. The HCR gene locates near the HLA-C locus and is a candidate gene for psoriasis (Asumalahti et al. 2000). However, there is so far no report of positive selection on this gene. The ZFY gene encodes a zinc fingercontaining protein that may function as a transcription factor. Differential rates of evolution of the ZFY-related genes were recently observed in mice species (Tucker, Adkins, and Rest 2003).
We also applied Fisher's exact test (two tails) for synonymous and nonsynonymous substitutions of human and apes. Three genes (BRCA1, FOXP2, and DAF) showed statistical significance at the 5% level (table 5). BRCA1 and FOXP2 genes showed significant enhancement of nonsynonymous substitutions in the human lineage, as also found by using the acceleration test, whereas the DAF (decay accelerating factor) gene showed significant reduction at the human lineage. Generally speaking, P(n/s) values are higher than P(A.I.-ape) values except for the FOXP2 gene.
There are two equally parsimonious trees for the DAF gene. Figure 2 shows these two trees (shown with bold lines in A and B) on the phylogenetic network. This gene was categorized into group -
, corresponding to tree A of figure 2. Four synonymous and two nonsynonymous substitutions on the human branch and six synonymous and 28 nonsynonymous substitutions on ape branches were assumed for tree A. The number of nonsynonymous substitutions on the human branch is significantly smaller than those of other branches (table 5). When the alternative maximum-parsimonious tree (fig. 2B) is considered, however, two synonymous and five nonsynonymous substitutions on human branch and six synonymous and 27 nonsynonymous substitutions on other branches were observed. In this case, number of substitutions between human and other branches is not statistically significant. The DAF gene codes a glycoprotein and it is related to Cromer blood group system (CR) (Reid et al. 1996). Kuttner-Kondo et al. (2000) mentioned that number of amino acid changes differ region by region in primate DAF genes. This gene may have a human-specific nucleotide substitution pattern, but it depends on a topology to be analyzed. More detailed analyses might be needed for this gene.
|
Differences of Nonsynonymous Substitutions for Human and Ape Branch
Comparison of synonymous and nonsynonymous substitutions for each coding region is a standard way of detecting the pattern of natural selection. However, the number of synonymous substitutions may undergo stochastic changes, and it can be rather small for one gene but become large in another gene. We therefore decided to compare the number of human and ape nonsynonymous substitutions (dN) with the number of synonymous substitutions (dS) for each gene. Human and ape dNs were positively correlated with high statistical significance (R2 = 0.33, P = 2.00 x 1010). This result is compatible with that of Wildman et al. (2003). Figure 3 shows a plot of human dN ape dN for each gene. We multiplied human dN by 6.54, because of the difference of human divergence time (5.4 MYA) and ape divergence times (35.3 MYA). If dN is constant for human and ape branches, human dN ape dN is expected to be zero. In fact, majority of the genes are located around the zero line in figure 3. Substitution rates (dS) varied among genes. For example, PRM2 showed the highest substitution rate by dS. However, there was no correlation between the difference of human dN ape dN and total dS.
|
In conclusion, we conducted a systematic analysis of 103 protein-coding genes for human, chimpanzee, gorilla, and orangutan. We showed that gene genealogies differ from gene to gene, because the time span between the human-chimpanzee common ancestor and gorilla speciation is short. We conducted three types of analyses for detecting the human-specific pattern in nonsynonymous changes. Comparison of each coding region is a standard way of detecting the pattern of natural selection. However, it is sometimes difficult to detect the pattern of natural selection because of a few numbers of changes. We conducted comparison of dNs by using a large number of genes. This kind of analysis may help to find candidate genes that caused human-specific phenotypic changes.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Pekka Pamilo, Associate Editor
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Asumalahti, K., T. Laitinen, R. Itkonen-Vatjus, M.-L. Lokki, S. Suomela, E. Snellman, U. Saarialho-Kere, and J. Kere. 2000. A candidate gene for psoriasis near HLA-C, HCR (Pg8), is highly polymorphic with a disease-associated susceptibility allele. Hum. Mol. Genet. 9:1533-1542.
Chen, F.-C., and W.-H. Li. 2001. Genomic divergence between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am. J. Hum. Genet. 68:444-456.[CrossRef][ISI][Medline]
Clark, A. G., S. Glanowski, and R. Nielsen, et al. (17 co-authors). 2003. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302:1960-1963.
Corder, E. H., A. M. Saunders, W. J. Strittmatter, D. E. Schmechel, P. C. Gaskell, G. W. Small, A. D. Roses, J. L. Haines, and M. A. Pericak-Vance. 1993. Gene dose of apolipoprotein E type 4 allele and the risk of Alzheimer's disease in late onset families. Science 261:921-923.[ISI][Medline]
Enard, W., M. Przeworski, S. E. Fisher, C. S. Lai, V. Wiebe, T. Kitano, A. P. Monaco, and S. Pääbo. 2002. Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418:869-872.[CrossRef][ISI][Medline]
Fujiyama, A., H. Watanabe, and A. Toyoda, et al. (14 co-authors). 2002. Construction and analysis of a human-chimpanzee comparative clone map. Science 295:131-134.
Goodman, M., C. A. Porter, J. Czelusniak, S. L. Page, H. Schneider, J. Shoshani, G. Gunnell, and C. P. Groves. 1998. Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidence. Mol. Phylogenet. Evol. 19:585-598.
Horai, S., K. Hayasaka, R. Kondo, K. Tsugane, and N. Takahata. 1995. Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc. Natl. Acad. Sci. USA 92:532-536.[Abstract]
Hughes, A. L., and M. Nei. 1988. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335:167-170.[CrossRef][ISI][Medline]
Hughes, A. L., and M. Nei. 1989. Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. Proc. Natl. Acad. Sci. USA 86:958-962.[Abstract]
Huttley, G. A., S. Easteal, M. C. Southey, A. Tesoriero, G. G. Giles, M. R. E. McCredie, J. L. Hopper, and D. J. Venter. 2000. Adaptive evolution of the tumour suppressor BRCA1 in humans and chimpanzees. Nat. Genet. 25:410-413.[CrossRef][ISI][Medline]
Ina, Y. 1994. ODEN: a program package for molecular evolutionary analysis and database search of DNA and amino acid sequences. Comput. Appl. Biosci. 10:11-12.[Medline]
International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921.[CrossRef][ISI][Medline]
Johnson, M. E., L. Viggiano, J. A. Bailey, M. Abdul-Rauf, G. Goodwin, M. Rocchi, and E. E. Eichler. 2001. Positive selection of a gene family during the emergence of humans and African apes. Nature 413:514-519.[CrossRef][ISI][Medline]
Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge, UK.
King, M. C., and A. C. Wilson. 1975. Evolution at two levels in humans and chimpanzees. Science 188:107-116.[ISI][Medline]
Kitano, T., and N. Saitou. 1999. Evolution of the Rh blood group genes has experienced gene conversions and positive selection. J. Mol. Evol. 49:615-626.[ISI][Medline]
Kitano, T., K. Sumiyama, T. Shiroishi, and N. Saitou. 1998. Conserved evolution of the Rh50 gene compared to its homologous Rh blood group gene. Biochem. Biophys. Res. Comm. 249:78-85.[CrossRef][ISI][Medline]
Kuttner-Kondo, L., V. B. Subramanian, J. P. Atkinson, J. Yu, and M. E. Medof. 2000. Conservation in decay accelerating factor (DAF) structure among primates. Dev. Comp. Immunol. 24:815-827.[CrossRef][ISI][Medline]
McDonald, J. H., and M. Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 351:652-654.[CrossRef][ISI][Medline]
Messier, W., and C. B. Stewart. 1997. Episodic adaptive evolution of primate lysozymes. Nature 385:151-154.[CrossRef][ISI][Medline]
Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.
Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418-426.[Abstract]
Ohno, S. 1972. So much "junk" DNA in our genome. Brookhaven Symp. Biol. 23:366-370.[Medline]
O'hUigin, C., Y. Satta, N. Takahata, and J. Klein. 2002. Contribution of homoplasy and of ancestral polymorphism to the evolution of genes in anthropoid primates. Mol. Biol. Evol. 19:1501-1513.
Rall, S. C., Jr., K. H. Weisgraber, T. L. Innerarity, and R. W. Mahley. 1982. Structural basis for receptor binding heterogeneity of apolipoprotein E from type III hyperlipoproteinemic subjects. Proc. Natl. Acad. Sci. USA 79:4696-4700.[Abstract]
Reid, M. E., V. Chandrasekaran, L. Sausais, J. Pierre, and R. Bullock. 1996. Disappearance of antibodies to Cromer blood group system antigens during mid pregnancy. Vox Sang. 71:48-50.[CrossRef][ISI][Medline]
Saitou, N. 1991. Reconstruction of molecular phylogeny of extant hominoids from DNA sequence data. Am. J. Phys. Anthropol. 84:75-85.[ISI][Medline]
Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.[Abstract]
Satta, Y., J. Klein, and N. Takahata. 2000. DNA archives and our nearest relative: the trichotomy problem revisited. Mol. Phyl. Evol. 14:259-275.[CrossRef][ISI][Medline]
Sibley, C. G., and J. E. Ahlquist. 1984. The phylogeny of the hominoid primates, as indicated by DNA-DNA hybridization. J. Mol. Evol. 20:2-15.[ISI][Medline]
Sneath, P. H. A., and R. R. Sokal. 1973. Numerical taxonomy. Freeman, San Francisco.
Sumiyama, K., N. Saitou, and S. Ueda. 2002. Adaptive evolution of the IgA hinge region in primates. Mol. Biol. Evol. 19:1093-1099.
Thompson, J. D., T. J. Gibson, and D. G. Higgins. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract]
Tucker, P. K., R. M. Adkins, and J. S. Rest. 2003. Differential rates of evolution for the ZFY-related zinc finger genes, Zfy, Zfx, and Zfa in the mouse genus Mus. Mol. Biol. Evol. 20:999-1005.
Weisgraber, K. H., S. C. Rall, and R. W. Mahley. 1981. Human E apoprotein heterogeneity: cysteine-arginine interchanges in the amino acid sequence of the apo-E isoforms. J. Biol. Chem. 256:9077-9083.
Wildman, D. E., M. Uddin, G. Liu, L. I. Grossman, and M. Goodman. 2003. Implications of natural selection in shaping 99.4% nonsynonymous DNA identity between humans and chimpanzees: enlarging genus Homo. Proc. Natl. Acad. Sci. USA 100:7181-7188.
Wyckoff, G. J., W. Wang, and C.-I. Wu. 2000. Rapid evolution of male reproductive genes in the descent of man. Nature 403:304-309.[CrossRef][ISI][Medline]
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555-556.[Medline]
Zhang, J., H. F. Rosenberg, and M. Nei. 1998. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc. Natl. Acad. Sci. USA 95:3708-3713.
Zhang, J., D. M. Webb, and O. Podlaha. 2002. Accelerated protein evolution and origins of human-specific features: Foxp2 as an example. Genetics 162:1825-1835.
|