Max Plank Institute for Evolutionary Anthropology, Leipzig, Germany
Correspondence: E-mail: przewors{at}eva.mpg.de.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: chimpanzees chimpanzee subspecies diversity population structure linkage disequilibrium
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Chimpanzees are classified into two species: the common chimpanzee, Pan troglodytes, and the pygmy chimpanzee or bonobo, Pan paniscus. Among common chimpanzees, three "subspecies" are recognized on the basis of their geographic distribution (Napier and Napier 1967): Pan troglodytes verus in western Africa, P. t. troglodytes in central Africa, and P. t. schweinfurthii in eastern Africa. Recently, a fourth subspecies, P. t. vellerosus, has been suggested in western Africa based on mitochondrial DNA divergence (Gonder et al. 1997). Very little is known about the distribution of variation among these subspecies or between species (cf. Morin et al. [1994]).
Indeed, while we now have polymorphism data for extensive regions of the human genome, including coding as well as intergenic regions (Ptak and Przeworski 2002 and references therein), there are few surveys of genetic variation in chimpanzees. Several studies have examined the phylogenetic relationship of humans to other primates (Ruvolo 1997; Gagneux et al. 1999; Chen and Li 2001; Jensen-Seaman, Deinard, and Kidd 2001) and estimated the sequence divergence (Chen et al. 2001; Ebersberger et al. 2002). However, intraspecific variation among chimpanzees has been studied in few regions: mitochondrial DNA (Ferris et al. 1981; Morin et al. 1994; Wise et al. 1997), the HOXB6 intergenic region (Deinard and Kidd 1999), a nongenic region at Xq13.3 (Kaessmann, Wiebe, and Pääbo 1999; Kaessmann et al. 2001), four genes of the renin-angiotensin system (Dufour et al. 2000), four autosomal and two X chromosomal genes in two individuals (Satta 2001), the male-specific part of the Y chromosome (Stone et al. 2002), and 10 genes on the X chromosome (Kitano et al. 2003). The main conclusions to emerge from these studies are that the mutation rate is similar in humans and chimpanzees, but levels of diversity are twofold to sixfold higher in chimpanzees, reflective of a larger effective population size (e.g., Kaessmann, Wiebe, and Pääbo [1999]).
Many of the regions surveyed in chimpanzees are genic or lie in regions of low or no recombination. Thus, they are more likely affected by natural selection (directly or at linked sites). To study nonselective processes shaping patterns of variation, surveys of intergenic regions in areas of normal to high recombination may be more informative. Recently, short fragments from 50 intergenic regions were sequenced in 17 chimpanzees, including five known to be central and six known to be western (Yu et al. 2003). The authors reported that, in contrast to previous studies, levels of diversity were only 50% higher in chimpanzees than in worldwide samples of humans.
We collected DNA sequence variation from nine intergenic, autosomal regions in 14 central chimpanzees. In each region, we sequenced approximately 1 kb from each end of a 10-kb segment to characterize patterns of linkage disequilibrium as well as of diversity. These regions have previously been sequenced in three human populations (Frisse et al. 2001) and partially in 16 western chimpanzees (Gilad et al. 2003). These data suggest that central chimpanzees are 2-fold to 2.5-fold more diverse than western chimpanzees and worldwide samples of humans. Consistent with this, levels of linkage disequilibrium are lower in central chimpanzees than in humans. Interestingly, there appears to be much stronger genetic differentiation between western and central subspecies than between human populations. Together with the high proportion of rare alleles observed in central but not western samples, this finding suggests a complex demographic history of chimpanzees.
![]() |
Material and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Genomic Regions and Characteristics
We resequenced nine of 10 genomic regions chosen in a previous study (Frisse et al. 2001) in the 14 central chimpanzees. The 10 regions were selected to have large-scale recombination rates and GC contents close to the genome average (Frisse et al. 2001). One region (region 9) was not amplified in chimpanzees because of a 400-bp complex repeat, which made sequencing difficult. Sequences are available from GenBank under accession numbers AY458900 to AY459151.
The GC content for each region was determined using a Unix script, kindly provided by Ingo Ebersberger. We checked for conserved regions between mouse and human sequences using the Berkeley genome pipeline (http://pipeline.lbl.gov/).
To estimate the recombination rates (c) of the nine regions studied, we made the assumption that large-scale rates are similar in humans and chimpanzees and used the human recombination rates reported in Kong et al. (2002). This study provides average recombination rates for 3 Mb windows centered on a marker. We assigned each region to the closest marker on this map; these were located 4 kb to 370 kb away from the 10-kb region of interest.
DNA Extraction
DNA was extracted from 50 ml cell cultures containing between 5 x 106 to 50 x 106 cells. These cell cultures were obtained from lymphocytes transformed with Epstein-Barr virus. The cell pellets were first resuspended in 5 ml TEN (10 mM Tris/HCl, pH 8.2 and 400 mM NaCl, 2 mM EDTA, pH 8.2). Ten milliliters of TEN, 200 ml 10% SDS (sodium-dodecyl-sulfate) and 100 ml proteinase K were then added. After an overnight incubation at 37°C and cooling for 1 h in the refrigerator, 5 ml of saturated NaCl were added. The solutions were well shaken and centrifugated for 15 min at 4°C (3,500 rpm). The supernatants were transferred in twice their volume of 100% ethanol for DNA precipitation. The DNAs were then washed with 70% ethanol. The dried pellets were redissolved in 1 ml water. This DNA was then treated 1 h with RNase A (10 µg/µl). The RNase and nucleotides were removed by a phenol-chloroform extraction. The remaining pellets were dissolved in 1 ml water and aliquoted to a concentration of 100 ng/ml for further use.
PCR Amplification
For all nine regions of lengths 10 kb, approximately 1 kb was amplified at each end. Polymerase chain reaction (PCR) primers were designed using the human sequence. Amplification reactions were performed in 96-well microtiter plate thermal cycler (Applied Biosystem). The PCR reaction mixture (100 ml) contained a standard buffer (10 mM Tris-HCl, 5 mM MgCl2), the four deoxynucleotide triphosphates (0.25 mM each), primers (0.5 pM each), Amplitaq GoldTM (PerkinElmer) and genomic DNA (70 ng). Thermal cycling was performed by touchdown PCR with an initial denaturation step at 95°C for 5 min, followed by 16 cycles of denaturation at 94°C for 45 s, primer annealing for 45 s decreasing by 1°C every cycle from 65°C to 50°C (touchdown PCR), and primer extension at 72°C for 2 min. After 16 cycles, 24 cycles at 50°C were added with the same denaturation and annealing conditions and a final extension was carried out at 72°C for 3 min.
DNA Sequencing
Sequencing primers were designed to anneal approximately every 400 bp, for complete coverage in both orientations of each locus pair. After DNA amplification, PCR products were purified with the QIAquick PCR Purification Kit (QIAGEN), and the amount of purified DNA was estimated by electrophoresis in a 1% agarose gel and by measurement with a spectrophotometer. Approximately 10 ng of the purified sample (1 to 4 ml) was used as sequencing template. These templates were diluted in water to a volume of 7 ml, and 1 µl primer (5 pM) and 2 µl dye-terminator were added. Cycle sequencing was performed according to the manufacturer's instructions using the BigDye Terminator Cycle Sequencing kit and ABI Prism 3700 (PerkinElmer Biosystems).
Chromatograms were imported to Sequencher version 4.0 (Gene Codes Corporation) for the assembly into contigs and the identification of polymorphic sites. Diploid sequence was determined by inspection of each nucleotide position in high-quality chromatograms.
Hardy-Weinberg equilibrium was tested for each site using Arlequin version 2.0 (http://lgb.unige.ch/arlequin/ ). Overall, there were no more departures than expected by chance (P < 0.05 for 12 variable sites out of 167 across the different regions). Nonetheless, at some sites that appeared out of Hardy-Weinberg equilibrium, there were individuals homozygous for the rare allele, raising the possibility of allelic dropout. We assessed whether these anomalies are caused by allelic dropout by designing new primers, performing a new PCR, and resequencing the product. Eight of the 12 sites identified as out of Hardy-Weinberg turned out to be heterozygotes. This method was also used on 22 additional homozygote sites, to test whether any additional heterozygous sites had been missed. None of them were found to be heterozygotes.
For insertion-deletions that made direct sequencing difficult, we carried out following procedure: After the PCR was performed, the products were cloned using the TOPO TA-cloning kit (Invitrogen). Once the bacteria had grown on agar plates, 10 white colonies were picked and a new PCR was performed with 40 cycles beginning with a denaturing temperature of 94°C for 1 min, followed by primer annealing at 57°C for 1 min, and an elongation temperature of 72°C for 2 min. Fifteen minutes of additional elongation were added at the end of the program. The PCR products were then sequenced as above for every individual.
Statistical Analyses
We used the program DNAsp (Rozas and Rozas 1999) to obtain a number of commonly used statistics. To summarize diversity levels, we calculated the average pairwise difference, (Nei and Li 1979) and
w (Watterson 1975). Under the standard neutral model of a random-mating population of constant size with no selection, both are estimates of the population mutation rate,
= 4Neµ (where µ is the mutation rate per generation and Ne the effective population size) (Tajima 1989). We also calculated two summaries of allele frequencies, Tajima's D statistic (Tajima 1989) and Fu and Li's D test (Fu and Li 1993).
We tested the goodness of fit of a standard neutral model of a random-mating population of constant size by asking how often the mean Tajima's D statistic across the nine loci or the mean Fu and Li's D test are as low or lower than observed in 10,000 simulations. We also assessed whether there was significant heterogeneity in the ratio of polymorphism to divergence across loci, using a multilocus Hudson-Kreitman-Aguadé (HKA) test (Hudson, Kreitman, and Aguade 1987). All three tests were implemented using the program HKA, available from Jody Hey's home page (http://lifesci.rutgers.edu/heylab/ ). The program assumes an infinite-site mutation model and no recombination within locus pairs but free recombination between them.
To estimate the chimpanzee mutation rate, we reconstructed the sequence for the ancestor of humans and chimpanzees, with the orangutan as an outgroup (using PAML [Yang 1997]). This allowed us to estimate the number of substitutions, , separately for each lineage. We obtained an estimate of the mutation rate per generation by dividing
by the number of generations to the most recent common ancestor, assuming 6 Myr to the common ancestor and 20 years per generation (L. Vigilant, personal communication). Given this estimate of the mutation rate, the diploid effective population size
e of autosomal loci can be estimated by
or
under the standard neutral model (where
and
are average values across the nine locus pairs).
Comparison of Central and Western Chimpanzees
Seven single fragments of 670 to 910 bp from the nine locus pairs were previously sequenced in 16 western chimpanzees, totaling 5.4 kb (Gilad et al. 2003). On the basis of these data and polymorphism data collected for the same 5.4 kb in central chimpanzees, we built a neighbor-joining tree, with the orangutan as an outgroup, for each of the regions (using MEGA version 2.1 and the Kimura two-parameter method [Kimura 1980]). To summarize differences in allele frequencies between the central and western chimpanzee samples, we calculated the ratio of the estimated variance component due to differences between populations over the estimated total variance, st (Excoffier, Smouse, and Quattro 1992), using the program Arlequin (http://lgb.unige.ch/arlequin/ ). We also tabulated the number of polymorphisms shared between western and central population samples, found only in one sample, or fixed between samples. Assuming a simple model in which an ancestral population splits into two descendant populations of constant (but possibly unequal) size, with no subsequent migration, these summaries can be used to estimate the split time and the effective population sizes. We did so using a method of moments developed by Wakeley and Hey (1997), using the mutation rate that we estimated for chimpanzees.
Measures of Linkage Disequilibrium (LD)
Haplotypes were inferred with the program PHASE (Stephens, Smith, and Donnelly 2001), after exclusion of singletons. Pairwise LD was then calculated using the inferred haplotypes (using DNAsp). To summarize levels of pairwise LD, we used r2, a commonly used statistic (Hill and Robertson 1968). An alternative way to quantify levels of LD is to estimate the population crossing-over parameter = 4Nr. The parameter
is a key determinant of LD patterns, with the strength of LD decreasing when
increases. The relation of
to the mean r2 is known under simple models (McVean 2002). Under the standard neutral model
|
In humans, there exist two approaches to estimating . In the first,
is estimated as
map = 4
e
, where
is an estimate of the recombination rate per bp, based on a comparison of physical and genetic maps, and
e is an estimate of the effective population size estimate, based on diversity and divergence data. In the second method,
is estimated from patterns of LD under the standard neutral model. One way to think of this second estimate is as the population recombination rate needed to produce the observed levels of LD under the standard neutral model. Thus, larger
correspond to lower levels of LD. If the underlying assumptions are valid, the two estimates of
should be similar.
We used the approach of Hudson (2001b) to estimate the maximum composite likelihood of r, referred to as cl. Specifically, we used an extension of the method that allows the simultaneous estimation of
and f, where f is the ratio of gene conversion to crossing-over events (Frisse et al. 2001). The program to do so was kindly provided by R. Hudson. The method assumes that gene conversion and crossing-over are alternative resolutions of a Holliday junction and that the conversion-tract length is geometrically distributed with mean length L. Given this model of gene conversion, an effective population recombination parameter
e can be calculated as follows:
(Wiuf and Hein 2000). This
e can then be also used to estimate the E(r2) by substituting
e for
in equation 1. Following Frisse et al. (2001), we assumed that
and f are fixed across loci and estimated the two parameters from all loci jointly. To make our results comparable to those in Frisse et al. (2001), we further assumed that L = 500 bp; similar results are obtained if L = 1,000 (results not shown). To test the hypothesis that f = 0, we compared the observed value of
= CLik(f = 0)/CLik(
) with a distribution of
values obtained from 100 simulations of the null model, where
, f = 0, and
=
cl (CLik denotes the composite likelihood). A small ratio indicates that the data are more likely under the alternative where f is not constrained to be 0.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Divergence, Diversity, and Effective Population Size
A total of 19,000 bases were sequenced per individual. The average divergence (table 1) between chimpanzees and humans is 1.16% (range, 0.74% to 1.59%) and between chimpanzees and orangutan 3.09% (range, 2.08% to 3.80%). This is in line with previous estimates (Chen and Li 2001; Ebersberger et al. 2002) of the average intergenic divergence of chimpanzee to human and to orangutan (e.g., 1.24% and 3.12%, respectively, in Chen and Li [2001]). In total, 167 variable positions are found among the 28 central chimpanzee chromosomes. The average number of pairwise differences () is 0.17% and ranges from 0.08% to 0.27% across the nine locus pairs. A summary of diversity based on the number of polymorphic sites in the sample,
w (Watterson 1975), is 0.24% when averaged across all regions and ranges from 0.14% to 0.37% (table 1).
|
Selection and Demography
To assess the evidence for differences in selective constraints among the nine regions in the chimpanzees, we used a multilocus HKA test with the orangutan to estimate divergence (see Materials and Methods). By this approach, there is no evidence for heterogeneity among the regions (P = 0.964, ignoring intralocus recombination). Further, the regions were not unusually conserved in a comparison of human to mouse (data not shown). Thus, there is no reason to assume that selection constraints have influenced the variation observed in these regions.
To assess the fit of allele frequencies to the expectations of the standard neutral model, we used the Tajima's D (see Materials and Methods). This statistic considers the approximately normalized difference between and
w. Under the standard neutral model, the expectations of
and
w are equal, so the mean Tajima's D statistic is roughly 0. Because rare alleles contribute more to
w than to
, a negative value of Tajima's D statistic reflects a relative excess of low-frequency polymorphisms. As table 1 shows, Tajima's D statistic is negative in all nine regions. Moreover, although only region 7 exhibits a D statistic value significantly different from 0 at the 5% level, the mean D statistic value across loci is significantly negative (1.02; P < 103 [see Materials and Methods]), reflecting an excess of rare variants relative to standard neutral expectations. The skew in the allele frequency spectrum explains the discrepancy between estimates of the effective population size obtained from
and from
w and suggests that one or more assumptions of the standard neutral model may be invalid. Because natural selection is unlikely to have affected these sequences, a plausible explanation is a demographic departure from model assumptions such as past population expansions, an old population bottleneck, or fine-scale population structure (Slatkin and Hudson 1991; Fay and Wu 1999; Wakeley and Aliacar 2001).
To assess whether the relative excess of rare variants is caused primarily by singletons, we used Fu and Li's D test, with the orangutan to infer ancestral and derived states (see Materials and Methods). Fu and Li's D test is based on the difference between , the total number of variable sites, and
s, the number of derived singletons (i.e., nonancestral mutations appearing only once in the sample). Fu and Li's D test is negative in eight out of the nine regions but is only significant at the 5% level for region 7 (table 1). The mean Fu and Li's D test across locus pairs is not significantly different from the neutral expectation 0 at the 5% level, although the P-value is low (P = 0.07). Furthermore, visual inspection of the entire frequency spectrum suggests that there is an overall excess of low-frequency alleles, rather than an excess of singletons alone (results not shown). Because the main effect of a simple growth model is expected to be on the proportion of singletons (Slatkin and Hudson 1991; Hey and Harris 1999) and the use of Fu and Li's D test as a test statistic should be more powerful than that of Tajima's D statistic (Fu 1997; Pluzhnikov, Di Rienzo, and Hudson 2002), a more likely explanation for the high proportion of rare alleles observed in central chimpanzees may be an old bottleneck (Simonsen, Churchill, and Aquadro 1995; Fay and Wu 1999) or fine-scale population structure (Wakeley and Aliacar 2001).
![]() |
Chimpanzee Subspecies and Human Populations |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To examine how DNA sequences in the two groups are evolutionarily related to each other, we built neighbor-joining trees for each locus pair, using human as an outgroup. In these trees, the DNA sequences from the western chimpanzees are always monophyletic, suggesting an old split time between populations (results not shown). Moreover, a large proportion of the genetic variance is caused by differences between the populations (st = 0.62).
Figure 1 presents the polymorphic sites partitioned into variants that are shared between samples, fixed between the samples, or exclusive to one sample. Strikingly, only six out of 71 variable sites are shared between western and central chimpanzee samples, whereas four are fixed between samples. In contrast, in humans, 57 out of 139 polymorphisms are shared between Hausa and Chinese samples and none are fixed (Di Rienzo, personal communication); for other population pairs, the proportion of shared polymorphisms is the same or higher (results not shown). Thus, human populations appear to have much more of their evolutionary history in common than do subspecies of chimpanzees. If we assume a simple split model for the evolution of chimpanzee subspecies and estimate the split time from the locus pair data in figure 1 (see Materials and Methods), we obtain an ancestral effective population size estimate of NA 51,000 and a population split time estimate
= 650,000 years.
|
Bonobos and chimpanzees occupy ranges that are separated by the Congo-Lualaba river system, thought to have formed approximately 1.5 MYA (Beadle 1981); thus, a sudden split may not be an unreasonable model for the evolution of these two species. Although there are also rivers delimiting the range of chimpanzee subspecies, the historical boundaries are much less clear (Morin et al. 1994; Gagneux 2002), and there may have been ongoing migration between western and central subspecies. If so, application of a sudden split model would tend to lead to an overestimate of the ancestral effective population size and to an underestimate of the split time (cf. Wall [2003]).
Linkage Disequilibrium in Central Chimpanzees
Levels of linkage disequilibrium (LD) vary across the genome by chance as well as because of local differences in recombination and mutation (cf. Pritchard and Przeworski [2001]). Levels of LD are often summarized by a pairwise summary, r2, which measures the correlation between alleles (cf. Hartl and Clark [1997]). If we consider r2 values for all pairs of single-nucleotide polymorphisms with minor allele frequencies above 0.05 in the central chimpanzee data (fig. 2), there is a clear decay with physical distance, such that the mean r2 for sites separated by 1 kb or less is 0.174, whereas it is only 0.066 for sites separated by 8 to 10 kb.
|
Over short distances, homologous gene conversion is likely to make an important contribution to the rate of genetic exchange and, hence, to the decay of LD (Andolfatto and Nordborg 1998). However, over larger distances, the effects of gene conversion will be negligible. Because genetic maps are based on markers 1 Mb or so apart, estimates of the recombination rate based on these markers (such as map) will essentially be estimates of the crossover rate alone (Przeworski and Wall 2001). In contrast, estimates based on patterns of LD over a small scale will be affected by both crossing-over and gene conversion.
To examine whether this difference might account for the poor fit of the predictions to our data, we coestimated and the ratio of gene conversion to crossing-over events ( f ). Assuming that
and f are fixed across loci and that the mean gene conversion tract length is 500 bp (see Materials and Methods), we obtain
cl = 0.0027 and
= 2, respectively. This estimate of the crossover rate is close to our a priori estimate (i.e.,
map = 0.0021). Furthermore, assuming the standard neutral model, we can marginally reject the null hypothesis of no gene conversion (
is smaller in four out of 100 simulations [see Materials and Methods). The use of the standard neutral model is inappropriate, as there are more rare alleles in the data than expected under this model (and these are less informative about recombination rates). This said, if we try to predict the decay of E(r2) with distance from
map, visual inspection of the data suggests a better fit with the inclusion of gene conversion (this is illustrated in figure 2 for f = 2). Given that gene conversion is a general feature of the recombination process across taxa (cf. Pittman and Schimenti [1998]) and that data from the same regions in humans also support the occurrence of gene conversion (Frisse et al. 2001), it seems a plausible explanation for the lower than expected levels of LD in central chimpanzees.
Estimates of recombination parameters obtained from the human and chimpanzee polymorphism data can be compared to assess the evidence for a large change in rates. The analysis of the same nine regions in a Hausa sample leads to cl = 0.0010 and
= 5. Simulations suggest that the values of
in Hausa and central chimpanzees are not significantly different from one another (Ptak et al. 2004; results not shown). Moreover, the ratio of
= 4Ner estimates (2.7-fold) is roughly as expected from the ratio of Ne estimates (2.3-fold). Thus, on the basis of this small data set, there is no evidence that rates of recombination have changed between humans and chimpanzees. It should be noted however that this analysis is limited by the assumption that
is the same across loci. Although these loci were chosen to have similar large-scale recombination rates in humans, local heterogeneity in the recombination rate renders the interpolation of these rates to smaller scales problematic. Unfortunately, there is not enough information in these data to obtain reliable estimates of
for each locus pair separately (Ptak et al. 2004).
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
As summarized in table 2, higher diversity in central than in western chimpanzees has been reported previously for an approximately 10-kb X-linked region (Kaessmann, Wiebe, and Pääbo 1999) and a number of intergenic regions (Yu et al. 2003). In these respects, available autosomal and X-linked data are consistent (with the exception of a 1-kb region near HOXB6). This said, levels of diversity vary substantially across loci; in particular, our diversity estimates are larger than found by Yu et al. (2003). Because there is a high proportion of rare alleles in central chimpanzees, one explanation may be the smaller sample of the Yu et al. (2003) study (10 chromosomes versus 28). For western chimpanzees, the discrepancy is harder to explain. One possibility is that it reflects differences in the origin of the western chimpanzees in the two studies; further sampling of western chimpanzees is needed to resolve this point.
|
In contrast to central chimpanzee samples, western samples do not harbor a high proportion of rare alleles. In fact, the mean D statistic value in western chimpanzees is very close to the standard neutral model expectation of 0. As can be seen in table 2, this observation is consistent with patterns at other autosomal loci but not with data from Xq13.3. Because Xq13.3 is only one locus, the discrepancy may be the result of chance; alternatively, it may reflect a difference between autosomal and sex-linked loci.
The difference in allele frequencies observed in samples of central and western chimpanzee points to divergent evolutionary histories of the two subspecies. Under a simple model of population history (see Materials and Methods), we estimate a split time of 430,000 to 650,000 years (depending on which data are used [see Results]) for western and central populations, substantially less than the 1.3 Myr estimated previously from the mtDNA (Morin et al. 1994). Based on the data collected by Yu et al. (2003), the split time for the Pan species does not appear much older (800,000 years).
Our estimate of the bonobo/chimpanzee split is more recent than previously published estimates: 930,000 years for Xq13.3 (Kaessmann, Wiebe, and Pääbo 1999), 1.5 Myr for data from the Y chromosome (Stone et al. 2002), 1.6 Myr for mtDNA data (Morin et al. 1994), and 1.8 Myr for the same data (Yu et al. 2003). However, with the exception of Stone et al. (2002), these estimates are based on pairwise differences between species, which reflects not only differences accumulated since the split but also ancestral polymorphism. In other words, these are actually estimates of the coalescence time of a chimpanzee and a bonobo sequence and not of the split time of the two species. If the split time was recent or the ancestral population size was large, as appears to be the case here, these two times will be substantially different (Rosenberg and Feldman 2002). As an illustration, when we reanalyzed the data of Yu et al. (2003), we obtained estimates of 43,000 for the ancestral effective population size and 800,000 years for the split time, less than half their estimate of the coalescence time. Thus, the discrepancy between estimates reflects, in part, a difference in what is being estimated. Nonetheless, there remains substantial variability in the split time estimates based on different regions (e.g., between ours and those obtained by Stone et al. [2002] for the NRY). More data (and methods that use more of the data) are clearly needed for an accurate estimate of these parameters.
On the basis of existing data, it appears that divergence levels between bonobo and common chimpanzee at random loci are not much higher than what is observed between subspecies of chimpanzees. In contrast, phenotypic differences are much greater between the two species, consistent with their taxonomic designations. For example, while the species can be distinguished in captivity (by humans), the subspecies cannot. Moreover, morphometric studies of craniofacial variation find much larger differences between species than between subspecies (Guy et al. 2003; Taylor and Groves 2003). In this respect, it is interesting to note that in humans, where there is much less genetic differentiation between populations than in chimpanzees, ancestry from different continents can often be reasonably well predicted from craniofacial features (e.g., Lynch, Wood, and Luboga 1996; Relethford 2002). This uncoupling of phenotypic differentiation and genetic differentiation at random markers may reflect greater genetic drift in humans and bonobos relative to chimpanzee subspecies or natural selection associated with the more diverse habitats exploited by humans (Akey et al. 2002; Kayser, Brauer, and Stoneking 2003) as well as possibly by bonobos (Myers-Thompson 2003).
Outlook
With the imminent publication of the chimpanzee genome, there is great interest in contrasting patterns of variation in humans and chimpanzees, in particular to estimate recombination rates and identify targets of natural selection. Both of these aims rely on a demographic model for the history of chimpanzees. These data suggest that the standard model for population genetic analyses is a poor description of the demographic history of chimpanzees. Indeed, given the complex demographic history apparent in these data, it appears that an understanding of the population history of chimpanzees will require extensive data collected with careful attention to geography.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
David B. Goldstein, Associate Editor
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Akey, J. M., G. Zhang, K. Zhang, L. Jin, and M. D. Shriver. 2002. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12:1805-1814.
Andolfatto, P., and M. Nordborg. 1998. The effect of gene conversion on intralocus associations. Genetics 148:1397-1399.
Beadle, L. C. 1981. The inland waters of tropical Africa. Longman, London.
Chen, F. C., and W. H. Li. 2001. Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am. J. Hum. Genet. 68:444-456.[CrossRef][ISI][Medline]
Chen, F. C., E. J. Vallender, H. Wang, C. S. Tzeng, and W. H. Li. 2001. Genomic divergence between human and chimpanzee estimated from large-scale alignments of genomic sequences. J. Hered. 92:481-489.
Deinard, A., and K. Kidd. 1999. Evolution of a HOXB6 intergenic region within the great apes and humans. J. Hum. Evol. 36:687-703.[CrossRef][ISI][Medline]
Dufour, C., D. Casane, D. Denton, J. Wickings, P. Corvol, and X. Jeunemaitre. 2000. Human-chimpanzee DNA sequence variation in the four major genes of the renin angiotensin system. Genomics 69:14-26.[CrossRef][ISI][Medline]
Ebersberger, I., D. Metzler, C. Schwarz, and S. Pääbo. 2002. Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70:1490-1497.[CrossRef][ISI][Medline]
Enard, W., M. Przeworski, S. E. Fisher, C. S. Lai, V. Wiebe, T. Kitano, A. P. Monaco, and S. Paabo. 2002. Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418:869-872.[CrossRef][ISI][Medline]
Excoffier, L., P. E. Smouse, and J. M. Quattro. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479-491.
Fay, J. C., and C. I. Wu. 1999. A human population bottleneck can account for the discordance between patterns of mitochondrial versus nuclear DNA variation. Mol. Biol. Evol. 16:1003-1005.
Ferris, S. D., W. M. Brown, W. S. Davidson, and A. C. Wilson. 1981. Extensive polymorphism in the mitochondrial DNA of apes. Proc. Natl. Acad. Sci. USA 78:6319-6323.[Abstract]
Frisse, L., R. R. Hudson, A. Bartoszewicz, J. D. Wall, J. Donfack, and A. Di Rienzo. 2001. Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69:831-843.[CrossRef][ISI][Medline]
Fu, Y. X. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915-925.
Fu, Y. X., and W. H. Li. 1993. Statistical tests of neutrality of mutations. Genetics 133:693-709.
Gagneux, P. 2002. The genus Pan: population genetics of an endangered outgroup. Trends Genet. 18:327-330.[CrossRef][ISI][Medline]
Gagneux, P., C. Wills, U. Gerloff, D. Tautz, P. A. Morin, C. Boesch, B. Fruth, G. Hohmann, O. A. Ryder, and D. S. Woodruff. 1999. Mitochondrial sequences show diverse evolutionary histories of African hominoids. Proc. Natl. Acad. Sci. USA 96:5077-5082.
Gilad, Y., C. D. Bustamante, D. Lancet, and S. Paabo. 2003. Natural selection on the olfactory receptor gene family in humans and chimpanzees. Am. J. Hum. Genet. 73:489-501.[CrossRef][Medline]
Gonder, M. K., J. F. Oates, T. R. Disotell, M. R. Forstner, J. C. Morales, and D. J. Melnick. 1997. A new West African chimpanzee subspecies? Nature 388:337.[CrossRef][ISI][Medline]
Guy, F., M. Brunet, M. Schmittbuhl, and L. Viriot. 2003. New approaches in hominoid taxonomy: morphometrics. Am. J. Phys. Anthropol. 121:198-218.[CrossRef][ISI][Medline]
Hartl, D. L., and A. G. Clark. 1997. Principles of population genetics. Sinauer Associates, Sunderland, MA.
Hey, J., and E. Harris. 1999. Population bottlenecks and patterns of human polymorphism. Mol. Biol. Evol. 16:1423-1426.
Hill, W. G., and A. Robertson. 1968. The effects of inbreeding at loci with heterozygote advantage. Genetics 60:615-628.
Hudson, R. R. 2001a. Linkage disequilibrium and recombination. Pp. 309318 in D. J. Balding, M. Bishop, and C. Cannings, eds. Handbook of statistical genetics. John Wiley & Sons, Ltd, Chichester, UK.
Hudson, R. R. 2001b. Two-locus sampling distributions and their application. Genetics 159:1805-1817.
Hudson, R. R., M. Kreitman, and M. Aguade. 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116:153-159.
Huttley, G. A., S. Easteal, M. C. Southey, A. Tesoriero, G. G. Giles, M. R. McCredie, J. L. Hopper, and D. J. Venter. 2000. Adaptive evolution of the tumour suppressor BRCA1 in humans and chimpanzees. Australian Breast Cancer Family Study. Nat. Genet. 25:410-413.[CrossRef][ISI][Medline]
Ingman, M., H. Kaessmann, S. Paabo, and U. Gyllensten. 2000. Mitochondrial genome variation and the origin of modern humans. Nature 408:708-713.[CrossRef][ISI][Medline]
Jensen-Seaman, M. I., A. S. Deinard, and K. K. Kidd. 2001. Modern African ape populations as genetic and demographic models of the last common ancestor of humans, chimpanzees, and gorillas. J. Hered. 92:475-480.
Kaessmann, H., V. Wiebe, G. Weiss, and S. Pääbo. 2001. Great ape DNA sequences reveal a reduced diversity and an expansion in humans. Nat. Genet. 27:155-156.[CrossRef][ISI][Medline]
Kaessmann, H., V. Wiebe, and S. Pääbo. 1999. Extensive nuclear DNA sequence diversity among chimpanzees. Science 286:1159-1162.
Kayser, M., S. Brauer, and M. Stoneking. 2003. A genome scan to detect candidate regions influenced by local natural selection in human populations. Mol. Biol. Evol. 20:893-900.
Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.[ISI][Medline]
Kitano, T., C. Schwarz, B. Nickel, and S. Paabo. 2003. Gene diversity patterns at 10 X-chromosomal Loci in humans and chimpanzees. Mol. Biol. Evol. 20:1281-1289.
Kong, A., D. F. Gudbjartsson, and J. Sainz, et al. (13 co-authors). 2002. A high-resolution recombination map of the human genome. Nat. Genet. 31:241-247.[CrossRef][ISI][Medline]
Lander, E. S., L. M. Linton, and B. Birren, et al. (255 co-authors). 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921.[CrossRef][ISI][Medline]
Lynch, J. M., C. G. Wood, and S. A. Luboga. 1996. Geometric morphometrics in primatology: craniofacial variation in Homo sapiens and Pan troglodytes. Folia Primatol. (Basel) 67:15-39.[Medline]
McVean, G. A. 2002. A genealogical interpretation of linkage disequilibrium. Genetics 162:987-991.
Morin, P. A., J. J. Moore, R. Chakraborty, L. Jin, J. Goodall, and D. S. Woodruff. 1994. Kin selection, social structure, gene flow, and the evolution of chimpanzees. Science 265:1193-1201.[ISI][Medline]
Myers-Thompson, J. A. 2003. A model of the biogeographical journey from Proto-Pan to Pan paniscus. Primates 44(2):1917.
Nachman, M. W., and S. L. Crowell. 2000. Estimate of the mutation rate per nucleotide in humans. Genetics 156:297-304.
Napier, J. R., and P. H. Napier. 1967. A handbook of living primates. Academic Press, New York.
Nei, M., and W. H. Li. 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. USA 76:5269-5273.[Abstract]
Olson, M. V., and A. Varki. 2003. Sequencing the chimpanzee genome: insights into human evolution and disease. Nat. Rev. Genet. 4:20-28.[CrossRef][ISI][Medline]
Pittman, D. L., and J. C. Schimenti. 1998. Recombination in the mammalian germ line. Curr. Topics Dev. Biol. 37:1-35.[ISI][Medline]
Pluzhnikov, A., A. Di Rienzo, and R. R. Hudson. 2002. Inferences about human demography based on multilocus analyses of noncoding sequences. Genetics 161:1209-1218.
Pritchard, J. K., and M. Przeworski. 2001. Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69:1-14.[CrossRef][ISI][Medline]
Przeworski, M., R. R. Hudson, and A. Di Rienzo. 2000. Adjusting the focus on human variation. Trends Genet. 16:296-302.[CrossRef][ISI][Medline]
Przeworski, M., and J. D. Wall. 2001. Why is there so little intragenic linkage disequilibrium in humans? Genet. Res. 77:143-151.[CrossRef][ISI][Medline]
Ptak, S. E., and M. Przeworski. 2002. Evidence for population growth in humans is confounded by fine-scale population structure. Trends Genet. 18:559-563.[CrossRef][ISI][Medline]
Ptak, S. E., and K. Voelpel, and M. Przeworski. 2004. Insights into recombination from patterns of linkage disequilibrium in humans. Genetics (in press).
Relethford, J. H. 2002. Apportionment of global human genetic diversity based on craniometrics and skin color. Am. J. Phys. Anthropol. 118:393-398.[CrossRef][ISI][Medline]
Rosenberg, N. A., and M. W. Feldman. 2002. The relationship between coalescence times and population divergence times. Oxford University Press, Oxford, UK.
Rozas, J., and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.
Ruvolo, M. 1997. Molecular phylogeny of the hominoids: inferences from multiple independent DNA sequence data sets. Mol. Biol. Evol. 14:248-265.[Abstract]
Satta, Y. 2001. Comparison of DNA and protein polymorphisms between humans and chimpanzees. Genes Genet. Syst. 76:159-168.[CrossRef][ISI][Medline]
Simonsen, K. L., G. A. Churchill, and C. F. Aquadro. 1995. Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413-429.
Slatkin, M., and R. R. Hudson. 1991. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129:555-562.
Stephens, J. C., J. A. Schneider, and D. A. Tanguay, et al. (25 co-authors). 2001. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293:489-493.
Stephens, M., N. J. Smith, and P. Donnelly. 2001. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68:978-989.[CrossRef][ISI][Medline]
Stone, A. C., R. C. Griffiths, S. L. Zegura, and M. F. Hammer. 2002. High levels of Y-chromosome nucleotide diversity in the genus Pan. Proc. Natl. Acad. Sci. USA 99:43-48.
Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.
Taylor, A. B., and C. P. Groves. 2003. Patterns of mandibular variation in Pan and Gorilla and implications for African ape taxonomy. J. Hum. Evol. 44:529-561.[CrossRef][ISI][Medline]
Teleki, G. 1989. Population status of wild chimpanzees (Pan troglodytes) and threats to survival. Harvard University Press, Cambridge, Mass.
Thomson, R., J. K. Pritchard, P. Shen, P. J. Oefner, and M. W. Feldman. 2000. Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc. Natl. Acad. Sci. USA 97:7360-7365.
Wakeley, J., and N. Aliacar. 2001. Gene genealogies in a metapopulation. Genetics 159:893-905.
Wakeley, J., and J. Hey. 1997. Estimating ancestral population parameters. Genetics 145:847-855.
Wall, J. D. 2003. Estimating ancestral population sizes and divergence times. Genetics 163:395-404.
Wall, J. D., L. A. Frisse, R. R. Hudson, and A. Di Rienzo. 2003. Comparative linkage disequilibrium analysis of the ß-globin hotspot in primates. Am. J. Hum. Genet. 73:13301340.
Watterson, G. 1975. On the number of segregating sites in genetical models without recombination. Theor. Pop. Biol. 7:256-276.[ISI][Medline]
Whiten, A., J. Goodall, W. C. McGrew, T. Nishida, V. Reynolds, Y. Sugiyama, C. E. Tutin, R. W. Wrangham, and C. Boesch. 1999. Cultures in chimpanzees. Nature 399:682-685.[CrossRef][ISI][Medline]
Wise, C. A., M. Sraml, D. C. Rubinsztein, and S. Easteal. 1997. Comparative nuclear and mitochondrial genome diversity in humans and chimpanzees. Mol. Biol. Evol. 14:707-716.[Abstract]
Wiuf, C., and J. Hein. 2000. The coalescent with gene conversion. Genetics 155:451-462.
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555-556.[Medline]
Yi, S., D. L. Ellsworth, and W. H. Li. 2002. Slow molecular clocks in Old World monkeys, apes, and humans. Mol. Biol. Evol. 19:2191-2198.
Yu, N., M. I. Jensen-Seaman, L. Chemnick, J. R. Kidd, A. S. Deinard, O. Ryder, K. K. Kidd, and W. H. Li. 2003. Low nucleotide diversity in chimpanzees and bonobos. Genetics 164:1511-1518.
Zhao, Z., L. Jin, and Y. X. Fu, et al. (10 co-authors). 2000. Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22. Proc. Natl. Acad. Sci. USA 97:11354-11358.