McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, England;
Dipartimento di Genetica e Microbiologia, Università de Pavia, Pavia, Italy; and
Mathematisches Seminar, Universität Hamburg, Hamburg, Germany
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
With this overall picture of mitochondrial prehistory coming into focus, geneticists soon shifted their attention to the question of whether modern European, Asian, Papuan, and Australian mtDNA types derive from an uninterrupted demographic expansion of the out-of-Africa founders (strong Garden of Eden model), or whether an initial expansion was followed by the formation of regional gene pools, which, after a period of isolation and drift, expanded demographically and geographically to form the present mtDNA variation in different continents and regions (weak Garden of Eden model). One early methodology with which to explore these alternative scenarios was that of pairwise sequence difference distributions, or "mismatch distributions," which were identified with global demographic expansions 80,000 to 40,000 years ago, thought to be in agreement with the weak Garden of Eden model (Harpending et al. 1993
; Sherry et al. 1994
). However, the mismatch distribution approach as used by Harpending et al. (1993)
and Sherry et al. (1994)
relied on implicit assumptions concerning the underlying mtDNA tree (Bandelt and Forster 1997
). Nevertheless, independent phylogenetic studies on mtDNA restriction fragment length polymorphisms (RFLPs) in Europeans (Torroni et al. 1996
), Asians (Ballinger et al. 1992a, 1992b
; Torroni et al. 1993b, 1994c
; Starikovskaya et al. 1998
; Schurr et al. 1999
), Papuans (Stoneking et al. 1990
), Americans (Torroni et al. 1993a, 1994a
), and Indians (Kivisild et al. 1999
) implicitly confirmed the weak Garden of Eden model by finding distinct and phylogenetically deep mtDNA branches in each continent or region.
In this study, we present a chronology for the out-of-Africa migration and the onset of demographic expansions in Papua New Guinea and Asia. To achieve this aim, we first identified starlike mtDNA clusters diagnostic for demographic expansions by applying a new phylogenetic star contraction algorithm on high-resolution (14-enzyme) mtDNA RFLP data for 826 Asians and Papuans (fig. 1 ). The star contraction method identifies and distinguishes starlike phylogenetic clusters from nonstarlike branches according to a parameter specifying mutational time depth, one that is akin to cutting through a bush at a fixed height with a trimmer and then examining the diameters of the branches. An increased diameter, i.e., a greater molecule census, of any single cluster may arguably be the result of local circumstance; however, we observed that the clusters grouped together in geographically and temporally distinct sets, with each set therefore indicating a general demographic expansion. Genetic dating of these sets of clusters then allowed us to identify the relative and absolute times of different demographic expansions in Asia, which ultimately led to the peopling not only of the whole of Asia and Papua New Guinea, but also of America and Polynesia. To compare our RFLP-based results with published mtDNA control region sequences, we took advantage of the hitherto unpublished correspondence table (appendix) linking the samples RFLP-typed by Cann, Stoneking, and Wilson (1987)
and Stoneking et al. (1990)
with the overlapping sample set sequenced for the mtDNA control region by Vigilant et al. (1991)
.
|
Molecules, Humans, and Populations
Two radically different approaches are currently popular in the endeavor to reconstruct human prehistory by means of molecular genetics (Pritchard and Feldman 1996
; Risch, Kidd, and Tishkoff 1996
; Stumpf and Goldstein 2001
). There are those geneticists who set out from the DNA molecule as the basic unit of investigation (Forster et al. 1996
; Torroni et al. 1998
; Quintana-Murci et al. 1999
; Richards et al. 2000
), and there are those who set out from a (suitably defined) population as the basic unit (e.g., Relethford and Jorde 1999
). Both approaches then attempt to reconstruct the prehistory of their respective units, delivering results which are not intended to be comparable (Harpending et al. 1998
). For example, an increase in one type of molecule does not necessarily entail a net increase of humans in the "population." While the molecular approach aspires to reconstruct the evolution of a genetic locus with the long-term aim of assembling many independent locus histories into a history of (human) evolution (Templeton 1998
), the population approach attempts to take a short cut by implicitly or explicitly postulating a complete population model which is then tested on the basis of the data. To prevent confusion over terminology, we briefly summarize in this section some concepts of the molecular approach we adopt in this paper. In molecular mtDNA usage, an mtDNA type is said to have "geographically expanded" to a greater or lesser degree according to its observed greater or lesser geographic distribution and the distribution of its descendant types, i.e., the distribution of the clade (haplogroup). (If an mtDNA clade is very widespread, e.g., the L2/L3 clade which is found in most of Africa and which is ancestral to more than 99% of non-African mtDNA types, then it goes without saying that the geographic expansion can only have been effected by concomitant "demographic expansion" of women carrying that mtDNA type.) While the geographic expansion of a molecule can be directly ascertained if no sweeping extinctions have occurred, a demographic expansion of a molecule is usually indirectly inferred from the more starlike or less starlike phylogenetic structure of its clade. The degree of "starlikeness" can be measured by the pairwise difference distribution within the clade (Watson et al. 1997
) or by a "star index" (Slatkin 1996
; Mateu et al. 1997
; Torroni et al. 1998
). Incidentally, in contrast to population terminology, the expression "constant-sized" has no application here: every observed molecule type has expanded from nonexistence to existence, whereas populations are usually defined as initially consisting of a number of individuals, which can increase, remain constant, or decrease until the time level of observation.
In this paper, we attempt a more direct approach to classifying molecules into demographically "more expanded" and "less expanded" types, namely, by taking a direct census of the number of molecules descended from a strongly expanded ancestral molecule. (Strongly expanded molecules, relative to other molecules, are first determined using our star density measure. This measure includes the feature that a descendent molecule which itself initiates a major expansion is classified separately as an expansion type.) Although taking a census of molecules to evaluate expansion may sound trivial, this approach is not equivalent to its theoretical counterpart of counting humans: an increase in the number of humans may be obscured by subsequent population declines in a quite unpredictable and unreconstructable manner. This is the strength of the molecular approach: population increase, decline, or mixing can systematically change neither the relative number of molecules descended from an ancestral molecule, nor their average mutational distance, nor therefore the time estimate of coalescence to that ancestral molecule or the starlike signature of expansion.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Data Revision
Australians and Asians of Cann, Stoneking, and Wilson (1987)
The Cann, Stoneking, and Wilson (1987)
worldwide data consist of 46 Caucasians (Caucasoids), 34 Asians, 26 Papua New Guineans, 21 Australians, 18 African Americans, and 2 African-born Africans. The Papua New Guinean individuals are a subset of the Stoneking et al. (1990)
Papuans (see below). We eliminated four sites, as they could not be experimentally confirmed (see table 1
for the letter codes used to specify the enzymes): "8j," "1484e," "7750c," and "13031g" (M. Stoneking, personal communication). Furthermore, "1403a" was corrected to 10397a, as pointed out by Ballinger et al. (1992a)
. We added site +185l in Cann types 8 and 9 in accordance with the corresponding Vigilant sequences (appendix). The publication by Cann, Stoneking, and Wilson (1987)
does not specify length variability of the 9-bp duplication at nucleotide positions (nps) 8272 through 8289, but this information is available: individuals 44, 62, 71, 73, 74 (Asians), and 72 (African American) have a 9-bp deletion (Wrishnik et al. 1987
). Even with these corrections, the Cann data set was the only RFLP set unable to produce a low-dimensional phylogenetic network (not shown), indicating undetected data problems (Bandelt et al. 1995
). For this reason, we discarded the Cann data from further analyses.
|
Southeast Asians of Ballinger et al. (1992a, 1992b)
All 153 southeast Asians (14 Malaysian Chinese, 14 Malays, 32 Malay Aborigines "Orang Asli," 32 Sabah Borneo Aborigines, 20 Taiwanese Han, 28 Vietnamese, and 13 South Koreans) were incorporated into the present study. We transferred site -4711i from mtDNA type 68 to mtDNA type 76, as explained in the corrigendum by Ballinger et al. (1992b)
. The -4685a variant was transferred from type 69 to type 77 in our file (A. Torroni, personal communication). The site 6618e in Ballinger et al.'s (1992a)
appendix B is inconsistent with clade G in the parsimony tree in their figure 2 ; original documentation, however, is not available. Sites "13284a" and "13284e" are artifacts (T. Schurr, personal communication) and were deleted here. Site "+4735k" in KN99 was changed to 4732k to agree with Tib142 (Torroni et al. 1994c
) and Ev20 (Torroni et al. 1993b
). Site "15431e" in KN102 was renamed 15437e to agree with RFLP type SIB40 of Schurr et al. (1999)
and Starikovskaya et al. (1998)
: KN102 and SIB40 (found in one Chukchi, four Siberian Eskimos, and two Koryaks) are mtDNA clade D and, moreover, share the absence of site 10180l. The ambiguous sequence region according to Anderson et al. (1981)
is as follows: 15430-C GCC CTC GGC TAC-15442. The corresponding amino acid sequence is ALGL. Site 15431e implies a change from A to G, and site 15437e implies a change L to H. The following sites were corrected mainly by consulting laboratory notes: "1063e" is 1062e; "3569j" is 3659j; "7672g" is 7672j; "8270k" is probably 8269k; the purported single site change "8569c" in type 110 is a double site, +8569c/-8572e; "9329f" is 9327f; "10256j" is 10256r; "11557b" is 11577b; "15660c" is 15460c; and "+16096g" is -16096k. The following sites are incorrect or misplaced, but documentation is not available: "160f," "3659o," "6534e" (presumably an "abbreviation" of 16534e: both "6534e" and 16534e are recorded for VN47), "9386e," "13180j," "15595e," and "16512l." These incorrect or misplaced sites were retained, as they do not create phylogenetic conflicts.
|
Tibetans of Torroni et al. (1994c)
The complete set of 54 Tibetans was used in the present study. Tibetan 118 was corrected as having +10394c and +10397a (Bandelt, Forster, and Röhl 1999
). We corrected the following positions: "5259b" is 5260b, "6331b" is 6332b, "14773c" is 14774c, and "16388e" is 16398e.
Chukchi and Siberian Eskimos of Starikovskaya et al. (1998)
The complete data set of 145 individuals (79 Siberian Eskimos and 66 Chukchi) was used in this study. We note that variation at nucleotide position (np) 16311 was not detected by 16310k.
Kamchatkans of Schurr et al. (1999)
The complete data set of 202 Kamchatkans (56 Aluitor Koryaks, 44 Karagin Koryaks, 55 Palan Koryaks, and 47 Itel'men) was used in this study. We note that variation at np 16311 was not detected by 16310k.
Data Harmonization
The five published RFLP studies use enzyme sets which overlap but are not quite congruent (see table 1
), necessitating harmonization. In the harmonized data we used for the analysis, the additional enzymes p, q, r, and s employed only by Ballinger et al. (1992a)
were deleted. These enzymes recognize longer recognition sites and therefore cut infrequently: in the diverse Ballinger et al. (1992a)
data, p and s never cut, q cuts once but can be expressed as o, and r cuts in only three mtDNA types. Furthermore, due to overlapping recognition sequences, enzymes m and n used in the Ballinger et al. (1992a)
and Torroni et al. (1993b, 1994c)
studies can always be expressed by j and f, respectively, in the studies used here. Enzyme h could usually be expressed as enzyme o in the four studies in which both were used (table 1 ). The only exceptions were in the Ballinger et al. (1992a)
data, for which we retained the independent 12406h variant of Asian 19, in the Schurr et al. (1999) data, for which we retained 1004h in Koryak 66, and in the Cann, Stoneking, and Wilson (1987)
data, for which we retained 3592h. We omitted enzyme FnuDII, employed only by Cann, Stoneking, and Wilson (1987)
and Stoneking et al. (1990)
; this enzyme detects variation in only one of the 119 Papuan individuals. We expressed sites +12345k and 64i by the alternative sites +12528k and 16494i, respectively, to harmonize the Cann, Stoneking, and Wilson (1987)
/Stoneking et al. (1990)
data with the Ballinger et al. (1992a)
data. We consistently scored variation at 16517e as gains (some publications scored it as losses). The harmonization hence loses a minimum of information with respect to the standard 14-enzyme system, permitting direct comparisons with the RFLP mutation rate estimates based on European and Amerind mtDNA (Torroni et al. 1998
, and with the African RFLP data of Chen et al. (1995)
. A file (asiapng.tor) of the revised and harmonized Asian data (excluding the Asians and Australians of Cann, Stoneking, and Wilson [1987
]) is available at http://www.fluxus-engineering.com.
Star Contraction Algorithm
The aim of the star contraction algorithm is to identify starlike clusters of sequences in a given sequence set and contract these clusters to single representative sequences. The resulting reduced sequence set can be entered into a phylogenetic algorithm to generate a tree or network. There are two potential applications for this method. First, large population data sets (several hundred to over a thousand) are rapidly becoming the norm in population genetic studies, and it is becoming increasingly difficult to display the corresponding phylogeny as a figure in a publication or even to visually analyze it on a computer screen. The star contraction algorithm in conjunction with a phylogenetic analysis can display the much smaller "skeleton" of the tree, with the clusters indicated as single nodes. The second application of the star contraction method is to rigorously define dense starlike clusters which are potentially diagnostic for demographic expansions. The time to coalescence of such phylogenetic clusters can then be dated via the molecular clock and compared with historic or prehistoric records of other disciplines.
The following algorithm specifies a certain time depth or mutational distance radius (range 0 to
) up to which clusters are to be identified, with the founding sequence of a cluster being reconstructed if it is absent in the original data (step 2). Any node ancestral to a cluster is excluded from contraction into the cluster (step 9). Potential clusters are evaluated on the basis of a star density measure (step 3). Sequences which are equidistant to two or more potential stars are preferentially assigned to existing sequences rather than to inferred nodes (steps 46). The following algorithm can be executed several times in succession (step 11) to progressively contract the network, but the conditions in step 3 ensure that successive contraction rounds converge on a skeleton phylogeny rather than on a single node.
Phylogenetic and Demographic Analysis
Prior to the star contraction analysis, we first deleted three RFLP sites from the data set: sites 16303k and 16310k, which were difficult to detect in some studies for technical reasons, and site 16517e due to its erratic, possibly directional, mutation mechanism (Chen et al. 1995
; Forster et al. 1997
). For the star contraction analysis, all the RFLP data sets described above except for the Cann, Stoneking, and Wilson (1987)
data were entered in the star contraction option of Network 3.0. Three rounds of star contraction were run, with
set at 3 mutations, corresponding to the coalescence time of the two African mtDNA founders for modern Eurasians and Papuans (see Results). The star contraction output file was then entered into the reduced median network (RM) option (Bandelt et al. 1995
) using the default settings except for the weighting of three sites as follows: sites 10394c/10397a were weighted half to counter known recurrent double-site losses through a single point mutation (Bandelt, Forster, and Röhl 1999
), and site 1715c was weighted half due to its known hypervariability (Macaulay et al. 1999
). Finally, the RM output file was entered into the median-joining (MJ) network option (Bandelt, Forster, and Röhl 1999
), using default parameters and the same weighting system, to eliminate nonparsimonious links (Forster et al. 2000
). In this final network, any node containing more than 1% of the total number of individuals (i.e., at least 9 given the sample size of 826) was defined as a demographic expansion cluster. This threshold was chosen for the practical reason that statistical analysis of samples of size <9 does not appear sensible. In other words, for the detection of smaller demographic expansions, much larger sample sizes would be necessary. (Note that our procedure only distinguishes dense from less dense phylogenetic nodes and stars, indicating greater or lesser demographic expansion. We do not seek to draw an artificial line between "stationary" and "expanding" lineages.) The resulting expansion clusters were then analyzed for geographic distribution and for expansion age using the RFLP mutation rate.
RFLP Mutation Rate and Genetic Dating
The mutation rate of the 14-enzyme RFLP typing system for the whole mtDNA molecule has previously been calibrated (Torroni et al. 1998
) using Amerindian samples which had been both sequenced for the first hypervariable region (np16090-np16365) of the mtDNA control region and RFLP-typed for sites throughout the molecule. A comparison of the control region tree with the RFLP tree yielded a ratio of 1.21 control region mutations for 1 RFLP mutation. Macaulay et al. (1999)
performed a similar comparison in a European sample but obtained a ratio of only 0.82 control region mutations for 1 RFLP mutation. The apparently large discrepancy was only partly mitigated when the authors confined their comparison to shallow clusters in their trees (to avoid overlooking saturated recurrent mutations). A major reason for the discrepancy was that 10 of the RFLP mutations had in fact been detected by control region sequencing, namely, seven 16310k mutations, two 16303k mutations, and one 16208k mutation (V. Macaulay, personal communication), and one RFLP mutation (12308g) had been detected by a mismatched primer. When considering only the shallow clusters in their phylogeny (H, pre-HV, J, T, K, U1, U5, R1, and X) this still leaves seven RFLP mutations which would not have been detected by conventional RFLP typing in most of the RFLP publications used here. After subtracting these mutations, we obtained a control region/RFLP mutation ratio for their tree of 0.9500. For the purpose of this paper, we adopted the average ratio between 0.95 and 1.21, namely 1.08, which translates into 1 RFLP mutation every 21,800 years using the control region calibration of Forster et al. (1996)
and Saillard et al. (2000)
. This point estimate is similar to the calibration of Horai et al. (1995)
using primate mtDNA. As a time measure for dating nodes in the phylogeny, we use the demographically unbiased parameter
, which is the average mutational distance to the node of interest (Morral et al. 1994
; Forster et al. 1996
). For estimating the standard error
of
, we employed the method of Saillard et al. (2000)
. The values for
and
are converted into years by multiplication with the mutation rate. The standard error
does not include uncertainty in the mutation rate. However, any future improved calibration for the mutation rate can directly be multiplied with the
and
values presented throughout this paper to obtain improved absolute time estimates. Thus, for example, if the out-of-Africa migration date were doubled from 55,000 years to 110,000 years (e.g., to accommodate the Skhul/Qafzeh remains as our ancestors), then the mtDNA date for the migration into the Americas would correspondingly increase from 25,000 to 50,000 years.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
At this point, we compressed the infinite-sites model into the finite-sites model by assigning several independent mutation events to a single nucleotide position (parallel mutations and reversions); nevertheless, the historical order of mutations was stored in the computer, allowing later comparisons of the real molecular tree with the reconstructed phylogenies.
The simulation spanned a period of 60,000 years (2,400 generations) with a nearly stagnant census punctuated by the following phases of increase and decline: expansion to 1,500 moleculesstart at generation 20, end at generation 130; expansion to 2,500 moleculesstart at generation 400, end at generation 450; contraction to 1,000 moleculesstart at generation 700, end at generation 750; expansion to 5,000 moleculesstart at generation 850, end at generation 950; contraction to 3,500 moleculesstart at generation 1200, end at generation 1220; contraction to 3,000 moleculesstart at generation 1600, end at generation 1605; expansion to 5,000 moleculesstart at generation 2000, end at generation 2030.
The expansion and contraction phases were simulated according to a sinelike function in the interval [-/2,
/2]. Within the specified restraints, the simulation allowed a maximum random change of 2% of the overall number of molecules from one generation to the next. In other words, minor expansions and contractions were permitted even during the stagnation phases. For example, in the simulation described here, a minor expansion occurred in the last 400 generations, yielding a molecule census of 5,371 at the end of the simulation. In the course of the simulation, two of the original four founding sequence types went extinct, the first one after only 5,000 years (200 generations), and the second after about 20,000 years (800 generations). Finally, from the resulting data set, 200 sequences were randomly sampled and submitted to the star contraction algorithm.
Figure 2 displays the true tree and within it the star-contracted tree of the random sample. It should be pointed out that the expectation of exactly reconstructing the true tree (fig. 2 ) is not justified. For one thing, parallel mutations may make certain sequence types look identical, as in the case of sequence types 1895 and 2388 and sequence types 2140 and 4815. Moreover, a reversal has occurred at nucleotide position 264 on the link between type 1190 and type 3954 which no parsimony method would identify. In the first round, the star contraction algorithm reduced this data set from 50 sequence types with 43 variable positions to 23 sequence types with 23 variable positions, hence a reduction by 75.4% (number of sequence types multiplied by number of variable positions). In this first round, no type was incorrectly assigned, and two median vectors were reconstructed. In the second star contraction round, the data set was reduced by another 11 sequence types and 8 variable positions, and the third round caused another reduction by one sequence type and one position (not shown).
To summarize, the SC algorithm in this simulation was found to assign, without any errors, numerous sequence types to ancestral nodes, in spite of the high level of homoplasy. This not only contracts the data set by over 70% after a single round of star contraction, it also accurately identifies and contracts numerous parallel and reverse mutations, thus simplifying subsequent phylogenetic analyses, e.g., by applying a tree-building or a network algorithm to the star-contracted data set.
Star Contraction Applied to Real mtDNA Coding-Region RFLP Evolution
Threefold application of the star contraction algorithm to the 826 east Asian and Papuan RFLP sequences (encompassing 247 types) contracted the number of types to 115 after the first round, 85 after the second round, and 83 after the third round. The reduction efficiency was thus similar to the frequency >1 option of Network 2.0 (654 sequences occurred more than once, yielding 75 different types, which yielded a structurally similar RM network not shown here). The reduced data set was weighted as described in the Materials and Methods and submitted to a reduced median network analysis which yielded a three-dimensional phylogenetic network containing both parsimonious and nonparsimonious links (not shown). To eliminate nonparsimonious links, the RM output was submitted to an MJ network analysis, and the resulting simpler network is shown in figure 3
. Inspection of the Asian/Papuan network reveals that most Papuans fall into branches which are distinct from those for mainland Asians, except for mtDNA clades B, E, and F, which were found in coastal Papua New Guinea, and one Malay Malaysian (MM90) who was potentially maternally related to one of the highland Papuan clusters. To investigate the Papuan sample, we therefore constructed a phylogenetic network (fig. 4 ) of the Papuans including the Malay. The Papuan network was constructed by combining the RM with the MJ algorithms as for the network of figure 3
. We dispensed with the star contraction algorithm for the Papua New Guinea network because the expansion clusters were already defined in the overall network and we here wished to focus on individual correspondences between control-region and RFLP types (see Discussion). Intraspecific robustness tests such as bootstrapping are not available for shallow intraspecies phylogenies (Bandelt et al. 1995
). However, confidence can be gained from the fact that most links in the Asian part of the phylogeny correspond to the published phylogenies of the individual data sets (for a reanalysis of the Tibetan data, see Bandelt, Forster, and Röhl [1999
]) if variation at 16517e is disregarded. A separate tree of the Papuans has not previously been published, but published trees combining the Papuans with others are available (Stoneking et al. 1990
; Ballinger et al. 1992a
). In those analyses, most of the Papuans are grouped into the same two highland clusters and into the coastal B cluster as in our analysis, and the deep mutations within these clusters are the same. Only the tips are slightly different, but this is mainly due to 16310k and 16517e, which we weighted half and zero, respectively, unlike in the published analyses.
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The oldest expansion in Eurasia occurred 65,000 ± 23,000 years ago (table 2
) and is witnessed by mitochondrial descendants preserved in Papua New Guinea; the age estimate can be narrowed somewhat by considering that the Papuan node is derived from a Eurasian founder, so it should not be older than 54,000 ± 12,000 years. This is still about 20,000 years older than any mainland Asian cluster, although both the Papuans and the Asians are derived from the same two Eurasian founders. On the basis of this time difference, we tentatively propose the following scenario to account for the obvious phenotypic differences between Papuans and Asians despite their sharing a common mitochondrial ancestry: The M and N founders derive from a single African migration but split at an early stage (possibly before reaching Europe, which lacks M) into proto-Papuan and proto-Eurasian. The proto-Papuan M and N immediately expanded demographically and geographically along a southern route until reaching Papua New Guinea, thus allowing Papuans to retain their overall genetic similarity to Africans (Stoneking et al. 1997
). Meanwhile, proto-Eurasians spent 20 or more millennia genetically drifting to their present distinct European, Indian, and east Asian M and N types, as well as phenotypes (compare the common Papuan/Eurasian melanocortin receptor variants in table 1
of Harding et al. [2000]
), long before expanding. The Papuan network in figure 4 shows that it may still be possible to trace the proposed southern route taken by proto-Papuan mtDNA. As had already been noted by Ballinger et al. (1992a)
, one Malay Malaysian (MM90) has two diagnostic sites15606a and 207oin common with the Papuan P clade. Comparison of Papuan RFLP clade P with the corresponding control-region sequences (SH17 and WE17 in the appendix) demonstrates that the following control-region motif (relative to CRS) is ancestral to the Papuan cluster: 16223.C (as in CRS), 16357.C, 73.G, 212.C (corresponding to 207o), and 263.G. Screening for these positions and 15606a in relevant populations (Indians, Andaman islanders) may help to confirm the proposed southern route into Papua New Guinea. If Malaysian MM90 turns out to be representative of southeast Asia, then 12528k (fig. 4
) would have mutated in or near Papua New Guinea, yielding a minimum age of 33,000 ± 8,000 years (table 2
) for the settlement of Papua New Guinea, and a maximum age of 51,000 ± 17,000 years (i.e., the age of the node ancestral to MM90). Another interesting point to resolve would be the genetic relationship of Australians to Papuans and Eurasians. Unfortunately, the Australian RFLP data had to be discarded for this study (see Materials and Methods), but other studies on mtDNA control-region sequences (Redd and Stoneking 1999
) and Y STRs (Forster et al. 1998
) have shown that Papuans and Australians are not closely related as far as these loci are concerned. The oldest Australian human remains are found at Lake Mungo and dated to 62,000 ± 6,000 years (Thorne et al. 1999
). Ancient DNA extracted from these remains (Adcock et al. 2001
) may be genuine (although experimental reproduction and details are lacking), as it appears to be related to an ancient mtDNA type found to be inserted in nuclear DNA of modern humans (Zischler et al. 1995
). The absence of this mtDNA lineage in any mitochondria in over 17,000 published modern human mtDNA sequences (Röhl et al. 2001
), including those of aboriginal Australians, would mean that the Lake Mungo mtDNA lineage was replaced by modern mtDNA (Adcock et al. 2001
) at some time in the past 60,000 years.
At the next time level in table 2
, six expansion nodes cluster closely at 27,000 to 35,000 years. If we take the mean value of 31,000 years as a lower limit for the arrival in east Asia of the African founders (a distance of about 8,000 km) and 54,000 years as the starting date, then the minimal eastward migration speed would have amounted to about 300 m/year. This rate appears plausible, as it is on the same order of magnitude as the minimal southward migration speed of Amerinds from Beringia to Chile of about 1 km per year, assuming an Alaskan entry date of 25,000 years and an arrival in Monte Verde by at least 14,000 calendar years ago (Forster et al. 1996
). Similarly, the arrival of the east African L2/L3 expansion (60,000 years old) in west Africa by 30,000 years ago (Watson et al. 1997
) implies a westward migration speed of at least 200 m/year.
Inspection of the geographic distributions of the six oldest East Asian expansion clusters reveals that they are mainly located south of the permafrost boundary of the Last Glacial Maximum 20,000 years ago (fig. 5
). A link with the Ice Age is strengthened by the gap in table 2
of 10,000 years during the Last Glacial Maximum before the next demographic expansion clusters occurred, all younger than 17,000 years and thus postglacial maximum. Most of these postglacial clusters are found today in northern Asia (fig. 6
), excepting a few southeast Asian expansion clusters, notably, the Polynesian mtDNA clade B expansion to coastal Papua New Guinea (Stoneking et al. 1990
). Taken together, this evidence strongly suggests that northern Asia was depopulated during the Last Glacial Maximum (fig. 7
). The early expansions starting about 30,000 years ago in Asia did, however, reach America before they were swept back again, as is seen in the widespread presence of mtDNA clade B in America and central Asia, but not in northern Asia (Shields et al. 1993
; Torroni et al. 1993a, 1993b
). According to the time estimates in table 2
and the geographic distributions in figures 5 and 6
, northern Asia was resettled partly from central Asia (from approximately the latitude of Korea according to fig. 5
) and partly from the Beringian glacial refuge from whence the Na Dene and Eskimo derive their mtDNA clade A2 types (Forster et al. 1996
). The reexpansions from these two refugia may have contributed to the geographic patterns of the second principal component of autosomal variation in Asia (fig. 4.17.2 in Cavalli-Sforza, Menozzi, and Piazza 1994
). Any attempt to search for the Amerind ancestors in modern Mongolians (Neel, Biggar, and Sukernik 1994
) or elsewhere is thus going to overlook the actual ancestral Amerind population which crossed the Bering land bridge more than 20,000 years ago, presumably from northeast Asia, and then had most of its genetic traces in Asia obliterated by the ensuing glacial conditions.
|
|
|
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
2 Keywords: phylogenetic network
Ice Age
refugium
Mongoloid
race
3 Address for correspondence and reprints: Peter Forster, McDonald Institute for Archaeological Research, University of Cambridge, Cambridge CB2 3ER, United Kingdom. pf223{at}cam.ac.uk
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adcock G. J., E. S. Dennis, S. Easteal, G. A. Huttley, L. S. Jermiin, W. J. Peacock, A. Thorne, 2001 Mitochondrial DNA sequences in ancient Australians: implications for modern human origins Proc. Natl. Acad. Sci. USA 98:537-542
Anderson S., A. T. Bankier, B. G. Barrell, et al. (14 co-authors) 1981 Sequence and organisation of the human mitochondrial genome Nature 290:457-465[ISI][Medline]
Andrews R. M., I. Kubacka, P. F. Chinnery, R. N. Lightowlers, D. M. Turnbull, N. Howell, 1999 Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA Nat. Genet 23:147[ISI][Medline]
Aris-Brosou S., L. Excoffier, 1996 The impact of population expansion and mutation rate heterogeneity on DNA sequence polymorphism Mol. Biol. Evol 13:494-504[Abstract]
Armour J. A. L., T. Anttinen, C. A. May, E. E. Vega, A. Sajantila, J. R. Kidd, K. K. Kidd, J. Bertranpetit, S. Pääbo, A. J. Jeffreys, 1996 Minisatellite diversity supports a recent African origin for modern humans Nat. Genet 13:154-160[ISI][Medline]
Ballinger S. W., T. G. Schurr, A. Torroni, Y. Y. Gan, J. A. Hodge, K. Hassan, K.-H. Chen, D. C. Wallace, 1992a. Southeast Asian mitochondrial DNA analysis reveals genetic continuity of ancient Mongoloid migrations Genetics 130:139-152
. 1992b. Corrigendum Genetics 130:957
Bandelt H.-J., P. Forster, 1997 The myth of bumpy hunter-gatherer mismatch distributions Am. J. Hum. Genet 61:980-983[ISI][Medline]
Bandelt H.-J., P. Forster, A. Röhl, 1999 Median-joining networks for inferring intraspecific phylogenies Mol. Biol. Evol 16:37-48[Abstract]
Bandelt H.-J., P. Forster, B. C. Sykes, M. B. Richards, 1995 Mitochondrial portraits of human populations using median networks Genetics 141:743-753
Bräuer G., 1989 The evolution of modern humans: a comparison of the African and non-African evidence Pp. 123154 in P. Mellars and C. Stringer, eds. The human revolution: behavioural and biological perspectives in the origins of modern humans. Edinburgh University Press, Edinburgh, Scotland.
Cann R. L., M. Stoneking, A. C. Wilson, 1987 Mitochondrial DNA and human evolution Nature 325:31-36[ISI][Medline]
Cavalli-Sforza L. L., P. Menozzi, A. Piazza, 1994 The history and geography of human genes Princeton University Press, Princeton, N.J
Chen Y.-S., A. Torroni, L. Excoffier, A. S. Santachiara-Benerecetti, D. C. Wallace, 1995 Analysis of mtDNA variation in African populations reveals the most ancient of all human continent-specific haplogroups Am. J. Hum. Genet 57:133-149[ISI][Medline]
Day M. H., C. B. Stringer, 1982 A reconsideration of the Omo Kibish remains and the erectus-sapiens transition Pp. 814846 in H. De Lumley, ed. L'Homo erectus et la place de l'homme de Tautavel parmi les hominidés fossiles. Centre National de la Recherche Scientifique/Louis-Jean Scientific and Literary, Nice, France
Forster P., R. Harding, A. Torroni, H.-J. Bandelt, 1996 Origin and evolution of native American mtDNA variation: a reappraisal Am. J. Hum. Genet 59:935-945[ISI][Medline]
. 1997 Reply to Bianchi and Bailliet Am. J. Hum. Genet 61:246-247[ISI]
Forster P., M. Kayser, E. Meyer, L. Roewer, H. Pfeiffer, H. Benkmann, B. Brinkmann, 1998 Phylogenetic resolution of complex mutational features at Y-STR DYS390 in aboriginal Australians and Papuans Mol. Biol. Evol 15:1108-1114[Abstract]
Forster P., A. Röhl, P. Lünnemann, C. Brinkmann, T. Zerjal, C. Tyler-Smith, B. Brinkmann, 2000 A short tandem repeat-based phylogeny for the human Y chromosome Am. J. Hum. Genet 67:182-196[ISI][Medline]
Frenzel B., M. Pécsi, A. Velichko, 1992 Atlas of Paleoclimates and paleoenvironments of the Northern Hemisphere Late PleistoceneHolocene. Hungarian Academy of Sciences, Gustav-Fischer-Verlag, Budapest/Stuttgart
Harding R. M., S. M. Fullerton, R. C. Griffiths, J. Bond, M. J. Cox, J. A. Schneider, D. S. Moulin, J. B. Clegg, 1997 Archaic African and Asian lineages in the genetic ancestry of modern humans Am. J. Hum. Genet 60:772-789[ISI][Medline]
Harding R. M., E. Healy, A. J. Ray, et al. (11 co-authors) 2000 Evidence for variable selective pressures at MC1R Am. J. Hum. Genet 66:1351-1361[ISI][Medline]
Harpending H. C., M. A. Batzer, M. Gurven, L. B. Jorde, A. R. Rogers, S. T. Sherry, 1998 Genetic traces of ancient demography Proc. Natl. Acad. Sci. USA 95:1961-1967
Harpending H. C., S. T. Sherry, A. R. Rogers, M. Stoneking, 1993 The genetic structure of ancient human populations Curr. Anthropol 34:483-496[ISI]
Horai S., K. Hayasaka, R. Kondo, K. Tsugane, N. Takahata, 1995 Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs Proc. Natl. Acad. Sci. USA 92:532-536[Abstract]
Ingman M., H. Kaessmann, S. Pääbo, U. Gyllensten, 2000 Mitochondrial genome variation and the origin of modern humans Nature 408:708-713[ISI][Medline]
Jorde L. B., M. Bamshad, 2000 Questioning evidence for recombination in human mitochondrial DNA www.sciencemag.org/cgi/content/full/288/5474/1931a
Kivisild T., M. J. Bamshad, K. Kaldma, et al. (15 co-authors) 1999 Deep common ancestry of Indian and western-Eurasian mitochondrial DNA lineages Curr. Biol 9:1331-1334[ISI][Medline]
Kivisild T., R. Villems, 2000 Questioning evidence for recombination in human mitochondrial DNA http://www.sciencemag.org/cgi/content/full/288/5474/1931a
Krings M., A. Stone, R. W. Schmitz, H. Krainitzki, M. Stoneking, S. Pääbo, 1997 Neandertal DNA sequences and the origin of modern humans Cell 90:19-30[ISI][Medline]
Kumar S., P. Hedrick, T. Dowling, M. Stoneking, 2000 Questioning evidence for recombination in human mitochondrial DNA www.sciencemag.org/cgi/content/full/288/5474/1931a
Macaulay V., M. Richards, E. Hickey, E. Vega, F. Cruciani, V. Guida, R. Scozzari, B. Bonné-Tamir, B. Sykes, A. Torroni, 1999 The emerging tree of west Eurasian mtDNAs: a synthesis of control-region sequences and RFLPs Am. J. Hum. Genet 64:232-249[ISI][Medline]
Mateu E., D. Comas, F. Calafell, A. Pérez-Lezaun, A. Abade, J. Bertranpetit, 1997 A tale of two islands: population history and mitochondrial DNA sequence variation of Bioko and São Tomé, Gulf of Guinea Ann. Hum. Genet 61:507-518[ISI][Medline]
Melton T., R. Peterson, A. J. Redd, N. Saha, A. S. M. Sofro, J. Martinson, M. Stoneking, 1995 Polynesian genetic affinities with southeast Asian populations as identified by mtDNA analysis Am. J. Hum. Genet 57:403-414[ISI][Medline]
Morral N., J. Bertranpetit, X. Estivill, et al. (31 co-authors) The origin of the major cystic fibrosis mutation (delta F508) in European populations Nat. Genet. 7:169175
Neel J. V., R. J. Biggar, R. I. Sukernik, 1994 Virologic and genetic studies relate Amerind origins to the indigenous people of the Mongolia/Manchuria/southeastern Siberia region Proc. Natl. Acad. Sci. USA 91:10737-10741
Oota H., N. Saitou, T. Matsushita, S. Ueda, 1999 Molecular genetic analysis of remains of a 2000-year-old human population in Chinaand its relevance for the origin of the modern Japanese population Am. J. Hum. Genet 64:250-258[ISI][Medline]
Ovchinnikov I. V., A. Götherström, G. P. Romanova, V. M. Kharitonov, K. Lidén, W. Goodwin, 2000 Molecular analysis of Neanderthal DNA from the northern Caucasus Nature 404:490-493[ISI][Medline]
Parsons T. J., J. A. Irwin, 2000 Questioning evidence for recombination in human mitochondrial DNA www.sciencemag.org/cgi/content/full/288/5474/1931a
Passarino G., O. Semino, L. F. Bernini, A. S. Santachiara-Benerecetti, 1996 Pre-Caucasoid and Caucasoid genetic features of the Indian population, revealed by mtDNA polymorphisms Am. J. Hum. Genet 59:927-934[ISI][Medline]
Penny D., M. Steel, P. J. Waddell, M. D. Hendy, 1995 Improved analyses of human mtDNA sequences support a recent African origin for Homo sapiens Mol. Biol. Evol 12:863-882[Abstract]
Pritchard J. K., M. W. Feldman, 1996 Genetic data and the African origin of humans Science 274:1548[ISI][Medline]
Quintana-Murci L., O. Semino, H.-J. Bandelt, G. Passarino, K. McElreavey, A. S. Santachiara-Benerecetti, 1999 Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa Nat. Genet 23:437-441[ISI][Medline]
Redd A. J., M. Stoneking, 1999 Peopling of Sahul: mtDNA variation in aboriginal Australian and Papua New Guinean populations Am. J. Hum. Genet 65:808-828[ISI][Medline]
Relethford J. H., L. B. Jorde, 1999 Genetic evidence for larger African population size during recent human evolution Am. J. Phys. Anthropol 108:251-260[ISI][Medline]
Richards M. B., H. Côrte-Real, P. Forster, V. Macaulay, H. Wilkinson-Herbots, A. Demaine, S. Papiha, R. Hedges, H.-J. Bandelt, B. C. Sykes, 1996 Palaeolithic and neolithic lineages in the European mitochondrial gene pool Am. J. Hum. Genet 59:185-203[ISI][Medline]
Richards M., S. Oppenheimer, B. Sykes, 1998 mtDNA suggests Polynesian origins in eastern Indonesia Am. J. Hum. Genet 63:1234-1236[ISI][Medline]
Richards M., V. Macaulay, E. Hickey, et al. (37 co-authors) 2000 Tracing European founder lineages in the near Eastern mitochondrial gene pool Am. J. Hum. Genet 67:1251-1276[ISI][Medline]
Risch N., K. K. Kidd, S. A. Tishkoff, 1996 Genetic data and the African origin of humans Science 274:1548-1549[ISI][Medline]
Röhl A., 1999 Phylogenetische Netzwerke Ph.D. thesis, University of Hamburg, Hamburg, Germany
Röhl A., B. Brinkmann, L. Forster, P. Forster, 2001 An annotated mtDNA database Int. J. Legal Med. 115:2939
Saillard J., P. Forster, N. Lynnerup, H.-J. Bandelt, S. Nørby, 2000 mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion Am. J. Hum. Genet 67:718-726[ISI][Medline]
Schurr T. G., R. I. Sukernik, Y. B. Starikovskaya, D. C. Wallace, 1999 Mitochondrial DNA variation in Koryaks and Itel'men: population replacement in the Okhotsk SeaBering Sea region during the Neolithic Am. J. Phys. Anthropol 108:1-39[ISI][Medline]
Sherry S. T., A. R. Rogers, H. Harpending, T. Jenkins, M. Stoneking, 1994 Mismatch distributions of mtDNA reveal recent human population expansions Hum. Biol 66:761-775[ISI][Medline]
Shields G. F., A. M. Schmiechen, B. L. Frazier, A. Redd, M. I. Voevoda, J. K. Reed, R. H. Ward, 1993 mtDNA sequences suggest a recent evolutionary divergence for Beringian and northern North American populations Am. J. Hum. Genet 53:549-562[ISI][Medline]
Slatkin M., 1996 Gene genealogies within mutant allelic classes Genetics 143:579-587
Starikovskaya Y. B., R. I. Sukernik, T. G. Schurr, A. M. Kogelnik, D. C. Wallace, 1998 mtDNA diversity in Chukchi and Siberian Eskimos: implications for the genetic history of ancient Beringia and the peopling of the New World Am. J. Hum. Genet 63:1473-1491[ISI][Medline]
Stoneking M., J. J. Fontius, S. L. Clifford, H. Soodyall, S. S. Arcot, N. Saha, T. Jenkins, M. A. Tahir, P. L. Deininger, M. A. Batzer, 1997 Alu insertion polymorphisms and human evolution: evidence for a larger population size in Africa Genome Res 7:1061-1071
Stoneking M., L. B. Jorde, K. Bhatia, A. C. Wilson, 1990 Geographic variation in human mitochondrial DNA from Papua New Guinea Genetics 124:717-733
Stumpf M. P. H., D. B. Goldstein, 2001 Genealogical and evolutionary inference with the human Y chromosome Science 291:1738-1742
Sykes B., A. Leiboff, J. Low-Beer, S. Tetzner, M. Richards, 1995 The origins of the Polynesians: an interpretation from mitochondrial lineage analysis Am. J. Hum. Genet 57:1463-1475[ISI][Medline]
Templeton A. R., 1998 Nested cladistic analyses of phylogeographic data: testing hypotheses about gene flow and population history Mol. Ecol 7:381-397[ISI][Medline]
Thorne A., R. Grün, G. Mortimer, N. A. Spooner, J. J. Simpson, M. McCulloch, L. Taylor, D. Curnoe, 1999 Australia's oldest human remains: age of the Lake Mungo 3 skeleton J. Hum. Evol 36:591-612[ISI][Medline]
Tishkoff S. A., E. Dietzsch, W. Speed, et al. (15 co-authors) 1996 Global patterns of linkage disequilibrium at the CD4 locus and modern human origins Science 271:1380-1387[Abstract]
Torroni A., H.-J. Bandelt, L. D'Urbano, et al. (11 co-authors) 1998 mtDNA analysis reveals a major late Palaeolithic population expansion from southwestern to northeastern Europe Am. J. Hum. Genet 62:1137-1152[ISI][Medline]
Torroni A., Y.-S. Chen, O. Semino, A. S. Santachiara-Benerecetti, C. R. Scott, M. T. Lott, M. Winter, D. C. Wallace, 1994a. mtDNA and Y-chromosome polymorphisms in four native American populations from southern Mexico Am. J. Hum. Genet 54:303-348[ISI][Medline]
Torroni A., K. Huoponen, P. Francalacci, M. Petrozzi, L. Morelli, R. Scozzari, D. Obinu, M.-L. Savontaus, D. C. Wallace, 1996 Classification of European mtDNAs from an analysis of three European populations Genetics 144:1835-1850
Torroni A., M. T. Lott, M. F. Cabell, Y.-S. Chen, L. Lavergne, D. C. Wallace, 1994b. mtDNA and the origin of Caucasians: identification of ancient Caucasian-specific haplogroups, one of which is prone to a recurrent somatic duplication in the D-loop region Am. J. Hum. Genet 55:760-776[ISI][Medline]
Torroni A., J. A. Miller, L. G. Moore, S. Zamudio, J. Zhuang, T. Droma, D. C. Wallace, 1994c. Mitochondrial DNA analysis in Tibet: implications for the origin of the Tibetan population and its adaption to high altitude Am. J. Phys. Anthropol 93:189-199[ISI][Medline]
Torroni A., T. G. Schurr, M. F. Cabell, M. D. Brown, J. V. Neel, M. Larsen, D. G. Smith, C. M. Vullo, D. C. Wallace, 1993a. Asian affinities and continental radiation of the four founding native American mtDNAs Am. J. Hum. Genet 53:563-590[ISI][Medline]
Torroni A., R. I. Sukernik, T. G. Schurr, Y. B. Starikovskaya, M. F. Cabell, M. H. Crawford, A. S. G. Comuzzie, D. C. Wallace, 1993b. mtDNA variation of aboriginal Siberians reveals distinct genetic affinities with Native Americans Am. J. Hum. Genet 53:591-608[ISI][Medline]
Turner C. G., 1986 Dentochronological separation estimates for Pacific rim populations Science 232:1140-1142[ISI]
Vigilant L., 1990 Control region sequences from African populations and the evolution of human mitochondrial-DNA Ph.D. thesis, University of California, Berkeley
Vigilant L., M. Stoneking, H. Harpending, K. Hawkes, A. C. Wilson, 1991 African populations and the evolution of human mitochondrial DNA Science 253:1503-1507[ISI][Medline]
Wakeley J., J. Hey, 1997 Estimating ancestral population parameters Genetics 145:847-855
Wang L., H. Oota, N. Saitou, F. Jin, T. Matsushita, S. Ueda, 2000 Genetic structure of a 2,500-year-old human population in China and its spatiotemporal changes Mol. Biol. Evol 17:1396-1400
Watson E., P. Forster, M. Richards, H.-J. Bandelt, 1997 Mitochondrial footprints of human expansions in Africa Am. J. Hum. Genet 61:691-704[ISI][Medline]
Wills C., 1995 Topiary pruning and weighting reinforce an African origin for the human mitochondrial DNA tree Evolution 50:977-989[ISI]
Wilson A. C., R. L. Cann, S. M. Carr, et al. (11 co-authors) 1985 Mitochondrial DNA and two perspectives on evolutionary genetics Biol. J. Linn. Soc 26:375-400[ISI]
Wrishnik L. A., R. G. Higuchi, M. Stoneking, H. A. Erlich, N. Arnheim, A. C. Wilson, 1987 Length mutations in human mitochondrial DNA: direct sequencing of enzymatically amplified DNA Nucleic Acids Res 15:529-542[Abstract]
Zischler H., H. Geisert, A. von Haeseler, S. Pääbo, 1995 A nuclear "fossil" of the mitochondrial D-loop and the origin of modern humans Nature 378:489-492[ISI][Medline]