Laboratory of Molecular Systematics and Evolution
Department of Anthropology, University of Arizona; Department of Genetics, Universita degli Studi di Pavia, Pavia, Italy; and
SAMIR, University of Witwatersrand, Johannesburg, South Africa
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Assuming a 1:1 sex-ratio, autosomal and X-linked regions of the genome have four- and threefold higher effective sizes, respectively, than the nonrecombining portion of the Y chromosome (NRY) and the mitochondrial DNA molecule. Consequently, increased levels of population subdivision due to genetic drift are expected for these uniparentally inherited haploid regions of the genome. Until recently, the small number of known NRY polymorphisms has hindered a comprehensive assessment of the global structure of Y-chromosome diversity. Earlier studies indicated that Y-chromosome polymorphisms were geographically restricted and that FST values for the NRY were higher than those for mtDNA (Jobling and Tyler-Smith 1995
; Cavalli-Sforza and Minch 1997
; Underhill et al. 1997
; Hammer et al. 1998
; Perez-Lezaun et al. 1999
). Indeed, the higher observed FST for the NRY compared with that for mtDNA led Seielstad, Minch, and Cavalli-Sforza (1998)
to propose that females have had an eightfold higher migration rate than males. It is unclear, however, whether the suggested underlying cause of this higher mobility (i.e., local-scale patrilocality, defined anthropologically as the tendency for a wife to move into her husband's natal domicile) would lead to a higher global FST for the Y chromosome (Stoneking 1998
). On the other hand, contrasting signals in nested cladistic analyses of NRY and mtDNA data sets led Hammer et al. (1998)
to hypothesize that male migration rates may have been higher than those for females at the intercontinental level. Despite these observations, most human population genetics models assume panmixia. This study was designed to measure the degree of Y-chromosome structure on a global scale (i.e., to test the assumption of panmixia) and to test the global applicability of the patrilocality hypothesis.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
The SSCP method was used to screen a set of 20 sequence-tagged sites (STSs). The DHPLC method was used to screen for mutations in the following set of three clones that were previously used as probes to detect restriction fragment length polymorphism (RFLP) variation on the NRYs of humans and great apes (Allen and Ostrer 1994
): clone 4-1 (DYS188), clone 3-11 (DYS190), and clone 3-8 (DYS194). Mutational variation within four STSs (DYS221, DYS257, DYS199, and DYS211) and two clones (3-8 and 4-1) was previously reported (Karafet et al. 1997
; Hammer et al. 1998, 2000
). Additional variation was found at sites within three STSs (DYS7, DYS265, and DYS257b) and two clones (3-8 and 3-11). Finally, Ya5 Alu elements within the 16E4 and 486,O,2 clones (GenBank accession numbers AC003094 and AC002531, respectively; http://www.ncbi.nlm.nih.gov/Genbank/index.html), as well as a 683-bp region of an arylsulfatase pseudogene (ARSEP, GenBank accession number AC002992) were screened for polymorphisms using DHPLC.
The DYS7 (GenBank accession number G12023), DYS265 (GenBank accession number G12016), and DYS257 (GenBank accession number G38358) STSs were amplified using the conditions and primers reported by Vollrath et al. (1992)
. The Y-specific clones 3-8 (DYS194) and 3-11 (DYS190) (Allen and Ostrer 1994
) were sequenced by primer walking (GenBank accession numbers AF257064 and AF337053, respectively), and the sequence information was used to design primers to amplify shorter fragments for DHPLC analysis. DNA sequencing was performed by standard procedures to identify mutations which altered mobility on SSCP gels or DHPLC chromatograms.
As in previous mutation detection surveys (Underhill et al. 1997, 2000
; Hammer et al. 1998
; Karafet et al. 1999
), we sequenced homologous DNA regions encompassing all sites found to be polymorphic on the human NRY in great ape species (e.g., one common chimpanzee, one bonobo, and one gorilla) to determine ancestral states, as well as the position of the root of the human NRY haplotype tree.
Allele-Specific Genotyping Assays
A total of 23 segregating sites were discovered using the two mutation detection methods. Of these 23, 15 were chosen for genotyping in the entire sample (10 new and 5 previously published polymorphisms). The other 8 polymorphisms were found to be so rare in a subset of the 2,858 chromosomes that they were excluded from subsequent analyses. After determining the location of a sample with respect to its position on the haplotype tree, no further genotyping was undertaken for that sample. This hierarchical genotyping protocol means that not every individual was typed for every marker, and hence it is possible that some recurrent mutations remained undetected using this strategy (Underhill et al. 2000)
. Nevertheless, because the homoplasy rate for single-nucleotide polymorphisms (SNPs) on the NRY is so low (Underhill et al. 2000)
, it is unlikely that undetected multiple "hits" would seriously affect either our phylogenetic or our diversity analyses. The remaining 20 previously published polymorphisms (from the entire battery of 43 polymorphisms) were also genotyped for all 2,858 chromosomes, with the aforementioned caveats.
Variation at all previously unpublished polymorphic sites (table 1 : mutations 2, 6, 9, 10, 12, 13, 20, 23, 32, and 38) was genotyped using allele-specific PCR (Sommer, Groszbach, and Bottema 1992
). The PCR conditions and primer sequences employed in these allele-specific genotyping assays were deposited in the National Center for Biotechnology Information (NCBI) dbSNP database (http://www.ncbi.nlm.nih.gov/SNP). Mutations numbered 3, 7, 14, 1620, 22, 24, 25, and 4043 in table 1
were genotyped according to methods reported by Hammer and Horai (1995)
, Hammer et al. (1998, 2000)
, and Karafet et al. (1999)
. Other previously published mutations included mutation 8 (Jobling et al. 1996
); mutations 1, 4, 5, 15, 21, 26, 27, 36, 37, and 39 (Underhill et al. 1997
); mutation 31 (Zerjal et al. 1997
); mutations 34 and 35 (Shinka et al. 1999
); mutations 28 and 33 (Su et al. 1999
); mutations 11 and 30 (Bao et al. 2000)
; and mutation 29 (Santos et al. 2000)
.
Statistical Analyses
Parsimony analysis of NRY haplotypes was aided by the use of PAUP, version 4.0b4 (Swofford 2000)
, with outgroup rooting. Measures of haplotype diversity, including the number of haplotypes (k), Nei's (1987)
heterozygosity (h), and the mean number of pairwise differences among haplotypes (p), were calculated using the software package ARLEQUIN (Schneider et al. 1998
). We also used ARLEQUIN to perform analysis of molecular variance (AMOVA). AMOVA produces estimates of variance components and
statistics (F statistic analogs) reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision (Excoffier, Smouse, and Quattro 1992
). Because the assumptions of random sampling, "pure" genetic drift, and no migration are likely to be violated in all human populations, caution is needed when interpreting
statistics. Nevertheless, according to Excoffier, Smouse, and Quattro (1992)
, the resulting variance components can be viewed as convenient summaries of the partitioning of genetic variation within and among populations. We performed multidimensional scaling (MDS) (Kruskal 1964
) on the
ST distances generated in ARLEQUIN using the software package NTSYS (Rohlf 1998
). Nested cladistic analyses (NCAs) were carried out using GeoDis, version 2.0 (Posada, Crandall, and Templeton 2000)
. This novel method attempts to explain statistically significant associations between haplotypes and geography in terms of population history and/or population structure considerations. Population structure processes operate over short time intervals and tend to establish migration-drift equilibria, whereas population history events are considered to be nonrecurrent phenomena that disrupt equilibria. Three conditions underlie the general applicability of NCA and its ability to discriminate among the various population structure processes (i.e., recurrent gene flow restricted by isolation by distance vs. long-distance dispersal) and/or population history events (i.e., contiguous range expansion, long-distance colonization, or fragmentation). These considerations include (1) adequate sampling across the geographic range of the species, (2) temporal polarity of the haplotype network, and (3) mutational resolution in the haplotype tree. Because our cladogram was rooted by outgroup comparisons, we were able to infer the geographical polarity of several of the signals detected by the NCA analysis by considering the distribution of interior and tip clades (e.g., directionality was assumed to go from interior to tip), especially in cases where there was a clear geographic pattern of separation between ancestral and derived haplotypes. For a more extensive explanation of the NCA method, consult Templeton, Routman, and Phillips (1995)
, Hammer et al. (1998)
, and Posada, Crandall, and Templeton (2000)
.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In addition to these 10 new markers, we surveyed 33 previously published polymorphisms (table 1 ). Mutational events at two sites were recurrent (SRY10831 and MSY2). The character states at all 41 mutational sites give rise to 44 possible NRY haplotypes, of which 39 were present in this survey. The frequencies of these 39 haplotypes (h1h39) in each regional group are reported in table 1 . Figure 1 displays a maximum-parsimony tree showing the evolutionary relationships of all 39 haplotypes. Haplotypes in figure 1 are color-coded by geography. The pie charts represent the frequencies of occurrence of the haplotypes within each of the 10 geographic regions listed in table 1 , and the overall size of each circle represents a global haplotype frequency.
|
Haplotype Diversity
Diversity statistics for the 10 regional and 5 continental groups are presented in table 2
. The number of regional haplotypes (k) ranged from 10 in South Asians to 18 in East Asians. Regional haplotype diversity values (h) ranged from 0.605 in Native Americans to 0.878 in Central Asians, while the mean number of pairwise differences (p) ranged from 1.31 in Native Americans to 3.93 in sub-Saharan Africans. Four distinct patterns appear when two diversity statistic values for the 10 regional groups in table 2
are compared: (1) low p/low h, (2) high p/high h, (3) high p/low h, and (4) moderate p/high h. The Americas represent the only region that exhibits the first pattern of concordantly low p and low h values. The low p value occurs here because 90% of Native American Y-chromosome lineages are one-step neighbors restricted to haplotypes 3639 (magenta in fig. 1
), while the low h value is due to the fact that 57% of the Native Americans in our study have haplotype 39 (table 1
). The second pattern, where a high p value is accompanied by a concordantly high h value, is seen only in Europeans. This pattern reflects intermediate frequencies of relatively divergent haplotypes found in different parts of the tree (i.e., blue in fig. 1
). The third pattern, where a high p value is discordantly combined with a low h value, characterizes both African regional groups. In sub-Saharan Africa, the extremely high p value is influenced by the marked divergence among the dark green haplotypes in figure 1
. The contrasting relatively low h value may occur because 45% of sub-Saharan African Y chromosomes exhibit a single haplotype (h15). Likewise, 50% of the North Africans have a single (but different) haplotype (h14), resulting in a low h value, while the rather high North African p value is associated with the occurrence of a diverse set of lineages (light green in fig. 1
). Finally, Asian and Oceanian populations exhibit the fourth pattern, moderate p and high h values. Although their p values are moderate, the Central and East Asians have the highest h values among the 10 regions. These high h values are probably due to the lack of any predominant Central or East Asian haplotype (red and orange, respectively in fig. 1
).
|
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In contrast to other kinds of genetic data (Przeworski, Hudson, and Di Rienzo 2000)
, both our present NRY tree and that of Underhill et al. (2000)
clearly indicate that haplotypes found outside of Africa are not a subset of those found within Africa. However, the NRY tree does show a branching pattern similar to that seen in the gene trees of several other loci: African-specific branches are found on both sides of the root of the tree and are separated from the remaining sets of African and non-African branches (Labuda, Zietkiewicz, and Yotova 2000)
.
Apportionment of NRY Biallelic Diversity
This study represents the most extensive -statistic analysis utilizing Y-chromosome biallelic markers to date combining sample size, geographic coverage, and number of markers. Caution should be exercised when comparing variance partitions among studies because results will depend on which populations are sampled, how the populations are grouped (nested), and what underlying models of population structure are assumed (Urbanek, Goldman, and Long 1996
). With this caveat in mind, a sample of previously reported total among-groups variation values (whether measured by
ST, FST, or GST) for Y-specific RFLPs and/or SNPs was found to range from 0.230 to 0.645 (Hammer et al. 1997
; Poloni et al. 1997
; Seielstad, Minch, and Cavalli-Sforza 1998
; Kittles et al. 1999
; Jorde et al. 2000
). Our global
ST value of 0.360 is only slightly lower than the mean value (0.413) calculated from the five studies cited above. Poloni et al. (1997)
cautioned that their FST value of 0.230 may be an underestimate in part because of recurrent mutation acting on the p49a,f/TaqI polymorphic system. It is not clear why the FST value of 0.645 reported by Seielstad, Minch, and Cavalli-Sforza (1998)
based on the data presented in Underhill et al. (1997)
is so much higher than the value reported here. When we analyzed the Underhill et al. (1997)
data set using the 10 population groupings provided in their figure 2
, we obtained a
ST value of 0.540 and an FST value of 0.414.
In general, the within-populations variance component for Y-chromosome data is much smaller than the values reported for mtDNA (Excoffier, Smouse, and Quattro 1992
; Seielstad, Minch, and Cavalli-Sforza 1998
; Kittles et al. 1999
; Jorde et al. 2000
). On the other hand, the among-groups and the among-populations-within-groups component values for Y chromosomes usually exceed those for mtDNA. Seielstad, Minch, and Cavalli-Sforza (1998)
appealed to a lower transgenerational migration rate for males as the major explanatory factor for why their Y-chromosome FST value (0.645) was so much higher than the mtDNA-based value they recalculated from Excoffier, Smouse, and Quattro's (1992)
data (0.186). Their estimated eightfold higher female migration rate was attributed to patrilocality operating primarily at the local and perhaps regional levels (Seielstad, Minch, and Cavalli-Sforza 1998
). Although the majority of human societies practice patrilocality (Murdock 1967
), it is unclear whether this effect extends to intercontinental and global levels (Stoneking 1998
). In order to investigate this proposition, we analyzed our Y-chromosome data according to the grouping design presented in Excoffier, Smouse, and Quattro (1992)
(albeit with different samples), thereby permitting a comparison of the
statistics generated from our Y-chromosome data with those derived from a global mtDNA RFLP data set. Although the "patrilocality effect" was extremely clear at the interregional level within continents (
SC ratio = 4.4), it was much less apparent at the intercontinental level (
CT ratio = 1.4) or at the global level (
ST ratio = 1.7). It is even possible that the latter two ratios would be closer to one (or less than one) without the high mutation rate and accompanying homoplasy known to affect mtDNA restriction sites (Excoffier, Smouse, and Quattro 1992
) and thought to depress FST values (Jorde et al. 2000)
. For instance, the at-least-10-fold-higher mtDNA mutation rate would be expected to increase the within-groups variance component and thus decrease the
ST values for mtDNA relative to the NRY, thereby artificially inflating the corresponding NRY/mtDNA ratio (Jin and Chakraborty 1995
). Numerous possible explanations exist for the discrepancy between the
SC ratio and the other two values. For instance, increased intercontinental male migration, decreased intercontinental female gene flow, and sex-specific demographic factors may all contribute to the
-statistic patterns (Stoneking 1998
; Fix 1999
; Karafet et al. 1999
).
Our AMOVA results support two tentative conclusions: (1) patrilocality effects are evident at local and regional scales rather than at the intercontinental and global levels of analysis, and (2) sole reliance on FST values based on Wright's (1969)
island model of population structure may result in distorted pictures of the geographic extent of the patrilocality effect and of the possible markedly different sex-specific migration rates noted above.
Tripartite Division of Global NRY Variation
The multidimensional scaling plot in figure 2
underscores the distinctiveness of Native American and African populations with respect to Eurasian and Oceanian populations seen in the AMOVA results (table 3
). The pattern of low Native American NRY diversity is also concordant with the position of the Americas as an outlier in figure 2
. These results fit a scenario whereby a combination of relatively recent colonization, repeated founder effects, small population sizes, and extensive intergenerational genetic drift is responsible for the distinctiveness of Native Americans with respect to their Asian forebears, as well as the remainder of the world (Karafet et al. 1999
). As in many other genetic studies (Vigilant et al. 1991
; Nei and Roychoudhury 1993
; Cavalli-Sforza, Menozzi, and Piazza 1994
), African populations occupy a distinct region of the multidimensional space in figure 2 . However, unlike those of the Americas, sub-Saharan African populations are characterized by a diverse set of ancient haplotypes that are not shared globally (e.g., basal haplotypes h1h10 in fig. 1
) in combination with a set of more derived haplotypes that are widely shared within Africa and, again, are not shared globally (e.g., h13 and h15). Therefore, the distinctiveness of African populations better fits a scenario of African-specific lineage admixture (Labuda, Zietkiewicz, and Yotova 2000)
.
Although North Africa occupies a position relatively close to sub-Saharan Africa in figure 2
, when traditional FST and CHORD distance statistics were employed (data not shown), North Africa moved closer to the Middle Eastern and European portion of the central cluster, as might be expected from ethnohistoric connections between North Africa and the Middle East (Cavalli-Sforza, Menozzi, and Piazza 1994
), thereby producing a pattern similar to that depicted in the maximum-likelihood network of Underhill et al. (2000)
.
What distinguishes our results in figure 2
from autosomal global genetic analyses is the particular subdivision pattern that emerges, wherein Africans and Native Americans occupy opposite ends of the plot, while populations from Europe, Asia, and Oceania form a large, central cluster. For instance, Cavalli-Sforza, Menozzi, and Piazza's (1994
, p. 82) principal-components ordination shows Africa clearly differentiated from the rest of the world; however, the Americas fall within the northern Eurasian portion of their map, which is separated from a southern Asian/Oceanian cluster. In other surveys, Africa and Oceania are frequently positioned as the outliers (Nei and Roychoudhury 1993
; Stoneking et al. 1997
). One possible reason for the distinctiveness of the haploid NRY pattern compared with diploid autosomal patterns is the stronger effect of genetic drift because of the smaller effective population size of the NRY (i.e., NRY Ne = 1/4 autosomal Ne).
The very similar and extremely low CT values for the European/non-European, Asian/non-Asian, and Oceanian/non-Oceanian comparisons in table 3 , as well as the high
ST and
CT values for the African/non-African, American/non-American, and Africa/Americas/"rest of the world" comparisons, coincide with the observed pattern of global Y-chromosome diversity portrayed in figure 2
. When
CT values were calculated for the three intercluster comparisons in figure 2
, the African/Native American comparison showed the largest between-groups differentiation (
CT = 0.549), while the central cluster was less differentiated from the Americas (
CT = 0.234) than from Africa (
CT = 0.350), in accord with ethnohistoric evidence (Cavalli-Sforza, Menozzi, and Piazza 1994
; Hammer and Zegura 1996
; Crawford 1998
; Karafet et al. 1999
; Cavalli-Sforza 2000
).
Nested Cladistic Analysis as a Synthetic Explanatory Tool
In order to understand the causal mechanisms underlying the pattern of NRY variation reflected in the MDS plot and AMOVA results, the spatial distribution of our global NRY database was investigated by NCA (Templeton, Routman, and Phillips 1995
; Hammer et al. 1998
). In figure 4
, inter- and intracontinental population history events (contiguous range expansions and long-distance colonizations) are depicted by solid arrows, while population structure processes (recurrent gene flow restricted by isolation by distance and long-distance dispersals) are indicated by dashed arrows (and one dashed line between Asia and Oceania, for which polarity could not be inferred). It is clear from figure 4
that both population structure and history have played important roles in shaping patterns of global NRY variation.
One of the most notable findings from the NCA analysis was the predominance of intercontinental signals detected emanating from Asia (fig. 4
). These multiple out-of-Asia signals included gene flow episodes to Europe and the Americas, along with range expansions to Oceania, Africa, and Europe. In contrast, the NCA only detected two out-of-Africa signals. These NCA inferences help to explain the MDS plot (i.e., Asia's membership in the central cluster), the AMOVA results (i.e., lack of significant differentiation of Eurasian and Oceanian populations), and the diversity statistics (i.e., similar patterns of diversity in Asia and Oceania). Contrary to previously published studies of mtDNA (Redd et al. 1999
) and autosomal markers (Harding et al. 1997
; Stoneking et al. 1997
), the NRY results suggest a strong affinity between mainland Asian and Oceanian populations. This different pattern may be due to either ascertainment bias in our NRY database or higher rates of male migration between Asia and Oceania. Support for the latter conjecture comes from the two long-distance colonization events, as well as a gene flow signal detected between Asia and Oceania in the NCA.
The fact that Europe is primarily a receiver rather than a sender of signals in figure 4
underscores the importance of gene flow/population movements into this continent. It also helps to explain Europe's central position in the MDS plot, its high h and p diversity statistic values, the concordant pattern of the three statistics for the European/non-European and Asian/non-Asian comparisons, and the observation that Europe has the lowest continental
ST value (see below). Interestingly, all incoming signals to Europe came from Asia. Two of these signals appear to have originated in Asia, one being a long-distance dispersal (from within nested clade 1-15) and the other being a contiguous range expansion (from within nested clade 1-11). The third was a gene flow signal that may have actually originated in Africa before moving to the Levant and eventually to Europe. This latter signal was postulated to result from the Neolithic demic diffusion of Levantine farmers into Europe (Hammer et al. 1998
) and corresponds to Semino et al.'s (2000)
Eu4 lineage. The former two signals may well correspond to the two proposed Paleolithic migratory episodes that contributed a major portion of the modern European paternal gene pool (Semino et al. 2000)
.
After an early out-of-Africa range expansion (widest arrow in fig. 4
), the majority of signals involving Africa were intracontinental events and processes. To explain this in the context of the two different sets of NRY haplotypes in sub-Saharan Africa (i.e., an ancient set of haplotypes overlaid by derived shared haplotypes), one probably needs a layered temporal framework whereby, for instance, early subdivision with inferred gene flow between the Khoisan and Pygmies (e.g., table 4
) is combined with later, more extensive gene flow and historical events such as the Bantu expansion (Cavalli-Sforza, Menozzi, and Piazza 1994
). These inferences are compatible with the model put forward by Labuda, Zietkiewicz, and Yotova (2000)
in which the gene pool of sub-Saharan Africans is seen to be composed of two clades that evolved separately and then eventually underwent hybridization.
Comparative Framework: NRY and mtDNA Patterns in Sub-Saharan Africa
Our results may help to inform the debate concerning the conflicting patterns observed in sub-Saharan African mitochondrial and nuclear DNA. A basic inconsistency has been noted concerning the relative branch lengths in population trees for sub-Saharan African and non-African populations (Jorde et al. 1995
). In contrast to non-African populations, sub-Saharan African populations appear to be well differentiated in mtDNA-based trees, suggesting that they have been subdivided for an extended period (Mountain 1998
). Population trees based on nuclear polymorphisms do not show this pattern: for example, sub-Saharan African populations appear to be more closely related, and non-African and sub-Saharan African branches are more comparable in length. This discrepancy has not been satisfactorily explained by models incorporating an ascertainment bias in nuclear polymorphisms, a higher substitution (and homoplasy) rate for mtDNA, limited sample sizes, or various population-level factors (e.g., size changes) (Jorde et al. 1995
; Mountain 1998
). Additional explanatory factors have been suggested, including lack of selective neutrality in mtDNA and differences in male versus female migration rates and/or effective sizes within sub-Saharan Africa (Jorde et al. 1995
).
We examined this problem from the perspective of the NRY by undertaking two new analyses. We wanted to know (1) whether NRY-based genetic distances within sub-Saharan Africa were smaller than those for non-African locales, and (2) if NRY data showed less differentiation among sub-Saharan African populations than did mtDNA data. First, ST genetic distances were calculated for each continent separately. Asia had the highest
ST value (0.271), followed by Africa (0.222), the Americas (0.188), Oceania (0.133), and Europe (0.128). Thus, at least one non-African locale (Asia) had larger genetic distances than Africa. When sub-Saharan Africa was analyzed separately, its
ST of 0.251 was still smaller than that of Asia. In contrast, mtDNA data typically show much greater among-groups variation for sub-Saharan African populations than for non-African groups. For instance, Melton et al. (1997)
reported a
ST value of 0.339 for sub-Saharan Africa, compared with values of 0.045 and 0.007 for Asian and European populations, respectively. The second new analysis consisted of an MDS plot for all 50 populations (fig. 5
). Here, sub-Saharan African populations were more tightly clustered than they were in Excoffier et al.'s (1996)
mtDNA-based plot, and the sub-Saharan African populations were also more tightly clustered than non-African populations, the exact opposite of the mtDNA pattern. Moreover, the overall pattern of sub-Saharan African NRY phylogeography is closer to other nuclear system results (Cavalli-Sforza, Menozzi, and Piazza 1994
; Stoneking et al. 1997
) than to those for its haploid mtDNA counterpart.
|
Conclusions, Caveats, and Future Directions
Previous global nested cladistic analyses of human NRY variation (Hammer et al. 1998
; Karafet et al. 1999
) have demonstrated patterns of diversity unlike those provided by mtDNA (Templeton 1993, 1997, 1999
) or autosomal systems (Harding et al. 1997
). For instance, Templeton's (1993, 1997, 1999)
nested cladistic analyses of human mtDNA data are all highlighted at the deepest level by pervasive gene flow restricted by isolation by distance throughout Africa and southern Eurasia for the entire time to the most recent common ancestor (TMRCA) of mtDNA. A similar extensive worldwide Late Pleistocene gene flow signal was detected at the ß-globin locus (Harding et al. 1997
; Templeton 1999
), the only global autosomal data set analyzed by Templeton, Routman, and Phillips' (1995)
nested cladistic procedures. In contrast, all three of our nested cladistic analyses detected a global contiguous range expansion out of Africa at the level of the entire cladogram. In the present NCA, the two deepest gene flow signals were only at the three-step level: one occurred globally, while the other was restricted to the continent of Africa. Our new results support a general scenario in which, after an early out-of-Africa range expansion, global-scale patterns of NRY variation were mainly influenced by migrations out of Asia. Moreover, the greater degree of contact detected by the NCA among Asia, Europe, and Oceania (via both population structure processes and population history events) helps to explain the observed pattern of global NRY diversity.
A major conclusion of the present work is that global human NRY variation is structured, with a significant amount of intergroup variation partitioned among African, Native American, and Eurasian/Oceanian populations. There was also a significant degree of among-populations variation at the intracontinental level; the degree of structure at lower levels of population subdivision remains to be determined.
It should be noted that the pattern of subdivision detected here could also be explained by models that involve natural selection or a combination of microevolutionary forces including selection, migration, genetic drift, and mutation. Additionally, various human social processes, such as polygyny and kin-structured migration, may affect variation on the NRY (Fix 1999
). Support for a model involving selection comes from recent findings demonstrating an excess of rare alleles at sites on the NRY (Underhill et al. 1997
; Pritchard et al. 1999
; Shen et al. 2000
; Thomson et al. 2000
). Indeed, our mutation screening at the DYS188, DYS190, and DYS194 sites on a panel of 58 Y chromosomes from worldwide samples also yielded a significant excess of singleton polymorphisms. This excess of singletons (>twofold more than expected under the hypothesis of constant population size) resulted in a significantly negative Fu and Li's (1997)
F* statistic of -2.67 (P < 0.05). These findings are consistent with models based on positive directional selection, expansion from a small population size, and/or ascertainment bias resulting from poor sampling of a subdivided population system. Obviously, more research needs to be focused on distinguishing the possible causes and implications of population subdivision in the human paternal gene pool.
It is important to note the limitations of the different methods employed in this study, as well as their complementary nature for inferring the underlying forces shaping NRY variation in human populations. The NCA is stronger at making inferences toward the interior of a cladogram and weaker at inferring processes/events at the tips. Therefore, as more polymorphisms are discovered (see, e.g., Underhill et al. 2000)
and the NRY tree becomes more resolved, more inferences concerning regional variation will emerge. It is also possible that some of the inferences made here will change as more data are collected. Consequently, we have mainly focused on general patterns and have not tried to explain all of the specific signals detected by the NCA. Finally, these methods do not distinguish selection from demographic forces in shaping patterns of diversity.
Standard approaches for the description of population structure based on Wright's (1969)
island model and/or F statistics (e.g., AMOVA) do not attempt to disentangle past events from contemporary processes and thus can be considered nonhistorical (Turner et al. 2000)
. The combination of nested cladistic and coalescence analyses can theoretically provide the temporal framework for making these crucial distinctions (Schaal and Olsen 2000)
. For instance, coalescence analysis could provide the missing dates needed to clarify the relative chronology of the many signals in figure 4
. Our two previous coalescence analyses of Y-chromosome data (Hammer et al. 1998
; Karafet et al. 1999
) were performed without population growth or subdivision parameters in the model. Growth has been shown to decrease TMRCA estimates (Pritchard et al. 1999
; Thomson et al. 2000
). We are presently collaborating with R. C. Griffiths who is developing coalescence analyses incorporating both population growth and subdivision. Population growth should decrease our previously published mutational ages and TMRCA estimates (Hammer et al. 1998
; Karafet et al. 1999
), while population subdivision should have the opposite effect.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: subdivision
patrilocality
gene flow
male migrations
2 Address for correspondence and reprints: Michael F. Hammer, Laboratory of Molecular Systematics and Evolution, Biosciences West room 239, University of Arizona, Tucson, Arizona 85721. E-mail: mhammer{at}u.arizona.edu
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Allen B. S., H. Ostrer, 1994 Conservation of human Y chromosome sequences among male great apes: implications for the evolution of Y chromosomes J. Mol. Evol 39:13-21[ISI][Medline]
Bao W., S. Zhu, A. Pandya, T. Zerjal, J. Xu, Q. Shu, R. Du, H. Yang, C. Tyler-Smith, 2000 MSY2: a slowly evolving minisatellite on the human Y chromosome which provides a useful polymorphic marker in Chinese populations Gene 244:29-33[ISI][Medline]
Barbujani G., A. Magagni, E. Minch, L. L. Cavalli-Sforza, 1997 An apportionment of human DNA diversity Proc. Natl. Acad. Sci. USA 94:4516-4519
Batzer M. A., M. Stoneking, M. Alegria-Hartman, et al. (11 co-authors) 1994 African origin of human-specific polymorphic Alu insertions Proc. Natl. Acad. Sci. USA 91:12288-12292
Bowcock A., L. L. Cavalli-Sforza, 1991 The study of variation in the human genome Genomics 11:491-498[ISI][Medline]
Carvajal-Carmona L. G., I. D. Soto, N. Pineda, et al. (11 co-authors) 2000 Strong Amerind/White sex bias and a possible Sephardic contribution among the founders of a population in northwest Colombia Am. J. Hum. Genet 67:1287-1295[ISI][Medline]
Cavalli-Sforza L. L., 2000 Genes, peoples, and languages North Point Press, New York
Cavalli-Sforza L. L., P. Menozzi, A. Piazza, 1994 The history and geography of human genes Princeton University Press, Princeton, N.J
Cavalli-Sforza L. L., E. Minch, 1997 Paleolithic and Neolithic lineages in the European mitochondrial gene pool Am. J. Hum. Genet 61:247-254[ISI][Medline]
Crawford M. H., 1998 The origins of Native Americans: evidence from anthropological genetics Cambridge University Press, Cambridge, England
Deka R., M. D. Shriver, L. M. Yu, R. E. Ferrell, R. Chakraborty, 1995 Intra- and inter-population diversity at short tandem repeat loci in diverse populations of the world Electrophoresis 16:1659-1664[ISI][Medline]
Excoffier L., E. S. Poloni, S. Santachiara-Benerecetti, O. Semino, A. Langaney, 1996 The molecular diversity of the Niokholo Mandenkalu from Eastern Senegal: an insight into West Africa genetic history Pp. 141155 in A. J. Boyce and C. G. N. Mascie-Taylors, eds. Molecular biology and human diversity. Cambridge University Press, Cambridge, England
Excoffier L., P. E. Smouse, J. M. Quattro, 1992 Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data Genetics 131:479-491
Fix A., 1999 Migration and colonization in human microevolution Cambridge University Press, New York
Fu Y., W.-H. Li, 1997 Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection Genetics 147:915-925
Hammer M. F., 1995 A recent common ancestry for human Y-chromosomes Nature 378:376-378[ISI][Medline]
Hammer M. F., S. Horai, 1995 Y chromosomal DNA variation and the peopling of Japan Am. J. Hum. Genet 56:951-962[ISI][Medline]
Hammer M. F., T. Karafet, A. Rasanayagam, E. T. Wood, T. K. Altheide, T. Jenkins, R. C. Griffiths, A. R. Templeton, S. L. Zegura, 1998 Out of Africa and back again: nested cladistic analysis of human Y chromosome variation Mol. Biol. Evol 15:427-441[Abstract]
Hammer M. F., A. J. Redd, E. T. Wood, et al. (12 co-authors) 2000 Jewish and middle eastern non-Jewish populations share a common pool of Y-chromosome biallelic haplotypes Proc. Natl. Acad. Sci. USA 97:6769-6774
Hammer M. F., A. B. Spurdle, T. Karafet, et al. (11 co-authors) 1997 The geographic distribution of human Y chromosome variation Genetics 145:787-805
Hammer M. F., S. L. Zegura, 1996 The role of the Y chromosome in human evolutionary studies Evol. Anthropol 5:116-134
Harding R. M., S. M. Fullerton, R. C. Griffiths, J. Bond, M. J. Cox, J. A. Schneider, D. S. Moulin, J. B. Clegg, 1997 Archaic African and Asian lineages in the genetic ancestry of modern humans Am. J. Hum. Genet 60:772-789[ISI][Medline]
Hey J., 1997 Mitochondrial and nuclear genes present conflicting portraits of human origins Mol. Biol. Evol 14:166-172[Abstract]
Jin L., R. Chakraborty, 1995 Population structure, stepwise mutations, heterozygote deficiency and their implications in DNA forensics Heredity 74:274-285[ISI][Medline]
Jobling M. A., V. Samara, A. Pandya, et al. (16 co-authors) 1996 Recurrent duplication and deletion polymorphisms on the long arm of the Y chromosome in normal males Hum. Mol. Genet 5:1767-1775
Jobling M. A., C. Tyler-Smith, 1995 Fathers and sons: the Y chromosome and human evolution Trends Genet 11:449-456[ISI][Medline]
Jorde L. B., M. J. Bamshad, W. S. Watkins, R. Zenger, A. E. Fraley, P. A. Krakowiak, K. D. Carpenter, H. Soodyall, T. Jenkins, A. R. Rogers, 1995 Origins and affinities of modern humans: a comparison of mitochondrial and nuclear genetic data Am. J. Hum. Genet 57:523-538[ISI][Medline]
Jorde L. B., W. S. Watkins, M. J. Bamshad, M. E. Dixon, C. E. Ricker, M. T. Seielstad, M. A. Batzer, 2000 The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y-chromosome data Am. J. Hum. Genet 66:979-988[ISI][Medline]
Karafet T. M., S. L. Zegura, O. Posukh, et al. (14 co-authors) 1999 Ancestral Asian source(s) of New World Y-chromosome founder haplotypes Am. J. Hum. Genet 64:817-831[ISI][Medline]
Karafet T., S. L. Zegura, J. Vuturo-Brady, et al. (14 co-authors) 1997 Y chromosome markers and trans-Bering Strait dispersals Am. J. Phys. Anthropol 102:301-314[ISI][Medline]
Kittles R. A., A. W. Bergen, M. Urbanek, M. Virkkunen, M. Linnoila, D. Goldman, J. C. Long, 1999 Autosomal, mitochondrial, and Y chromosome DNA variation in Finland: evidence for a male-specific bottleneck Am. J. Phys. Anthropol 108:381-399[ISI][Medline]
Kruskal J. B., 1964 Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis Pyschometrika 29:1-27
Labuda D., E. Zietkiewicz, V. Yotova, 2000 Archaic lineages in the history of modern humans Genetics 156:799-808
Latter B. D. H., 1980 Genetic differences within and between populations of the major human subgroups Am. Nat 116:220-237[ISI]
Lewontin R. C., 1972 The apportionment of human diversity Evol. Biol 6:381-398
Melton T., C. Ginther, G. Sensabaugh, H. Soodyall, M. Stoneking, 1997 Extent of heterogeneity in mitochondrial DNA of sub-Saharan African populations J. Forensic Sci 42:582-592[ISI][Medline]
Mesa N. R., M. C. Mondragon, I. D. Soto, et al. (13 co-authors) 2000 Autosomal, mtDNA, and Y-chromosome diversity in Amerinds: pre- and post-Columbian patterns of gene flow in South America Am. J. Hum. Genet 67:1277-1286[ISI][Medline]
Mountain J. L., 1998 Molecular evolution and modern human origins Evol. Anthropol 7:21-37[ISI]
Murdock G. P., 1967 Ethnographic atlas University of Pittsburgh Press, Pittsburgh, Pa
Nei M., 1987 Molecular evolutionary genetics Columbia University Press, New York
Nei M., A. K. Roychoudhury, 1974 Genic variation within and between the three major races of man, Caucasoids, Negroids, and Mongoloids Am. J. Hum. Genet 26:421-443[ISI][Medline]
. 1993 Evolutionary relationships of human populations on a global scale Mol. Biol. Evol 10:927-943[Abstract]
Perez-Lezaun A., F. Calafell, D. Comas, et al. (12 co-authors) 1999 Sex-specific migration patterns in Central Asian populations, revealed by analysis of Y-chromosome short tandem repeats and mtDNA Am. J. Hum. Genet 65:208-219[ISI][Medline]
Poloni E. S., O. Semino, G. Passarino, A. S. Santachiara-Benerecetti, I. Dupanloup, A. Langaney, L. Excoffier, 1997 Human genetic affinities for Y-chromosome P49a,f/TaqI haplotypes show strong correspondence with linguistics Am. J. Hum. Genet 61:1015-1035[ISI][Medline]
Posada D., K. A. Crandall, A. R. Templeton, 2000 GeoDis: a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes Mol. Ecol 9:487-488[ISI][Medline]
Pritchard J. K., M. T. Seielstad, A. Perez-Lezaun, M. W. Feldman, 1999 Population growth of human Y chromosomes: a study of Y chromosome microsatellites Mol. Biol. Evol 16:1791-1798
Przeworski M., R. R. Hudson, A. Di Rienzo, 2000 Adjusting the focus on human variation Trends Genet 16:296-302[ISI][Medline]
Qamar R., Q. Ayub, S. Khaliq, A. Mansoor, T. Karafet, S. Q. Mehdi, M. F. Hammer, 1999 African and Levantine origins of Pakistani YAP+ Y chromosomes Hum. Biol 71:745-755[ISI][Medline]
Redd A. J., M. Stoneking, 1999 Peopling of Sahul: mtDNA variation in aboriginal Australian and Papua New Guinean populations Am. J. Hum. Genet 65:808-828[ISI][Medline]
Relethford J. H., 1995 Genetics and modern human origins Evol. Anthropol 4:53-63
Relethford J. H., H. C. Harpending, 1994 Craniometric variation, genetic theory, and modern human origins Am. J. Phys. Anthropol 95:249-270[ISI][Medline]
Rohlf F. J., 1998 NTSYS-pc: numerical taxonomy and multivariate analysis system Release 2.02H. Exeter Software, Setauket, N.Y
Santos F. R., A. Pandya, M. Kayser, et al. (13 co-authors) 2000 A polymorphic L1 retroposon insertion in the centromere of the human Y chromosome Hum. Mol. Genet 9:421-430
Schaal B. A., K. M. Olsen, 2000 Gene genealogies and population variation in plants Proc. Natl. Acad. Sci. USA 97:7024-7029
Schneider S., J.-M. Kueffer, D. Roessli, L. Excoffier, 1998 Arlequin: a software for population genetic analysis Release 1.1. Genetics and Biometry Laboratory, University of Geneva, Geneva, Switzerland
Scozzari R., F. Cruciani, P. Santolamazza, et al. (17 co-authors) 1999 Combined use of biallelic and microsatellite Y-chromosome polymorphisms to infer affinities among African populations Am. J. Hum. Genet 65:829-846[ISI][Medline]
Seielstad M. T., E. Minch, L. L. Cavalli-Sforza, 1998 Genetic evidence for a higher female migration rate in humans Nat. Genet 20:278-280[ISI][Medline]
Semino O., G. Passarino, P. J. Oefner, et al. (17 co-authors) 2000 The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective Science 290:1155-1159
Sheffield V. C., J. S. Beck, A. E. Kwitek, D. W. Sandstrom, E. M. Stone, 1993 The sensitivity of single-strand conformation polymorphism analysis for the detection of single base changes Genomics 16:325-332[ISI][Medline]
Shen P., F. Wang, P. A. Underhill, et al. (13 co-authors) 2000 Population genetic implications from sequence variation in four Y chromosome genes Proc. Natl. Acad. Sci. USA 97:7354-7359
Shinka T., K. Tomita, T. Toda, S. E. Kotliarova, J. Lee, Y. Kuroki, D. K. Jin, K. Tokunaga, H. Nakamura, Y. Nakahori, 1999 Genetic variations on the Y chromosome in the Japanese population and implications for modern human Y chromosome lineage J. Hum. Genet 44:240-245[ISI][Medline]
Sommer S. S., A. R. Groszbach, C. D. Bottema, 1992 PCR amplification of specific alleles (PASA) is a general method for rapidly detecting known single-base changes Biotechniques 12:82-87[ISI][Medline]
Stoneking M., 1998 Women on the move Nat. Genet 20:219-220[ISI][Medline]
Stoneking M., J. J. Fontius, S. L. Clifford, H. Soodyall, S. S. Arcot, N. Saha, T. Jenkins, M. A. Tahir, P. L. Deininger, M. A. Batzer, 1997 Alu insertion polymorphisms and human evolution: evidence for a larger population size in Africa Genome Res 7:1061-1071
Su B., J. Xiao, P. Underhill, et al. (21 co-authors) 1999 Y-Chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age Am. J. Hum. Genet 65:1718-1724[ISI][Medline]
Swofford D., 2000 PAUP: phylogenetic analysis using parsimony Release 4.0b4. Sinauer, Sunderland, Mass
Templeton A. R., 1993 The "Eve" hypothesis: a genetic critique and reanalysis Am. Anthropol 95:51-72[ISI]
. 1997 Testing the out-of-Africa replacement hypothesis with mitochondrial DNA data Pp. 329360 in G. A. Clark and C. Willermet, eds. Conceptual issues in modern human origins research. Aldine de Gruyter, Amsterdam
. 1999 Human races: a genetic and evolutionary perspective Am. Anthropol 100:632-650[ISI]
Templeton A. R., E. Boerwinkle, C. F. Sing, 1987 A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila. Genetics 117:343-351
Templeton A. R., E. Routman, C. A. Phillips, 1995 Separating population structure from population history: a cladistic analysis of the geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, Ambystoma tigrinum. Genetics 140:767-782
Templeton A. R., C. F. Sing, 1993 A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. IV. Nested analyses with cladogram uncertainty and recombination Genetics 134:659-669
Thomson R., J. K. Pritchard, P. Shen, P. J. Oefner, M. W. Feldman, 2000 Recent common ancestry of human Y chromosomes: evidence from DNA sequence data Proc. Natl. Acad. Sci. USA 97:7360-7365
Turner T. F., J. C. Trexler, J. L. Harris, J. L. Haynes, 2000 Nested cladistic analysis indicates population fragmentation shapes genetic diversity in a freshwater mussel Genetics 154:777-785
Underhill P. A., L. Jin, A. A. Lin, S. Q. Mehdi, T. Jenkins, D. Vollrath, R. W. Davis, L. L. Cavalli-Sforza, P. J. Oefner, 1997 Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography Genome Res 7:996-1005
Underhill P. A., P. Shen, A. A. Lin, et al. (21 co-authors) 2000 Y chromosome sequence variation and the history of human populations Nat. Genet 26:358-361[ISI][Medline]
Urbanek M., D. Goldman, J. C. Long, 1996 The apportionment of dinucleotide repeat diversity in Native Americans and Europeans: a new approach to measuring gene identity reveals asymmetric patterns of divergence Mol. Biol. Evol 13:943-953
Vigilant L., M. Stoneking, H. Harpending, K. Hawkes, A. C. Wilson, 1991 African populations and the evolution of human mitochondrial DNA Science 253:1503-1507[ISI][Medline]
Vollrath D., S. Foote, A. Hilton, L. G. Brown, P. Beer-Romero, J. S. Bogan, D. C. Page, 1992 The human Y chromosome: a 43-interval map based on naturally occurring deletions Science 258:52-59[ISI][Medline]
Wright S., 1969 Evolution and the genetics of populations 2. the theory of gene frequencies University of Chicago Press, Chicago
Zerjal T., B. Dashnyam, A. Pandya, et al. (18 co-authors) 1997 Genetic relationships of Asians and northern Europeans, revealed by Y-chromosome DNA analysis Am. J. Hum. Genet 60:1174-1183[ISI][Medline]