Hubbard Center for Genome Studies and Graduate Program in Genetics, University of New Hampshire
Despite more than a century of interest in the evolution of humans from our close relatives the great apes, the genes responsible for phenotypic differences between humans and chimpanzees have remained elusive. Sequencing of the chimpanzee genome is expected to identify some 42 million nucleotide differences between humans and chimpanzee. How can we identify the small proportion of these differences which are the essential elements of being human? We have analyzed the draft human genome to find regions which may have experienced recent strong selection in the human line. Included in the identified regions are several genes for neural development and function, skeletal development, and fat metabolism. These observations provide a starting point in the search to identify the salient genetic differences between modern humans and our immediate hominid ancestors.
Strong directional selection for a favorable new allele can cause a "selective sweep." As the new mutant rises in frequency, adjacent chromosomal regions are also swept to fixation in a process sometimes called genetic hitchhiking (Maynard-Smith and Haigh 1974
). These events can be recognized as regions of low nucleotide diversity. The size of this region depends both on the length of time required to bring the mutation to fixation in the population and the local recombination rate. With time, the record of the selective sweep is gradually erased by new mutations and the continual turnover of neutral variation. The average persistence of time for a human single nucleotide polymorphism (SNP) is 4N generations, or about 1 Myr (assuming a population [N] of 10,000 and a generation time of 25 years). So a region devoid of SNPs because of a selective sweep will regain 50% of its normal level of polymorphism in 1 Myr and 75% within 2 Myr. The current areas of low nucleotide diversity in the human genome may therefore reflect recent selective sweeps surrounding genes which are important for the evolution of modern humans some 200,000 years ago but are much less likely to be associated with the rise of Homo erectus 2 MYA.
We examined the data of the International SNP Map Working Group (2001)
to identify areas of low nucleotide diversity in the human genome. In these data, the genome is divided into consecutive bins of 200,000 base pairs, and the density of SNPs is recorded from a panel of 24 ethnically diverse individuals. Over the whole genome, 2.5% of the bins had nucleotide diversity (
) less than 2.0 x 10-4. Regions of low SNP density are most prevalent on the sex chromosomes: 89% of bins in the nonrecombining region of the Y and 15% of bins on the X chromosome have
< 2.0 x 10-4.
Our analysis focused on the autosomes. While any bin where < 2.0 x 10-4 is a likely place to look for a selective sweep, areas where we find two or more consecutive bins are of special interest because they may indicate an episode of strong selection. The 192 low-diversity autosomal bins are not distributed randomly. One would expect no more than five bins to be clustered with other bins by chance, but we found 31 bins clustered into 12 runs of two or more consecutive bins. This clumping is highly unlikely (Runs test, t = -10.57, P << 10-16; see table 1 ) (Sokal and Rohlf 1981
, pp. 782784). There are two runs of four consecutive low-diversity bins, three runs of three consecutive bins, and seven runs of two consecutive bins. None of these runs are on chromosomes containing "recombination deserts" described by Yu et al. (Yu et al. 2001
). Because of continuing changes in the assembly of the human genome, we used BLAST to match the sequences of the original bins to the current annotated assembly (February 2002) and scanned these chromosomal regions to identify genes which might be responsible for a selective sweep.
|
The run of four consecutive low-diversity bins on chromosome 16 contains ABCC11 and ABCC12 (also known as MRP8 and MRP9). These genes encode members of the multidrug resistance protein group of the ATP-binding cassette transporter superfamily. They are most closely related to ABCC5, which is a transporter of nucleosides in a variety of tissues (Dean, Rzhetsky, and Allikmets 2001
). ABCC11 is expressed in most tissues, whereas expression of ABCC12 is limited to testis, ovary, and prostate. Their chromosomal location identifies them as candidates for two inherited forms of convulsive disorders (PKC and ICCA) (Tammur et al. 2001
).
Eighteen of the 89 named genes found in the low SNP density bins were involved in neural development and function. Related in function to OLIG2, we have GCMB, a binary switch between neural and glial cell determination and EDN3 which induces reversion of melanocytes to their bipotential neural crest stem cell precursors allowing them to develop either into glial cells or melanocytes (Dupin et al. 2000
). In runs of two and three consecutive low-diversity bins on chromosome 21, in the area involved with Down Syndrome, there are genes for central nervous system development and function (DSCR1, ITSN1). Also of note are nardilysin (NRD1) which in early mouse development is expressed almost exclusively in neural tissue (Fumagalli et al. 1998
); members of the protocadherin beta family which are involved in synapse formation and neuron-neuron recognition and interaction; and two glutamate receptors (GRM1/3) involved in neurotransmission.
At least fifteen other genes on our list are involved in general skeletal development, including a bone morphogenic protein receptor (BMPR2) and a cartilage-derived morphogenic protein (GDF5). We find dysmorphic anatomic features from the Down Syndrome genes in trisomy 21, and from mutations in glucosidase (GCS1), polydactyly (PAPA-1), and TRIM37 (the syndrome of mulibrey nanism [Perpheentupa et al. 1973
]).
One long-noted basic anatomical difference between humans and chimpanzees is the human subcutaneous layer of fat which is lacking in other primates (Wood-Jones 1929
, p. 309). In one of our three-bin runs we find the gene AGRP, agouti-related protein homolog, which regulates body weight, obesity, and fat distribution, including the layer of subcutaneous fat. Other lipid-related genes are an intracellular lipid receptor (OSBPL9) and 2 apolipoprotein genes (APOL5/6).
Although multiple adjacent bins of low SNP density suggest recent selective sweeps, isolated bins may indicate relatively ancient selective sweeps or instances of weaker selection. In total, the 192 autosomal bins with low SNP density contain 18 genes related to neural development and function, 15 other genes relating to structural development or growth factors, and 56 other genes of miscellaneous function (see table 2
). There are also 470 hypothetical genes whose functions may be important to human evolution.
|
|
Footnotes
Naruya Saitou, Reviewing Editor
Keywords: selective sweeps
human evolution
human genome
chimpanzee genome
genetic hitchhiking
Address for correspondence and reprints: Karl C. Diller, Hubbard Center for Genome Studies, Environmental Technology Building, 430, University of New Hampshire, Durham, New Hampshire 03824. E-mail: karl.diller{at}unh.edu
.
References
Dean M., A. Rzhetsky, R. Allikmets, 2001 The human ATP-binding cassette (ABC) transporter superfamily Genome Res 11:1156-1166
Dupin E., C. Glavieux, P. Vaigot, N. M. Le Douarin, 2000 Endothelin 3 induces the reversion of melanocytes to glia through a neural crest-derived glial-melanocytic progenitor Proc. Natl. Acad. Sci. USA 97:7882-7887
Fumagalli P., M. Accarino, A. Egeo, et al. (12 co-authors) 1998 Human NRD convertase: a highly conserved metalloendopeptidase expressed at specific sites during development and in adult tissues Genomics 47:238-245[ISI][Medline]
Maynard-Smith J., J. Haigh, 1974 The hitchhiking effect of a favorable gene Genet. Res 23:23-35[ISI][Medline]
Perpheentupa J., S. Autio, S. Leisti, C. Raitta, L. Tuuteri, 1973 Mulibrey nanism, an autosomal recessive syndrome with pericardial constriction Lancet 2:351-355[Medline]
Saito T., F. Guan, D. F. Papolos, S. Lau, M. Klein, C. S. Fann, H. M. Lachman, 2001 Mutation analysis of SYNJ1: a possible candidate gene for chromosome 21q22-linked bipolar disorder Mol. Psychiatry 6:387-395[ISI][Medline]
Sokal R. R., F. J. Rohlf, 1981 Biometry: the principles and practice of statistics in biological research W. H. Freeman and Co, San Francisco
Tammur J., C. Prades, I. Arnould, et al. (12 co-authors) 2001 Two new genes from the human ATP-binding cassette transporter superfamily, ABCC11 and ABCC12, tandemly duplicated on chromosome 16q12 Gene 273:89-96[ISI][Medline]
The International SNP Map Working Group. 2001 A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms Nature 409:928-933.[ISI][Medline]
Wood-Jones F., 1929 Man's place among the mammals Edward Arnold, London
Yu A., C. Zhao, Y. Fan, et al. (11 co-authors) 2001 Comparison of human genetic and sequence-based physical maps Nature 409:951-953[ISI][Medline]
Zhou Q., D. J. Anderson, 2002 The bHLH transcription factors OLIG2 and OLIG1 couple neuronal and glial subtype specification Cell 109:61-73[ISI][Medline]