Genetic Diversity Patterns in the SR-BI/II Locus Can Be Explained by a Recent Selective Sweep

Mireille Le Jossec*, Tina Wambach*, Damian Labuda*,{dagger}, Daniel Sinnett*,{dagger} and Emile Levy*,{ddagger}

* Centre de Recherche, Hôpital Sainte-Justine, Montréal, Quebec, Canada
{dagger} Département de Pédiatrie
{ddagger} Département de Nutrition, Université de Montréal, Québec, Canada

Correspondence: E-mail: damian.labuda{at}umontreal.ca.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
The human scavenger receptor class B type I (SR-BI and splice variant SR-BII) plays a central role in HDL cholesterol metabolism and represents a candidate gene for a number of related diseases. We examined the genetic diversity of its coding and flanking regions in a sample of 178 chromosomes from individuals of European, African, East Asian (including Southeast Asian), Middle-Eastern as well as Amerindian descent. Nine of the 14 polymorphisms observed are new. Four of the five variants causing amino acid replacements, G2S, S229G, R484W, and G499R, are likely to affect protein structure and function. SR-BI/BII diversity is partitioned among 19 haplotypes; all but one interconnected by single mutation or a recombination event. Such tight haplotype network and the unusual geographic partitioning of this diversity, high not only in Africa but in East Asia as well, suggests its recent origin and possible effect of selection. Coalescent analysis infers a relatively short time to the most recent common ancestor and points to population expansion in Africa and East Asia. These two continents differ significantly in pairwise FST values, differing as well from a single cluster formed by Europe, Middle East and America. In the context of findings for similarly analyzed other loci, we propose that a selective sweep at the origin of modern human populations could explain the low level of ancestral SR-BI/II diversity. The unusually deep split between Africa and Asia, well beyond the Upper Paleolithic when inferred under neutrality, is consistent with subsequent geographical and demographic expansion favoring the accumulation of new variants, especially in groups characterized by large effective population sizes, such as Asians and Africans. The relevance of such partitioning of SR-BI/II diversity remains to be investigated in genetic epidemiological studies which can be guided by the present findings.

Key Words: human populations • genetic diversity • scavenger receptor gene • coalescence analysis


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
SR-BI is a plasma membrane glycoprotein identified as the first cell surface high-density-lipoprotein (HDL) receptor (Acton et al. 1994; 1996), which plays a central role in the clearance of HDL cholesterol and in the regulation of its plasma levels (Ji et al. 1997; Rigotti et al. 1997; Jian et al. 1998; Varban et al. 1998). It mediates the binding of HDL, the selective uptake of cholesteryl esters without degradation of HDL protein moieties, and the bi-directional transfer of free cholesterol between HDL and cells. SR-BI recognizes a broad spectrum of ligands, including native low-density lipoprotein (LDL), modified lipoproteins, maleylated albumin, and anionic phospholipids (Acton et al. 1994; Calvo et al. 1997; Rigotti et al. 1997). It is involved in the suppression of atherosclerosis development (Braun et al. 2002), in the regulation of hormone synthesis in streoidogenic tissues (Azhar and Reaven 2002), and in intestinal cholesterol absorption (Werder et al. 2001). Cholesterol metabolism during pregnancy and lactation (Smith et al. 1998), lipid delivery to the embryo (Hatzopoulos et al. 1998), and the recognition of senescent or apoptotic cells (Fukasawa et al. 1996; Murao et al. 1997; Shiratsuchi et al. 1999) were also found to be influenced by this locus. SR-BI (-/-) mice displayed marked hypercholesterolemia characterized by impaired HDL clearance and dramatic sterol content decrease in adrenal tissue (Rigotti et al. 1997). In contrast, hepatic overexpression of SR-BI resulted in diminished HDL cholesterol content and enhanced delivery of HDL-associated lipid to hepatocytes and bile (Kozarsky et al. 1997).

The SR-BI gene is composed of 13 exons that are spread over 86 kb on chromosome 12q24.31–32; its splice variant SR-BII, obtained by skipping the 129-nucleotide-long exon 12, is expressed in rodents and humans (Webb et al. 1997; 1998). Certain hormones have been shown to modulate its splicing (Graf, Roswell, and Smart 2001), consistent with the notion that these isoforms may represent part of the HDL-cholesterol metabolism regulation mechanism. In the context of current efforts to identify and characterize polymorphisms conferring susceptibility alleles associated with disease risk or therapeutic response, genes implicated in cholesterol and lipid metabolism have drawn considerable attention (Clark et al. 1998; Fullerton et al. 2000; 2002). However, except for a few variants identified in individuals of European descent (Acton et al. 1999), there has been no systematic study of the SR-BI/BII locus. We present data on allelic diversity in its coding region and the resulting haplotypes. Our analysis suggests the possible involvement of selection in the evolutionary history of this locus, a finding that could be of potential relevance for genetic epidemiology.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
DNA Samples
DNA samples were acquired from the Coriell Institute for Medical Research (Camden, N.J.) or obtained, on a non-nominative basis, from consenting adults who provided information about their ethnic, linguistic, and geographic origins (following a protocol approved by our Institutional Advisory Board). Individuals of European descent were represented by French-Canadians from Quebec (n = 9) and other European-Canadians (n = 9); those of Sub-Saharan African descent consisted of African Americans (n = 18) (Coriell NA17133–17150); Southeast Asians originated from Vietnam and Laos (they and East Asians from China and Korea will be collectively referred to as East Asians [n = 9]); the Middle East and North Africa were represented by Ashkenazi and Sepharadic Jews, as well as by Arabs from Lebanon, Egypt, and Algeria (n = 23); and Amerindians were represented by individuals from North America (n = 10) and South America (n = 10) (Coriell NA17311–17320) plus one individual of mixed origin from Mexico.

Polymerase Chain Reaction Amplification
Polymerase chain reaction (PCR) primer pairs (table 1) were designed for exons 2–12 outside the exonic borders. For exon 1, where no downstream sequence (intron 1) was available, the reverse primer intrudes 22 nucleotides into the exon (GenBank entries Z22555, gi: 397606 and AC020773 for cDNA and genomic sequence, respectively). According to Cao et al. (1997), the nucleotide 143 positions upstream of initiation codon ATG was considered as a transcription start site, and hence the beginning of exon 1. For exon 13 (932 bp), five overlapping fragments (A to E) were used to cover the sequence with shorter DNA fragments better suited for Single-Strand Conformation Polymorphism (SSCP) analysis (Orita et al. 1989).


View this table:
[in this window]
[in a new window]
 
Table 1 Characteristics of PCR Amplification of SR-BI/II Segments.

 
DNA segments were amplified by PCR in a 10-µl reaction using 5 ng of genomic sample, 0.75 U of Platinum Taq DNA polymerase (Invitrogen Canada, Inc.) in the Platinum PCR buffer containing 1.5 mM MgCl2 and 0.2 mM of each dNTP, and 2 µM of each primer in the presence of formamide or DMSO if so indicated in table 1. The PCR was performed for 35 cycles, each of 30 s at 94°C, 30 s at the annealing temperature (table 1), and 30 s at 72°C. For SSCP analysis, the PCR contained only 0.1 mM cold dCTP supplemented by 0.1 mM {alpha}32P-dCTP, 3,000 Ci/mmol (Amersham Biosciences).

SSCP Analysis, Sequencing, and Genotyping
Conformational DNA variants were analyzed by electrophoresis in a 6% polyacrylamide gel containing 10% glycerol (acrylamide-bisacrylamide ratio of 29:1 and for exon 7 also 50:1) in TBE (0.089 M Tris-borate pH 8, 0.01 M EDTA). The PCR products were mixed (1:2) with 95% formamide, heated at 94°C for 5 min, and cooled on ice. Electrophoresis was carried out at room temperature for 16–21 h, at 400 V, in an S2 apparatus (Life Technologies). The dried gel was exposed using X-OMAT AR Kodak film. Gel mobility variants were analyzed by dideoxy-sequencing (Thermo Sequenase Radiolabeled Terminator Cycle Sequencing Kit, USB Corporation, Cleveland, Ohio) according to the manufacturer's protocol. Partial sequences in a subset of the analyzed fragments were determined in the common chimpanzee (DNA sample from BIOS Laboratories) in order to assist the assignment of the ancestral allele in polymorphisms 1–4, 9, 12, and 13. Because of characteristic gel mobility profiles, alleles were unambiguously assigned for both homozygote and heterozygote individuals. The genotypes obtained in this way were subsequently used to solve the underlying haplotypes.

Haplotype Solving and Networks
Of the 89 individuals examined, 60 were either homozygous or heterozygous at a single site, yielding unambiguous haplotype solutions. Furthermore, 27 individuals were heterozygous at more than one site and 2 had missing genotypes at position 4 or positions 1–3, respectively. All genotypes were analyzed using the software PHASE, version 1.0 (Stephens, Smith, and Donnelly 2001). Repeated analyses led to stable solutions except for four genotypes consisting of combinations of one known and a new or inferred haplotype. Preference was given to combinations involving known haplotypes of higher regional population frequency. The alternative solutions are reported later (table 3) to indicate that these minor haplotypes are not inferred without ambiguity inherent in the data. A maximum parsimony network of haplotypes was constructed manually, first from the haplotypes connectable by mutations only, and then by addition of the recombinant haplotypes (recurrent mutations were considered unlikely).


View this table:
[in this window]
[in a new window]
 
Table 3 SR-BI/II Haplotypes and Their Geographic Distribution.

 
Statistical Analyses
The software Arlequin v.2.0 (Schneider, Roessli, and Excoffier 2000) was used to obtain summary statistics of genetic diversity (see Results) and to test for selective neutrality (Tajima 1989; Fu 1997). Genetic differentiation among population samples was quantified by the test statistic FST (Wright 1951), computed according to Weir and Cockerham (1984), as implemented in Arlequin, per site or locus, for pairs of continental populations or for the total sample subdivided into five continental groups. Pairwise FST's between the five groups were represented by multidimensional scaling using STATISTICA v. 6.0.

Coalescence analysis was carried out according to the method of Griffiths and Tavaré (1994) using genetree (v. 8.3) on the full data set (excluding recombinant haplotypes—i.e., 12 out of 178 chromosomes in the analyzed sample); maximum likelihood estimates of {theta} (i.e., {theta}ML), the time to the most recent common ancestor (TMRCA), and the ages of mutations were obtained conditional on the haplotype tree, assuming an infinite-sites model, constant population size, and random mating. The number of iterations per run was sufficiently large for estimates to remain constant over repeated runs, which only differed in the random number seed. Estimates of {theta}ML were also obtained under a model of exponential growth, concurrently yielding an estimate of growth rate ß. Particularly in this model the population size exponentially declines backward in time at rate ß from a current size Ne(0), such that the size of the population at time t is N(t) = Ne(0) et (note that t = g/2Ne(0), where g is the number of generations ago). To calculate effective population sizes, we used the mutation rate estimate of Fan et al. (2002) of 1.04 x 10-8 per nucleotide per generation (or 5.2 x 10-10 per nucleotide per year), obtained from a comparison of 108-kb human expressed sequence tags (ESTs) with the corresponding great and lesser ape sequences. This rate compares well with the mutation rate of 10-9 per nucleotide per year based on the divergence of 0.15 between human and bovine SR-BI/II cDNAs, taking into account the slowing down of the mutation rate and extended generation time in primate lineages (Koop et al. 1989; Yi, Ellsworth, and Li 2002). Note that use of the value of 10-9, twice as large as the Fan et al. (2002) estimate, would result in a twofold decrease of the reported Ne and, consequently, the TMRCA and mutation age estimates. From the genetic map of Kong et al. (2002), we obtained an average recombination rate of about 3.1% per Mb per generation.

Interspecies Comparative Analysis
Amino acid sequence of the human protein was compared to the corresponding sequences of rat, mouse, bovine, Chinese hamster, and pig (GenBank entries NP 113729, NP 058021, AAB70920.1, AAA61572, and AAL75567.1, respectively) to assess evolutionary conservation of amino acid substitution polymorphisms.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
SR-BI/II Variants
We analyzed the genetic diversity of the quasi-totality of the SR-BI/II coding region (1,624 bp), both 5' and 3' untranslated regions (980 bp), as well as 627 bp of the flanking introns (fig. 1). Twelve transitions and two transversions were observed in a worldwide sample of 178 chromosomes. Eleven of these sites were in the coding regions, five of which caused amino acid replacements, one was in 3'-UTR, and two were intronic (table 4). Three polymorphisms were seen in at least four continents, four occurred in two continents, and seven occurred in a single continent only, suggesting their recent origin. Tajima's test statistic D (Tajima 1989) with P = 0.071 for the total (world) sample (table 2), bearing in mind its conservative assumption of no recombination, might be considered suggestive of non-neutrality or of population expansion. This would be consistent with the highly significant Fs statistics (table 2) in the world, in Africa, and in East Asia (Fu 1997; Ramos-Onsins and Rozas 2002).



View larger version (12K):
[in this window]
[in a new window]
 
FIG. 1. Schematic representation of the SR-BI/II gene. Horizontal bars correspond to 13 exons with extruding intronic fragments when present in the amplicons. Fragments 13A to 13E represent five amplicons covering exon 13. SR-BII differs from SR-BI by alternative splicing, omitting exon 12, which leads to a stop translation in exon 13 (Webb et al. 1998). Positions of the polymorphic sites numbered 1 to 14 are either given as a distance from the transcription start site within the mRNA sequence or, if in the intronic sequence (int), upstream/downstream of the corresponding exon. An estimate of the total gene length of about 86 kb was obtained by aligning the SR-BI mRNA's (GeneBank gi: 397606) upstream and downstream portions with the genomic clone NCBI gi: 22003948, AC126309, and its adjacent and overlapping contig gi: 22058877, NT_035239.1, respectively

 

View this table:
[in this window]
[in a new window]
 
Table 4 Frequencies and Geographic Distribution of SR-BI/II SNPs.

 

View this table:
[in this window]
[in a new window]
 
Table 2 Diversity Statistics in SR-BI/II.

 
Of the 19 haplotypes deduced, H1 to H19, 14 are due to mutations, whereas 5 haplotypes resulted from single crossovers (table 3). The number of haplotypes (14) observed that are due to mutation is unusually large—i.e., almost equal to the maximum number of 15 haplotypes (S + 1) that can arise from 14 biallelic sites, not all of which are expected to be found in a finite population at the same time. Therefore, our observation may indicate the effect of selection, or a dramatic reduction of genetic drift, due either to a very large population size or to its rapid growth (see Beheregaray et al. 2003). Finding all intermediate haplotypes in a relatively limited population sample argues against a very large population size, thus suggesting rapid population growth and/or selection. The 19 haplotypes were easily connected within a minimum-step spanning network (fig. 2). Two major haplotypes, H1 and H2, which differ by one mutation (C1193T), account for 71% of the global sample. The direction of the 1193 mutation is from C to T, i.e., from H2 to H1, because C, shared with the chimpanzee, can be considered ancestral. Although H1 and H2 occur at the same worldwide frequency, only two mutations, 6 and 11, appear to have occurred on the background of H1, whereas 10 arose on the background of H2. More haplotypes derive from H2 than from H1, including haplotype branch 13–3–4 that occurs locally in Sub-Saharan Africa and haplotype branch 5–12–7 from East Asia. Continentally shared haplotypes can be ascribed to gene flow either from East Asia to the Americas or from Africa through the Middle East to Europe (fig. 2).



View larger version (20K):
[in this window]
[in a new window]
 
FIG. 2. SR-BI/II haplotype networks for the total sample and by groups of continental populations. Each haplotype is represented by a circle, whose square radius (surface) is proportional to its population frequency. H2 is the likely ancestral haplotype. Solid lines represent mutation events creating polymorphisms 1 to 14 (see table 4), and broken lines indicate recombinations leading to new haplotypes

 
FST and Population Differentiation
An overall FST estimate of 9.2% is in the lower range of values (10% to 15%) typically observed between continental groups with protein or autosomal DNA markers (e.g., Cavalli-Sforza, Menozzi, and Piazza 1994; Barbujani et al. 1997; Jorde et al. 2000; Fan et al. 2002). The contribution of individual polymorphic sites was uneven, with only half of the sites showing FST values significantly different from zero. In the overall apportionment of genetic diversity across continents, measured through pairwise FST's and represented by multidimensional scaling (fig. 3), Africa and East Asia appear to differ significantly from Europe, the Middle East, and the Americas, consistent with local networks of haplotypes shown in figure 2.



View larger version (10K):
[in this window]
[in a new window]
 
FIG. 3. Multidimensional scaling representation of pairwise FST's based on all polymorphic sites in the SR-BI/II locus. Distances between America (Am), Europe (Eu), and the Middle East (ME) are not significant, but they are between these three groups and either Africa (Af) or East Asia (EA)

 
Amino Acid Polymorphisms
Sites polymorphic at the human protein level (table 4) were compared to their orthologous positions in three rodent and two artiodactyl species (see Materials and Methods). G2, S229, and G499 were found in all aligned sequences, indicating evolutionary conservation across three (i.e., including Primates) mammalian orders. In contrast, V135 was found to be replaced by I in the Chinese hamster. At position 484, however, with R and derived W in the human protein, Q is found in the orthologous site in all nonprimate species compared. Indeed, the V to I substitution is known to be "conservative," i.e., it rarely affects the protein function due to structural similarity of these amino acids. Throughout evolution it occurs with a relative frequency of 33, as measured by the PAM (point accepted mutation) matrix. This is in contrast to the relative frequency of 16 and 21 for G to S and S to G replacements, respectively, of 10 for Q to R, 2 for R to W, and zero for both G to R and Q to W substitutions (Felsenstein 2004). The latter two thus represent "nonconservative" amino acid substitutions that are likely to affect the SR-BI/II structure and function. We will see that this might also be the case of less dramatic amino acid change at sites 2 and 229.

Age of Mutations
We used a coalescent model to infer the time scale of the origin and evolution of the variation within SR-BI/II locus. The underlying gene tree used in this analysis (fig. 4) was rooted by assuming that haplotype H2 was ancestral (table 3); five recombinant haplotypes representing 12 of the 178 chromosomes analyzed were not included in this analysis (see Materials and Methods). Given {theta} = 4Neµ, and µ = 3.5 x 10-5 as mutation rate per analyzed SR-BI/II segment per generation, we evaluated Ne at 8,500, 17,500, and 21,900 from {theta}{Pi}, {theta}S and {theta}ML, respectively, comparable to values reported in the literature (Li and Sadler 1991; Harding et al. 1997; Jaruzelska et al. 1999; Fan et al. 2002). For a generation time of 20 years, assuming {theta} to be known and setting it equal to {theta}ML, we obtained an estimate of TMRCA of 1.04 (standard deviation of 0.36) in coalescence units (or 2Ne generations; fig. 4), corresponding to a TMRCA of 910 ± 320 Kyr (thousand of years). Using {theta}{Pi} yields a TMRCA of 1.55 ± 0.5 coalescence units, which corresponds to a time depth of 530 ± 170 Kyr, and thus a shallower tree than the one based on {theta}ML. Time estimates based on {theta}S are between those using {theta}ML and {theta}{Pi} (see {theta} values in table 2). From this point forward, we will only refer to the values obtained based on {theta}ML (table 2). The age of mutation 9 is 0.42 ± 0.25 in 2Ne generations (fig. 4) or 370 ± 220 Kyr. The estimates for mutations 13, 5, 1, and 12 were 250 ± 210, 250 ± 200, 140 ± 135, and 90 ± 75 Kyr, respectively. Other mutations appeared younger, most less than 50 or even 10 Kyr old (fig. 4, filled circles). On a relative scale, the oldest mutation, 9, which splits the diversity between H2-derived and H1-derived branches, occurs closer to the present than the half-time to the TMRCA. In contrast to a multiplicity of H2-derived branches, only two branches, those marked by mutations 6 and 11, follow mutation 9. This suggests that mutation 9 could be younger than it appears, either because of recent rapid growth or because of a founder effect in its carrier population(s) of Europe, the Middle East, and the Americas (note that the estimated mutation age is a function of the new allele's frequency). In Africa and East Asia the prevalent H2 gave rise to much greater haplotype diversity. Are these latter continental populations relatively older or were the effective sizes of their human populations relatively larger?



View larger version (27K):
[in this window]
[in a new window]
 
FIG. 4. Gene trees of SR-BI/II haplotypes under growth (left) and constant population size (right time scale). The mutations (table 4) are indicated on the branches (filled and open symbols for the constant and the growing population model, respectively) at the distances proportional to their estimated average time in coalescent units in number of generations 2Ne where Ne is the effective population size over the population history (right scale) or that of that evaluated at present Ne(0) under the population growth model (left scale). The corresponding haplotypes (see table 2), as well as their counts, are indicated below the tree. Twelve (of 178) chromosomes with haplotypes resulting from recombinations were excluded from the analysis

 
In fact, genetree provides a better representation of data under a growth scenario (see table 2 for joint estimates of ß and {theta}ML) than under a constant population size model. This is seen in the world sample, as well as in Africa and East Asia, where a star-like phylogeny is suggested by the shallow shape of the tree, with all but one of the intermediary haplotypes present (fig. 2), and most of the mutations (and recombinations) positioned at the tip of the branches (Donnelly 1996; Harpending et al. 1998). Although the heights of the resulting trees (in 2Ne(0) units) are smaller for a growth model (fig. 4), the difference between TMRCAs in years is less dramatic because the jointly estimated {theta}MLs are also larger, as are the resulting Ne's. Essentially identical TMRCA's, of 680 Kyr, 684 Kyr, and 734 Kyr were evaluated for the world, Africa, and East Asia, respectively. The ages of mutation 9 (415 Kyr) and all other polymorphisms, when evaluated in the world sample, were slightly higher than under a constant size model (fig, 4, open symbols).


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
In this study, 9 of 14 single-nucleotide polymorphisms (SNPs) found are previously unreported; five variants in the coding region caused amino acid replacements. The nature of these substitutions and their interspecies comparison suggest that some of these polymorphisms, and particularly at amino acid positions 484 and 499, might indeed modify protein structure, and thus, affect its function. In addition, a correlation was reported between the serine allele (variant G2S), an increased level of HDL and a decreased concentration of LDL cholesterol (Acton et al. 1999). We found this G2S polymorphism in all populations except for East Asia, consistent with the findings reported by Hong et al. (2002). Therefore, a possible role of this polymorphism in cholesterol-related disorders needs to be investigated further in appropriate cohorts. In contrast, V135I which is not expected to have any significant consequence on protein structure showed no association with plasma lipid and circulating lipoprotein levels, or with the body mass index (Acton et al. 1999). In this respect, G to S and S to G replacements, although less frequent than V to I replacements, represent relatively "permissive" substitutions. Still, both G2 and S229 were found to be conserved in three mammalian orders examined, which may explain the possible functional importance and the effect of G2S polymorphism above as observed by Acton et al. (1999). The importance of S229 may be ascribed to its sequence context. Because the amino acid sequence at positions 227–230 (NIR) defines the consensus sequence for N-glycosylation, G229 renders this putative glycosylation site nonfunctional. The absence of glycosylation in the predicted extracellular domain of the SR-BI/II protein might affect the selective uptake of HDL cholesteryl ester as well as the bi-directional free cholesterol flux. This relationship is to be examined in the East Asian population, where the G229 allele is the most prevalent.

The silent polymorphism C1193T in exon 8, which is the oldest SNP and effectively partitions the SR-BI/II variability almost in half (table 4), is associated with low plasma LDL cholesterol in healthy Caucasian women (Acton et al. 1999) as well as with high plasma HDL cholesterol and a lower risk of coronary artery disease in Koreans (Hong et al. 2002). As a result, all functional variability that arose on the background of H1 and H2 will be in linkage disequilibrium with the 1193T allele and the C allele, respectively. Hence, these associations do not necessarily indicate functional significance of this apparently silent T to C polymorphism, but rather point to its linkage with other polymorphisms such as the observed amino acid substitutions that are more likely to affect the protein function. Knowledge of the haplotype network can therefore assist in the design of an epidemiological survey (i.e., how to choose informative markers for genotyping) and in the subsequent functional interpretation of the underlying variation.

The nucleotide diversity of SR-BI/II is similar to that of other genomic segments, particularly coding or expressed regions (Cargill et al. 1999; Fan et al. 2002; Nakajima et al. 2002; Schneider et al. 2003). We observed two common frequent haplotypes (H1 and H2) that dominate over a flat distribution of rare haplotypes due to novel mutations or recent recombinations. However, the total number of 19 haplotypes appears low for a gene spread over 86 kb on the physical map and 0.26 cM on the genetic map (Kong et al. 2002). With a recombination rate of 0.0026 (1/385) per gene segment per generation, a significant fraction of which is informative (estimated at 17% in the world sample—data not shown; see appendix in Zietkiewicz et al. 2003), a greater number of recombinant SR-BI/II haplotypes would be expected given the depth of the coalescence tree. In contrast, the observed recombinants can be considered relatively young given their frequency (fig. 2).

Another particularity is the geographic distribution of the genetic diversity of the SR-BI/II locus. Nucleotide diversity among Asians is equal to or even exceeds that in Africa (0.059% and 0.044%, respectively), an observation rarely reported in other loci (Harding et al. 1997; Jaruzelska et al. 1999). At the haplotype level, both Africa and East Asia are significantly more variable than the other continental samples (table 2). All this could suggest that African and Asian lineages were similarly ancestral to present-day populations. The estimated ages of mutations 5 and 13, East Asian and African specific, respectively, exceed 200 Kyr, and, if taken literally, would indicate an unusually ancient split between the East Asian and African lineages (fig. 4). Continentally restricted recombinant haplotypes H15 and H9 further emphasize isolation of these continental groups. Genetic separation between Africa and East Asia is also reflected in the FST analysis (fig. 3). Even if the genetic history of SR-BI/II followed a trajectory different from that predicted by the recent out-of-Africa model (e.g., Harding et al. 1997; Hammer et al. 1998; Cruciani et al. 2002), one would still expect to observe a recent African influence in East Asia from the Upper Palaeolithic to the present day (Satta and Takahata 2002). In contrast, the deep split between East Asia and Africa seen in our data may actually be as recent as the late Middle or Upper Palaeolithic, and for several reasons.

First, because the haplotypes observed here represent a tight network with no intermediate missing forms (fig. 2). This could indicate a small role played by genetic drift in the evolution of SR-BI/II genetic diversity as observed today (see e.g., Beheregaray et al. 2003). Indeed, over time, and in a finite population, new haplotypes are created while others become extinct. Thus, at a given historical moment, those haplotypes that have arisen through multiple mutation and recombination events represent only a subset of all allelic combinations as is seen in the PDHA1 locus (Harris and Hey 1999), in beta-globin (Harding et al. 1997), dystrophin (Labuda, Zietkiewicz, and Yotova 2000), CYP1A2 (Wooding et al. 2002), subterminal 16p (Alonso and Armour 2001), or APOE (Fullerton et al. 2000). Although tight networks were observed (1) in very short genomic segments (Jaruzelska et al. 1999; Jin et al. 1999), (2) in segments of low recombination rate (Kaessmann et al. 1999), or (3) in segments under selection (Smirnova et al. 2001), in SR-BI/II, extending over 86 kb and with the recombination rate exceeding threefold the genomic average, explanations (1) and (2) hardly apply. Second, when evaluated through a simple model of constant population size, the overall time depth of the history of SR-BI/II (2.08 Ne generations—fig. 4) is about half that expected (4 Ne) for a neutrally evolving autosomal locus. The SR-BI/II tree appears even shallower than those of X-chromosome loci (Harris and Hey 1999; Jaruzelska et al. 1999; Verrelli et al. 2002). Third, population expansion provides a better description of the data than a constant population size model, as seen under the coalescence analysis and based on statistics such as Fu's Fs and Tajima's D (table 2). However, it is noteworthy that slowly accumulating nucleotide variation in nuclear loci has only rarely provided evidence of expansion (Thomson et al. 2000; Alonso and Armour 2001), as the expansion period (Klein 1999) was relatively recent and presumably too short on an evolutionary scale for new variants to accumulate. Thus, the evidence of population expansion in coalescence analysis may likely be due, in fact, to selection as supported by Tajima's D and Fu's FS. Fourth, there is a striking absence of significant nucleotide variability in SR-BI/II that is ancestral to the divergence of continental populations, as well as a shortage of haplotype diversity, due to recombination, given the spread of genetic distances within the locus. Selection is therefore plausible, and if this is the case, the time estimates based on the neutral model are certainly misleading. Considering all findings and in comparison with other similarly analyzed loci, which were "shaped" by the same demogenetic processes, it is conceivable that selection affected the genetic history of the SR-BI/II locus in the lineage leading to modern humans. In summary, we propose that rather than supporting an ancient split between Africa and Asia, our data can be interpreted assuming that the SR-BI/II variability is relatively young and accrued during the period following the recent out-of-Africa expansion (Cavalli-Sforza, Menozzi, and Piazza 1994). A selective sweep at the origin of modern humans could have been responsible for the disappearance of its ancestral diversity. Because the local recombination rate is relatively high, it is possible that SR-BI/II itself could have been under the selection process rather than being affected by the genetic hitchhiking that derived from selection involving a linked gene.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
We are grateful to all individuals who kindly consented to provide DNA for this study. We thank Patrick Beaulieu, Dominik Gehl, and Vania Yotova for their help; Alan Lovell for the English corrections; Dominika Kozubska for her secretarial assistance; and our reviewers for their very helpful comments. D.S. is a scholar of the Fonds de la Recherche en Santé du Québec. This work was supported by the Fondation de Hôpital Ste-Justine and in part by Valorisation Recherche Québec and the Canadian Institutes of Health Research grant (MOP-12782 to D.L.).


    Footnotes
 
Our polymorphisms were deposited at the National Center for Biotechnology Information (NCBI) and received NCBI assay ID ss14660101 to ss14660106, ss14660108 to ss14660140 Back

David B. Goldstein, Associate Editor Back


    Literature Cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 

    Acton, S., D. Osgood, and M. Donoghue, et al. (11 co-authors). 1999. Association of polymorphisms at the SR-BI gene locus with plasma lipid levels and body mass index in a white population. Arterioscler. Thromb. Vasc. Biol. 19:1734-1743.[Abstract/Free Full Text]

    Acton, S., A. Rigotti, K. T. Landschulz, S. Xu, H. H. Hobbs, and M. Krieger. 1996. Identification of scavenger receptor SR-BI as a high density lipoprotein receptor. Science 271:518-520.[Abstract]

    Acton, S. L., P. E. Scherer, H. F. Lodish, and M. Krieger. 1994. Expression cloning of SR-BI, a CD36-related class B scavenger receptor. J. Biol. Chem. 269:21003-21009.[Abstract/Free Full Text]

    Alonso, S., and J. A. Armour. 2001. A highly variable segment of human subterminal 16p reveals a history of population growth for modern humans outstide Africa. Proc. Natl. Acad. Sci. USA 98:864-869.[Abstract/Free Full Text]

    Azhar, S., and E. Reaven. 2002. Scavenger receptor class BI and selective cholesteryl ester uptake: partners in the regulation of steroidogenesis. Mol. Cell. Endocrinol. 195:1-26.[CrossRef][ISI][Medline]

    Barbujani, G., A. Magagni, E. Minch, and L. L. Cavalli-Sforza. 1997. An apportionment of human DNA diversity. Proc. Natl. Acad. Sci. USA 94:4516-4519.[Abstract/Free Full Text]

    Beheregaray, L. B., C. Ciofi, D. Geist, J. P. Gibbs, A. Caccone, and J. R. Powell. 2003. Genes record a prehistoric volcano eruption in the Galapagos. Science 302:75.[Free Full Text]

    Braun, A., B. L. Trigatti, M. J. Post, K. Sato, M. Simons, J. M. Edelberg, R. D. Rosenberg, M. Schrenzel, and M. Krieger. 2002. Loss of SR-BI expression leads to the early onset of occlusive atherosclerotic coronary artery disease, spontaneous myocardial infarctions, severe cardiac dysfunction, and premature death in apolipoprotein E–deficient mice. Circ. Res. 90:270-276.[Abstract/Free Full Text]

    Calvo, D., D. Gomez-Coronado, M. A. Lasuncion, and M. A. Vega. 1997. CLA-1 is an 85-kD plasma membrane glycoprotein that acts as a high- affinity receptor for both native (HDL, LDL, and VLDL) and modified (OxLDL and AcLDL) lipoproteins. Arterioscler. Thromb. Vasc. Biol. 17:2341-2349.[Abstract/Free Full Text]

    Cao, G., C. K. Garcia, K. L. Wyne, R. A. Schultz, K. L. Parker, and H. H. Hobbs. 1997. Structure and localization of the human gene encoding SR-BI/CLA-1. Evidence for transcriptional control by steroidogenic factor 1. J. Biol. Chem. 272:33068-33076.[Abstract/Free Full Text]

    Cargill, M., D. Altshuler, and J. Ireland, et al. (16 co-authors). 1999. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet 22:231-238.[CrossRef][ISI][Medline]

    Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1994. The history and geography of human genes. Princeton University Press, Princeton, N.J.

    Clark, A. G., K. M. Weiss, and D. A. Nickerson, et al. (11 co-authors). 1998. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am. J. Hum. Genet. 63:595-612.[CrossRef][ISI][Medline]

    Cruciani, F., P. Santolamazza, and P. Shen, et al. (11 co-authors). 2002. A back migration from Asia to sub-Saharan Africa is supported by high-resolution analysis of human Y-chromosome haplotypes. Am J Hum Genet 70:1197-1214.[CrossRef][ISI][Medline]

    Donnelly, P. 1996. Interpreting genetic variability: the effects of shared evolutionary history. Pp. 25–50 in D. Chadwick and G. Cardew, eds. Variation in the human genome. John Wiley & Sons, Chichester, U.K.

    Fan, J. B., D. Gehl, and L. Hsie, et al. (11 co-authors). 2002. Assessing DNA sequence variations in human ESTs in a phylogenetic context using high-density oligonucleotide arrays. Genomics 80:351-360.[CrossRef][ISI][Medline]

    Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland, Mass.

    Fu, Y. X. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915-925.[Abstract/Free Full Text]

    Fukasawa, M., H. Adachi, K. Hirota, M. Tsujimoto, H. Arai, and K. Inoue. 1996. SRB1, a class B scavenger receptor, recognizes both negatively charged liposomes and apoptotic cells. Exp. Cell Res. 222:246-250.[CrossRef][ISI][Medline]

    Fullerton, S. M., A. G. Clark, and K. M. Weiss, et al. (11 co-authors). 2000. Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. Am. J. Hum. Genet. 67:881-900.[CrossRef][ISI][Medline]

    Fullerton, S. M., A. G. Clark, K. M. Weiss, S. L. Taylor, J. H. Stengard, V. Salomaa, E. Boerwinkle, and D. A. Nickerson. 2002. Sequence polymorphism at the human apolipoprotein AII gene (APOA2): unexpected deficit of variation in an African-American sample. Hum. Genet. 111:75-87.[CrossRef][ISI][Medline]

    Graf, G. A., K. L. Roswell, and E. J. Smart. 2001. 17-beta-estradiol promotes the up-regulation of SR-BII in HepG2 cells and in rat livers. J. Lipid Res. 42:1444-1449.[Abstract/Free Full Text]

    Griffiths, R. C., and S. Tavare. 1994. Sampling theory for neutral alleles in a varying environment. Phil. Trans. R. Soc. Lond. Ser. B Biol. Sci. 344:403-410.[ISI][Medline]

    Hammer, M. F., T. Karafet, A. Rasanayagam, E. T. Wood, T. K. Altheide, T. Jenkins, R. C. Griffiths, A. R. Templeton, and S. L. Zegura. 1998. Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. Mol. Biol. Evol. 15:427-441.[Abstract]

    Harding, R. M., S. M. Fullerton, R. C. Griffiths, J. Bond, M. J. Cox, J. A. Schneider, D. S. Moulin, and J. B. Clegg. 1997. Archaic African and Asian lineages in the genetic ancestry of modern humans. Am. J. Hum. Genet. 60:772-789.[ISI][Medline]

    Harpending, H. C., M. A. Batzer, M. Gurven, L. B. Jorde, A. R. Rogers, and S. T. Sherry. 1998. Genetic traces of ancient demography. Proc. Natl. Acad. Sci. USA 95:1961-1967.[Abstract/Free Full Text]

    Harris, E. E., and J. Hey. 1999. X chromosome evidence for ancient human histories. Proc. Natl. Acad. Sci. USA 96:3320-3324.[Abstract/Free Full Text]

    Hatzopoulos, A. K., A. Rigotti, R. D. Rosenberg, and M. Krieger. 1998. Temporal and spatial pattern of expression of the HDL receptor SR-BI during murine embryogenesis. J. Lipid Res. 39:495-508.[Abstract/Free Full Text]

    Hong, S. H., Y. R. Kim, Y. M. Yoon, W. K. Min, S. I. Chun, and J. Q. Kim. 2002. Association between HaeIII polymorphism of scavenger receptor class B type I gene and plasma HDL-cholesterol concentration. Ann. Clin. Biochem. 39:478-481.[CrossRef][ISI][Medline]

    Jaruzelska, J., E. Zietkiewicz, M. Batzer, D. E. Cole, J. P. Moisan, R. Scozzari, S. Tavare, and D. Labuda. 1999. Spatial and temporal distribution of the neutral polymorphisms in the last ZFX intron: analysis of the haplotype structure and genealogy. Genetics 152:1091-1101.[Abstract/Free Full Text]

    Ji, Y., B. Jian, N. Wang, Y. Sun, M. L. Moya, M. C. Phillips, G. H. Rothblat, J. B. Swaney, and A. R. Tall. 1997. Scavenger receptor BI promotes high density lipoprotein–mediated cellular cholesterol efflux. J. Biol. Chem. 272:20982-20985.[Abstract/Free Full Text]

    Jian, B., M. de la Llera-Moya, Y. Ji, N. Wang, M. C. Phillips, J. B. Swaney, A. R. Tall, and G. H. Rothblat. 1998. Scavenger receptor class B type I as a mediator of cellular cholesterol efflux to lipoproteins and phospholipid acceptors. J. Biol. Chem. 273:5599-5606.[Abstract/Free Full Text]

    Jin, L., P. A. Underhill, V. Doctor, R. W. Davis, P. Shen, L. L. Cavalli-Sforza, and P. J. Oefner. 1999. Distribution of haplotypes from a chromosome 21 region distinguishes multiple prehistoric human migrations. Proc. Natl. Acad. Sci. USA 96:3796-3800.[Abstract/Free Full Text]

    Jorde, L. B., W. S. Watkins, M. J. Bamshad, M. E. Dixon, C. E. Ricker, M. T. Seielstad, and M. A. Batzer. 2000. The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y-chromosome data. Am. J. Hum. Genet. 66:979-988.[CrossRef][ISI][Medline]

    Kaessmann, H., F. Heissig, A. von Haeseler, and S. Paabo. 1999. DNA sequence variation in a non-coding region of low recombination on the human X chromosome. Nat. Genet. 22:78-81.[CrossRef][ISI][Medline]

    Klein, R. G. 1999. The human career. Human biological and cultural origins. The University of Chicago Press, Chicago.

    Kong, A., D. F. Gudbjartsson, J. Sainz, G. M. Jonsdottir, S. A. Gudjonsson, B. Richardsson, S. Sigurdardottir, J. Barnard, B. Hallbeck, and G. Masson. 2002. A high-resolution recombination map of the human genome. Nat. Genet. 31:241-247.[CrossRef][ISI][Medline]

    Koop, B. F., D. A. Tagle, M. Goodman, and J. L. Slightom. 1989. A molecular view of primate phylogeny and important systematic and evolutionary questions. Mol. Biol. Evol. 6:580-612.[Abstract]

    Kozarsky, K. F., M. H. Donahee, A. Rigotti, S. N. Iqbal, E. R. Edelman, and M. Krieger. 1997. Overexpression of the HDL receptor SR-BI alters plasma HDL and bile cholesterol levels. Nature 387:414-417.[CrossRef][ISI][Medline]

    Labuda, D., E. Zietkiewicz, and V. Yotova. 2000. Archaic lineages in the history of modern humans. Genetics 156:799-808.[Abstract/Free Full Text]

    Li, W. H., and L. A. Sadler. 1991. Low nucleotide diversity in man. Genetics 129:513-523.[Abstract/Free Full Text]

    Murao, K., V. Terpstra, S. R. Green, N. Kondratenko, D. Steinberg, and O. Quehenberger. 1997. Characterization of CLA-1, a human homologue of rodent scavenger receptor BI, as a receptor for high density lipoprotein and apoptotic thymocytes. J. Biol. Chem. 272:17551-17557.[Abstract/Free Full Text]

    Nakajima, T., L. B. Jorde, T. Ishigami, S. Umemura, M. Emi, J. M. Lalouel, and I. Inoue. 2002. Nucleotide diversity and haplotype structure of the human angiotensinogen gene in two populations. Am. J. Hum. Genet. 70:108-123.[CrossRef][ISI][Medline]

    Orita, M., H. Iwahana, H. Kanazawa, K. Hayashi, and T. Sekiya. 1989. Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphisms. Proc. Natl. Acad. Sci. USA 86:2766-2770.[Abstract]

    Ramos-Onsins, S. E., and J. Rozas. 2002. Statistical properties of new neutrality tests against population growth. Mol. Biol. Evol. 19:2092-2100.[Abstract/Free Full Text]

    Rigotti, A., B. L. Trigatti, M. Penman, H. Rayburn, J. Herz, and M. Krieger. 1997. A targeted mutation in the murine gene encoding the high density lipoprotein (HDL) receptor scavenger receptor class B type I reveals its key role in HDL metabolism. Proc. Natl. Acad. Sci. USA 94:12610-12615.[Abstract/Free Full Text]

    Satta, Y., and N. Takahata. 2002. Out of Africa with regional interbreeding? Modern human origins. Bioessays 24:871-875.[CrossRef][ISI][Medline]

    Schneider, J. A., M. S. Pungliya, J. Y. Choi, R. Jiang, X. J. Sun, B. A. Salisbury, and J. C. Stephens. 2003. DNA variability of human genes. Mech. Ageing Dev. 124:17-25.[CrossRef][ISI][Medline]

    Schneider, S., D. Roessli, and L. Excoffier. 2000. Arlequin: a software for population genetics data analysis. Genetics and Biometry Laboratory, Department of Anthropology, University of Geneva, Switzerland.

    Shiratsuchi, A., Y. Kawasaki, M. Ikemoto, H. Arai, and Y. Nakanishi. 1999. Role of class B scavenger receptor type I in phagocytosis of apoptotic rat spermatogenic cells by Sertoli cells. J. Biol. Chem. 274:5901-5908.[Abstract/Free Full Text]

    Smirnova, I., M. T. Hamblin, C. McBride, B. Beutler, and A. Di Rienzo. 2001. Excess of rare amino acid polymorphisms in the Toll-like receptor 4 in humans. Genetics 158:1657-1664.[Abstract/Free Full Text]

    Smith, J. L., S. R. Lear, T. M. Forte, W. Ko, M. Massimi, and S. K. Erickson. 1998. Effect of pregnancy and lactation on lipoprotein and cholesterol metabolism in the rat. J. Lipid Res. 39:2237-2249.[Abstract/Free Full Text]

    Stephens, M., N. J. Smith, and P. Donnelly. 2001. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68:978-989.[CrossRef][ISI][Medline]

    Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.[Abstract/Free Full Text]

    Thomson, R., J. K. Pritchard, P. Shen, P. J. Oefner, and M. W. Feldman. 2000. Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc. Natl. Acad. Sci. USA 97:7360-7365.[Abstract/Free Full Text]

    Varban, M. L., F. Rinninger, and N. Wang, et al. (12 co-authors). 1998. Targeted mutation reveals a central role for SR-BI in hepatic selective uptake of high density lipoprotein cholesterol. Proc. Natl. Acad. Sci. USA 95:4619-4624.[Abstract/Free Full Text]

    Verrelli, B. C., J. H. McDonald, G. Argyropoulos, G. Destro-Bisol, A. Froment, A. Drousiotou, G. Lefranc, A. N. Helal, J. Loiselet, and S. A. Tishkoff. 2002. Evidence for balancing selection from nucleotide sequence analyses of human G6PD. Am. J. Hum. Genet. 71:1112-1128.[CrossRef][ISI][Medline]

    Webb, N. R., P. M. Connell, G. A. Graf, E. J. Smart, W. J. de Villiers, F. C. de Beer, and D. R. van der Westhuyzen. 1998. SR-BII, an isoform of the scavenger receptor BI containing an alternate cytoplasmic tail, mediates lipid transfer between high density lipoprotein and cells. J. Biol. Chem. 273:15241-15248.[Abstract/Free Full Text]

    Webb, N. R., W. J. de Villiers, P. M. Connell, F. C. de Beer, and D. R. van der Westhuyzen. 1997. Alternative forms of the scavenger receptor BI (SR-BI). J. Lipid Res. 38:1490-1495.[Abstract]

    Weir, B. S., and C. C. Cockerham. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.[ISI]

    Werder, M., C. H. Han, E. Wehrli, D. Bimmler, G. Schulthess, and H. Hauser. 2001. Role of scavenger receptors SR-BI and CD36 in selective sterol uptake in the small intestine. Biochemistry 40:11643-11650.[CrossRef][ISI][Medline]

    Wooding, S. P., W. S. Watkins, M. J. Bamshad, D. M. Dunn, R. B. Weiss, and L. B. Jorde. 2002. DNA sequence variation in a 3.7-kb noncoding sequence 5' of the CYP1A2 gene: implications for human population history and natural selection. Am. J. Hum. Genet. 71:528-542.[CrossRef][Medline]

    Wright, S. 1951. The genetical structure of populations. Ann. Eugen. 15:323-354.[ISI]

    Yi, S., D. L. Ellsworth, and W. H. Li. 2002. Slow molecular clocks in Old World monkeys, apes, and humans. Mol. Biol. Evol. 19:2191-2198.[Abstract/Free Full Text]

    Zietkiewicz, E., V. Yotova, and D. Gehl, et al. (13 co-authors). 2003. Haplotypes in the dystrophin DNA segment point to a mosaic origin of modern humans' diversity. Am. J. Hum. Genet. 73:994-1015.[CrossRef][ISI][Medline]

Accepted for publication December 5, 2003.





This Article
Abstract
FREE Full Text (PDF)
All Versions of this Article:
21/4/760    most recent
msh074v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Request Permissions
Google Scholar
Articles by Le Jossec, M.
Articles by Levy, E.
PubMed
PubMed Citation
Articles by Le Jossec, M.
Articles by Levy, E.