The Complex History of a Gene Proposed to Participate in a Sexual Isolation Mechanism in House Mice

Robert C. Karn, Annie Orth, François Bonhomme and Pierre Boursot

*Department of Biological Sciences, Butler University;
{dagger}Laboratoire Génome Populations Interactions, CNRS UMR5000, Université Montpellier II, France


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Previous behavioral experiments showed that mouse salivary androgen–binding protein (ABP) was involved in interindividual recognition and might play a role in sexual isolation between house mouse (Mus musculus) subspecies. The pattern of evolution of Abpa, the gene for the alpha subunit of ABP, was found to be consistent with this hypothesis. Abpa apparently diverged rapidly between species and subspecies with a large excess of nonsynonymous substitutions, a lack of exon polymorphism within each of the three subspecies, and a lack of intron polymorphism in the one subspecies studied (M. musculus domesticus). Here we characterized the intron and exon sequence variations of this gene in house mouse populations from central Eurasia, a region yet unsampled and thought to be close to the cradle of the radiation of the subspecies. We also determined the intron and exon sequences in seven other species of the genus Mus. We confirmed the general pattern of rapid evolution by essentially nonsynonymous substitutions, both inter- and intraspecifically, supporting the idea that Darwinian selection has driven the evolution of this gene. We also observed a uniform intron sequence in five samples of M. musculus musculus, suggesting that a selective sweep might have occurred for that allele. In contrast to previous results, however, we found extensive intron and exon polymorphism in some house mouse populations from central Eurasia. We also found evidence for secondary admixture of the subspecies-specific alleles in regions of transition between the subspecies in central Eurasia. Furthermore, an abnormal intron phylogeny suggested that interspecific exchanges had occurred between the house mouse subspecies and three other Palearctic species. These observations appear to be at variance with the simple hypothesis that Abpa is involved in reproductive isolation. Although we do not rule out a role in recognition, the situation appears to be more complex than previously thought. Thus the selective mechanism behind the evolution of Abpa remains to be resolved, and we suggest that it may have changed during the recent colonization history of the house mouse.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The house mouse, Mus musculus, comprises at least three relatively distinct parapatric gene pools given subspecies status by some and full species status by others (for reviews see Boursot et al. 1993Citation ; Sage, Atchley, and Capanna 1993Citation ). Over the past 10 years, evidence has been amassed to suggest that mouse salivary androgen–binding protein (ABP) may play a role as a signal involved in a prezygotic isolation mechanism (Karn and Dlouhy 1991Citation ; Laukaitis, Critser, and Karn 1997Citation ; Talley, Laukaitis, and Karn 2001Citation ) that could have been involved in the original divergence of these gene pools or in the maintenance of their integrity where subspecies made secondary contacts (or in both). The Abpa gene encodes one of the subunits of this small dimeric protein (Dlouhy, Taylor, and Karn 1987Citation ), and has three different alleles, each fixed in a different M. musculus subspecies: Abpaa in M. musculus domesticus, Abpab in M. musculus musculus, and Abpac in M. musculus castaneus (Karn and Dlouhy 1991Citation ). Behavioral experiments (Laukaitis, Critser, and Karn 1997Citation ) and odor preference tests (Talley, Laukaitis, and Karn 2001Citation ) have demonstrated assortative choice correlated with the ABP genotype of the test subjects, and it has been shown that the recognition signal was present in the saliva (where ABP is excreted) but not in the urine (where it is not; Dlouhy, Nichols, and Karn 1986Citation ).

Molecular studies of Abpa have been consistent with the emerging picture of its role in sexual isolation. The elevated Ka:Ks ratio between the alleles, and the alternative fixation of different alleles in each subspecies (Hwang et al. 1997Citation ), suggest a mechanism of fixation of new variants by selective sweeps. Karn and Nachman (1999)Citation produced supporting evidence for such directional selection by showing that in M. musculus domesticus Abpa displayed significantly reduced intron polymorphism in comparison with two X-linked genes.

The aforementioned subspecies have been defined at the periphery of the distribution range of the species: M. musculus domesticus from the Near East to western Europe through the Mediterranean Basin, M. musculus musculus from eastern Europe to northern China, north of the Himalayas, and M. musculus castaneus in South-East Asia. However, the situation in the central part of the range (from the Caucasus to eastern India, south of the Himalayas) is more complex. Populations from the northern Indian subcontinent display much more polymorphism than the peripheral subspecies for both nuclear protein genes (Din et al. 1996Citation ) and mtDNA (Boursot et al. 1996Citation ), and the authors cited have proposed that this is because they are closer to the cradle of the species, from which the recent radiation leading to the peripheral subspecies started. Furthermore, some areas of central Eurasia bordering the range of the peripheral subspecies have been shown to be regions of genetic admixture, for instance between M. musculus domesticus and M. musculus musculus south of the Caucasus (Orth et al. 1996Citation ; Mezhzherin, Kotenkova, and Mikhailenko 1998Citation ) and between M. musculus musculus and the polymorphic central populations in northern Iran (Boursot et al. 1996Citation ; Din et al. 1996Citation ).

We wished to know whether the peculiar pattern of rapid Abpa evolution and absence of polymorphism seen in the peripheral subspecies also prevailed in this central and more complex part of the range of the species. To further study the extent to which selection may have been involved in the evolution of Abpa, we also studied intron polymorphism in representatives of the major protein alleles and intron phylogeny between eight species of the genus Mus.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Materials
Genomic DNA was prepared from mouse tissues in the Laboratory Génome Populations Interactions, University Montpellier II, France. These were provided to R.C.K, who amplified and sequenced segments of Abpa. For taxonomic identification of the species Mus famulus, the reader is referred to sample reference vouchers deposited at MVZ-Berkeley 182095 to 182992 and MNHN-Paris CG-1995-1315 to 1995-1320, under the taxon name Mus famulus Bonhote, 1898. Wild-derived strains ZYP (Mus spicilegus, Yugoslavia), XBS (Mus macedonicus, Bulgaria), and COK (Mus cookii, Thailand) are maintained in the mouse genetic repository in Montpellier. Other specimens used were from the wild and drawn from the DNA collection of LGPI: Mus cervicolor, individidual 16162 from Khorat, Thailand, M. famulus, individual 10659 from the Nilgiri Hills, India, and for M. musculus, the samples listed in table 1 . We wish to thank P. Delattre and J. P. Quéré for providing the sample from China.


View this table:
[in this window]
[in a new window]
 
Table 1 Distribution of Abpa Alleles Among Eurasian Mice

 
Polymerase Chain Reaction (PCR) and DNA Sequencing
PCR products were obtained from genomic DNA as described in Hwang et al. (1997)Citation and sequenced directly using manual and automated sequencing as described previously (Karn and Nachman 1999Citation ). The Abpa gene comprises two exons separated by a 785-bp intron. The two exons encode 61 and 9 amino acids, respectively. Protein polymorphism was determined in a geographically diverse collection of individuals, listed in table 1 , by sequencing the coding region of the first exon (183 bp). A 998-bp fragment including the intron and the coding part of the two exons (up to and including the stop codon) was also sequenced in a subset of these individuals as well as in several species of the genus Mus.

Data Analysis
Sequences were aligned with the DNASIS program (Hitachi). Program MEGA version 2.1 (S. Kumar et al., personal communication) was used to calculate numbers of synonymous and nonsynonymous substitutions as well as their rates (Ka and Ks), using the modified Nei and Gojobori (1986)Citation method, with Jukes-Cantor correction for multiple hits and a transition-transversion rate of 2. This program also served to generate neighbor-joining trees and the node supports by bootstrap resampling. Population genetics statistics and coalescent simulations were performed using DnaSP version 3 (Rozas and Rozas, 1999Citation ).


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Intraspecific Coding Sequence Variation in M. musculus
We sequenced the coding part of the first exon of the Abpa gene (61 codons out of the 70 in the region coding for the secreted protein) in the 68 individuals of various origins listed in table 1 . A majority (51) of the sequenced mice appeared homozygous for one of the a, b, and c alleles originally described by Hwang et al. (1997)Citation . Six mice from a single locality appeared homozygous for a new allele (allele h in table 1 ). Four were heterozygous at a single nucleotide position, thus allowing haplotype reconstruction, which led to the definition of four new haplotypes, found heterozygous with either the c allele (alleles d, e, and f) or the h allele (allele i). The remaining seven individuals appeared heterozygous at several nucleotide positions, and we tried to explain their genotype by combinations of haplotypes found homozygous in other individuals, thus minimizing the total number of haplotypes necessary to explain the data. It was thus necessary to infer only one new haplotype (g), found heterozygous with a c allele. Overall, the genotypes of the 68 mice could thus be explained by combinations of a minimum total number of nine haplotypes, named ai, the inferred sequences of which are shown in figure 1 .



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 1.—Alleles of Abpa inferred from the sequence of the first exon in the 68 M. musculus listed in table 1 . Abpac is used as the reference and only differences from it are shown for the other sequences

 
A note is needed concerning the nomenclature of alleles. Using immunostaining to specifically detect ABP on electroblots of electrophoresis gels, Karn and Dlouhy (1991)Citation described four alleles for Abpa, naming them Abpaa–d. They found the first three alleles in wild population samples of M. musculus but not the putative Abpad allele. Because additional genetic studies suggested that the electromorph designated ABP d is an epigenetic modification of the ABP c protein (R. C. Karn, unpublished data), we withdraw the allelic status of the d variant and reassign Abpad to represent one of the new alleles reported here.

It can be seen in figure 1 that most of the differences between the alleles are nonsynonymous. All pairs of alleles display only no, one, or two synonymous differences, with the exception of the pairs involving allele g, which all display three or four differences. Note though that the g allele was inferred from the genotype of an individual heterozygous at several nucleotide positions, by supposing that it was a c/g heterozygote, and this large number of differences may be an artifact caused by improper haplotype inference. The alleles found to segregate in Pakistan (cg, table 1 ) differ from each other by one to three nonsynonymous differences. Alleles h and i, found in a southern Indian locality (Nilgiri) differ from these by four to seven nonsynonymous differences and from each other by three. The M. musculus musculus allele (b) differs from all the aforementioned alleles (ci) by one to five nonsynonymous differences, whereas the M. musculus domesticus allele (a) appears the most different from all others (from 6 to 11 nonsynonymous differences). Thus, the predominance of nonsynonymous differences previously reported for the a, b, and c alleles (Hwang et al. 1997Citation ) also holds for the newly discovered alleles. Estimated rates of synonymous (Ks) and nonsynonymous (Ka) changes between all pairs of alleles in M. musculus are plotted in figure 2 , showing the general excess of nonsynonymous changes in intraspecific comparisons.



View larger version (9K):
[in this window]
[in a new window]
 
Fig. 2.—Intraspecific comparisons of Ka and Ks in M. musculus (alleles ai of fig. 1 )

 
Geographical Distribution of Abpa-Coding Variants in M. musculus
The geographical distribution of the Abpa alleles inferred from the coding sequence data is plotted in figure 3 . Previous studies were concerned only with the peripheral subspecies and showed that each of them had apparently fixed one of three alleles (fig. 3A , allele a in domesticus, b in musculus, and c in castaneus; Karn and Dlouhy 1991Citation ). Our study concentrated on the central region (from the Caucasus to India) and gave a somewhat more complex, although coherent, picture.



View larger version (67K):
[in this window]
[in a new window]
 
Fig. 3.—The geographical locations of the Abpa alleles listed in table 1 . Panel A, The three alleles (ac) found in the peripheral Mus musculus subspecies. The unnumbered locations are from Karn and Dlouhy (1991), and the numbered are from table 1 . Panel B, The Abpa alleles found in the central region of Eurasia (location numbers from table 1 ). Panel C, Locations (table 1 ) south of the Caucasus Mountains

 
The first major result is that each of the subspecies-specific alleles (a, b, and c) was found in the regions closest to the subspecies to which it belongs. The castaneus allele (c) is present over the entire central region from Iran to northern India (fig. 3B ), where it predominates. The furthest west that this allele was found is in Armenia (fig. 3C ). The domesticus allele (a) was found southwest of the Caucasus, in Adjaria, the locality that is closest to the distribution area of M. musculus domesticus in our sampling. However, it was also found further east, in Armenia, and as far as Tehran in Iran. The musculus allele (b) was found in the regions closest to the distribution area of M. musculus musculus. It was predominant in the region south of the Caucasus (fig. 3C ) and was also found in two localities of northern Iran (fig. 3B ). This is the furthest east and south that this allele was found.

The second major result is that wherever admixture of the subspecies appears geographically possible, we found an admixture of subspecies-specific Abpa alleles: a and b in Adjaria, a, b, and c in Armenia, b and c in northern Iran, and a and c in Tehran.

The third major result concerns the new polymorphisms encountered. Whereas the samples from Delhi appeared fixed for the c allele, consistent with the apparent monomorphism for this allele further east in M. musculus castaneus, the Pakistani samples display significant polymorphism, with four of the 11 individuals sampled heterozygous for c and one rare allele each (alleles dg, table 1 ). The sample from southern India also displays polymorphism, with two new alleles (h and i) found nowhere else.

Interspecific Coding Sequence Variation
A 998-bp fragment covering the 785-bp intron and the coding parts of the flanking exons (183 bp in the left exon and 30 in the right exon, including the stop codon) was sequenced in 12 of the 68 M. musculus mice described previously as well as in one mouse from each of six other species in the genus Mus. Considering only the coding regions, the sequences in the right exon showed no variation, and we will not consider this region further. The sequences of the left exon are shown in figure 4 . The M. musculus sequenced were homozygous for one of the three major alleles, a, b, and c, reported by Hwang et al. (1997)Citation . We also obtained the same coding sequences reported by Hwang et al. (1997)Citation for Mus spretus, M. spicilegus and Mus caroli. The additional Mus species we report here all had unique coding sequences except for M. macedonicus, which was identical to M. spicilegus. In figure 5 , we plotted Ka against Ks to show that the ratio Ka:Ks is greater than 1 in most interspecific comparisons and often even greater than 2, as already reported on a more limited set of species (Karn and Nachman 1999Citation ). The phylogenetic information in these coding sequences is poor because of the short length of the sequence and also because of the homoplasy on several variable sites. Part of this homoplasy presumably results from convergent evolution of this protein under strong selection pressure.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 4.—The Abpa coding sequences of the first exon from eight full species of Mus, including the three subspecies of M. musculus. The sequences of the three alleles found in the subspecies of M. musculus (Abpaa in M. musculus domesticus, Abpab in M. musculus musculus, and Abpac in M. musculus castaneus) as well as those from M. spretus (spr), M. spicilegus (spi), and M. caroli (car) are the same as those reported by Hwang et al. (1997)Citation . The new species coding sequences reported here are from M. macedonicus (mac), M. famulus (fam), M. cervicolor (cer), and M. cookii (coo). As in figure 1 , the c sequence from M. musculus castaneus is used as the reference

 


View larger version (9K):
[in this window]
[in a new window]
 
Fig. 5.—Interspecific comparisons of Ka and Ks, from the sequences in figure 4

 
Intron Polymorphism in M. musculus
As part of the 998-bp fragment described previously, we sequenced the 785-bp intron in 12 representatives of M. musculus, and the results are shown in the upper part of figure 6 . Two individuals from southwest of the Caucasus (Adjaria), that were homozygous for the Abpaa allele, yielded intron sequences identical between them and to the five monomorphic domesticus sequences reported by Karn and Nachman (1999)Citation , which originated from Egypt, Bulgaria, France, Italy, and Senegal. Five mice homozygous for Abpab, from four different localities in Transcaucasia, yielded sequences identical to each other but very different from that of the Abpaa alleles (18 differences).



View larger version (62K):
[in this window]
[in a new window]
 
Fig. 6.—The Abpa intron sequences from eight full species of Mus, including the three subspecies of M. musculus. For the latter, several individuals with the a, b, or c protein allele were sequenced and are identified by their number, as in table 1 . Only the variable sites are shown and identified by their position in the complete sequence. Dots indicate identity with the first sequence, and dashes, deletions

 
Five mice homozygous for Abpac were also sequenced: three from three localities in Pakistan and two from Delhi. The five sequences were all different from each other and resembled more the Abpaa than the Abpab intron. Differences between c allele introns were observed at four nucleotide positions, each of which displayed heterozygosity in some individuals (fig. 6 ). As some mice were heterozygous at more than one position, resolution of the haplotypes was not possible. However, we chose the haplotypic resolution that minimizes the total number of different haplotypes among the five individuals. This identifies mouse 10348 as CCCG/TACA, 10352 as CCCG/CCCG, 10353 as TATA/TATA, 9835 as CCTG/CATA, and 9838 as CATA/CCCG at the four polymorphic sites. This solution implies that there are five putative haplotypes (CCCG, TACA, TATA, CCTG, and CATA), and we used these for further analysis. Nucleotide intron diversity among the five individuals sequenced is {pi} = 0.0027. The parameter {theta} = 4 can be estimated either from {pi} directly, giving {theta}{pi} = 0.0027, or from the number of segregating sites (S = 4), giving {theta}S = 0.0018 (Watterson 1975Citation ). With the estimation of the mutation rate µ per year derived subsequently and assuming three generations per year, this gives an estimated effective population size N {cong} 1.1 x 106 or 7.2 x 105, respectively. Coalescent simulations indicate that, given the number of segregating sites, there is an excess of nucleotide diversity (P = 0.03, assuming no recombination) and an excess of haplotypes if recombination is neglected (P < 10-5). These discrepancies could, however, be caused partly by the violation of the underlying hypotheses of the models used (infinite site or infinite allele models). Relating the five inferred haplotypes in a phylogenetic tree implies at least one homoplasy.

It is possible to develop an estimate of the age of the c allele using the total length of the tree relating the five intron sequences inferred, in units of numbers of mutations (m). Making the conservative assumption that n alleles sampled have evolved along n independent lineages since the appearance of the original mutation that gave birth to the c allele at time t in the past, we expect:

where µ is the mutation rate per base pair and l is the length of the sequence. Estimating µ from the divergence of the Palearctic species group (M. musculus, M. spretus, M. spicilegus, and M. macedonicus, see following discussion) or from the divergence between these and the Asian species (M. caroli, M. cervicolor, and M. cookii) and using the estimated divergence times from She et al. (1990Citation ; 1.1 and 2.4 Myr, respectively) gives the same estimate of µ = 1.3 x 10-8 mutations/(site year). A parsimonious (and thus conservative) value of m is 4 and l = 785. The number of alleles is 10 because five diploid individuals were sequenced. Given the level of inbreeding in mouse populations, taking n = 10 in this calculation is, however, probably extremely conservative. With these very conservative estimates of all the parameters, we find that the minimum age of the c allele would be 40,000 years.

Interspecific Intron Phylogeny
Figure 6 shows the variations observed in the 785-bp intron sequences among the eight species of Mus, including the previously reported sequence of M. caroli (Karn and Nachman 1999Citation ) and the variations inside M. musculus described previously. Figure 7 shows the phylogenetic tree generated from these intron sequences, including the five c allele haplotypes inferred as described previously. The topology of the tree is unexpected because the house mouse M. musculus is not monophyletic and its subspecies stand as outgroups vis à vis the other Palearctic species (M. spretus, M. spicilegus, and M. macedonicus), whereas the reverse was expected based on previous studies (reviewed for instance in Boursot et al. 1993Citation ). In particular, the position of M. musculus musculus outside the other Palearctic species and subspecies is supported by a very high bootstrap score (94%). The divergence of the intron of the musculus subspecies (with coding allele b), especially relative to that of the c allele introns, is striking because the b and c protein alleles show relatively few differences compared with the more divergent domesticus allele. In fact, the Ka:Ks ratio for the b/c comparison was among the lowest (0.83) in all intra- and interspecific comparisons of alleles (figs. 2 and 5 ).



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 7.—The phylogenetic tree generated from the 785-bp intron (fig. 6 ). The four-nucleotide sets represent the haplotypes determined from analysis of the Abpac intron sequences shown in figure 4 (see text). Bootstrap supports are indicated at the bases of the nodes (10,000 replicates)

 
In order to better understand the possible origin of this discrepancy in the intron phylogeny, we compared the levels of divergence of the Abpa intron at different nodes in the phylogeny of the genus Mus with those obtained for single copy nuclear DNA, as determined by DNA/DNA hybridization in She et al. (1990)Citation . The comparison is shown in figure 8 , where we have retained for each data set only the branches that are strongly supported, collapsing the other branches into unresolved polytomies. Estimated nucleotide divergences at the nodes of interest are reported on these trees, in absolute values and in relation to the divergence between the Asian and Palearctic species, this standardization allowing comparisons between the two data sets. The striking point in this comparison is that when M. musculus musculus is excluded for Abpa, the relative divergence of the Palearctic species (M. musculus, M. spretus, M. spicilegus, and M. macedonicus) is much smaller for the Abpa intron than for scnDNA (0.16 compared with 0.46). When M. musculus musculus is taken into account despite its unexpected position in the Abpa phylogeny, the divergence of the Palearctic group appears more compatible between Abpa (0.32) and scnDNA (0.46).



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 8.—Comparison of the phylogeny of the Abpa intron (left panel) with that inferred from single copy nuclear DNA/DNA hybridization (scnDNA, right panel, from She et al. 1990Citation ). Below the trees we indicate the average divergences measured at several nodes in the tree, in % nucleotide divergence for Abpa and in delta-Mode value for scnDNA. Their standard errors are indicated below in italics. For Abpa they are the errors caused by the sampling of sites, estimated by bootstrap resampling. For scnDNA, they represent the experimental error estimated in She et al. (1990)Citation . The bottom line indicates in bold the divergences at the same nodes, standardized in each case relative to the deepest divergence in the tree, between the Asian and Palearctic species. Branch lengths on the trees are not faithfully proportional to nucleotide divergences. Note also that, for simplicity, we use here M. musculus castaneus, rather than M. musculus ssp., to represent the c allele

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Abpa Differentiation Followed Subspeciation Routes
We note that the distributions of the Abpa variants are remarkably compatible with the well-studied mtDNA distributions. This resemblance includes (1) one lineage specific to each peripheral subspecies (e.g., Boursot et al. 1996Citation ; Boissinot and Boursot 1997Citation ; Prager, Orrego, and Sage 1998Citation and references therein), (2) extensive phyletic diversity in the Indian subcontinent (with castaneus being a subset of this, same references), (3) admixture of the domesticus and musculus lineages in Transcaucasia (Orth et al. 1996Citation ), and (4) the presence of musculus and castaneus-like lineages in northern Iran (Boursot et al. 1996Citation ; Boissinot and Boursot 1997Citation ; Prager, Orrego, and Sage 1998Citation ). The picture is also compatible with the collective allozyme data: (1) there is more polymorphism in the center (India, Pakistan, and Iran), including alleles found nowhere else (Din et al. 1996Citation ), (2) there is evidence that the Caucasus region populations are a mixture of domesticus and musculus (Orth et al. 1996Citation ; Mezhzherin, Kotenkova, and Mikhailenko 1998Citation ), and (3) the Iranian populations appear to be intermediate between domesticus and the Indo-Pakistani populations (Din et al. 1996Citation ).

This is worth noting because another potential actor in sexual selection, the Y chromosome, shows a different phylogeographic pattern. Boissinot and Boursot (1997)Citation recognized two lineages of Y chromosome, one being shared by M. musculus musculus and M. musculus castaneus, whereas the other prevailed in M. musculus domesticus as well as in central Eurasia. On the basis of the low molecular divergence between these major Y chromosome lineages, they suggested that this might be explained by secondary sweeps after the subspeciation process. Using a different geographic scenario for the subspeciation process and not taking into account the low Y chromosome divergence, Prager, Orrego, and Sage (1998)Citation favored instead the hypothesis of the segregation of ancestral polymorphism.

Whichever hypothesis is correct, it remains true that, given the presently available data, the distribution of Abpa alleles appears more compatible with that of mtDNA than with that of the Y chromosome. The unique feature of Abpa, on the other hand, is that it is an autosomal gene that is monomorphic in the peripheral subspecies, and our new data substantially increases the resolution of its peripheral distributions.

Secondary Admixture of Peripheral Fixed Abpa Variants
In the case of mtDNA, because the phylogeny of alleles was precisely reconstructed, the occurrence of domesticus and musculus variants in Transcaucasia as well as in northern Iran could be interpreted with confidence as resulting from secondary admixture of the well-differentiated subspecies-specific lineages (Boursot et al. 1996Citation ; Orth et al. 1996Citation ; Boissinot and Boursot 1997Citation ; Prager, Orrego, and Sage 1998Citation ). Such conclusions were harder to reach in the case of allozymes because the phylogenies of the alleles were unknown (Din et al. 1996Citation ; Orth et al. 1996Citation ). In the case of Abpa, our intron sequence data clearly show that the a, b, and c alleles sampled in regions where they co-occur have not coexisted for a long time in the same homogeneous gene pool because they each belong to one of three well-differentiated lineages with no polymorphism (a and b alleles) or little polymorphism compared with the divergence between alleles (the c allele). Thus, it seems natural to consider that their co-occurrence south of the Caucasus and in northern Iran resulted from secondary introgression of the subspecies-specific a, b, and c alleles.

Apparently, whatever selective pressure led to the monomorphism of Abpa in the peripheral subspecies did not prevent secondary admixture there. We can contrast these results with those of the Y chromosome: in neither of the two regions of secondary admixture of mtDNA and Abpa (Transcaucasia and northern Iran) has any evidence of Y chromosome lineage admixture yet been found (Boursot et al. 1996Citation ; Orth et al. 1996Citation ; Boissinot and Boursot 1997Citation ; Prager, Orrego, and Sage 1998Citation ). Rather, only the musculus-like Y chromosome type has been found, up to now, in these regions. The absence of Y chromosome admixture could result from the sterility of hybrid males, a phenomenon that has been reported in some intersubspecific crosses (Forejt 1996Citation ). However, the generality of this phenomenon is far from established. The hybrid zone between domesticus and musculus has been studied in detail in Europe and has given evidence of very strong counterselection against Y chromosome introgression (Vanlerberghe et al. 1986Citation ; Tucker et al. 1992Citation ; Dod et al. 1993Citation ; Prager, Boursot, and Sage 1997Citation ). Similar data on Abpa will allow further comparison of the consequences of selection on these two parts of the genome and their relative role in species isolation.

Directional Selection on Abpa?
As summarized in the introduction, there are various arguments (absence of polymorphism, elevated Ka:Ks ratio, behavioral data) to support the idea that Abpa, at least in the peripheral subspecies, is evolving under some form of selection pressure that might have to do with recognition and mate choice. Karn and Nachman (1999)Citation suggested that allelic sweeps could account for the significantly reduced intron polymorphism in Abpaa. This picture is reinforced by our observation of a single 998-bp Abpab sequence, very different from that of Abpaa, in five samples of the M. musculus musculus populations. Although we could not test for selection, having fixed the b allele in M. musculus musculus populations because no other region of the genome was available for comparison, the monomorphism we observed for this allele is at least consistent with the idea that selective sweeps can fix Abpa alleles in the M. musculus peripheral subspecies. Whether the apparent fixation of the c allele in the South-East Asian subspecies M. musculus castaneus is also accompanied by an absence of intron polymorphism should be the subject of future investigation.

High Abpa Intron and Exon Polymorphism in the Central Region
The co-occurrence in the central region of new Abpa variants not found at the periphery (alleles dg in Pakistan and h and i in southern India) and the observation of substantial intron polymorphism are two major differences with previous observations on Abpa. As was the case for the three major alleles, the new ones were derived essentially by nonsynonymous substitutions, which is usually taken as evidence for selection. However, the protein polymorphism encountered in the Pakistani and southern Indian samples as well as the intron polymorphism of the c allele from Pakistan and northern India is not compatible with the hypothesis of directional selection having driven the evolution of the Abpa gene in these populations. The intron nucleotide diversity encountered in our sample of five individuals with c alleles ({pi} = 0.27%) contrasts with the absence of polymorphism among seven M. musculus domesticus (Karn and Nachman 1999Citation ; this study) and five M. musculus musculus (this study). It is greater than the diversity encountered in M. musculus domesticus at the two X gene introns used by Karn and Nachman (1999)Citation to reject neutrality at Abpa (0.135% and 0.16%). Estimates of effective population sizes from nucleotide diversity in the c introns (N {cong} 106, see Results) appear quite high for a vertebrate.

In fact, our observations on Abpa variation in the central region would better fit a hypothesis of diversifying selection maintaining this polymorphism. However, to retain this hypothesis, we would have to eliminate the alternative explanation that the observed polymorphism is because of secondary admixture of differentiated populations. Clearly, a better understanding of the genetic makeup of this complex central region will be critical to properly understand the type of selection underlying the generation and maintenance of this polymorphism.

Is Selection on Abpa in Peripheral Subspecies Linked to Commensalism?
It seems that the recent extension of the three subspecies to their peripheral range in Eurasia is linked to the advent of commensalism with humans (reviewed in Boursot et al. 1993Citation ; Sage, Atchley, and Capanna 1993Citation ). Perhaps selection on Abpa alleles in the peripheral populations relative to the central ones changed as the result of this recent and rapid expansion. Mus musculus evolved for at least half a million years before it became commensal with humans and before it had radiated very extensively outside the region shown in figure 3B . It could be that during the peripheral expansion in relation to commensalism, ABP-mediated sexual selection became more important in the ecology of mouse populations.

Clearly, the c allele could not have been fixed by selection in relationship to commensalism because we estimate its origin to be far older (at least 40,000 years old) than commensalism (10,000 years; Auffray, Vanlerberghe, and Britton-Davidian 1990Citation ). However, it may be that the protein polymorphism observed in the center results from the secondary admixture of previously differentiated and monomorphic populations and that the c intron polymorphism results from recent recombination with the other divergent alleles. This would be another consequence of commensalism, that presumably promoted gene flow.

The suggestion that gene flow is responsible for Abpa polymorphism in the central region does not fit well with the proposed role for ABP as a recognition signal. However, it could be that this salivary protein has some other function, implying a variable role for selection depending on the environment (for example, Laukaitis, Critser, and Karn [1997Citation ] suggested that ABP's original function may have involved conditioning the mouse's pelt). Its role in sexual selection would be a side effect only emphasized under certain conditions when the pressure for sexual selection becomes predominant compared with that exerted by environmental factors. The changes of condition that presumably accompanied the rapid expansion of the peripheral subspecies to the whole continent could have created such a change in the selection regime. In the case of one of the best studied polymorphisms at the DNA level, the Drosophila melanogaster ADH locus, evidence is accumulating that the kinds of selection pressures operating on the gene have changed in space and time, especially since the recent worldwide spread of this commensal species (Veuille et al. 1998Citation ; Begun et al. 1999Citation ). A search for possible alternative functions of ABP should contribute to resolving the issue of the kinds of selection pressures influencing its evolution.

The Peculiar Phylogenetic History of Abpa
The phylogeny of the Abpa introns that we obtained shows several unexpected aspects, given what is known of the speciation and subspeciation history in the genus Mus (reviewed in Boursot et al. 1993Citation ; Sage, Atchley, and Capanna 1993Citation ). Considering only the topology of the tree in figure 7 , it is striking that the house mouse sequences lie outside the other Palearctic species sequences. This external position is strongly supported in the case of M. musculus musculus. If we interpreted the result in this way, we would have to suppose that the M. musculus alleles (or at least the Abpab allele) are of alien origin and were captured by an interspecific transfer from an unknown species (we have sampled all species known to date). Setting aside this hypothesis, we can ask whether we sequenced orthologous copies of Abpa in all species. If a gene duplication produced two copies of Abpa (say copies I and II), it must have happened before the divergence of the Palearctic species to account for the position of the house mouse sequences (and especially the b allele) outside this group. It must also be hypothesized that our primers amplified, say, copy I but not copy II in all taxa except M. musculus, whereas the reverse pattern of amplification success occurred for copy II. This appears quite improbable. Supposing then that the primers picked orthologous exons in all species, we have to explain that the transfer of information between copies concerned only the intron. This could have happened by ectopic conversion of one copy by the other. Although ectopic conversion has been found to occur in mice (e.g., Murti, Bumbulis, and Schimenti 1994Citation ), it was also shown to be improbable for sequences this short (Cooper, Schimenti, and Schimenti, 1998Citation ). Observations of its effects remain anecdotal, so that its likelihood appears hard to assess at present.

Furthermore, this explanation based on multiple gene copies fits poorly within the time frame inferred for the phylogeny of the Abpa introns in comparison with that of the average scnDNA, depicted in figure 8 . Taking the divergence times into account, the phylogeny points to an unexpected resemblance between the Palearctic species (M. musculus musculus excluded), rather than to an abnormal divergence of M. musculus musculus from the other Palearctic species (spretus, spicilegus, and macedonicus). An explanation for this pattern would be that secondary genetic exchanges have occurred along the lineages leading to the Palearctic species.

This possibility must be taken seriously because there is evidence of a transfer of LINE-1 copies from M. spretus and M. spicilegus into laboratory strains of mice (Rikke et al. 1995Citation ; Zhao, Green-Till, and Hardies 1998Citation ; Hardies et al. 2000Citation ) as well as of sporadic reciprocal exchanges of mtDNA and other genes between M. musculus domesticus and M. spretus in the wild (Orth et al. in press). Thus, interspecific genetic exchanges apparently occur between these species at a rate sufficient to be detected by limited sampling of the genome. In addition, their effects could have been emphasized at some genes for which selection favored the fixation of new alleles, and there is evidence for this in the case of Abpa. A similar case has been reported by Wang, Wakely, and Hey (1997)Citation , who have shown extensive introgression at the Adh locus between closely related species of the Drosophila pseudoobscura group.

Although secondary exchanges account for the abnormal Abpa resemblance of the Palearctic species, they do not explain how M. musculus musculus has retained an Abpa lineage, the divergence of which apparently predates the subspeciation of the house mouse, because the coalescence of Abpa in M. musculus occurs at 0.32, compared with only 0.15 for the average scnDNA (fig. 8 ). The variance of the coalescence times of different portions of the nuclear genome among these mouse subspecies should be studied before any sound conclusion can be drawn from this observation. In a recent paper, Ting, Tsaur, and Wu (2000)Citation have shown that different genes can display very different coalescence patterns between closely related Drosophila species, presumably because of a combination of secondary introgression and segregation of ancestral polymorphism. They have also shown that a gene of hybrid sterility reflects much more clearly than other genes the events of reproductive divergence between these species. Although the initial picture of the evolution of the Abpa gene in the house mouse was suggestive of this kind of hypothesis, with one allele fixed in each subspecies, the picture that now emerges is more complex. On one hand, there is evidence that this gene plays a role in recognition and has evolved under strong positive Darwinian selection. On the other hand, it is difficult to rectify the notion of a ubiquitous role for it in sexual isolation, given the extensive polymorphism in some regions and the secondary interspecific exchanges that have apparently occurred. This does not necessarily rule out a role of ABP in recognition, but some other (or additional) function should be sought that would help account for the peculiar evolutionary history of this gene.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The new sequences reported here have been deposited in GenBank and assigned the following accession numbers: Mus musculus musculus (Abpa b allele, 998 bp), AF413618; Mus musculus castaneus (Abpa c allele, 998 bp), AF413619; Mus spretus, AF413620; Mus spicilegus, AF413621; Mus macedonicus, AF413622; Mus famulus, AF413623; Mus cervicolor, AF413624; Mus cookii, AF413625; (Abpa d allele, 183 bp), AF414436; (Abpa e allele, 183 bp), AF414437; (Abpa f allele, 183 bp), AF414438; (Abpa g allele, 183 bp), AF414439; (Abpa h allele, 183 bp), AF414440; (Abpa i allele, 183 bp), AF414441.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
R.C.K. was supported in part by a grant from the Holcomb Research Institute and that funding is gratefully acknowledged. The authors thank C. M. Laukaitis for assistance in preparing the manuscript.


    Footnotes
 
Wolfgang Stephan, Reviewing Editor

Keywords: androgen-binding protein Mus musculus interspecific hybridization sexual isolation selection Back

Abbreviations: ABP, androgen-binding protein; Abpa, the gene for the alpha subunit of mouse salivary androgen–binding protein. Back

Address for correspondence and reprints: Robert C. Karn, Department of Biological Sciences, Butler University, 4600 Sunset Avenue, Indianapolis, Indiana 46208. rkarn{at}butler.edu . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Auffray J.-C., F. Vanlerberghe, J. Britton-Davidian, 1990 House mouse progression in Eurasia: a palaentological and archaeozoological approach Biol. J. Linn. Soc 41:13-25[ISI]

    Begun D. J., A. J. Betancourt, C. H. Langley, W. Stephan, 1999 Is the fast/slow allozyme variation at the Adh locus of Drosophila melanogaster an ancient balanced polymorphism? Mol. Biol. Evol 16:1816-1819[Free Full Text]

    Boissinot S., P. Boursot, 1997 Discordant phylogeographic patterns between the Y chromosome and mitochondrial DNA in the house mouse: selection on the Y chromosome? Genetics 146:1019-1034[Abstract/Free Full Text]

    Boursot P., J.-C. Auffray, J. Britton-Davidian, F. Bonhomme, 1993 The evolution of house mice Annu. Rev. Ecol. Syst 24:119-152[ISI]

    Boursot P., W. Din, R. Anand, D. Darviche, B. Dod, F. von Deimling, G. P. Talwar, F. Bonhomme, 1996 Origin and radiation of the house mouse: mitochondrial DNA phylogeny J. Evol. Biol 9:391-415[ISI]

    Cooper D. M., K. J. Schimenti, J. C. Schimenti, 1998 Factors affecting ectopic gene conversion in mice Mamm. Genome 9:355-360[ISI][Medline]

    Din W., R. Anand, P. Boursot, D. Darviche, B. Dod, E. Jouvin-Marche, A. Orth, G. P. Talwar, P.-A. Cazenave, F. Bonhomme, 1996 Origin and radiation of the house mouse: clues from nuclear genes J. Evol. Biol 9:519-539[ISI]

    Dlouhy S. R., W. C. Nichols, R. C. Karn, 1986 Production of an antibody to mouse salivary androgen binding protein (ABP) and its use in identifying a prostate protein produced by a gene distinct from Abp Biochem. Genet 24:443-463

    Dlouhy S. R., B. A. Taylor, R. C. Karn, 1987 The genes for mouse salivary androgen-binding protein (ABP) subunits alpha and gamma are located on chromosome 7 Genetics 115:535-543[Abstract/Free Full Text]

    Dod B., L. S. Jermiin, P. Boursot, V. M. Chapman, J. T. Nielsen, F. Bonhomme, 1993 Counterselection on sex chromosomes in the Mus musculus European hybrid zone J. Evol. Biol 6:529-546[ISI]

    Forejt J., 1996 Hybrid sterility in the mouse Trends Genet 12:412-417[ISI][Medline]

    Hardies S. C., L. P. Wang, L. X. Zhou, Y. P. Zhao, N. C. Casavant, S. J. Huang, 2000 LINE-1 (L1) lineages in the mouse Mol. Biol. Evol 17:616-628[Abstract/Free Full Text]

    Hwang J. H., J. R. Hofstetter, F. Bonhomme, R. C. Karn, 1997 The microevolution of mouse salivary androgen-binding protein (ABP) paralleled subspeciation of Mus musculus J. Hered 88:93-97[Abstract]

    Karn R. C., S. R. Dlouhy, 1991 Salivary androgen-binding protein variation in Mus and other rodents J. Hered 82:453-458[ISI][Medline]

    Karn R. C., M. W. Nachman, 1999 Reduced nucleotide variability at an androgen-binding protein locus (Abpa) in house mice; evidence for positive natural selection Mol. Biol. Evol 16:1192-1197[Abstract]

    Laukaitis C. M., E. S. Critser, R. C. Karn, 1997 Salivary androgen-binding protein (ABP) mediates sexual isolation in Mus musculus Evolution 51:2000-2005[ISI]

    Mezhzherin S. V., E. V. Kotenkova, A. G. Mikhailenko, 1998 The house mice, Mus musculus s. l., hybrid zone of Transcaucasus Z. Saeugetierkd 63:154-168

    Murti J. R., M. Bumbulis, J. C. Schimenti, 1994 Gene conversion between unlinked sequences in the germline of mice Genetics 137:837-843[Abstract/Free Full Text]

    Nei M., T. Gojobori, 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions Mol. Biol. Evol 3:418-426[Abstract]

    Orth A., K. Belkhir, J. Britton-Davidian, P. Boursot, T. Benazzou, F. Bonhomme, 2002 Hybridation naturelle entre deux espèces sympatriques de souris M. musculus domesticus L. et M. spretus Lataste C. R. Acad. Sci., Paris (in press)

    Orth A., E. Lyapunova, A. Kandaurov, S. Boissinot, P. Boursot, N. Vorontsov, F. Bonhomme, 1996 L'espèce polytypique Mus musculus en Transcaucasie C. R. Acad. Sci., Paris 319:435-441.[Medline]

    Prager E. M., P. Boursot, R. D. Sage, 1997 New assays for Y chromosome and p53 pseudogene clines among East Holstein house mice Mamm. Genome 8:279-281[ISI][Medline]

    Prager E. M., C. Orrego, R. D. Sage, 1998 Genetic variation and phylogeography of Central Asian and other house mice, including a major new mitochondrial lineage in Yemen Genetics 150:835-861[Abstract/Free Full Text]

    Rikke B. A., Y. Zhao, L. P. Daggett, R. Reyes, S. C. Hardies, 1995 Mus spretus LINE-1 sequences detected in the Mus musculus inbred strain C57BL/6J using LINE-1 DNA probes Genetics 139:901-906[Abstract/Free Full Text]

    Rozas J., R. Rozas, 1999 DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis Bioinformatics 15:174-175[Abstract/Free Full Text]

    Sage R. D., W. R. Atchley, E. Capanna, 1993 House mice as models in systematic biology Syst. Biol 42:523-561[ISI]

    She J. X., F. Bonhomme, P. Boursot, L. Thaler, F. Catzeflis, 1990 Molecular phylogenies in the genus Mus: comparative analysis of electrophoretic, scnDNA hybridisation, and mtDNA RFLP data Biol. J. Linn. Soc 41:83-103[ISI]

    Talley H. M., C. M. Laukaitis, R. C. Karn, 2001 Female preference for male saliva: implications for sexual isolation of Mus musculus subspecies Evolution 55:631-634[ISI][Medline]

    Ting C.-T., S.-C. Tsaur, C.-I. Wu, 2000 The phylogeny of closely related species as revealed by the genealogy of a speciation gene, Odysseus Proc. Natl. Acad. Sci. USA 97:5313-5316[Abstract/Free Full Text]

    Tucker P. K., R. D. Sage, J. Warner, A. C. Wilson, E. M. Eicher, 1992 Abrupt cline for sex chromosomes in a hybrid zone between two species of mice Evolution 46:1146-1163[ISI]

    Vanlerberghe F., B. Dod, P. Boursot, M. Bellis, F. Bonhomme, 1986 Absence of Y-chromosome introgression across the hybrid zone between Mus musculus domesticus and Mus musculus musculus Genet. Res 48:191-197[ISI][Medline]

    Veuille M., V. Benassi, S. Aulard, F. Depaulis, 1998 Allele-specific population structure of Drosophila melanogaster alcohol dehydrogenase at the molecular level Genetics 149:971-981[Abstract/Free Full Text]

    Wang R. L., J. Wakely, J. Hey, 1997 Gene flow and natural selection in the origin of Drosophila pseudoobscura and close relatives Genetics 147:1091-1106[Abstract/Free Full Text]

    Watterson G., 1975 On the number of segregating sites in genetical models without recombination Theor. Popul. Biol 7:256-276[ISI][Medline]

    Zhao Y., R. Greene-Till, S. C. Hardies, 1998 Mus spretus LINE-1s in C57BL/6J map to at least two different chromosomes Mamm. Genome 9:679-680[ISI][Medline]

Accepted for publication November 23, 2001.