Directional Evolution of Size Coupled with Ascertainment Bias for Variation in Drosophila Microsatellites

William Amos*, Carolyn M. Hutter{dagger},{ddagger}, Malcolm D. Schug{dagger},§ and Charles F. Aquadro{dagger},

* Department of Zoology, University of Cambridge, Cambridge, England
{dagger} Department of Molecular Biology and Genetics, Cornell University
{ddagger} Department of Epidemiology, University of Washington School of Public Health and Community Medicine
§ Department of Biology, University of North Carolina Greensboro


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
Species-specific differences in microsatellite locus length and ascertainment bias have both been proposed to explain differences in microsatellite variability and length usually observed when loci isolated in one species are used to survey variation in a related species. Here we provide a simple algebraic approach to independently estimate the contributions of true species-specific length differences and ascertainment bias. We apply this approach to a reciprocal-isolation microsatellite study and show contributions of both ascertainment bias and a true longer average microsatellite length in Drosophila melanogaster compared with D. simulans.

Key Words: Drosophila • microsatellites • ascertainment bias • directional evolution


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
Microsatellites are arguably the most important class of genetic markers yet discovered, comprising arrays of short (2 to 5 bp) tandemly repeated motifs. A major benefit is that conservation of their flanking sequences allows homologous loci to be amplified in related species. However, many studies report significant differences in length and heterozygosity between species. Two explanations have been proffered: (1) directional evolution, trends created by the characteristics of the markers themselves, namely biased mutation coupled with variation in genome-wide mutation rates (e.g., Rubinsztein et al. 1995; Amos et al. 1996), and (2) ascertainment bias, trends artificially created by selection criteria during marker development (Ellegren, Primmer, and Sheldon 1995; Forbes et al. 1995).

There is general agreement that the best way to discriminate between these explanations is to conduct reciprocal studies, specifically cloning markers independently from two species and then making both "focal" (markers tested in the species from which they are derived) and "nonfocal" comparisons. Only a handful of studies have adopted this approach, and the results are mixed. Studies on sheep and cattle (Crawford et al. 1998) and humans and chimpanzees (Cooper, Rubinsztein, and Amos 1998) found that ascertainment bias alone could not explain the observed length differences. A second, much smaller, study on cattle and sheep found a large ascertainment bias effect (Ellegren et al. 1997). A recent study on Drosophila melanogaster and D. simulans found no length difference but a large ascertainment bias effect for heterozygosity (Hutter, Schug, and Aquadro 1998).

Hutter, Schug, and Aquadro (1998) reported that (1) heterozygosity is greater in the focal species, (2) the longest pure repeat stretch is significantly longer in focal compared with the nonfocal species, but (3) PCR product length does not differ significantly between focal and nonfocal comparisons. Since marker selection usually favors pure repeats over interrupted tracts, nonfocal species may show a higher proportion of interrupted repeats and reduced heterozygosity, as seen in the Drosophila data (Hutter, Schug, and Aquadro 1998). In this paper we reexamine the Drosophila data and show that significant differences in PCR product length do exist. We demonstrate that PCR products of D. melanogaster microsatellite loci are longer on average than D. simulans, suggesting a consistent difference in microsatellite length between these species. We also present a method to estimate the contribution of true species differences and ascertainment bias to the microsatellite length differences observed between species.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
We reanalyzed the data presented in table 2 and the Appendix of Hutter, Schug, and Aquadro (1998) for PCR product length for 18 D. melanogaster–derived and 18 D. simulans–derived microsatellite loci assayed in population samples of both species from the United States and Zimbabwe.

To examine the relative roles of ascertainment bias and species-dependent length differences, we used simple algebra to solve for each effect separately:


where LsM and LsS are the observed lengths in D. simulans of D. melanogaster–derived and D. simulans–derived loci, respectively; LmS and LmM are the observed lengths in D. melanogaster of D. simulans–derived and D. melanogaster–derived loci, respectively; Ab is the average size of the ascertainment bias; and D(s-m) is the average size of any inherent difference in length between the two species expressed as length in D. simulans minus the length in D. melanogaster.

Combining these two equations and rearranging yields the following estimate of ascertainment bias:


which is the average difference between focally derived and nonfocally derived microsatellites. By rearranging the equation for LsM, we can estimate D(s-m), the inherent difference in locus size between the species, as


which is the average difference between D. simulans and D. melanogaster for the D. melanogaster–derived and D. simulans–derived loci, respectively.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
Hutter, Schug, and Aquadro (1998) noted that the D. simulans microsatellites were on average slightly shorter than those of D. melanogaster, regardless of the focal species. However, they did not directly compare species on all 36 loci together, but instead analyzed each species' microsatellites separately (18 from each species). When all 36 loci are analyzed together, D. simulans microsatellites are on average significantly shorter than their D. melanogaster homologs for both the United States and Zimbabwe samples (paired t-tests: t = 2.11, P = 0.0418, and t = 2.15, P = 0.0382, respectively). Differences are normally distributed (Kolmogorov-Smirnov normality test: P > 0.15 in both cases).

Substituting the published estimates of the mean PCR fragment size in base pairs (from table 2 of Hutter, Schug, and Aquadro 1998) for the various species comparisons in the equations derived in the Materials and Methods yields the following estimate of ascertainment bias:


for the United States samples of D. melanogaster and D. simulans, and


for the Zimbabwe samples of both species.

The inherent difference in locus size between the species, D(s-m), is estimated for the United States samples of both species as


and for the Zimbabwe samples of both species as


This reanalysis reveals that the observed difference in PCR product length between the two species involves both an ascertainment bias component, plus a (larger) component due to a tendency for PCR product length to be on average 3.38 bp longer in D. melanogaster. If the average observed ascertainment bias is used to correct each original pairwise comparison (full data from Appendix of Hutter, Schug, and Aquadro 1998), the best-fit interspecies length differences become -6.12 bp (U.S.) and -2.98 bp (Zimbabwe), both of which are significantly less than zero (Wilcoxon one-sample signed rank test for median not equal to zero: U.S. test statistic = 558, P = 0.0002; Zimbabwe test statistic = 466, P = 0.036).

A length ascertainment bias is thought to arise from selection for length during marker development. Selection for purity of repeat structure leads to microsatellites in nonfocal species carrying more interruptions, which in turn is likely to reduce mutation (slippage) rate, as reflected in a strong heterozygosity ascertainment bias. If the mutation process is unbiased, this difference in mutation rate will have no effect on relative length between species. However, if the mutation process is biased in favor of expansion, a lower mutation rate in nonfocal relative to focal species would contribute an additional length difference that would compound the length ascertainment bias (shorter in the nonfocal species). Similar effects have already been documented in the cow-sheep comparison (Crawford et al. 1998).

Longer microsatellites on average in D. melanogaster compared with D. simulans, as revealed by our reanalysis of the Hutter, Schug, and Aquadro (1998) data, could be explained if microsatellite slippage rates were higher in D. melanogaster compared with D. simulans under one model of microsatellite evolution (Kruglyak et al. 1998).

Just how and why a shift in the genome-wide rate of slippage could come about remains unclear. Amos et al. (1996) and Amos (1999) have argued that heterozygous sites mutate more than homozygous sites, and, hence, the increase in heterozygosity associated with population expansion could cause a parallel genome-wide increase in mutation rate. In our study, independent data from nucleotide variation studies suggest that D. simulans currently has the larger effective population size. This argues against the heterozygosity hypothesis (Hutter, Schug, and Aquadro 1998), even though we are still largely ignorant of both the time scales over which effects could be observed and the full impact of anthropogenic factors. Alternative explanations for a change in slippage rate might involve rapid evolution in the enzymes involved in DNA replication or changes in whatever forces may constrain genome size. Interestingly, Akashi (1996) has reported that average protein length is longer in D. melanogaster than D. simulans. This result could simply be coincidence, or could reflect a genuine selective constraint on genome size that is more effective in D. simulans. Recent analysis of models of microsatellite evolution support the presence of mutation biases and/or selection on repeat length (e.g., Calabrese, Durrett, and Aquadro 2001). The role of differences in mismatch repair among species is also worth investigating (e.g., Harr, Todorova, and Schlötterer 2002). Studies of homologous genome regions across species (Webster, Smith, and Ellegren 2002) should shed additional light on alternative hypotheses to explain these genome-wide trends in sequence architecture.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 
This work was partially supported by NIH grant GM36431 to C.F.A. We appreciate the statistical advice of A. G. Clark.


    Footnotes
 
E-mail: cfa1{at}cornell.edu. Back

David Rand, Associate Editor Back


    Literature Cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 Literature Cited
 

    Akashi, H. 1996. Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics 144:1297-1307.[Abstract/Free Full Text]

    Amos, W. 1999. A comparative approach to the study of microsatellite evolution. Pp. 67–79 in D. B. Goldstein and C. Schlotterer, eds. Microsatellites: evolution and implications. Oxford University Press, Oxford, England.

    Amos, W., S. J. Sawcer, R. Feakes, and D. C. Rubinsztein. 1996. Microsatellites show mutational bias and heterozygote instability. Nature Genet. 13:390-391.[ISI][Medline]

    Calabrese, P. P., R. T. Durrett, and C. F. Aquadro. 2001. Dynamics of microsatellite divergence under stepwise and proportional slippage/point mutation models. Genetics 159:839-852.[Abstract/Free Full Text]

    Crawford, A., S. M. Knappes, K. A. Peterson, M. J. deGotari, K. G. Dodds, R. T. Freking, R. T. Stone, and C. W. Beattie. 1998. Microsatellite evolution: testing the ascertainment bias hypothesis. J. Mol. Evol. 46:256-260.[ISI][Medline]

    Cooper, G., D. C. Rubinsztein, and W. Amos. 1998. Ascertainment bias does not entirely account for human microsatellites being longer than their chimpanzee homologues. Hum. Mol. Genet. 7:1425-1429.[Abstract/Free Full Text]

    Ellegren, H., S. Moore, N. Robinson, K. Byrne, W. Ward, and B. C. Sheldon. 1997. Microsatellite evolution—a reciprocal study of repeat lengths at homologous loci in cattle and sheep. Mol. Biol. Evol. 14:854-860.[Abstract]

    Ellegren, H., C. R. Primmer, and B. C. Sheldon. 1995. Microsatellite evolution: directionality or bias in locus selection. Nature Genet. 11:360-362.[ISI][Medline]

    Forbes, S. H., J. T. Hogg, F. C. Buchanan, A. M. Crawford, and F. W. Allendorf. 1995. Microsatellite evolution in congeneric mammals: domestic and bighorn sheep. Mol. Biol. Evol. 12:1106-1113.[Abstract]

    Harr, B., J. Todorova, and C. Schlötterer. 2002. Mismatch repair-driven mutational bias in D. melanogaster. Mol. Cell 10:199-205.[ISI][Medline]

    Hutter, C. M., M. D. Schug, and C. F. Aquadro. 1998. Molecular variation in Drosophila melanogaster and Drosophila simulans: a reciprocal test of the ascertainment bias hypothesis. Mol. Biol. Evol. 15:1620-1636.[Abstract/Free Full Text]

    Kruglyak, S., R. Durrett, M. D. Schug, and C.F. Aquadro. 1998. Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations. Proc. Natl. Acad. Sci. USA 95:10774-110778.[Abstract/Free Full Text]

    Rubinsztein, D. C., W. Amos, J. Leggo, S. Goodburn, S. Jain, S. H. Li, R. L. Margolis, C. A. Ross, and M. A. Ferguson-Smith. 1995. Microsatellite evolution—evidence for directionality and variation in rate between species. Nature Genet. 10:337-343.[ISI][Medline]

    Webster, M. T., N. G. C. Smith, and H. Ellegren. 2002. Microsatellite evolution inferred from human-chimpanzee genomic sequence alignments. Proc. Natl. Acad. Sci. USA 99:8748-8753.[Abstract/Free Full Text]

Accepted for publication December 9, 2002.