Center for Population Biology, University of California, Davis
Correspondence: E-mail: adkern{at}ucdavis.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Drosophila polymorphism divergence noncoding DNA base composition introns
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In this paper, we describe patterns of polymorphism and divergence in two types of noncoding DNA, nontranscribed intergenic regions and introns, from Drosophila melanogaster and Drosophila simulans. Additionally, we compare these data to synonymous site data from exons. Our goals are to investigate whether transcription is associated with different patterns of mutation or selection and to compare patterns of variation in nontranscribed DNA from two lineages. Two observations motivate this goal. First, Akashi (1996) used divergence data to infer that intron base composition is at equilibrium in D. melanogaster and D. simulans. Second, Takano-Shimizu (2001) suggested that noncoding sequences along the lineage between D. yakuba and D. melanogaster had undergone a shift in mutational bias towards A/T nucleotides. One interpretation of these results is that the mutation process differs between intron and intergenic sequences in Drosophila. A possible explanation for such differences is that introns are transcribed, whereas most intergenic regions, presumably, are not. Indeed, transcription appears to have a significant impact on patterns of base composition in mammals, most likely as a consequence of germline transcription-coupled repair (Green et al. 2003; Majewski 2003). Although there is no evidence of transcription-coupled repair in flies (de Cock et al. 1992; van der Helm et al. 1997; Sekelsky, Brodsky, and Burtis 2000), other properties/effects of transcription could affect mutation biases or rates.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Intron Sequences
To obtain a lineage-specific picture of polymorphism and divergence from intronic regions, loci for which polymorphism data existed from D. melanogaster and D. simulans and which also had at least one outgroup sequence from D. yakuba were assembled from GenBank. In total, 49 introns from 24 loci (5.5 kb) throughout the genome satisfied these criteria (tables 1 and 2). References for these sequences can be found in the publications listed in table 1. Most sampled introns were small, which is consistent with the heavily tailed distribution of intron sizes found in the D. melanogaster genome (Adams et al. 2000; Parsch 2003).
|
|
Many polymorphism and divergence statistics were estimated using software developed by A.D.K. Source code in Smalltalk is available upon request. To test the null hypothesis of neutral evolution, we performed multilocus Hudson-Kreitman-Aguadé tests (Hudson, Kreitman, Aguadé 1987) on intergenic and intronic data separately. These tests were carried out using a version of Jody Hey's HKA program (Kliman et al. 2000) that we compiled to handle a larger number of loci. Critical values reported here for HKA tests were based on 10,000 neutral coalescent simulations rather than the standard chi-square distribution.
Parsimony and maximum-likelihood analyses were performed on a site-by-site basis to infer ancestral states for the common ancestor of D. melanogaster and D. simulans. Inferred ancestral states were used to identify lineage-specific fixations and to infer ancestral states for polymorphisms. For likelihood reconstruction, we used PAML (Yang 1997) under a number of substitution models. For these reconstructions, we assumed that within-species genealogies are star shaped. This could be a problem if one were indirectly inferring the number of polymorphisms within species because the assumption of a star-shaped genealogy would inflate the number of mutations inferred on terminal branches. However, because we are interested only in the internal node of the genealogy, which represents the most recent common ancestor of D. melanogaster and D. simulans, the star-phylogeny assumption is not expected to have a large effect on our inferences. Indeed, any error caused by this assumption appears to have had little effect on our conclusions, as we obtained qualitatively similar results independent of the mode of ancestral reconstruction employed. Furthermore, only minor differences were observed for different likelihood models (data not shown), most plausibly as a function of the short divergence times. Results reported here are from maximum-likelihood reconstructions under the general time reversible (GTR) model (Tavaré 1986), where both the transition/transversion rate and shape parameter for rate heterogeneity were estimated from the data. For parsimony reconstructions, only sites for which an unambiguous ancestral state could be inferred were considered. Although parsimony may lead to inaccurate ancestral inference over large evolutionary distances, it should provide a reliable criterion over the short divergence times between taxa considered here (Yang, Kumar, and Nei 1995). All parsimony analyses were done using software written by A.D.K. Once ancestral states were inferred, polarized mutations were classified as A/T to G/C or G/C to A/T. To explicitly deal with effects of polymorphism on estimates of ML divergence, we calculated distances for all possible permutations of unrooted, three-taxon trees (D. melanogaster, D. simulans, and D. yakuba) using one sampled allele for each species and reported the mean divergence along each branch.
Most statistical hypothesis testing was implemented using the R language (http://www.r-project.org). Stationarity for base composition was assessed by comparing polymorphic versus fixed, A/T to G/C versus G/C to A/T mutations, following McDonald and Kreitman (1991). The nonparametric Wilcoxon rank sum test was used to test for significant differences in the frequency of segregating A/T to G/C and G/C to A/T polymorphisms.
Bayesian Analysis of Base-Composition Evolution
Under the assumption of selective neutrality, equilibrium base-composition bias can be estimated from present base composition (A/T%) and estimates of the per bp rate of mutations towards (v) and away from (u) A/T (Sueoka 1993; Eyre-Walker 1997). Let
![]() |
![]() |
![]() |
To estimate in a Bayesian setting, we implemented this likelihood function in a Markov chain Monte Carlo (MCMC) analysis using a noninformative Dirichlet prior. The search through likelihood space utilizes the Metropolis-Hastings algorithm (Metropolis and Ulam 1949; Hastings 1970). Multiple chains were run from different initial conditions, and convergence was monitored by looking at the statistic
of
, as suggested by Gelman et al. (1995). Based on this criterion, chains converged very quickly (< 4,000 iterations), as might be expected from such a simple likelihood function. Markov chains were thinned to sampling every 1,000 iterations, and 10,000 samples were taken after an initial burn-in of 5,000 iterations.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
Levels of intergenic DNA polymorphism in D. simulans (mean = 0.0102, median
= 0.0091) were surprisingly low relative to intron and synonymous estimates. A Kruskal-Wallis rank sum test on heterozygosity for the three classes of silent polymorphisms in D. simulans (table 4) is nearly significant (
: chi-square = 5.7789, P = 0.05533;
: chi-square = 5.7924, P = 0.05523). In D. melanogaster, intergenic heterozygosity is roughly similar to intronic and synonymous heterozygosity (mean intergenic
= 0.0126; median intergenic
= 0.0126). Kruskal-Wallis tests on the three categories of silent variation in D. melanogaster did not reject the null hypothesis of homogeneity (
: chi-square = 4.228, P = 0.1211;
: chi-square = 5.5167, P = 0.0634).
D. melanogaster versus D. simulans heterozygosity
Comparison of heterozygosities in D. melanogaster and D. simulans reveals significant differences for intronic and synonymous (Wilcoxon rank sum tests, intronic : P = 0.0006; synonymous
: P = 0.0203) but not for intergenic sites (Wilcoxon rank sum tests, intergenic
: P = 0.3176). However, comparison of African D. melanogaster samples (n = 11 loci: mth, G6pd, mei-218, pgm, tpi, Adh, anon1A3, anon1E9, anon1G5, v, and acp53 [table 1 in Supplementary Material online]) versus D. simulans provides a slightly different result. Intron variation is significantly lower in D. melanogaster (Wilcoxon rank sum test, intronic
: P = 0.00277), but intergenic and synonymous heterozygosity are roughly equal (African D. melanogaster: mean intergenic
= 0.0126, mean intronic
= 0.0097, median synonymous
= 0.0176; D. simulans: mean intergenic
= 0.0102, mean intronic
= 0.0205, mean synonymous
= 0.024). In other words, for African D. melanogaster versus D. simulans, the only obvious difference in heterozygosity is for introns.
Divergence
Unpolarized comparisons of divergence between D. melanogaster and D. simulans among silent sites reveals that the rank order of divergence is intergenic (mean Jukes-Cantor corrected divergence = 0.0578) < intronic (mean JC corrected = 0.0969) < synonymous (mean JC corrected = 0.1181). Thus, in pairwise comparisons, intergenic DNA is evolving roughly half as quickly as intronic or synonymous sites. A test for heterogeneity of divergences among classes of sites is highly significant (Kruskal-Wallis rank sum test: chi-square = 14.7973, P = 0.0006). Mean divergence for the two classes of noncoding sites (i.e., intergenic and intronic sequence) is also significantly different (Wilcoxon rank sum test, P = 0.0291).
Divergence that has been polarized to the D. melanogaster or D. simulans lineages (see Materials and Methods) provides a similar picture of evolution. Rank order mean divergence in D. melanogaster for silent classes was intergenic (0.0447) < synonymous (0.0571) < intronic (0.0730). These three classes are not significantly heterogeneous (Kruskal-Wallis rank sum test, chi-square = 2.3284, P = 0.3122). This rank order of mean divergence is slightly different from that observed in D. simulans, which mirrors the unpolarized analysis: intergenic (0.0301) < intronic (0.0451) < synonymous (0.0506). The three site classes were also not heterogeneous for D. simulans (Kruskal-Wallis rank sum test, chi-square = 4.2384, P = 0.1203).
Our analysis supports the idea that D. melanogaster evolves faster than D. simulans across all silent sites (Wilcoxon rank sum test, silent sites P = 0.0202; sign test, 38/56 P = 0.0029). This is consistent with previous reports from protein-coding regions (Akashi 1996).
Tests of Neutrality
We examined the frequency spectrum of segregating sites and compared levels of polymorphism and divergence to test for departures from neutral equilibrium expectations at silent sites. Tests of the frequency spectrum were carried out using Tajima's D statistic (Tajima 1989). The results from these tests can be found in tables 1, 2, and 3. Generally, there were few instances of significant D values. We carried out multilocus HKA tests of synonymous, intronic, and intergenic sites separately. Figures 1, 2, and 3 display locus-by-locus contributions to the overall test statistic. Observed values that were less than/greater than expected values are plotted as negative/positive values, respectively. Both synonymous and intronic sequences (figs. 1 and 2) yielded a significant HKA result (intronic: P = 0.0170; synonymous: P = 0.0002), consistent with too little polymorphism, given levels of divergence. Deviations for D. melanogaster polymorphism, D. simulans polymorphism, and divergence between species were highly correlated among intronic and synonymous sites from a given locus (Spearman's rank correlation: D. melanogaster polymorphism, rho = 0.4617, P = 0.02412; D. simulans polymorphism, rho = 0.4497, P = 0.02851; interspecific divergence, rho = 0.5904, P = 0.002837). To the extent that the significant heterogeneity as revealed by HKA tests reflects the impact of linked, beneficial mutations (Maynard Smith and Haigh 1974; Kaplan et al. 1989), the correlations of deviations suggest that hitchhiking effects often span gene-sized or larger genomic regions. Intergenic sequences, however, did not reject neutrality (P < 0.6920). The number of intergenic regions surveyed (n = 7) is much smaller that the number of introns surveyed (n = 49). Thus, the different statistical results for intron + synonymous versus intergenic could reflect reduced power to reject the null in the intergenic sample or could result if most beneficial mutations occur in or near genes, rather than in intergenic DNA. However, if we restrict our HKA tests of synonymous and intronic sites to African D. melanogaster samples, we observe a slightly different result. In this case, synonymous sites still reject neutrality, but intronic sites do not reject the null (African samples, synonymous: P = 0.0430; intronic: P < 0.389). Once again it is feasible that this difference might be the result of reduced power to reject in our smaller sample of African loci/alleles. Alternatively, selection associated with migration out of Africa might explain the differences seen between our worldwide samples and our restricted African subsamples (e.g., Kauer, Dieringer, and Schlotterer 2003).
|
|
|
|
|
|
Contrasts of Polymorphism and Divergence
Comparisons of polymorphic and fixed G/C to A/T and A/T to G/C mutations reveal striking patterns of heterogeneity. Although the 2 x 2 table of polymorphic and fixed mutations is not significantly heterogeneous for D. simulans introns, D. simulans intergenic, or D. melanogaster intergenic DNA, the table for D. melanogaster introns is significantly heterogeneous, regardless of the method of ancestral reconstruction used (G-tests: ML P = 0.0013, parsimony P = 0.008). It appears that the there is a large excess of G/C to A/T polymorphisms in D. melanogaster introns.
The 2 x 2 tables for synonymous site evolution (table 7) are highly significant in D. simulans and D. melanogaster. Previous analyses of D. simulans variation revealed a major excess of A/T polymorphisms and smaller excess of A/T fixations; the polymorphism and divergence was heterogeneous in that the deviations from equilibrium expectation were much greater for the polymorphism (Akashi 1996; Begun 2001). Our results are consistent with these earlier analyses. Surprisingly, however, the 2 x 2 tables for synonymous site evolution (table 7) show that D. melanogaster is also significantly heterogeneous for polymorphic and fixed G/C versus A/T variants. Previous analyses of fewer D. melanogaster data showed no heterogeneity of polymorphic versus fixed A/T versus G/C mutations at synonymous sites (Akashi 1994, 1995).
Frequency Distribution
Given the above results from D. melanogaster, we investigated the frequency distribution of A/T and G/C polymorphisms from intronic and intergenic DNA in D. melanogaster. Comparisons using different ancestral reconstructions yield inconsistent results in this analysis. If we consider each site as an independent observation and use parsimony reconstructions, G/C mutations are at higher frequency than A/T mutations (Wilcoxon rank sum test: P = 0.03131). ML reconstructions show a qualitatively similar pattern; however, a hypothesis test fails to reject the null (Wilcoxon rank sum test: P = 0.1701). Thus, there is weak evidence that derived G/C and A/T mutations are at different frequencies in D. melanogaster populations. If we compare the average frequency per gene among silent classes (a more conservative test), we do not detect a significant difference between classes. Intergenic regions from D. melanogaster also show no hint of a frequency difference among polymorphism classes, both when comparing gene-averaged frequencies, or when treating segregating sites as independent observations. It is worth noting, however, that failure to reject the null in this setting may be the result of reduced power in the small population samples we have examined.
Bayesian Analysis of Base-Composition Evolution
Under the premise that excess A/T intron polymorphism in D. melanogaster results from a recent change in mutation bias, we can estimate the new expected mutation bias at equilibrium (Eyre-Walker 1997). Figure 4 presents the posterior distribution of
, given the numbers of A/T and G/C polymorphisms and fixations from D. melanogaster and D. simulans introns from our Markov chain simulation. Table 8 presents results from our MCMC contrasting D. melanogaster and D. simulans at both intronic and intergenic sites. Current A/T content in D. melanogaster introns is 0.647. The MAP estimate of
from our MCMC simulation was 0.77, with a 95% credible interval from 0.651 to 0.846. Thus, the data are compatible with a major evolutionary change of base composition in D. melanogaster introns. This is in contrast to our estimates from D. simulans introns, where the current A/T frequency, 0.614, is well within the 95% credible interval for our estimates of
(table 8). Additionally, MCMC simulations for our intergenic data are compatible, with no change in equilibrium base-composition bias, over that currently observed in either species.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
First, intergenic DNA appears to evolve more slowly than intron DNA or synonymous sites in exons. This is consistent with recent reports by Halligan et al. (2004) on intergenic divergence in these species. However, much of their intergenic sequence was sampled from 5' and 3' flanking regions of genes. Such regions may be expected to harbor functionally important sites. Conversely, much of our sampled intergenic DNA comes from regions not near known or predicted genes. One interpretation of our data (and those of Halligan et al. [2004]) is that the neutral mutation rate in intergenic DNA is unexpectedly low as a result of functional constraint. The relatively low heterozygosity in these regions for D. simulans is consistent with this interpretation. However, the data are also completely consistent with a reduced mutation rate in nontranscribed sequences in Drosophila.
Second, D. melanogaster appears to be evolving significantly faster than D. simulans at all three classes of silent sites: synonymous, intergenic, and intronic. Previous results (Akashi 1996) suggested that rates of protein evolution are higher along the D. melanogaster than the D. simulans lineage. Indeed, analysis of the larger set of genes used here confirms the trend of faster protein evolution along the D. melanogaster lineage, although the difference is not significant (D. melanogaster: mean dN = 0.0131; D. simulans: mean dN = 0.0124). Thus, the data from D. melanogaster and D. simulans suggest that D. melanogaster evolves faster than D. simulans at all classes of sites examined. Akashi (1996) interpreted the higher rate of evolution in D. melanogaster replacement and synonymous sites (versus D. simulans) as resulting from reduced efficacy of natural selection against slightly deleterious mutations, perhaps as a consequence of reduced effective population size. If this explanation is correct, then our data suggest that all classes of sites are fixing slightly deleterious mutations in the D. melanogaster lineage. A corollary of this hypothesis is that the distribution of selection coefficients of new mutations is similar (i.e., nearly neutral) across these functionally disparate site types. Perhaps a more parsimonious interpretation is that the mutation rate has increased in the D. melanogaster lineage (relative to D. simulans). This would seem to be at odds with the observation that D. melanogaster is less polymorphic than D. simulans in both African and non-African populations (e.g., Begun and Whitley 2000; Andolfatto 2001). However, the neutral model posits that levels of variation depend on population size and neutral mutation rate, whereas divergence only depends on neutral mutation rate (Kimura 1983) Perhaps one attractive feature of the mutation rateincrease hypothesis is that it is amenable to direct empirical testing (e.g., Eeken, de Jong, and Green 1987).
Previous studies of D. melanogaster and D. simulans variation suggested a fundamental difference between species in the dynamics of synonymous variation. Contrasts of polymorphic versus fixed, A/T versus G/C mutations in D. melanogaster were homogeneous, yet A/T mutations greatly outnumbered G/C mutations for both polymorphic and fixed sites (Akashi 1995). The contrast of polymorphic versus fixed, A/T versus G/C mutations in D. simulans was highly significantly heterogeneous, consistent with an excess of A/T polymorphisms (Akashi 1995; Begun 2001). These results were interpreted as supporting a model of neutral evolution of higher A/T content in D. melanogaster (Akashi 1995, 1996). D. simulans data suggested that this lineage is also evolving higher A/T content (Begun 2001, McVean and Vieira 2001), although at a much slower rate than D. melanogaster, and that the polymorphism may be inconsistent with neutrality (Akashi 1996).
Our analysis of this larger D. melanogaster set suggests a different view of the polymorphism and divergence at synonymous sites, namely, that the contrasts of polymorphic and fixed A/T versus G/C mutations are significantly heterogeneous in both species (table 7). Such allele configurations (table 7) were previously interpreted as evidence for reduced selection against slightly deleterious A/T fixations after a population bottleneck and as evidence for nearly neutral dynamics for A/T polymorphisms (Akashi 1995, 1996). An alternative hypothesis for such synonymous site data is a recent change in mutation bias (Eyre-Walker 1997; Akashi 1997). Of course, rejection of homogeneity in the 2 x 2 contingency table is formally consistent with other interpretations, which complicates inferences regarding base-composition evolution at synonymous sites. For example, if the polymorphism data provide an accurate picture of historical mutational input, the data could be interpreted as a lineage-specific fixation bias of G/C mutations in D. melanogaster, perhaps as result of natural selection or biased gene conversion (Holmquist 1992; Eyre-Walker 1993, 1999). In any case, our results suggest that patterns of base-composition variation at synonymous sites are qualitatively similar in the two species.
If the mutation bias hypothesis was the correct explanation for synonymous site variation, we might expect to observe similar patterns of base-composition evolution for intronic and intergenic sites. In fact, we did observe a similar pattern, but only for intronic and not intergenic sites, and only for D. melanogaster and not D. simulans (tables 5 and 6). Thus, the data from these two species are recalcitrant to a simple explanation. An analysis aimed at quantifying the magnitude of such a bias from D. melanogaster introns is consistent with a large change in base composition currently under way along that lineage (fig. 4).
One interpretation of the lineage and site-specific heterogeneity for contrasts of polymorphic and fixed, A/T versus G/C mutations is a recent change in mutation bias in transcribed D. melanogaster (but not in D. simulans) sites (i.e., introns and synonymous sites). In principle, this could explain both the much greater excess of A/T fixations in D. melanogaster versus D. simulans synonymous sites and the excess of A/T polymorphisms in D. melanogaster but not in D. simulans introns. Overall then, data from the two species may be consistent with mildly deleterious evolution of synonymous sites in both species, leading to an accumulation of unpreferred (i.e., A/T ending) codons and a recent change in mutation bias associated with transcribed DNA in D. melanogaster. This explanation is somewhat unsatisfying because it is essentially a restatement of the empirical observations. However, it does have the virtue of predicting that careful measurement of DNA metabolism in D. melanogaster and D. simulans will reveal interspecific differences. For example, the hypothesis of different patterns of mutation bias in D. melanogaster versus D. simulans could be investigated experimentally by direct assays of genomic DNA damage or genetic assays to detect differences in DNA repair efficiency between D. melanogaster and D. simulans. Genomic-scale polymorphism and divergence data from DNA sequences varying in levels of germline transcription would also be valuable for further exploration of these ideas.
Mammalian data suggest that germline transcription-coupled repair may have important consequences for base-composition evolution (Green et al. 2003, Majewski 2003). However, there is no evidence for transcription-coupled repair in flies (de Cock et al. 1992; van der Helm et al. 1997; Sekelsky, Brodsky, and Burtis 2000). Nevertheless, transcribed DNA is packaged into a more "open" chromatin conformation than nontranscribed DNA (reviewed in Kornberg and Lorch [1995]), which is thought to make transcribed DNA more accessible to endogenous DNA-damaging agents (Bartlett, Scicchitano, and Robison 1991; Ljungman and Hanawalt 1992). In principle, differential chromatin states may affect the distribution of new mutations in transcribed versus nontranscribed sequencesa phenomenon known as transcription-associated mutation (TAM).
A problem with the recent change in mutation-bias hypothesis is that it predicts G/C to A/T mutations are, on average, younger than A/T to G/C mutations and, thus, should be segregating at lower frequencies in D. melanogaster (Sawyer 1977). We observed no such heterogeneity of frequencies, although the relatively small sample sizes may provide little power to detect such a phenomenon.
A recent survey of synonymous sites at the rosy locus from 22 Drosophila species showed that many lineages appear to be evolving higher A/T content (Begun and Whitley 2003). In this respect, D. melanogaster and D. simulans appear to be like other fliesthey are becoming more A/T rich at synonymous sites. One explanation for these data is that the ancestor of Drosophila was highly G/C rich but evolved a strong(er) A/T mutation bias before splitting of the extant Drosophila lineages. In this scenario, the accumulation of A/T mutations in many lineages reflects a slow approach to a new, higher A/Tcontent equilibrium. If intergenic and intron sequences are under weaker selection for base composition, perhaps they would have approached this equilibrium more quickly compared with synonymous sites, thereby explaining why the substitution data from intergenic and intron DNA, but not synonymous sites, supports base-composition equilibrium.
Finally, one of the major questions of population genetics is how much of the genome is either directly or indirectly influenced by directional selection. Our comparisons of polymorphism and divergence from worldwide samples using the HKA test (Hudson, Kreitman and Aguadé 1987) revealed that whereas intergenic data were consistent with neutrality, synonymous site and intronic data were not (although note the differences in our restricted African sample). If the significant HKA results are a consequence of hitchhiking effects (Maynard Smith and Haigh 1974; Kaplan et al. 1989), then we might infer that (1) directional selection tends to occur in genes and (2) the physical extent of hitchhiking effects is often "gene-sized" or largeron the order of a few kilobases.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adams, M. D., S. E. Celnicker, R. A. Holt, C. A. Evans, et al. 2000. The genome sequence of Drosophila melanogaster. Science 287:21852195.
Akashi, H. 1994. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927935.
. 1995. Inferring weak selection from patterns of polymorphism and divergence at silent sites in Drosophila. Genetics 139:10671076.
. 1996. Molecular evolution between Drosophilia melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution and larger proteins in D. melanogaster. Genetics 144:12971307.
. 1997. Distinguishing the effects of mutational biases and natural selection on DNA sequence variation. Genetics 147:19831987.
Andolfatto, P. 2001. Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18:279290.
Bartlett, J. D., A., Scicchitano, and S. H. Robison. 1991. Two expressed human genes sustain slightly more DNA damage after alkylating agent treatment than an inactive gene. Mutat. Res. 255:247256.[CrossRef][ISI][Medline]
. 2001. The frequency distribution of nucleotide variation in Drosophila simulans. Mol. Biol. Evol. 18:13431352.
Begun, D. J., and C. F. Aquadro. 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356:519520.[CrossRef][ISI][Medline]
. 1995. Molecular variation at the vermilion locus in geographically diverse populations of Drosophila melanogaster and Drosophila simulans. Genetics 140:10191032.
Begun, D. J., A. J. Betancourt, C. H. Langley, and W. Stephan. 1999. Is the fast/slow allozyme variation at the Adh locus of Drosophila melanogaster an ancient balanced polymorphism? Mol. Biol. Evol. 16:18161819.
Begun, D. J., and P. Whitley. 2000. Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc. Natl. Acad. Sci. USA 97:59605965.
. 2003. Molecular population genetics of Xdh and the evolution of base competition in Drosophila. Genetics 162:17251735.[ISI]
Begun, D. J., P. Whitley, B. L. Todd, H. M. Waldrip-Dail, and A. G. Clark. 2000. Molecular population genetics of male accessory gland proteins in Drosophila. Genetics 156:18791888.
Bergman, C. M., and M. Kreitman. 2001. Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11:13351345.
Berry, A. J., J. W. Ajioka, and M. Kreitman. 1991 Lack of polymorphism on the Drosophila fourth chromosome resulting from selection. Genetics 129:11111117.
Clark, A. G., B. G. Leicht, and S. V. Muse. 1996. Length variation and secondary structure of introns in the Mlc1 gene in six species of Drosophila. Mol. Biol. Evol. 13:481482.
Cooke, P. H., and J. G. Oakeshott. 1989. Amino acid polymorphisms for esterase-6 in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 86:14261430.[Abstract]
de Cock, J. G., E. C. Klink, W. Ferro, P. H. Lohman, and J. C. Eeken. 1992. Neither enhanced removal of cyclobutane pyrimidine dimers nor strand-specific repair is found after transcription induction of the beta 3- tubulin gene in a Drosophila embryonic cell line Kc. Mutat. Res. 293:1120.[CrossRef][ISI][Medline]
Eanes, W. F., M. Kirchner, and J. Yoon. 1993. Evidence for adaptive evolution of the G6pd gene in the Drosophila melanogaster and Drosophila simulans lineages. Proc. Natl. Acad. Sci. USA 90:74757479.
Eanes, W. F., M. Kirchner, J. Yoon, C. H. Biermann, I. N. Wang, M. A. McCartney, and B. C. Verrelli. 1996. Historical selection, amino acid polymorphism and lineage-specific divergence at the G6pd locus in Drosophila melanogaster and D. simulans. Genetics 144:10271041.
Eeken, J. C., A. W. de Jong, and M. M. Green. 1987. The spontaneous mutation rate in Drosophila simulans. Mutat. Res. 192:259262.[CrossRef][ISI][Medline]
Ewing, B., and P. Green. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8:186194.
Ewing, B., L. Hillier, M. C. Wendl, and P. Green. 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8:175185.
Eyre-Walker, A. 1993. Recombination and mammalian genome evolution. Proc. R. Soc. Lond. B Biol. Sci. 252:237243.[ISI][Medline]
. 1997. Differentiating between selection and mutation bias. Genetics 147:19831987.
. 1999. Evidence of selection on silent base composition in mammals: potential implications for the evolution of isochores and junk DNA. Genetics 152:675683.
Gelman, A., J. B. Carlin, H.S. Stern, and D. B. Rubin. 1995. Bayesian data analysis. Chapman & Hall, New York.
Gordon, D., C. Abajian, and P. Green. 1998. Consed: a graphical tool for sequence finishing. Genome Res. 8:195202.
Green, P., B. Ewing, W. Miller, P. J. Thomas, E. D. Green, NISC Comparative Sequencing Program. 2003. Transcription-associated mutational asymmetry in mammalian evolution. Nat. Genet. 33:514517.[CrossRef][ISI][Medline]
Halligan, D. L., A. Eyre-Walker, P. Andolfatto, and P. D. Keightley. 2004. Patterns of evolutionary constraints in intronic and intergenic DNA of Drosophila. Genome Res. 14:273279.
Hasson, E. and W. F. Eanes. 1996. Contrasting histories of three gene regions associated with In(3L)Payne of Drosophila melanogaster. Genetics 144:15651575.
Hasson, E., I. N. Wang, L. W. Zeng, M. Kreitman, and W. F. Eanes. 1998. Nucleotide variation in the triosephosphate isomerase (Tpi) locus of Drosophila. Mol. Biol. Evol. 15:756769.[Abstract]
Hastings, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97109.[ISI]
Hilton, H., R. M. Kliman, and J. Hey. 1994. Using hitchhiking genes to study adaptation and divergence during speciation with the Drosophila melanogaster complex. Evolution 48:19001913.[ISI]
Holmquist, G. P. 1992. Chromosome bands, their chromatin flavors and their functional features. Am. J. Hum. Genet. 51:1737.[ISI][Medline]
Hudson, R. R., M. Kreitman, and M. Aguade. 1987. A test of the neutral molecular evolution based on nucleotide data. Genetics 116:153159.
Kaplan, N. L., R. R. Hudson, and C. H. Langley. 1989. The "hitchhiking effect" revisited. Genetics 123:887899.
Karotam, J., A. C. Delves, and J. G. Oakeshott. 1993. Conservation and change in structural and 5' flanking sequences of esterase 6 in sibling Drosophila species. Genetica 90:1128.
Kauer, M. O., D. Dieringer, and C. Schlotterer. 2003. A microsatellite variability screen for positive selection associated with the "out of Africa" habitat expansion of Drosophila melanogaster. Genetics 165:11371148.
Kern, A. D., C. D. Jones, and D. J. Begun. 2004. Molecular population genetics of male accessory gland proteins in the Drosophila simulans complex. Genetics 167:725735.
Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge.
Kliman, R. M., and J. Hey. 1993. Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol. Biol. Evol. 10:12391258.[Abstract]
Kliman, R. M., P. Andolfatto, J. A. Coyne, F. Depaulis, M. Kreitman, A. J. Berry, J. McCarter, J. Wakeley, and J. Hey. 2000. The population genetics of the origin and divergence of the Drosophila simulans complex of species. Genetics 156:19131931.
Kornberg, R. D., and Y. Lorch. 1995. Interplay between chromatin structure and transcription. Curr. Opin. Cell. Biol. 7:371375.[CrossRef][ISI][Medline]
Kreitman, M. 1983. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304:412417.[ISI][Medline]
Langley, C. H., J. MacDonald, N. Miyashita, and M. Aguadé. 1993. Lack of correlation between interspecific divergence and intraspecific polymorphism at the suppressor of forked region in Drosophila melanogaster and Drosophila simulans. Proc. Natl. Acad. Sci. USA 90:18001803.[Abstract]
Leicht, B. G., S. V. Muse, and A. G. Clark. 1995. Constraints on intron evolution in the gene encoding the myosin alkali light chain in Drosophila. Genetics 139:299308.
Ljungman, M., and P. C. Hanawalt. 1992. Efficient protection against oxidative damage in chromatin. Mol. Carcinogen. 5:264269.[ISI][Medline]
Majewski, J. 2003. Dependence of mutational asymmetry on gene-expression levels in the human genome. Am. J. Hum. Genet. 73:688692.[CrossRef][ISI][Medline]
Maynard Smith, J., and J. Haigh. 1974. The hitch-hiking effect of a favourable gene. Genet. Res., Camb. 23:2335.[ISI][Medline]
McDonald, J. H., and M. Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652654.[CrossRef][ISI][Medline]
McVean, G., and J. Vieira. 2001. Inferring parameters of mutation, selection and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245257.
Metropolis, N., and S. Ulam. 1949. The Monte Carlo method. J. Am. Stat. Assoc. 44:335341.[ISI]
Morgenstern, B. 1999. DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15:211218.
Moriyama, E. N., and J. R. Powell. 1996. Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261277.[Abstract]
Parsch, J., C. D. Meiklejohn, and D. L. Hartl. 2001. Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans. Genetics 159:647657.
Parsch, J. 2003. Selective constraints on intron evolution in Drosophila. Genetics 165:18431851.
Sawyer, S. 1977. On the past history of an allele now known to have frequency P.. J. Appl. Prob. 14:439450.[ISI]
Schmid, K. J., and D. Tautz. 1997. A screen for fast evolving genes from Drosophila. Proc. Natl. Acad. Sci. USA 94:97469750.
Schmidt, P. S., D. D. Duvernell, and W. F. Eanes. 2000. Adaptive evolution of a candidate gene for aging in Drosophila. Proc. Natl. Acad. Sci. USA 97:1086110865.
Schlenke, T. A., and D. J. Begun. 2003. Natural selection drives Drosophila immune system evolution. Genetics 164:14711480.
Sekelsky, J. J., M. H. Brodsky, and K. C. Burtis. 2000. DNA repair in Drosophila: insights from the Drosophila genome sequence. J. Cell. Biol. 150:F3136.[CrossRef][ISI][Medline]
Sueoka, N. 1993. Directional mutation pressure, mutator mutations, and dynamics of molecular evolution. J. Mol. Evol. 37:137153.[ISI][Medline]
Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585595.
Takano, T. S. 1998. Rate variation of DNA sequence evolution in the Drosophila lineages. Genetics 149:959970.
. 2001. Local changes in the GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol. Biol. Evol. 18:606619.
Tavaré, S. 1986. Some probabilistic and statistical problems on the analysis of sequences. Lect. Math. Life Sci. 17:368376.
Thackeray, J. R., and C. P. Kyriacou. 1990. Molecular evolution in the Drosophila yakuba period locus. J. Mol. Evol. 31:389401.[ISI][Medline]
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:383402.[Abstract]
van der Helm, P. J., E. C. Klink, P. H. Lohman, and J. C. Eeken. 1997. The repair of UV-induced cyclobutane pyrimidine dimers in the individual genes Gart, Notch, and white from isolated brain tissue of Drosophila melanogaster. Mutat. Res. 383:113124.[ISI][Medline]
Verrelli, B. C., and W. F. Eanes. 2000. Extensive amino acid polymorphism at the pgm locus is consistent with adaptive protein evolution in Drosophila melanogaster. Genetics 156:17371752.
Webb, C. T., S. A. Shabalina, A. Y. Ogurtsov, and A. S. Kondrashov. 2002. Analysis of similarity within 142 pairs of orthologous intergenic regions of Caenorhabditis elegans and Caenorhabditis briggsae. Nucleic Acids Res. 30:12331239.
Yang, Z., S. Kumar, and M. Nei. 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:16411650.
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555556.[Medline]