Patterns of Polymorphism and Divergence from Noncoding Sequences of Drosophila melanogaster and D. simulans: Evidence for Nonequilibrium Processes

Andrew D. Kern and David J. Begun

Center for Population Biology, University of California, Davis

Correspondence: E-mail: adkern{at}ucdavis.edu.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Despite the fact that D. melanogaster and D. simulans have been the central model system for molecular population genetics, few data are available for noncoding regions. Here, we present an analysis of population genetic data from intergenic regions and comparisons of these data to previously collected data from introns and exons. Polymorphisms and fixations were categorized as A/T to G/C or G/C to A/T changes and were polarized by inferring the ancestral state using both parsimony and maximum likelihood. Noncoding fixations in both D. melanogaster and D. simulans were consistent with equilibrium base-composition evolution. However, polarized noncoding polymorphisms, revealed a different pattern. Although A/T to G/C and G/C to A/T polymorphisms in D. simulans were consistent with equilibrium, we observed a highly significant dearth of A/T to G/C polymorphisms in D. melanogaster introns but not in intergenic sequences. Such data could be explained by recent evolution of mutational biases associated with transcription or by lineage-specific selection on base composition. These data reveal the complexity of evolutionary processes acting even on noncoding DNA in Drosophila.

Key Words: Drosophila • polymorphism • divergence • noncoding DNA • base composition • introns


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Comparison of polymorphism and divergence in closely related species of Drosophila has been an effective strategy for testing population genetic hypotheses on the causes of variation (Hudson, Kreitman, and Aguadé 1987; Berry, Ajioka, and Kreitman 1991; Begun and Aquadro 1992; Langley et al. 1993; Akashi 1994). However, most such data come from protein-coding regions. Basic evolutionary questions about noncoding regions, including the relative importance of mutation and selection to the evolution of such regions in Drosophila, are poorly understood. Because noncoding DNA represents the great majority of most eukaryotic genomes, a population genetics–based understanding of such DNA would be a significant advance for evolutionary genomics. The investigation of noncoding regions also has potential practical implications. A current focus of genomics is to identify functional elements in the noncoding portions of genomes. Generally this goal relies on identifying conserved regions among distantly related genomes (e.g., Bergman and Kreitman 2001; Webb et al. 2002). The idea that such regions are functional follows from the neutral theory of molecular evolution (Kimura 1983), which posits that functional constraint and evolutionary rate are negatively correlated. However, such analyses disregard the potential contributions of mutational heterogeneity across genomic regions and lineages.

In this paper, we describe patterns of polymorphism and divergence in two types of noncoding DNA, nontranscribed intergenic regions and introns, from Drosophila melanogaster and Drosophila simulans. Additionally, we compare these data to synonymous site data from exons. Our goals are to investigate whether transcription is associated with different patterns of mutation or selection and to compare patterns of variation in nontranscribed DNA from two lineages. Two observations motivate this goal. First, Akashi (1996) used divergence data to infer that intron base composition is at equilibrium in D. melanogaster and D. simulans. Second, Takano-Shimizu (2001) suggested that noncoding sequences along the lineage between D. yakuba and D. melanogaster had undergone a shift in mutational bias towards A/T nucleotides. One interpretation of these results is that the mutation process differs between intron and intergenic sequences in Drosophila. A possible explanation for such differences is that introns are transcribed, whereas most intergenic regions, presumably, are not. Indeed, transcription appears to have a significant impact on patterns of base composition in mammals, most likely as a consequence of germline transcription-coupled repair (Green et al. 2003; Majewski 2003). Although there is no evidence of transcription-coupled repair in flies (de Cock et al. 1992; van der Helm et al. 1997; Sekelsky, Brodsky, and Burtis 2000), other properties/effects of transcription could affect mutation biases or rates.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Fly Stocks/Sequencing
New D. simulans sequences reported here were collected from inbred lines derived from flies collected in the Wolfskill Orchard in Winters, California (Begun and Whitley 2000). New D. melanogaster intergenic sequences are from isofemale lines collected from Malawi and provided to us by W. O. Ballard. D. yakuba sequence is from one isofemale line, S180, from the Umea Stock Center. In all cases, direct sequencing of PCR products in both directions were determined on an ABI 3700 sequencer. Raw data were analyzed using phred (Ewing et al. 1998) and phrap (Ewing and Green 1998). Sequences were then examined by eye using Consed (Gordon, Abajian, and Green 1998). Polymorphic sites in multiple alignments were verified using the MACE package of scripts (http://ludwig.ucdavis.edu/MACE/). The intergenic loci sampled were designed to be a random sample of nontranscribed, noncoding DNA and were taken from the GadFly version 3.1 annotation of the D. melanogaster genome. In total, we sequenced seven intergenic loci, which we have designated according to cytological location. Inter11C1FR1 corresponds to the scaffold AE003490 from positions 162425 to 162954. Inter11C1FR6 is from AE003490 at 165475 to 165587. Both of these regions are approximately 3 kb from the nearest annotated gene. Inter21F corresponds to positions 286303 to 287103 on the scaffold AE003587 (approximately 3 kb from the nearest annotated gene). Inter49F comes from AE003819 positions 187258 to 187939 (approximately 1 kb from the nearest annotated gene). Inter65A corresponds to AE003563 from positions 103969 to 104613 (approximately 1 kb from the nearest annotated gene). Inter84E is from AE003676 positions 101165 to 102088 (approximately 500 bp from the nearest annotated gene). Inter86C is from AE003688 positions 215272 to 215768 (approximately 200 bp from the nearest annotated gene). Average local recombination rates for sampled intergenic loci were estimated by fitting a fourth order polynomial to the comparison of genetic and physical maps (Kliman and Hey 1993; data not shown). Recombination rates for intergenic sequences (2.43 cM/mb) were roughly equivalent to average recombination rates for the genes from which we have intronic/synonymous site data (2.11 cM/mb). New sequences reported in this paper can be found in GenBank under accession numbers AY757957 to AY758069. Alignments are available from A.D.K. upon request.

Intron Sequences
To obtain a lineage-specific picture of polymorphism and divergence from intronic regions, loci for which polymorphism data existed from D. melanogaster and D. simulans and which also had at least one outgroup sequence from D. yakuba were assembled from GenBank. In total, 49 introns from 24 loci (~5.5 kb) throughout the genome satisfied these criteria (tables 1 and 2). References for these sequences can be found in the publications listed in table 1. Most sampled introns were small, which is consistent with the heavily tailed distribution of intron sizes found in the D. melanogaster genome (Adams et al. 2000; Parsch 2003).


View this table:
[in this window]
[in a new window]
 
Table 1 Summary Statistics of DNA Polymorphism for Introns from 24 Genes from D. melanogaster

 

View this table:
[in this window]
[in a new window]
 
Table 2 Summary Statistics of DNA Polymorphism for Introns from 24 Genes from D. simulans

 
Sequence Analysis
All alignments were done using either ClustalW version 1.82 (Thompson et al. 1994) and/or DIALIGN version 2.2 (Morgenstern 1999) and were subsequently edited by hand. As introns are often difficult to align, we anchored each multiple alignment in flanking coding sequences. This provided much better sequence alignment for indel-rich introns. Intergenic sequences were easily aligned. Sites that showed indel polymorphism within or divergence between species were excluded from further analysis.

Many polymorphism and divergence statistics were estimated using software developed by A.D.K. Source code in Smalltalk is available upon request. To test the null hypothesis of neutral evolution, we performed multilocus Hudson-Kreitman-Aguadé tests (Hudson, Kreitman, Aguadé 1987) on intergenic and intronic data separately. These tests were carried out using a version of Jody Hey's HKA program (Kliman et al. 2000) that we compiled to handle a larger number of loci. Critical values reported here for HKA tests were based on 10,000 neutral coalescent simulations rather than the standard chi-square distribution.

Parsimony and maximum-likelihood analyses were performed on a site-by-site basis to infer ancestral states for the common ancestor of D. melanogaster and D. simulans. Inferred ancestral states were used to identify lineage-specific fixations and to infer ancestral states for polymorphisms. For likelihood reconstruction, we used PAML (Yang 1997) under a number of substitution models. For these reconstructions, we assumed that within-species genealogies are star shaped. This could be a problem if one were indirectly inferring the number of polymorphisms within species because the assumption of a star-shaped genealogy would inflate the number of mutations inferred on terminal branches. However, because we are interested only in the internal node of the genealogy, which represents the most recent common ancestor of D. melanogaster and D. simulans, the star-phylogeny assumption is not expected to have a large effect on our inferences. Indeed, any error caused by this assumption appears to have had little effect on our conclusions, as we obtained qualitatively similar results independent of the mode of ancestral reconstruction employed. Furthermore, only minor differences were observed for different likelihood models (data not shown), most plausibly as a function of the short divergence times. Results reported here are from maximum-likelihood reconstructions under the general time reversible (GTR) model (Tavaré 1986), where both the transition/transversion rate and shape parameter for rate heterogeneity were estimated from the data. For parsimony reconstructions, only sites for which an unambiguous ancestral state could be inferred were considered. Although parsimony may lead to inaccurate ancestral inference over large evolutionary distances, it should provide a reliable criterion over the short divergence times between taxa considered here (Yang, Kumar, and Nei 1995). All parsimony analyses were done using software written by A.D.K. Once ancestral states were inferred, polarized mutations were classified as A/T to G/C or G/C to A/T. To explicitly deal with effects of polymorphism on estimates of ML divergence, we calculated distances for all possible permutations of unrooted, three-taxon trees (D. melanogaster, D. simulans, and D. yakuba) using one sampled allele for each species and reported the mean divergence along each branch.

Most statistical hypothesis testing was implemented using the R language (http://www.r-project.org). Stationarity for base composition was assessed by comparing polymorphic versus fixed, A/T to G/C versus G/C to A/T mutations, following McDonald and Kreitman (1991). The nonparametric Wilcoxon rank sum test was used to test for significant differences in the frequency of segregating A/T to G/C and G/C to A/T polymorphisms.

Bayesian Analysis of Base-Composition Evolution
Under the assumption of selective neutrality, equilibrium base-composition bias can be estimated from present base composition (A/T%) and estimates of the per bp rate of mutations towards (v) and away from (u) A/T (Sueoka 1993; Eyre-Walker 1997). Let

equal the mutation bias towards A/T. Eyre-Walker (1997) has shown that given a recent change in mutation bias, the expected equilibrium base-composition bias can be estimated by

where f is the current frequency of A/T nucleotides, and z is the proportion of segregating mutations that are either A or T. One can use maximum likelihood to estimate {omega} under the assumption that the four types of observations in the analysis, sites that are fixed for an A/T, sites that are fixed for a G/C, and sites that are fixed for A/T->G/C and G/C->A/T polymorphisms, are multinomially distributed (Eyre-Walker 1997). Explicitly, if we denote the four types of observations to be x1, x2, x3, and x4, then the likelihood of observing the xi, given the true proportions of sites are yi, is given by

To estimate {omega} in a Bayesian setting, we implemented this likelihood function in a Markov chain Monte Carlo (MCMC) analysis using a noninformative Dirichlet prior. The search through likelihood space utilizes the Metropolis-Hastings algorithm (Metropolis and Ulam 1949; Hastings 1970). Multiple chains were run from different initial conditions, and convergence was monitored by looking at the statistic of {omega}, as suggested by Gelman et al. (1995). Based on this criterion, chains converged very quickly (< 4,000 iterations), as might be expected from such a simple likelihood function. Markov chains were thinned to sampling every 1,000 iterations, and 10,000 samples were taken after an initial burn-in of 5,000 iterations.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Polymorphism
Tables 1 and 2 provide basic polymorphism and divergence statistics for intron sequences from D. melanogaster and D. simulans. Table 3 provides a polymorphism summary for intergenic loci from both D. melanogaster and D. simulans. Table 4 gives means across loci for summary statistics of polymorphism and divergence. In general, levels of intron polymorphism were correlated with levels of synonymous polymorphism from their corresponding genes in both D. melanogaster and D. simulans (D. melanogaster: R-squared = 0.5296, P < 0.0001; D. simulans: R-squared = 0.6033, P < 0.00001).


View this table:
[in this window]
[in a new window]
 
Table 3 Summary Statistics of DNA Polymorphism for Seven Intergenic Loci from D. melanogaster and D. simulans

 

View this table:
[in this window]
[in a new window]
 
Table 4 Mean Polymorphism and Divergence Summary

 
Mean intron heterozygosity was slightly lower than mean synonymous heterozygosity in both species (D. melanogaster: mean intron {theta}{pi} = 0.0089, mean synonymous {theta}{pi} = 0.0141; D. simulans: mean intron {theta}{pi} = 0.0215, mean synonymous {theta}{pi} = 0.0264), although the mean differences were not statistically significant. This is consistent with an earlier analysis from a smaller data set (Moriyama and Powell 1996). However, a sign test on intron versus synonymous heterozygosity across genes was significant in D. melanogaster (18/24; P = 0.008) but not in D. simulans (15/24; P = 0.0779). Thus, D. melanogaster intron heterozygosity was consistently lower than the synonymous heterozygosity in the associated gene.

Levels of intergenic DNA polymorphism in D. simulans (mean {theta}{pi} = 0.0102, median {theta}{pi} = 0.0091) were surprisingly low relative to intron and synonymous estimates. A Kruskal-Wallis rank sum test on heterozygosity for the three classes of silent polymorphisms in D. simulans (table 4) is nearly significant ({theta}{pi}: chi-square = 5.7789, P = 0.05533; {theta}{omega}: chi-square = 5.7924, P = 0.05523). In D. melanogaster, intergenic heterozygosity is roughly similar to intronic and synonymous heterozygosity (mean intergenic {theta}{pi} = 0.0126; median intergenic {theta}{pi} = 0.0126). Kruskal-Wallis tests on the three categories of silent variation in D. melanogaster did not reject the null hypothesis of homogeneity ({theta}{pi}: chi-square = 4.228, P = 0.1211; {theta}{omega}: chi-square = 5.5167, P = 0.0634).

D. melanogaster versus D. simulans heterozygosity
Comparison of heterozygosities in D. melanogaster and D. simulans reveals significant differences for intronic and synonymous (Wilcoxon rank sum tests, intronic {theta}{pi}: P = 0.0006; synonymous {theta}{pi}: P = 0.0203) but not for intergenic sites (Wilcoxon rank sum tests, intergenic {theta}{pi}: P = 0.3176). However, comparison of African D. melanogaster samples (n = 11 loci: mth, G6pd, mei-218, pgm, tpi, Adh, anon1A3, anon1E9, anon1G5, v, and acp53 [table 1 in Supplementary Material online]) versus D. simulans provides a slightly different result. Intron variation is significantly lower in D. melanogaster (Wilcoxon rank sum test, intronic {theta}{pi}: P = 0.00277), but intergenic and synonymous heterozygosity are roughly equal (African D. melanogaster: mean intergenic {theta}{pi} = 0.0126, mean intronic {theta}{pi} = 0.0097, median synonymous {theta}{pi} = 0.0176; D. simulans: mean intergenic {theta}{pi} = 0.0102, mean intronic {theta}{pi} = 0.0205, mean synonymous {theta}{pi} = 0.024). In other words, for African D. melanogaster versus D. simulans, the only obvious difference in heterozygosity is for introns.

Divergence
Unpolarized comparisons of divergence between D. melanogaster and D. simulans among silent sites reveals that the rank order of divergence is intergenic (mean Jukes-Cantor corrected divergence = 0.0578) < intronic (mean JC corrected = 0.0969) < synonymous (mean JC corrected = 0.1181). Thus, in pairwise comparisons, intergenic DNA is evolving roughly half as quickly as intronic or synonymous sites. A test for heterogeneity of divergences among classes of sites is highly significant (Kruskal-Wallis rank sum test: chi-square = 14.7973, P = 0.0006). Mean divergence for the two classes of noncoding sites (i.e., intergenic and intronic sequence) is also significantly different (Wilcoxon rank sum test, P = 0.0291).

Divergence that has been polarized to the D. melanogaster or D. simulans lineages (see Materials and Methods) provides a similar picture of evolution. Rank order mean divergence in D. melanogaster for silent classes was intergenic (0.0447) < synonymous (0.0571) < intronic (0.0730). These three classes are not significantly heterogeneous (Kruskal-Wallis rank sum test, chi-square = 2.3284, P = 0.3122). This rank order of mean divergence is slightly different from that observed in D. simulans, which mirrors the unpolarized analysis: intergenic (0.0301) < intronic (0.0451) < synonymous (0.0506). The three site classes were also not heterogeneous for D. simulans (Kruskal-Wallis rank sum test, chi-square = 4.2384, P = 0.1203).

Our analysis supports the idea that D. melanogaster evolves faster than D. simulans across all silent sites (Wilcoxon rank sum test, silent sites P = 0.0202; sign test, 38/56 P = 0.0029). This is consistent with previous reports from protein-coding regions (Akashi 1996).

Tests of Neutrality
We examined the frequency spectrum of segregating sites and compared levels of polymorphism and divergence to test for departures from neutral equilibrium expectations at silent sites. Tests of the frequency spectrum were carried out using Tajima's D statistic (Tajima 1989). The results from these tests can be found in tables 1, 2, and 3. Generally, there were few instances of significant D values. We carried out multilocus HKA tests of synonymous, intronic, and intergenic sites separately. Figures 1, 2, and 3 display locus-by-locus contributions to the overall test statistic. Observed values that were less than/greater than expected values are plotted as negative/positive values, respectively. Both synonymous and intronic sequences (figs. 1 and 2) yielded a significant HKA result (intronic: P = 0.0170; synonymous: P = 0.0002), consistent with too little polymorphism, given levels of divergence. Deviations for D. melanogaster polymorphism, D. simulans polymorphism, and divergence between species were highly correlated among intronic and synonymous sites from a given locus (Spearman's rank correlation: D. melanogaster polymorphism, rho = 0.4617, P = 0.02412; D. simulans polymorphism, rho = 0.4497, P = 0.02851; interspecific divergence, rho = 0.5904, P = 0.002837). To the extent that the significant heterogeneity as revealed by HKA tests reflects the impact of linked, beneficial mutations (Maynard Smith and Haigh 1974; Kaplan et al. 1989), the correlations of deviations suggest that hitchhiking effects often span gene-sized or larger genomic regions. Intergenic sequences, however, did not reject neutrality (P < 0.6920). The number of intergenic regions surveyed (n = 7) is much smaller that the number of introns surveyed (n = 49). Thus, the different statistical results for intron + synonymous versus intergenic could reflect reduced power to reject the null in the intergenic sample or could result if most beneficial mutations occur in or near genes, rather than in intergenic DNA. However, if we restrict our HKA tests of synonymous and intronic sites to African D. melanogaster samples, we observe a slightly different result. In this case, synonymous sites still reject neutrality, but intronic sites do not reject the null (African samples, synonymous: P = 0.0430; intronic: P < 0.389). Once again it is feasible that this difference might be the result of reduced power to reject in our smaller sample of African loci/alleles. Alternatively, selection associated with migration out of Africa might explain the differences seen between our worldwide samples and our restricted African subsamples (e.g., Kauer, Dieringer, and Schlotterer 2003).



View larger version (10K):
[in this window]
[in a new window]
 
FIG. 1.— Contributions from intronic loci to the multilocus HKA statistic. Values above (below) the x-axis indicate a positive (negative) deviation from the expected value.

 


View larger version (10K):
[in this window]
[in a new window]
 
FIG. 2.— Contributions from individual locus synonymous sites to the multilocus HKA statistic. Values above (below) the x-axis indicate a positive (negative) deviation from the expected value.

 


View larger version (8K):
[in this window]
[in a new window]
 
FIG. 3.— Contributions from intergenic loci to the multilocus HKA statistic. Values above (below) the x-axis indicate a positive (negative) deviation from the expected value.

 
Base-Composition Evolution
We used ancestral reconstructions to polarize G/C to A/T and A/T to G/C fixations and polymorphisms for the D. melanogaster and D. simulans lineages, respectively. As few mutations of any class were observed within individual loci, we combined data from across loci to make a more general statement of genomic evolution. Tables 5, 6, and 7 present these aggregate data in the form of a 2 x 2 contingency table for each species, for intronic, intergenic, and synonymous sites, respectively. Interestingly, if we restrict our attention to only African D. melanogaster samples, all of our base-composition results remain qualitatively unchanged.


View this table:
[in this window]
[in a new window]
 
Table 5 Intron Base-Composition Evolution

 

View this table:
[in this window]
[in a new window]
 
Table 6 Intergenic Base-Composition Evolution

 

View this table:
[in this window]
[in a new window]
 
Table 7 Synonymous Site Base-Composition Evolution

 
Substitutions
Sequences evolving at equilibrium for base composition are expected to fix equal numbers of G/C to A/T and A/T to G/C mutations. Intron and intergenic fixations are consistent with equilibrium for both species (D. melanogaster binomial probability: introns ML P = 0.523, parsimony P = 0.5145; intergenic ML P = 0.6254, parsimony P = 1.0 and D. simulans binomial probability: introns ML P = 1.0, parsimony P = 0.435; intergenic ML P = 0.5114, parsimony P = 0.7283). However, D. melanogaster and D. simulans synonymous sites are fixing significantly more A/T than G/C mutations (D. melanogaster binomial probability: synonymous sites ML P < 1 x 10–17, parsimony P < 1 x 10–22 and D. simulans binomial probability: synonymous sites ML P = 0.0441, parsimony P = 0.0403), a pattern which has been noted before (Akashi 1994; Begun 2001).

Contrasts of Polymorphism and Divergence
Comparisons of polymorphic and fixed G/C to A/T and A/T to G/C mutations reveal striking patterns of heterogeneity. Although the 2 x 2 table of polymorphic and fixed mutations is not significantly heterogeneous for D. simulans introns, D. simulans intergenic, or D. melanogaster intergenic DNA, the table for D. melanogaster introns is significantly heterogeneous, regardless of the method of ancestral reconstruction used (G-tests: ML P = 0.0013, parsimony P = 0.008). It appears that the there is a large excess of G/C to A/T polymorphisms in D. melanogaster introns.

The 2 x 2 tables for synonymous site evolution (table 7) are highly significant in D. simulans and D. melanogaster. Previous analyses of D. simulans variation revealed a major excess of A/T polymorphisms and smaller excess of A/T fixations; the polymorphism and divergence was heterogeneous in that the deviations from equilibrium expectation were much greater for the polymorphism (Akashi 1996; Begun 2001). Our results are consistent with these earlier analyses. Surprisingly, however, the 2 x 2 tables for synonymous site evolution (table 7) show that D. melanogaster is also significantly heterogeneous for polymorphic and fixed G/C versus A/T variants. Previous analyses of fewer D. melanogaster data showed no heterogeneity of polymorphic versus fixed A/T versus G/C mutations at synonymous sites (Akashi 1994, 1995).

Frequency Distribution
Given the above results from D. melanogaster, we investigated the frequency distribution of A/T and G/C polymorphisms from intronic and intergenic DNA in D. melanogaster. Comparisons using different ancestral reconstructions yield inconsistent results in this analysis. If we consider each site as an independent observation and use parsimony reconstructions, G/C mutations are at higher frequency than A/T mutations (Wilcoxon rank sum test: P = 0.03131). ML reconstructions show a qualitatively similar pattern; however, a hypothesis test fails to reject the null (Wilcoxon rank sum test: P = 0.1701). Thus, there is weak evidence that derived G/C and A/T mutations are at different frequencies in D. melanogaster populations. If we compare the average frequency per gene among silent classes (a more conservative test), we do not detect a significant difference between classes. Intergenic regions from D. melanogaster also show no hint of a frequency difference among polymorphism classes, both when comparing gene-averaged frequencies, or when treating segregating sites as independent observations. It is worth noting, however, that failure to reject the null in this setting may be the result of reduced power in the small population samples we have examined.

Bayesian Analysis of Base-Composition Evolution
Under the premise that excess A/T intron polymorphism in D. melanogaster results from a recent change in mutation bias, we can estimate the new expected mutation bias at equilibrium {omega} (Eyre-Walker 1997). Figure 4 presents the posterior distribution of {omega}, given the numbers of A/T and G/C polymorphisms and fixations from D. melanogaster and D. simulans introns from our Markov chain simulation. Table 8 presents results from our MCMC contrasting D. melanogaster and D. simulans at both intronic and intergenic sites. Current A/T content in D. melanogaster introns is 0.647. The MAP estimate of {omega} from our MCMC simulation was 0.77, with a 95% credible interval from 0.651 to 0.846. Thus, the data are compatible with a major evolutionary change of base composition in D. melanogaster introns. This is in contrast to our estimates from D. simulans introns, where the current A/T frequency, 0.614, is well within the 95% credible interval for our estimates of {omega} (table 8). Additionally, MCMC simulations for our intergenic data are compatible, with no change in equilibrium base-composition bias, over that currently observed in either species.



View larger version (31K):
[in this window]
[in a new window]
 
FIG. 4.— The posterior distribution of {omega}, the expected equilibrium base-composition bias. The distribution is derived from four chains of 10,000 iterations each. Current A/T content in D. melanogaster introns is 0.647. The MAP estimate of {omega} from our simulations was 0.77, with a 95% credible interval from 0.651 to 0.846.

 

View this table:
[in this window]
[in a new window]
 
Table 8 Bayesian Analysis of Expected Equilibrium Base-Composition Bias

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Though D. melanogaster and D. simulans have been important model systems in molecular population genetics for several years, most molecular population genetic data from these species have been from protein-coding regions. Our analysis of noncoding DNA revealed two properties of the substitution process.

First, intergenic DNA appears to evolve more slowly than intron DNA or synonymous sites in exons. This is consistent with recent reports by Halligan et al. (2004) on intergenic divergence in these species. However, much of their intergenic sequence was sampled from 5' and 3' flanking regions of genes. Such regions may be expected to harbor functionally important sites. Conversely, much of our sampled intergenic DNA comes from regions not near known or predicted genes. One interpretation of our data (and those of Halligan et al. [2004]) is that the neutral mutation rate in intergenic DNA is unexpectedly low as a result of functional constraint. The relatively low heterozygosity in these regions for D. simulans is consistent with this interpretation. However, the data are also completely consistent with a reduced mutation rate in nontranscribed sequences in Drosophila.

Second, D. melanogaster appears to be evolving significantly faster than D. simulans at all three classes of silent sites: synonymous, intergenic, and intronic. Previous results (Akashi 1996) suggested that rates of protein evolution are higher along the D. melanogaster than the D. simulans lineage. Indeed, analysis of the larger set of genes used here confirms the trend of faster protein evolution along the D. melanogaster lineage, although the difference is not significant (D. melanogaster: mean dN = 0.0131; D. simulans: mean dN = 0.0124). Thus, the data from D. melanogaster and D. simulans suggest that D. melanogaster evolves faster than D. simulans at all classes of sites examined. Akashi (1996) interpreted the higher rate of evolution in D. melanogaster replacement and synonymous sites (versus D. simulans) as resulting from reduced efficacy of natural selection against slightly deleterious mutations, perhaps as a consequence of reduced effective population size. If this explanation is correct, then our data suggest that all classes of sites are fixing slightly deleterious mutations in the D. melanogaster lineage. A corollary of this hypothesis is that the distribution of selection coefficients of new mutations is similar (i.e., nearly neutral) across these functionally disparate site types. Perhaps a more parsimonious interpretation is that the mutation rate has increased in the D. melanogaster lineage (relative to D. simulans). This would seem to be at odds with the observation that D. melanogaster is less polymorphic than D. simulans in both African and non-African populations (e.g., Begun and Whitley 2000; Andolfatto 2001). However, the neutral model posits that levels of variation depend on population size and neutral mutation rate, whereas divergence only depends on neutral mutation rate (Kimura 1983) Perhaps one attractive feature of the mutation rate–increase hypothesis is that it is amenable to direct empirical testing (e.g., Eeken, de Jong, and Green 1987).

Previous studies of D. melanogaster and D. simulans variation suggested a fundamental difference between species in the dynamics of synonymous variation. Contrasts of polymorphic versus fixed, A/T versus G/C mutations in D. melanogaster were homogeneous, yet A/T mutations greatly outnumbered G/C mutations for both polymorphic and fixed sites (Akashi 1995). The contrast of polymorphic versus fixed, A/T versus G/C mutations in D. simulans was highly significantly heterogeneous, consistent with an excess of A/T polymorphisms (Akashi 1995; Begun 2001). These results were interpreted as supporting a model of neutral evolution of higher A/T content in D. melanogaster (Akashi 1995, 1996). D. simulans data suggested that this lineage is also evolving higher A/T content (Begun 2001, McVean and Vieira 2001), although at a much slower rate than D. melanogaster, and that the polymorphism may be inconsistent with neutrality (Akashi 1996).

Our analysis of this larger D. melanogaster set suggests a different view of the polymorphism and divergence at synonymous sites, namely, that the contrasts of polymorphic and fixed A/T versus G/C mutations are significantly heterogeneous in both species (table 7). Such allele configurations (table 7) were previously interpreted as evidence for reduced selection against slightly deleterious A/T fixations after a population bottleneck and as evidence for nearly neutral dynamics for A/T polymorphisms (Akashi 1995, 1996). An alternative hypothesis for such synonymous site data is a recent change in mutation bias (Eyre-Walker 1997; Akashi 1997). Of course, rejection of homogeneity in the 2 x 2 contingency table is formally consistent with other interpretations, which complicates inferences regarding base-composition evolution at synonymous sites. For example, if the polymorphism data provide an accurate picture of historical mutational input, the data could be interpreted as a lineage-specific fixation bias of G/C mutations in D. melanogaster, perhaps as result of natural selection or biased gene conversion (Holmquist 1992; Eyre-Walker 1993, 1999). In any case, our results suggest that patterns of base-composition variation at synonymous sites are qualitatively similar in the two species.

If the mutation bias hypothesis was the correct explanation for synonymous site variation, we might expect to observe similar patterns of base-composition evolution for intronic and intergenic sites. In fact, we did observe a similar pattern, but only for intronic and not intergenic sites, and only for D. melanogaster and not D. simulans (tables 5 and 6). Thus, the data from these two species are recalcitrant to a simple explanation. An analysis aimed at quantifying the magnitude of such a bias from D. melanogaster introns is consistent with a large change in base composition currently under way along that lineage (fig. 4).

One interpretation of the lineage and site-specific heterogeneity for contrasts of polymorphic and fixed, A/T versus G/C mutations is a recent change in mutation bias in transcribed D. melanogaster (but not in D. simulans) sites (i.e., introns and synonymous sites). In principle, this could explain both the much greater excess of A/T fixations in D. melanogaster versus D. simulans synonymous sites and the excess of A/T polymorphisms in D. melanogaster but not in D. simulans introns. Overall then, data from the two species may be consistent with mildly deleterious evolution of synonymous sites in both species, leading to an accumulation of unpreferred (i.e., A/T ending) codons and a recent change in mutation bias associated with transcribed DNA in D. melanogaster. This explanation is somewhat unsatisfying because it is essentially a restatement of the empirical observations. However, it does have the virtue of predicting that careful measurement of DNA metabolism in D. melanogaster and D. simulans will reveal interspecific differences. For example, the hypothesis of different patterns of mutation bias in D. melanogaster versus D. simulans could be investigated experimentally by direct assays of genomic DNA damage or genetic assays to detect differences in DNA repair efficiency between D. melanogaster and D. simulans. Genomic-scale polymorphism and divergence data from DNA sequences varying in levels of germline transcription would also be valuable for further exploration of these ideas.

Mammalian data suggest that germline transcription-coupled repair may have important consequences for base-composition evolution (Green et al. 2003, Majewski 2003). However, there is no evidence for transcription-coupled repair in flies (de Cock et al. 1992; van der Helm et al. 1997; Sekelsky, Brodsky, and Burtis 2000). Nevertheless, transcribed DNA is packaged into a more "open" chromatin conformation than nontranscribed DNA (reviewed in Kornberg and Lorch [1995]), which is thought to make transcribed DNA more accessible to endogenous DNA-damaging agents (Bartlett, Scicchitano, and Robison 1991; Ljungman and Hanawalt 1992). In principle, differential chromatin states may affect the distribution of new mutations in transcribed versus nontranscribed sequences—a phenomenon known as transcription-associated mutation (TAM).

A problem with the recent change in mutation-bias hypothesis is that it predicts G/C to A/T mutations are, on average, younger than A/T to G/C mutations and, thus, should be segregating at lower frequencies in D. melanogaster (Sawyer 1977). We observed no such heterogeneity of frequencies, although the relatively small sample sizes may provide little power to detect such a phenomenon.

A recent survey of synonymous sites at the rosy locus from 22 Drosophila species showed that many lineages appear to be evolving higher A/T content (Begun and Whitley 2003). In this respect, D. melanogaster and D. simulans appear to be like other flies—they are becoming more A/T rich at synonymous sites. One explanation for these data is that the ancestor of Drosophila was highly G/C rich but evolved a strong(er) A/T mutation bias before splitting of the extant Drosophila lineages. In this scenario, the accumulation of A/T mutations in many lineages reflects a slow approach to a new, higher A/T–content equilibrium. If intergenic and intron sequences are under weaker selection for base composition, perhaps they would have approached this equilibrium more quickly compared with synonymous sites, thereby explaining why the substitution data from intergenic and intron DNA, but not synonymous sites, supports base-composition equilibrium.

Finally, one of the major questions of population genetics is how much of the genome is either directly or indirectly influenced by directional selection. Our comparisons of polymorphism and divergence from worldwide samples using the HKA test (Hudson, Kreitman and Aguadé 1987) revealed that whereas intergenic data were consistent with neutrality, synonymous site and intronic data were not (although note the differences in our restricted African sample). If the significant HKA results are a consequence of hitchhiking effects (Maynard Smith and Haigh 1974; Kaplan et al. 1989), then we might infer that (1) directional selection tends to occur in genes and (2) the physical extent of hitchhiking effects is often "gene-sized" or larger—on the order of a few kilobases.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
We thank John Gillespie, Chuck Langley, Matt Hahn, Corbin Jones, Josh Chang Mell, and the UC Davis Evolutionary Discussion Group (EDG) for valuable comments and discussion. Michael Nachman and two anonymous reviewers improved this manuscript substantially. A.D.K. is a Howard Hughes Medical Institute predoctoral fellow. This work was supported by NSF DEB-0327049. M. Kerber provided technical assistance.


    Footnotes
 
Michael Nachman, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 

    Adams, M. D., S. E. Celnicker, R. A. Holt, C. A. Evans, et al. 2000. The genome sequence of Drosophila melanogaster. Science 287:2185–2195.[Abstract/Free Full Text]

    Akashi, H. 1994. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927–935.[Abstract/Free Full Text]

    ———. 1995. Inferring weak selection from patterns of polymorphism and divergence at silent sites in Drosophila. Genetics 139:1067–1076.[Abstract/Free Full Text]

    ———. 1996. Molecular evolution between Drosophilia melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution and larger proteins in D. melanogaster. Genetics 144:1297–1307.[Abstract/Free Full Text]

    ———. 1997. Distinguishing the effects of mutational biases and natural selection on DNA sequence variation. Genetics 147:1983–1987.[Free Full Text]

    Andolfatto, P. 2001. Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18:279–290.[Abstract/Free Full Text]

    Bartlett, J. D., A., Scicchitano, and S. H. Robison. 1991. Two expressed human genes sustain slightly more DNA damage after alkylating agent treatment than an inactive gene. Mutat. Res. 255:247–256.[CrossRef][ISI][Medline]

    ———. 2001. The frequency distribution of nucleotide variation in Drosophila simulans. Mol. Biol. Evol. 18:1343–1352.[Abstract/Free Full Text]

    Begun, D. J., and C. F. Aquadro. 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356:519–520.[CrossRef][ISI][Medline]

    ———. 1995. Molecular variation at the vermilion locus in geographically diverse populations of Drosophila melanogaster and Drosophila simulans. Genetics 140:1019–1032.[Abstract/Free Full Text]

    Begun, D. J., A. J. Betancourt, C. H. Langley, and W. Stephan. 1999. Is the fast/slow allozyme variation at the Adh locus of Drosophila melanogaster an ancient balanced polymorphism? Mol. Biol. Evol. 16:1816–1819.[Free Full Text]

    Begun, D. J., and P. Whitley. 2000. Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc. Natl. Acad. Sci. USA 97:5960–5965.[Abstract/Free Full Text]

    ———. 2003. Molecular population genetics of Xdh and the evolution of base competition in Drosophila. Genetics 162:1725–1735.[ISI]

    Begun, D. J., P. Whitley, B. L. Todd, H. M. Waldrip-Dail, and A. G. Clark. 2000. Molecular population genetics of male accessory gland proteins in Drosophila. Genetics 156:1879–1888.[Abstract/Free Full Text]

    Bergman, C. M., and M. Kreitman. 2001. Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11:1335–1345.[Abstract/Free Full Text]

    Berry, A. J., J. W. Ajioka, and M. Kreitman. 1991 Lack of polymorphism on the Drosophila fourth chromosome resulting from selection. Genetics 129:1111–1117.[Abstract/Free Full Text]

    Clark, A. G., B. G. Leicht, and S. V. Muse. 1996. Length variation and secondary structure of introns in the Mlc1 gene in six species of Drosophila. Mol. Biol. Evol. 13:481–482.

    Cooke, P. H., and J. G. Oakeshott. 1989. Amino acid polymorphisms for esterase-6 in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 86:1426–1430.[Abstract]

    de Cock, J. G., E. C. Klink, W. Ferro, P. H. Lohman, and J. C. Eeken. 1992. Neither enhanced removal of cyclobutane pyrimidine dimers nor strand-specific repair is found after transcription induction of the beta 3- tubulin gene in a Drosophila embryonic cell line Kc. Mutat. Res. 293:11–20.[CrossRef][ISI][Medline]

    Eanes, W. F., M. Kirchner, and J. Yoon. 1993. Evidence for adaptive evolution of the G6pd gene in the Drosophila melanogaster and Drosophila simulans lineages. Proc. Natl. Acad. Sci. USA 90:7475–7479.[Abstract/Free Full Text]

    Eanes, W. F., M. Kirchner, J. Yoon, C. H. Biermann, I. N. Wang, M. A. McCartney, and B. C. Verrelli. 1996. Historical selection, amino acid polymorphism and lineage-specific divergence at the G6pd locus in Drosophila melanogaster and D. simulans. Genetics 144:1027–1041.[Abstract/Free Full Text]

    Eeken, J. C., A. W. de Jong, and M. M. Green. 1987. The spontaneous mutation rate in Drosophila simulans. Mutat. Res. 192:259–262.[CrossRef][ISI][Medline]

    Ewing, B., and P. Green. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8:186–194.[Abstract/Free Full Text]

    Ewing, B., L. Hillier, M. C. Wendl, and P. Green. 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8:175–185.[Abstract/Free Full Text]

    Eyre-Walker, A. 1993. Recombination and mammalian genome evolution. Proc. R. Soc. Lond. B Biol. Sci. 252:237–243.[ISI][Medline]

    ———. 1997. Differentiating between selection and mutation bias. Genetics 147:1983–1987.[Free Full Text]

    ———. 1999. Evidence of selection on silent base composition in mammals: potential implications for the evolution of isochores and junk DNA. Genetics 152:675–683.[Abstract/Free Full Text]

    Gelman, A., J. B. Carlin, H.S. Stern, and D. B. Rubin. 1995. Bayesian data analysis. Chapman & Hall, New York.

    Gordon, D., C. Abajian, and P. Green. 1998. Consed: a graphical tool for sequence finishing. Genome Res. 8:195–202.[Abstract/Free Full Text]

    Green, P., B. Ewing, W. Miller, P. J. Thomas, E. D. Green, NISC Comparative Sequencing Program. 2003. Transcription-associated mutational asymmetry in mammalian evolution. Nat. Genet. 33:514–517.[CrossRef][ISI][Medline]

    Halligan, D. L., A. Eyre-Walker, P. Andolfatto, and P. D. Keightley. 2004. Patterns of evolutionary constraints in intronic and intergenic DNA of Drosophila. Genome Res. 14:273–279.[Abstract/Free Full Text]

    Hasson, E. and W. F. Eanes. 1996. Contrasting histories of three gene regions associated with In(3L)Payne of Drosophila melanogaster. Genetics 144:1565–1575.[Abstract/Free Full Text]

    Hasson, E., I. N. Wang, L. W. Zeng, M. Kreitman, and W. F. Eanes. 1998. Nucleotide variation in the triosephosphate isomerase (Tpi) locus of Drosophila. Mol. Biol. Evol. 15:756–769.[Abstract]

    Hastings, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109.[ISI]

    Hilton, H., R. M. Kliman, and J. Hey. 1994. Using hitchhiking genes to study adaptation and divergence during speciation with the Drosophila melanogaster complex. Evolution 48:1900–1913.[ISI]

    Holmquist, G. P. 1992. Chromosome bands, their chromatin flavors and their functional features. Am. J. Hum. Genet. 51:17–37.[ISI][Medline]

    Hudson, R. R., M. Kreitman, and M. Aguade. 1987. A test of the neutral molecular evolution based on nucleotide data. Genetics 116:153–159.[Abstract/Free Full Text]

    Kaplan, N. L., R. R. Hudson, and C. H. Langley. 1989. The "hitchhiking effect" revisited. Genetics 123:887–899.[Abstract/Free Full Text]

    Karotam, J., A. C. Delves, and J. G. Oakeshott. 1993. Conservation and change in structural and 5' flanking sequences of esterase 6 in sibling Drosophila species. Genetica 90:11–28.

    Kauer, M. O., D. Dieringer, and C. Schlotterer. 2003. A microsatellite variability screen for positive selection associated with the "out of Africa" habitat expansion of Drosophila melanogaster. Genetics 165:1137–1148.[Abstract/Free Full Text]

    Kern, A. D., C. D. Jones, and D. J. Begun. 2004. Molecular population genetics of male accessory gland proteins in the Drosophila simulans complex. Genetics 167:725–735.[Abstract/Free Full Text]

    Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge.

    Kliman, R. M., and J. Hey. 1993. Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol. Biol. Evol. 10:1239–1258.[Abstract]

    Kliman, R. M., P. Andolfatto, J. A. Coyne, F. Depaulis, M. Kreitman, A. J. Berry, J. McCarter, J. Wakeley, and J. Hey. 2000. The population genetics of the origin and divergence of the Drosophila simulans complex of species. Genetics 156:1913–1931.[Abstract/Free Full Text]

    Kornberg, R. D., and Y. Lorch. 1995. Interplay between chromatin structure and transcription. Curr. Opin. Cell. Biol. 7:371–375.[CrossRef][ISI][Medline]

    Kreitman, M. 1983. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304:412–417.[ISI][Medline]

    Langley, C. H., J. MacDonald, N. Miyashita, and M. Aguadé. 1993. Lack of correlation between interspecific divergence and intraspecific polymorphism at the suppressor of forked region in Drosophila melanogaster and Drosophila simulans. Proc. Natl. Acad. Sci. USA 90:1800–1803.[Abstract]

    Leicht, B. G., S. V. Muse, and A. G. Clark. 1995. Constraints on intron evolution in the gene encoding the myosin alkali light chain in Drosophila. Genetics 139:299–308.[Abstract/Free Full Text]

    Ljungman, M., and P. C. Hanawalt. 1992. Efficient protection against oxidative damage in chromatin. Mol. Carcinogen. 5:264–269.[ISI][Medline]

    Majewski, J. 2003. Dependence of mutational asymmetry on gene-expression levels in the human genome. Am. J. Hum. Genet. 73:688–692.[CrossRef][ISI][Medline]

    Maynard Smith, J., and J. Haigh. 1974. The hitch-hiking effect of a favourable gene. Genet. Res., Camb. 23:23–35.[ISI][Medline]

    McDonald, J. H., and M. Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652–654.[CrossRef][ISI][Medline]

    McVean, G., and J. Vieira. 2001. Inferring parameters of mutation, selection and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245–257.[Abstract/Free Full Text]

    Metropolis, N., and S. Ulam. 1949. The Monte Carlo method. J. Am. Stat. Assoc. 44:335–341.[ISI]

    Morgenstern, B. 1999. DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 15:211–218.[Abstract/Free Full Text]

    Moriyama, E. N., and J. R. Powell. 1996. Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261–277.[Abstract]

    Parsch, J., C. D. Meiklejohn, and D. L. Hartl. 2001. Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans. Genetics 159:647–657.[Abstract/Free Full Text]

    Parsch, J. 2003. Selective constraints on intron evolution in Drosophila. Genetics 165:1843–1851.[Abstract/Free Full Text]

    Sawyer, S. 1977. On the past history of an allele now known to have frequency P.. J. Appl. Prob. 14:439–450.[ISI]

    Schmid, K. J., and D. Tautz. 1997. A screen for fast evolving genes from Drosophila. Proc. Natl. Acad. Sci. USA 94:9746–9750.[Abstract/Free Full Text]

    Schmidt, P. S., D. D. Duvernell, and W. F. Eanes. 2000. Adaptive evolution of a candidate gene for aging in Drosophila. Proc. Natl. Acad. Sci. USA 97:10861–10865.[Abstract/Free Full Text]

    Schlenke, T. A., and D. J. Begun. 2003. Natural selection drives Drosophila immune system evolution. Genetics 164:1471–1480.[Abstract/Free Full Text]

    Sekelsky, J. J., M. H. Brodsky, and K. C. Burtis. 2000. DNA repair in Drosophila: insights from the Drosophila genome sequence. J. Cell. Biol. 150:F31–36.[CrossRef][ISI][Medline]

    Sueoka, N. 1993. Directional mutation pressure, mutator mutations, and dynamics of molecular evolution. J. Mol. Evol. 37:137–153.[ISI][Medline]

    Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595.[Abstract/Free Full Text]

    Takano, T. S. 1998. Rate variation of DNA sequence evolution in the Drosophila lineages. Genetics 149:959–970.[Abstract/Free Full Text]

    ———. 2001. Local changes in the GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol. Biol. Evol. 18:606–619.[Abstract/Free Full Text]

    Tavaré, S. 1986. Some probabilistic and statistical problems on the analysis of sequences. Lect. Math. Life Sci. 17:368–376.

    Thackeray, J. R., and C. P. Kyriacou. 1990. Molecular evolution in the Drosophila yakuba period locus. J. Mol. Evol. 31:389–401.[ISI][Medline]

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:383–402.[Abstract]

    van der Helm, P. J., E. C. Klink, P. H. Lohman, and J. C. Eeken. 1997. The repair of UV-induced cyclobutane pyrimidine dimers in the individual genes Gart, Notch, and white from isolated brain tissue of Drosophila melanogaster. Mutat. Res. 383:113–124.[ISI][Medline]

    Verrelli, B. C., and W. F. Eanes. 2000. Extensive amino acid polymorphism at the pgm locus is consistent with adaptive protein evolution in Drosophila melanogaster. Genetics 156:1737–1752.[Abstract/Free Full Text]

    Webb, C. T., S. A. Shabalina, A. Y. Ogurtsov, and A. S. Kondrashov. 2002. Analysis of similarity within 142 pairs of orthologous intergenic regions of Caenorhabditis elegans and Caenorhabditis briggsae. Nucleic Acids Res. 30:1233–1239.[Abstract/Free Full Text]

    Yang, Z., S. Kumar, and M. Nei. 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641–1650.[Abstract/Free Full Text]

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555–556.[Medline]

Accepted for publication August 30, 2004.