Nucleotide Polymorphism in the RpII215 Gene Region of the Insular Species Drosophila guanche: Reduced Efficacy of Weak Selection on Synonymous Variation

José A. Pérez*,{dagger}, Agustí Munté*, Julio Rozas*, Carmen Segarra* and Montserrat Aguadé*,

* Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
{dagger} Departamento de Parasitología, Ecología y Genética, Facultad de Farmacia/Biología, Universidad de La Laguna, La Laguna, Tenerife, Spain

Correspondence: E-mail address: aguade{at}porthos.bio.ub.es.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
An approximately 6.9-kb region encompassing the RpII215 gene was sequenced for 24 individuals of the island endemic species Drosophila guanche. The comparative analysis of synonymous polymorphism and divergence in D. guanche and D. subobscura, two species with pronounced differences in population size, allows contrasting the nearly neutral character of synonymous mutations. In D. guanche, unlike in D. subobscura, (1) the ratio of preferred to unpreferred synonymous changes was similar for polymorphic and fixed changes, (2) the numbers of preferred and unpreferred changes, both polymorphic and fixed, could be explained by the mutational process, and (3) the estimated scaled selection coefficient for unpreferred mutations did not differ significantly from zero. Additionally, the comparative analysis revealed that both the ratio of preferred to unpreferred synonymous changes and the frequency spectrum of unpreferred polymorphic mutations differed significantly between species. All these results indicate that a large fraction of synonymous mutations in the RpII215 gene behave as effectively neutral in D. guanche, whereas they are weakly selected in D. subobscura. The reduced efficacy of selection in the insular species constitutes strong evidence of the nearly neutral character of synonymous mutations and, therefore, of the role of weak selection in maintaining codon bias.

Key Words: weak selection • codon bias • effective population size • Drosophila guancheDrosophila subobscura


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Many genes exhibit a nonrandom use of synonymous codons with some preferred (major or optimal) codons and other unpreferred (minor or nonoptimal) codons. This preferential use of a particular set of codons is known as codon bias. The initially detected association between the degree of codon bias and the level of gene expression pointed to the action of selection at the translational level in maintaining codon bias (Shields et al. 1988; Akashi 1994). However, it is still controversial (Akashi 1997; Eyre-Walker 1997) whether the differential use of codons just reflects the mutational process (i.e., a mutation-drift equilibrium) or is actively maintained by weak selection (i.e., a mutation-selection-drift equilibrium [Li 1987; Bulmer 1991]).

In Drosophila, the mutational process is biased toward A/T, with an expected A+T content at equilibrium of about 65% (Petrov and Hartl 1999). The mutational input could thus explain the A+T content of introns (Moriyama and Hartl 1993; Hey and Kliman 2002) but not the codon bias detected in this genus. Indeed, preferred codons (i.e., those found at high frequencies in highly biased genes) end in C or G in Drosophila melanogaster and in other Drosophila species (Akashi 1995; Kreitman and Antezana 2000). In addition, a significant positive correlation between the G+C content at third positions of fourfold degenerate codons and that of the adjacent introns was only detected for low bias genes (Kliman and Hey 1994). These observations would allow precluding that the mutational process is mainly responsible for codon bias, at least in Drosophila.

Several observations on synonymous variation in Drosophila are consistent with predictions of the mutation-selection-drift equilibrium model and would thus favor the role of weak selection in maintaining codon bias: (1) significant associations between codon bias and both the recombination rate and the length of the coding region were detected by the analysis of a large set of genes in D. melanogaster (Kliman and Hey 1993; Comeron, Kreitman, and Aguadé 1999; Duret and Mouchiroud 1999; Hey and Kliman 2002); (2) a parallel increase in codon bias and the rate of recombination was noticed in interspecies comparisons of particular genes that have changed their recombinational environment (Munté, Aguadé, and Segarra 1997, 2001; Takano-Shimizu 1999); (3) the analysis of synonymous polymorphism and divergence in small sets of genes revealed differences between preferred and unpreferred changes either in the ratio of polymorphism to divergence or in the frequency spectrum of derived variants (Akashi 1995, 1999; Akashi and Schaeffer 1997; Kliman 1999; Llopart and Aguadé 2000; Begun 2002); and (4) analysis of synonymous divergence between two species differing in their effective population size (Ne) showed an excess of fixed unpreferred mutations in the lineage with lower Ne (Llopart and Aguadé 1999).

According to the nearly neutral model of molecular evolution (Ohta and Kimura 1971; Ohta 1972), the fate of weakly selected mutations is expected to differ in species with strong differences in effective population size. A direct approach to contrast the nearly neutral character of synonymous mutations would consist of comparing the distribution of preferred and unpreferred polymorphic variants between a pair of related species with similar generation times and strong (and sequence variation independent) support for important differences in their effective population size. Given the difficulties associated with assessing effective population size, pairs of species consisting of one continental species with a rather large distribution area and another species endemic to a volcanic island would be among the best candidates for such an approach. D. subobscura and D. guanche would constitute such a pair. Indeed, D. subobscura is widespread and abundant in the Palearctic region, and D. guanche is restricted to some isolated gorges of the Tenerife Island (Canary Archipelago, Spain). The endemic character of D. guanche, its association with the relict tertiary flora of the island, and the lack of important climatic fluctuation in the archipelago over evolutionary time would support the long-term low effective population size of D. guanche. Furthermore, the divergence time between both species, although short at the evolutionary time scale, is long enough (about 1.8 to 2.8 MYA, Ramos-Onsins et al. 1998) to minimize the effect of shared ancestral polymorphisms.

Herein, we report the survey of nucleotide polymorphism in the RpII215 gene region of D. guanche, aiming to contrast the nearly neutral character of synonymous mutations and thus the role of weak selection in maintaining codon bias. This gene is located in a region of high recombination (section 10A of the X chromosome, Segarra and Aguadé 1992), and it has a long coding region (5,667 nt, 1,889 codons). Both characteristics allow predicting the detection of a sufficiently high number of synonymous polymorphic variants in a species with a putative low effective population size. In addition, the study of a single gene has an additional advantage over studies in which several genes are jointly analyzed: it eliminates the overdispersion of the selection coefficients caused by differences in codon bias among genes. Most importantly, previous data on the divergence of this gene between D. subobscura and D. guanche (Llopart and Aguadé 2000) revealed an excess of unpreferred differences fixed in the D. guanche lineage, which would be consistent with its long-term reduced effective size if unpreferred mutations were under weak negative selection. Synonymous polymorphism in D. guanche should also reflect the reduced effective size of this species. Indeed, if weak selection were relaxed in the D. guanche lineage, there should be an excess of unpreferred mutations segregating in D. guanche relative to D. subobscura. In addition, these mutations would segregate at higher frequencies in the insular than in the continental species. The pattern of synonymous polymorphism detected in the RpII215 gene of D. guanche conforms to these expectations and indicates that most synonymous mutations in the D. guanche lineage behave as neutral. Therefore, present data show that selection is relaxed in the D. guanche lineage and strongly support the nearly neutral character of synonymous mutations.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Fly Samples
Drosophila guanche flies were collected from November 1999 to February 2000 in the localities Barranco del Infierno and Valle Brosque in the Tenerife Island (Canary Archipelago, Spain), which are separated by approximately 60 km of a mountainous area. A total of 24 isofemale lines of D. guanche (12 lines from each locality) were used in this study. Polytene chromosomes of female third-instar larvae were examined to establish the putative presence of inversion polymorphism. A laboratory strain of D. pseudoobscura was used to extend the previously reported RpII215 sequence (Llopart and Aguadé 1999).

DNA Sequencing Strategy
Genomic DNA from a single male of each isofemale line was extracted using a standard small-scale procedure (Ashburner 1989). An approximately 6.9-kb DNA fragment that includes the RpII215 gene and its flanking regions was PCR amplified by means of three overlapping fragments. Oligonucleotides for PCR amplification and sequencing were designed on the available sequences of D. guanche and D. subobscura (Llopart and Aguadé 1999, 2000). After purification of PCR products with Microcon®-PCR Filter Columns (Millipore), both strands were completely sequenced using internal primers. An inverse-PCR and primer-walking strategy was used to extend the D. pseudoobscura sequence. Sequencing reactions were carried out with Abi Prism® BigDyeTM Terminators version 2.0 Cycle Sequencing Kit (Applied Biosystems) following manufacturer's instructions. The sequencing reaction products were separated, after ethanol precipitation, on an Abi Prism 377 automated DNA sequencer (PerkinElmer), and sequences were assembled with the SeqEd version 1.03 program (Hagemann and Kwan 1997).

Newly reported DNA sequences from D. guanche have been deposited in the EMBL/GenBank Database under accession numbers AJ547806 and AJ548510 to AJ548532. The accession number of the partial sequence of D. pseudoobscura is AJ544770.

Data Analyses
DNA sequences were multiply aligned using the ClustalW program (Thompson, Higgins, and Gibson 1994) and further edited with the MacClade version 3.06 program (Maddison and Maddison 1992). Sites with alignment gaps were excluded from the analyses. The DnaSP version 3.98 program (Rozas and Rozas 1999) was used to perform most analyses. Genetic differentiation between populations was tested using the Ks* and Snn statistics (Hudson, Boos, and Kaplan 1992; Hudson 2000); statistical significance was obtained by the permutation test (10,000 replicates). The population recombination parameter R = 2Ner (where r is the recombination rate per generation between the most distant sites) was estimated from the minimum number of recombination events found in the sample, or RM (Hudson and Kaplan 1985). The minimum and maximum values of R compatible with the observed RM value (at the 5% level), RL and RU, were estimated following the method of Rozas et al. (2001). The global measures of linkage disequilibrium ZnS (Kelly 1997) and ZA (Rozas et al. 2001) were also obtained.

The gene genealogy was reconstructed by the neighbor-joining method (Saitou and Nei 1987), as implemented in the MEGA program (Kumar et al. 2001). The ancestral state of each variable site in the 24 D. guanche sequences and in the 11 previously reported D. subobscura sequences (Llopart and Aguadé 2000) was inferred using the RpII215 sequence of D. pseudoobscura (Llopart and Aguadé 1999; this study) and, occasionally (in six out of 130 cases), that of D. melanogaster (Jokerst et al. 1989) as the outgroup. Variable sites for which phylogenetic reconstruction did not result in a single most-parsimonious tree (i.e., ambiguous sites) were excluded from the analyses. Polarized synonymous changes were classified as preferred (from unpreferred to preferred codons) or unpreferred (from preferred to unpreferred codons) according to the D. melanogaster codon preferences (Akashi 1995). Although D. pseudoobscura is phylogenetically closer to D. subobscura than is D. melanogaster, we used the codon preferences of this latter species rather than those of D. pseudoobscura (Akashi and Schaeffer 1997), because they are based on a larger set of genes. Additionally, Kreitman and Antezana (2000) showed that codon preferences are similar in D. subobscura and D. melanogaster. Only changes from preferred to unpreferred codons and from unpreferred to preferred codons were analyzed. For the comparative analysis between species, previously published data on polymorphism in D. subobscura and divergence from D. guanche (Llopart and Aguadé 1999, 2000) were reanalyzed, since the D. pseudoobscura codon preferences had been used in those studies.

The numbers of preferred and unpreferred changes expected from the mutational process (i.e., under a strictly neutral model) were obtained using the pattern of nucleotide mutation detected in unconstrained DNA sequences of Drosophila (Petrov and Hartl 1999). The mutational matrix was applied to the particular codon composition of the RpII215 sequence inferred for the ancestor of D. guanche and D. subobscura.

Tajima's test of neutrality (Tajima 1989), which is based on the frequency spectrum of polymorphisms, was applied to the frequency distributions of the different classes of derived variants (preferred and unpreferred synonymous changes, and changes in noncoding regions). The Mann-Whitney U test (z value from the fdMWU test, Akashi 1999) was applied to compare polymorphism frequency distributions using five frequency intervals of equal size.

The scaled selection coefficient ({gamma} = Nes, where s is the selection coefficient) for the different kinds of changes was estimated using two different approaches based on the Poisson random field (PRF) theory (Sawyer and Hartl 1992), which assumes that the number of changes is an independent Poisson random variable. The first approach uses information on the frequency distribution (fd) of derived polymorphic variants (Hartl, Moriyama, and Sawyer 1994; Akashi and Schaeffer 1997). The PRFMLE software (Hartl, Moriyama, and Sawyer 1994) was used to estimate the {gamma} values, their confidence intervals, and the statistical significance against the null hypothesis of no selection ({gamma} = 0). The second approach uses information on the ratio of polymorphism to divergence (rpd) and requires comparison with a neutrally evolving region (Sawyer and Hartl 1992; Akashi 1995). In this case, confidence intervals of {gamma} were obtained by parametric bootstrap (10,000 replicates); for each computer replicate, the numbers of preferred, unpreferred, and noncoding changes (both polymorphic and fixed) were randomly obtained from a Poisson distribution with the number of changes observed in each class as the mean.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
Level of Polymorphism in the RpII215 Region of Drosophila guanche
An approximately 6.9-kb region encompassing the RpII215 gene and its flanking regions was sequenced in 24 X chromosomes randomly obtained from two natural populations of D. guanche (Barranco del Infierno and Valle Brosque). Both populations were monomorphic at the chromosomal level. No genetic differentiation between populations was detected (Ks* = 2.63, P = 0.539; Snn = 0.41, P = 0.711). Thus, the two samples were pooled for further analyses. A total of 54 nucleotide polymorphisms and six length polymorphisms were detected. All nucleotide polymorphisms (fig. 1) were located in noncoding regions or at synonymous sites in the coding region (i.e., they were silent). All length polymorphisms affected tandem repeats in noncoding regions: A(3–4) and T(5–6) in the 5' flanking region; TGCAAAAA(2–3) and G(8–12) in intron 1; and GGA(1–6) and GGT(2–3) in the 3' flanking region. Estimated nucleotide diversity was similar in noncoding regions and at synonymous sites (table 1).



View larger version (17K):
[in this window]
[in a new window]
 
FIG. 1. Nucleotide polymorphism in the RpII215 gene region of D. guanche. Polymorphic nucleotide positions are numbered from the translation initiation site. The last two rows include the information for these polymorphic sites in D. subobscura (strain pp1 in Llopart and Aguadé 2000) and D. pseudoobscura (Llopart and Aguadé 1999; present study), respectively. Nucleotides identical to the first sequence are indicated by a dot. I = intron; BI = Barranco del Infierno; VB = Valle Brosque; Hap = haplotype; — = missing data; d = deletion; Dsub = D. subobscura; Dpse = D. pseudoobscura

 

View this table:
[in this window]
[in a new window]
 
Table 1 Nucleotide Variation in the RpII215 Region of D. guanche.

 
The high value of the recombination parameter estimate (R = 101 with RL = 40 and RU = 219) indicates that recombination has played a major role in the evolution of the RpII215 gene region. This is further supported by the low estimated values of two measures of global linkage disequilibrium (ZnS = 0.102 and ZA = 0.203).

Preferred and Unpreferred Changes in the D. guanche and D. subobscura Lineages
The 24 sequences of D. guanche and the 11 previously reported sequences of D. subobscura (Llopart and Aguadé 2000) were used to obtain the numbers of fixed and polymorphic changes separately in each lineage (table 2). The number of synonymous changes fixed in the D. guanche lineage largely exceeds the number of those fixed in the D. subobscura lineage, which is reflected in the gene genealogy shown in figure 2. The 130 (out of 142) synonymous changes (either polymorphic within species or fixed between species) that could be unambiguously polarized were subsequently classified according to the D. melanogaster codon usage preferences (Akashi 1995). There were 104 preferred or unpreferred changes, five changes from preferred to preferred codons, and 21 changes from unpreferred to unpreferred codons. Only preferred and unpreferred changes were considered in further analyses.


View this table:
[in this window]
[in a new window]
 
Table 2 Comparison of the Numbers of Synonymous Changes Between Lineages.

 


View larger version (12K):
[in this window]
[in a new window]
 
FIG. 2. Gene genealogy based on variation in the RpII215 gene. The tree was built using the number of synonymous substitutions per site (Jukes and Cantor 1969; Nei and Gojobori 1986) as the genetic distance. Open and black circles refer to the D. guanche sequences from Valle Brosque and Barranco del Infierno, respectively. Squares refer to D. subobscura sequences

 
The numbers of fixed preferred and unpreferred changes differed significantly between lineages (table 2), as previously detected when using a single D. subobscura and D. guanche sequence (Llopart and Aguadé 1999). Also, the numbers of polymorphic preferred and unpreferred changes differed significantly between species. In the D. guanche lineage, the ratio of preferred to unpreferred mutations was similar for polymorphic and fixed changes ({chi}2 = 0.30, P = 0.58 [table 2 and fig. 3]). This observation is in contrast to the significant excess of unpreferred polymorphic changes previously detected in the D. subobscura lineage (Llopart and Aguadé 2000), when fixed differences were determined from the D. subobscura/D. madeirensis comparison. We detected a similar, although not significant, excess of unpreferred polymorphic changes in the D. subobscura lineage, when fixed differences were determined from the D. subobscura/D. guanche comparison ({chi}2 = 2.70, P = 0.10 [table 2 and fig. 3]).



View larger version (13K):
[in this window]
[in a new window]
 
FIG. 3. Frequencies of preferred (dark shading) and unpreferred (light shading) mutations for polymorphic and fixed synonymous changes in the lineages of (a) D. guanche and (b) D. subobscura. Frequencies refer to the total number of synonymous changes detected in each lineage (from table 2)

 
Frequency Distribution of Silent Changes in D. guanche and D. subobscura
Natural selection acting on synonymous sites should affect the frequency spectrum of polymorphic variants. The Tajima (1989) and fdMWU (Akashi 1999) tests were used to detect putative deviations in the frequency distribution of polymorphisms. Tajima's test was conducted independently for the three categories of silent changes: preferred and unpreferred synonymous changes, and changes in noncoding sites. In D. guanche, none of Tajima's tests detected any deviation from neutral expectations using the conservative RL value (table 3). This is in contrast to the significant deviation from neutral predictions detected in D. subobscura for unpreferred changes and for variation in noncoding regions (table 3).


View this table:
[in this window]
[in a new window]
 
Table 3 Tajima's Test.

 
The fdMWU test (Akashi 1999) can be used to compare two frequency distributions, such as between two categories of changes in a single species or between two species for a single category of changes. The low number of preferred changes in D. guanche precluded their use in this kind of analysis. For the other categories of changes, the tests within species yielded contrasting results in the two species. In the insular species, there was no indication that unpreferred changes and changes in noncoding regions segregate at different frequencies (z = 0.62; P = 0.268). In D. subobscura, there was a close to significant difference between the frequency spectra of unpreferred changes and changes in noncoding regions (z = 1.38; P = 0.084), with the frequency spectrum of unpreferred changes more skewed toward low frequencies. Moreover, preferred and unpreferred changes segregated at significantly different frequencies in this species (Llopart and Aguadé 2000; present study). The comparison between species showed a significantly different frequency distribution in the two species only for unpreferred changes (z = 2.66; P = 0.004 [fig. 4]).



View larger version (17K):
[in this window]
[in a new window]
 
FIG. 4. Frequency distribution of unpreferred synonymous polymorphic changes in D. guanche and D. subobscura. Bars indicate observed values; diamonds (connected by lines) indicate expected values for neutral mutations obtained following Tajima (1989)

 
Synonymous Variation and Mutational Input
Although all previously described observations point to the nearly neutral character of synonymous mutations, we contrasted whether the mutational process alone could also account for the observed changes. The ancestral RpII215 sequence inferred by maximum parsimony from the 24 D. guanche and 11 D. subobscura sequences (using D. pseudoobscura as the outgroup) consisted of 658 unpreferred and 1,142 preferred codons (not considering Met and Trp codons), and its GC content at third codon positions was 0.732. The mutational matrix for unconstrained DNA sequences (Petrov and Hartl 1999) was applied to the particular codon composition of the RpII215 ancestral sequence to obtain the expected ratio of preferred to unpreferred changes (table 4). In the D. guanche lineage, the numbers of observed preferred and unpreferred changes did not differ significantly from those expected from the mutational process neither within the fixed nor polymorphic classes. In contrast, there was a significant excess of preferred changes in the D. subobscura lineage (table 3). All these observations are consistent with the hypothesis that a large fraction of synonymous changes behave as neutral in the D. guanche lineage, whereas they are weakly selected in D. subobscura. Indeed, the numbers of preferred and unpreferred mutations fixed in the D. subobscura lineage did not deviate from the equifrequency expected in mutation-selection-drift equilibrium (seven unpreferred and nine preferred changes; P = 0.617), whereas they clearly did in the D. guanche lineage (23 unpreferred and four preferred changes; P = 0.0003). This latter result indicates a change in the behavior of synonymous mutations in the D. guanche lineage after its split from D. subobscura.


View this table:
[in this window]
[in a new window]
 
Table 4 Synonymous Changes and Mutational Input.

 
Estimates of Selection Coefficients
Scaled selection coefficients for unpreferred synonymous mutations and for mutations in noncoding regions were estimated both in the D. guanche and in the D. subobscura lineages, whereas those for preferred synonymous mutations were estimated only in the latter lineage due to their low number in D. guanche (table 5). Scaled selection coefficients for unpreferred changes estimated from the frequency distribution (fd) of polymorphisms clearly differed between lineages. Indeed, the estimated value was positive but not significantly different from zero in D. guanche ({gamma} = 0.55), whereas it was negative and significantly lower than zero in D. subobscura ({gamma} = –4.33). Unexpectedly, this method also yielded negative estimates for noncoding changes ({gamma} = –2.79) in D. subobscura, suggesting that noncoding variation in this species could be under weak negative selection. The putatively slightly deleterious character of noncoding mutations in D. subobscura (maybe through their effect on the efficacy of transcription/splicing or on transcript stability) would preclude their use as a neutrally evolving region in the rpd approach. In D. guanche, the fd-based and rpd-based estimates of the scaled selection coefficient for unpreferred mutations (0.55 and –0.6, respectively) did not differ significantly from zero. Thus, present estimates would indicate, as predicted from theory if unpreferred changes are nearly neutral, that a large fraction of unpreferred mutations in the RpII215 gene behave as neutral in the insular species D. guanche, but they are slightly deleterious in the Palearctic species D. subobscura.


View this table:
[in this window]
[in a new window]
 
Table 5 Scaled Selection Coefficients Estimated by the fd and rpd Methods.

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
The nearly neutral model of molecular evolution predicts that changes in the effective population size affect nucleotide variation as a result of the differential contribution of weakly selected mutations (Ohta and Kimura 1971; Ohta 1972). In populations with a very reduced effective size, genetic drift overwhelms the effect of weak selection and, therefore, mildly selected mutations behave as neutral. If synonymous mutations in Drosophila were nearly neutral (as many studies indicate [see Introduction]), their pattern of variation might conform to neutral expectations in species with a very reduced effective population size. The pattern of synonymous variation observed in Drosophila guanche is consistent with predictions of the neutral model. Indeed, in this species lineage, (1) the value of Tajima's D statistic for unpreferred polymorphic mutations does not deviate significantly from zero; (2) the frequency spectra of unpreferred mutations and of mutations in noncoding sites do not differ significantly; (3) the numbers of preferred and unpreferred polymorphic variants do not deviate from those expected according to the mutational process; (4) the ratio of preferred to unpreferred synonymous changes does not differ significantly between polymorphic and fixed differences; and (5) the numbers of preferred and unpreferred fixed differences can be explained by the mutational input.

The pattern of synonymous variation in D. guanche contrasts with that detected in D. subobscura (Llopart and Aguadé 1999, 2000). As revealed in the present study, the frequency distribution of unpreferred polymorphic mutations differs significantly between species, which clearly indicates a different behavior of these mutations in both species. In fact, only in D. subobscura, there is an excess of unpreferred mutations segregating at low frequency, as reflected by the significantly negative value of Tajima's D statistic (Llopart and Aguadé 2000; present study). The contrasting results between D. guanche and D. subobscura can be explained according to the nearly neutral model of molecular evolution. Indeed, assuming that selection coefficients against unpreferred mutations are similar in both species, the detected differences are consistent with the much lower effective population size of D. guanche relative to D. subobscura. In D. guanche, a large fraction of unpreferred mutations would behave as neutral, as supported by the two estimated values of the scaled selection coefficient that do not differ significantly from zero (table 5). In contrast, in D. subobscura, selection against these mutations would be more effective, as indicated by the significantly negative fd estimate of the scaled selection coefficient. It could be argued that selection coefficients on synonymous mutations at RpII215 were smaller in D. guanche than in D. subobscura. Given the lower population size of the insular species, this alternative explanation seems rather unlikely.

Estimates of the scaled selection coefficient could be biased if the free recombination assumption of the PRF model (Sawyer and Hartl 1992) were severely violated. No major effect is expected in D. guanche since estimates of the recombination parameter and measures of linkage disequilibrium indicate that the rate of recombination at the RpII215 region is relatively high. Although in D. subobscura the RpII215 gene is also located in a region of high recombination, the presence of two chromosomal arrangements (Ast and A2) affecting this region might constitute a rather important departure from the free recombination assumption. Nevertheless, similar estimates of the scaled selection coefficients were obtained using all D. subobscura sequences (Nes = –4.33) and only sequences from the putatively ancestral Ast arrangement (Nes = –4.77). Furthermore, the magnitude (in absolute value) of these estimates in D. subobscura, although rather high for nearly neutral mutations, would be in good agreement with previous estimates reported for D. pseudoobscura (Nes = –4.6 for the Adh-Adhr genes) and D. simulans (Nes = –2.1, data from eight genes) (Akashi and Schaeffer 1997).

As an alternative to weak selection acting on synonymous variation, demographic factors affecting differentially the two species might also explain their different pattern of unpreferred mutations. Thus, a population expansion in D. subobscura but not in D. guanche might account for the observed results. Indeed, a skew towards rare variants and therefore negative Tajima's D values are expected after population expansions. However, demographic factors should affect similarly different kinds of nucleotide variation. Although both noncoding and unpreferred mutations in D. subobscura show negative Tajima's D values, this is not the case for preferred mutations. This differential behavior of preferred mutations in D. subobscura is therefore not consistent with demographic hypotheses.

The results obtained in the present study are thus consistent with an evolutionary scenario where codon bias was actively maintained by natural selection before the species split. The population size would have remained large in the continental species D. subobscura and, thus, codon bias was maintained by natural selection. The small population size of D. guanche would have caused, on the contrary, a relaxation of natural selection and, consequently, an important fraction of synonymous mutations that were slightly deleterious in the ancestral lineage (and in D. subobscura) would behave as neutral in the endemic species. This shift in the selective behavior of synonymous mutations would cause a progressive reduction of codon bias in D. guanche. Indeed, codon bias would move slowly to a new mutation-selection-drift equilibrium, given the minor contribution of weakly selected mutations.

Under strict neutrality, the expected heterozygosity ({theta}) in a stationary population at mutation-drift equilibrium is equal to 4Neu (3Neu for X-linked genes), where u is the mutation rate. According to this prediction, the expected heterozygosity should differ substantially between D. subobscura and D. guanche. Estimated nucleotide diversity in this gene is about twofold lower in D. guanche than in D. subobscura. This difference is qualitatively consistent with the lower effective population size of D. guanche relative to D. subobscura, but it is quantitatively much smaller than expected for neutral variation, given the putative large disparity in effective sizes. However, if both species effective sizes were much more similar, it would be difficult to explain, for instance, the significantly different numbers of unpreferred and preferred synonymous changes fixed in the two lineages or the significantly different distribution of unpreferred polymorphic mutations in the two species.

In addition, a lack of correlation between heterozygosity and effective size has been extensively reported (Lewontin 1974), which has led to the development of population genetic models that account for this discrepancy. Thus, the pseudohitchhiking model (Gillespie 2000, 2001), which considers the effect of strongly selected mutations on linked neutral variation, can explain the insensitivity of heterozygosity to changes in the population size and may even predict a reduction of variation with increasing population size. On the other hand, variation in the RpII215 gene region is mainly due to synonymous mutations. In D. subobscura, unlike in D. guanche, synonymous mutations (and probably also mutations at noncoding sites) are weakly selected and thus not strictly neutral. Since weak selection acting in the maintenance of codon bias reduces the sojourn time of these mutations in the population, the expected level of standing synonymous variation in the RpII215 gene would be lower in D. subobscura than in D. guanche. In addition, the interference model predicts a reduction of intraspecific variation due to linkage between mutations under weak selection (McVean and Charlesworth 2000; Comeron and Kreitman 2002). As this kind of selection is much more efficient in D. subobscura, the effect of interference should be stronger in this species than in D. guanche. However, both the pseudohitchhiking and the interference effects are based on linked selection, which is expected to be less efficient in regions of high recombination, such as the RpII215 region. Thus, the mere twofold reduction of variation in D. guanche relative to D. subobscura might be explained by the differential behavior of synonymous mutations in the two species and, likely to a minor extent, by the pseudohitchhiking and interference effects.

In conclusion, the comparative study of nucleotide variation at RpII215 in D. subobscura and D. guanche clearly shows a different behavior of synonymous mutations in these species. This contrasting behavior would be caused by differences in their effective population size, assuming that synonymous mutations are under weak selection. Present results provide strong evidence, therefore, for the role of weak selection in the maintenance of codon bias. They also prove that the D. subobscura and D. guanche species pair is a good model to further analyze the action of weak selection and, more specifically, how the effective population size affects the level and pattern of nucleotide variation.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 
We thank Serveis Científico-Tècnics, Universitat de Barcelona, for automated sequencing facilities. J.A.P. was supported by a fellowship from the Dirección General de Universidades e Investigación, Gobierno de Canarias, Spain. This work was supported by grants PB97-0918 and BMC2001-2906 from Comisión Interdepartamental de Ciencia y Tecnología, Spain, and grant 1999SGR-25 from Comissió Interdepartamental de Recerca i Innovació Tecnològica, Generalitat de Catalunya, Spain, to M.A.


    Footnotes
 
Edward Holmes, Associate Editor Back


    Literature Cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 Literature Cited
 

    Akashi, H. 1994. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927-935.[Abstract/Free Full Text]

    Akashi, H. 1995. Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA. Genetics 139:1067-1076.[Abstract/Free Full Text]

    Akashi, H. 1997. Distinguishing the effects of mutational biases and natural selection on DNA sequence variation. Genetics 147:1989-1991.[Free Full Text]

    Akashi, H. 1999. Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics 151:221-238.[Abstract/Free Full Text]

    Akashi, H., and S. W. Schaeffer. 1997. Natural selection and the frequency distributions of "silent" DNA polymorphism in Drosophila. Genetics 146:295-307.[Abstract/Free Full Text]

    Ashburner, M. 1989. Drosophila, a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

    Begun, D. J. 2002. Protein variation in Drosophila simulans, and comparison of genes from centromeric versus noncentromeric regions of chromosome 3. Mol. Biol. Evol. 19:201-203.[Free Full Text]

    Bulmer, M. 1991. The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897-907.[Abstract/Free Full Text]

    Comeron, J. M., and M. Kreitman. 2002. Population, evolutionary and genomic consequences of interference selection. Genetics 161:389-410.[Abstract/Free Full Text]

    Comeron, J. M., M. Kreitman, and M. Aguadé. 1999. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239-249.[Abstract/Free Full Text]

    Duret, L., and D. Mouchiroud. 1999. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc. Natl. Acad. Sci. USA 96:4482-4487.[Abstract/Free Full Text]

    Eyre-Walker, A. 1997. Differentiating between selection and mutation bias. Genetics 147:1983-1987.[Free Full Text]

    Gillespie, J. H. 2000. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155:909-919.[Abstract/Free Full Text]

    Gillespie, J. H. . 2001. Is the population size of a species relevant to its evolution? Evolution 55:2161-2169.[ISI][Medline]

    Hagemann, T. L., and S. P. Kwan. 1997. SeqEd: manipulation of sequence data and chromatograms from the ABI DNA sequencer analysis files. Methods Mol. Biol. 70:55-63.[Medline]

    Hartl, D. L., E. N. Moriyama, and S. A. Sawyer. 1994. Selection intensity for codon bias. Genetics 138:227-234.[Abstract/Free Full Text]

    Hey, J., and R. M. Kliman. 2002. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160:595-608.[Abstract/Free Full Text]

    Hudson, R. R. 2000. A new statistic for detecting genetic differentiation. Genetics 155:2011-2014.[Abstract/Free Full Text]

    Hudson, R. R., D. D. Boos, and N. L. Kaplan. 1992. A statistical test for detecting geographic subdivision. Mol. Biol. Evol. 9:138-151.[Abstract]

    Hudson, R. R., and N. L. Kaplan. 1985. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147-164.[Abstract/Free Full Text]

    Jokerst, R. S., J. R. Weeks, W. A. Zehring, and A. L. Greenleaf. 1989. Analysis of the gene encoding the largest subunit of RNA polymerase II in Drosophila. Mol. Gen. Genet. 215:266-275.[ISI][Medline]

    Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pp. 21–123 in H. N. Munro, ed. Mammalian protein metabolism. Academic Press, New York.

    Kelly, J. K. 1997. A test of neutrality based on interlocus associations. Genetics 146:1197-1206.[Abstract/Free Full Text]

    Kliman, R. M. 1999. Recent selection on synonymous codon usage in Drosophila. J. Mol. Evol. 49:343-351.[ISI][Medline]

    Kliman, R. M., and J. Hey. 1993. Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol. Biol. Evol. 10:1239-1258.[Abstract]

    Kliman, R. M., and J. Hey. 1994. The effects of mutation and natural selection on codon bias in the genes of Drosophila. Genetics 137:1049-1056.[Abstract/Free Full Text]

    Kreitman, M., and M. Antezana. 2000. The population and evolutionary genetics of codon bias. Pp. 82–101 in R. S. Singh and C. B. Krimbas, eds. Evolutionary genetics: from molecules to morphology, Cambridge University Press, Cambridge.

    Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.[Abstract/Free Full Text]

    Lewontin, R. C. 1974. The genetic basis of evolutionary change. Columbia University Press, New York.

    Li, W.-H. 1987. Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons. J. Mol. Evol. 24:337-345.[ISI][Medline]

    Llopart, A., and M. Aguadé. 1999. Synonymous rates at the RpII215 gene of Drosophila: variation among species and across the coding region. Genetics 152:269-280.[Abstract/Free Full Text]

    Llopart, A., and M. Aguadé. 2000. Nucleotide polymorphism at the RpII215 gene in Drosophila subobscura: weak selection on synonymous mutations. Genetics 155:1245-1252.[Abstract/Free Full Text]

    Maddison, W. P., and D. R. Maddison. 1992. MacClade: analysis of phylogeny and character evolution. Version 3.0. Sinauer Associates, Sunderland, Mass.

    McVean, G. A., and B. Charlesworth. 2000. The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155:929-944.[Abstract/Free Full Text]

    Moriyama, E. N., and D. L. Hartl. 1993. Codon usage bias and base composition of nuclear genes in Drosophila. Genetics 134:847-858.[Abstract/Free Full Text]

    Munté, A., M. Aguadé, and C. Segarra. 1997. Divergence of the yellow gene between Drosophila melanogaster and D. subobscura: recombination rate, codon bias and synonymous substitutions. Genetics 147:165-175.[Abstract/Free Full Text]

    Munté, A., M. Aguadé, and C. Segarra. 2001. Changes in the recombinational environment affect divergence in the yellow gene of Drosophila. Mol. Biol. Evol. 18:1045-1056.[Abstract/Free Full Text]

    Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.

    Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418-426.[Abstract]

    Ohta, T. 1972. Evolutionary rate of cistrons and DNA divergence. J. Mol. Evol. 1:150-157.

    Ohta, T., and M. Kimura. 1971. On the constancy of the evolutionary rate of cistrons. J. Mol. Evol. 1:18-25.[Medline]

    Petrov, D. A., and D. L. Hartl. 1999. Patterns of nucleotide substitution in Drosophila and mammalian genomes. Proc. Natl. Acad. Sci. USA 96:1475-1479.[Abstract/Free Full Text]

    Ramos-Onsins, S., C. Segarra, J. Rozas, and M. Aguadé. 1998. Molecular and chromosomal phylogeny in the obscura group of Drosophila inferred from sequences of the rp49 gene region. Mol. Phylogenet. Evol. 9:33-41.[CrossRef][ISI][Medline]

    Rozas, J., and R. Rozas. 1999. DnaSP: an integrated program for molecular population genetics and molecular evolution analysis. Version 3. Bioinformatics. 15:174-175.[Abstract/Free Full Text]

    Rozas, J., M. Gullaud, G. Blandin, and M. Aguadé. 2001. DNA variation at the rp49 gene region of Drosophila simulans: evolutionary inferences from an unusual haplotype structure. Genetics 158:1147-1155.[Abstract/Free Full Text]

    Saitou, N., and M. Nei. 1987. The Neighbor-Joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.[Abstract]

    Sawyer, S. A., and D. L. Hartl. 1992. Population genetics of polymorphism and divergence. Genetics 132:1161-1176.[Abstract/Free Full Text]

    Segarra, C., and M. Aguadé. 1992. Molecular organization of the X chromosome in different species of the obscura group of Drosophila. Genetics 130:513-521.[Abstract/Free Full Text]

    Shields, D. C., P. M. Sharp, D. G. Higgins, and F. Wright. 1988. "Silent" sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol. Evol. 5:704-716.[Abstract]

    Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.[Abstract/Free Full Text]

    Takano-Shimizu, T. 1999. Local recombination and mutation effects on molecular evolution in Drosophila. Genetics 153:1285-1296.[Abstract/Free Full Text]

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract]

Accepted for publication June 14, 2003.