*Laboratoire "Biométrie et biologie évolutive," UMR CNRS 5558, Université Claude Bernard Lyon 1, Villeurbanne, France;
Center for the Study of Evolution, University of Sussex, Brighton
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The first model proposes that the positive correlation between codon bias and recombination rate is caused by Hill-Robertson interference (HRi) (Kliman and Hey 1993
; Comeron, Kreitman, and Aguadé 1999
; McVean and Charlesworth 2000
). HRi leads to a decrease of selection efficacy. This is because the linkage disequilibrium between alleles at selected loci, generated by the stochastic nature of mutation and sampling in a finite population, interferes with the action of selection at other loci (Hill and Robertson 1966
; Felsenstein 1974
). Simulation studies suggest that the effect of genetic linkage should be particularly damaging in the case of weak selection, such as selection acting on codon usage (Li 1987
; Comeron, Kreitman, and Aguadé 1999
; McVean and Charlesworth 2000
).
The second model proposes that the positive correlation between codon bias and recombination rate is a byproduct of mutational bias variations (MBV) associated with recombination (Marais, Mouchiroud, and Duret 2001
). Consistent with this model, in D. melanogaster and C. elegans the G+C content of both noncoding DNA and synonymous sites correlates positively with recombination rate (Marais, Mouchiroud, and Duret 2001
). In the D. melanogaster subgroup, local changes in crossing-over frequencies between species are correlated with changes in MBV (Takano-Shimizu 2001
). Because most of the optimal codons end in G or C in both D. melanogaster and C. elegans (Shields et al. 1988
; Stenico, Lloyd, and Sharp 1994
; Duret and Mouchiroud 1999
), the high frequency of optimal codons observed in regions of high recombination may be the result of MBV associated with recombination (Marais, Mouchiroud, and Duret 2001
). A positive correlation between G+C content and recombination has also been observed in other organisms, such as yeast (Baudat and Nicolas 1997
; Gerton et al. 2000
), mouse (Perry and Ashworth 1999
), and human (Eyre-Walker 1993
; Eisenbarth et al. 2000
; Fullerton, Bernardo Carvalho, and Clark 2001
; Yu et al. 2001
). In such eukaryotic organisms, the recombination machinery induces genetic conversion between parental chromosomes during meiosis (Smith and Nicolas 1998
). Experimental evidence in mammals suggests that genetic conversion associated with recombination favors the copy of the most GC-rich sequence over the other (Brown and Jiricny 1988
; Bill et al. 1998
). Biased gene conversion might explain why MBV are associated with recombination in many organisms (Galtier et al. 2001
).
Recently, both models have been tested in C. elegans and D. melanogaster by considering separately codons ending in G or C and codons ending in A or U (Marais, Mouchiroud, and Duret 2001
). In both invertebrates, the frequency of GC-ending codons correlates positively with recombination rate, and the frequency of AU-ending codons correlates negatively with recombination rate, in agreement with the MBV model but not with the HRi model. Thus, the positive correlation between codon bias and recombination rate is mainly caused by MBV in C. elegans and D. melanogaster (Marais, Mouchiroud, and Duret 2001
). An important question remains: is it possible to detect HRi on codon usage in C. elegans and D. melanogaster once the effect of MBV has been accounted for?
Introns are often considered good indicators of mutation patterns (Kliman and Hey 1993
, 1994
; Akashi, Kliman, and Eyre-Walker 1998
). Thus, in our previous work, we used introns as indicators of MBV, but we failed to detect any HRi on codon usage (Marais, Mouchiroud, and Duret 2001
). However, introns may be poor indicators of MBV affecting synonymous sites in such compact genomes as D. melanogaster and C. elegans. Because selection on codon usage is not expected to act on lowly expressed genes, we used the synonymous sites of lowly expressed genes to account for the effect of MBV on codon bias. We measured the differences between codon biases of highly expressed genes and their lowly expressed neighbors. This measure of codon bias should therefore be independent of the MBV occurring at synonymous sites. In D. melanogaster we find that HRi probably affects selection on codon usage of genes located in regions of very low recombination (<1 cM/Mb). Under the assumption that highly expressed genes are representative of the genes experiencing selection on codon usage, only 4% of genes are affected by less effective selection on codon usage because of HRi in this species. In C. elegans we do not find any evidence for the effect of recombination on selection for codon bias. We suggest that the correlation between codon bias and recombination rate is a consequence of MBV in this species. Computer simulations indicate that HRi only affects selection on codon usage when the local recombination rate is below the mutation rate. This prediction of the model is consistent with our data and the current estimate of the mutation rate in D. melanogaster. The case of C. elegans, which is highly self-fertilizing, is discussed. Finally, our results suggest that HRi is a minor determinant of variations in codon bias across the genome.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Random Sampling of the Data Set
To resolve the problem of the covariations of gene length and recombination rate, we forced the distribution of gene length to be the same for the different classes of recombination rate for both lowly and highly expressed genes. We chose the distribution of gene length of the recombination rate class with the smallest sample size among lowly and highly expressed genes to be the reference distribution of gene length for all other recombination rate classes for both lowly and highly expressed genes. For C. elegans, this distribution corresponds to 23% of genes with coding sequence (CDS) length <1,000 nucleotides, 18% of genes with CDS length of 1,0001,750 nucleotides, and 59% of genes with CDS length >1,750 nucleotides. For D. melanogaster this distribution corresponds to 18% of genes with CDS length <800 nucleotides, 21% of genes with CDS length of 8001,550 nucleotides, and 61% of genes with CDS length >1,550 nucleotides. We generated 10 new data sets by random sampling of genes in each class of sequence length for each class of recombination for both lowly and highly expressed genes. In D. melanogaster, n = 4,159 for each data set corrected for gene length variations; in C. elegans, n = 3,100 for each data set corrected for gene length variations.
Computer Simulations
The simulation process is close to that of previous simulations studies of HRi (Li 1987
; Comeron, Kreitman, and Aguadé 1999
; McVean and Charlesworth 2000
): we assumed that each individual is represented by L biallelic sites (e.g., optimal and nonoptimal codons). The haploid population size is N. If not specified, the mutation rate from nonoptimal toward optimal codons is u, the reverse mutation rate is v = 2u leading to an equilibrium value of 0.33 without selection (Fop = 0.33 when codon usage is uniform), and the global mutation rate (number of mutation per site per generation) is m = u(1 - Fop) + vFop. The number of mutations follows a Poisson distribution of mean NLu and NLv. The number of crossing-overs per generation also follows a Poisson distribution of mean NLr where r is the recombination rate (number of crossing-over per site per generation). The N individuals of the next generation are randomly chosen by multinomial sampling among the N individuals of the present generation, given their relative fitness in the population. The absolute fitness of a sequence with i optimal sites is given by (1 + s)i, which is equivalent to negative selection on nonoptimal codons, given a simple transformation of selection coefficient s (Piganeau et al. 2001
). The process is run for 4/(u + v) generations to reach equilibrium. The mean and variance of the equilibrium optimal codons frequency are calculated from 100 values checked every 2N generations, and each simulation is run at least four times. Without linkage between selected sites, selection efficiency is known to depend on the scaled mutation rates Nu and Nv and selection coefficients Ns (Li 1987
). In the rest of the text, the Fop value expected without linkage between selected sites is referred as Fop-max. Under complete linkage, the selection efficiency depends on the scaled mutation rates NLu and NLv and Ns (McVean and Charlesworth 2000
).
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
We studied the complete genomes of C. elegans (The C. elegans Sequencing Consortium 1998
) and D. melanogaster (Adams et al. 2000
). We measured codon bias by the frequency of optimal codons (Fop) (Stenico, Lloyd, and Sharp 1994
; Duret and Mouchiroud 1999
). For each highly expressed gene, we measured the average difference between its Fop and the Fop of its lowly expressed neighbors over an interval of 100 kb centered on the midpoint of the highly expressed gene. In this way, we removed the local effect of MBV on Fop of highly expressed genes. In figure 1
, we show the residuals of Fop after the removal of the MBV effect on codon usage (noted Fop-MBV for Fop corrected for MBV) according to recombination rate. The overall relationship between Fop-MBV and recombination rate is clearly not linear (see fig. 1
). In D. melanogaster we observed a weak but significant increase of Fop-MBV with recombination rate for highly expressed genes located in regions of recombination rate of 01 cM/Mb (Spearman's rank correlation coefficient Rs = 0.129 with P = 0.0033) and no relationship between Fop-MBV and recombination rate for the other highly expressed genes (1 to > 3.9 cM/Mb, Rs = -0.019 with P = 0.32). This observation suggests that codon usage of highly expressed genes located in regions with recombination rate under
1 cM/Mb in D. melanogaster probably experiences HRi. The same is found for moderately expressed genes, although variations in Fop-MBV induced by HRi tend to be weaker (see fig. 1
). For these genes, variations in Fop-MBV in regions of recombination rate of 01 cM/Mb are not significant (Rs = 0.021 with P = 0.46). Thus, we do not consider them in the rest of the analysis. In C. elegans the relationship between Fop-MBV and recombination rate for highly expressed genes is not convincing, although there is a global correlation between the two parameters (Rs = 0.064 with P < 0.0075). For moderately expressed genes, the relationship is not convincing, and there is no global correlation (see fig. 1
).
|
|
|
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Both authors contributed equally to this work
Abbreviations: Fop, frequency of optimal codons; Rs, Spearman's rank correlation coefficient; MBV, mutational bias variations; HRi, Hill-Robertson interference; Fop-MBV, Fop corrected for MBV; r, recombination rate; m, mutation rate; N, effective population size; L, number of selected sites; s, selection coefficient, CDS, coding sequence; Fop-max, Fop value expected with independent selected sites; EST, expressed sequence tag.
Keywords: codon usage
recombination
mutation patterns
Hill-Robertson interference
Drosophila
Caenorhabditis
Address for correspondence and reprints: Gabriel Marais, Laboratoire "Biométrie et biologie évolutive," UMR CNRS 5558, Université Claude Bernard Lyon 1, 43 Bvd du 11 novembre 1918, 69622 Villeurbanne, France. E-mail: marais{at}biomserv.univ-lyon1.fr
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Adams M. D., S. E. Celniker, R. A. Holt, et al. (95 co-authors) 2000 The genome sequence of Drosophila melanogaster Science 287:2185-2195
Akashi H., 1995 Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA Genetics 139:1067-1076
Akashi H., R. M. Kliman, A. Eyre-Walker, 1998 Mutation pressure, natural selection, and the evolution of base composition in Drosophila Genetica 102/103:49-60
Andolfatto P., 2001 Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans Mol. Biol. Evol 18:279-290
Andolfatto P., M. Przeworski, 2000 A Genome-wide departure from the standard neutral model in natural populations of Drosophila Genetics 156:257-268
Barnes T. M., Y. Kohara, A. Coulson, S. Hekimi, 1995 Meiotic recombination, noncoding DNA and genomic organization in Caenorhabditis elegans Genetics 141:159-179
Baudat F., A. Nicolas, 1997 Clustering of meiotic double-strand breaks on yeast chromosome III Proc. Natl. Acad. Sci. USA 94:5213-5218
Bill C. A., W. A. Duran, N. R. Miselis, J. A. Nickoloff, 1998 Efficient repair of all types of single-base mismatches in recombination intermediates in Chinese Hamster ovary cells. Competition between long-patch and G-T glycosylase-mediated repair of G-T mismatches Genetics 149:1935-1943
Brown T. C., J. Jiricny, 1988 Different base/base mispairs are corrected with different efficiencies and specificities in monkey kidney cells Cell 54:705-711[ISI][Medline]
Bulmer M., 1991 The selection-mutation-drift theory of synonymous codon usage Genetics 129:897-907
The C. elegans Sequencing Consortium. 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology Science 282:2012-2018
Chiapello H., F. Lisacek, M. Caboche, A. Henaut, 1998 Codon usage and gene function are related in sequences of Arabidopsis thaliana Gene 209:GC1-GC38[ISI][Medline]
Comeron J. M., M. Kreitman, M. Aguadé, 1999 Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila Genetics 151:239-249
Drake J. W., B. Charlesworth, D. Charlesworth, J. Crow, 1998 Rates of spontaneous mutation Genetics 148:1667-1686
Dunn K. A., J. P. Bielawski, Z. Yang, 2001 Substitution rates in Drosophila nuclear genes: implications for translational selection Genetics 157:295-305
Duret L., 2000 tRNA gene number and codon usage in C. elegans genome are co-adapted for optimal translation of highly expressed genes Trends Genet 16:287-289[ISI][Medline]
Duret L., P. Bucher, 1997 Searching for regulatory elements in human noncoding sequences Curr. Opin. Struct. Biol 7:399-406[ISI][Medline]
Duret L., L. D. Hurst, 2001 The elevated GC content at exonic third sites is not evidence against neutralist models of isochore evolution Mol. Biol. Evol 18:757-762
Duret L., D. Mouchiroud, 1999 Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis Proc. Natl. Acad. Sci. USA 96:4482-4487
Eisenbarth I., G. Vogel, W. Krone, W. Vogel, G. Assum, 2000 An isochore transition in the NF1 gene region coincides with a switch in the extent of linkage disequilibrium Am. J. Hum. Genet 67:873-880[ISI][Medline]
Eyre-Walker A., 1993 Recombination and mammalian genome evolution Proc. R. Soc. Lond. B 252:237-243[ISI][Medline]
Felsenstein J., 1974 The evolutionary advantage of recombination Genetics 78:737-756
Fields C., 1990 Information content of Caenorhabditis elegans splice site sequences varies with intron length Nucleic Acids Res 18:1509-1512[Abstract]
Fullerton S. M., A. Bernardo Carvalho, A. G. Clark, 2001 Local rates of recombination are positively correlated with GC content in the human genome Mol. Biol. Evol 18:1139-1142
Galtier N., G. Piganeau, D. Mouchiroud, L. Duret, 2001 GC-content evolution in mammalian genomes: the biased gene conversion hypothesis Genetics 159:907-911
Gerton J. L., J. DeRisi, R. Shroff, M. Lichten, P. O. Brown, T. D. Petes, 2000 Global mapping of meiotic recombination hotspots and coldspots in the yeast Saccharomyces cerevisiae Proc. Natl. Acad. Sci. USA 97:11383-11390
Haag E. S., J. Kimble, 2000 Regulatory elements required for development of Caenorhabiditis elegans hermaphrodites are conserved in the tra-2 homologue of C. remanei a male/female sister species Genetics 155:105-116
Hartl D. L., E. N. Moriyama, S. A. Sawyer, 1994 Selection intensity for codon bias Genetics 138:227-234
Hey J., R. M. Kliman, 2002 Interactions between natural selection, recombination and gene density in the genes of Drosophila Genetics 160:595-608
Hill W. G., A. Robertson, 1966 The effect of linkage on limits to artificial selection Genet. Res 8:269-294[ISI][Medline]
Keightley P. D., A. Eyre-Walker, 1999 Deleterious mutations and the evolution of sex Science 290:331-333
Kliman R. M., J. Hey, 1993 Reduced natural selection associated with low recombination in Drosophila melanogaster Mol. Biol. Evol 10:1239-1258[Abstract]
. 1994 The effects of mutation and natural selection on codon bias in the genes of Drosophila Genetics 137:1049-1056
Koch R., H. G. van Luenen, M. van der Horst, K. L. Thijssen, R. H. Plasterk, 2000 Single nucleotide polymorphisms in wild isolates of Caenorhabditis elegans Genome Res 10:1690-1696.
Lerat E., C. Biémont, P. Capy, 2000 Codon usage and the origin of P elements Mol. Biol. Evol 17:467-468
Lerat E., P. Capy, C. Biémont, 2002 Codon usage by transposable elements and their host genes in five species J. Mol. Evol 54:625-637.[ISI][Medline]
Li W.-H., 1987 Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons J. Mol. Evol 24:337-345[ISI][Medline]
Marais G., L. Duret, 2001 Synonymous codon usage, accuracy of translation, and gene length in Caenorhabditis elegans J. Mol. Evol 52:275-280[ISI][Medline]
Marais G., D. Mouchiroud, L. Duret, 2001 Does recombination improve selection on codon usage? Lessons from nematode and fly complete genomes Proc. Natl. Acad. Sci. USA 98:5688-5692
Maroni G., 1994 The organization of Drosophila genes DNA Seq 4:347-354[ISI][Medline]
McVean G. A. T., B. Charlesworth, 2000 The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation Genetics 155:929-944
Moriyama E. N., J. R. Powell, 1997 Codon usage bias and tRNA abundance in Drosophila J. Mol. Evol 45:514-523[ISI][Medline]
. 1998 Gene length and codon usage bias in Drosophila melanogaster, Saccharomyces cerevisiae and Escherichia coli Nucleic Acids Res 26:3188-3193
Mount S. M., C. Burks, G. Hertz, G. D. Stormo, O. White, C. Fields, 1992 Splicing signals in Drosophila: intron size, information content, and consensus sequences Nucleic Acids Res 20:4255-4262[Abstract]
Nordborg M., 2000 Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial self-fertilization Genetics 154:923-929
Perry J., A. Ashworth, 1999 Evolutionary rate of a gene affected by chromosomal position Curr. Biol 9:987-989[ISI][Medline]
Piganeau G., R. Westrelin, B. Tourancheau, C. Gautier, 2001 Multiplicative versus additive selection in relation to genome evolution: a simulation study Genet. Res 78:171-175[ISI][Medline]
Powell J. R., E. N. Moriyama, 1997 Evolution of codon usage bias in Drosophila Proc. Natl. Acad. Sci. USA 94:7784-7790
Sharp P. M., W. H. Li, 1989 On the rate of DNA sequence evolution in Drosophila J. Mol. Evol 28:398-402[ISI][Medline]
Sharp P. M., M. Stenico, J. F. Peden, A. T. Lloyd, 1993 Codon usage: mutational bias, translational selection, or both? Biochem. Soc. Trans 21:835-841[ISI][Medline]
Shields D. C., P. M. Sharp, 1989 Evidence that mutation patterns vary among drosophila transposable elements J. Mol. Biol 207:843-846[ISI][Medline]
Shields D. C., P. M. Sharp, D. G. Higgins, F. Wright, 1988 "Silent" sites in Drosophila genes are not neutral: evidence of selection among synonymous codons Mol. Biol. Evol 5:704-716[Abstract]
Smith K. N., A. Nicolas, 1998 Recombination at work for meiosis Curr. Opin. Genet. Dev 8:200-211[ISI][Medline]
Stenico M., A. T. Lloyd, P. M. Sharp, 1994 Codon usage in Caenorhabditis elegans: delineation of translational selection and mutational biases Nucleic Acids Res 22:2437-2446[Abstract]
Takano-Shimizu T., 2001 Local changes in GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes Mol. Biol. Evol 18:606-619
Yu A., C. Zhao, Y. Fan, et al. (11 co-authors) 2001 Comparison of human genetic and sequence-based physical maps Nature 409:951-953[ISI][Medline]