1 Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA
2 Department of Biological Sciences, Stanford University, Stanford, CA 94305, USA
Correspondence
Jennifer J. Wernegreen
jwernegreen{at}mbl.edu
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Prokaryotic genomes are ideal for comparative analyses of the mutation-selection balance because diverse bacterial lifestyles influence mutation, selection and genetic drift, and can be linked to distinct patterns of genome evolution. Synonymous codon usage in the free-living bacteria Escherichia coli and Salmonella typhimurium is dictated by selection for translational efficiency, as evidenced by bias toward the use of optimal codons' at highly expressed genes (Sharp, 1991; Sharp & Li, 1987
). In contrast, synonymous codon usage is shaped by mutational bias and genetic drift in many obligately host-associated bacteria, such as the aphid mutualist Buchnera aphidicola (Wernegreen & Moran, 1999
) and the parasites Rickettsia prowazekii (Andersson & Sharp, 1996
) and Micrococcus luteus (Ohama et al., 1990
), among other species. Strong effects of mutation and drift on codon usage may reflect a reduced Ne caused by recurrent bottlenecks related to an intracellular lifestyle (Mira & Moran, 2002
; Moran & Wernegreen, 2000
). For example, independent population genetic studies of two divergent aphid species estimated the Ne of Buchnera to be
107 (Abbot & Moran, 2002
; Funk et al., 2001
) whereas the Ne for the closely related bacterium E. coli has been estimated at 2·5x109 (Ochman & Wilson, 1987
). Reduced Ne causes stronger genetic drift and decreases the efficiency of selection on preferred codons or amino acids. Among intracellular bacterial genomes, the irrevocable loss of DNA repair genes may contribute to elevated underlying mutation rates and biases. Such mutational bias can affect not only synonymous codon usage, but also amino acid usage (Sueoka, 1961
) as seen in Buchnera (Palacios & Wernegreen, 2002
), Rickettsia (Andersson & Sharp, 1996
) and a broad array of other prokaryotic genomes (Singer & Hickey, 2000
).
This study examines codon and amino acid usage in the full genome sequence of Wigglesworthia glossinidia brevipalpis, the -3 proteobacterial endosymbiont of the blood-feeding tsetse fly Glossina brevipalpis, to gain a more comprehensive understanding of the mutation-selection balance in this bacterial genome. The extremely reduced, 697 kb chromosome of Wigglesworthia includes just 621 coding regions (Akman et al., 2002
), compared to the 4·6 Mb and nearly 4000 genes in the closely related E. coli genome (Blattner et al., 1997
). Like many other intracellular bacteria, Wigglesworthia shows a heavily biased nucleotide composition (22 % GC; Akman et al., 2002
). The tsetseWigglesworthia symbiosis is obligate and mutual, and it is believed that Wigglesworthia plays a role in vitamin synthesis for the host (Nogge, 1981
). Tsetse flies cured of Wigglesworthia infection experience lower fecundity, although a supplementary diet of B-complex vitamins can restore the host's reproductive ability (Nogge, 1981
). The retention of vitamin biosynthetic capabilities in the small Wigglesworthia genome may reflect host-level selection to supply these nutrients which are lacking in the tsetse diet of vertebrate blood [analogous to aphid host-level selection on amino acid production by Buchnera (Shigenobu et al., 2000
)].
Wigglesworthia provides an independent lineage in which to compare the effects of endosymbiosis on the mutation-selection balance in bacteria. Although Wigglesworthia is closely related to the relatively well characterized Buchnera, the two lineages apparently represent independent transitions to an endosymbiotic lifestyle within the -3 Proteobacteria (Wernegreen et al., 2003
). Shared lifestyle characteristics of Wigglesworthia and Buchnera include an obligate association with insects, maternal transmission to host offspring and strict cospeciation with their hosts, and may account for the overall similarities between their AT-rich, small genomes. The Wigglesworthiatsetse fly endosymbiosis is estimated to be
40 million years old (Moran & Wernegreen, 2000
) compared to the much older 150250 million year old Buchneraaphid endosymbiosis (Moran et al., 1993
). Here, we test whether these phylogenetically independent endosymbiotic bacteria of different ages experience similar evolutionary processes that affect genome-wide variation in predictable ways and with the same severity.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Differences in amino acid usage between high- and low-expression genes.
We calculated differences in the RAAU of putative high- and low-expression genes for each amino acid using the index D{H,L} described previously (Palacios & Wernegreen, 2002). This statistic quantifies the difference in RAAU for each amino acid at high- and low-expression genes. For this calculation, we determined RAAU using General Codon Usage Analysis (McInerney, 1998b
). We estimated the significance of D{H,L} values by a randomization test (Sokal & Rohlf, 1995
) including 1000 permutations of gene-specific amino acid counts. We designated amino acids as AT-rich, GC-rich or unbiased (neither AT- nor GC-rich) based on the base composition of their associated codons, following accepted conventions (Foster et al., 1997
). For brevity, in this paper we indicate amino acids encoded by relatively GC-rich codons as GC-rich amino acids, refer to those amino acids encoded by relative AT-rich codons as AT-rich amino acids' and call amino acids that are neither GC- nor AT-rich unbiased amino acids.
Estimation of amino acid conservation levels.
We explored amino acid differences between Wigglesworthia and E. coli to help distinguish between two hypotheses to explain an observed abundance of GC-rich amino acids at highly expressed Wigglesworthia genes: selection against AT-rich amino acids at high-expression genes or selection for overall maintenance of amino acids since divergence from an ancestor with moderate base composition (Fig. 1). Endosymbionts within the
-3 Proteobacteria, including Wigglesworthia, have shifted to extreme AT base compositional bias since their divergence from an ancestor with a relatively moderate base composition (Charles et al., 2001
; Heddi et al., 1998
). Given the close phylogenetic position of E. coli, its moderate base composition of 50·8 % GC (Blattner et al., 1997
) and its slow rate of amino acid substitution, all relative to Wigglesworthia, many amino acid differences between E. coli and Wigglesworthia proteins most likely reflect changes in the lineage leading to Wigglesworthia. Therefore, if we consider E. coli protein sequences a proxy for ancestral states, we can develop a heuristic approach to study the relatively rapid evolution of extremely biased amino acid composition in Wigglesworthia.
|
In addition, we explored the relationship between protein conservation and gene expression by estimating pairwise non-synonymous divergences (dN) between all Wigglesworthia and E. coli orthologues using the CODEML program in the PAML version 3.13 software package (Yang, 2002). We tested for a significant difference between the mean dN values of the 56 putative high-expression genes and the mean dN values of all other orthologue pairs. We also compared pairwise dN estimates to CAI values, using the respective CAI value of each E. coli gene in all orthologue pairs.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
If translational selection influences codon usage in Wigglesworthia, then codon bias is expected to differ between high- and low-expression genes, as observed in E. coli (Sharp, 1991) and other genomes (Andersson & Kurland, 1990
). We tested this prediction by performing COA on the RSCU of 564 Wigglesworthia genes, and by testing for distinct codon usage patterns between loci with predicted high- versus low-expression. Axes 1 and 2 account for just 10 and 6 % of the observed variation in RSCU, respectively, distributed across a total of 19 axes. This result indicates little variation among genes in codon usage patterns (Table 2
). Genes plotted against the first two axes were distinguished neither by predicted expression levels (Fig. 2
a) nor by protein gravy score (hydrophobicity) (Fig. 2b
). Rather, a strong correlation of axis one and GC3 (rs, Spearman's rank correlation coefficient,=-0·60, P<0·0001) suggests that local variation in synonymous base composition drives variation in codon usage patterns. The location of genes along axes 2 and 3 reveals no distinction based on expression level, hydrophobicity, aromaticity or GC content (not shown).
|
|
|
|
|
|
Our results show lower uncorrected amino acid divergence at high-expression (0·35) than low-expression genes (0·54) (Table 4a), indicating fewer amino acid changes at high-expression loci. Likewise, uncorrected pairwise divergences for all amino acid partition types in E. coli (whether an amino acid site in E. coli has a GC-rich, AT-rich or unbiased amino acid) are significantly lower in high-expression genes than in low-expression genes [P<0·05 for all comparisons, (Student's t distribution)] (Table 4a
). The apparent conservation of amino acids that are AT-rich in E. coli may be partially explained by the lack of correction for multiple substitutions and the extreme AT bias of the Wigglesworthia genome. Consistent with the hypothesis of overall amino acid conservation (Fig. 1b
), the GC-rich amino acids of E. coli that differ in Wigglesworthia show the same relative frequencies of GC-rich, AT-rich and unbiased amino acids in Wigglesworthia high- and low-expression genes (
2=0·13, P=0·94; Table 4b
). This similarity of configurations at high- and low-expression genes is predicted under the model of selection for overall amino acid conservation at high-expression loci, but not under the hypothesis of selection against AT-rich amino acids (Fig. 1a
).
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Synonymous codon usage
Adaptive codon bias shaped by translational selection is characterized by distinct synonymous codon usage at high- and low-expression genes. This type of selection shapes codon usage in many free-living microbial species, where the efficiency of gene translation is related to maximum growth rates (Bulmer, 1991). However, our COA of RSCU in the Wigglesworthia genome shows no distinction of high- and low-expression genes, and thus suggests that translational selection has little, if any, effect on codon bias (Fig. 1
). Our identification of putative high- and low-expression genes was based on known expression patterns in E. coli and, in the case of ribosomal proteins, across all studied bacterial species (Srivastava & Schlessinger, 1990
). This criterion has important limitations, since the acquisition of an endosymbiotic lifestyle may alter relative gene expression levels. However, experimental analyses of Wigglesworthia proteins show very high expression levels of the chaperonin groEL (Aksoy, 1995
). Therefore, the observation that groEL did not show distinct codon usage compared to other genes, including those of putative lower expression, strengthens evidence against adaptive codon usage in Wigglesworthia.
Mutational biases that may shape codon usage include strand-specific biases, in which the two DNA strands experience different configurations of mutations. For example, the relative abundance of Ts and Gs in the leading strand of replication may be explained by strand-specific mutational spectra during replication (Francino & Ochman, 1999; Rocha & Danchin, 2001
). Such strand-specific biases can play a dominant role in shaping codon usage [e.g. Borrelia burgdorferi (McInerney, 1998a
)]. However, we found no relationship between strand orientation and codon usage patterns in Wigglesworthia, consistent with the notable lack of strand-specific nucleotide bias, or skew, in this genome (Akman et al., 2002
). Rather, the predominant use of A- and T-ending codons across the genome indicates that strong directional AT mutational bias (from GC to AT pairs) drives codon usage in Wigglesworthia. In addition, the observed correlation of Nc and GC3 implies that variation in codon bias among Wigglesworthia genes reflects slight differences in local base composition. Directional mutation also drives codon usage in many intracellular bacterial genomes such as the AT-rich genomes of Buchnera (Wernegreen & Moran, 1999
) and Rickettsia prowazekii (Andersson & Sharp, 1996
), and the GC-rich genome of Micrococcus luteus (Ohama et al., 1990
).
Changes in population structure that accompany an endosymbiotic lifestyle may explain the lack of translational selection and strong effects of mutational bias in the Wigglesworthia genome. Although the population dynamics of Wigglesworthia are currently unclear, these bacteria may experience bottlenecks during transmission to new host offspring through the tsetse milk gland (Ma & Denlinger, 1974), analogous to the bottlenecks experienced by Buchnera when transmitted to aphid eggs or embryos (Mira & Moran, 2002
). Such bottlenecks would reduce Ne, increase genetic drift and limit the ability of weak selection to maintain optimal codons in a gene (Bulmer, 1991
). In this case, background mutational biases would tend to dominate over selection and eventually shape codon usage. In addition to the potential effects of genetic drift, the substantial loss of DNA repair functions, such as the nucleotide excision repair genes uvrABC, may also contribute to the strong mutational biases in Wigglesworthia and other endosymbiont genomes. Reduced tRNA gene number may also explain the lack of adaptive codon bias in certain prokaryote genomes, a hypothesis that has been considered in previous studies (Andersson & Sharp, 1996
; Palacios & Wernegreen, 2002
). Compared to 86 tRNA genes in E. coli (Blattner et al., 1997
), the reduced genome of Wigglesworthia contains just 34 (Akman et al., 2002
), resulting in only one or a few tRNA genes per amino acid or codon. However, even those amino acid families with only one corresponding tRNA gene (representing extremely biased tRNA pools) do not show distinct RSCU between high- and low-expression genes (data not shown).
Amino acid usage
Previous studies have demonstrated strong effects of mutational bias on amino acid composition of bacterial proteins (Singer & Hickey, 2000). Among bacterial endosymbionts, strong effects of mutational biases on synonymous codon and amino acid usage have been interpreted as shifts of the mutation-selection balance due to elevated levels of genetic drift (Clark et al., 1998
, 2001
; Wernegreen & Moran, 1999
). The extremely AT-rich Buchnera exemplifies this evolutionary phenomenon, as it lacks adaptive codon usage and exhibits a genome-wide increased frequency of AT-rich amino acids (Clark et al., 1999
). However, the strength of mutational bias on amino acid usage in Buchnera is attenuated in high-expression genes, where selection on amino acid composition is apparently strongest (Palacios & Wernegreen, 2002
). Likewise, the full genome sequence of Wigglesworthia documented a predominance of amino acids with AT-rich codons (Akman et al., 2002
). In this study, we found that Wigglesworthia genes of putative high- and low-expression form distinct groups on the first two axes of COA of RAAU (Fig. 3
). A major force driving this distinction is the tendency of high-expression genes to use amino acids with relatively GC-rich codons. Of the three amino acids that are over-represented in high-expression genes (i.e. significant D{H,L} values), Ala and Arg are GC-rich while Val is considered unbiased. Likewise, Gly tends to be over-represented at high-expression genes and is also GC-rich. These results suggest that stronger selection on amino acid usage in high-expression genes counters the effects of a genome-wide mutational bias toward AT-rich amino acids.
We considered three possible explanations for the relative GC-richness of high-expression genes. First, this pattern may reflect selection against aromatic and energetically costly amino acids, two of which (Phe and Tyr) are also AT-rich (Akashi & Gojobori, 2002). This hypothesis predicts that aromatic amino acids will be under-represented at high-expression genes. For example, the proteomes of E. coli and Bacillus subtilis reflect selection to enhance metabolic efficiency by reducing the abundance of aromatic and other energetically expensive amino acids in high-expression genes (Akashi & Gojobori, 2002
). This prediction partially holds in Wigglesworthia, since the aromatic amino acid Phe is significantly under-represented at high-expression genes (i.e. significantly negative D{H,L}). However, while the aromatic amino acids Trp and Tyr are less frequent in high-expression genes, this result is not significant (Table 3
), suggesting that selection against aromaticity does not completely explain the distinct profiles at high-expression genes.
Second, the distinct RAAU patterns may result from selection against the use of AT-rich amino acids at high-expression genes. Consistent with that hypothesis, four of the six AT-rich amino acids (all but Lys and Asn) are under-represented at high-expression genes, and two have significant D{H,L} values. We compared specific amino acid differences between E. coli and Wigglesworthia to distinguish this hypothesis from a third: that relative GC-richness of high-expression genes reflects overall conservation of amino acids since divergence from a relatively GC-rich ancestor. In this case, high-expression genes are expected to show lower amino acid divergence compared to other loci. We found that amino acids of high-expression genes are significantly more conserved (mean dN=0·66 for high-expression genes, mean dN=1·25 for all genes; difference in means is significantly greater than zero; P=5·75x10-15, Student's t distribution of differences). Consistent with this result, we found a significant negative correlation between amino acid divergence and gene expression level as estimated by the E. coli CAI (Fig. 6) (rs=-0·42, P<0·0001). This negative relationship has been shown before in E. coli and S. typhimurium (Sharp, 1991
). Given the moderate 50·8 % GC nucleotide composition of E. coli (Blattner et al., 1997
), the more frequent use of GC-rich amino acids in Wigglesworthia high-expression genes is likely to reflect the maintenance of ancestral amino acid composition, rather than selection against AT-rich amino acids per se.
|
Although the specific mechanism of transmission of Wigglesworthia to host offspring is unclear, transmission occurs maternally via the tsetse milk gland (Ma & Denlinger, 1974). The similar genome-wide patterns in Wigglesworthia and Buchnera observed in this and previous studies suggest that Wigglesworthia also experiences genetic drift, perhaps due to repeated bottlenecks associated with transmission through host generations. This may be corroborated as more information on the Wigglesworthiatsetse endosymbiosis becomes available. In addition to small population and genetic drift, other peculiarities of the bacterial intracellular lifestyle may drive extreme genome reduction and strong AT mutational bias. Effects of genetic drift may be enhanced by irreversible gene loss in endosymbionts that lack opportunities and mechanisms for recombination with genetically distinct strains (Moran & Wernegreen, 2000
). Previous studies have suggested a universal AT mutational bias, because many types of spontaneous mutations (e.g. the deamination of cytosine) cause GC to AT changes (Birdsell, 2002
). The effects of this mutational bias may be more pronounced in small genomes that are deficient in DNA repair and that experience genetic drift (e.g. mitochondria, Buchnera and many other small genome bacteria). In addition, a possible relaxation of selection in the intracellular environment compared to free-living existence may allow more rapid gene loss and stronger mutational biases.
Host-level selection might also play a role in Wigglesworthia amino acid usage, as Wigglesworthia has very limited capability for amino acid biosynthesis but possesses numerous transporters that mediate the acquisition of amino acids from the tsetse host (Akman et al., 2002). Host-level selection on amino acid usage is predicted to occur when the host has certain amino acid deficiencies, whether due to limited biosynthetic capacity or restricted diet. Because the tsetse fly feeds on an amino-acid-rich diet of vertebrate blood, it is not clear which, if any, amino acids could limit host growth and drive such host-level selection. In fact, we found no evidence that host-level selection shapes amino acid usage in Wigglesworthia. That is, the distinct amino acid profile of high-expression Wigglesworthia genes reflects conservation of these proteins, not a decreased use of rare or energetically expensive amino acids. In striking contrast, Buchnera retains biosynthetic capacity for essential amino acids, which it supplies to the aphid host in exchange for non-essential amino acids (Shigenobu et al., 2000
). For example, the aphid depends on Buchnera to supply Trp and Leu, which occur at particularly low concentrations in the aphids' plant sap diet (Sandström & Moran, 1999
). The location of Trp and Leu biosynthetic genes on multicopy plasmids in some Buchnera species has been interpreted as host-level selection (Baumann et al., 1999
; Lai et al., 1994
; Wernegreen & Moran, 2001
). The patterns of amino acid usage in Buchnera and Wigglesworthia are strikingly similar, except for Leu and Trp, which are significantly under-represented in Buchnera high-expression genes (Palacios & Wernegreen, 2002
), but not significantly under-represented in Wigglesworthia high-expression genes.
The relative youth of the Wigglesworthia endosymbiosis is corroborated by genetic signatures of parasitism in the Wigglesworthia genome, such as maintenance of genes for a flagellum and a more robust cell membrane structure that are lacking in Buchnera (Akman et al., 2002). However, the effects of the intracellular lifestyle on genome size and nucleotide and amino acid composition of Wigglesworthia have been extreme, as shown previously (Akman et al., 2002
) and in this study. Comparisons of diverse Buchnera lineages show that severe genome reduction and AT-biased amino acid changes occurred very early in the symbiosis, before the divergence of major aphid subfamilies (Clark et al., 1999
; van Ham et al., 2003
; Wernegreen et al., 2000
). Such rapid genome changes apparently occur in the relatively young Wigglesworthia as well, as this endosymbiont also shows severe effects of AT mutational bias on both codon and amino acid usage.
![]() |
ACKNOWLEDGEMENTS |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Akashi, H. (1994). Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136, 927935.
Akashi, H. (1997). Codon bias evolution in Drosophila. Population genetics of mutation-selection drift. Gene 205, 269278.[CrossRef][Medline]
Akashi, H. & Gojobori, T. (2002). Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci U S A 99, 36953700.
Akman, L., Yamashita, A., Watanabe, H., Oshima, K., Shiba, T., Hattori, M. & Aksoy, S. (2002). Genome sequence of the endocellular obligate symbiont of tsetse flies, Wigglesworthia glossinidia. Nat Genet 32, 402407.[CrossRef][Medline]
Aksoy, S. (1995). Molecular analysis of the endosymbionts of tsetse flies: 16S rDNA locus and over-expression of a chaperonin. Insect Mol Biol 4, 2329.[Medline]
Andersson, S. G. & Kurland, C. G. (1990). Codon preferences in free-living microorganisms. Microbiol Rev 54, 198210.[Medline]
Andersson, S. G. & Sharp, P. M. (1996). Codon usage and base composition in Rickettsia prowazekii. J Mol Evol 42, 525536.[Medline]
Baumann, L., Baumann, P. & Clark, M. A. (1996). Levels of Buchnera aphidicola chaperonin groEL during growth of the aphid Schizaphis graminum. Curr Microbiol 32, 279285.[CrossRef]
Baumann, L., Baumann, P., Moran, N. A., Sandstrom, J. & Thao, M. L. (1999). Genetic characterization of plasmids containing genes encoding enzymes of leucine biosynthesis in endosymbionts (Buchnera) of aphids. J Mol Evol 48, 7785.[Medline]
Bernardi, G. (1985). Codon usage and genome composition. J Mol Evol 22, 363365.[Medline]
Birdsell, J. A. (2002). Integrating genomics, bioinformatics, and classical genetics to study the effects of recombination on genome evolution. Mol Biol Evol 19, 11811197.
Blattner, F. R., Plunkett, G. I., Bloch, C. A. & 14 other authors (1997). The complete genome sequence of Escherichia coli K12. Science 277, 14531474.
Bulmer, M. (1991). The selection-mutation-drift theory of synonymous codon usage. Genetics 129, 897907.
Charles, H., Heddi, A. & Rahbe, Y. (2001). A putative insect intracellular endosymbiont stem clade, within the Enterobacteriaceae, inferred from phylogenetic analysis based on a heterogeneous model of DNA evolution. C R Acad Sci III 324, 489494.[Medline]
Clark, M. A., Baumann, L. & Baumann, P. (1998). Sequence analysis of a 34·7-kb DNA segment from the genome of Buchnera aphidicola (endosymbiont of aphids) containing groEL, dnaA, the atp operon, gidA, and rho. Curr Microbiol 36, 158163.[Medline]
Clark, M. A., Moran, N. A. & Baumann, P. (1999). Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol Biol Evol 16, 15861598.[Abstract]
Clark, M. A., Baumann, L., Thao, M. L., Moran, N. A. & Baumann, P. (2001). Degenerative minimalism in the genome of a psyllid endosymbiont. J Bacteriol 183, 18531861.
de Miranda, A. B., Alvarez-Valin, F., Jabbari, K., Degrave, W. M. & Bernardi, G. (2000). Gene expression, amino acid conservation, and hydrophobicity are the main factors shaping codon preferences in Mycobacterium tuberculosis and Mycobacterium leprae. J Mol Evol 50, 4555.[Medline]
D'Onofrio, G., Mouchiroud, D., Aissani, B., Gautier, C. & Bernardi, G. (1991). Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 32, 504510.[Medline]
Foster, P. G., Jermiin, L. S. & Hickey, D. A. (1997). Nucleotide composition bias affects amino acid content in proteins coded by animal mitochondria. J Mol Evol 44, 282288.[Medline]
Francino, M. P. & Ochman, H. (1999). A comparative genomics approach to DNA asymmetry. Ann N Y Acad Sci 870, 428431.
Funk, D. J., Wernegreen, J. J. & Moran, N. A. (2001). Intraspecific variation in symbiont genomes: bottlenecks and the aphidBuchnera association. Genetics 157, 477489.
Greenacre, M. (1984). Theory and Applications of Correspondence Analysis. London: Academic Press.
Heddi, A., Charles, H., Khatchadourian, C., Bonnot, G. & Nardon, P. (1998). Molecular characterization of the principal symbiotic bacteria of the weevil Sitophilus oryzae: a peculiar G+C content of an endocytobiotic DNA. J Mol Evol 47, 5261.[Medline]
Kreitman, M. & Antezana, M. (1999). The Population and Evolutionary Genetics of Codon Bias. Cambridge: Cambridge University Press.
Lafay, B., Atherton, J. C. & Sharp, P. M. (2000). Absence of translationally selected synonymous codon usage bias in Helicobacter pylori. Microbiology 146, 851860.
Lai, C. Y., Baumann, L. & Baumann, P. (1994). Amplification of trpEG: adaptation of Buchnera aphidicola to an endosymbiotic association with aphids. Proc Natl Acad Sci U S A 91, 38193823.[Abstract]
Ma, W.-C. & Denlinger, D. L. (1974). Secretory discharge and microflora of milk gland in tsetse flies. Nature 247, 301303.
Maddison, D. & Maddison, W. (2002). MacClade: Analysis of Phylogeny and Character Evolution. Sunderland, MA: Sinauer Associates.
McInerney, J. O. (1998a). Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. Proc Natl Acad Sci U S A 95, 1069810703.
McInerney, J. O. (1998b). GCUA (General Codon Usage Analysis). Bionformatics 14, 372373.[CrossRef]
Mira, A. & Moran, N. A. (2002). Estimating population size and transmission bottlenecks in maternally transmitted endosymbiotic bacteria. Microb Ecol 44, 137143.[Medline]
Moran, N. A. & Wernegreen, J. J. (2000). Lifestyle evolution in symbiotic bacteria: insights from genomics. Trends Ecol Evol 15, 321326.[CrossRef][Medline]
Moran, N. A., Munson, M. A., Baumann, P. & Ishikawa, H. (1993). A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts. Proc R Soc Lond B Biol Sci 253, 167171.
Nogge, G. (1981). Significance of symbionts for the maintenance of an optional nutritional state for successful reproduction in hematophagous arthropods. Parasitology 82, 101104.
Ochman, H. & Wilson, A. C. (1987). Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. J Mol Evol 26, 7486.[Medline]
Ohama, T., Muto, A. & Osawa, S. (1990). The role of GC-biased mutation pressure on synonymous codon choice in Mycoplasma luteus, a bacterium with a high genomic GC-content. Nucleic Acids Res 18, 15651569.[Abstract]
Palacios, C. & Wernegreen, J. J. (2002). A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high expression genes. Mol Biol Evol 19, 15751584.
Powell, J. R. & Moriyama, E. N. (1997). Evolution of codon usage bias in Drosophila. Proc Natl Acad Sci U S A 94, 77847790.
Rocha, E. P. & Danchin, A. (2001). Ongoing evolution of strand composition in bacterial genomes. Mol Biol Evol 18, 17891799.
Sandström, J. & Moran, N. (1999). How nutritionally imbalanced is phloem sap for aphids? Entomol Exp Appl 91, 203210.[CrossRef]
Sharp, P. M. (1991). Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution. J Mol Evol 33, 2333.[Medline]
Sharp, P. M. & Li, W. H. (1987). The Codon Adaptation Index a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15, 12811295.[Abstract]
Shigenobu, S., Watanabe, H., Hattori, M., Sakaki, Y. & Ishikawa, H. (2000). Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407, 8186.[CrossRef][Medline]
Singer, G. A. & Hickey, D. A. (2000). Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. Mol Biol Evol 17, 15811588.
Sokal, R. R. & Rohlf, F. J. (1995). Biometry. New York: W. H. Freeman.
Srivastava, A. K. & Schlessinger, D. (1990). Mechanism and regulation of bacterial ribosomal RNA processing. Annu Rev Microbiol 44, 105129.[CrossRef][Medline]
Sueoka, N. (1961). Compositional correlation between deoxyribonucleic acid and protein. Cold Spring Harb Symp Quant Biol 26, 3543.
Swofford, D. L. (2002). PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Sunderland, MA: Sinauer Associates.
van Ham, R. C., Kamerbeek, J., Palacios, C. & 13 other authors (2003). Reductive genome evolution in Buchnera aphidicola. Proc Natl Acad Sci U S A 100, 581586.
Wernegreen, J. J. & Moran, N. A. (1999). Evidence for genetic drift in endosymbionts (Buchnera): analyses of protein-coding genes. Mol Biol Evol 16, 8397.[Abstract]
Wernegreen, J. J. & Moran, N. A. (2001). Vertical transmission of biosynthetic plasmids in aphid endosymbionts (Buchnera). J Bacteriol 183, 785790.
Wernegreen, J. J., Ochman, H., Jones, I. B. & Moran, N. A. (2000). Decoupling of genome size and sequence divergence in a symbiotic bacterium. J Bacteriol 182, 38673869.
Wernegreen, J., Degnan, P., Lazarus, A., Palacios, C. & Bordenstein, S. (2003). Genome evolution in an insect cell: distinct features of an antbacterial partnership. Biol Bull 204, 221231.
Wright, F. (1990). The effective number of codons' used in a gene. Gene 87, 2329.[CrossRef][Medline]
Yang, Z. (2002). Phylogenetic Analysis by Maximum Likelihood (PAML), version 3.12. London: University College London.
Received 31 March 2003;
revised 5 June 2003;
accepted 18 June 2003.