Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Vertically transmitted, obligate endosymbionts may have relatively small effective population size (Ne) caused by recurrent bottlenecks upon transmission between host generations (Moran 1996
) and limited genetic recombination between endosymbionts of different hosts (Moran and Baumann 1994
; Funk et al. 2000
; Wernegreen and Moran 2001
). Therefore, the efficacy of selection may be reduced in intracellular replicating genomes (Muller 1964
; Ohta 1973
; Moran 1996
) compared with free-living, recombining organisms such as E. coli, which are thought to have large long-term Ne (Selander, Caugant, and Whittam 1987
).
Analyses of synonymous codon usage and amino acid composition are useful tools to explore shifts in the mutation-selection balance across bacterial species with different lifestyles. For example, in contrast to adaptive codon usage in E. coli and other free-living bacteria (Ikemura 1981
; Bennetzen and Hall 1982
), synonymous codon usage of intracellular pathogens such as Mycoplasma genitalium and Rickettsia prowazekii corresponds with local base compositional biases, and selection seems to have little effect (Andersson and Sharp 1996
; McInerney 1997
). Likewise, Buchnera shows an extreme AT bias at synonymous codon positions and spacer regions (Shigenobu et al. 2000
) and lacks the adaptive codon bias shown by E. coli (Wernegreen and Moran 1999
). Analysis of several protein-coding genes in this endosymbiont shows that mutational bias and drift drive not only codon usage but also amino acid changes (Ohtaka and Ishikawa 1993
; Moran 1996
; Brynnel et al. 1998
; Clark, Baumann, and Baumann 1998
; Clark, Moran, and Baumann 1999
) and may contribute to gene loss (Mira, Ochman, and Moran 2001
; Silva, Latorre, and Moya 2001
).
Recently, full genome sequence data have strengthened multivariate analyses to explore factors that drive variation in amino acid and codon usage among genes within a given genome (e.g., de Miranda et al. 2000
; Romero, Zavala, and Musto 2000
). In this article, we build upon previous genome-level studies by employing multivariate analysis to identify major factors shaping amino acid usage in the full genomes of Buchnera APS and E. coli K-12. Our results argue that mutation and selection have strong but distinct effects on protein evolution in these two bacterial species.
![]() |
Material and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Multivariate Analysis of Amino Acid Composition
We used correspondence analysis (COA, Greenacre 1984
) to identify the major factors that shape variation in amino acid usage among proteins of Buchnera and E. coli, as implemented by CodonW v. 1.4.2 for UNIX (available with John Peden at http://www.molbiol.ox.ac.uk/cu/). Because COA vectors may be affected by unusual amino acid usage of Buchnera plasmid-encoded proteins, plasmid genes were added after COA, and their positions were calculated based on vectors obtained from nuclear genes only. Major axes did not distinguish plasmid from nuclear genes of Buchnera.
Identifying Sources of Trends in Amino Acid Usage
We used nonparametric tests of association to test the significance of associations between the position of loci (or amino acids) on principal axes of COA and 39 parameters relating to properties of loci or amino acids (JMP v. 4; SAS Institute). Given that multiple tests were performed, we adjusted values of type I error () by means of the Bonferroni correction (Sokal and Rohlf 1995
, pp. 236240). Parameters of loci included several measures of nucleotide composition (e.g., AT skew defined by {A - T/A + T}, base composition at each codon position, etc.), gene length, relative frequency of aromatic amino acids, and the overall hydropathicity score of a protein (GRAVY; Kyte and Doolittle 1982
). Properties of amino acids included molecular weight, hydropathy level (i.e., degree of hydrophilicity or hydrophobicity), and AT-richness of codons scored as in Clark, Moran, and Baumann (1999)
. In this study, we selected the four amino acids at each extreme of that scale to define "GC-rich" and "AT-rich" categories (tables 1
and 2
). We also included Leu in the AT-rich category, as this amino acid is encoded by TT[AG] and CTN.
|
|
Locating Genes Situated on Leading Versus Lagging Strands of Replication
Asymmetrical mutational bias between the two complementary DNA strands may contribute to variation in both codon and amino acid usage (Karlin, Campbell, and Mrazek 1998
; McInerney 1998
; Lafay et al. 1999
). To consider effects of leading versus lagging strands of replication, we determined strand orientation of genes on the basis of their position relative to the origin and terminus of replication. The presence of the DnaA-box of the Buchnera genome is thought to mark the origin of replication of the Buchnera chromosome (Shigenobu et al. 2000
). But a shift of the GC skew in noncoding and synonymous third codon positions 13 kb upstream of this DnaA-box (Shigenobu et al. 2000
) may correlate with the origin, as shown for other bacterial genomes (Lobry 1996
; Blattner et al. 1997
). To account for ambiguity in the location of the Buchnera origin, we considered this 13-kb region (from 627,681 to position 1 of the sequenced genome) as an "origin window." Likewise, we defined the "terminus window" as the 13-kb region immediately opposite (180 degrees) the origin region (307,340 to 320,340). We excluded genes in these windows from comparisons of leading versus lagging strands. In contrast, the origin and terminus of E. coli are well defined experimentally (e.g., Yoshikawa and Ogasawara 1991
) so that all genes can be assigned to the leading or lagging strand.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Axis 1
The first axis accounts for 23.0% of the total variation of the data. This axis correlates positively with GC content at first and second codon positions (rs (Spearman's Rho coefficient) = 0.89, P < 0.0001; 0.78, P < 0.0001; respectively) but notably, not third codon positions. Axis 1 also correlates negatively with aromaticity levels of each protein (rs = -0.67, P < 0.0001) and differentiates putative high- and low-expression genes in Buchnera (fig. 1a
).
|
Other Axes
The third and fourth axes of COA in Buchnera account for 8.1% and 6.3% of the variation in the data, respectively. Axis 3 does not correlate significantly with any parameter considered. The fourth axis separates proteins that are rich in Cys, the most rare amino acid in Buchnera (table 1
). The low dispersion observed in these and subsequent axes did not warrant further consideration.
Multivariate Analysis of Amino Acid Usage in E. coli
The first four axes of COA of the complete genome sequence of E. coli K-12 explain 49.2% of the total variation of the data (distributed along the 19 total axes). Axis 1 (19.1% of the total variation) correlates positively with the GRAVY score of proteins (rs = 0.83, P < 0.0001) and negatively with AT skew (rs = -0.67, P < 0.0001). This result agrees with a previous multivariate analysis of 999 E. coli genes, in which integral membrane proteins (IMP's) form a distinct group along Axis 1 (Lobry and Gautier 1994
). Axis 2 (12.4% of the total variation) correlates with the gene expression (approximated using CAI) significantly (rs = 0.36, P < 0.0001) but not as strongly as the correlation previously reported (rs = 0.55, P < 0.0001; Lobry and Gautier 1994
). A high correlation with C and A content at first codon positions (rs = -0.79, P < 0.0001; and 0.68, P < 0.0001, respectively) does not coincide with that expected from the correlation with gene expression, i.e., excess of guanine at first codon position (Gutierrez, Marquez, and Marin 1996
). Axis 3 (9.4% of the variation) correlates positively with aromaticity and correlates with CAI almost as well as does Axis 2 (rs = -0.31, P < 0.0001). Both Axes 2 and 3 differentiate high- and low-expression genes considered in this study for E. coli (data not shown) as predicted by their correlations with CAI. As in Buchnera, the distinction of proteins rich in Cys (also the most rare amino acid in E. coli; table 2 ) along Axis 4 (8.3% of the variation) suggests that the frequency of Cys is highly variable among loci.
|
|
In E. coli, D{H,L} values only partially account for variation at Axes 2 and 3, both of which correlate with gene expression. Amino acids that are underrepresented in high-expression genes (Trp, Leu, Cys, Gln, and Pro; fig. 2 ) appear at the extreme of Axis 2, but only Trp is extreme in Axis 3 (fig. 1c ). Amino acids that are overrepresented in high-expression E. coli genes (Lys, Val, and Arg) are situated at the extreme of Axis 3, but only Lys is extreme in Axis 2.
Strand of Replication
Our COA of amino acid usage across the full genomes of Buchnera and E. coli did not clearly distinguish genes on the leading and lagging strands of replication in either species. But strand orientation and gene expression levels are not independent. As noted previously for E. coli and several other bacterial species (Francino and Ochman 1999
), we found that putative high-expression genes tend to occur on the leading strand (78% of Buchnera and 96% of E. coli genes sampled here), perhaps because of selection to avoid collision between DNA and RNA polymerases (Brewer 1988
).
Thus, we tested whether prevalence of high-expression genes on the leading strand, coupled with strand-specific mutational asymmetries, could account for the distinct amino acid profiles we observed at high- and low-expression genes. To distinguish the effects of strand orientation and gene expression level, we compared D{H,L} values calculated separately for genes on leading and lagging strands with D{H,L} values obtained when strand orientation was not considered (tables 1 and 2 ). In general, strand position did not affect the sign of significant D{H,L} values (i.e., had no effect on whether an amino acid is over- or underrepresented in high-expression genes). In Buchnera, switches in the sign of D{H,L} occurred only when this value was not significant; therefore, these sign changes may be attributed to random variation. The same result was found in E. coli, with the exception of Val (GTN) which is more abundant on the leading strand (see Discussion).
Hydrophobicity
We further explored the effect of hydrophobicity on amino acid usage in Buchnera and E. coli by comparing the inferred functions of genes at extreme positions of the axes that correlate with the hydropathy of proteins (i.e., loci positioned at >0.20 on Axis 2 of Buchnera and loci at >0.20 on Axis 1 for E. coli). In both genomes, these hydrophobic proteins tend to function as IMPs, with functions such as transport and anchoring of dehydrogenases. All but two Buchnera genes positioned >0.20 on Axis 2 were homologous to E. coli genes at extreme of Axis 1. These two exceptions, znuB and secY, also encode IMPs involved in transport. Moreover, secY is listed among E. coli genes at extreme of Axis 1 in a previous COA of this species (Lobry and Gautier 1994
) but is absent from E. coli K-12.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The first axis of the COA clearly distinguishes putative high- and low-expression Buchnera genes. To confirm this strong effect of gene expression on amino acid usage, we compared the RAAU of each amino acid at high- and low-expression genes using the new statistic D{H,L}. Several amino acids show significant differences in their abundance at putative high- and low-expression Buchnera genes. Amino acids significantly overrepresented at high-expression Buchnera genes include Ala, Gly, Arg, and Val, whereas those significantly underrepresented include Trp, Leu, Phe, Tyr, Ile, and Asn (table 1 and fig. 2 ).
This distinct amino acid profile of high-expression Buchnera loci may be shaped by selection against aromatic amino acids (Trp, Phe, and Tyr), which are expensive to biosynthesize (Craig and Weber 1998
; Akashi and Gojobori 2002
). In addition, amino acids that are more abundant at high-expression Buchnera genes tend to use codons that are relatively GC-rich at first and second positions (but not at third positions). This relative GC-richness suggests that selection counteracting a genome-wide AT mutational pressure in Buchnera is stronger at high-expression genes compared with low-expression genes.
Because many aromatic amino acids are encoded by AT-rich codons, selection against aromatic residues and selection against AT-rich codons are complementary and overlapping. But results of this study show their independent effects. For example, the amino acids Asn and Ile are encoded by AT-rich codons but are not aromatic. Thus, the low frequencies of Asn and Ile in high-expression Buchnera genes argues for selection against AT-rich codons at high-expression genes that cannot be explained by selection against aromaticity. Likewise, the low frequency of the very aromatic Trp (encoded by TGG) in high-expression genes suggests selection against aromatic amino acids that cannot be explained by selection against AT-rich codons.
The specific function of Buchnera as a nutritional endosymbiont may influence certain patterns of amino acid usage in this genome. Interestingly, the essential amino acids Trp and Leu are generally overproduced by Buchnera to supplement its host diet (Douglas and Prosser 1992
; Bracho et al. 1995
; Baumann et al. 1998
) and might be relatively abundant amino acids in the endosymbiont cell. Therefore, the paucity of Trp and Leu in high-expression Buchnera genes may be shaped by host-level selection for energetic efficiency (see Rispe and Moran 2000
for models of host and symbiont-level selection). In addition, host-level selection may also influence the usage of several nonessential amino acids that Buchnera cannot synthesize but must acquire from the aphid host (Shigenobu et al. 2000
).
Although strong AT bias in Buchnera may drive distinct amino acid usage at high- and low-expression genes, this mutational bias may actually narrow that difference in some cases. Notably, Lys, encoded by the AT-rich codons AA[AG], is significantly overrepresented in ribosomal proteins of E. coli (RAAU of 0.0948 in ribosomal proteins vs. 0.0437 genome-wide). But Lys is the most common amino acid across the Buchnera genome (RAAU of 0.0988 genome-wide), consistent with the strong AT mutational bias. The slightly higher frequency of Lys in high-expression Buchnera genes is not significant given the genome-wide abundance of this amino acid.
Despite a strong effect of gene expression level on amino acid usage in Buchnera, we found no effect of gene expression on patterns of relative synonymous codon usage (RSCU) in this species. That is, in a multivariate analysis of RSCU of the 479 Buchnera loci included in this study, no major axis distinguished putative high- and low-expression genes (data not shown). This genome-wide analysis supports previous evidence that mutational bias and drift shape codon usage in Buchnera (e.g., Brynnel et al. 1998
; Wernegreen and Moran 1999
) and adds to the accumulating evidence that translational selection is not sufficiently strong or effective (or both) to counter the effects of genetic drift and mutational bias on synonymous codon usage. Therefore, gene expression levels influence amino acid usage, where selection may act more strongly, but not synonymous codon usage, where weak selection is apparently ineffective in small Buchnera populations. Another plausible explanation for an absence of translational selection on codon usage in Buchnera could be the equal abundance of tRNAs in this genome, which contains only one or a few copies of each tRNA. The reduced tRNA populations in the AT-rich genome of the parasite R. prowazekii was also postulated as a major factor in the absence of codon usage biases (Andersson and Sharp 1996
). Interestingly, a more detailed analysis of codon bias in Buchnera suggests that different mutational biases on leading and lagging strands affects synonymous codon usage (Claude Rispe, personal communication).
In E. coli, hydrophobicity is the primary factor shaping variation in amino acid usage among proteins, but the effects of gene expression (although secondary) are nonetheless apparent. Comparisons of putative high- and low-expression genes show many similarities between Buchnera and E. coli because no amino acid shows significantly different trends in the two genomes (fig. 2
). The significant underrepresentation of Trp in high-expression genes of both genomes suggests selection against the use of this aromatic amino acid. On the basis of similar results for E. coli, Lobry and Gautier (1994)
suggested that the energetic costs of aromatic amino acids may account for their low abundance in high-expression genes. Recently, a more detailed study of metabolic efficiency in bacteria analyzed the cost of each amino acid in terms of high-energy phosphate bonds (the most expensive amino acids being the aromatic Trp, Phe, and Tyr) (Akashi and Gojobori 2002
). The authors found decreased abundance of costly amino acids in high-expression E. coli loci regardless of gene function. Interestingly, our comparison of high- and low-expression E. coli genes is based on a limited gene sample but yielded results entirely consistent with this previous genome-wide analysis. Only Phe, Pro, Ser, and Arg differ in whether they show significant changes with gene expression, but this could be attributed to our necessarily smaller gene sample. Consistent with our results, Akashi and Gojobori (2002)
also found no relationship between gene expression and the abundance of amino acids encoded by AT-rich or GC-rich codons in E. coli. This pattern in E. coli contrasts with the striking correlation in Buchnera between GC-richness of amino acid codons and gene expression (as reflected in significant correlations between Axis 1 and GC content at first and second codon positions and the biased amino acid profiles at high- vs. low-expression genes [fig. 1b,
fig. 2
]).
Previous work shows that tRNA pools match overall amino acid usage of proteins in several genomes (Yamao et al. 1991
). In E. coli, the amino acid composition of high-expression genes correlates more strongly with tRNA abundances than do low-expression genes (Lobry and Gautier 1994
). This trend has been interpreted as coadaptation between amino acid composition of proteins and tRNA-pools to enhance translational efficiency (Lobry and Gautier 1994
; Akashi and Eyre-Walker 1998
). In this study, we plot the two axes that correlated with gene expression in the E. coli COA against major tRNA abundances (Dong, Nilsson and Kurland 1996
). Axis 2 is not correlated with tRNA abundance of the corresponding amino acid, whereas Axis 3 correlates significantly with tRNA abundance (but only before applying the Bonferroni correction; rs = -0.66, P < 0.001; fig. 1d
). This data supports previous evidence that translational selection may shape amino acid usage in E. coli (Lobry and Gautier 1994
). This pattern contrasts with Buchnera, in which equal abundances of tRNA molecules or reduced efficacy of selection may limit effects of translational selection on both codon usage (see above) and amino acid usage.
Other Processes that may Drive Intragenomic Variation in Amino Acid Composition of Buchnera and E. coli
We calculated D{H,L} on the leading and lagging strands separately to distinguish the effects of strand orientation and gene expression level. In Buchnera, strand orientation does not account for the observed differences between high- and low-expression genes (table 1
). But in E. coli, switches in the sign of D{H,L} depending on strand orientation suggest that strand-specific mutational biases may affect amino acid usage (table 2
). Namely, high-expression genes of E. coli may experience distinct mutational pressures by virtue of their prevalence on the leading strand of replication. For example, the strong bias of Val (encoded by GTN) on the leading strand in E. coli and other species, independent of gene expression level, is consistent with a G > C and T > A skew on the leading strand resulting from strand mutational asymmetries (Mackiewicz et al. 1999
; Rocha, Danchin, and Viari 1999
). Likewise, strand position apparently contributes to the overrepresentation of Val in high-expression E. coli genes in our study.
Strand-specific biases may also drive asymmetries between the coding and noncoding DNA strands, as a result of transcription-associated mutation or DNA repair (or both) (e.g., Francino and Ochman 1999
). If this bias increases with transcription levels, then transcription-associated asymmetries may contribute to differences between high- and low-expression genes. For example, it is possible that C
T mutational bias on the coding strand (Beletskii and Bhagwat 1996
) may contribute to the observed high frequencies of certain GT-rich amino acids (e.g., Gly [GGN] and Val [GTN]) at high-expression genes in Buchnera, and low frequencies of CA-rich amino acid codons ([Pro (CCN)], Gln [CA(AG)]) at these genes in E. coli. But transcription-associated biases alone cannot account for distinct amino acid profiles in high- and low-expression genes. For example, several amino acids that use GT-rich codons are significantly underrepresented at high-expression genes of one or both species (Trp [TGG], Phe [TT(TC)] Cys [TG(TC)]). Nor can transcription-associated biases account for the observed reduction in the AT content of amino acid codons at high-expression genes because a C
T mutational bias is expected to increase T-richness of high-expression genes.
In analyses of single genomes, apparent differences in amino acid usage at high- and low-expression genes may partially reflect distinct structural or functional requirements of the proteins selected. But gene-specific structural or functional constraints have a minimal effect on the conclusions of our study, which is based on a comparison of an identical set of genes in Buchnera and E. coli. Moreover, the consistency of our results with a previous genome-wide study of E. coli (Akashi and Gojobori 2002
) suggests that, for the purposes of this study, our limited gene sample is largely characteristic of high- and low-expression genes. Paralogous genes also pose complications for analyses of individual genomes because amino acid profiles of paralogs may be similar because of common ancestry. But in this study the observed differences between E. coli and Buchnera cannot be explained by gene duplication in Buchnera, which basically represents a subset of the E. coli genome (Shigenobu et al. 2000
).
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Abbreviations: Ne, effective population size; COA, correspondence analysis; RAAU, relative amino acid usage; D{H,L}, difference in RAAU of an amino acid at high- and low-expression genes; rs, Spearman's Rho coefficient; RSCU, relative synonymous codon usage.
Keywords: amino acid usage
selection
endosymbiosis
mutational bias
multivariate analysis
gene expression
Escherichia coli
Address for correspondence and reprints: Jennifer J. Wernegreen, Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, Massachusetts 02543. E-mail: jwernegreen{at}mbl.edu
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Akashi H., A. Eyre-Walker, 1998 Translational selection and molecular evolution Curr. Opin. Genet. Dev 8:688-693[ISI][Medline]
Akashi H., T. Gojobori, 2002 Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis Proc. Natl. Acad. Sci. USA 99:3695-3700
Aksoy S., 1995 Molecular analysis of the endosymbionts of tsetse flies: 16S rDNA locus and over-expression of a chaperonin Insect Mol. Biol 4:23-29[Medline]
Andersson S. G. E., C. G. Kurland, 1998 Reductive evolution of resident genomes Trends Microbiol 6:263-268[ISI][Medline]
Andersson S. G. E., P. M. Sharp, 1996 Codon usage and base composition in Rickettsia prowazekii J. Mol. Evol 42:525-536[ISI][Medline]
Baumann P., L. Baumann, M. A. Clark, M. L. Thao, 1998 Genetic properties and adaptations of Buchnera aphidicola to an endosymbiotic association with aphids ASM News 64:203-208[ISI]
Beletskii A., A. S. Bhagwat, 1996 Transcription-induced mutations: increase in C to T mutations in the nontranscribed strand during transcription in Escherichia coli Proc. Natl. Acad. Sci. USA 93:13919-13924
Bennetzen J. L., B. D. Hall, 1982 Codon selection in yeast J. Biol. Chem 257:3026-3031
Blattner F. R., G. Plunkett III,, C. A. Bloch, et al. (17 co-authors) 1997 The complete genome sequence of Escherichia coli K-12 Science 277:1453-1474
Bracho A. M., D. Martinez-Torres, A. Moya, A. Latorre, 1995 Discovery and molecular characterization of a plasmid localized in Buchnera sp. bacterial endosymbiont of the aphid Rhopalosiphum padi J. Mol. Evol 41:67-73[ISI][Medline]
Brewer B., 1988 When polymerases collide: replication and the transcriptional organization of the E. coli chromosome Cell 53:679-686[ISI][Medline]
Brynnel E. U., C. G. Kurland, N. A. Moran, S. G. E. Andersson, 1998 Evolutionary rates for tuf genes in endosymbionts of aphids Mol. Biol. Evol 15:574-582[Abstract]
Buchner P., 1965 Endosymbiosis of animals with plant microorganisms Interscience-Wiley, New York
Clark M. A., L. Baumann, P. Baumann, 1998 Sequence analysis of a 34.7-kb DNA segment from the genome of Buchnera aphidicola (endosymbiont of aphids) containing groEL, dnaA, the atp operon, gidA, and rho Curr. Microbiol 36:158-163[ISI][Medline]
Clark M. A., N. A. Moran, P. Baumann, 1999 Sequence evolution in bacterial endosymbionts having extreme base compositions Mol. Biol. Evol 16:1586-1598[Abstract]
Craig C. L., R. S. Weber, 1998 Selection costs of amino acid substitutions in ColE1 and ColIa gene clusters harbored by Escherichia coli Mol. Biol. Evol 15:774-776
de Miranda A. B., F. Alvarez-Valin, K. Jabbari, W. M. Degrave, G. Bernardi, 2000 Gene expression, amino acid conservation, and hydrophobicity are the main factors shaping codon preferences in Mycobacterium tuberculosis and Mycobacterium leprae J. Mol. Evol 50:45-55[ISI][Medline]
Dong H., L. Nilsson, C. G. Kurland, 1996 Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates J. Mol. Biol 260:649-663[ISI][Medline]
Douglas A. E., 1995 The ecology of symbiotic micro-organisms Adv. Ecol. Res 26:69-103
Douglas A. E., W. A. Prosser, 1992 Synthesis of the essential amino acid tryptophan in the pea aphid (Acyrthosiphon pisum) symbiosis J. Insect Physiol 38:565-568[ISI]
Francino M. P., H. Ochman, 1999 A comparative genomics approach to DNA asymmetry Ann. N. Y. Acad. Sci 870:428-431
Funk D. J., L. Helbling, J. J. Wernegreen, N. A. Moran, 2000 Intraspecific phylogenetic congruence among multiple symbiont genomes Proc. R. Soc. Lond. B 267:2517-2521[ISI][Medline]
Greenacre M. J., 1984 Theory and applications of correspondence analysis Academic Press, London
Gutierrez G., L. Marquez, A. Marin, 1996 Preference for guanosine at first codon position in highly expressed Escherichia coli genes. A relationship with translational efficiency Nucleic Acids Res 24:2525-2527
Ikemura T., 1981 Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system J. Mol. Biol 151:389-409[ISI][Medline]
Ishikawa H., 1984 Characterization of the protein species synthesized in vivo and in vitro by an aphid endosymbiont Insect Biochem 14:417-425[ISI]
Karlin S., A. M. Campbell, J. Mrazek, 1998 Comparative DNA analysis across diverse genomes Annu. Rev. Genet 32:185-225[ISI][Medline]
Kyte J., R. F. Doolittle, 1982 A simple method for displaying the hydropathic character of a protein J. Mol. Biol 157:105-132[ISI][Medline]
Lafay B., A. T. Lloyd, M. J. McLean, K. M. Devine, P. M. Sharp, K. H. Wolfe, 1999 Proteome composition and codon usage in spirochaetes: species-specific and DNA strand-specific mutational biases Nucleic Acids Res 27:1642-1649
Lobry J. R., 1996 Origin of replication of Mycoplasma genitalium Science 272:745-746[ISI][Medline]
Lobry J. R., C. Gautier, 1994 Hydrophobicity, expressivity and aromaticity are the major trends of amino-acid usage in 999 Escherichia coli chromosome-encoded genes Nucleic Acids Res 22:3174-3180[Abstract]
Mackiewicz P., A. Gierlik, M. Kowalczuk, M. Dudek, S. Cebrat, 1999 How does replication-associated mutational pressure influence amino acid composition of proteins? Genome Res 9:409-416
McInerney J. O., 1997 Prokaryotic genome evolution as assessed by multivariate analysis of codon usage patterns Microb. Comparat. Genomics 2:89-97
. 1998 Replication and transcriptional selection on codon usage in Borrelia burgdorferi Proc. Natl. Acad. Sci. USA 95:10698-10703
McLean D. L., E. J. Houk, 1973 Phase contrast and electron microscopy of the mycetocytes and symbiontes of the pea aphid Acyrtosiphon pisum J. Insect Physiol 19:625-633[ISI]
Mira A., H. Ochman, N. A. Moran, 2001 Deletional bias and the evolution of bacterial genomes Trends Genet 17:589-596[ISI][Medline]
Moran N., 1996 Accelerated evolution and Muller's ratchet in endosymbiotic bacteria Proc. Natl. Acad. Sci. USA 93:2873-2878
Moran N., P. Baumann, 1994 Phylogenetics of cytoplasmically inherited microorganisms of arthropods Trends Ecol. Evol 9:15-20[ISI]
Moran N. A., M. A. Munson, P. Baumann, H. Ishikawa, 1993 A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts Proc. R. Soc. Lond. B 253:167-171[ISI]
Muller J., 1964 The relation of recombination to mutational advance Mutat. Res 1:2-9[ISI]
Munson M. A., P. Baumann, M. A. Clark, L. Baumann, N. A. Moran, D. J. Voegtlin, B. C. Campbell, 1991 Evidence for the establishment of aphid-eubacterium endosymbiosis in an ancestor of four aphid families J. Bacteriol 173:6321-6324[ISI][Medline]
Ohta T., 1973 Slightly deleterious mutant substitutions in evolution Nature 246:96-98[ISI][Medline]
Ohtaka C., H. Ishikawa, 1993 Accumulation of adenine and thymine in a groE-homologous operon of an intracellular symbiont J. Mol. Evol 36:121-126[ISI][Medline]
Rispe C., N. A. Moran, 2000 Accumulation of deleterious mutations in endosymbionts: Muller's ratchet with two levels of selection Am. Nat 156:425-441[ISI]
Rocha E., A. Danchin, A. Viari, 1999 Universal replication biases in bacteria Mol. Microbiol 32:11.[ISI][Medline]
Romero H., A. Zavala, H. Musto, 2000 Codon usage in Chlamydia trachomatis is the result of strand-specific mutational biases and a complex pattern of selective forces Nucleic. Acids. Res 28:2084-2090
Sato S., H. Ishikawa, 1997 Expression and control of an operon from an intracellular symbiont which is homologous to the groE operon J. Bacteriol 179:2300-2304[Abstract]
Selander R. K., D. A. Caugant, T. S. Whittam, 1987 Genetic structure and variation in natural populations of Escherichia coli Pp. 16251648 in F. C. Neidhardt, J. L. Ingraham, K. B. Low, G. Magasanik, M. Schaechter, and H. E. Umbarger, eds. Escherichia coli and Salmonella typhimurium: cellular and molecular Biology. American Society of Microbiology, Washington, D.C.
Selosse M., B. Albert, B. Godelle, 2001 Reducing the genome size of organelles favours gene transfer to the nucleus Trends Ecol. Evol 16:135-141[ISI][Medline]
Sharp P. M., W. H. Li, 1987 The codon Adaptation Indexa measure of directional synonymous codon usage bias, and its potential applications Nucleic Acids Res 15:1281-1295[Abstract]
Shigenobu S., H. Watanabe, M. Hattori, Y. Sakaki, H. Ishikawa, 2000 Genome sequence of the endocellular bacterial symbiont of aphids Buchnera APS Nature 407:81-86[ISI][Medline]
Shigenobu S., H. Watanabe, Y. Sakaki, H. Ishikawa, 2001 Accumulation of species-specific amino acid replacements that cause loss of particular functions in Buchnera, an endocellular bacterial symbiont J. Mol. Evol 53:377-386[ISI][Medline]
Silva F. J., A. Latorre, A. Moya, 2001 Genome size reduction through multiple events of gene disintegration in Buchnera APS Trends Genet 17:615-618[ISI][Medline]
Singer G. A. C., D. A. Hickey, 2000 Nucleotide bias causes a genomewide bias in the amino acid composition of proteins Mol. Biol. Evol 17:1581-1588
Sokal R. R., F. J. Rohlf, 1995 Biometry W.H. Freeman and Co., New York
Srivastava A. K., D. Schlessinger, 1990 Mechanism and regulation of bacterial ribosomal RNA processing Annu. Rev. Microbiol 44:105-129[ISI][Medline]
Unterman B. M., P. Baumann, D. L. McLean, 1989 Pea aphid symbiont relationships established by analysis of 16S rRNAs J. Bacteriol 171:2970-2974[ISI][Medline]
Wernegreen J. J., N. A. Moran, 1999 Evidence for genetic drift in endosymbionts (Buchnera): analyses of protein-coding genes Mol. Biol. Evol 16:83-97[Abstract]
. 2001 Vertical transmission of biosynthetic plasmids in aphid endosymbionts (Buchnera) J. Bacteriol 183:785-790
Yamao F., Y. Andachi, A. Muto, T. Ikemura, S. Osawa, 1991 Levels of tRNAs in bacterial cells as affected by amino acid usage in proteins Nucleic Acids Res 19:6119-6122[Abstract]
Yoshikawa H., N. Ogasawara, 1991 Structure and function of DnaA and the DnaA-box in eubacteria: evolutionary relationships of bacterial replication origins Mol. Microbiol 5:2589-2597[ISI][Medline]