Department of Molecular Evolution, Evolutionary Biology Center, University of Uppsala, Uppsala, Sweden
Correspondence: E-mail: Siv.Andersson{at}ebc.uu.se.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: endosymbionts phylogeny substitution rate
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
An important but, yet, unresolved question concerns the evolutionary forces that drive the acceleration of nucleotide substitutions in endosymbiont populations. Factors such as enhanced mutation rates, reduced levels of purifying selection, and increased fixation rates for slightly deleterious mutations combined with the effects of Mullers ratchet on small, asexual populations have been postulated to account for the rate increase (Moran 1996; Andersson and Kurland 1998; Brynnel et al. 1998; Clark, Moran, and Baumann 1999). A recent study revealed a uniform rate increase for B. aphidicola genes relative to those of E. coli, suggesting that the elevated substitution rate is solely the result of a mutation rate increase (Itoh, Martin, and Nei 2002). However, this conclusion was drawn from an analysis that only included one-third of all B. aphidicola genes. These were selected on the basis that they produced phylogenetic trees in which B. aphidicola formed a sister group with E. coli to the exclusion of H. influenzae (Itoh, Martin, and Nei 2002). The majority of genes, two-thirds, were associated with deviant tree topologies, and these were excluded from the analyses based on the assumption that they could represent horizontal gene transfer events (Itoh, Martin, and Nei 2002).
A large number of conflicting tree topologies is a characteristic feature of whole-genomebased phylogenetic analyses (Sicheritz-Ponten and Andersson 2001; Eisen and Fraser 2003), and these incongruent topological features have been used as evidence for horizontal gene transfer events (Gogarten, Doolittle, and Lawrence 2002; Doolittle et al. 2003). Taken to the extreme, it is argued that the widespread occurrence of such anomalies challenges the concept of a universal tree of life (Doolittle 1999; Doolittle et al. 2003). Others, however, argue that a core set of genes with a common history can be identified that reflects the true species tree, despite the many conflicting tree topologies (Daubin, Gouy, and Perriere 2002; Daubin, Moran, and Ochman 2003; Rokas et al. 2003). Alternative explanations for unexpected placements of taxa in single-gene trees include long-branch artifacts, gene paralogy, and lack of resolution because of saturation effects or low levels of sequence divergence (Eisen and Fraser 2003; Daubin, Moran, and Ochman 2003).
Because the correct species tree is normally not known, there are few realistic scenarios under which the sources and implications of deviant tree topologies can be examined. The aphid endosymbiont lineages provide an interesting exception; because all genes in these genomes are thought to share the same evolutionary history, they represent an ideal reference system for phylogenomic analyses. Because it is of general interest to examine whether unexpected placements of taxa in single-gene trees represent horizontal gene transfer events (Doolittle et al. 2003; Gogarten, Doolittle, and Lawrence 2002; Daubin, Gouy, and Perriere 2002; Daubin, Moran, and Ochman 2003; Eisen and Fraser 2003; Kurland, Canbäck, and Berg 2003) and of specific interest to examine the causes for lineage-specific rate increases in endosymbiotic bacteria (Itoh, Martin, and Nei 2002), we have here used a phylogenomic approach to re-examine the congruence of gene trees from B. aphidicola. Our results suggest that distorted tree topologies for endosymbiont genes result from biased mutation pressures and enhanced substitution rates, rather than from horizontal gene transfer events. The analysis provides evidence for relative rate variations among functional classes of genes in endosymbiotic bacteria of aphids and tsetse flies, indicating that the acceleration of evolutionary rates is affected by purifying selection acting at the host level.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
For the first global phylogenetic survey, we reconstructed phylogenetic trees from protein alignments that contained all of the 545 inferred B. aphidicola (Sg) gene products and the homologous protein sequences from other proteobacterial species. Based on the results, we conducted a second study to test the hypothesis that different tree topologies were obtained as a result of rate heterogeneity and nucleotide composition bias in the data set. In this case, phylogenetic trees were built from a more restricted data set that consisted of 180 B. aphidicola genes aligned with their orthologs from free-living proteobacterial species for which complete genome data were available. Finally, to examine rate variation among endosymbiont genes, we determined the relative rate of sequence evolution for orthologous genes of B. aphidicola and W. glossinidia from the branch lengths of these lineages in 258 phylogenetic trees that also included four newly sequenced proteobacterial species (including W. glossinida).
Survey of the B. aphidicola Phylome: Selection of Species, Genes, and Methods
To perform a first survey of the B. aphidiocola phylome, two databases were created with all 545 putative protein sequences of B. aphidicola (Sg), along with all proteobacterial sequences found in the SPTR protein database (accessible as SWALL on the EBI SRS server, http://srs6.ebi.ac.uk/). At the time of the analysis (April 2002), the total number of sequences in the SPTR database amounted to 127,974. The 545 putative proteins encoded by the genome of the Buchnera (Sg) were searched for homologous proteins in the constructed proteobacterial database by using the Blast version 2.0 algorithm (Altschul et al. 1997) with a threshold E-value of 1 x 1010. The amino acid sequences for the 30 best hits were aligned together with the query sequence using ClustalW version 1.8 (Thompson, Higgins, and Gibson 1994).
As many as 518 alignments contained four or more taxa, from which minimum-evolution (ME) trees were reconstructed with raw amino acid fraction differences as distances and 500 bootstrap replicates by using PAUP* version 4.0b10 for Unix (Swofford 1998). In each reconstruction, out-groups were defined as the species with the lowest scoring Blast hit among the -Proteobacteria, ß-Proteobacteria ,
-Proteobacteria , or
-Proteobacteria (i.e., among taxa that are not members of the
-Proteobacteria). In 41 trees, only
-proteobacterial homologs were identified. Consequently, no out-group could be defined, and these genes were excluded from further analyses, as were also 121 trees that did not contain genes from either E. coli, H. influenzae, or both. Another 30 trees were unresolved at the nodes of interest, and these were also excluded from the analysis.
For the 326 remaining trees, the relationships between B. aphidicola, E. coli, and H. influenzae were analyzed. If two species shared a node exclusive of the third species, a relationship between these two species was assumed for that particular protein. The resulting tree topologies differed mainly with respect to the placement of B. aphidicola. To illustrate the difference among the most frequently observed tree topologies, nine or 10 proteins were selected for a more detailed investigation of topology structures. The selected genes supported one of the alternative topologies with high bootstrap support values. The sequences of the selected proteins were concatenated, aligned, and subjected to phylogenetic reconstructions using both neighbor-joining (Saitou and Nei 1987) and maximum-parsimony methods. Bootstrap trees were calculated with 500 replicates in both cases.
Hypothesis Testing: Selection of Species, Genes, and Methods
To examine the influence of nucleotide sequence heterogeneity on tree topologies, a database was created that contained all nucleotide and protein sequences available from completely sequenced proteobacterial genomes. Thus, a restricted set of species was included in the analysis (table 1) so as to produce phylogenetic trees with high consistency and, thereby, reduce species sampling effects and, most importantly, to allow simple comparisons among trees. In the few cases where two or more strains had been sequenced from the same species, only one strain was selected. All the 545 putative proteins encoded by the B. aphidicola (Sg) genome were searched for homologous proteins in the proteobacterial genome database using a threshold E-value of 1 x 105. Proteins without homologs in all of the defined -proteobacterial species (table 1) and the selected out-group taxa were discarded from further analysis. As out-groups we selected the
-Proteobacteria Rickettsia conorii and Rickettsia prowazekii and the ß-Proteobacteria Neisseria meningitidis and Ralstonia solanacearum. R. prowazekii and R. conorii were selected as out-groups because of their A+T-richness so as to also test for possible A+T artifacts with the A+T-rich B. aphidicola. After this screening process, 180 genes remained for an in-depth phylogenetic analysis.
|
First, the neighbor-joining method (Saitou and Nei 1987) was used with the Galtier and Gouy substitution model (Galtier and Gouy 1995) that accommodates asymmetric substitution frequencies (e.g., the substitution rate from C to G may not equal the rate from G to C), unequal base compositions, and different transversion/transition rates and is implemented in Phylo_win (Galtier, Gouy, and Gautier 1996). Second, maximum-likelihood trees were built with the aid of PAUP*. Parameters used in the likelihood analyses were estimated with Modeltest (Posada and Crandall 1998), which helps to determine whether to use a gamma distribution, the size of the associated shape parameter, and the proportion of invariant sites and selects an appropriate substitution model. Likelihood estimations are demanding from a computational perspective. Therefore, we restricted the search in PAUP* by using only one repetition in the heuristic search. For the bootstrapped trees, 10 replicates were generated using the FastStep search method with HKY85 as substitution model and a gamma distribution with a shape parameter estimated by Modeltest.
To illustrate the different topologies obtained in this analysis, we selected four sets of genes that supported each of four different topologies with high bootstrap support values in the minimum-evolution trees. Nucleotide alignments were created by the concatenation of the individual gene alignments. Phylogenetic relationships were inferred from the first and second codon positions with the methods described above. However, with the limited number of taxa, the likelihood method could now be used with heuristic searches with 10 repetitions. For the bootstrapped trees, 100 replicates were generated using FastStep search. In both cases, all estimated parameters from Modeltest were used.
Studies of Rate Variation Among Genes: Selection of Species, Genes, and Methods
For the comparative analysis of the relative rate increase in endosymbiont genomes, we compiled a third data set. In addition to the selected set of species described above, this data set also included the gene and protein sequences of (by the time) newly sequenced Proteobacteria W. glossinidia, Shigella flexneri, Shewanella oneidensis, and Brucella suis (table 1). In this analysis, we selected as out-groups the -Proteobacteria Caulobacter crescentus and Mesorhizobium loti, together with the ß-Proteobacteria N. meningitidis and R. solanacearum. The genomes of C. crescentus and M. loti have a relatively high gene content as compared with R. prowazekii and R. conorii, which allowed more genes to be kept for the analysis.
The resulting 258 sequences were aligned with ClustalW 1.8 (Thompson, Higgins, and Gibson 1994), with default settings. From the protein alignments, minimum-evolution (ME) trees were constructed with the aid of PAUP* 4.0b10 for Unix (Swofford 1998). Five-hundred bootstrap replicates were produced, with the raw amino acid fraction differences as distances. Phylogenetic trees were also reconstructed based on first and second codon positions using the neighbor-joining method and the Galtier and Gouy (1995) substitution model. Branch-length ratios between taxa (table 2) were calculated from trees derived from the Galtier and Gouy (1995) method by dividing the branch length from the common node to the external node (taxon) for one of the taxons with the corresponding length for the other taxon.
|
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To illustrate the different topologies obtained, we concatenated three sets of approximately 10 genes in each set that were selected on the basis that they supported each of the conflicting tree topologies with high bootstrap support values (fig. 1). Gene order is conserved between E. coli and B. aphidicola in a shared set of 91 fragments that contain a total of 503 genes in B. aphidicola (Ap) (Moran and Mira 2001). Genes producing alternative tree topologies mapped onto these fragments in a random manner (data not shown), with no indications of clusters of genes having similar histories, as might be expected if longer DNA segments had been acquired by horizontal gene transfer. We were also unable to find any support for a correlation between tree topologies and functional categories in B. aphidicola, as would be the case if genes with different functions had different ancestries.
|
To examine the effect of biased rates and patterns of sequence evolution on the placement of B. aphidicola in individual gene trees, we compared topological features with substitution frequencies and genic G+C content values. Nonsynonymous substitution frequencies (dN) were estimated for a set of orthologous genes in B. aphidicola (Sg) and (Ap). To begin, we noted a strong correlation between substitution frequencies and genic G+C content values (fig. 2), indicative of a recent mutation bias towards A+T. Both of these parameters were found to correlate with tree topology structures (fig. 2). On the average, genes in the A-type category were found to be more slowly evolving and less biased towards A+T (dN = 0.13; G+C = 28.2% [n = 78 genes]) than genes in the B-type category (dN = 0.17; G+C = 26.2% [n = 248 genes]). Likewise, the topology favoring a recent divergence of B. aphidicola near to E. coli and Salmonella with 100% bootstrap support (fig. 1A) was associated with a set of genes that on the average are highly conserved and have low G+C content values (fig. 2A). In contrast, topologies providing evidence for deeper divergences of B. aphidicola within the -Proteobacteria with strong bootstrap support (fig. 1B, C) were obtained from sets of genes with higher substitution rates and higher A+T content values (fig. 2A).
|
Application of Substitution Models for Unequal Base Composition
To improve the phylogenetic signal and minimize problems caused by the A+T-richness of the B. aphidicola genes, we performed phylogenetic reconstructions using nucleotide first and second codon positions with the Galtier and Gouy substitution model (Galtier and Gouy 1995) applied to the neighbor-joining algorithm (NJGG). The Galtier and Gouy model was developed for estimating evolutionary distances for data sets with heterogeneous base composition and have previously been shown to compensate for such biases (Galtier and Gouy 1995). To eliminate species sampling effects and to make the individual gene trees as comparable as possible, we selected for this analysis a set of 180 genes, all of which are present in the -proteobacterial genomes examined, as well as in four out-group taxa (table 1). A comparison of figure 2A and B shows that this gene set spans the entire range of G+C content values and nonsynonymous substitution frequencies in B. aphidicola and is thus a good representation of the complete gene set.
As in the previous analysis, several different topologies were obtained when the minimum-evolution algorithm was applied to each of the 180 protein alignments (MEprot). To illustrate the topology characteristics, we concatenated alignments of sets of individual proteins that supported each of four different topologies (A1, A2, B1, and B2) with high support. In topology A1 (fig. 3A), B. aphidicola formed a monophyletic group with Y. pestis, E. coli, and Salmonella species (the yess cluster). Topology A2 (fig. 3B) is similar to topology A1 except that V. cholerae is positioned in between the yess cluster and B. aphidicola. In the B-type topologies, B. aphidicola was found to diverge before H. influenzae, Pasteurella multocida, V. cholera, and the yess cluster. In the B1-type of trees (fig. 3C), P. aeruginosa diverged immediately after B. aphidicola, whereas in the B2-type of trees (fig. 3D), B. aphidicola diverged before P. aeruginosa.
|
|
Acceleration of Genomic Evolution Is Counterbalanced by Host-Level Selection
To study the role of host selection on endosymbiont genome evolution, we compiled a data set of 258 genes, all of which are present in B. aphidicola and W. glossinidia, as well as in the free-living -proteobacterial species selected and the out-group taxa M. loti, C. crescentus, N. meningitidis, and R. solanacearum. Also for this data set, approximately 50% of the trees yielded the A-type topology with strong support when the NJGG method was applied to the analysis. From the relative branch lengths in these trees, the rate of sequence evolution at nonsynonymous sites were estimated to be approximately twofold higher on the average in B. aphidicola than in E. coli (Buch/Ecoli = 2.4), with a less than twofold variation among functional categories (data not shown). However, we were unable to infer the full range of the rate enhancement from the branch lengths of B. aphidicola and E. coli to a shared node, because the most rapidly evolving genes are associated with deviant tree topologies.
Another approach to search for variations in evolutionary rates among genes is to estimate the relative rate enhancement in aphid and tsetse fly endosymbionts. The individual gene trees supported a clustering of B. aphidicola and W. glossinidia with high bootstrap support (>70%), as illustrated in figure 5. On the average, we estimated genes from the endosymbionts of aphids to have evolved 1.4-fold slower than those of tsetse flies (Buch/Wigg = 0.70) (table 2). The rate variation among functional categories is within a factor of 2; however, it is interesting to note that many of the outliers represent putative host-selected functions. At the upper end, we find genes coding for proteins involved in the biosynthesis of fatty acids and phospholipids (Buch/Wigg = 1.16), cofactors (Buch/Wigg = 0.93), and cell envelope structures (Buch/Wigg = 0.88) (table 2 and fig. 5A). At the lower end, we find genes coding for amino acid biosynthetic functions (Buch/Wigg = 0.60) (table 2 and fig. 5B).
|
Taken together, our analysis suggests that the rate and mode of sequence evolution has not been uniform among genes in the endosymbiont genomes. The observation is that the rate acceleration has been counterbalanced by selection for genes in functional categories that contain a larger number of genes in one endosymbiont genome compared with the other. The two effects, gene retention and reduced substitution rates, are best explained as the result of purifying selection acting at the host level.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The finding that many of the individual gene trees supported topologies that deviated from the presumed species tree (figs. 1 and 3) is consistent with results from previous phylogenetic analyses of the -Proteobacteria (Itoh, Martin, and Nei 2002; Snel, Bork, and Huynen 2002). In contrast, a lack of phylogenetic conflict with the assumed species tree for 203 of 205
-proteobacterial genes was suggested by the Shimodaira-Hasegawa test (Lerat, Daubin, and Moran 2003). However, it is difficult to evaluate the outcome of this test, because it is dependent on the confidence level (Lerat, Daubin, and Moran 2003). The fraction of single-gene trees supporting the assumed species trees increased upon the use of substitution models that take asymmetric base composition and variable G+C content levels into account (fig. 4). The ML method failed to resolve the divergence node for B. aphidicola in most of the individual gene trees, although a single, fully resolved species tree was obtained with high support in analyses of concatenated gene sets (fig. 3). Likewise, incongruence among single-gene phylogenies for eight yeast species were found to be resolved using larger concatenated data sets (Rokas et al. 2003).
This confirms the suspicion that phylogenetic trees inferred from single-gene sequences may be seriously flawed because of too few sites and/or because the data set violates the assumptions of the models used for the phylogenetic reconstructions, as was also noted in previous studies of sequences with biased G+C content values (Galtier and Gouy 1995, 1998; Eyre-Walker 1998; Foster and Hickey 1999). In many cases, the assumed incorrect internal nodes are highly supported by bootstrap analysis (>95%), as also observed for dubious internal branches in other analyses where no correction was made for compositional biases (Galtier and Gouy 1995). Thus, the assumption that atypical gene trees correspond to horizontal gene transfers (Gogarten, Doolittle, and Lawrence 2002; Doolittle et al. 2003) should be taken with great caution, in particular when there is a compositional bias in the data set.
The elevated substitution rates in endosymbiont genomes relative to those of free-living species may be explained by the loss of genes involved in repair and recombination pathways, leading to higher mutation rates (Tamas et al. 2002; Wernegreen 2002; Dale et al. 2003). If the rate difference is solely the result of a mutation rate difference (Itoh, Martin, and Nei 2002), all proteins will be affected to similar extent. Also if the rate acceleration is influenced by slightly deleterious mutations or by a general relaxation of selection (Moran 1996; Andersson and Kurland 1998; Brynnel et al. 1998; Wernegreen and Moran 1999; Woolfit and Bromham 2003), the rate increase will be similar for all proteins. However, under this scenario, higher dN/dS ratios are expected for the endosymbiont populations. Because of selective constraints on synonymous substitutions in E. coli, these ratios are difficult to compare, and previous attempts to compare dN/dS ratios for endosymbionts and enterics have given discrepant results (Clark, Moran, and Baumann 1999; Ochman, Elwyn, and Moran 1999; Itoh, Martin, and Nei 2002).
If functional constraints are relaxed only for some genes in the endosymbiont populations, the rate acceleration is expected to differ among functional classes of genes. Genes coding for proteins involved in flagellar biosynthesis and protein export seem to evolve under relaxed functional constraints in the endosymbiont genomes, as indicated by atypically high levels of amino acid divergence for these genes (Tamas et al. 2002). A more direct approach to infer amino acid rate variation among genes is to compare branch lengths for endosymbionts and free-living lineages in individual gene trees (Itoh, Martin, and Nei 2002). However, until the correct species topology can be reconstructed for all genes in B. aphidicola, including those that are most rapidly evolving, no conclusive results can be obtained from these studies (Itoh, Martin, and Nei 2002).
Less focus has been placed on the role of the host, although simulation studies point at the population size of the host as the most important factor determining the fixation rate of mutations in endosymbiont genomes (Rispe and Moran 2000). According to the results of this simulation, fixation rates for mutations in endosymbiont genomes should decrease with increasing host population sizes (Rispe and Moran 2000). The interpretation is that selection on the host may counterbalance the accumulation of deleterious mutations through Mullers ratchet, thereby delaying fitness decline and extinction of endosymbiont populations (Rispe and Moran 2000). The relatively small increase of substitution rates for proteins involved in amino acid biosynthesis in B. aphidicola (Itoh, Martin, and Nei 2002) provides some support for this view. If the evolution of endosymbiont genes is indeed influenced by selective constraints imposed by the host, the relative rate enhancement is expected to be variable among endosymbionts with different host-interaction profiles. On the other hand, if the acceleration of genomic evolution is solely influenced by enhanced mutation rates, no such correlation is to be expected.
To examine the influence of host-selection on endosymbiont gene evolution, we compared the relative rate variation among genes for endosymbionts with different hosts. Because the test is based on a comparative analysis of branch lengths for the endosymbionts of aphids (B. aphidicola) with those of tsetse flies (W. glossinidia), it is important that the endosymbionts are correctly placed in the -proteobacterial tree. It was previously thought that the symbiosis event for the tsetse fly endosymbionts dates back approximately 40 Myr and was independent from that of the aphid endosymbionts (Moran and Wernegreen 2000), estimated to 150 to 250 MYA. However, a more recent analysis suggests that B. aphidicola and W. glossinidia are sister clades and have originated from a single symbiosis event (Lerat, Daubin, and Moran 2003), which is consistent with our phylogenetic analysis of both single and concatenated gene alignments (fig. 5).
The results of our analysis suggest that the relative rate increase differs among endosymbiont genes in a manner that is best explained by host-level selection. For example, the higher genomic fraction and the lower rate of sequence evolution for genes involved in amino acid biosynthesis in B. aphidicola is consistent with host selection for amino acid biosynthetic genes in aphid endosymbionts (Shigenebou et al. 2000; Tamas et al. 2002). Likewise, endosymbiont supply of B-vitamins to the tsetse flies is consistent with the retention of a greater proportion of genes involved in cofactor biosynthesis and a relatively lower rate of sequence evolution for these genes in the tsetse fly endosymbionts. Another difference is that the B. aphidicola genome only contains 50% of the genes coding cell-structure components present in the W. glossinidia genome, possibly because of a higher requirement for a functional cell wall during transmission to larvae in the tsetse fly endosymbionts (Akman et al. 2002). Also, for genes encoding cell wall components, the relative rate acceleration in the tsetse fly endosymbionts is lower than for other genes.
The inverse correlation between gene retention and substitution rates suggests that the rate of sequence evolution in endosymbiont genomes is influenced by selective constraints acting at the level of the host. Thus, we have shown here that host-beneficial genes accumulate mutations at reduced levels in relation to the metabolic dependencies of hosts and endosymbionts. To our knowledge, this is the first demonstration of species-specific, host-selected retardation of sequence evolution in endosymbiont genomes.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
2 Present address: Center for Genomics and Bioinformatics, Karolinska Institute, Stockholm, Sweden.
Jonathan Eisen, Associate Editor
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Akman, L., A. Yamshita, H. Watanabe, K. Oshima, T. Shiba, M. Hattori, and S. Aksoy. 2002. Genome sequence of the endocellullar obligate symbiont of tsetse flies, Wigglesworthia glossinidia. Nat. Genet. 32:402-407.[CrossRef][ISI][Medline]
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.
Andersson, S. G. E., and C. G. Kurland. 1998. Reductive evolution of resident genomes. Trends Microbiol. 6:525-536.
Brynnel, E. U., C. G. Kurland, N. A. Moran, and S. G. E. Andersson. 1998. Evolutionary rates for tuf genes in endosymbionts of aphids. Mol. Biol. Evol. 15:574-582.[Abstract]
Clark, M. A., N. A. Moran, and P. Baumann. 1999. Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16:1586-1598.[Abstract]
Dale, C., B. Wang, N. Moran, and H. Ochman. 2003. Loss of DNA recombinational repair enzymes in the initial stages of genome degradation. Mol. Biol. Evol. 20:1188-1194.
Daubin, V., M. Gouy, and G. Perriere. 2002. A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing common history. Genome Res. 12:1080-1090.
Daubin, V., N. A. Moran, and H. Ochman. 2003. Phylogenetics and the cohesion of bacterial genomes. Science 301:829-832.
Doolittle, W. F. 1999. Phylogenetic classification and the universal tree. Science 284:2124-2128.
Doolittle, W. F., Y. Boucher, C. L. Nesbo, C. J. Douady, J. O. Andersson, and A. J. Roger. 2003. How big is the iceberg of which organellar genes in nuclear genomes are but the tip? Philos. Trans. R. Soc. Lond. B Biol. Sci. 358:39-57.[CrossRef][ISI][Medline]
Eisen, J. A., and C. M. Fraser. 2003. Phylogenomics: intersection of evolution and genomics. Science 300:1706-1707.
Eyre-Walker, A. 1998. Problems with parsimony in sequences of biased base composition. J. Mol. Evol. 47:686-690.[ISI][Medline]
Foster, P. G., and D. Hickey. 1999. Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. J. Mol. Evol. 48:284-290.[ISI][Medline]
Galtier, N., and M. Gouy. 1995. Inferring phylogenies from DNA sequences of unequal base compositions. Proc. Natl. Acad. Sci. USA 92:11317-11321.[Abstract]
Galtier, N., and M. Gouy. 1998. Inferring pattern and process: maximum-likelihood implementation of a non-homogeneous model of DNA sequence evolution for phylogenetic analysis. Mol. Biol. Evol. 15:871-879.[Abstract]
Galtier, N., M. Gouy, and C. Gautier. 1996. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12:543-548.[Abstract]
Gogarten, J. P., W. F. Doolittle, and J. G. Lawrence. 2002. Prokaryotic evolution in the light of gene transfer. Mol. Biol. Evol. 19:2226-2238.
Herbeck, J. T., D. P. Wall, and J. J. Wernegreen. 2003. Gene expression level influence amino acid usage, but not codon usage, in the tsetse fly endosymbiont Wigglesworthia. Microbiology 149:2585-2596.
Itoh, T., W. Martin, and M. Nei. 2002. Acceleration of genomic evolution caused by enhanced mutation rate in endocellular symbionts. Proc. Natl. Acad. Sci. USA 99:12944-12948.
Kurland, C. G., B. Canback, and O. G. Berg. 2003. Horizontal gene transfer: a critical view. Proc. Natl. Acad. Sci. USA 100:9658-9662.
Lerat, E., V. Daubin, and N. A. Moran. 2003. From gene trees to organismal phylogeny in prokaryotes: The case of the -Proteobacteria. PloS Biology 1:101-109.[CrossRef][ISI]
Moran, N. A. 1996. Accelerated evolution and Muller's ratchet in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 93:2873-2878.
Moran, N. A., and A. Mira. 2001. The process of genome shrinkage in the obligate symbiont Buchnera aphidicola. Genome Biol. 2:1-12.
Moran, N. A., and J. J. Wernegreen. 2000. Lifestyle evolution in symbiotic bacteria: insights from genomics. Trends Ecol. Evol. 15:321-326.[CrossRef][ISI][Medline]
Ochman, H., S. Elwyn, and N. A. Moran. 1999. Calibrating bacterial evolution. Proc. Natl. Acad. Sci. USA 96:12638-12643.
Palacios, C., and J. J. Wernegreen. 2002. A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high-expression genes. Mol. Biol. Evol. 19:1575-1584.
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818.[Abstract]
Rispe, C., F. Delmotte, R. C. H. J. van Ham, and A. Moya. 2004. Mutational and selective pressures on codon and amino acid usage in Buchnera, endosymbiotic bacteria of aphids. Genome Res. 14:44-53.
Rispe, C., and N. A. Moran. 2000. Accumulation of deleterious mutations in endosymbionts: Muller's ratchet with two levels of selection. Am. Nat. 156:425-441.[CrossRef][ISI]
Rokas, A., B. L. Williams, N. King, and S. B. Carroll. 2003. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425:798-804.[CrossRef][ISI][Medline]
Saitou, N., and M. Nei. 1987. The neghbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.[Abstract]
Shigenebou, S., H. Watanabee, M. Hattori, Y. Sakakai, and H. Ishikawa. 2000. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407:81-86.[CrossRef][ISI][Medline]
Sicheritz-Ponten, T., and S. G. E. Andersson. 2001. A phylogenomic approach to microbial evolution. Nucleic Acids Res. 29:545-552.
Snel, B., Bork. P., and M. A. Huynen. 2002. Genomes in flux: the evolution of archaeal and proteobacterial gene content. Genome Res. 12:17-25.
Swofford, D. L. 1998. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.
Tamas, I., L. Klasson, B. Canbäck, A. K. Näslund, A.-S. Eriksson, J. J. Wernegreen, J. P. Sandström, N. A. Moran, and S. G. E. Andersson. 2002. 50 million years of genomic stasis in endosymbiotic bacteria. Science 296:2376-2379.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nuclei Acids Res. 22:4683-4680.
van Ham, R. C., J. Kamerbeck, and C. Palacios, et al. (15 co-authors). 2003. Reductive genome evolution in Buchnera aphidicola. Proc. Natl. Acad. Sci. USA 100:581-586.
Wernegreen, J. J. 2002. Genome evolution in bacterial endosymbionts of insects. Nat. Rev. Genet. 3:850-860.[CrossRef][ISI][Medline]
Wernegreen, J. J., and N. A. Moran. 1999. Evidence for genetic drift in endosymbionts (Buchnera): analyses of protein-coding genes. Mol. Biol. Evol. 16:83-97.[Abstract]
Woolfit, M., and L. Bromhall. 2003. Increased rates of sequence evolution in endosymbiotic bacteria and fungi with small effective population sizes. Mol. Biol. Evol. 20:1545-1555.
Yang, Z. 1997. PAML: a program for phylogenetic analysis by maximum likelihood. CABIOS 13:555-556.[Medline]