Nonhomogeneous Model of Sequence Evolution Indicates Independent Origins of Primary Endosymbionts Within the Enterobacteriales ({gamma}-Proteobacteria)

Joshua T. Herbeck1, Patrick H. Degnan and Jennifer J. Wernegreen

Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts

Correspondence: E-mail: herbeck{at}u.washington.edu.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Standard methods of phylogenetic reconstruction are based on models that assume homogeneity of nucleotide composition among taxa. However, this assumption is often violated in biological data sets. In this study, we examine possible effects of nucleotide heterogeneity among lineages on the phylogenetic reconstruction of a bacterial group that spans a wide range of genomic nucleotide contents: obligately endosymbiotic bacteria and free-living or commensal species in the {gamma}-Proteobacteria. We focus on AT-rich primary endosymbionts to better understand the origins of obligately intracellular lifestyles. Previous phylogenetic analyses of this bacterial group point to the importance of accounting for base compositional variation in estimating relationships, particularly between endosymbiotic and free-living taxa. Here, we develop an approach to compare susceptibility of various phylogenetic reconstruction methods to the effects of nucleotide heterogeneity. First, we identify candidate trees of {gamma}-Proteobacteria groEL and 16S rRNA using approaches that assume homogeneous and stationary base composition, including Bayesian, maximum likelihood, parsimony, and distance methods. We then create permutations of the resulting candidate trees by varying the placement of the AT-rich endosymbiont Buchnera. These permutations are evaluated under the nonhomogeneous and nonstationary maximum likelihood model of Galtier and Gouy, which allows equilibrium base content to vary among examined lineages. Our results show that commonly used phylogenetic methods produce incongruent trees of the Enterobacteriales, and that the placement of Buchnera is especially unstable. However, under a nonhomogeneous model, various groEL and 16S rRNA phylogenies that separate Buchnera from other AT-rich endosymbionts (Blochmannia and Wigglesworthia) have consistently and significantly higher likelihood scores. Blochmannia and Wigglesworthia appear to have evolved from secondary endosymbionts, and represent an origin of primary endosymbiosis that is independent from Buchnera. This application of a nonhomogeneous model offers a computationally feasible way to test specific phylogenetic hypotheses for taxa with heterogeneous and nonstationary base composition.

Key Words: nucleotide composition • Buchnera • insect endosymbionts • Enterobacteriales • phylogeny


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Phylogenetic inference can be confounded by various evolutionary factors, including unequal substitution rates among sites (Yang 1993), unequal transition and transversion rates (Kimura 1980), and substitution saturation among excessively divergent taxa. In addition, most nucleotide substitution models assume homogeneity of nucleotide composition among taxa (Felsenstein 1988), although this assumption is easily violated in nonsimulated data sets. While it is known that variable base composition among taxa can distort estimations of substitution rates (Tourasse and Li 1999), the effects of variable base composition on phylogenetic analysis is not entirely clear (Mooers and Holmes 2000; Conant and Lewis 2001). Simulation studies (Conant and Lewis 2001; Rosenberg and Kumar 2003) suggest that nucleotide heterogeneity among taxa does not negatively affect parsimony, distance, and likelihood methods. Conant and Lewis (2001) found that only extreme nucleotide bias and long branch lengths can lead parsimony to incorrect phylogenetic inference. Yet, analyses of biological data sets have shown that parsimony (Loomis and Smith 1990; Steel, Lockhart, and Penny 1995), distance (Lockhart et al. 1994; Galtier and Gouy 1995), and maximum likelihood methods (Chang and Campbell 2000) can mistakenly group unrelated species with similar GC contents. Thus, base composition heterogeneity is considered a potential problem for phylogenetic reconstruction.

Several approaches have been proposed to account for variable base content among taxa. Hasegawa (Hasegawa and Hashimoto 1993; Hasegawa et al. 1993) found that nucleotide heterogeneity in 16S rRNA biased the phylogeny of deep diverging eukaryotes and suggested that amino acid sequences are more reliable. However, amino acid composition is also affected by nucleotide compositional bias (Foster, Jermiin, and Hickey 1997; Singer and Hickey 2000). Attempts to correct for this bias include LogDet (Lockhart et al. 1994), a distance method that transforms the substitution matrix to produce additive distances. The LogDet method does not consider rate variation among sites, and, similar to other distance methods, it performs poorly in analyses of taxa with moderate amounts of substitution saturation (Mooers and Holmes 2000). Galtier and Gouy (1995, 1998) have developed a nonhomogeneous Markov model of nucleotide substitution that allows equilibrium base composition to vary among lineages. This method modifies Tamura's (1992) substitution model of unequal transition and transversion rates and unequal nucleotide content (GC and AT), to include variable substitution rates among sites and variable GC content among branches. This model has been used in a maximum likelihood framework to estimate ancestral GC content of thermophilic organisms (Galtier, Tourasse, and Gouy 1999), and to infer phylogenies of Drosophilidae taxa (Tarrio, Rodriguez-Trelles, and Ayala 2001) and weevil endosymbiotic bacteria (Lefèvre et al. 2004).

Such nonhomogeneous models may be particularly important in inferring phylogenies for bacteria, which show an exceptionally wide range of base compositional biases (ranging from ~25% to 75% GC content; Sueoka 1962). Intracellular bacterial mutualists and pathogens have the lowest known genomic GC contents, and they represent several phylogenetically independent lineages (Moran and Wernegreen 2000; Shigenobu et al. 2000; Charles, Heddi, and Rahbe 2001; Moran 2002). Their AT-richness is most extreme at third codon positions and intergenic spacers, suggesting a strong effect of directional mutational pressure, or biased changes between GC and AT pairs (Muto and Osawa 1987; Sueoka 1988; Sueoka 1992). The Enterobacteriales includes free-living or gut-associated species (Escherichia coli, Salmonella typhimurium, Shigella flexneri, and Yersinia pestis) with moderate base compositions, as well as endosymbionts that form primary (obligate) and secondary (facultative, transient) associations with insects and that are relatively AT-biased (Moran and Telang 1998; Baumann, Moran, and Baumann 2000). The AT-bias of several primary endosymbionts within the Enterobacteriales is quite severe, at ~26% GC (Shigenobu et al. 2000) in the aphid endosymbiont Buchnera, ~22% GC in the tsetse fly endosymbiont Wigglesworthia (Akman et al. 2002), and 27% GC in the ant endosymbiont Blochmannia (Gil et al. 2003).

In addition to their extreme AT-bias, primary endosymbionts are characterized by severe genome reduction. Among Buchnera, Blochmannia, and Wigglesworthia, genome sizes range from 450 kb (Gil et al. 2002) to ~800 kb (Wernegreen, Lazarus, and Degnan 2002) compared to the 6 to 5.3 Mb for Escherichia coli genomes (Bergthorsson and Ochman 1995). Like other intracellular bacteria, these endosymbionts also experience fast rates of sequence evolution (Moran 1996; Woolfit and Bromham 2003), especially at nonsynonymous sites (Clark, Moran, and Baumann 1999; Wernegreen and Moran 1999), deleterious changes at the 16S rRNA gene (Lambert and Moran 1998), and AT-biased amino acid changes (Moran 1996; Clark, Moran, and Baumann 1999; Palacios and Wernegreen 2002; Herbeck, Wall, and Wernegreen 2003; Rispe et al. 2004). Mechanisms driving these shared features of endosymbiont genomes may include a combination of relaxed selection in the host intracellular environment, strong genetic drift resulting in part from the repeated vertical transmission through insect host generations and decreased effective population sizes (Moran 1996; Funk, Wernegreen, and Moran 2001; Abbot and Moran 2002; Mira and Moran 2002; Herbeck et al. 2003), and increased background mutation rates resulting from the loss of DNA repair loci during genome reduction (Mira, Ochman, and Moran 2001).

The phylogenies of the {gamma}-Proteobacteria and Enterobacteriales specifically have received considerable attention, owing in part to their ecological importance, the medical relevance of several species, and the diverse lifestyles this group represents (Lawrence, Ochman, and Hartl 1991; Sproer et al. 1999; Wertz et al. 2003; Canbäck, Tamas, and Andersson 2004). Of particular interest for comparative genomic studies is the relative position of Buchnera, Blochmannia, and Wigglesworthia, as the full genome sequences of these taxa are now available (Shigenobu et al. 2000; Akman et al. 2002; Tamas et al. 2002; Van Ham et al. 2003; Gil et al. 2003). However, the phylogenetic position of these and other AT-rich endosymbionts has proved difficult to recover and often varies among studies. Published phylogenies that include Buchnera, Wigglesworthia, and/or Blochmannia often group them as sister taxa or within a clade that includes only endosymbionts (e.g., Schröder et al. 1996; Heddi et al. 1998; Spaulding and von Dohlen 1998; Gil et al. 2003; Lerat, Daubin, and Moran 2003; Woolfit and Bromham 2003; Canbäck, Tamas, and Andersson 2004). One exception is a phylogenetic study that considered heterogeneity in nucleotide biases in this group and that suggested a paraphyletic relationship among several primary endosymbionts, grouping Buchnera apart from Blochmannia and Wigglesworthia (Charles, Heddi, and Rahbe 2001). In addition, a tree estimated under a nonhomogeneous model (NJ-nh) in Lerat, Daubin, and Moran (2003; fig. 2) positioned Buchnera and Wigglesworthia in separate clades, prompting these authors to consider alternative topologies in which the two endosymbionts were not sister taxa. However, these candidate topologies were rejected as significantly less likely than topologies in which the two endosymbionts are sister taxa, based on an SH test (Shimodaira and Hasegawa 1999) implemented with a homogeneous model.



View larger version (33K):
[in this window]
[in a new window]
 
FIG. 2.— Phylogenies of the groEL gene (codon positions 1 and 2) for select {gamma}-Proteobacteria inferred using standard phylogenetic methods. Support values for the MB phylogeny are given in posterior probabilities for each node, with only probabilities of 70% or greater shown. Support values for the ML-gtr, MP, and NJ-nh trees are given in bootstrap determined from 500 replicates, with only percentages of 70% or greater shown. Underlined taxa are endosymbiotic bacteria of insects, taxa in boldface are primary endosymbionts. The NJ-k2p and LogDet phylogenies are not shown due to their extremely low –ln(L)nh scores and topologies generally inconsistent with those found by all other methods; both trees group the endosymbionts of interest together.

 
Our goal in this study is to build upon these previous analyses of the Enterobacteriales by further exploring possible effects of base-compositional biases, and by developing a new approach to accounting for such biases in phylogenetic analysis. We focus primarily on the relationships among select AT-rich primary endosymbionts, to better understand the origins of endosymbiosis and the evolution of their genomic features. Although previous studies have employed a nonhomogeneous model, we use a nonhomogeneous model to systematically evaluate possible candidate trees that were generated by homogeneous methods. First, we develop phylogenetic hypotheses with commonly used methods, including Bayesian inference, maximum likelihood, parsimony, and distance approaches. We then evaluate the resulting phylogenies and multiple permutations of them using the nonhomogeneous maximum likelihood model of Galtier and Gouy (1995).

Our results consistently support topologies that group Buchnera with gut-associated enteric species, separate from a clade that includes Wigglesworthia, Blochmannia, and many secondary (transient) endosymbionts of insects. Although the two genes considered here (groEL and 16S rRNA) produce slightly different topologies of the Enterobacteriales, the phylogenetic independence of Buchnera is well supported. We suggest the combination of homogeneous models to identify candidate trees and nonhomogeneous models to evaluate those topologies offers a robust method to test phylogenetic hypotheses for groups that vary dramatically in base composition.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Taxa and Sequences
We obtained sequences of {gamma}-Proteobacteria (including Enterobacteriales and relatives) chaperonin groEL and 16S ribosomal RNA (rRNA) from GenBank. The groEL data set included 23 {gamma}-Proteobacteria taxa representing nine endosymbionts and 14 free-living bacteria (table 1). We aligned groEL using ClustalW (Chenna et al. 2003) based on amino acid translations of nucleotide sequences and subsequently excluded third positions due to saturation. The groEL alignment was 1,572 bp in total, and 1,048 bp for first and second codon positions alone. The 16S rRNA analysis included 32 taxa (table 2). We used the Ribosomal Database Project II Sequence Aligner (Cole et al. 2003) to align the 16S rRNA sequences, and edited them by hand in MacClade v.4.05 (Maddison and Maddison 2002). The final 16S rRNA data set, after excluding all regions with ambiguous alignment, was 1,320 bp. We tested for heterogeneity in base composition among taxa using the chi-squared test implemented in PAUP* 4.0b10 (Swofford 2002), examining all nucleotide sites and only variable sites in both groEL and 16S rRNA data sets. Both sequence alignments are available in the online resources.


View this table:
[in this window]
[in a new window]
 
Table 1 The 23-Taxa groEL Data Set Used for Phylogenetic Analysis, with GenBank Accession Numbers, and Nucleotide Content for First and Second Codon Positions (1048 bp) and for Variable Sites at First and Second Codon Positions (327 bp)

 

View this table:
[in this window]
[in a new window]
 
Table 2 The 32-Taxa 16S rRNA Data Set Used for Phylogenetic Analysis, with GenBank Accession Numbers and Nucleotide Content for All Sites (1320 bp) and Variable Sites (504 bp)

 
Phylogenetic Analysis
Our approach was to estimate candidate starting trees using standard methods of phylogeny reconstruction that search tree space extensively. We then varied the placement of Buchnera across each starting tree and evaluated all possible permutations under a nonhomogeneous maximum likelihood model of sequence evolution (Galtier and Gouy 1995) (fig. 1).



View larger version (13K):
[in this window]
[in a new window]
 
FIG. 1.— A schematic diagram of the phylogenetic approach used in this study. (1) We used two data sets, 32 taxa of 16S rRNA and 23 taxa of groEL. (2) Standard phylogenetic methods created starting trees. (3) We generated multiple novel trees by varying the placement of Buchnera on the starting trees. (4) The non-homogeneous maximum likelihood model of Galtier and Gouy (1995) was used to evaluate all novel trees. (5) We ranked all novel trees by likelihood under ML-nh. (6) RELL analysis was used to establish significant differences among topologies.

 
Starting Phylogenies
We used six phylogenetic approaches to create alternative starting hypotheses of {gamma}-Proteobacteria evolution for the groEL and 16S rRNA data sets: maximum parsimony (MP), GTR maximum likelihood (ML-gtr), Bayesian inference (MB), and Neighbor-Joining under three models of sequence evolution: the Kimura 2-parameter (NJ-k2p), the LogDet transformation, and the nonhomogeneous model T92+{Gamma}+varGC (NJ-nh). The free-living Pseudomonas aeruginosa was designated the root taxon for all trees, after the initial phylogenetic analyses were completed.

Maximum likelihood analyses of both groEL and 16S rRNA were performed in PAUP* 4.0b10 (Swofford 2002) based on a general time reversible (GTR) model with a {Gamma} distribution and a portion of invariable sites estimated from the data, as selected using ModelTest version 3.06 (Posada and Crandall 1998) by the Akaike Information Criterion. Bootstrap support was calculated using 500 replicates, each using 10 random taxon addition replicates with tree bisection and reconnection (TBR) branch swapping. All maximum likelihood heuristic searches and bootstrap analyses were implemented in parallel on a Beowulf cluster utilizing the clusterpaup program (A. G. McArthur, jbpc.mbl.edu/computing). We conducted Bayesian analysis with Mr. Bayes v.3.0 (Ronquist and Huelsenbeck 2003) using the GTR substitution model with base frequencies and substitution rate matrix estimated from the data. For groEL analysis we used 3 million Markov chain Monte Carlo (MCMC) generations with four parallel chains (one cold, three heated), with trees sampled every 50 generations and a 10,000 tree burn-in period. For 16S rRNA analysis, we included 10 million MCMC generations of four chains, with trees sampled every 50 generations, with a 50,000 tree burn-in. Heuristic parsimony analyses included TBR branch-swapping and all characters unordered and equally weighted.

We obtained a NJ-nh topology from a distance matrix estimated using Galtier and Gouy's (1995) T92+{Gamma}+varGC nonhomogeneous model, implemented in the software Phylo_Win (Galtier, Gouy, and Gautier 1996). Using PAUP* 4.0b10 (Swofford 2002), we also estimated NJ trees based on the LogDet transformation (Lockhart et al. 1994), a method designed to be robust to variable mutational tendencies among lineages, and the Kimura 2-parameter model (NJ-k2p). Bootstrap support was calculated using 500 replications for all distance methods.

Permutation of Starting Trees by Varying Position of Buchnera
The five homogeneous analyses and NJ-nh generated alternative phylogenetic hypotheses that we used as starting trees. To focus on the relationship between Buchnera and other AT-rich primary endosymbionts, we created permutations of these starting trees for all possible placements of the Buchnera clade (lineages of Buchnera from the following aphids: A. pisum, B. pistaciae, and S. graminum). We focused on Buchnera because it is a relatively well characterized primary endosymbiont, and its placement was the most labile in the starting phylogenies. These permutations were produced in PAUP* by "generating all trees" constrained to the best tree for a given reconstruction method, altered so that the Buchnera clade was a basal polytomy (and thus free to vary in location across the constraint tree). All possible placements of Buchnera generated 234 candidate trees for the groEL data set (39 possible placements of Buchnera on the six starting trees) and 342 candidate trees for the 16S data set (57 possible placements of Buchnera on each of six starting trees).

Evaluation of Alternative Phylogenies under a Nonhomogeneous Model of Nucleotide Substitution
For each starting tree and all permutations of those trees, we used the eval_nh program from the NHML 3.0 package (Galtier, Tourasse, and Gouy 1999) to evaluate the likelihood under Galtier and Gouy's (1995) T92+{Gamma}+varGC nonhomogeneous model (ML-nh). We then ranked all topologies based on the resulting likelihood scores. We noted the placement of Buchnera relative to the other (AT-rich) primary endosymbionts, particularly Blochmannia and Wigglesworthia, and categorized trees as "polyphyletic" (if Buchnera grouped apart from Blochmannia and Wigglesworthia) or "monophyletic" (if Buchnera grouped with Blochmannia and Wigglesworthia).

Essentially, this approach was an approximate tree-searching method to create phylogenetic hypotheses of interest for subsequent evaluation under the nonhomogeneous model. We chose this approach rather than the tree-searching algorithm available in the NHML program (star_nh) for three reasons: (1) tree-searching under star_nh was prohibitively computationally intensive, given the parameter-rich model and the number of taxa we considered; (2) initial phylogenies based on MP, ML-gtr, Bayesian, and NJ approaches, and permutations of those trees, offered a more comprehensive and stable set of possible topologies than did the branch-swapping method implemented by star_nh (unpublished data); (3) because our primary interest was the evolution of AT-rich primary endosymbionts within the Enterobacteriales, we wished to consider all possible placements of Buchnera across reasonable starting trees.

We evaluated the statistical significance of differences in likelihood values among competing phylogenies with Kishino and Hasegawa's (1989) resampling estimated log likelihood (RELL) procedure implemented by PAUP* 4.0b10. This test entails the bootstrapping of site-specific log-likelihood values estimated under a particular substitution model, in this case Galtier and Gouy's (1995) nonhomogeneous model (ML-nh). We also performed Mann-Whitney U-tests to identify significant differences in –ln(L)nh distributions between trees in which primary endosymbionts are polyphyletic versus monophyletic (as defined above). This allowed us to test whether trees that position Buchnera apart from other primary endosymbionts have consistently and significantly better likelihood scores than trees that group primary endosymbionts together. We tested the suitability of the ML-nh model for the groEL and 16S rRNA data sets using likelihood ratio tests (Felsenstein 1981) on nested substitution models, using maximum parsimony topologies.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Nucleotide Composition
For both groEL and 16S rRNA, base composition at first and second codon positions is significantly heterogeneous among taxa (although only at variable sites in 16S rRNA) (table 1, table 2). As expected, endosymbiotic bacteria are relatively AT-rich compared to the majority of free-living bacteria. Although the chi-squared test of homogeneity used here ignores potential phylogenetic structure and violates the assumption of independent comparisons, the result suggests that phylogenetic analysis should account for variable GC content among taxa. Consistent with this observed heterogeneity, likelihood ratio tests (table 3) show that ML-nh is a significantly better fit than other evolutionary models for both groEL and 16S rRNA.


View this table:
[in this window]
[in a new window]
 
Table 3 Likelihood Ratio Test Results for groEL and 16S rRNA Data Sets

 
Phylogeny of groEL
The inferred phylogenies of groEL (codon positions 1 and 2) vary considerably among the reconstruction methods (fig. 2). The MB and NJ-nh topologies position Buchnera apart from other endosymbionts and as a sister to the enteric bacteria. In addition, consistent with most published phylogenies (e.g., Spaulding and von Dohlen 1998) the MB and NJ-nh phylogenies position the whitefly endosymbiont as a basal lineage. By contrast, the MP and ML-gtr trees group all AT-rich endosymbiont together.

We used eval_nh to calculate –ln(L)nh of the data set across 234 candidate groEL phylogenies (table 4). The original MB topology had the highest –ln(L)nh, and permutations of this tree had generally high scores compared to permutations of other starting trees. Permutations of the MP, ML-gtr, and NJ-nh trees have nearly equal distributions of –ln(L)nh scores. The starting trees under LogDet and NJ-k2p, and all permutations of these, have very low –ln(L)nh scores (all ≤5180) (see figures in the Supplementary Material online).


View this table:
[in this window]
[in a new window]
 
Table 4 Top Log Likelihood Scores, under the Galtier and Gouy Maximum Likelihood Model of Nonhomogeneous Nucleotide Substitution (ML-nh), of 234 Possible groEL Phylogenies

 
We then compared –ln(L)nh scores of trees developed under a given method, noting the placement of Buchnera as polyphyletic or monophyletic relative to the Blochmannia and Wigglesworthia clade. This comparison determined whether various placements of Buchnera on the starting topologies significantly improved –ln(L)nh. Separation of Buchnera from Blochmannia and Wigglesworthia improved –ln(L)nh scores for MP, ML-gtr, and NJ-nh relative to scores for the original trees (which group all endosymbionts together). Notably, the tree with the best –ln(L)nh (the original MB tree) is polyphyletic. We also explored whether alternative placements of Buchnera had distinct likelihood distributions, regardless of the underlying topology. For MB, MP, ML-gtr, and NJ-nh, the distribution of –ln(L)nh for monophyletic permutations was typically lower than those of polyphyletic permutations of the same starting tree, and it was significantly different for MB and NJ-nh (Mann-Whitney U-test, P < 0.001*). This indicates that, across various starting topologies, trees that position Buchnera apart from other primary endosymbionts have better likelihood scores under the nonhomogeneous model than do trees that group primary endosymbionts together.

Of the 234 groEL topologies considered, those with the top 10 –ln(L)nh scores included nine trees based on the initial MB phylogeny and one permutation of the NJ-nh topology. Resampling estimated log likelihood analysis shows no statistical support for the single top tree over the next nine topologies, but any one of the top 10 topologies is statistically supported over every 10th tree of the top 100 sampled. (The 10th tree is not significantly better than the 11th, but is significantly better than the 20th, 30th, etc.). The majority of topologies with the greatest nonhomogeneous likelihood place Buchnera separate from Blochmannia and Wigglesworthia, including 24 of the best-scoring 25 and 47 of the best-scoring 50 topologies. The majority-rule consensus of the top 10 trees under the NH model places the Buchnera clade sister to the free-living Erwinia herbicola, while Wigglesworthia and Blochmannia group with the Sitophilus oryzae primary endosymbiont and Sodalis (fig. 3). Consensus of groEL trees with the top 50–ln(L)nh scores does not change the placement of Buchnera relative to Blochmannia and Wigglesworthia, and it differs only in the level of resolution given to the placement of Buchnera within the free-living enteric clade (see figure 3 for consensus of top 10 and top 25 trees).



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 3.— Majority-rule consensus tree of the top 10 and top 25 topologies with the highest –ln(L)nh score under the ML-nh model, for groEL. Underlined taxa are insect endosymbionts, boldface taxa are primary endosymbionts. Frequencies shown are for first for the 10 and 25 top-scoring trees (10 trees/25 trees).

 
Phylogeny of 16S rRNA
The 16S rRNA analysis shows many of the same trends as groEL. First, the starting topologies estimated under the five homogeneous methods and NJ-nh vary considerably, and methods differ in their placement of Buchnera with or apart from Wigglesworthia and Blochmannia (fig. 4). Interestingly, some methods that group primary endosymbionts together at groEL (e.g., MP, ML-gtr), show polyphyletic relationships at 16S rRNA, and vice versa (e.g., MB). For 16S rRNA, the MB and ML-gtr topologies are identical except for the placement of Buchnera. The endosymbiont of Bemisia tabaci is basal in all 16S rRNA phylogenies, as expected.



View larger version (48K):
[in this window]
[in a new window]
 
FIG. 4.— Phylogenies of the 16S rRNA gene of select {gamma}-Proteobacteria, inferred using standard phylogenetic methods. Support values for the MB phylogeny are given in posterior probabilities for each node, with only probabilities greater than 70% shown. Support values for the ML-gtr, MP, and NJ trees are given in bootstrap determined from 500 replicates, with only frequencies 70% and greater shown. Underlined taxa are endosymbiotic bacteria of insects, taxa in boldface are primary endosymbionts.

 
Among the various permutations of these starting trees, those based on NJ trees have the lowest –ln(L)nh scores (all permutations ≤9075). Even NJ based on a nonhomogeneous model (NJ-nh) had lower –ln(L)nh scores than did trees based on MB, ML-gtr, or MP. New positions of the Buchnera clade generally improve the –ln(L)nh scores of the starting tree, with the exception of the ML-gtr tree, for which the original tree has the greatest –ln(L)nh and, notably, Buchnera is polyphyletic. The original MB tree is significantly less likely in RELL analysis than the original ML-gtr tree because it groups Buchnera with other endosymbionts. For all methods, separation of Buchnera from other endosymbionts improves –ln(L)nh scores as compared to trees that group them together (see figures in the Supplementary Material online). This improvement is reflected in the significant difference in the –ln(L)nh distributions between polyphyletic and monophyletic trees (Mann-Whitney U-tests, P < 0.001* for each of MB, ML-gtr, MP, and NH-nh).

The top 10 16S rRNA topologies under nonhomogeneous models place Buchnera polyphyletic relative to the other endosymbionts, and classify it as sister to the E. coli group (table 5). As for groEL, the RELL method does not distinguish among the top 10 16S trees, but any one of the top 10 trees is significantly more likely than any of every 10th lower-ranked topology (as described above). The majority-rule consensus of the 10 16S rRNA trees with the top –ln(L)nh scores separates Buchnera from Blochmannia, Wigglesworthia, and the other endosymbionts (fig. 5). Also similar to the groEL analysis, the majority of trees with greatest –ln(L)nh scores separate Buchnera from the other endosymbionts. These include 25 of the highest 25 trees and 45 of the highest 50.


View this table:
[in this window]
[in a new window]
 
Table 5 Top Log Likelihood Scores, under the Galtier and Gouy Maximum Likelihood Model of Nonhomogeneous Nucleotide Substitution (ML-nh), of All Possible Buchnera Placements on the 16S rRNA Phylogeny, of 342 Total Topologies

 


View larger version (41K):
[in this window]
[in a new window]
 
FIG. 5.— Majority-rule consensus tree of the top 10 and top 25 topologies with the –ln(L)nh scores for 16S rRNA. Underlined taxa are insect endosymbionts, boldface taxa are primary endosymbionts. Frequencies shown are for first for the top ten trees, then the top 25 trees (10 trees/25 trees).

 
We also analyzed a 1,280-bp, 30-taxa, 16S rRNA data set that differed from the 32-taxa data set by lacking two endosymbionts and replacing one free-living taxon. Because the results from this 30-taxa data set only corroborate the results and interpretation of the 32-taxa data set described above, we therefore have placed this information in an online supplement.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
We have focused on the evolutionary relationships of AT-rich primary endosymbionts, using groEL and 16S rRNA genes and a nonhomogeneous substitution model that accounts for variable AT-bias among taxa. Our goal in this study was to develop a approach that takes advantage of (1) the tree-searching algorithms implemented by several standard phylogenetic methods, and (2) the utility of nonhomogeneous models to evaluate relationships among taxa with widely different base compositions. We first demonstrated that variable base composition strongly affects estimates of relationships among free-living and endosymbiotic Enterobacteriales, as the likelihood ratio tests show ML-nh is the best fit to both groEL and 16S rRNA data sets. Our main result is that a nonhomogeneous likelihood model supports the separation of Buchnera from Blochmannia and Wigglesworthia. Three lines of evidence presented here support that conclusion. First, homogeneous models gave incongruent topologies and sometimes placed Buchnera, Blochmannia, and Wigglesworthia as monophyletic, as found in previous studies. However, the nonhomogeneous likelihood scores (–ln(L)nh) of such "monophyletic" trees was always improved in a permutation that separated Buchnera from Blochmannia and Wigglesworthia. Second, across all standard phylogenetic methods, various permutations that separate Buchnera had overall better –ln(L)nh, an improvement that was often statistically significant. Third, for both 16S rRNA and groEL, the consensus trees of phylogenies with the best –ln(L)nh scores separate Buchnera from a clade that includes Wigglesworthia, Blochmannia, and other insect endosymbionts.

The results of this study also shed light on which standard phylogenetic methods may be most robust for data sets with base composition variation. The consensus trees under the nonhomogeneous model are most similar to the original phylogenies estimated by MB (for 16S) and ML-gtr (for groEL). This observed robustness of MB and ML-gtr is consistent with a recent simulation study showing that a maximum-likelihood approach is best able to handle heterogeneity (Rosenberg and Kumar 2003). Interestingly, NJ-nh separated Buchnera from Blochmannia and Wigglesworthia at both groEL and 16S rRNA, suggesting that this method successfully accounts for base compositional variation. However, other aspects of this tree were apparently quite poor, as the NJ-nh tree (and permutations thereof) had lower –ln(L)nh than phylogenies based on most other methods. This result suggests that implementation of a nonhomogeneous model should not be limited to only an NJ approach.

Comparison with Previously Published Phylogenies
Our results do not provide a single best Enterobacteriales phylogeny, as the groEL and 16S rRNA consensus trees show slight differences (e.g., in the placement of Erwinia spp.) that may reflect the different taxon sets. The lability of Buchnera observed across our original phylogenies (figs. 2 and 4) is also reflected by the varying placements of Buchnera across published trees (Schröder et al. 1996; Charles, Heddi, and Rahbe 2001; Gil et al. 2003; Canbäck, Tamas, and Andersson 2004). Despite the variable position of Buchnera in our starting trees, evaluation of these placements (and many more, in permutations of these trees) under the nonhomogeneous models supports the grouping of Buchnera with a clade of enteric bacteria, apart from an endosymbiont clade that includes Blochmannia and Wigglesworthia.

The fact that our results conflict with those of several previous studies warrants a comparison of the phylogenetic models and taxon sampling employed. First, previous studies suggesting monophyly of Buchnera, Wigglesworthia, and/or Blochmannia often employ homogeneous models. Similarly, we also found that the best trees of some homogeneous models group these endosymbionts closely together (e.g., ML-gtr and MP for 16S rRNA, and MP and MB for groEL). Previous studies that implemented nonhomogeneous models have typically used NJ-nh, and they either produced trees that separated Buchnera and Wigglesworthia but were statistically rejected by other methods (Lerat, Daubin, and Moran 2003), or they produced a tree in which these two endosymbionts were sister taxa (Canbäck, Tamas, and Andersson 2004). In contrast, our use of nonhomogeneous model is likelihood-based rather than exclusively NJ-based.

Second, several previous analyses have taken advantage of the numerous loci available in full genome sequences, but generally they included fewer taxa in the analysis (16 or fewer). The inclusion of numerous genes or entire genomes has obvious benefits for studies of phylogenomics (Canbäck, Tamas, and Andersson 2004) and tests of lateral gene transfer (Daubin, Moran, and Ochman, 2003; Lerat, Daubin, and Moran 2003). However, for phylogenetic reconstruction per se, the trade-off between more genes or more taxa is not always clear, especially when taxa evolve at different rates and exhibit different compositional biases. Increased taxon sampling has been shown to be of greater benefit to phylogenetic accuracy than increased sequence length (Graybeal 1998; Pollock et al. 2002; Hillis et al. 2003). That is, more genes per taxon may actually cause convergence upon, and increased confidence in, the wrong tree. This is particularly true for data sets subject to long branch attraction (Hillis 1998), a potential issue in analyses of rapidly evolving, AT-rich taxa. For this reason, we included many endosymbionts (both primary and secondary) that are closely related to free-living and commensal enterics. Of course, the choice of loci is critical when few genes are considered. Notably, the inclusion of a genome-wide sample allowed Canbäck, Tamas, and Andersson (2004) to examine the correlation of gene conservation to particular tree topologies. They show that relatively conserved (and relatively GC-rich) genes have more reliable phylogenetic signals and support the recent divergence of Buchnera from E. coli and Salmonella (Canbäck, Tamas, and Andersson 2004). This result supports our use of the highly conserved genes groEL and 16S rRNA. Our results further suggest that even highly conserved genes will benefit from the use of nonhomogeneous models in phylogeny reconstruction.

Implications for the Evolution of Endosymbiosis
Our results have two implications for the evolution of endosymbiosis within the Enterobacteriales. First, Blochmannia and Wigglesworthia apparently represent an origin of primary endosymbiosis that is independent from Buchnera. Genome sequence data may also support this independent transition to primary endosymbiosis, as the three fully sequenced endosymbiont genomes share only ~50% of their genes, or ~70% for any pairwise comparisons between genera (Gil et al. 2003). Because transition to primary endosymbiosis is thought to impose immediate, severe genome reduction through large genome deletion events, such endosymbionts may rapidly become constrained to their particular host association (Moran and Mira 2001; Van Ham et al. 2003). This genome reduction may impose severe constraints on extracellular existence, and it may limit switching among hosts with different nutritional physiologies. The phylogenies presented here cannot distinguish whether Blochmannia and Wigglesworthia acquired the primary endosymbiotic lifestyle independently of each other. However, given that these two genomes share just ~70% of their genes and are highly specialized to the nutritional physiology of their respective hosts, independent acquisitions of obligate endosymbiosis is the more likely possibility.

Second, Blochmannia and Wigglesworthia are part of a diverse clade consisting of secondary endosymbionts of insects, suggesting that primary endosymbionts may evolve from secondaries. Koga, Tsuchida, and Fukatsu (2003) provide experimental evidence that secondary endosymbionts may move into the symbiotic niche of primaries. Specifically, they showed that a facultative endosymbiotic {gamma}-Proteobacterium infected the cytoplasm of bacteriocytes of aphid hosts from which Buchnera had been eliminated, and that it thus compensated for the essential roles of Buchnera. A second example of the potential for primary endosymbionts to evolve from secondary endosymbionts is that of Sodalis glossinidius and the Sitophilus oryzae primary endosymbiont. The close phylogenetic association of this secondary endosymbiont of tsetse flies and this obligate mutualist of weevils shown here (figs. 3 and 5) is further corroborated by their shared maintenance and expression of a type III secretion system, which likely was acquired prior to their divergence (Dale et al. 2002). Although these endosymbionts are associated with distinct insect hosts, Sodalis is still capable of horizontal transmission (Aksoy, Chen, and Hypsa 1997; Dale and Maudlin 1999). Understanding specific routes by which diverse endosymbioses are established will require more extensive taxon sampling of facultative and primary endosymbiont lineages; however, the current study suggests that primary endosymbiosis has originated more often than previously thought, and it may represent the end of an evolutionary spectrum between the facultative and obligate intracellular lifestyles.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
The authors thank N. Galtier and M. Gouy for providing the NHML program for implementation of nonhomogeneous models. We are grateful to F. Rodriguez-Trelles for invaluable assistance applying these models and sharing modified source codes. We thank A.G. McArthur for providing clusterpaup and other phylogenetic analysis programs; D. M. Hillis for helpful discussion of taxon sampling and related issues; and two anonymous reviewers for useful comments on an earlier version of this manuscript. This work was supported by grants to J.J.W. from the National Institutes of Health (NIH R01 GM62626–01), the National Science Foundation (DEB 0089455), and the Josephine Bay Paul and C. Michael Paul Foundation.


    Footnotes
 
1 Present address: Department of Microbiology, University of Washington, Seattle, WA 98195. Back

Manolo Gouy, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 

    Abbot, P., and N. A. Moran. 2002. Extremely low levels of genetic polymorphism in endosymbionts (Buchnera) of aphids (Pemphigus). Mol. Ecol. 11:2649–2660.[CrossRef][ISI][Medline]

    Akman, L., A. Yamashita, H. Watanabe, K. Oshima, T. Shiba, M. Hattori, and S. Aksoy. 2002. Genome sequence of the endocellular obligate symbiont of tsetse flies, Wigglesworthia glossinidia. Nat. Genet. 32:402–407.[CrossRef][ISI][Medline]

    Aksoy, S., X. Chen, and V. Hypsa. 1997. Phylogeny and potential transmission routes of midgut-associated endosymbionts of tsetse (Diptera: Glossinidae). Insect. Mol. Biol. 6:183–190.[ISI][Medline]

    Baumann, P., N. A. Moran, and L. Baumann. 2000. Bacteriocyte-associated endosymbionts of insects. in M. Dworkin, ed. The prokaryotes, a handbook on the biology of bacteria; ecophysiology, isolation, identification, applications, Springer-Verlag, New York.

    Bergthorsson, U., and H. Ochman. 1995. Heterogeneity of genome sizes among natural isolates of Escherichia coli. J. Bacteriol. 177:5784–5789.[Abstract/Free Full Text]

    Canbäck, B., I. Tamas, and S. G. E. Andersson. 2004. A phylogenomic study of endosymbiotic bacteria. Mol. Biol. Evol. 21:1110–1122.[Abstract/Free Full Text]

    Chang, B. S., and D. L. Campbell. 2000. Bias in phylogenetic reconstruction of vertebrate rhodopsin sequences. Mol. Biol. Evol. 17:1220–1231.[Abstract/Free Full Text]

    Charles, H., A. Heddi, and Y. Rahbe. 2001. A putative insect intracellular endosymbiont stem clade, within the Enterobacteriaceae, inferred from phylogenetic analysis based on a heterogeneous model of DNA evolution. C. R. Acad. Sci. III 324:489–494.[ISI][Medline]

    Chenna, R., H. Sugawara, T. Koike, R. Lopez, T. J. Gibson, D. G. Higgins, and J. D. Thompson. 2003. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 31:3497–3500.[Abstract/Free Full Text]

    Clark, M. A., N. A. Moran, and P. Baumann. 1999. Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16:1586–1598.[Abstract]

    Cole, J., B. Chai, T. Marsh, R. Farris, Q. Wang, S. Kulam, S. Chandra, D. McGarrell, T. Schmidt, G. Garrity, and J. Tiedje. 2003. The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res. 31:442–443.[Abstract/Free Full Text]

    Conant, G. C., and P. O. Lewis. 2001. Effects of nucleotide composition bias on the success of the parsimony criterion in phylogenetic inference. Mol. Biol. Evol. 18:1024–1033.[Abstract/Free Full Text]

    Dale, C., and I. Maudlin. 1999. Sodalis gen. nov. and Sodalis glossinidius sp. nov., a microaerophilic secondary endosymbiont of the tsetse fly Glossinia morsitans morsitans. Int. J. Syst. Bacteriol. 49(Pt 1):267–275.[Abstract]

    Dale, C., G. R. Plague, B. Wang, H. Ochman, and N. A. Moran. 2002. Type III secretion systems and the evolution of mutualistic endosymbiosis. Proc. Natl. Acad. Sci. USA 99:12397–12402.[Abstract/Free Full Text]

    Daubin, V., N. A. Moran, and H. Ochman. 2003. Phylogenetics and the cohesion of bacterial genomes. Science 301:829–832.[Abstract/Free Full Text]

    Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368–376.[ISI][Medline]

    ———. 1988. Phylogenies from molecular sequences: inference and reliability. Annu. Rev. Gen. 22:212–219.

    Foster, P. G., L. S. Jermiin, and D. A. Hickey. 1997. Nucleotide composition bias affects amino acid content in proteins coded by animal mitochondria. J. Mol. Evol. 44:282–288.[ISI][Medline]

    Funk, D. J., J. J. Wernegreen, and N. A. Moran. 2001. Intraspecific variation in symbiont genomes: bottlenecks and the aphid-Buchnera association. Genetics 157:477–489.[Abstract/Free Full Text]

    Galtier, N., and M. Gouy. 1995. Inferring phylogenies from DNA sequences of unequal base compositions. Proc. Natl. Acad. Sci. USA 92:11317–11321.[Abstract/Free Full Text]

    ———. 1998. Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol. Biol. Evol. 15:871–879.[Abstract]

    Galtier, N., M. Gouy, and C. Gautier. 1996. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12:543–548.[Abstract]

    Galtier, N., N. Tourasse, and M. Gouy. 1999. A nonhyperthermophilic common ancestor to extant life forms. Science 283:220–221.[Abstract/Free Full Text]

    Gil, R., B. Sabater-Munoz, A. Latorre, F. J. Silva, and A. Moya. 2002. Extreme genome reduction in Buchnera spp.: toward the minimal genome needed for symbiotic life. Proc. Natl. Acad. Sci. USA 99:4454–4458.[Abstract/Free Full Text]

    Gil, R., F. J. Silva, E. Zientz, F. Delmotte, F. Gonzalez-Candelas, A. Latorre, C. Rausell, J. Kamerbeek, J. Gadau, B. Hölldobler, R. C. van Ham, R. Gross, and A. Moya. 2003. The genome sequence of Blochmannia floridanus: comparative analysis of reduced genomes. Proc. Natl. Acad. Sci. USA 100:9388–9393.[Abstract/Free Full Text]

    Graybeal, A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47:9–17.[CrossRef][ISI][Medline]

    Hasegawa, M., and T. Hashimoto. 1993. Ribosomal RNA trees misleading? Nature 361:23.[ISI][Medline]

    Hasegawa, M., T. Hashimoto, J. Adachi, N. Iwabe, and T. Miyata. 1993. Early branchings in the evolution of eukaryotes: ancient divergence of Entamoeba that lacks mitochondria revealed by protein sequence data. J. Mol. Evol. 36:380–388.[ISI][Medline]

    Heddi, A., H. Charles, C. Khatchadourian, G. Bonnot, and P. Nardon. 1998. Molecular characterization of the principal symbiotic bacteria of the weevil Sitophilus oryzae: a peculiar G + C content of an endocytobiotic DNA. J. Mol. Evol. 47:52–61.[ISI][Medline]

    Herbeck, J. T., D. J. Funk, P. H. Degnan, and J. J. Wernegreen. 2003. A conservative test of genetic drift in the endosymbiotic bacterium Buchnera: slightly deleterious mutations in the chaperonin groEL. Genetics 165:1651–1660.[Abstract/Free Full Text]

    Herbeck, J. T., D. P. Wall, and J. J. Wernegreen. 2003. Gene expression level influences amino acid usage, but not codon usage, in the tsetse fly endosymbiont Wigglesworthia. Microbiology 149:2585–2596.[CrossRef][ISI][Medline]

    Hillis, D. M. 1998. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst. Biol. 47:3–8.[CrossRef][ISI][Medline]

    Hillis, D. M., D. D. Pollock, J. A. McGuire, and D. J. Zwickl. 2003. Is sparse taxon sampling a problem for phylogenetic inference? Syst. Biol. 52:124–126.[CrossRef][ISI][Medline]

    Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111–120.[ISI][Medline]

    Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170–179.[ISI][Medline]

    Koga, R., T. Tsuchida, and T. Fukatsu. 2003. Changing partners in an obligate symbiosis: a facultative endosymbiont can compensate for loss of the essential endosymbiont Buchnera in an aphid. Proc. R. Soc. Lond. Ser. B Biol. Sci. 270:2543–2550.[CrossRef][ISI][Medline]

    Lambert, J. D., and N. A. Moran. 1998. Deleterious mutations destabilize ribosomal RNA in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 95:4458–4462.[Abstract/Free Full Text]

    Lefèvre, C., H. Charles, A. Vallier, B. Delobel, B. Farrell, and A. Heddi. 2004. Endosymbiont phylogenesis in the Dryophthoridae weevils: evidence for bacterial replacement. Mol. Biol. Evol. 21:965–973.[Abstract/Free Full Text]

    Lerat, E., V. Daubin, and N. A. Moran. 2003. From gene trees to organismal phylogeny in the Prokaryotes: the case of the gamma-Proteobacteria. PLoS Biol. Oct. 1:E19.[CrossRef]

    Lawrence, J. G., H. Ochman, and D. L. Hartl. 1991. Molecular and evolutionary relationships among enteric bacteria. J. Gen. Microbiol. 137:1911–1921.[ISI][Medline]

    Lockhart, P. J., M. A. Steel, M. D. Hendy, and D. Penny. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11:605–612.[Free Full Text]

    Loomis, W. F., and D. W. Smith. 1990. Molecular phylogeny of Dictyostelium discoideum by protein sequence comparison. Proc. Natl. Acad. Sci. USA 87:9093–9097.[Abstract/Free Full Text]

    Maddison, D., and W. Maddison. 2002. MacClade: analysis of phylogeny and character evolution. Sinauer Associates, Sunderland, Mass.

    Mira, A., H. Ochman, and N. A. Moran. 2001. Deletional bias and the evolution of bacterial genomes. Trends Genet. 17:589–596.[CrossRef][ISI][Medline]

    Mira, A., and N. A. Moran. 2002. Estimating transmission size and population bottlenecks in maternally transmitted endosymbiotic bacteria. Microb. Ecol. 44:137–143.[ISI][Medline]

    Mooers, A. O., and E. C. Holmes. 2000. The evolution of base composition and phylogenetic inference. Trends Ecol. Evol. 15:365–369.[CrossRef][ISI][Medline]

    Moran, N. A. 1996. Accelerated evolution and Muller's rachet in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 93:2873–2878.[Abstract/Free Full Text]

    ———. 2002. Microbial minimalism: genome reduction in bacterial pathogens. Cell 108:583–586.[ISI][Medline]

    Moran, N., and A. Telang. 1998. Bacteriocyte-associated symbionts of insects. Bioscience 48:295–304.[ISI]

    Moran, N. A., and A. Mira. 2001. The process of genome shrinkage in the obligate symbiont Buchnera aphidicola. Genome Biol. 2:RESEARCH0054.[Medline]

    Moran, N. A., and J. J. Wernegreen. 2000. Lifestyle evolution in symbiotic bacteria: insights from genomics. Trends Ecol. Evol. 15:321–326.[CrossRef][ISI][Medline]

    Muto, A., and S. Osawa. 1987. The guanine and cytosine content of genomic DNA and bacterial evolution. Proc. Natl. Acad. Sci. USA 84:166–169.[Abstract]

    Palacios, C., and J. J. Wernegreen. 2002. A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high expression genes. Mol. Biol. Evol. 19:1575–1584.[Abstract/Free Full Text]

    Pollock, D. D., D. J. Zwickl, J. A. McGuire, and D. M. Hillis. 2002. Increased taxon sampling is advantageous for phylogenetic inference. Syst. Biol. 51:664–671.[CrossRef][ISI][Medline]

    Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818.[Abstract]

    Rispe, C., F. Delmotte, R. C. van Ham, and A. Moya. 2004. Mutational and selective pressures on codon and amino acid usage in Buchnera, endosymbiotic bacteria of aphids. Genome Res. 14:44–53.[Abstract/Free Full Text]

    Ronquist, F., and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574.[Abstract/Free Full Text]

    Rosenberg, M. S., and S. Kumar. 2003. Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference. Mol. Biol. Evol. 20:610–621.[Abstract/Free Full Text]

    Schröder, D., H. Deppisch, M. Obermayer, G. Krohne, E. Stackebrandt, B. Hölldobler, W. Goebel, and R. Gross. 1996. Intracellular endosymbiotic bacteria of Camponotus species (carpenter ants): systematics, evolution and ultrastructural characterization. Mol. Microbiol. 21:479–489.[ISI][Medline]

    Shigenobu, S., H. Watanabe, M. Hattori, Y. Sakaki, and H. Ishikawa. 2000. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407:81–86.[CrossRef][ISI][Medline]

    Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:1114–1116.[Free Full Text]

    Singer, G. A., and D. A. Hickey. 2000. Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. Mol. Biol. Evol. 17:1581–1588.[Abstract/Free Full Text]

    Spaulding, A. W., and C. D. von Dohlen. 1998. Phylogenetic characterization and molecular evolution of bacterial enndosymbionts in psyllids (Hemiptera: Sternorrhyncha). Mol. Biol. Evol. 15:1506–1513.[Free Full Text]

    Sproer, C., U. Mendrock, J. Swiderski, E. Lang, and E. Stackebrandt. 1999. The phylogenetic position of Serratia, Buttiauxella and some other genera of the family Enterobacteriaceae. Int. J. Syst. Evol. Microbiol. 49:1433–1438.[Abstract]

    Steel, M. A., P. J. Lockhart, and D. Penny. 1995. A frequency-dependent significance test for parsimony. Mol. Phylogenet. Evol. 4:64–71.[CrossRef][Medline]

    Sueoka, N. 1962. On the genetic basis of heterogeneity of DNA content. Proc. Natl. Acad. Sci. USA 48:582–592.[ISI][Medline]

    ——— 1988. Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci. USA 85:2653–2657.[Abstract]

    ———. 1992. Directional mutation pressure, selective constraints, and genetic equilibria. J. Mol. Evol. 34:95–114.[ISI][Medline]

    Swofford, D. L. 2002. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sinauer Associates, Sunderland, Mass.

    Tamas, I., L. Klasson L, B. Canbäck, A. K. Naslund, A. S. Eriksson, J. J. Wernegreen, J. P. Sandstrom, N. A. Moran, and S. G. Andersson. 2002. 50 million years of genomic stasis in endosymbiotic bacteria. Science 296:2376–2379.[Abstract/Free Full Text]

    Tamura, K. 1992. Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Mol. Biol. Evol. 9:678–687.[Abstract]

    Tarrio, R., F. Rodriguez-Trelles, and F. J. Ayala. 2001. Shared nucleotide composition biases among species and their impact on phylogenetic reconstructions of the Drosophilidae. Mol. Biol. Evol. 18:1464–1473.[Abstract/Free Full Text]

    Tourasse, N. J., and W. H. Li. 1999. Performance of the relative-rate test under nonstationary models of nucleotide substitution. Mol. Biol. Evol. 16:1068–1078.[Abstract]

    Van Ham, R. C., J. Kamerbeek, C. Palacios, C. Rausell, F. Abascal, U. Bastolla, J. M. Fernandez, L. Jimenez, M. Postigo, F. J. Silva et al. (16 co-authors). 2003. Reductive genome evolution in Buchnera aphidicola. Proc. Natl. Acad. Sci. USA 100:581–586.[Abstract/Free Full Text]

    Wernegreen, J. J., A. B. Lazarus, and P. H. Degnan. 2002. Small genome of Candidatus Blochmannia, the bacterial endosymbiont of Camponotus, implies irreversible specialization to an intracellular lifestyle. Microbiology 148:2551–2556.[ISI][Medline]

    Wernegreen, J. J., and N. A. Moran. 1999. Evidence for genetic drift in endosymbionts (Buchnera): analyses of protein-coding genes. Mol. Biol. Evol. 16:83–97.[Abstract]

    Wertz, J. E., C. Goldstone, D. M. Gordon, and M. A. Riley. 2003. A molecular phylogeny of enteric bacteria and implications for a bacterial species concept. J. Evol. Biol. 16:1236–1248.[CrossRef][ISI][Medline]

    Woolfit, M., and L. Bromham. 2003. Increased rates of sequence evolution in endosymbiotic bacteria and fungi with small effective population sizes. Mol. Biol. Evol. 20:1545–155.[Abstract/Free Full Text]

    Yang, Z. 1993. Maximum-likelihood estimation of phylogeny from DNA when substitution rates differ over sites. Mol. Biol. Evol. 10:1396–1401.[Abstract]

Accepted for publication October 4, 2004.





This Article
Abstract
FREE Full Text (PDF)
Supplementary Material
Correction to PDF
An erratum has been published
All Versions of this Article:
22/3/520    most recent
msi036v1
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (1)
Request Permissions
Google Scholar
Articles by Herbeck, J. T.
Articles by Wernegreen, J. J.
PubMed
PubMed Citation
Articles by Herbeck, J. T.
Articles by Wernegreen, J. J.