The Glutamine Synthetases of Rhizobia: Phylogenetics and Evolutionary Implications

Sarah L. TurnerGo, and J. Peter W. Young

Department of Biology, University of York, York, England


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Glutamine synthetase exists in at least two related forms, GSI and GSII, the sequences of which have been used in evolutionary molecular clock studies. GSI has so far been found exclusively in bacteria, and GSII has been found predominantly in eukaryotes. To date, only a minority of bacteria, including rhizobia, have been shown to express both forms of GS. The sequences of equivalent internal fragments of the GSI and GSII genes for the type strains of 16 species of rhizobia have been determined and analyzed. The GSI and GSII data sets do not produce congruent phylogenies with either neighbor-joining or maximum-likelihood analyses. The GSI phylogeny is broadly congruent with the 16S rDNA phylogeny for the same bacteria; the GSII phylogeny is not. There are three striking rearrangements in the GSII phylograms, all of which might be explained by horizontal gene transfer to Bradyrhizobium (probably from Mesorhizobium), to Rhizobium galegae (from Rhizobium), and to Mesorhizobium huakuii (perhaps from Rhizobium). There is also evidence suggesting intrageneric DNA transfer within Mesorhizobium. Meta-analysis of both GS genes from the different genera of rhizobia and other reference organisms suggests that the divergence times of the different rhizobium genera predate the existence of legumes, their host plants.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Rhizobia are ubiquitous soil bacteria that are able to form nitrogen-fixing nodules with the roots of leguminous plants. The classification of rhizobia has been modified in recent years, with the division of the genus Rhizobium into Rhizobium, Sinorhizobium (Chen, Yan, and Li 1988Citation ; de Lajudie et al. 1994Citation ), and Mesorhizobium (Jarvis et al. 1997Citation ), and the addition of several new species names within these genera, e.g., Sinorhizobium saheli and Sinorhizobium terangae (de Lajudie et al. 1994Citation ) and Sinorhizobium medicae (Rome et al. 1996Citation ). These species and genera are based on physiological and genotypic data and appear to be robust (Young 1996Citation ). The 16S rDNA phylogenetic relationships among the rhizobial species used in this study are shown in figure 1 (after Young 1996Citation ). However, rhizobia have mechanisms for gene transfer that could potentially blur taxonomic distinctions. In many rhizobial species, the genes that determine symbiotic host range and efficacy are carried on plasmids. These plasmids can transfer between strains, species, and genera under laboratory conditions (Herrera-Cervera et al. 1998Citation ), and plasmid transfer has been shown to occur in the field (Rigottier-Gois et al. 1998Citation ). Mesorhizobium loti has also been shown to exchange symbiotic genes, although these genes are carried on a conjugative transposon that is located on the chromosome (Sullivan et al. 1995, 1996Citation ). More generally, rhizobia have relatively low levels of linkage disequilibrium for a range of symbiotic and nonsymbiotic genes (Souza et al. 1992Citation ; Maynard Smith et al. 1993Citation ; Souza and Eguiarte 1997Citation ), indicating significant levels of gene exchange within rhizobial populations.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 1.—Neighbor-joining phylogeny for the full-length 16S genes of rhizobial type strains (from published sequences). The tree was constructed using the K2P model and excluding gaps; bootstrap values (1,000 trials) are shown as percentages. The scale bar represents 0.01 substitutions per site

 
Bacterial species are defined using a range of techniques (including DNA-DNA hybridization and numerical taxonomy) that should avoid biases caused by localized recombination events. However, since the work of Woese (1987)Citation was published, there has been an increasing reliance on 16S rRNA gene sequence data to identify and classify bacteria. There is evidence for chimeric 16S gene sequences within rhizobia (Eardly, Wang, and van Berkum 1996Citation ), and there is evidence for transfer of complete 16S rDNA sequences between species of mesorhizobia (Sullivan et al. 1996Citation ). These observations suggest that 16S rDNA–based phylogenies of rhizobia can be misleading. A more detailed understanding of the evolutionary and ecological relationships among rhizobia ought to be achieved through analysis of several gene sequences (as many as possible) to establish consensus relationships and determine which genera/species exchange DNA. Few studies have concentrated on establishing the inter- and intrageneric relationships among closely related species of bacteria. Two recent studies examining housekeeping genes in commensal and pathogenic species of Neisseria associated with humans (Feil et al. 1996Citation ; Zhou, Bowler, and Spratt 1997Citation ) have found strong evidence for interspecies recombination across short stretches of DNA. Neisseria are naturally transformable, i.e., able to take up naked DNA, and this process is probably responsible for the recombination patterns observed. On the other hand, rhizobia are not naturally transformable, although gene transfer can be achieved through conjugation of large (generally >100 kb) plasmids followed by recombination. This might lead to a different pattern of recombination.

The molecular-clock hypothesis, using molecular data in conjunction with fossil evidence, has been invoked to interpret phylogenetic relationships and to date speciation events. Iwabe et al. (1989)Citation argued that gene paralogs, derived from a common ancestor following a gene duplication, are suitable for molecular clock studies, since the resulting phylogeny can be rooted unambiguously at the duplication event. Genes subject to uniform selective pressures are useful for molecular-clock studies, since these genes are most likely to conform to the central assumptions of the theory that the evolutionary process is independent at all sites. Genes that most closely adhere to this model are pseudogenes and genes that are so central to metabolism that they are thought to be under little directional selective pressure due to the functional constraints imposed on their products. Glutamine synthetase (GS) (EC 6.3.1.2) is a key enzyme in nitrogen assimilation that is both duplicated and conserved in function. The gene duplication event appears to have occurred before the split of prokaryotes and eukaryotes (Kumada et al. 1993Citation ). One form, GSI, has been found only in prokaryotes, whereas another form, GSII, is found in all eukaryotes and has been detected in a minority of prokaryotes. A third form of GS, glnT, has been identified only in rhizobia; it is distantly related to the other two forms and may not be a functional homolog (Shatters, Lui, and Kahn 1993Citation ). The adherence or nonadherence (Brown et al. 1994Citation ) of GS genes to the molecular-clock theory has been discussed in detail. However, allowing for a number of ancient gene transfers, GS evolution appears reasonably clocklike when either only nonsynonymous mutations (Kumada et al. 1993Citation ) or only second codon positions (Pesole et al. 1991, 1995Citation ) are used. The X-ray structure of the Salmonella typhimurium GSI protein (GlnA) has been determined: it is a dodecameric homopolymer in which the active site is formed between subunits (Yamashita et al. 1989Citation ). The need for intimate contacts with several other subunits might explain why these sequences are so highly conserved.

Taboada et al. (1996)Citation investigated the relative mobility of the GSI and GSII proteins from different species of rhizobia using two-dimensional gel electrophoresis. Their results suggest that the GSII enzyme is the more divergent, having a different mobility for each species screened, whereas the GSI enzyme mobilities were identical across all species. The aim of our work was to use GSI and GSII gene sequences from different species of rhizobia to investigate the evolutionary relationships for each gene, with the possibility of dating the major branch points within the family.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Bacterial type strains used in this study are listed in table 1 .


View this table:
[in this window]
[in a new window]
 
Table 1 Bacterial Strains Used

 
Molecular Methods: PCR and Sequencing
The PCR primer pairs GSI-1/2, GSI-3/4, GSII-1/2, and GSII-3/4 were designed to ensure amplification of either the GSI or the GSII genes, respectively. This was achieved by targeting one primer of each pair to a region specific to either GSI or GSII sequences (fig. 2 ). The GSI sequences used for primer design were Azospirillum brasiliense (M26107), Azotobacter vinelandii (M57275), Escherichia coli (D83536), Mycobacterium tuberculosis (Z73902), Neisseria gonorrhoeae (M84113), Rhizobium leguminosarum (X04880), Rhodobacter capsulatus (U25953), Rhodobacter sphaeroides (X71659), Salmonella typhimurium (M14536), Sinorhizobium meliloti (U50385), Thiobacillus ferrooxidans (M16626), and Vibrio alginolyticus (L08499). The bacterial GSII sequences used for primer design were Bradyrhizobium japonicum (X04187), Frankia sp. (M58415), R. leguminosarum (X67296), and S. meliloti (X17523). Each GS data set was aligned using CLUSTAL X (Thompson et al. 1997Citation ), and highly conserved sections of sequence were investigated for primer design.



View larger version (74K):
[in this window]
[in a new window]
 
Fig. 2.—Protein sequence alignments of the rhizobial GSI sequences in this work compared with the Salmonella typhimurium sequence, the known secondary structure of which is indicated underneath. Numbers refer to the positions in the full-length S. typhimurium protein; active site residues are indicated by a # above the Sinorhizobium fredii sequence. Abbreviations are as follows: Sfre, S. fredii; Ster, Sinorhizobium terangae; Ssah, Sinorhizobium saheli; Smel, Sinorhizobium meliloti; Smed, Sinorhizobium medicae; Retl, Rhizobium etli; Rvic, Rhizobium leguminosarum bv. viciae; RtrA, Rhizobium tropici A; RtroB, R. tropici B; Rgal, Rhizobium galegae; Mtia, Mesorhizobium tianshanense; Mmed, Mesorhizobium mediterraneum; Mhua, Mesorhizobium huakuii; Mlot, Mesorhizobium loti; Mcic, Mesorhizobium ciceri; Acaul, Azorhizobium caulinodans; Styp, S. typhimurium.

 
The sequences of the primers (and their positions in the corresponding S. meliloti database sequence) are as follows: GSI-1—AAG GGC GGC TAY TTC CCG GT (532–551); GSI-2—GTC GAG ACC GGC CAT CAG CA (1143–1124); GSI-3—GAY CTG CGY TTY ACC GAC C (58–76); GSI-4—CTT CRT GGT GRT GCT TTT C (643–625); GSI-5—GCA AGC TGC AGC AYG TGA CG (83–102); GSII-1—AAC GCA GAT CAA GGA ATT CG (69–88); GSII-2—ATG CCC GAG CCG TTC CAG TC (686–667); GSII-3—AGR TYT TCG GCA AGG GYT C (542–560); GSII-4—GCG AAC GAT CTG GTA GGG GT (981–962). The conditions for amplification were 1 x MgCl2-free buffer (Promega), 1.63 mM MgCl2, 200 µM dNTPs, 16 pmol of each primer, and 1 U Taq polymerase (Promega) per 50-µl reaction. The PCR cycles used were as follows: 97°C for 120 s; 30 cycles of 92°C x 40 s, either 49°C (primer pairs GSI-3/4 and GSII-3/4) or 55°C (primer pairs GSI-1/2 and GSII-1/2) x 40 s, and 72°C x 90s; and 72°C x 10 min. The PCR products were cleaned using QIAPREP plasmid purification system (QIAGEN) and sequenced directly (both strands) using the Ready Reaction Dye Terminator kit (Perkin Elmer) and an ABI 377 sequencer according to the manufacturers' instructions.

Sequence Data and Accession Numbers
Sequence data corresponding to residues 145–1086 (942 bp) and residues 121–948 (828 bp) of the S. meliloti GSI and GSII database sequences, respectively, were determined for both strands. Accession numbers are given in table 1 .

Phylogenetic Analyses
The sequences obtained were aligned using CLUSTAL X (Thompson et al. 1997Citation ). The alignments were adjusted to match the alignment of Pesole et al. (1995)Citation when analyses of both GSI and GSII sequences were undertaken. Neighbor-joining analyses (Saitou and Nei 1987Citation ) were undertaken using PAUP*, version 4.0b2a (Swofford 1998Citation ), and maximum-likelihood analyses using DNAml (which uses the F84 model) and DNAmlk in the PHYLIP 3.57c suite of programs (Felsenstein 1993Citation ). All alignments are available from S.L.T.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Amplification and Sequence Determination of GSI and GSII from Rhizobial Type Strains
The PCR primer pairs GSI-1/2 and GSII-1/2 generated single PCR amplification products with each of the rhizobial species used in this study, except for M. huakuii with the GSI-1/2 primer pair and A. caulinodans with either primer pair. The absence of a product for A. caulinodans with the GSII-1/2 primer pair was expected, since biochemical studies have shown that, unlike other members of the rhizobia, A. caulinodans has only the GSI gene (Donald and Ludwig 1984Citation ). The full-length A. caulinodans GSI sequence became available during the course of this work (accession number Y10213). There are two base mismatches at the 3' end of primer GSI-2, which explains why an amplification product was not obtained with the GSI-1/2 primer set from this strain.

All of the PCR products were sequenced on both strands, and the sequence data were used in conjunction with the earlier alignments to design the primers GSI-3/4 and GSII-3/4 for specific amplification of the 5' region of the GSI and the 3' region of the GSII genes, respectively. PCR products of the expected size were obtained for all type strains screened with these primer sets, except for A. caulinodans with the GSII-3/4 primer pair. Bradyrhizobium japonicum generated a weak amplification product with the GSI-3/4 primer pair, and a reliable sequence could not be obtained by direct sequencing. However, from the poor-quality sequence obtained and published partial B. japonicum GSI sequences (accession numbers M26735 and M10926), a new forward primer, GSI-5, was designed that gave reliable amplification and unambiguous sequence data. A long M. huakuii GSI sequence was generated using primers GSI-3 and GSI-2 with a 49°C anneal and a 120-s extension, and this product was sequenced with all four GSI primers. Long GSI and GSII products can be amplified for all of the fast-growing rhizobia using the lower anneal temperature, an increased extension time, and the GSI-3/2 or GSII-1/4 primer pair, respectively.

Phylogenetic Analysis of GSI Sequences
The sequence data for the GSI-1/2 and GSI-3/4 primer sets were combined for each species, translated, and aligned using CLUSTAL X (fig. 3 ). Preliminary analyses included translation products of the R. leguminosarum bv. viciae strain RC1001 (X04880) and S. meliloti strain 2011 (U50385) sequences present in the database. These are the only full-length rhizobial GSI sequences available in the databases: the S. medicae sequence obtained in this work was identical to the S. meliloti 2011 sequence. The species S. medicae has only recently been recognized (Rome et al. 1996Citation ); it was formerly classified as S. meliloti type B, and so this finding is not unexpected. The R. leguminosarum sequence obtained in this work was most closely related to the R. leguminosarum RC1001 sequence. These sequences were not identical; there appears to be a short sequence rearrangement in the published amino acid sequence, in which residues 225–232 of figure 2 precede residues 220–224. The consensus sequence from all of the rhizobial sequences obtained during this work suggests that this rearrangement is peculiar to the published R. leguminosarum RC1001 sequence or may be a sequencing error: in addition to the rearrangement, there are some point mutations that appear to be unique to the published sequence. These two database sequences were omitted from all subsequent analyses.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 3.—Phylogenetic relationships of rhizobial type strains for GSI partial gene sequences (A) estimated using the neighbor-joining algorithm with the F81 model (percentage bootstrap values after 1,000 trails are shown) and (B) estimated using maximum likelihood with global rearrangements (Ln -6,366, 1,037 trees). Scale bars represent 0.1 substitutions per site

 
The A. caulinodans sequence was included in GSI sequence analyses to serve as an outgroup. Neighbor-joining analyses using PAUP* included several different evolutionary models (GTR, K2P, J-C, F81, and F84) and several different codon position combinations (e.g., all, 1 and 2 only, or each individually). The model used and the codon positions analyzed did not affect the tree topology greatly; a representative tree derived using the F81 model and all codon positions is shown in figure 3 A. The inclusion of all three codon positions reveals more clearly the intragenus relationships (see Discussion). Maximum-likelihood analysis using the F84 model and all three codon positions (fig. 3 B) also produces a tree with very similar topology, as does an analysis confined to second codon positions (not shown). In all analyses, the B. japonicum and A. caulinodans sequences form a distinct outgroup, the sinorhizobia and mesorhizobia always form well-supported clades, and the R. galegae sequence does not group with any other sequences. When third codon positions are excluded, R. leguminosarum, R. etli, and the R. tropici species do not form a distinct clade, although all species are excluded from the Sinorhizobium and Mesorhizobium clades following bootstrap analysis of neighbor-joining trees. These relationships are in good agreement with the species relationships that have been inferred for these bacteria from 16S rDNA sequences (fig. 1 ).

How Do the GSII Phylogenies Compare with the GSI Results?
The results of phylogenetic analysis of the GSII data are presented as radial phylograms (fig. 4 ) because they cannot be reliably rooted, since A. caulinodans does not have a GSII. The only other bacteria that have been shown to have GSII genes are the high-GC Gram-positives, and these sequences are too diverged to provide a useful outgroup.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 4.—Phylogenetic relationships of partial GSII sequences. A, Neighbor-joining gene phylogeny (F81 model). B, Neighbor-joining protein phylogeny (Kimura correction). B, Maximum-likelihood gene phylogeny (Ln -5,144, 1,120 trees). Percentage bootstrap values after 1,000 trials are shown. Scale bars indicate number of substitutions per site

 
As with the GSI sequences, neighbor-joining analysis of the GSII data was done using PAUP*, invoking a number of different evolutionary models (GTR, K2P, J-C, F81 and F84) and using different codon positions. When all three codon positions were used, the tree topology was largely unaffected by the model used (e.g., fig. 4 A), and when third codon positions were excluded, the tree topology more closely resembled the protein tree (fig. 4 B). The topologies of these trees are different, and neither is congruent with the GSI phylogenies. The GSII tree topologies differ from the GSI and 16S trees in several ways. First, the R. galegae sequence is positioned within the Rhizobium cluster and suggests close relatedness to R. leguminosarum and R. etli, whereas this species is not closely related to the Rhizobium cluster in GSI or 16S analyses. Second, the B. japonicum sequences are more closely related to the fast-growing rhizobia than expected from the 16S and GSI data. The B. japonicumR. leguminosarum distance is noticeably longer than the R. leguminosarumS. fredii distance in the GSI and 16S trees (approximately 1.6- and 2.3-fold, respectively), whereas these distances are approximately equal in the GSII trees. Third, the M. huakuii sequences are more strongly associated with the Rhizobium cluster than with the Mesorhizobium cluster. This association is most pronounced on the protein tree, where M. huakuii is located inside the Rhizobium cluster and well supported by bootstrap values. Maximum-likelihood analysis (all three codons) does not resolve the position of M. huakuii, since the tree generated (fig. 4 C) is different from both the neighbor-joining trees (fig. 4 A and B). The topology of the DNAml tree generated using only second codon positions is similar to that of figure 4 C, differing only in the branch order within the Sinorhizobium clade. Maximum-likelihood analyses support the anomalous positions of R. galegae and M. huakuii and the relatively short branch length of B. japonicum inferred from the neighbor-joining analyses. The M. loti sequence, however, occupies an anomalous position compared with the neighbor-joining DNA tree (fig. 4 A), and both the M. loti and the M. huakuii sequences occupy anomalous positions compared with the protein tree (fig. 4 B). The three tree topologies shown in figure 4 were compared directly using the user-defined trees option of DNAml, allowing branch length to vary. This program performs the Kishino-Hasegawa test (Kishino and Hasegawa 1989Citation ) to assess whether or not trees differ significantly. The results suggest that the DNA sequences (all positions) can accommodate all three topologies shown in figure 4 without a significant reduction in likelihood. The same method of analysis was used to compare the GSI and GSII tree topologies: neither data set can accommodate the other tree topology without a significant reduction in likelihood, i.e., the GSI and GSII phylogenies are different (P < 0.05).


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
How Do the GSI Sequence Data Fit with the Protein Structure?
The S. typhimurium and the R. leguminosarum protein sequences are both 468 aa long; the sequences are very similar and can be aligned easily with conservation of all secondary structures and active-site residues (fig. 2 ). The level of conservation is such that we felt it plausible to use the solved S. typhimurium structure (Yamashita et al. 1989Citation ) as a template to look at the variable sites among the rhizobial sequences. Using the program Rasmol, version 2.6 (www.umass.edu/microbio/rasmol/), and the structural data file (MMDB 2694), we estimated the probable three-dimensional locations of, and potential interactions of residues that vary among, the rhizobial sequences (see fig. 2 ). All of the variable sites appear to involve either surface residues and/or residues that form part of loop structures of no known structural/functional importance, as might be expected.

When residues that are conserved within the rhizobia but differ from the S. typhimurium sequence are considered, the substitutions are usually conservative. The substitution frequency was calculated by comparison of the S. typhimurium and the S. fredii protein sequences (fig. 2 ). Residues 136–161 (including ß-8, ß-9, and {alpha}-4) have 20 substitutions, approximately twice the number expected from the frequency for the full-length fragments. Furthermore, this region has a disproportionately high number of substitutions when all possible pairwise comparisons are made between S. fredii, S. typhimurium, and N. gonorrhoeae. Yet, this 20-aa sequence is totally conserved within the 15 fast-growing rhizobia sampled here and within the 7 species of Neisseria sequenced by Zhou, Bowler, and Spratt (1997)Citation . The three-dimensional model indicates that the dodecameric homopolymer comprises two opposing hexamer rings. These rings are thought to be held together by a ß-sheet structure formed by ß-8 and ß-9 of opposing monomers in each ring. If this is the case, then the ß-8 and ß-9 might be expected to be functionally constrained and, therefore, to have low levels of sequence variation.

Brown et al. (1994)Citation proposed that there has been an insertion of approximately 25 amino acids in the GSIß subset of sequences that includes the S. typhimurium and rhizobial proteins (see fig. 5 ). This insert would correspond to residues 144–166 of figure 2 including all of ß-9 and {alpha}-4. Pesole et al. (1995)Citation proposed that the insert corresponds to residues 138–162, including most of ß-8, as well as ß-9 and {alpha}-4, and corresponding almost exactly to the region of higher-than-expected substitutions between S. fredii and S. typhimurium. If these functionally important structures are the result of an insertion event, then determining the structure of the ancestral form without the insert, represented by either a GSII or a GSI{alpha} polymer, should help to identify the boundaries and nature of the insertion event.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 5.—Maximum-likelihood phylogeny, assuming constant rates, used to estimate divergence times. The phylogeny was constructed using only the regions of GSI and GSII sequenced in this study; only second codon positions were considered, and gaps were ignored (245 positions). The scale bar represents the number of substitutions per site. Other sequences taken from the data base are as follows: GSI—Azotobacter brassiliense (M26107), Neisseria gonorrheae (M84113), Escherichia coli (X05173), Salmonella typhimurium (M14536), Streptomyces viridochromogenes (X70924), Frankia alni (L10631), Bacillus subtilis (D00854), Lactobacillus delbruekii (D10020), Pyrococcus furiosus (L12410); GSII—Rattus novegicus (M91625), Homo sapiens (X59834), Gallus gallus (M29076), Xenopus laevis (D50062), Drosophila melanogaster (X52759), Saccharomyces cerevisiae (M65157), Medicago truncatula (Y10267), Brassica napus (X82997), Zea mays (X65926), Pinus sylvestris (X69822), Skeletonema costatum (AF064638), Frankia sp. (M58415), Streptomyces viridochromogenes (X52842)

 
What Do GS Sequences Tell Us About the Evolutionary Relationships of the Rhizobia?
The relationships among the different genera of rhizobia inferred from the GSI sequence comparisons are in good agreement with those inferred previously from 16S rDNA data. There is significant support for the three fast-growing genera (Rhizobium, Sinorhizobium, and Mesorhizobium) and also strong support for the distinct position of B. japonicum outside of these three. Both the 16S and the GSI genes also suggest that R. galegae is not strongly associated with any one of the three fast-growing genera. Eardly, Wang, and van Berkum (1996)Citation have proposed that the 16S gene of R. galegae is chimeric, and its true taxonomic position within the rhizobia is under review.

The GSII relationships show several differences from GSI and 16S: R. galegae strongly groups within the Rhizobium cluster, B. japonicum is more closely related to the fast-growing rhizobia than predicted from the other two data sets, and the Mesorhizobium clade is not supported. The M. huakuii GSII sequence appears to be more closely related to Rhizobium sequences than it is to Mesorhizobium sequences. This finding was checked by sequencing GSI and GSII of the M. huakuii type strain (CCBAU 2609) held in the HAMBI collection. The sequences were identical to those obtained for the USDA 4779 strain. Sequences from other strains independently isolated from the same host plant species (Astragalus sinicus) also share the anomalous position of the M. huakuii sequence (our unpublished results). The B. japonicum GSII DNA sequence obtained in this work is most similar, although not identical, to the B. japonicum sequence in the database. The protein sequences are, however, identical: these results suggest that the unusual position of this type strain sequence is not an artifact.

The possibility that one or all of these anomalies might be the consequence of a recent recombination event was investigated using PLATO (Grassly and Holmes 1997Citation ). This program was designed to identify regions of a sequence alignment that produce phylogenies that are inconsistent with the maximum-likelihood phylogeny for the full-length alignment. PLATO, version 2.01f, did not identify any extended regions of the GSII sequence alignment that would generate anomalous phylogenies under the F81 model when all three codon positions were used. Neither did PhylPro (Weiller 1998Citation ), which uses a sliding-window approach to identify recombination sites. Thus, there is no evidence for a recombination break point within the GSII sequence.

PLATO was also used to assess the GSI data set (all positions, F84 model); an extensive region (72–509 bp of the alignment; a Z value of 4.99, where values >3.70 are significant) was highlighted by the program. Visual inspection of the GSI protein and DNA sequence alignments did not reveal any obvious sequence discontinuities. The maximum-likelihood phylogeny of this fragment (72–509 bp, not shown) is slightly different from that shown in figure 2 B. The anomalous region can accommodate the full-length tree configuration without a significant reduction in log-likelihood when the user-defined trees option is invoked and branch lengths are allowed to vary. Furthermore, removal of single sequences (B. japonicum, S. meliloti, or R. etli) and reanalysis with PLATO results in loss of the anomalous region, suggesting that the anomaly is not dependent on any one sequence. Again, PhylPro did not produce results consistent with a recent recombination event, and we conclude that there is no clear evidence for a recombination break point within the GSI sequence.

Comparison of the intrageneric species relationships for each of the three gene sequences considered in this paper, 16S rDNA, GSI, and GSII (figs. 1, 3A and 4A ) suggest that the Mesorhizobium species relationships are the least consistent. For example, in the 16S phylogeny, the M. ciceri and M. loti sequences are most closely related (100% bootstrap support), whereas the M. ciceri and M. loti GSI sequences are less closely related, although they are definitely in the same clade. According to the GSII analyses, the M. loti sequence is not strongly grouped with the other mesorhizobia. In contrast, all three gene phylogenies support the Sinorhizobium clade. Within this clade, the S. medicaeS. meliloti association is strongly supported in all three phylogenies, although the relationships of the other three species in this clade are poorly resolved on both the GSI and the GSII trees. The R. leguminosarumR. etli and R. tropici A–R. tropici B relationships are well supported in all three trees, and these species consistently group together. Recent evidence suggests exchange of chromosomal genes between Mesorhizobium species, which has not been so clearly documented for the other fast-growing rhizobia. Sullivan et al. (1996)Citation have identified Mesorhizobium isolates that are different species according to DNA : DNA hybridization analysis but have identical or very similar 16S rDNA sequences. However, more research is required before any general conclusions about the levels and extent of gene exchange within and among the different genera of rhizobia can be made.

Can We Date Speciation Within the Rhizobia?
There is very little fossil evidence for bacteria, so divergence times are almost impossible to estimate. One way to circumvent this problem is to correlate bacterial splits with those of higher organisms that have more complete fossil records (Ochman and Wilson 1987Citation ). For example, the divergence times of aphid species were used to date the divergence of their maternally inherited, obligate intracellular symbiont (Buchnera aphidicola) lineages to between 100–250 MYA. However, the extrapolation of these dates to other bacterial lineages is now in doubt, because B. aphidicola appears to have an increased substitution rate (Moran, von Dohlen, and Baumann 1995Citation ; Brynnel et al. 1998Citation ). An alternative approach is to calibrate divergence in a bacterial gene with the homologous gene in eukaryotes for which a fossil record is available. As mentioned earlier, duplicated genes such as GS are ideally suited to molecular-clock studies, since the duplication event allows the resultant phylogenetic tree to be rooted unambiguously. This, in turn, allows the uniformity or nonuniformity of rates to be assessed. The most ancient duplication of GS genes is likely to have occurred before the split of prokaryotes and eukaryotes (Kumada et al. 1993Citation ). A molecular-clock approach should allow the divergence times of rhizobial GSII sequences to be estimated from the fossil record for higher organisms. These dates might then allow the divergence times of rhizobia and other bacterial lineages to be estimated using GSI sequences if the rates appear to be equivalent in the two halves of the tree.

Figure 5 shows the phylogeny of GSI and GSII sequences, including reference eukaryote sequences and some bacterial GSI sequences, calculated assuming constant rates. The alignment used was based on that of Pesole et al. (1995)Citation ; only second codon positions were considered, and positions with gaps were ignored (245 sites). Pesole et al. (1995)Citation showed that second codon positions obeyed stationarity for the taxa used: the plot of %GC of second codon positions against %GC of third codon positions showed that all taxa displayed the same base composition, within statistical variations, for second codon positions. This is also true for the sequences used in this study (data not shown). The likelihood ratio test was used to compare the DNAmlk phylogeny shown (Ln -4,148) with the equivalent, unrooted DNAml phylogeny (Ln -4,127) (not shown): the result of this test ({chi}2 = 42, df = 30) suggests that the DNAmlk phylogeny is not significantly less likely than the DNAml phylogeny (DNAmlk/PHYLIP notes).

Only the type species of each genus (B. japonicum, R. leguminosarum, S. fredii, and M. loti) and R. galegae, because of its uncertain phylogenetic position, were included in the calculations for figure 5 . The phylogeny confirms that the B. japonicum GSII sequence is not a reliable outgroup for the fast-growing rhizobia for this gene. The relative distances between the fast-growing rhizobia and B. japonicum differ greatly in the GSI and GSII halves of the tree, which would make any interpretation involving this species rather tenuous. However, the divergence times between the fast-growing rhizobia are remarkably similar for the GSI and GSII halves of the tree, suggesting that these sequences have behaved in a clocklike manner in the rhizobia. This was confirmed by a plot of GSI second-codon-position distances (F84 model) against those for GSII for the taxa in figure 5 that have both genes (the rhizobia and the high-GC Gram-positives). The results suggest that when B. japonicum is ignored, the rate of substitution is equivalent for GSI and GSII for all taxa (within 95% confidence limits).

Recent reports using many genes to assess the likely divergence times of some higher eukaryotes have produced results that are in good agreement with fossil evidence (Goremykin, Hansmann, and Martin 1997Citation ; Kumar and Hedges 1998Citation ; Wang, Kumar, and Hedges 1999Citation ). Kumar and Hedges (1998)Citation used fossil evidence that dates the diapsid-synapsid (bird-mammal) split at 310 MYA to estimate the amphibian-bird/mammal (360 ± 14.7 MYA) and rodent-human (112 ± 3.2 MYA) divergence times. Their later work (Wang, Kumar, and Hedges 1999Citation ) places the chordate-arthropod split at 993 ± 46 MYA and suggests that the split between animals, plants, and fungi occurred 1,200–1,500 MYA. Goremykin, Hansman, and Martin (1997)Citation estimated divergence times for the angiosperm-gymnosperm and monocot-dicot splits of 160 and 348 MYA (±10%), using a calibration point of 450 MYA for the Marchantia–vascular plant divergence. If the bird-mammal split is used as a calibration point for the GS data, however, the divergence times of other splits do not agree with those estimated using multiple gene sequences (table 2 ). When the amphibian-bird/mammal split or the angiosperm-gymnosperm split is used for calibration, the divergence times for splits other than the bird-mammal split are in better agreement with those proposed by Kumar and Hedges (1998)Citation and Goremykin, Hansman, and Martin (1997)Citation (table 2 ).


View this table:
[in this window]
[in a new window]
 
Table 2 Divergence Times Estimated Assuming Clocklike Behavior of GS Gene Sequences, Using Three Different Calibration Points (Indicated in Bold)

 
Using the divergence times estimated from these data, the divergence times for the fast-growing rhizobial genera would have been between 203–324 MYA (see table 2 ). This time is earlier than both the monocot-dicot (156–171 MYA) and the brassica-legume (125–136 MYA) splits, i.e., before the existence of leguminous plants. The possibility that rhizobia started to diverge before the existence of legumes has been proposed before: Young and Johnston (1989)Citation suggested that the last common ancestor of fast-growing rhizobia and B. japonicum predates the existence of angiosperms. This assertion is supported by the deep branch between these taxa in the GSI half of figure 5 , which would date the divergence between bradyrhizobia and the fast-growing rhizobia at 507–553 MYA. This supports the idea that the nodulation genes (those specifically involved in forming the symbiotic nodule), which are thought to have a single, but unknown, origin, arose after the bacterial divergence and have spread between species and genera (Young and Johnston 1989Citation ; Dobert, Breil, and Triplett 1994Citation ).

Due to the absence of an informative fossil record, dating bacterial divergence times is very difficult. Some estimates have been made (Ochman and Wilson 1987Citation ; Doolittle et al. 1996Citation ) based on the assumption that eukaryotes and bacteria have comparable substitution rates for all genes. These date the E. coliS. typhimurium split at 120–160 MYA based on 16S rDNA (Ochman and Wilson 1987Citation ), or at 100 MYA based on several protein sequences (Doolittle et al. 1996Citation ). The GSI data would place the E. coliS. typhimurium split at 69–75 MYA, somewhat later than these estimates, which, unlike the GS data, do not have an internal assessment of clocklike behavior for the bacterial and eukaryotic sequences. The similar branch lengths for the fast-growing rhizobia in both the GSI and GSII halves of figure 5 suggest that these paralogs do behave as good molecular clocks.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
This work was funded by the EU as part of the INCO-DC Programme, contract IC18-CT96-0103, and by NERC grant GR3/10819. We are also grateful to A. Mould for technical support with sequencing.


    Footnotes
 
Richard H. Thomas,

1 Keywords: recombination molecular clock bacteria glutamine synthetase rhizobia Back

2 Address for correspondence and reprints: Sarah L. Turner, Department of Biology, University of York, P.O. Box 373, York, YO10 5YW, U.K. E-mail: slt4{at}york.ac.uk Back


    literature cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 

    Brown, J. R., Y. Masuchi, F. T. Robb, and W. F. Doolittle. 1994. Evolutionary relationships of bacterial and archaeal glutamine synthetase genes. J. Mol. Evol. 38:566–576.[ISI][Medline]

    Brynnel, E. U., C. G. Kurland, N. A. Moran, and S. G. E. Andersson. 1998. Evolutionary rates for tuf genes in endosymbionts of aphids. Mol. Biol. Evol. 15:574–582.[Abstract]

    Chen, W. X., G. H. Yan, and J. L. Li. 1988. Numerical taxonomy study of fast-growing soy bean rhizobia and a proposal that Rhizobium fredii be assigned to Sinorhizobium gen. nov. Int. J. Syst. Bacteriol. 38:392–397.

    de Lajudie, P., A. Willems, B. Pot, D. Dewittinck, G. Maestrojuan, M. Neyra, M. D. Collins, B. Dreyfus, K. Kersters, and M. Gillis. 1994. Polyphasic taxonomy of rhizobia: emendation of the genus Sinorhizobium and description of Sinorhizobium meliloti comb. nov., Sinorhizobium saheli sp. nov., and Sinorhizobium teranga sp. nov. Int. J. Syst. Bacteriol. 44:715–733.

    Dobert, R. C., B. T. Breil, and E. W. Triplett. 1994. DNA sequence of the common nodulation genes of Bradyrhizobium elkanii and their phylogenetic relationship to those of other nodulating bacteria. Mol. Plant Microbe Interact. 7:564–572.[ISI][Medline]

    Donald, R. G. K., and R. A. Ludwig. 1984. Rhizobium sp. strain ORS571 ammonium assimilation and nitrogen fixation. J. Bacteriol. 158:1144–1151.[ISI][Medline]

    Doolittle, R. F., D.-F. Feng, S. Tsang, G. Cho, and E. Little. 1996. Determining divergence times of the major kingdoms of living organisms with a protein clock. Science 271:470–477.

    Eardly, B. D., F. S. Wang, and P. van Berkum. 1996. Corresponding 16S rRNA gene segments on Rhizobiaceae and Aeromonas yield discordant phylogenies. Plant Soil 186:69–74.

    Feil, E., J. Zhou, J. Maynard Smith, and B. G. Spratt. 1996. A comparison of the nucleotide sequences of the adk and recA genes of pathogenic and commensal Neisseria species: evidence for extensive interspecies recombination within adk. J. Mol. Evol. 43:631–640.[ISI][Medline]

    Felsenstein, J. 1993. PHYLIP (phylogeny inference package). Version 3.5c. Distributed by the author, Department of Genetics, University of Washington, Seattle.

    Goremykin, V. V., S. Hansman, and W. F. Martin. 1997. Evolutionary analysis of 58 proteins encoded in six completely sequenced chloroplast genomes: revised molecular estimates of two seed plant divergence times. Plant Syst. Evol. 206:337–351.[ISI]

    Grassly, N. C., and E. C. Holmes. 1997. A likelihood method for the detection of selection and recombination using nucleotide sequences. Mol. Biol. Evol. 14:239–247.[Abstract]

    Herrera-Cervera, J. A., J. M. Sanjuan-Pinilla, J. Olivares, and J. Sanjuan. 1998. Cloning and identification of conjugative transfer origins in the Rhizobium meliloti genome. J. Bacteriol. 180:4583–4590.[Abstract/Free Full Text]

    Iwabe, N., K.-I. Kuma, M. Hasegawa, S. Osawa, and T. Miyata. 1989. Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA 86:9355–9359.

    Jarvis, B. D. W., P. van Berkum, W. X. Chen, S. M. Nour, M. P. Fernandez, J. C. Cleyet-Marel, and M. Gillis. 1997. Transfer of Rhizobium loti, Rhizobium huakuii, Rhizobium ciceri, Rhizobium mediterraneum and Rhizobium tianshanense to Mesorhizobium gen. nov. Int. J. Syst. Bacteriol. 47:895–898.[ISI]

    Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum-likelihood estimation of the evolutionary tree topologies from DNA-sequence data, and the branching order in the Hominoidea. J. Mol. Evol. 29:170–179.[ISI][Medline]

    Kumada, Y., D. R. Benson, D. Hillemann, T. J. Hosted, D. A. Rochefort, C. J. Thompson, W. Wohlleben, and Y. Tateno. 1993. Evolution of the glutamine synthetase gene, one of the oldest existing and functioning genes. Proc. Natl. Acad. Sci. USA 90:3009–3013.

    Kumar, S., and S. B. Hedges. 1998. A molecular timescale for vertebrate evolution. Nature 392:917–920.

    Maynard Smith, J., N. H. Smith, M. O'Rourke, and B. G. Spratt. 1993. How clonal are bacteria? Proc. Natl. Acad. Sci. USA 90:4384–4388.

    Moran, N. A., C. D. von Dohlen, and P. Bauman. 1995. Faster evolutionary rates in endosymbiotic bacteria than in cospeciating insect hosts. J. Mol. Evol. 41:727–731.[ISI]

    Ochman, H., and A. C. Wilson. 1987. Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26:74–86.[ISI][Medline]

    Pesole, G., M. P. Bozzetti, C. Lanave, G. Preparata, and C. Saccone. 1991. Glutamine synthetase gene evolution: a good molecular clock. Proc. Natl. Acad. Sci. USA 88:522–526.

    Pesole, G., C. Gissi, C. Lanave, and C. Saccone. 1995. Glutamine synthetase gene evolution in bacteria. Mol. Biol. Evol. 12:189–197.[Abstract]

    Rigottier-Gois, L., S. L. Turner, J. P. W. Young, and N. Amarger. 1998. Distribution of repC plasmid-replication sequences among plasmids and isolates of Rhizobium leguminosarum bv. viciae from field populations. Microbiology 144:771–780.

    Rome, S., M. P. Fernandez, B. Brunel, P. Normand, and J.-C. Cleyet-Marel. 1996. Sinorhizobium medicae sp. nov., isolated from annual Medicago spp. Int. J. Syst. Bacteriol. 46:972–980.[Abstract]

    Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425.[Abstract]

    Shatters, R. G., Y. Liu, and M. L. Kahn. 1993. Isolation and characterisation of a novel glutamine synthetase from Rhizobium meliloti. J. Biol. Chem. 268:469–475.[Abstract/Free Full Text]

    Souza, V., and L. E. Eguiarte. 1997. Bacteria gone native vs. bacteria gone awry?: plasmidic transfer and bacterial evolution. Proc. Natl. Acad. Sci. USA 94:5501–5503.

    Souza, V., T. T. Nguyen, R. R. Hudson, D. Piñero, and R. E. Lenski. 1992. Hierarchical analysis of linkage disequilibrium in Rhizobium populations: evidence for sex? Proc. Natl. Acad. Sci. USA 89:8389–8393.

    Sullivan, J. T., B. D. Eardly, P. van Berkum, and C. W. Ronson. 1996. Four unnamed species of nonsymbiotic rhizobia isolated from the rhizosphere of Lotus corniculatus. Appl. Environ. Microbiol. 62:2818–2825.

    Sullivan, J. T., H. N. Patrick, W. L. Lowther, D. B. Scott, and C. W. Ronson. 1995. Nodulating strains of Rhizobium loti arise through chromosomal symbiotic gene transfer in the environment. Proc. Natl. Acad. Sci. USA 92:8985–8989.

    Swofford, D. L. 1998. PAUP*. Phylogenetic analysis using parsimony (* and other methods). Version 4.0b2a. Sinauer, Sunderland, Mass.

    Taboada, H., S. Encarnacion, C. del Carmen Vargas, Y. Mora, E. MartÍnez-Romero, and J. Mora. 1996. Glutamine synthetase II constitutes a novel taxonomic marker in Rhizobium etli and other Rhizobium species. Int. J. Syst. Bacteriol. 46:485–591.

    Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876–4882.[Abstract/Free Full Text]

    Wang, D. Y.-C., S. Kumar, and S. B. Hedges. 1999. Divergence time estimates for the early history of animal phyla and the origin of plants, animals and fungi. Proc. R. Soc. Lond. B Biol. Sci. 266:163–171.[ISI][Medline]

    Weiller, G. F. 1998. Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences. Mol. Biol. Evol. 15:326–335.[Abstract]

    Woese, C. R. 1987. Bacterial evolution. Microbiol. Rev. 51:221–271.[ISI]

    Yamashita, M. M., R. J. Almassy, C. A. Janson, D. Cascio, and D. Eisenberg. 1989. Refined atomic model of glutamine synthetase at 3.5Å resolution. J. Biol. Chem. 264:17681–17690.[Abstract/Free Full Text]

    Young, J. P. W. 1996. Phylogeny and taxonomy of rhizobia. Plant Soil 186:45–52.

    Young, J. P. W., and A. W. B. Johnston. 1989. The evolution of specificity in the legume-Rhizobium symbiosis. Trends Ecol. Evol. 4:341–349.[ISI]

    Zhou, J. J., L. D. Bowler, and B. G. Spratt. 1997. Interspecies recombination, and phylogenetic distortions, within the glutamine synthetase and shikimate dehydrogenase genes of Neisseria meningitidis and commensal Neisseria species. Mol. Microbiol. 23:799–812.[ISI][Medline]

Accepted for publication November 1, 1999.