Department of Microbiology (G08), University of Sydney, NSW 2006, Australia1
Author for correspondence: Peter R. Reeves. Tel: +612 9351 2536. Fax: +612 9351 4571. e-mail: reeves{at}angis.usyd.edu.au
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: rml genes, rhamnose pathway, Salmonella enterica, lateral gene transfer
Abbreviations: CPS, capsular polysaccharide
The GenBank accession numbers for the sequences reported in this paper are AF279615AF279625 for the rml gene sets and AF279626AF279648 for the rmlB gene fragments.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The genes involved in the biosynthesis of O antigen are clustered at 45 min on the chromosome in S. enterica, flanked by the galF and gnd genes at the 5' and 3' ends, respectively. The G+C content of O antigen gene clusters is generally atypical, suggesting that they have transferred from other species relatively recently (Reeves, 1993 ). The 2435 serovars of S. enterica are classified into seven subspecies (I, II, IIIa, IIIb, IV, V, VI) based on their biochemical characteristics (Popoff & Le Minor, 1997
). This classification has been confirmed by multilocus enzyme electrophoresis and nucleotide sequencing of several housekeeping genes: gapA (glyceraldehyde-3-phosphate dehydrogenase), mdh (malate dehydrogenase) and putP (proline permease) (Boyd et al., 1994
; Nelson & Selander, 1992
; Nelson et al., 1991
). Both methods have given discrete groups for each subspecies, suggesting very little recombination between subspecies. However, O antigen genes appear to have transferred among subspecies, as the majority of S. enterica O antigens are found in at least two subspecies with a mean of 3·5 subspecies per O antigen (Popoff & Le Minor, 1997
; Reeves, 1995
). The relatively higher level of transfer of O antigen gene clusters within or between species is believed to reflect the advantage from time to time of an alternative O antigen for bacterial adaptation to new niches, followed by natural selection for recombinants (Reeves, 1997
).
Although research in S. enterica, E. coli and other species makes it evident that O antigen gene clusters have been subject to gene transfer between species (Comstock et al., 1995 ; Jiang et al., 1991
; Stevenson et al., 1994
), there are few studies on the origin of the O antigen gene clusters and the extent of their transfer between species, due to the limited availability of O antigen and other polysaccharide gene cluster sequences.
The genes in O antigen gene clusters generally fall into three classes: pathway genes for the biosynthesis of nucleotide sugars; transferase genes, mostly glycosyl transferase genes, for the synthesis and modification of the O unit; and processing genes, such as wzx and wzy, for the polymerization and transport of O units (Reeves et al., 1996 ). Compared with transferase genes and processing genes, which are very heterogeneous among different O antigens due to the wide range of linkages involved, pathway genes are generally homologous throughout all species. Thus, pathway genes are the good candidates for studying relationships and lateral gene transfer of O antigen gene clusters.
L-Rhamnose (rhamnose) is a 6-deoxyhexose sugar which is widely distributed in O antigens of Gram-negative bacteria and is also commonly present in the capsular polysaccharides (CPSs) of Gram-positive bacteria. dTDP-L-rhamnose is the activated precursor of the rhamnose moiety in O antigens and CPSs. rmlA, rmlB, rmlC and rmlD encode (in biosynthetic pathway order) glucose-1-phosphate thymidylyl-transferase, dTDP-D-glucose-4,6-dehydratase, dTDP-6-deoxy-D-glucose-3,5-epimerase and dTDP-6-deoxy-L-mannose dehydrogenase, respectively, which are responsible for the four-step biosynthesis of dTDP-L-rhamnose from glucose 1-phosphate. Some of the sugars commonly found in O antigens and CPSs are also involved in general metabolism, and the biosynthetic pathway genes for UDP-glucose and UDP-galactose, for example, are generally on the chromosome outside of the O antigen or CPS gene clusters. However, rhamnose is commonly present in bacteria only as a component of a surface polysaccharide and the four rml genes are generally arranged as a separate group within the O antigen or CPS gene cluster. The four rml genes have been identified in a range of species and are clearly homologous, although the gene order may vary from species to species (DeShazer et al., 1998 ; Guidolin et al., 1994
; Koplin et al., 1993
; Mitchison et al., 1997
; Tsukioka et al., 1997
). Therefore rml genes could be useful in studying the relationships of O antigen and other gene clusters.
Before investigating rml genes from a broad range of species, we first inspected the variation within a species and in this report focus on genetic variation in the four rml genes in S. enterica. Lüderitz et al. (1966) determined the major polysaccharide components of the then 37 known LPS forms in S. enterica and found that 12 have rhamnose in their O antigens. Among the nine O antigens which were found after Lüderitzs study, three (E4, D3 and O62) were found to include rhamnose by DNA hybridization (Xiang et al., 1993
) or determination of their structure (Vinogradov et al., 1994
). O67 is a variant of B (Lei & Reeves, unpublished), O61 has no rhamnose by structural study (Vinogradov et al., 1992
) and nothing is known about the sugar constituents of O60, O63, O65 and O66.
Of the eight previously studied and closely related O antigens of S. enterica (A, B, C2, D1, D2, D3 E1 and E4) the rml gene set of O antigen B has been fully sequenced (Jiang et al., 1991 ) and that of O antigen E1 nearly fully sequenced (Wang et al., 1992
). It was found that the rml genes for both O antigens are arranged in the order rmlB, rmlD, rmlA and rmlC at the 5' end of the O cluster. rmlB, rmlD and most of rmlA for O antigen E1 are very similar to those of O antigen B. However, similarity of rml genes of E1 with those of B falls sharply from near the end of rmlA to the end of rmlC (Wang et al., 1992
). The four rml genes for O antigens A, C2, D1, D2 and D3 are almost identical to those of O antigen B, based on restriction mapping (Brown et al., 1992
; Liu et al., 1991
; Reeves, 1993
; Xiang, 1995
). Likewise, the gene clusters for E1 and E4 have the same hybridization patterns, even for glycosyl transferase genes (Xiang et al., 1993
). O antigens E2 and E3 have already been incorporated into E1 (Popoff & Le Minor, 1997
) as they have the same chromosomal gene cluster with the differences known to be due to genes on converting phages, and the difference between E1 and E4 is proposed to be also due to the presence of a gene(s) on a converting phage in E4, although the phage has not been observed (Xiang et al., 1993
). For other rhamnose-containing O antigens (O11, O28, O42, O53, O57 and O59), a gradient of divergence was observed when probes of the four rml genes of O antigen B strain LT2 were used for hybridization (Xiang et al., 1993
). The rmlC genes of O11, O28, O42, O53, O57 and O59 strains, like that of E1, all failed to hybridize. For some of these O antigens, rmlA showed only weak hybridization, whereas all were strongly positive for rmlB and rmlD.
In this study, we sequenced the rml genes from a representative strain for each O antigen containing rhamnose, including an O62 strain, other than for those already sequenced or, as discussed above, known to be almost identical to those of the B or E1 O antigen gene clusters. We also sequenced the 5' end of the O antigen gene clusters of O60, O63, O65 and O66 to see if they contain rml genes. A further analysis of sequence and phylogeny was performed based on rml genes from all rhamnose-containing O antigens of S. enterica.
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
The same method was used to sequence the 5' end sequences of M1939 (60,V), M1953 (63,IIIa), M330 (65,IIIb) and M322 (66,V) O antigen gene clusters.
Amplification of long range PCR products was carried out by using the expand long-range PCR kit from Boehringer Mannheim according to the manufacturers instructions. Sequencing was carried out by the Sydney University and Prince Alfred Macromolecular Analysis Centre (SUPAMAC) using an Applied Biosystems model 377A automated DNA sequencing system and the Applied Biosystems dye terminator cycle sequencing kit.
Cloning of the rmlC gene from M269 (28,I).
No rmlC gene was found immediately downstream of rmlA in M269 (28,I) and two O53 strains. A selection method for cloning rmlC genes was devised by modification of the method described by Clarke & Whitfield (1992) . Bacteriophage Ffm (Wilkinson et al., 1972
) lyses E. coli strains with rough LPS (Schmidt et al., 1974
). P5435 is a E. coli K-12 strain that was constructed so that it contains all the genes necessary for the biosynthesis of K-12 O antigen with the exception of rmlC, which was deleted (data not shown). The plasmid gene banks of these strains were constructed as follows: the chromosomal DNA was partially digested with Sau3AI. DNA fragments of 28 kb were collected from an agarose gel and ligated to BamHI-digested pGEM7zf(+) vector. The ligation mix was then transformed into P5435 by electroporation and the transformants applied to plates pre-seeded with 105 p.f.u. phage Ffm. The bacteria transformed with an rmlC-containing plasmid would make smooth LPS, which would prevent them from being lysed by the phage Ffm, and such bacteria were obtained by selection for resistance to the bacteriophage Ffm. The bacteria were further examined by agglutination with O16 antisera and the plasmids were confirmed by sequencing of the insert.
Computer analysis.
DNA sequence data were assembled and edited using programs from ANGIS (The Australian National Genomic Information Service) at the University of Sydney. Pairwise comparisons and polymorphism analysis of DNA sequence data were conducted using the MULTICOMP package (Reeves et al., 1994 ), which incorporates a number of programs for DNA sequence and phylogeny analysis. Phylogenetic trees were constructed both by the parsimony method using PAUP (version 4.0) and the neighbour-joining method (Saitou & Nei, 1987
) using PHYLIP (version 3.4, written by J. Felsenstein, Department of Genetics, University of Washington, Seattle, USA). Except for some minor differences, the trees generated have the same topology, and only neighbour-joining trees are presented in this study. Intragenic recombination was detected by the Stephens test (Stephens, 1985
) and the Maximum chi-squared program (version 1.0, written by B. Spratt and N. Ross, School of Biological Sciences, University of Sussex, UK) (Smith, 1992
).
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
There are four O antigens for which we have no information on whether they contain rhamnose (see Introduction). As all rml gene sets found so far in S. enterica are located at the 5' end of the O antigen gene cluster, we sequenced the region from strains M1939 (60,V), M1953 (63,IIIa), M330 (65,IIIb) and M322 (66,V). M1939 (60,V) has rmlB and rmlA at the 5' end of the O antigen gene cluster, but no rmlD or rmlC genes were found downstream of rmlA. As rmlB and rmlA are also involved in the biosynthesis of dTDP-N-acetyl fucosamine (Kuhn et al., 1984 ) and dTDP-N-acetyl viosamine (Reeves et al., 1996
), we suggest that the two rml genes of M1939 (60,V) are involved in the biosynthesis of a sugar other than rhamnose. The O63, O65 and O66 strains lacked rml genes at this region. We presume that they contain no rhamnose in their O antigens and did not work further with them.
Attempted cloning of rmlC genes from M269 (28,I) and the two O53 strains
In M269 (28,I), M303 (53,II) and M1891 (53,IIIb), no rmlC gene was found immediately downstream of rmlA. We attempted to clone rmlC genes from plasmid gene banks of these strains, using an E. coli K-12 host strain with rmlC deleted and selecting for O antigen synthesis using phage Ffm (see Methods). One plasmid, pPR1996, carrying a 5 kb fragment starting from residue 301 of rmlB was obtained from M269 (28,I). Further sequencing showed that the rmlC gene of M269 (28,I) was in the O antigen gene cluster but separated from the rmlA gene by a complete ORF of 271 aa that has considerable similarity to a putative glycosyl-transferase gene (amsE) in the amylovoran gene cluster of Erwinia amylovora (Koplin et al., 1993 ). There are 28 bp and 25 bp intergenic regions upstream and downstream of this ORF, respectively. No rmlC clones were obtained with DNA from the two O53 strains.
Sequence variation
The new rml sequences include representatives of all rhamnose-containing O antigens other than B and E1, which are already sequenced, and C2, D1, D2, D3 and E4, which are known to be very similar to the O antigen B or E1 genes (see above). For the analysis of variation in rml genes, we included the previously published sequences from P9003 (B,I) and M32 (E1,I). Analysis revealed that all strains except M1952 (62,IIIa) are related in that their 5' end rml genes are very similar. All four rml genes of M1952 (62,IIIa), however, are very divergent from those of the other strains. We first focus on the sequence variation among strains other than M1952 (62,IIIa).
The average G+C contents are 0·43, 0·50, 0·45 and 0·34 mol% for rmlB, rmlD, rmlA and rmlC, respectively and in general lower than that of the S. enterica genome (0·52 mol%) (Ochman & Lawrence, 1996 ). Note that the average G+C content of rmlC is much lower than that of the other three rml genes and that of the third codon base of rmlC is only 0·21 mol% (0·180·28 mol%).
The sequences were aligned and alignments generated for all polymorphic and informative-only sites. Informative sites are those which affect branching of trees as they occur in at least two different nucleotides, with each present in at least two sequences. There are 802 polymorphic sites including 505 informative sites. In general, non-informative polymorphic sites have similar distributions as informative sites (shown in Fig. 1), but there is a cluster of 35 such sites at the 3' end of the rmlB gene (positions 726870, see Fig. 4
) varying only in M285 (59,II). Among rmlB sequences, there were 140 polymorphic sites, including 33 amino acid replacement sites. rmlD had 111 polymorphic sites, of which 25 were amino acid replacement sites. For rmlA, there were 218 polymorphic sites, including 45 amino acid replacement sites while rmlC had 351 polymorphic sites, 119 of which result in amino acid replacement.
|
|
For M1952 (62,IIIa), the percentage difference from other strains at the nucleotide level ranges from 24·1 to 25·1%, 28·2 to 29·7%, 21·1 to 24·6% and 24·7 to 35·0% for rmlB, rmlD, rmlA and rmlC, respectively. Amino acid sequence comparison of the four rml genes of M1952 (62,IIIa), S. enterica LT2 and E. coli K-12 shows similar levels of difference for all three pairwise comparisons (data not shown).
The rmlB and rmlA genes of M1939 (60,V) are thought to be involved in the biosynthesis of a dTDP-sugar other than rhamnose (see above). We compared these with those involved in rhamnose pathways. rmlB of M1939 (60,V) shows 5·3% to 9·4% difference from that of the other 10 O antigens if M1952 (62,IIIa) is excluded, while rmlA has 36·7% to 38·1% difference. It seems that rmlB of M1939 (60,V) has a recent common ancestor with that involved in the rhamnose pathway of S. enterica, while rmlA of M1939 (60,V) comes from a different source.
Evidence for recombination in the rml gene set
The neighbour-joining method was used to construct individual rml gene trees (Fig. 2). rml genes of three E. coli strains are also included (Marolda et al., 1999
; Rajakumar et al., 1994
; Stevenson et al., 1994
). If the rml gene sets have all evolved from the same ancestral set and there have been no recombination events, it would be expected that the four gene trees would have similar topology. The most notable feature of this analysis is, however, the variation among the four rml trees. In the rmlB tree, strains from subspecies I and II are clustered in two separate branches except for M285 (59,II) which is apart from other strains mainly because of a cluster of 35 sites at which M285 differs from others (see above). The rmlD tree is similar to the rmlB tree except that M269 (28,I) and M1911 (57,I) no longer cluster with subspecies I strains. In the rmlA tree, two O57 strains of subspecies I and II are grouped with M269 (28,I) in a separate branch, and subspecies II strains are no longer clustered in one branch. The level of variation is greater for rmlA, but if we exclude the 3' 136 bp sequence, it resembles that for the rmlB and rmlD trees (data not shown). In the rmlC tree, genes from strains of the same O antigen but different subspecies group together, and genes of O antigen B, O28 and O57 strains became very divergent from those of other O antigens. As expected, in all four rml trees, M1952 (62,IIIa) forms a deep branch away from the other S. enterica strains. The rmlB, rmlD and rmlA genes of the three E. coli strains form a separate branch and are more divergent than those of S. enterica. As observed in S. enterica, the difference levels increase remarkably in rmlC, with E. coli O7 and Flexneri grouped with S. enterica M1952 (62,IIIa) in a deep branch and rmlC of E. coli K-12 becomes extremely divergent from those of the other two strains. The non-congruence of the four rml gene trees of S. enterica, as discussed below, is attributed to recombination events.
|
|
The situation is complex in that there are often other breaks in the level of similarity and pattern of polymorphic sites, which indicate recombination. The first 1330 bp of M324 (11,VI) (Fig. 1) is apparently from a subspecies other than I or II. Evidence from two other subspecies VI strains, as we show below, suggests that this sequence is subspecies VI-specific. The remaining sequence of M324 (11,VI) has subspecies I sequence. In M269 (28,I), the first 700 bp of rmlB has typical subspecies I sequence, but from around 700 to 1735 it has subspecies II sequence and from positions 1758 to 1870, it is like subspecies I again. The next segment, from position 1873 to 2710, has sequence which does not resemble that of subspecies I or II but is shared by corresponding regions of M1911 (57,I) and M293 (57,II), although the first 1450 bp of M1911 (57,I) and first 2065 bp of M293 (57,II) have sequences typical of the subspecies to which they belong. For M1911 (57,I), the segment between positions 1451 and 2071 is subspecies II-like. The mosaic structures of these strains account for their abnormal groupings in the rml trees as discussed above. The two O53 strains are similar throughout rmlB, rmlD and rmlA and are subspecies II-specific except for the 3' end of rmlA. Both lack the rmlC gene.
Recombination is also proposed to account for the abrupt similarity changes at the 3' end of rmlC of E1, O11 and O42 strains. The sequences for these strains are highly similar from the 3' end of rmlA to the point 180 bp from the 3' end of rmlC, with a mean difference of 1·9% (0·83·3%) for the 5' 370 bp of rmlC. In contrast, the similarity level in the remaining 3' end rmlC is much lower, with the differences ranging from 33·3% to 42·0%.
It is noteworthy that there are chi-like sequences, indicated in Fig. 3, located adjacent to the junctions of the segments of different origins in M269 (28,I) (1727 5'-TCTGGTGG-3' 1734, 1841 5'-GCTGGTCG-3' 1834), M293 (57,II) (1841 5'-GCTGGTCG-3' 1834) and M1911 (57,I) (1841 5'-ACTGGTTG-3' 1834). The orientation is correct for chi-stimulated homologous recombination. The segments separated by chi are also shown by the Stephens test (Stephens, 1985
) to be highly significant partitions for clustered polymorphic sites (data not shown).
Subspecies variation and evolutionary relationships of S. enterica rmlB genes
As discussed above for subspecies I and II, the 5' end of the rml gene set, particularly rmlB, is commonly subspecies specific. To examine if the phenomenon applies more generally, a 903 bp segment of the rmlB gene extending from positions 16 to 918 was sequenced for 23 strains from seven subspecies. Most of these strains have O antigens 11, 42, 53, 57 or 59, each present in three to six subspecies (Popoff & Le Minor, 1997 ). Where possible, strains from different subspecies were chosen for each O antigen. We also included rmlB of M1939 (60,V) which we believe is involved in synthesis of a sugar other than rhamnose. Together with rmlB from the 11 previously sequenced rml gene sets, a total of 35 rmlB sequences are available for analysis.
If we exclude the extremely divergent strain M1952 (62,IIIa), there are 204 polymorphic sites and 152 informative sites. Analysis of the distribution of polymorphic sites (Fig. 4) revealed that sequences from subspecies I, II and the 5' half of IV each have distinct subspecies-specific polymorphic sites regardless of O antigen. All nine subspecies I strains representing seven O antigens have subspecies I-specific sequences. Four out of seven subspecies II strains representing four O antigens have a common sequence, which we treat as subspecies II specific. Four out of five subspecies IV strains representing four O antigens have a similar sequence for the 5' half of the gene, then M1642 (11,IV) and M1894 (53,IV) become subspecies I-like from around position 500 while M1917 (57,IV) and M1799 (42,IV) are subspecies II-like. The three strains from subspecies VI have almost identical polymorphic sites that are unique to these strains and hence very likely subspecies VI-specific, but note that all have antigen O11 which is the only rhamnose-containing O antigen present in subspecies VI.
There is no sequence specific for subspecies IIIa or IIIb strains, for which the entire rmlB genes resemble those from other subspecies (Table 3, Fig. 4
). This is also the case for a few subspecies II and IV strains. Evidence for intragenic recombination is also found for a number of strains (Table 3
). Some of these segments are confirmed by the Stephens test to be partitions with significant P values for clustered polymorphic sites (data not shown). Others, although not detected by the Stephens test, were found to contain clustered polymorphic sites typical for other subspecies when examined by the Maximum chi-squared program, which allows one to compare the distribution of polymorphic sites in two parental sequences and a potential recombinant sequence with that expected to occur by chance. For most of these segments, the high level of similarity and the distinct subspecies-specific sequences suggested they were generated as a result of inter-subspecies recombination. Interestingly, the cluster of 35 polymorphic sites mentioned above in the rmlB gene of M285 (59,II) is also shared by M1914 (57,II) and M1915 (57,IIIb) (positions 726870). The high level of divergence of this segment and of another segment shared by M1795 (42,IIIa) and M1796 (42,IIIb) at positions 102151, suggests that they have been derived from other species.
|
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Variation at the 5' and 3' ends of the rml gene set
There is a gradient in the nature of variation along the rml gene set, but we will focus first on genes at the two ends. S. enterica has a distinct subspecies structure, with housekeeping gene sequence variation correlating with subspecies (see Introduction). The rml genes are not housekeeping genes, but at least for subspecies I, II, IV and VI, the variation at the 5' end of the rml gene set is subspecies specific. For comparison of the level of the variation in rmlB genes with that in housekeeping genes, we have chosen for analysis two sequences from each of the four well represented subspecies (marked * in Fig. 5), as housekeeping gene studies usually have equal representation from each subspecies. The mean difference is 3·69%, which is within the range (3·84·6%) previously estimated for several housekeeping genes (Boyd et al., 1994
; Nelson & Selander, 1994
, 1992
; Nelson et al., 1991
; Thampapillai et al., 1994
). The rmlB tree and a tree based on the combined coding sequences of five housekeeping genes (Selander et al., 1996
) shows that the relationships for subspecies I, II and VI are similar for the two trees. The exception is subspecies V which, on all previously used criteria, is the most divergent, but in the rmlB tree the subspecies V strain M1939 (60,V) is closest to the subspecies I cluster, branching from it after the other subspecies diverge. It should be noted that no subspecies V serovar has rhamnose in its O antigen and the rmlB gene of M1939 (60,V) is from a different, unknown, sugar pathway.
The three available E. coli rmlB genes group together but vary more than those of S. enterica (Fig. 2). The level of rmlB difference between E. coli and S. enterica is about 20%, with a similar value for rmlD and rmlA (excluding the 3' end of rmlA). This is only a little greater than the 15% difference estimated for the two species (Sharp, 1991
) and is consistent with the 5' rml genes having been in E. coli and S. enterica since the two species diverged. Thus the variation within S. enterica and between S. enterica and E. coli is consistent with the genes having diverged as the species and subspecies diverge.
However, the rmlB genes of the two S. enterica O57 strains form a highly divergent separate branch, but are still closer to other S. enterica sequences than to the E. coli sequences, suggesting that they came from an as yet unidentified subspecies of S. enterica.
The situation at the 3' end of the rml gene set is quite different. The 3' end of rmlA and all of rmlC are much more variable than are the genes discussed above, and the variation at this end of the gene set is clearly O antigen and not subspecies specific. The G+C content is also much lower than that of the 5' end, and very close to that of the central serogroup-specific region as determined for gene clusters of O antigens B, C2, D1, D2, D3 and E1 (Brown et al., 1991 ; Jiang et al., 1991
; Liu et al., 1991
; Wang et al., 1992
; Xiang, 1995
). Apparently, the 3' rml genes have a different evolutionary history from the 5' rml genes. We suggest that rmlC and the 3' end of rmlA are commonly transferred with the glycosyl-transferase and O antigen processing genes, which determine O antigen specificity, and are generally in the central region of the gene cluster. The high level of sequence variation in the 3' rmlC gene indicates the divergent sources for these O antigen gene clusters (see below and Fig. 6
.
|
We see three possibilities for the maintenance of subspecies specificity of some 5' rml genes given the high level of lateral transfer of O antigen gene clusters:
1. It could be that rml-containing O antigen gene clusters commonly transfer between subspecies by recombination within the rml genes, the 5' end of the gene cluster gaining subspecies-specific sequence in the process.
2. The gene clusters may transfer as complete gene clusters, with the 5' rml gene set being replaced later by genes from another strain of the same subspecies.
3. The O antigen gene clusters currently present in each subspecies have been there since subspecies divergence.
In support of option 1 and against option 3 in particular is the observation that the gnd gene, located downstream of the O antigen gene cluster, often has a chimeric structure with the 5' and 3' parts of the gene from different subspecies, suggesting that the 5' end of the gene had been transferred between subspecies together with the O antigen gene cluster (Nelson & Selander, 1994 ; Thampapillai et al., 1994
). This is similar to the situation we now observe for the 5' end of the rml gene set located at the other end of some O antigen gene clusters. The presence of both gnd and rml genes with sequence of mixed subspecies origins argues strongly in favour of the movement of O antigen genes between subspecies and argues against option 3. In addition, option 3 can not account for the high level of similarity of rmlC genes from the strains which have the same O antigen but come from different subspecies, while option 1 can.
Option 2 seems highly improbable. There appears to be no obvious selection pressure maintaining the subspecies-specificity of the 5' rml genes, as indicated by the comparable ratios of synonymous to nonsynonymous nucleotide substitutions among rmlB genes within and between subspecies (data not shown).
We conclude that option 1, that lateral transfer of rml-containing O antigen gene clusters generally involves recombination within the rml gene set fits all the known data and provides the best explanation for the 5' end of the rml gene set being subspecies specific.
Recombination involving rml gene-containing O antigen gene clusters
The chimeric structures of rml genes in S. enterica throw light on the origins of the extant O antigen gene clusters. The junctions between segments shown in Fig. 3 are presumably the sites of homologous recombination between the donor and the recipient O antigen gene clusters involved in lateral transfer. In general, recombination products that survive will be those that give a clone a new O antigen, which is favoured by natural selection. This will be achieved by the transfer of the central O antigen-specific region, which will often bring with it part of the adjacent common DNA, in this case, the 3' portion of rml gene set. We see two levels of recombination, that is inter- and intra-species recombination. The scenario illustrated in Fig. 7
could explain the results observed in this study. Clusters Ise and IIse represent two rml-containing O antigen gene clusters present in different subspecies of S. enterica since subspecies divergence. Clusters
and ß are two rml-containing polysaccharide gene clusters recently transferred to S. enterica from other species. The inter-species recombination between clusters Ise and
is shown at the 3' end of rmlA as this appears to be the most common site, resulting in cluster IVse. The O antigen is still of type
but most of the rml genes are those of S. enterica.
|
For O antigens E1, O11 and O42, the gene clusters have highly similar sequences from the 3' end of rmlA to near the 3' end of rmlC, after which they become highly divergent. One possibility is that after one of these O antigen gene clusters (represented by IVse) established in S. enterica, there was recombination at the 3' end of rmlC with other incoming polysaccharide gene clusters (represented by cluster ß in Fig. 7) at the 3' end of rmlC, resulting in gene clusters which differ only in the 3' end of rmlC (represented by cluster Vse).
In the situation where only the donor cluster contains the rml gene set, as illustrated by the recombination between cluster IVse and IIIse (Fig. 7), we suggest that the exchange of O antigen clusters occurs by recombination within housekeeping genes upstream of the O antigen cluster, for example, within galF, resulting in the replacement of the whole rml gene set (VIIIse of Fig. 7
). We have not looked at the galF sequence to confirm this but movement of O antigen by recombination within genes adjacent to an O antigen cluster has already been observed in the gnd gene (Thampapillai et al., 1994
). There are many cases from subspecies IIIa and IIIb as well as a few from other subspecies in this study showing inter-subspecies recombination involving entire rmlB segments and most probably other rml genes as well. An example is seen in the rml gene set of M1891 (53,IIIb) which appears to have come from subspecies II. The only subspecies V strain in this study might have also derived its rmlB gene from subspecies I and then diverged within the subspecies.
Although it is obvious that O antigen gene clusters of S. enterica have undergone lateral gene transfer, the mechanism for such transfer is not clear. The finding of plasmid-born O antigens in S. enterica O54 (Popoff & Le Minor, 1985 ) and E. coli Sonnei (Viret et al., 1993
) suggests that plasmids could be the vehicle.
The origins of the O62 gene cluster
The rml gene set of M1952 (62,IIIa) is atypical as its rml genes are all as divergent from those of other S. enterica or E. coli O antigens as are those of S. enterica and E. coli from each other. It appears that the rmlB, rmlD and rmlA genes of M1952 (62,IIIa), E. coli and S. enterica O antigens diverged at approximately the same time. It could be that O62 is one of the O antigens originally in S. enterica but that its rmlB, rmlD and rmlA genes have since evolved independently of those of the other S. enterica O antigens. O62 is only present in subspecies IIIa and it is also possible that its rml gene set was captured from a species closely related to S. enterica and E. coli.
Concluding remarks
The gene clusters of several polysaccharide antigens have been found to have a cassette structure with a central set of variable serogroup-specific genes flanked by highly homologous pathway genes or other genes common to all clusters of that antigen class. This pattern has been observed in the gene clusters of several structurally related O antigens in S. enterica (Brown et al., 1992 ; Jiang et al., 1991
; Liu et al., 1991
; Wang et al., 1992
) and CPS (Coffey et al., 1998
; Frosch et al., 1989
; Kroll & Moxon, 1990
; Kroll et al., 1989
) as well as E. coli group II K antigens (Roberts, 1996
). It has been hypothesized, based on these studies, that the outer conserved genes play a role in mediating the exchange of central serogroup-specific genes. This investigation of rml genes of S. enterica is, as far as we know, the first study focusing on the evolution of the outer conserved genes, which is necessary for our understanding of the evolution of O antigen gene clusters.
We have shown that the rml genes of S. enterica do play a role in mediating transfer of the serogroup specific genes and through this carry a record of the history of the O antigen clusters as they are transferred between subspecies. The 5' part of the rml gene set has subspecies specificity, indicating that it is only rarely included in inter-subspecies transfer. Only the 3' end of the rml gene set appears to represent DNA from the donor species, but even in the highly divergent sequences of the 3' end of rmlC, the S. enterica strains tend to group together (Fig. 6): some group with the E. coli sequences, but only the O57 and O28 sequences group with sequences from more distantly related species. It may be that transfer within the Enterobacteriacae is much easier than from more distantly related species and that many gene clusters in S. enterica have come via other enterobacterial species, in the process losing all of the rml sequences from their original sources in one of the many recombination events involved.
The rml genes are extremely useful in following the history of O antigen gene clusters within S. enterica, but because of recombination in these genes, they do not appear to be very useful for our original aim of determining the ultimate source of the gene cluster.
![]() |
ACKNOWLEDGEMENTS |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Boyd, E. F., Nelson, K., Wang, F.-S., Whittam, T. S. & Selander, R. K. (1994). Molecular genetic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc Natl Acad Sci USA 91, 1280-1284.[Abstract]
Brown, P. K., Romana, L. K. & Reeves, P. R. (1991). Cloning of the rfb gene cluster of a group C2 Salmonella: comparison with the rfb regions of groups B and D. Mol Microbiol 5, 1873-1881.[Medline]
Brown, P. K., Romana, L. K. & Reeves, P. R. (1992). Molecular analysis of the rfb gene cluster of Salmonella serovar Muenchen (strain M67): genetic basis of the polymorphism between groups C2 and B. Mol Microbiol 6, 1385-1394.[Medline]
Clarke, B. R. & Whitfield, C. (1992). Molecular cloning of the rfb region of Klebsiella pneumoniae serotype O1:K20: the rfb gene cluster is responsible for synthesis of the D-galactan I O polysaccharide. J Bacteriol 174, 4614-4621.[Abstract]
Coffey, T. J., Enright, M. C., Daniels, M., Morona, J. K., Morona, R., Hryniewicz, W., Paton, J. C. & Spratt, B. G. (1998). Recombinational exchanges at the capsular polysaccharide biosynthetic locus lead to frequent serotype changes among natural isolates of Streptococcus pneumoniae. Mol Microbiol 27, 73-83.[Medline]
Comstock, L. E., Maneval, D.Jr, Panigrahi, P., Joseph, A., Levine, M. M., Kaper, J. B., Morris, J. G.Jr & Johnson, J. A. (1995). The capsule and O antigen in Vibrio cholerae 0139 Bengal are associated with a genetic region not present in Vibrio cholerae 01. Infect Immun 63, 317-323.[Abstract]
DeShazer, D., Brett, P. J. & Woods, D. E. (1998). The type II O-antigen polysaccharide moiety of Burkholderia pseudomallei lipopolysaccharide is required for serum resistance and virulence. Mol Microbiol 30, 1081-1100.[Medline]
Frosch, M., Weisgerber, C. & Meyer, T. F. (1989). Molecular characterization and expression in Escherichia coli of the gene complex encoding the polysaccharide capsule of Neisseria meningitidis group B. Proc Natl Acad Sci USA 86, 1669-1673.[Abstract]
Guidolin, A., Morona, J. K., Morona, R., Hansman, D. & Paton, J. C. (1994). Nucleotide sequence analysis of genes essential for capsular polysaccharide biosynthesis in Streptococcus pneumoniae type 19F. Infect Immun 62, 5384-5396.[Abstract]
Hobbs, M. & Reeves, P. R (1994). The JUMPstart sequence: a 39 bp element common to several polysaccharide gene clusters. Mol Microbiol 12, 855-856.[Medline]
Jiang, X. M., Neal, B., Santiago, F., Lee, S. J., Romana, L. K. & Reeves, P. R. (1991). Structure and sequence of the rfb (O antigen) gene cluster of Salmonella serovar typhimurium (strain LT2). Mol Microbiol 5, 695-713.[Medline]
Koplin, R., Wang, G., Hotte, B., Priefer, U. B. & Puhler, A. (1993). A 3·9 kb DNA region of Xanthomonas campestris pv. campestris that is necessary for lipopolysaccharide production encodes a set of enzymes involved in the synthesis of dTDP-rhamnose. J Bacteriol 175, 7786-7792.[Abstract]
Kroll, J. S. & Moxon, E. R. (1990). Capsulation in distantly related strains of Haemophilus influenzae type b: genetic drift and gene transfer at the capsulation locus. J Bacteriol 172, 1347-1379.
Kroll, J. S., Zamze, S., Loynds, B. & Moxon, E. R. (1989). Common organization of chromosomal loci for production of different capsular polysaccharides in Haemophilus influenzae. J Bacteriol 174, 3343-3347.
Kuhn, H. M., Meier, U. & Mayer, H. (1984). ECA, das gemeinsame Antigen der Enterobacteriaceae Stiefkind der Mikrobiologie. Forum Microbiol 7, 274-285.
Liu, D., Verma, N. K., Romana, L. K. & Reeves, P. R. (1991). Relationships among the rfb regions of Salmonella serovars A, B, and D. J Bacteriol 173, 4814-4819.[Medline]
Lüderitz, O., Staub, A. M. & Westphal, O. (1966). Immunochemistry of O and R antigens of Salmonella and related Enterobacteriaceae. Bacteriol Rev 30, 192-255.[Medline]
Marolda, C. L., Feldman, M. F. & Valvano, M. A. (1999). Genetic organization of the O7-specific lipopolysaccharide biosynthesis cluster of Escherichia coli VW187 (O7:K1). Microbiology 145, 2485-2495.
Mitchison, M., Bulach, D. M., Vinh, T., Rajakumar, K., Faine, S. & Adler, B. (1997). Identification and characterization of the dTDP-rhamnose biosynthesis and transfer genes of the lipopolysaccharide-related rfb locus in Leptospira interrogans serovar Copenhageni. J Bacteriol 179, 1262-1267.[Abstract]
Nelson, K. & Selander, R. K. (1992). Evolutionary genetics of the proline permease gene (putP) and the control region of the proline utilization operon in populations of Salmonella and Escherichia coli. J Bacteriol 174, 6886-6895.[Abstract]
Nelson, K. & Selander, R. K. (1994). Intergenic transfer and recombination of the 6-phosphogluconate dehydrogenase gene (gnd) in enteric bacteria. Proc Natl Acad Sci USA 91, 10227-10231.
Nelson, K., Whittam, T. S. & Selander, R. K. (1991). Nucleotide polymorphism and evolution in the glyceraldehyde-3-phosphate dehydrogenase gene (gapA) in natural populations of Salmonella and Escherichia coli. Proc Natl Acad Sci USA 88, 6667-6671.[Abstract]
Ochman, H. & Lawrence, J. G. (1996). Phylogenetics and the amelioration of bacterial genomes. In Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology, pp. 26272648. Edited by F. C. Neidhardt and others. Washington, DC: American Society for Microbiology.
Ornellas, E. P. & Stocker, B. A. D. (1974). Relation of lipopolysaccharide character to P1 sensitivity in Salmonella typhimurium. Virology 60, 491-502.[Medline]
Popoff, M. Y. & Le Minor, L. (1985). Expression of antigenic factor O:54 is associated with the presence of a plasmid in Salmonella. Ann Inst Pasteur Microbiol 136B, 169-179.
Popoff, M. Y. & Le Minor, L. (1992). Antigenic Formulas of the Salmonella Serovars, 6th revision. Paris: WHO Collaborating Centre for Reference and Research on Salmonella, Institut Pasteur, Paris.
Popoff, M. Y. & Le Minor, L. (1997). Antigenic Formulas of the Salmonella Serovars, 7th revision. Paris: WHO Collaborating Centre for Reference and Research on Salmonella, Institut Pasteur.
Rajakumar, K., Jost, B. H., Sasakawa, C., Okada, N., Yoshikawa, M. & Adler, B. (1994). Nucleotide sequence of the rhamnose biosynthetic operon of Shigella flexneri 2a and role of lipopolysaccharide in virulence. J Bacteriol 176, 2362-2373.[Abstract]
Reeves, P. R. (1993). Evolution of Salmonella O antigen variation by interspecific gene transfer on a large scale. Trends Genet 9, 17-22.[Medline]
Reeves, P. R. (1995). Role of O-antigen variation in the immune response. Trends Microbiol 3, 381-386.[Medline]
Reeves, P. R. (1997). Specialized clones and lateral transfer in pathogens. In Ecology of Pathogenic Bacteria: Molecular and Evolutionary Aspects, pp. 237254. Edited by B. A. M. van der Zeijst and others. Amsterdam: Elsevier.
Reeves, P. R., Farnell, L. & Lan, R. (1994). MULTICOMP: a program for preparing sequence data for phylogenetic analysis. Comput Appl Biosci 10, 281-284.[Abstract]
Reeves, P. R., Hobbs, M., Valvano, M. & 8 other authors (1996). Bacterial polysaccharide synthesis and gene nomenclature. Trends Microbiol 4, 495503.[Medline]
Roberts, I. S. (1996). The biochemistry and genetics of capsular polysaccharide production in bacteria. Annu Rev Microbiol 50, 285-315.[Medline]
Saitou, N. & Nei, M. (1987). The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406-425.[Abstract]
Schmidt, G., Jann, B. & Jann, K. (1974). Genetic and immunochemical studies on Escherichia coli O14:K7:H-. Eur J Biochem 42, 303-309.[Medline]
Selander, R. K., Li, J. & Nelson, K. (1996). Evolutionary genetics of Salmonella enterica. In Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd edn, pp. 26912707. Edited by F. C. Neidhardt and others. Washington, DC: American Society for Microbiology.
Sharp, P. M. (1991). Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution. J Mol Evol 33, 23-33.[Medline]
Smith, J. M. (1992). Analysing the mosaic structure of genes. J Mol Evol 34, 126-129.[Medline]
Stephens, J. C. (1985). Statistical methods of DNA sequence analysis: detection of intragenic recombination or gene conversion. Mol Biol Evol 2, 539-556.[Abstract]
Stevenson, G., Neal, B., Liu, D., Hobbs, M., Packer, N. H., Batley, M., Redmond, J. W., Lindquist, L. & Reeves, P. R. (1994). Structure of the O-antigen of E. coli K-12 and the sequence of its rfb gene cluster. J Bacteriol 176, 4144-4156.[Abstract]
Thampapillai, G., Lan, R. & Reeves, P. R. (1994). Molecular evolution in the gnd locus of Salmonella enterica. Mol Biol Evol 11, 813-828.[Abstract]
Tsukioka, Y., Yamashita, Y., Oho, T., Nakano, Y. & Koga, T. (1997). Biological function of the dTDP-rhamnose synthesis pathway in Streptococcus mutans. J Bacteriol 179, 1126-1134.[Abstract]
Vinogradov, E. V., Shashkov, A. S., Knirel, Y. A., Kochetkov, N. K., Dabrowski, J., Grosskurth, H., Stanislavsky, E. S. & Kholodkova, E. V. (1992). The structure of the O-specific polysaccharide chain of the lipopolysaccharide of Salmonella arizonae O61. Carbohydr Res 231, 1-11.[Medline]
Vinogradov, E. V., Knirel, Y. A., Kochetkov, N. K., Schlecht, S. & Mayer, H. (1994). The structure of the O-specific polysaccharide of Salmonella arizonae O62. Carbohydr Res 253, 101-110.[Medline]
Viret, J.-F., Cryz, S. J.Jr, Lang, A. B. & Favre, D. (1993). Molecular cloning and characterization of the genetic determinants that express the complete Shigella serotype D (Shigella sonnei) lipopolysaccharide in heterologous live attenuated vaccine strains. Mol Microbiol 7, 239-252.[Medline]
Wang, L., Romana, L. K. & Reeves, P. R. (1992). Molecular analysis of a Salmonella enterica group E1 rfb gene cluster: O antigen and the genetic basis of the major polymorphism. Genetics 130, 429-443.
Wilkinson, R. G., Gemski, P., Stocker, J. & Stocker, B. A. D. (1972). Non-smooth mutants of Salmonella typhimurium: differentiation by phage sensitive and genetic mapping. J Gen Microbiol 70, 527-554.[Medline]
Xiang, S.-H. (1995). Variation in rfb gene clusters of Salmonella enterica and origin of group D2. PhD thesis, University of Sydney.
Xiang, S.-H., Haase, A. M. & Reeves, P. R. (1993). Variation of the rfb gene clusters in Salmonella enterica. J Bacteriol 175, 4877-4884.[Abstract]
Received 16 February 2000;
revised 1 May 2000;
accepted 31 May 2000.