Department of Ecology and Evolution, Stony Brook University
Correspondence: E-mail: dstoebel{at}life.bio.sunysb.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Adaptation Escherichia coli lateral gene transfer Shimodaira-Hasegawa test lac operon
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Lactose use by E. coli has become a paradigm of horizontal transfer leading to nonpathogenic adaptation and speciation because we also have an understanding of the ecology of E. coli (Lawrence and Roth 1996; Lawrence and Ochman 1998; Ochman, Lawrence, and Groisman 2000; Lawrence 2001). The importance of the supposed transfer event lies in the supposition that, for bacteria, metabolism is ecology (Lawrence 2001), in that the ability to degrade or synthesize specific compounds can limit the distribution and abundance of bacterial species. Ochman, Lawrence, and Groisman (2000) write that "E. coli, by acquiring the lac operon, gained the ability to use the milk sugar lactose as a carbon source and to explore a new niche, the mammalian colon, where it established a commensal relationship." In contrast, a close relative of E. coli, Salmonella cannot use lactose and lives in reptiles or as a pathogen in mammals. The hypothesis of Ochman, Lawrence, and Groisman (2000) links a single horizontal transfer event to the acquisition of novel metabolic properties, allowing the expansion of a niche and consequent speciation.
The reality of this transfer event should be questioned, however, as the data said to support E. coli's gain of lac are ambiguous. For example, the fact that S. enterica, the supposed sister group of E. coli, does not have a lac operon and does not ferment lactose (fig. 1) is consistent with a horizontal transfer event, but the pattern could also be due to loss of the lac operon in the Salmonella lineage. (It is also worth noting that lactose fermenting E. fergusonii, rather than any Salmonella sp., is E. coli's sister species [Lawrence, Ochman, and Hartl (1991)].)
|
The horizontal transfer of lac was revisited with DNA sequence data by Buvinger et al. (1984), who sequenced 1,430 bp 3' of the lac operon and found a terminal repeat similar to those in the transposon Tn5, suggestive of a past horizontal transfer. This 9-bp motif may be the signature of a transposon, but no other such repeat has been located 5' of the lac operon. The occurrence of one 9-bp repeat is at best marginal evidence that there was ever a transposon located in this region of the E. coli genome, let alone that a transposon brought with it the lac operon. In their survey of the complete genome of E. coli K-12, Lawrence and Ochman (1998) classified the lac operon as horizontally transferred but did not indicate the basis on which they made that determination.
The acquisition of the lac operon by E. coli could be an excellent case study of adaptive horizontal transfer, not only because we understand details of the molecular genetics and physiology of the operon but also because we can use this knowledge to make predictions about the ecological and evolutionary impacts of the acquisition of the operon. Given the tenuous nature of the evidence for this event, I attempted to rigorously test whether E. coli gained the lac operon via horizontal transfer. I used a phylogenetic approach, as methods using genome sequence information like G + C content or codon bias have been criticized for their poor performance (Guindon and Perriere 2001; Koski, Morton, and Golding 2001). Phylogenetic methods have recently been used successfully to verify the vertical transmission of a group of shared genes in the Enterobacteriaceae (Daubin, Moran, and Ochman 2003; Lerat, Daubin, and Moran 2003). In this article, I compare the phylogenies inferred from part of the lac operon of 14 Enterobacteriaceae to the phylogeny inferred from two housekeeping genes, which are likely to be consistent markers of organismal phylogeny. If a horizontal transfer event has occurred, the phylogeny of lac will differ from the phylogeny of most of the genome. The entire operon is over 7000 bp long, so only pieces of lacZ and lacY were sequenced.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
To amplify lacZ and lacY, I used a two-part strategy. First, I designed primers from a published sequence to amplify pieces of lacZ and lacY. Although I amplified lacZ for all of the strains used in this study, I was not able to amplify lacY from six of them, as noted in table 1. I could not amplify a continuous lacZY region from any strain with these primers, presumably because the primers' imperfect matches prevented the 3.5-kb reaction. To amplify this large region from those species from which I amplified both lacZ and lacY, I designed species-specific primers from the lacZ and lacY sequences. All primers are listed in item 1 of the Supplementary Material online.
The reaction conditions for lacZ and lacY were 1 U Taq polymerase, 20 mM Tris-HCl, 50 mM KCl, 1.5 mM MgCl2, each primer at 1 mM, and each nucleotide at 0.2 mM. The primers to amplify approximately 1.2 kb of lacZ were lacZ2 and either lacZ6 or lacZ7. The reaction conditions were initial denaturation of 2 min at 94°C, followed by 35 cycles of 94°C for 30 s, 50°C for 60 s, and 72°C for 45 s, and concluded with 10 min of extension at 72°C. To amplify the small piece of lacY, primers lacY2 and lacY5 were used. The reaction conditions were as for lacZ above, but with annealing at 45°C. These PCR products were then directly sequenced. To amplify the 3.5-kb continuous fragment of lacZY, 1.25 U of the proofreading polymerase Platinum Pfx (Invitrogen), 2x amplification buffer, 1 mM MgSO4, 0.2 mM of each nucleotide, and each primer at 1 mM were used. The reaction conditions were 5 min of denaturation at 94°C, followed by 35 cycles of 94°C for 15 s, 50°C for 30 s, and 68°C for 3 min, finishing with extension at 68°C for 10 min. The reactions gave products of several sizes, so the proper sized band was gel purified and ligated into plasmid pCR (Invitrogen). Because I never observed polymorphisms in the directly sequenced lacZ and lacY PCR products, and because the other bands were at least 50% larger or smaller than the expected size, I assumed that the other bands were spurious products rather than duplicated copies of lacZY which varied in length from the published sequence. Sequencing of both PCR products and plasmids was performed by the Stony Brook University DNA sequencing facility. The GenBank accession numbers for the sequences are AY743917AY743920 and AY746943AY746962.
Phylogenetic Analysis
The sequences were aligned using Clustal X 1.82 (Thompson et al. 1997) with default settings, and further refined by eye to preserve codon boundaries. For the lacZY data set, the region between the stop codon of lacZ and the start codon of lacY was excluded, as the length of this spacer ranged from 52 bp in E. coli to 210 bp in Serratia sp. MF 416 and could not be aligned unambiguously. PAUP* 4.0b10 (Swofford 2003) was used to infer the phylogeny for each data set. For each data set, Modeltest 3.4 (Posada and Crandall 1998) was used to select the appropriate model of molecular evolution. Modeltest implements both the Akaike Information Criteria and the hierarchical likelihood ratio test for model selection. When the two criteria differed in the models chosen, I selected the model with fewer parameters (which was always the model chosen by the hierarchical likelihood ratio test). I used this model to find the maximum likelihood (ML) tree, and the Neighbor-Joining (NJ) tree from the ML distances. I also built trees using unweighted maximum parsimony (MP). For all data sets, a heuristic search of 1000 bootstrap replicates was used to assess uncertainty in the branching patterns. I rooted trees with Yersinia pestis for the ompA, gap and lacZ. lacY is not present in the Y. pestis genome, so I used Serratia sp. MF 416 as the outgroup for the lacZY tree, as this species was the outgroup to the other species in the lacZ tree. Rooting trees is tenuous when horizontal transfer may have occurred, as an outgroup species may not have the most distantly related copy of a gene. Concordance between the trees suggests that the rooting has not been affected by horizontal transfer.
To test for the significance of differences in likelihoods between trees, I used the PAUP* 4.0b10 implementation of the Shimodaira-Hasegawa (SH) test (Shimodaira and Hasegawa 1999), with the RELL approximation with 1000 bootstrap replicates. This test places a confidence interval on the likelihood of the ML tree, taking into account the multiple comparisons inherent in comparing a ML tree to other phylogenetic trees. Likelihoods of reversible models (all of those considered here) are calculated on unrooted trees, so the SH test is unaffected by the choice of outgroup. Testing for the support of individual differences was a three-step process of generating a constrained tree with regard to the taxa of interest, and comparing it to the unconstrained tree. First, a constraint was made forcing the taxa of interest to have the sister relationship inferred from the phylogeny of housekeeping genes. Second, the most likely tree, consistent with the constraint, was found for the data set being tested. Finally, all constrained trees were tested simultaneously against the ML tree. This procedure tests the null hypothesis that the placement of a branch in the ML tree is not actually different from the placement in the species phylogeny. If there is no true difference, then constraining the branch to its location in the species tree will not significantly lower the likelihood of the phylogeny.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
Lack of support for the paraphyly of these taxa could reflect that they have actually evolved vertically, or it could be that the paraphyly is correct but the test failed to reject the null hypothesis (that is, type II error). The SH test is known to be quite conservative, so I attempted to bring more data to bear on this question. For those species from which part of lacY was amplified, the most likely tree for the 3480-bp data set is shown in figure 3B. As in the previous analysis, Serratia sp. MF 426 groups with C. freundii rather than with the other Serratia sp., and E. coli and Salmonella are paraphyletic. (lacY was not amplified from E. fergusonii, so this species is not in this analysis.) In the same fashion as the lacZ analysis, I found the ML tree which placed Serratia sp. MF 416 and Serratia sp. MF 426 together, and the ML tree placing E. coli and Salmonella sp. group IIIb together. I then compared these two trees and the ompA tree to the unconstrained tree (fig. 3B) with the SH test. Although there are significant differences between the ompA tree and the unconstrained tree (P = 0.02), neither the placement of MF 426 (P = 0.071) nor that of E. coli and Salmonella (P = 0.212) is significantly different from placement in the ML tree.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Analysis of the longer lacZY sequences from a subset of species reveals a pattern similar to the lacZ data, which were collected from more species. The lacZY tree differs significantly from the ompA phylogeny, although neither the placement of Salmonella sp. group IIIb nor that of Serratia sp. MF 426 was individually significant. The lack of significance of either individual transfer is somewhat surprising, as the lacZY data set contained over 2000 more base pairs than the lacZ data, even though it contained fewer species. A better understanding of how the number of taxa and the length of sequences affects the SH test would help direct the design of future studies.
Neither the lacZ nor the lacZY analyses supports the idea that E. coli gained the operon through horizontal transfer. This is a setback for studies of the role of horizontal transfer in niche expansion and speciation. Plausible hypotheses existed regarding the adaptive nature of a gain of the lac operon by E. coli, but we know little about ecology of the species that have actually gained lac via horizontal transfer. These strains might not even use their lac gene products to metabolize lactose. For example, Serratia sp. MF 426 was isolated from a turtle (M. Feldgarden, pers. comm.), animals that neither produce nor consume lactose. E. vulneris and E. hermannii have predominately been isolated from wounds (Brenner et al. 1982a, 1982b), which are not high lactose environments. K. pneumoniae is found in the intestinal tract of mammals, but it is also found in soil and water (Grimont, Grimont, and Richard 1992). Glycerol-galactoside (another ß-galactoside) is found in chloroplasts, and it may be a major substrate for the lac operon of many species (Egel 1979; Boos 1982). Without knowledge of the ecology of most Enterobacteriaceae, we do not know what role the lac operon may have played in adaptation of these other species.
Gene Loss
If Salmonella form a clade, and strains in group IIIb retain an ancestral copy (as shown here), then most Salmonella must have lost the operon. The loss of the lac operon in much of the Salmonella lineage may be part of a common pattern. lacZ could not be amplified from E. blattae and S. odorifera when they were screened for this study. Likewise, ß-galactosidase activity (the product of lacZ) is a very labile character in the Enterobacteriaceae (Holt and Kreig 1984, p. 414), potentially reflecting multiple losses of this gene. Only a subset of the strains in this study also have lacY, and this gene was not found in strains from which I could not amplify lacZ. Although the presence or absence of a PCR product is not a definitive test of the presence or absence of a gene, these data suggest that there are three basic lac genotypes of Enterobacteriaceae: those with both lacZ and lacY, those with only lacZ, and those with neither.
A similar pattern exists in the four named species of Shigella, which are actually strains of E. coli that are pathogenic specialists. Different authors have contended that the clones either originated independently (Pupo, Lan, and Reeves 2000) or via a single event (Escobar-Paramo et al. 2003), yet all of the clones have converged on similar lac phenotypes. All Shigella use lactose poorly, but different species seem to have a different genetic basis for this trait. S. flexneri and S. boydii have neither lacZ nor lacY, S. dysenterii has lacZ but not lacY, and S. sonneii has both genes, but lacY is nonfunctional (Ito et al. 1991). It appears the loss of lacY function has played a major role in the convergent evolution of poor lactose fermentation in Shigella. This might be due to genetic drift of nonfunctional lacY alleles. It is also possible that there is a selective benefit to deleting lacY in some environments. The mechanistic cause of this selection is unclear, but it is not simply the energetic cost of production: LacZ is three times as large as LacY. If there is an energetic cost of production of these proteins, it must be accompanied by selection to retain LacZ.
The Dynamics of Deletion and Horizontal Acquisition
Recent work has shown that rates of homologous horizontal transfer are very low, particularly in the Enterobacteriaceae (Daubin, Moran, and Ochman 2003; Lerat, Daubin, and Moran 2003). If the horizontal transfer events found in this study occurred by homologous replacement, then lac would exhibit an extremely rapid rate of transfer. The loss of the operon in most Salmonella and Shigella species suggests, instead, a more dynamic process of horizontal transfer. I suggest that some lineages lose the operon through selection for loss or through drift. Descendants can regain the operon via illegitimate recombination, and these individuals may persist if ecological pressures favor the ability to metabolize ß-galactosides. For example, the lineage leading to Serratia sp. MF 426 may have lost the operon and regained it from a C. freundii-like species, rather than having a vertically transmitted copy replaced by a horizontally transferred one. It might be possible to test the hypothesis of deletion and illegitimate recombination by adding species to the phylogeny which lack the lac operon, and reconstructing the character state of the lac operon at each node as vertically inherited, deleted, or horizontally acquired. If these predictions are correct, Serratia sp. MF 426 should be found to nest within a group of Serratia strains which lack the operon, rather than those, like Serratia sp. MF 416, which seem to have retained the operon through vertical transmission.
Caveats
The results of this phylogenetic analysis show that there are significant differences between the phylogenies of lacZ and those of the housekeeping genes ompA and gap. I have interpreted this divergence as potential horizontal transfer, and used the SH test to determine which transfers are supported by the data. Thus, the conclusions of this article are contingent on ML in general, and particularly on the SH test. If the model of evolution employed in this study is not a reasonable characterization of the actual evolutionary process, then the SH test will have given incorrect P values. The best fit model of evolution was frequently the most complex model I examined (GTR + I + ), and it is possible that an even more complex model would have been justified by the data. The effect of model misspecification on the SH test seems to be poorly understood; simulation studies of this problem might prove enlightening. Model misspecification is a potential problem in any statistical analysis, and I have taken steps to avoid this problems by statistically justifying the model of evolution, rather than choosing it arbitrarily.
A major conclusion of this article is that there is no evidence of horizontal transfer of the lac operon into E. coli. While I found a difference in the placements of E. coli, E. fergusonii, and Salmonella group IIIb, this difference was not statistically significant. As in any statistical analysis, the variation may be due to type II error, the lack of power to reject a true difference. The SH test was able to detect several significant differences between the trees, so this method can detect horizontal transfer events. Given the absence of other evidence of the horizontal transfer of the lac operon into E. coli, the null hypothesis of vertical transmission should not be rejected. A power analysis of the SH test would help future workers to interpret the implications of negative results.
![]() |
Conclusions |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Andrews, K. J., and E. C. Lin. 1976. Thiogalactoside transacetylase of the lactose operon as an enzyme for detoxification. J. Bacteriol. 128:510513.[ISI][Medline]
Blattner, F. R., G. Plunkett, 3rd, C. A. Bloch, N. T. Perna, V. Burland, M. Riley, J. Collado-Vides, J. D. Glasner, C. K. Rode, G. F. Mayhew et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277:14531474.
Boos, W. 1982. Synthesis of (2R)-glycerol-o-beta-D-galactopyranoside by beta-galactosidase. Methods Enzymol. 89(Pt D):5964.[ISI][Medline]
Boyd, E. F., F. S. Wang, T. S. Whittam, and R. K. Selander. 1996. Molecular genetic relationships of the salmonellae. Appl. Environ. Microbiol. 62:804808.[Abstract]
Brenner, D. J., B. R. Davis, A. G. Steigerwalt, C. F. Riddle, A. C. McWhorter, S. D. Allen, J. J. Farmer, 3rd, Y. Saitoh, and G. R. Fanning. 1982a. Atypical biogroups of Escherichia coli found in clinical specimens and description of Escherichia hermannii sp. nov. J. Clin. Microbiol. 15:703713.[ISI][Medline]
Brenner, D. J., A. C. McWhorter, J. K. Knutson, and A. G. Steigerwalt. 1982b. Escherichia vulneris: a new species of Enterobacteriaceae associated with human wounds. J. Clin. Microbiol. 15:11331140.[ISI][Medline]
Bronikowski, A. M., A. F. Bennett, and R. E. Lenski. 2001. Evolutionary Adaptation to temperature. VIII. Effects of temperature on growth rate in natural isolates of Escherichia coli and Salmonella enterica from different thermal environments. Evolution 55:3340.[ISI][Medline]
Buvinger, W. E., K. A. Lampel, R. J. Bojanowski, and M. Riley. 1984. Location and analysis of nucleotide sequences at one end of a putative lac transposon in the Escherichia coli chromosome. J. Bacteriol. 159:618623.[ISI][Medline]
Daubin, V., N. A. Moran, and H. Ochman. 2003. Phylogenetics and the cohesion of bacterial genomes. Science 301:829832.
Deng, W., V. Burland, G. Plunkett, 3rd, A. Boutin, G. F. Mayhew, P. Liss, N. T. Perna, D. J. Rose, B. Mau, S. Zhou, D. C. et al. 2002. Genome sequence of Yersinia pestis KIM. J. Bacteriol. 184:46014611.
Egel, R. 1979. The lac-operon for lactose degradation, or rather for the utilization of galactosylglycerols from galactolipids? J. Theor. Biol. 79:117119.[ISI][Medline]
Eisen, J. A. 1998. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8:163167.
Escobar-Paramo, P., C. Giudicelli, C. Parsot, and E. Denamur. 2003. The evolutionary history of Shigella and enteroinvasive Escherichia coli revised. J. Mol. Evol. 57:140148.[CrossRef][ISI][Medline]
Grimont, F., P. A. D. Grimont, and C. Richard. 1992. The Genus Klebsiella. Pp. 27752796 in A. Balows, H. G. Truper, M. Dworkin, W. Harder, and K.-H. Schleifer, eds. The prokaryotes. Springer-Verlag, New York.
Guindon, S., and G. Perriere. 2001. Intragenomic base content variation is a potential source of biases when searching for horizontally transferred genes. Mol. Biol. Evol. 18:18381840.
Holt, J. G., and N. R. Kreig. 1984. Bergey's manual of systematic bacteriology. Lippincott, Williams & Wilkins, Baltimore.
Ito, H., N. Kido, Y. Arakawa, M. Ohta, T. Sugiyama, and N. Kato. 1991. Possible mechanisms underlying the slow lactose fermentation phenotype in Shigella spp. Appl. Environ. Microbiol. 57:29122917.[ISI][Medline]
Koski, L. B., and G. B. Golding. 2001. The closest BLAST hit is often not the nearest neighbor. J. Mol. Evol. 52:540542.[ISI][Medline]
Koski, L. B., R. A. Morton, and G. B. Golding. 2001. Codon bias and base composition are poor indicators of horizontally transferred genes. Mol. Biol. Evol. 18:404412.
Lawrence, J. G. 2001. Catalyzing bacterial speciation: correlating lateral transfer with genetic headroom. Syst. Biol. 50:479496.[CrossRef][ISI][Medline]
Lawrence, J. G., and H. Ochman. 1998. Molecular archaeology of the Escherichia coli genome. Proc. Natl. Acad. Sci. USA 95:94139417.
Lawrence, J. G., H. Ochman, and D. L. Hartl. 1991. Molecular and evolutionary relationships among enteric bacteria. J. Gen. Microbiol. 137:19111921.[ISI][Medline]
Lawrence, J. G., and J. R. Roth. 1996. Selfish operons: horizontal transfer may drive the evolution of gene clusters. Genetics 143:18431860.
Lerat, E., V. Daubin, and N. A. Moran. 2003. From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-proteobacteria. PLoS Biol. 1:E19.[CrossRef][Medline]
Miller, J. H., and W. S. Reznikoff. 1978. The operon. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
Ochman, H., J. G. Lawrence, and E. A. Groisman. 2000. Lateral gene transfer and the nature of bacterial innovation. Nature 405:299304.[CrossRef][ISI][Medline]
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817818.[Abstract]
Pupo, G. M., R. Lan, and P. R. Reeves. 2000. Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc. Natl. Acad. Sci. USA 97:1056710572.
Riley, M., and A. Anilionis. 1980. Conservation and variation of nucleotide sequences within related bacterial genomes: enterobacteria. J. Bacteriol. 143:366376.[ISI][Medline]
Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:11141116.
Starlinger, P. 1977. DNA rearrangements in procaryotes. Annu. Rev. Genet. 11:103126.[CrossRef][ISI][Medline]
Swofford, D. L. 2003. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinauer Associates, Sunderland, Mass.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:48764882.