Structure and distribution of the phosphoprotein phosphatase genes, prpA and prpB, among Shigella subgroups

Baoguang Li, Eric W. Brown, Christine D'Agostino, J. Eugene LeClerc and Thomas A. Cebula

Division of Molecular Biology, Center for Food Safety and Applied Nutrition, Food and Drug Administration, 8301 Muirkirk Road, Laurel, MD 20708, USA

Correspondence
Thomas A. Cebula
tcebula{at}cfsan.fda.gov


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Phosphoprotein phosphatases encoded by the prpA and prpB genes function in signal transduction pathways for degradation of misfolded proteins in the extracytoplasmic compartments of Escherichia coli. In order to trace the evolution of prp genes and assess their roles in other enteric pathogens, the structure and distribution of these genes among closely related Shigella subgroups were studied. PCR amplification, probe hybridization studies and DNA sequencing were used to determine the prp genotypes of 58 strains from the four Shigella subgroups, Dysenteriae, Boydii, Sonnei and Flexneri. It was found that the prp alleles among Shigella subgroups were extremely susceptible to gene inactivation and that the mutations involved in prp allele inactivation were varied. They included IS insertions, gene replacement by an IS element, a small deletion within the gene or large deletion engulfing the entire gene region, and base substitutions that generated premature termination codons. As a result, of 58 strains studied, only eight (14 %) possessed intact prpA and prpB genes. Of the Shigella strains examined, 76 % (44/58) showed at least one of the prp alleles inactivated by one or more IS elements, including IS1, IS4, IS600 and IS629. Phylogenetic analysis revealed that IS elements have been independently acquired in multiple lineages of Shigella, suggesting that loss of functional alleles has been advantageous during Shigella strain evolution.


The GenBank/EMBL/DDBJ accession numbers for the prpA and prpB genes and the IS insertion sequences found inside the prp genes are AY734633–AY734679. Those for the phylogenetic datasets are as follows: mdh sequences, AY434217–AY434274; gapA, AY434159–AY434216; thrB, AY434275–AY434332; argR, AY434333–AY434390; crr, AY434159–AY434216.


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Shigella was identified as a prominent human pathogen just over a century ago and recognized as a causative agent of bacillary dysentery. Shigella strains classically have been partitioned into four subgroups, Dysenteriae, Boydii, Sonnei and Flexneri, based on their O-antigen serotypes and capacity to ferment different sugars (Ewing, 1949). It is recognized, however, that the subgroups may be considered as pathogenic lineages of the large Escherichia coli family of organisms (Pupo et al., 2000; Lan & Reeves, 2002; Escobar-Paramo et al., 2003). Strains of all four subgroups cause diarrhoea and dysentery in humans and higher primates. Among humans, Shigella is transmitted through the faecal–oral route and by ingestion of contaminated foods or water (Watarai et al., 1995). An estimated 165 million cases of shigellosis worldwide are reported annually, resulting in more than a million deaths (Kotloff et al., 1999). The serious health effects of shigellosis, coupled with an infectious dose as low as ten bacteria (DuPont et al., 1989), make Shigella a highly dangerous aetiological agent from both a food safety and food security perspective. Understanding the mechanisms by which pathogens such as Shigella can survive in conditions of environmental stress is critical if appropriate strategies are to be developed to reduce or eliminate these bacteria from the food and water supply.

Whole genome sequences have been published for two strains of Flexneri serotype 2a, the most prevalent Shigella subgroup and serotype (Jin et al., 2002; Wei et al., 2003). Surveying a variety of stress-response genes in the sequences revealed that these Shigella strains contained alleles for prpA and prpB, although both presented as pseudogenes in these strains of Flexneri due to deletion or IS insertion. prpA and prpB encode phosphoprotein phosphatases that play a role in the CpxR–CpxA signal transduction pathways responsive to protein misfolding. They were recently identified in E. coli and shown to modulate the transcription of proteins, such as the HtrA protease, that protect cells from accumulation of misfolded proteins in the periplasmic space (Missiakas & Raina, 1997).

In E. coli, defects in the prpA and prpB genes lead to a slow growth phenotype and slight temperature sensitivity above 43 °C (Missiakas & Raina, 1997). These findings prompted us to investigate the integrity, structure and distribution of prp gene alleles among other Shigella strains and subgroups, since genetically, Shigella comprises a series of closely related but discernible lineages of E. coli (Pupo et al., 2000; Escobar-Paramo et al., 2003). Insight into the structure and distribution of the prpA and prpB genes among Shigella subgroups should aid in assessing the role of prp genes in response to stress conditions among enteric pathogens. We therefore determined the prevalence, organization and phylogenetic history of these genes in 58 strains spanning the four Shigella subgroups.


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Strains and media.
The Shigella strains used in this study are listed in Table 1 and were from the FDA stock collection or were gifts from Drs Keith A. Lampel (FDA) and Nancy Strockbine (CDC). Fifty-eight Shigella and 72 ECOR (E. coli reference) strains (Ochman & Selander, 1984) were inoculated from frozen stock into LB broth and grown overnight.


View this table:
[in this window]
[in a new window]
 
Table 1. Presence of prpA and prpB genes among Shigella strains

 
DNA preparation.
Except where noted, genomic DNA was prepared using the InstaGene DNA extraction matrix (Bio-Rad) according to the manufacturer's instructions. A 200 µl volume of overnight broth culture was used to prepare DNA samples; after extraction, the volume of each preparation was adjusted to 200 µl.

PCR amplification of prpA and prpB genes.
The published sequences of the prpA and prpB genes from E. coli MG1655 (Blattner et al., 1997) were used to design oligonucleotides for PCR amplification and DNA probes for colony hybridization. Oligonucleotides (Table 2) used for amplification were prpA1 and prpA3 for prpA and prpB1and prpB3 for prpB. prpA and prpB were amplified using 5 µl of each of the DNA preparations and AmpliTaq DNA polymerase according to the manufacturer's instructions. prpA was amplified using cycling conditions of 94 °C for 1 min, 60 °C for 1 min, and 30 cycles of 72 °C for 2·5 min. prpB was amplified using the same conditions except for an annealing temperature of 55 °C. In order to investigate the prpA gene of 13 Dysenteriae strains that were not amplified by these conditions, primer pairs that spanned up to 4·1 kb of the prpA flanking region were tested. Amplification was achieved using primer pair prpB20 and prpB28 (Table 2) and long PCR conditions using a GeneAmp Xl PCR Kit (Perkin Elmer) and DNA prepared with a Puregene DNA Isolation Kit (Gentra) as described previously (Li et al., 2003).


View this table:
[in this window]
[in a new window]
 
Table 2. Primers

 
Oligonucleotide probes and colony hybridization.
Oligonucleotides prpA1 and prpB1 were used as probes for colony hybridization to detect the prpA and prpB sequences, respectively. Each oligonucleotide was end-labelled with [{gamma}-32P]ATP and the labelled probes were column purified (Stratagene) according to the manufacturer's instructions. Strains of Shigella and E. coli were grown in brain-heart infusion broth and on brain-heart infusion agar for filter preparation. Colony hybridization was performed as previously described (Cebula, 1995), except that hybridization and washing were done at 60 °C and 55 °C, respectively.

DNA sequencing.
Sequence analysis was performed on all prpA and prpB amplicons from the Shigella strains studied in this work. PCR products were purified using QIA-quick PCR purification spin columns (Qiagen). Based on their size patterns, PCR products were selected as a representative from each size category to sequence the entire prp gene (Lark Technologies). PCR products from the remaining strains were sequenced by Amplicon Express, either for a complete prp gene sequence or for the prp sequence boundaries when inserts in the gene were found. All prp amplicons showing an intact ORF initially were resequenced by Lark Technologies.

Phylogenetic analysis.
Sequences from five housekeeping genes, mdh (malate dehydrogenase; 73 min), argR (arginine biosynthesis repressor; 73 min), thrB (homoserine kinase; 0 min), crr (phosphotransferase system glucose-specific enzyme III; 55 min) and gapA (glyceraldehyde-3-phosphate dehydrogenase; 40 min), were used for phylogenetic analysis. Genes were amplified using oligonucleotides designed from conserved sites found in alignments of these genes from E. coli K-12 (Blattner et al., 1997), E. coli O157 : H7 (Perna et al., 2001), and Shigella strains (Pupo et al., 2000). Primer pairs used to amplify Shigella housekeeping genes and their resultant amplicon sizes were as follows: mdh, mdhnhF-mdhnhR (491 bp), mdhcoF-mdhcoR (427 bp); gapA, gapAF-gapAR (466 bp); thrB, thrBF-thrBR (566 bp); crr, crrF-crrR (510 bp); and argR, argRF-argRR (560 bp). Amplicons were purified as described above and cycle-sequenced in both directions by automated Sanger dideoxy-chain-termination methods using the primers listed in Table 2 (Amplicon Express). Sequences were assembled and analysed using the GCG (Genetics Computer Group, University of Wisconsin, Madison, WI, USA) sequence-handling program (Devereux et al., 1984). Multiple sequence alignment was performed using the CLUSTAL X multiple sequence alignment program (Thompson et al., 1997).

The combination of the five genes yielded 2459 nucleotides that were analysed simultaneously in a combined character matrix. Each of these genes was taken largely to reiterate whole-chromosome (strain) evolution for enteric species and has been applied previously to construct stable phylogenies of Shigella, E. coli and Salmonella strains (Boyd et al., 1994; Pupo et al., 1997, 2000; Lecointre et al., 1998; Brown et al., 2001, 2002). Aligned nucleotide matrices were subjected to phylogenetic analysis using the principle of maximum parsimony (Farris, 1983), available in PAUP* (Phylogenetic Analysis Using Parsimony) v.4.03b (Swofford, 1999). Most-parsimonious trees were derived using heuristic searches with random addition order. Character support for internal tree nodes was determined by 5000 iterations of bootstrapping (Felsenstein, 1985). Relative levels of homoplasy were measured for the tree using CI, the consistency index (Forey et al., 1992). Congruence between genes was assessed using the incongruence length difference (ILD) test, which measures the level of phylogenetic concordance between two genetic datasets either accepting or rejecting the null hypothesis of congruence among genes (Farris et al., 1995). The version of the ILD test employed here is available in PAUP* v.4.03b. The evolutionary events that gave rise to the current structure of prpA and prpB genes within Shigella were investigated by mapping prp gene distribution, associated IS elements and prp sequence anomalies (e.g. termination codons) onto a phylogenetic tree containing sequences from the 58 Shigellas and a single outgroup strain of Salmonella enterica Typhimurium LT2 given in GenBank nos AE008854 (mdh), M63369 (gapA), AE008809 (crr), AE008693 (thrB) and AE008854 (argR).


   RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
PCR amplification of prpA and prpB genes
Both the prpA and prpB genes in E. coli are 657 bp in size (Missiakas & Raina, 1997). Using flanking prpA primers derived from the E. coli K-12 sequence (Blattner et al., 1997), we attempted to amplify full gene-length prpA amplicons from 58 Shigella strains. As shown in Fig. 1, amplified DNA of the expected size was recovered from only 13 (22 %) of the strains, including 11 out of 18 Dysenteriae strains and two out of 11 Flexneri strains. No prpA amplification product was observed from any of 12 Dysenteriae type 1 strains or from a Dysenteriae type 7 strain.



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 1. PCR amplification of the prpA gene among Shigella subgroups and identification of the deletion that engulfed the prpA region in Dysenteriae strains. M refers to a kb ladder. (a) Lanes 1–12, Dysenteriae type 1 strains; lanes 13–16, Dysenteriae strains of types 2–5. The lower arrow points to the expected size of prpA amplicons and the upper arrow points to an amplicon with insertion. (b) Lanes 1–10, Dysenteriae strains of types 6–15; Lane 11, Dysenteriae 6; lanes 12 and 14, Dysenteriae 3; Lane 13, Dysenteriae of unknown serotype; lanes 15 and 16, Boydii strains. The lower arrow points to the expected size of prpA amplicons and the upper arrow indicates the size of insertion-containing amplicons. (c) Lanes 1–5, Boydii strains; lanes 6–15, Sonnei strains; lane 16, a Flexneri strain. The upper and lower arrows indicate the sizes of prpA amplicons with insertions of different sizes. (d) Lanes 1–10, Flexneri strains. The lower arrow indicates the expected size of prpA amplicons and the upper arrow indicates the size of insertion-containing amplicons. (e) A 2·3 kb deletion on the chromosome flanking the prpA gene region was identified by long PCR from 13 prpA-negative Dysenteriae strains. Lanes 1–5, Dysenteriae type 1 strains 1, 2, 3, 4 and 5; lane 6, Dysenteriae type 7 strain 18; lanes 7–13, Dysenteriae type 1 strains 6, 7, 8, 9, 10, 11 and 12; lane 14, E. coli K-12. The upper arrow indicates the normal-sized amplicon from E. coli O157 : H7 as a control and the lower arrow indicates deletion-containing amplicons. (f) Location of the 2·3 kb deletion covering the prpA gene and flanking region. Sequence information is based on the GenBank sequence of the E. coli O157 : H7 EDL933 genome, accession no. AE005406. Z2882, Z2883 and Z2889 refer to ORFs. As diagrammed, the deletion extends from position 6452 to 8747, and the repeated sequences found at the end points of the deletion are noted.

 
The remaining 32 strains produced prpA amplicons larger than the expected size of 657 bp, suggesting the presence of insertion(s) within the prpA gene. Six Dysenteriae strains (type 2, type 5, type 12 and three of unknown type), all seven Boydii strains, and nine out of 11 Flexneri strains yielded approximately 2·0 kb prpA amplicons, while all ten Sonnei strains yielded approximately 2·7 kb amplicons (Fig. 1a–d).

When a primer pair that spanned 4·1 kb of the prpA gene and flanking regions was utilized for the 13 Dysenteriae prpA-negative strains, a PCR product of 1·8 kb was produced, indicating that a deletion of 2·3 kb occurred at the prpA gene region of these strains (Fig. 1e). The 1·8 kb amplicons from the 13 Dysenteriae strains were sequenced to determine the exact location and size of the deletion. Sequence analysis confirmed that a 2295 bp deletion covered the prpA gene and over 1·6 kb of flanking sequence. A repeated sequence of 10 bp (GGTGAATCGC) was identified at the deletion end points, as represented in Fig. 1(f). The retention of only one copy of the repeated sequence in these strains points to Streisinger slippage as a likely mechanism for deletion formation (Streisinger et al., 1966).

Fig. 2 shows the analysis of the prpB gene by PCR. All 12 Dysenteriae type 1 strains were negative for prpB. All other strains of Dysenteriae (i.e. types 2–15 and four Dysenteriae of unknown serotype), Boydii (7/7), and Flexneri (11/11) produced a prpB amplicon of the expected size. All Sonnei strains (10/10) yielded amplicons that were larger than the expected size.



View larger version (48K):
[in this window]
[in a new window]
 
Fig. 2. PCR amplicifcation of the prpB gene of 58 strains of Shigella. M refers to a kb ladder. The arrows indicate the expected size of prpA amplicons, except in (c), where the lower arrow indicates the expected size of prpA amplicons and the upper arrow indicates amplicons with an insertion. (a) Lanes 1–12, Dysenteriae type 1 strains; lanes 13–16, Dysenteriae strains of types 2–5. (b) Lanes 1–10, Dysenteriae strains of types 6–15; lanes 11–14, Dysenteriae strains of unknown serotype; lanes 15–16, Boydii strains. (c) Lanes 1–5, Boydii strains; lanes 6–15, Sonnei strains; lane 16, Flexneri. (d) Lanes 1–10, Flexneri strains.

 
Colony hybridization was used as an independent means to assess the presence or absence of prp genes in the Shigella strains analysed here. With two exceptions, strains shown positive for the prpA and prpB genes by PCR amplification were probe-positive regardless of their IS status (data not shown). Although probe-negative results were found for the prpA gene in Dysenteriae type 10 and the prpB gene in Dysenteriae type 7, sequencing of PCR amplicons confirmed that the genes are intact in these strains.

DNA sequence analysis of the prpA gene
The entire prpA gene was sequenced from Boydii, Sonnei, Flexneri, and 11 Dysenteriae strains other than serotype 1. Sequence analysis confirmed that all 13 Shigella strains that possess a normal-sized prpA amplicon contain an intact prpA gene. Like E. coli K-12 (Missiakas & Raina, 1997), the prpA gene in these strains is 657 bp in length. Sequence comparisons among strains containing an intact prpA gene revealed high sequence identity (99 %) among prpA alleles in Shigella strains and over 99·5 % sequence similarity with the E. coli prpA gene.

As noted above, Shigella that yielded larger prpA amplicons distributed into two discrete groups producing amplicons that were either 2·0 kb or 2·7 kb in size. The 2·7 kb prpA amplicon, found in all Sonnei strains but not in any other subgroups, contained full-length IS1 and IS600 insertion sequence elements in addition to a complete prpA gene sequence. In this group of prpA alleles, the IS1 sequence element interrupts prpA 176 bp from the prpA start codon, and the IS600 insertion is 461 bp downstream from the prpA start codon (Table 3).


View this table:
[in this window]
[in a new window]
 
Table 3. Insertion sequences in the prpA gene among Shigella strains

 
Strains generating the approximately 2·0 kb prpA amplicon could be separated into three distinct groups. Sequence analyses of several representative strains showed that the 2·0 kb prpA amplicon comprised a full-length prpA gene interrupted by full-length IS600, IS4 or IS629 insertion sequence elements. Shown in summary fashion in Table 3, IS600 was located 188 bp from the prpA start codon in five Boydii strains; 446 bp from the start codon in nine Flexneri strains; and either 188 bp (type 5) or 415 bp (types 2 and 12) from the start codon in three Dysenteriae strains. The prpA gene of two Boydii strains and one Dysenteriae strain contained IS4 located 365 bp from the prpA start codon. IS629 was found 228 bp from the start codon in two other Dysenteriae strains. It is noteworthy that the IS600 insertions in the prpA gene among the four Shigella subgroups, including Sonnei, whose prpA gene harboured both IS1 and IS600, occurred at different locations specific to each group, at positions 188, 415, 446 and 461 bp from the start codon of the prpA gene.

Detection of prp genes among ECOR strains
In order to assess the functional and structural status of prp genes within conventional E. coli strains, we probed the ECOR collection (Ochman & Selander, 1984) for the presence of prpA and prpB sequences (Table 4). The prpA gene was detected in 100 % (n=72) of the ECOR strain collection. Surprisingly, however, the prpB gene was detected in only 74 % (n=53) of the ECOR strains. It is notable that an overwhelming majority of the 19 prpB-negative E. coli strains are from phyletic groups B2 (n=10) and D (n=6) (Herzer et al., 1990).


View this table:
[in this window]
[in a new window]
 
Table 4. Presence of prp genes among ECOR strains

The prpA gene was present in all 72 ECOR strains. The prpB gene was present in 53 of the strains, as shown in the table.

 
DNA sequence analysis of the prpB gene
The overall sequence identity for prpB among the Shigella strains examined was greater than 95 %, and sequence comparisons of Shigella prpB with E. coli K-12 prpB showed 95 % sequence similarity as well. Sequence analysis showed that the prpB gene of all Dysenteriae type 1 strains examined was replaced by an IS1 insertion sequence element, confirming our earlier findings (LeClerc et al., 1999). Twelve additional Dysenteriae strains were demonstrated to contain an intact prpB gene, however (Table 1). Although an expected-size prpB amplicon was obtained from all seven Boydii and six Dysenteriae strains, the prpB gene in these strains is a pseudogene due to a premature stop codon located within the prpB ORF (Table 1). Specifically, a stop codon (TAT->TAA) interrupted the prpB gene at position 81 in all Boydii and Dysenteriae type 3 strains. Additionally, CAA->TAA interrupted the prpB gene at position 526 in Dysenteriae types 4, 14, 15 and two Dysenteriae strains of unknown serotype (Table 1).

In contrast, sequence analyses of the ten 1·4 kb products amplified from Sonnei strains revealed that the prpB gene in these strains is interrupted by an IS1 insertion sequence located approximately in the middle of the gene. In each of the strains, the site of IS1 insertion is the same. In the prpB gene of Flexneri strains, a 4 bp (CAGA) deletion was identified 322 bp from the start codon. Given the propensity for insertion sequences to invade prp genes in Shigella and the absence of deletions among other isolates of Shigella or E. coli, it is possible that the 4 bp deletion in Flexneri strains represents a ‘scar’ left by imprecise excision of an IS sequence. The deletion in prpB generated a frame shift, followed by a stop codon 19 bp downstream from the deletion site, resulting in prpB pseudogenes in the 11 Flexneri strains examined.

Direct repeat sequences
Table 3 shows the direct repeat sequences that were found at each end of the insertion site of IS elements in the Shigella prpA gene. In the cases of both prpA and prpB, the repeat sequence, unique to the IS element, is a partial duplication of the prpA or prpB gene sequence immediately following the insertion. Four types of direct repeat sequences were found next to the IS600 insertion sites. In the case of IS600 and IS1 both residing in the prpA gene of Sonnei, IS600 had a direct repeat of 5 bp (cAAAA) at position 461, of which the ‘c’ was shared by the prpA gene and IS600, and IS1 had a direct repeat of 10 bp (TCGATCGTGG) at position 173. Four-basepair repeats were found at the site of IS600 insertions in the prpA genes of Dysenteriae other than type 1, Boydii and Flexneri strains, but were different for each Shigella group: CATG at position 415 in Dysenteriae types 2 and 12; GTCT at position 188 in Boydii and Dysenteriae type 5; and GTTt at position 446 in Flexneri, of which the ‘t’ was shared by prpA and IS600. Identical 12 bp direct repeats were found abutting the IS4 insertion site in prpA in Dysenteriae and Boydii strains. In the prpB gene, a direct repeat of 9 bp (CCATCGCAC) was found at the IS1 insertion site in Sonnei strains.

Phylogenetic distribution of prp-gene-associated changes among Shigella subgroups
In order to investigate evolutionary aspects of prp gene distribution and alteration by IS elements, a multilocus nucleotide sequence phylogeny based on five housekeeping genes was constructed for the 58 Shigella strains in this study. The ILD results for the five genes tested simultaneously revealed that the sequences were not phylogenetically discordant with each other (P=0·166 for 1000 partitions), supporting the view that these genes have similar evolutionary structures that are generally representative of Shigella chromosome evolution.

Upon inspection, the most parsimonious tree to result from the combined analysis yielded four disparate clades of Shigella strains, designated clades I–IV (Fig. 3). Despite the use of an independent collection of Shigella strains in the analysis presented here, it is important to note that the four clades, along with various subclades within these phylogenetic partitions, essentially recapitulated the major Shigella lineages and subgroups originally defined by Pupo et al. (2000) and reinforced by Escobar-Paramo et al. (2003) and Hyma et al. (2005). As an example, Sonnei and Dysenteriae serotype 1 formed distinct, well-defined clades (supported by unanimous bootstrap values), indicating that phylogenetic intermingling between these two specific clonal lineages and other Shigellas has not occurred. This finding was consistent with previous work that also identified these two lineages as isolated clones (Pupo et al., 2000). Also congruent with previous phylogenetic characterizations of Shigella, Dysenteriae type 10, represented here by Shigella strain 21, falls out by itself at the base of the tree and away from the main Shigella clusters. Finally, the three major clusters of Shigella – representing assemblages of various Boydii strains (except for serotypes 7 and 13, both of which do not cluster in the main Shigella group but are more closely related to Escherichia albertii: Hyma et al., 2005), Flexneri strains and Dysenteriae strains, and originally designated S1–S3 by Pupo et al. (2000) – were upheld largely in the centre of the tree as constituents of clade II (Fig. 3). The only notable exception to this congruence was Dysenteriae strain 19, representative of the D8 Shigella lineage. In this case, D8 falls out within a clade equivalent to Shigella cluster 3 as a sister branch to strain 20 (Dysenteriae type 9), while, in previous analyses, D8 forms a unique lineage separate from any of the three main clusters of Shigella. It should be noted, however, that this single topological difference between analyses does not preclude the phylogenetic investigation of prp gene alterations since the D8 strain (19) retains fully intact prpA and prpB alleles that are untainted by IS element insertion or other inactivating mutational events. Thus, effective reconciliation of our strain data with earlier Shigella phylogenies points to a consistent evolutionary signal in the Fig. 3 cladogram that allows us to explore evolutionary aspects of prp gene deactivation among Shigella strains.




View larger version (39K):
[in this window]
[in a new window]
 
Fig. 3. (continued on facing page). Phylogenetic mapping of IS elements and other alterations of Shigella strains. The tree shown represents the single most parsimonious tree to result from a combined analysis of 2435 nucleotide characters from five housekeeping genes (mdh, gapA, argR, thrB and crr). The tree had a length of 368 steps and a CI of 0·94, indicating a 6 % convergence level. Measures of clade confidence are reported below each node in the form of bootstrap values (out of 5000 iterations) (Felsenstein, 1985). The internal brackets to the right of each clade reflect monophyletic strain groupings with respect to Shigella lineages and clones. The broken internal bracket indicates a group of strains that formed a polytomy on the tree and is ambiguous with respect to strain monophyly. The larger external brackets designate the four distinct clades (denoted by Roman numerals I–IV) of Shigella strains present in the tree. The major Shigella lineages originally described by Pupo et al. (2000) are denoted, where applicable, in grey highlights as follows: SS, Sonnei; S1–S3, clusters retaining multiple Shigella subgroups (S2 on our tree is putative since it could not be confirmed by known overlapping serotypes); D1, Dysenteriae serotype 1; D10, Dysenteriae type 10. The individual branch lengths are presented above each branch; the parenthetical values denote the number of unambiguous substitutions that mapped to the tree only once. (a) Phylogenetic mapping of prpA gene distribution and associated IS elements. (b) Phylogenetic mapping of prpB distribution, IS elements, and other nucleotide-based polymorphisms associated with prpB. The mapping and optimization scheme shown represents the most parsimonious scenario for the evolution of IS and nucleotide-based configurations. In clades where equally parsimonious optimizations existed, the accelerated transformation (ACCTRANS) scheme is presented. Keys defining each symbol are presented to the left of each tree. Black and white symbols represent gains and losses of characteristics, respectively.

 
The prpA genes, most with insertion sequences, appeared in three of the four clades, indicating a wide distribution across evolutionarily distinct Shigella lineages (Fig. 3a). The exception was clade III, composed entirely of strains of Dysenteriae 1, where a single deletion event accounts for the absence of prpA in this lineage. It is noteworthy that the same group of strains also lacks a prpB gene despite the presence of prpB in remaining Shigella clades. When binary data from insertion, substitution and deletion events were mapped onto the Shigella strain tree, several IS families were found to have been gained or lost independently in the evolution of these strains. IS600, for instance, has emerged four times and been lost at least twice among prpA alleles, while IS4 appears twice. prpB-associated IS1 elements emerged twice as well (Fig. 3b). In total, four distinct IS combinations found associated with prpA and prpB mapped to the tree 13 times, suggesting that the current position of many of the IS elements examined here probably resulted from independent transfers. Additionally, two substitutions resulting in Prp-truncating stop codons each appear to have occurred separately but in parallel throughout the evolution of various Shigella lineages, while a single 4 bp deletion found in prpB can be accounted for once in the Flexneri group (in clade II). Similar to previous accounts of IS-mediated ablation of the tna (tryptophan utilization) operon and the indole-negative phenotype (Rezwan et al., 2004), these data indicate a reticulated parallel evolutionary path for many of the deleterious features found in the prp genes of Shigella strains.


   DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
By using PCR amplification, colony hybridization and DNA sequence analyses, we examined a collection of Shigella strains representing four designated subgroups for the presence and integrity of the protein phosphatase prpA and prpB prp genes. Surprisingly, most of the 58 Shigella strains examined (86 %, n=50) were found to be defective in at least one of the prp genes and a large majority of strains (71 %, n=41) was deficient in both genes. Additionally, a majority of Shigella strains (55 %, n=32) possessed a prp gene that was inactivated by an IS element. In summary, all 12 strains of Dysenteriae type 1 examined entirely lacked both the prpA and prpB alleles. Those strains appear to have lost prpA by deletion and prpB by replacement with IS1. In all Boydii strains examined, the prpA genes were inactivated by IS600 or IS4 insertion and prpB was terminated prematurely by a stop codon. The prpA and prpB genes of all Sonnei strains examined had undergone inactivation by three IS insertion events, i.e. inactivation of prpA by both IS1 and IS600 and of prpB by IS1. Nine of eleven Flexneri strains carried prpA with an IS600 insertion and prpB of all strains had a four-base deletion, creating a downstream stop codon. Only in Dysenteriae strains of serotypes other than 1 were significant numbers of prpA alleles (11/18) and prpB alleles (12/18) intact.

The frequency of inactivated prp genes among Shigella strains due to deletion, IS element replacement, IS insertion and premature termination contrasts with that seen in E. coli, which largely retains intact prpA and prpB loci. When we assessed the status of the prp genes among ECOR strains, a collection assembled to represent the genetic diversity within the E. coli species (Ochman & Selander, 1984), we identified the expected-sized prpA gene by PCR in all 72 ECOR strains of the collection, while the prpB gene was not detected in 19 of the 72 ECOR strains (see Table 4). It is noteworthy that there were no obvious growth defects in the 19 prpB-negative ECOR strains and that four of those strains are human pathogens. Such lack of a prominent phenotype might be expected, since in E. coli K-12, a prpB null mutation had less effect than the prpA mutation and combined mutant alleles produced only a mild synergistic effect (Missiakas & Raina, 1997). The prpA and prpB gene functions seem to overlap.

Given the redundancy of prp gene functions in E. coli, the observation that most Shigella strains carry disrupted prp alleles or are devoid of prp genes entirely is striking for strains that persist in natural settings. Nineteen prpB-negative ECOR strains notwithstanding, this finding for Shigella sharply contrasts with the prp status among feral E. coli, which retained intact prpA and prpB gene sequences in 72 (100 %) and 53 (74 %) of ECOR strains, respectively. From an evolutionary perspective, loss of prp function seemingly would limit tolerance of the bacterium to external stresses, and consequently reduce its environmental elasticity (Missiakas & Raina, 1997). This might, in part, explain the limited habitats for Shigella as human pathogens. Shigella strains appear to lack the environmental tolerance that other closely related species possess, suggesting a relatively narrow niche for these strains in nature. Consistent with this, Shigella strains, in general, are susceptible to oxidizing, temperature and osmotic stresses. In comparative studies with E. coli, for instance, Shigella shows relatively weaker acid tolerance and poor survival in the presence of H2O2 and other germicides (Lin et al., 1995; Sagripanti et al., 1997; Taormina et al., 2001). Such deficiencies in stress response might hinder movement of the bacterium into different environments or novel niches. On the other hand, the absence of prpB in a small subset of feral E. coli strains argues that the PrpB phenotype may have little effect on subsequent bacterial fitness. That is, loss of prpB may do little in limiting environmental elasticity or host range in E. coli. From a genomic perspective, however, Shigella appear to have sustained a series of unique acquisitions and losses that now distinguish many Shigella subgroups from their E. coli brethren (Jin et al., 2002; Wei et al., 2003).

Although less probable, another explanation is that the loss or inactivation of Prp function among certain lineages might enhance survival in particular niches, helping to explain the apparent fitness of Shigella for human pathogenesis, as evidenced by its low infective dose. Due to the extensive cross-talk between multiple genes involved in environmental stress responses, it might be advantageous for Shigella to harbour functionally inert prp alleles. Such a hypothesized condition would have the effect of biochemically ‘streamlining’ stress-response pathways in Shigella. Rezwan et al. (2004), in explaining the high prevalence of the indole-negative phenotype among expansive clusters of Shigella strains, postulated a selective advantage for strains that have lost the ability to produce indole. Precedents for a selective advantage through gene loss are the ‘black holes' of Shigella, defined as deletions of genes whose products are detrimental to bacterial pathogenicity, the deletions thus having the effect of increasing virulence (Maurelli et al., 1998). In this well-characterized example, loss of lysine decarboxylase by deletion of the cadA gene in Flexneri 2a was accompanied by an extensive chromosomal deletion of ~90 kb (Maurelli et al., 1998). The deletion surrounding prpA in Dysenteriae type 1 strains, albeit much smaller, might be of similar origin. It is intriguing that IS insertions might be an intermediate on the way to deletion formation, as Shigella has been particularly prone to gene inactivation by IS elements (Nyman et al., 1981; Rezwan et al., 2004). However, it is important to note that effects of prp gene loss on stress response and virulence remain to be determined for Shigella strains.

Phylogenetic analyses revealed similarities in prp allele structure and distribution among certain Shigella and E. coli strains. Previous MLST (multilocus sequence typing) analysis of strains from these two organisms revealed a clustering of Dysenteriae type 1 with E. coli group E (Pupo et al., 2000). The findings reported here also suggest that this serotype is closer to diarrhoeagenic E. coli lineages, away from commensal or extraintestinal groupings (i.e. groups A, B1 and B2). Like Dysenteriae type 1, ECOR strains 38 and 41 (group D) retain intact and IS1-ablated prpA and prpB alleles, respectively, and a mutS–rpoS intergenic region identical in size (~4·6 kb) to this Shigella serotype, unlike most E. coli lineages (LeClerc et al., 1999). These data support a model that a single event, common to both Shigella and E. coli, may have forged genomic architecture in this region of the chromosome and suggest that strains in this lineage were an ancestral reservoir from which the Dysenteriae 1 serotype emerged.

The application of phylogenetic mapping techniques also yielded evidence for multiple insertions of several of the IS elements (IS1, IS4 and IS600) in the evolution of Shigella prp genes. The multiple gains and losses of insertion sequences that were uncovered by phylogenetic analysis support a model of IS elements moving laterally through the genomes of Shigella strains following their radiation out of E. coli. These findings reinforce previous results of Rezwan et al. (2004), who demonstrated a substantial role for IS-mediated inactivation of the tna (tryptophanase) operon leading to the indole-negative phenotype for many of the major Shigella strain lineages. Additionally supporting a conclusion of the independent horizontal acquisition of IS sequences is the finding that IS1 replaces prpB entirely in Dysenteriae type 1 but is inserted into a full-length prpB gene in Sonnei strains. Moreover, the insertion sites and repeat sequences flanking prpA IS600 are unique for these two Shigella subgroups, underscoring the independence of the events leading to these two insertions. It is notable that most examples of prp inactivation by IS elements appear to have occurred after the radiation of Shigella clones out of E. coli. As an example, the three Shigella lineages (S1–S3) originally described by Pupo et al. (2000) retain several ECOR group B1 and A strains (30, 33, 28 and 7) as their nearest phylogenetic neighbours, yet each of these ECOR strains retains unablated prpA and prpB sequence (Table 4). Additionally, the full-length genome of Flexneri revealed a chromosome peppered with the vestiges of IS elements including IS1, IS4, IS600 and IS629. Flexneri strain Sf301, for example, retained, on average, 18 times the number of IS1 and IS4 sequences found in the E. coli K-12 genome as well as numerous copies of IS600 and IS629 – K12 retains neither of these latter two elements (Jin et al., 2002). Schneider et al. (2000) emphasized the significant role that IS elements play in mediating chromosomal rearrangements, particularly deletions and inversions. Taken together, these observations point to numerous IS events forging the evolutionary structure of prp gene sequences among Shigella lineages as they emerged from their E. coli ancestors.

In summary, although Prp proteins are known to play a role in stress response in E. coli, the sporadic distribution and IS-element-based disruption of many of the prpA and prpB alleles from multiple Shigella subgroups indicate a more limited role in stress response for the two genes in these bacteria. It remains to be understood whether a difference in prp gene retention is a factor in the differences in environmental elasticity that exist between Shigella and E. coli strains. The lack of one or more functional prp alleles may have imposed an adaptive premium on strains of Shigella that require relatively stringent conditions for survival in the environment. An analysis of prp genes in enteroinvasive strains of E. coli (EIEC), largely considered an evolutionary intermediate between most E. coli and Shigella strains, may aid in better understanding the role of these stress-response genes in the limited survival conditions observed for Shigella bacteria. Detailed studies of specific stress-response genes such as prpA and prpB should assist in the development of preventative measures for curbing the staggering incidence of shigellosis worldwide.


   ACKNOWLEDGEMENTS
 
We acknowledge Drs Keith A. Lampel (FDA) and Nancy Strockbine (CDC) for their gracious gift of Shigella strains. We would also like to acknowledge Lauren Morgenroth for technical contributions to both the molecular and cladistic analyses.


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Blattner, F. R., Plunkett, G., 3rd, Bloch, C. A. & 14 other authors (1997). The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1474.[Abstract/Free Full Text]

Boyd, E. F., Nelson, K., Wang, F. S., Whittam, T. S. & Selander, R. K. (1994). Molecular genetic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc Natl Acad Sci U S A 91, 1280–1284.[Abstract/Free Full Text]

Brown, E. W., LeClerc, J. E., Li, B., Payne, W. L. & Cebula, T. A. (2001). Phylogenetic evidence for horizontal transfer of mutS alleles among naturally occurring Escherichia coli strains. J Bacteriol 183, 1631–1644.[Abstract/Free Full Text]

Brown, E. W., Kotewicz, M. L. & Cebula, T. A. (2002). Detection of recombination among Salmonella enterica strains using the incongruence length difference test. Mol Phylogenet Evol 24, 102–120.[CrossRef][Medline]

Cebula, T. A. (1995). Allele-specific polymerase chain reaction (PCR) in mutation analysis: the Salmonella typhimurium his paradigm. In Application of Molecular Biology in Environmental Chemistry, pp. 11–33. Edited by R. A. Minear, A. M. Ford, L. L. Needham & N. J. Karch. Boca Raton, FL: CRC Press.

Devereux, J., Haeberli, P. & Smithies, O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res 12, 387–395.[Abstract]

DuPont, H. L., Levine, M. M., Hornick, R. B. & Formal, S. B. (1989). Inoculum size in shigellosis and implications for expected mode of transmission. J Infect Dis 159, 1126–1128.[Medline]

Escobar-Paramo, P., Giudicelli, C., Parsot, C. & Denamur, E. (2003). The evolutionary history of Shigella and enteroinvasive Escherichia coli revised. J Mol Evol 57, 140–148.[CrossRef][Medline]

Ewing, W. H. (1949). Shigella nomenclature. J Bacteriol 57, 633–638.

Farris, J. S. (1983). The logical basis of phylogenetic analysis. In Procedings of the 2nd Meeting of the Willi Hennig Society (Adv Cladistics 2, pp. 7–36). Edited by N. Platnick & V. Funk. New York: Columbia University Press.

Farris, J. S., Kallersjo, M., Kluge, A. G. & Bult, C. (1995). Testing significance of incongruence. Cladistics 10, 783–791.

Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791.

Forey, P. L., Humphries, C. J., Kitching, I. L., Scotland, R. W., Siebert, D. J. & Williams, D. M. (1992). Cladistics: a Practical Course in Systematics. Oxford: Clarendon Press.

Herzer, P. J., Inouye, S., Inouye, M. & Whittam, T. S. (1990). Phylogenetic distribution of branched RNA-linked multicopy single-stranded DNA among natural isolates of Escherichia coli. J Bacteriol 172, 6175–6181.[Medline]

Hyma, K. E., Lacher, D. W., Nelson, A. M., Bumbaugh, A. C., Janda, J. M., Strockbine, N. A., Young, V. B. & Whittam, T. S. (2005). Evolutionary genetics of a new pathogenic Escherichia species: Escherichia albertii and related Shigella boydii strains. J Bacteriol 187, 619–628.[Abstract/Free Full Text]

Jin, Q., Yuan, Z., Xu, J. & 30 other authors (2002). Genome sequence of Shigella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K-12 and O157. Nucleic Acids Res 30, 4432–4441.[Abstract/Free Full Text]

Kotloff, K. L., Winickff, J. P., Ivanoff, B., Clemens, J. D., Swerdlow, D. L., Sansonetti, P. J., Adak, G. K. & Levine, M. M. (1999). Global burden of Shigella infections: implications for vaccine development and implantation of control strategies. Bull WHO 77, 651–666.[Medline]

Lan, R. & Reeves, P. (2002). Escherichia coli in disguise: molecular origins of Shigella. Microb Infect 4, 1125–1132.[CrossRef][Medline]

LeClerc, J. E., Li, B., Payne, W. L. & Cebula, T. A. (1999). Promiscuous origin of a chimeric sequence in the Escherichia coli O157 : H7 genome. J Bacteriol 181, 7614–7617.[Abstract/Free Full Text]

Lecointre, G., Rachdi, L., Darlu, P. & Denamur, E. (1998). Escherichia coli molecular phylogeny using the incongruence length difference test. Mol Biol Evol 15, 1685–1695.[Abstract/Free Full Text]

Li, B., Tsui, H.-C. T., LeClerc, J. E., Dey, M., Winkler, M. E. & Cebula, T. A. (2003). Molecular analysis of mutS expression and mutation in natural isolates of pathogenic Escherichia coli. Microbiology 149, 1323–1331.[CrossRef][Medline]

Lin, J., Lee, I. S., Slonczewski, J. L. & Foster, J. W. (1995). Comparative analysis of extreme acid survival in Salmonella typhimurium, Shigella flexneri, and Escherichia coli. J Bacteriol 177, 4097–4104.[Abstract/Free Full Text]

Missiakas, D. & Raina, S. (1997). Signal transduction pathways in response to protein misfolding in the extracytoplasmic compartments of E. coli: role of two new phosphoprotein phosphatase PrpA and PrpB. EMBO J 16, 1670–1685.[Abstract/Free Full Text]

Maurelli, A. T., Fernandez, R. E., Bloch, C. A., Rode, C. K. & Fasano, A. (1998). "Black holes" and bacterial pathogenicity: a large genomic deletion that enhances the virulence of Shigella spp. and enteroinvasive Escherichia coli. Proc Natl Acad Sci U S A 95, 3943–3948.[Abstract/Free Full Text]

Nyman, K., Nakamura, K., Ohtsubo, H. & Ohtsubo, E. (1981). Distribution of the insertion sequence IS1 in gram-negative bacteria. Nature 289, 609–612.[CrossRef][Medline]

Ochman, H. & Selander, R. K. (1984). Standard reference strains of Escherichia coli from natural populations. J Bacteriol 157, 690–693.[Medline]

Perna, N. T., Plunkett, G., 3rd, Burland, V. & 24 other authors (2001). Genome sequence of enterohaemorrhagic Escherichia coli O157 : H7. Nature 409, 529–533.[CrossRef][Medline]

Pupo, G. M., Karaolis, D. K. R., Lan, R. & Reeves, P. R. (1997). Evolutionary relationships among pathogenic and nonpathogenic Escherichia coli strains inferred from mutilocus enzyme electrophoresis and mdh sequences studies. Infect Immun 65, 2685–2692.[Abstract]

Pupo, G. M., Lan, R. & Reeves, P. R. (2000). Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc Natl Acad Sci U S A 97, 10567–10572.[Abstract/Free Full Text]

Rezwan, F., Lan, R. & Reeves, P. R. (2004). Molecular basis of the indole-negative reaction in Shigella strains: extensive damages to the tna operon by insertion sequences. J Bacteriol 186, 7460–7465.[Abstract/Free Full Text]

Sagripanti, J. L., Eklund, C. A., Trost, P. A., Jinneman, K. C., Abeyta, C., Jr, Kaysner, C. A. & Hill, W. E. (1997). Comparative sensitivity of 13 species of pathogenic bacteria to seven chemical germicides. Am J Infect Control 25, 335–339.[CrossRef][Medline]

Schneider, D., Duperchy, E., Coursange, E., Lenski, R. E. & Blot, M. (2000). Long-term experimental evolution in Escherichia coli. IX. Characterization of insertion sequence-mediated mutation and rearrangements. Genetics 156, 477–488.[Abstract/Free Full Text]

Streisinger, G., Okada, Y., Emrich, J., Newton, J., Tsugita, A., Terzaghi, E. & Inouye, M. (1966). Frameshift mutations and the genetic code. Cold Spring Harbor Symp Quant Biol 31, 77–84.[Medline]

Swofford, D. L. (1999). Phylogenetic analysis using parsimony (PAUP* v. 4.03b) program and documentation. Washington, DC: Smithsonian Institution.

Taormina, P. J., Niemira, B. A. & Beuchat, L. R. (2001). Inhibitory activity of honey against foodborne pathogens as influenced by the presence of hydrogen peroxide and level of antioxidant power. Int J Food Microbiol 69, 217–225.[CrossRef][Medline]

Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTAL X Windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.[Abstract/Free Full Text]

Watarai, M., Tobe, T., Yashikawa, M. & Sasakawa, C. (1995). Contact of Shigella with host cells triggers release of Ipa invasins and is an essential function of invasiveness. EMBO J 14, 2461–2470.[Abstract]

Wei, J., Goldberg, M. B., Burland, V. & 14 other authors (2003). Complete genome sequence and comparative genomics of Shigella flexneri 2a strain 2457T. Infect Immun 71, 2775–2786.[Abstract/Free Full Text]

Received 23 February 2005; revised 26 April 2005; accepted 13 May 2005.



This Article
Abstract
Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Google Scholar
Articles by Li, B.
Articles by Cebula, T. A.
Articles citing this Article
PubMed
PubMed Citation
Articles by Li, B.
Articles by Cebula, T. A.
Agricola
Articles by Li, B.
Articles by Cebula, T. A.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS
Copyright © 2005 Society for General Microbiology.