(Received for publication, January 27, 1997, and in revised form, April 3, 1997)
From the Departments of Microbiology and Immunology,
Anatomy and Cell Biology, and ** Biochemistry and Molecular
Biology, University of North Texas Health Science Center at Fort Worth,
Fort Worth, Texas 76107-2699, the ¶ Department of Microbiology and
Cell Science, University of Florida, Gainesville, Florida 32611-0100, and the
Department of Biochemistry and
Biophysics, Texas A & M University,
College Station, Texas 77843-2128
The RNA-binding protein CsrA (carbon storage regulator) is a new kind of global regulator, which facilitates specific mRNA decay. A recombinant CsrA protein containing a metal-binding affinity tag (CsrA-H6) was purified to homogeneity and authenticated by N-terminal sequencing, matrix-assisted laser desorption/ionization time of flight mass spectrometry, and other studies. This protein was entirely contained within a globular complex of approximately 18 CsrA-H6 subunits and a single ~350-nucleotide RNA, CsrB. cDNA cloning and nucleotide sequencing revealed that the csrB gene is located downstream from syd in the 64-min region of the Escherichia coli K-12 genome and contains no open reading frames. The purified CsrA-CsrB ribonucleoprotein complex was active in regulating glg (glycogen biosynthesis) gene expression in vitro, as was the RNA-free form of the CsrA protein. Overexpression of csrB enhanced glycogen accumulation in E. coli, a stationary phase process that is repressed by CsrA. Thus, CsrB RNA is a second component of the Csr system, which binds to CsrA and antagonizes its effects on gene expression. A model for regulatory interactions in Csr is presented, which also explains previous observations on the homologous system in Erwinia carotovora. A highly repeated nucleotide sequence located within predicted stem-loops and other single-stranded regions of CsrB, CAGGA(U/A/C)G, is a plausible CsrA-binding element.
During studies on the genetic regulation of glycogen biosynthesis in Escherichia coli, a metabolic pathway that is induced as cultures enter the stationary phase of growth, we identified a gene encoding a factor that effects potent negative regulation of glycogen biosynthesis, csrA or "carbon storage regulator" (1, 2). Subsequent studies have shown that csrA is a global regulatory gene that controls numerous genes and enzymes of carbohydrate metabolism. In E. coli K-12, it acts as a negative regulator of glycogen biosynthesis, gluconeogenesis, and glycogen catabolism and as a positive factor for glycolysis, and it affects cell surface properties (1, 3-5). Recent experiments of Chatterjee et al. (6) and Cui et al. (7) document a key regulatory role for the csrA homolog of the pathogenic Erwinia species, rsmA (repressor of stationary phase metabolites), in the expression of several virulence factors of soft rot disease of higher plants, including pectinase, cellulase, and protease activities, and further suggest that it may modulate the production of the quorum-sensing metabolite N-(3-oxohexanoyl)-L-homoserine lactone. Homologs of this metabolite are secreted by numerous Gram-negative bacteria, in which they serve to activate the expression of a wide variety of genes in response to cell density (reviewed in Refs. 8 and 9). Widespread phylogenetic distribution of csrA homologs among eubacteria points to a broad significance and ancient origin for this regulatory system in this group of organisms (10, 11).
Studies conducted in vivo have indicated that the CsrA gene product facilitates the decay of glgCAP mRNA, which encodes two enzymes required for glycogen biosynthesis and the enzyme glycogen phosphorylase (3, 5). The CsrA gene product is a 61-amino acid protein whose deduced amino acid sequence contains a conserved RNA-binding motif, KH (3). A cis-acting region located close to or overlapping the glgC ribosome binding site mediates its regulatory effects (Ref. 3, and RNA mobility shift experiments).1 To further elucidate the mechanism of CsrA, we have now expressed and purified a recombinant CsrA protein, CsrA-H6.2 Unexpectedly, this protein was found to be noncovalently bound to an RNA molecule, CsrB, in a large multisubunit complex. The characterization of purified recombinant CsrA protein, the CsrA-CsrB ribonucleoprotein complex, and the csrB gene are described herein.
Experimental evidence supports a model in which CsrB RNA competes with cellular mRNAs for binding to CsrA and thereby antagonizes its activity. This hypothesis also provides the mechanistic framework for understanding observations made on the homologous system of Erwinia carotovora. In this species, the apparent csrB homolog was previously called aepH, because it somehow activates the expression of extracellular proteins (12), which in fact are the same proteins that are repressed via rsmA (7). However, rsmA and aepH are not linked in the genome and have not been demonstrated to function together in a regulatory system. Evidence that the regulatory factor encoded by the aepH region is the CsrB RNA homolog of E. carotovora will be discussed.
Agarose Ni-NTA affinity resin was
purchased from Qiagen (Chatsworth, CA). The enzymes for DNA
manipulation were from the sources previously described (3, 13). Sheep
anti-rabbit horseradish peroxidase, prestained and unstained protein
standards for SDS-PAGE, polyvinylidene difluoride membranes, and
nitrocellulose membranes were purchased from Bio-Rad. DNA and RNA size
markers and T4 polynucleotide kinase were purchased from Life
Technologies, Inc. RNase A, high quality imidazole, and native protein
molecular weight markers were purchased from Sigma. The
[-32P]ATP, [
-32P]dATP,
[
-35S]dATP, and [35S]methionine were
purchased from DuPont NEN. Oligonucleotides were synthesized and
purified by BioSynthesis, Inc. (Lewisville, TX).
The strains used in
this study include BW3414 (lacU169), TR1-5BW3414
(
lacU169 csrA::kanR), and
TR1-5BW3414[pCSR10] (csrA-overexpressing strain) (1),
TR1-5BW[pCSRH6-19] (expresses the recombinant protein CsrA-H6)
(this study), DH5
(supE44
lacU169
(
80lacZ
M15)hsdR17 recA1 endA1 gyrA96 thi-1
relA1) (14), and LE392 (hsd R514 (rK- mK-) supE44 supF58 lacY galK2 galT22 metB1
trpR55
-) (from D. Daniels and F. R. Blattner, University of
Wisconsin, Madison). Plasmids include pUC18, pUC19 (15), pCSR10
(csrA cloned into pUC19) (1), pKK223-3 (16), and pCSRB-SR
and pCSRB-SF, in which csrB was cloned into pUC18 or pUC19,
respectively, such that csrB was either in the same (SF) or
in the opposite orientation (SR) with respect to the lac
promoter (this study). Plasmids pAKC671 and pAKC672 encode the E. carotovora aepH* and aepH genes (12). Bacteriophage
DD628, encoding DNA from the 63-64-min region of the E. coli K-12 MG1655 genome was obtained from F. R. Blattner (University of Wisconsin).
Standard procedures were used for isolation of plasmid DNA and restriction fragments, restriction mapping, transformation, and molecular cloning, as described previously (13) and for PCR amplification (17). Alternatively, plasmid DNA was purified using Qiagen plasmid mini-DNA cartridges. Dideoxynucleotide sequencing (18) was performed using the SequenaseTM version 2.0 kit under the conditions described by the manufacturer (U.S. Biochemical Corp.). Data base searches were performed using BLAST analysis (19) via the Internet services at the National Center for Biotechnology Information.
Preparation of a Plasmid for Expression of CsrA-H6A PCR
product containing the csrA coding region and six in-frame
histidine codons followed by a termination codon was prepared using
HindIII-treated pCSR10 (1) and the oligonucleotide primers ATGCTGATTCTGACTCGTCG and
TTAATGATGATGATGATGATGGTAACTGGACTGCTGGGAT and was treated with
polynucleotide kinase and T4 DNA polymerase and ligated into the
dephosphorylated SmaI site of the expression vector
pKK223-3. The ligation mixture was used to transform DH5 to
ampicillin resistance. A resulting plasmid clone, pCSRH6-19, complemented the csrA::kanR mutation of
TR1-5BW3414, and its csrA coding region was free of any
PCR-generated mutations, as determined by DNA sequence analysis.
cDNA was prepared
by treating CsrB RNA (5 µg) with poly(A) polymerase (4 units) in a
reaction containing 250 µM ATP, 40 mM Tris-HCl, pH 8.0, 10 mM MgCl2, 2.5 mM MnCl2, 250 mM NaCl, 1 mM dithiothreitol, followed by cDNA synthesis using the
Riboclone cDNA Synthesis Systems avian myeloblastosis virus reverse
transcriptase (Promega, Madison, WI) according to the manufacturer's
specifications. cDNA was made blunt-ended using T4 DNA polymerase
and was cloned into the SmaI site of pUC19 using DH5 as
the host strain for transformation. Approximately 200 clones were
saved, 14 of which were at least partially sequenced and mapped on the
E. coli genome using data base searches.
The csrB gene was amplified from DD628 DNA by PCR using
the oligonucleotide primers GTAAGCGCCTTGTAAGACTTC and
CTGGAGACGAACGCGGTCATG, and the PCR product was treated with T4 DNA
polymerase and polynucleotide kinase and cloned into the
SmaI site of pUC18 to yield pCSRB-SR. Subsequently, pCSRB-SR
was treated with EcoRI and BamHI, and the insert
DNA was subcloned into pUC19 to yield the plasmid clone pCSRB-SF, which
was later sequenced.
For large scale preparation of
protein, 3 liters of an overnight culture of TR1-5BW3414[pCSRH6-19]
was used to inoculate a 500-liter fermenter containing 350 liters of LB
medium (20) with ampicillin (100 µg/ml). The cells were grown
aerobically at 37 °C, harvested at late exponential phase,
resuspended in 7 liters of binding buffer (50 mM sodium
phosphate, pH 8.0, 500 mM NaCl), lysed in a French pressure
cell, and kept at 80 °C until use (conducted at the Fermentation
Facility, University of Alabama, Birmingham). The cell lysate (1 liter)
was thawed at room temperature, imidazole was added to 20 mM, and the mixture was centrifuged at 10,000 × g at 4 °C for 30 min. The clear supernatant solution was
mixed with 2 ml of prewashed Ni-NTA resin in binding buffer. After
stirring on ice for 2 h, the batch was poured into a glass column
and washed first with 40 volumes of binding buffer until the absorbance
at 280 nm was less than 0.01 and then with 40 volumes of wash buffer
(50 mM sodium phosphate, pH 6.0, 500 mM NaCl,
10% glycerol, and 20 mM imidazole) until the absorbance was less than 0.01. The CsrA-H6 protein was eluted with a 50-ml linear
gradient of imidazole (20 mM to 500 mM) in wash
buffer followed by 12 ml of 1 M imidazole in wash buffer.
Column fractions containing protein were analyzed by SDS-PAGE (15%
gel), and the resolved polypeptides were detected with Coomassie Blue
staining or by Western immunoblot analysis. The fractions containing
electrophoretically pure protein were pooled, concentrated, dialyzed
against 10 mM Tris-OAc, pH 8.0, and assayed for total
protein (21).
The
CsrA-CsrB complex was subjected to nondenaturing electrophoresis on
7.5% polyacrylamide gels (22) and was detected by staining protein
with Coomassie Blue or nucleic acid with acridine orange, which renders
single-stranded RNA red and double-stranded DNA green (23). Denaturing
electrophoresis was carried out using SDS-polyacrylamide slab gels (24)
to separate polypeptides that had been denatured with SDS and
-mercaptoethanol according to a method specifically suggested for
the analysis of His-tagged proteins (Qiagen).
In Western blotting of SDS or nondenaturing gels, proteins were transferred onto a 0.2-µm nitrocellulose membrane in transfer buffer (25 mM Tris-HCl, 192 mM glycine, pH 8.3, 20% methanol; Ref. 25). The membranes were blocked in 0.3% gelatin and incubated for 4 h with 2,000-fold diluted rabbit anti-CsrA peptide (Lys-Glu-Val-Ser-Val-His-Arg-Glu-Glu-Ile-Tyr; residues 38-48; prepared by Research Genetics, Huntsville, AL). After washing three times, the membranes were incubated with sheep anti-rabbit horseradish peroxidase conjugate and developed with 4-chloro-N-naphthyl.
N-terminal Amino Acid Sequencing and Amino Acid Composition AnalysisThe purified CsrA-H6 complex (1 mg/ml protein in 10 mM Tris-OAc) was loaded onto a trifluoroacetic acid-activated glass fiber filter, and its amino-terminal sequence was determined by automated Edman degradation using an ABI 475A gas phase instrument (26). The reaction cartridge temperature was 50 °C.
For amino acid composition analysis, 2 µg of concentrated affinity-purified CsrA-CsrB was subjected to electrophoresis on 15% SDS-PAGE and blotted onto polyvinylidene difluoride membrane. The membrane was rinsed with water, stained with 0.025% Coomassie Blue and 40% methanol, and destained in 50% methanol. The protein was excised, hydrolyzed in 6 N HCl for 24 or 48 h, and analyzed on an ABI model 420 automated amino acid analyzer (Foster City, CA) (performed by BioSynthesis, Inc., Lewisville, TX).
MALDI TOF Mass SpectrometryMass spectra were acquired
using a Vestec LaserTec Research linear instrument (Perceptive
Biosystems, Houston, TX), employing a nitrogen laser (337 nm), a 1.2-m
flight tube, and an accelerating voltage of 10 kV (27). The mass axis
was set by calibration with insulin. Samples were diluted into 0.1%
trifluoroacetic acid and mixed on the target with an equal volume of
MALDI matrix consisting of a saturated solution of either
3,5-dimethoxy-4-hydroxycinnamic (sinapinic) acid or
-cyano-4-hydroxycinnamic acid dissolved in 40% acetonitrile and
0.1% trifluoroacetic acid. Data from 25-75 spectra were averaged and
assigned using the GRAMS-LaserTec program.
Affinity-purified
CsrA-CsrB (4 µg of protein) and standard proteins (-lactalbumin,
14.2 kDa; carbonic anhydrase, 29 kDa; bovine serum albumin monomer, 66 kDa; bovine serum albumin dimer, 132 kDa; urease trimer, 272 kDa; and
urease hexamer, 545 kDa) were analyzed by electrophoresis on a series
of nondenaturing gels containing 5, 5.5, 6, 7, 8, and 9% acrylamide,
and their resulting RF values were calculated as the
ratio of migration distance relative to the dye bromphenol blue. The
values of 100(log(RF × 100)) for each of the
proteins was plotted versus gel concentrations as
percentages, and the logarithm of the negative slope of each curve was
then plotted against the logarithm of known molecular mass of each
protein to generate a curve from which the molecular mass of CsrA-CsrB
was estimated (28).
RNA was purified from the CsrA-CsrB complex by extraction once with phenol/chloroform (1:1) and once with chloroform and was precipitated with ethanol. RNA was quantified by absorbance at 260 nm or by an orcinol assay (29). For molecular mass estimation, purified RNA was subjected to denaturing electrophoresis on 1.2% agarose gels containing 2.2 M formaldehyde (17).
Secondary Structure Predictions for CsrB RNAThe entire
nucleotide sequence was folded with the program STAR 4.1, developed by
F. H. D. van Batenburg, running on a Macintosh PowerMac 7500 platform.
STAR arrives at a secondary structural prediction using one of three
algorithms (sequential, stochastic, or genetic folding algorithm) by
simulating the folding pathway of an RNA from the 5-end of the
molecule rather than by necessarily calculating the most stable
structure predicted by the RNA sequence. The results presented here
were obtained with a stochastic algorithm (30) using the default
nearest neighbor energy rules, an increment value of 25 nucleotides,
and a population size of 10. Structure predictions obtained with STAR
were rendered by LoopDloop (obtained from D. Gilbert, Indiana
University). Secondary structural models generated by the sequential
folding algorithm (31) were substantially similar to those obtained
with the stochastic algorithm. Nucleotides 1-217 and 295-360 were
folded identically by both algorithms; nucleotides 218-294 were folded
into four stem structures, which were distinct in each case. Models
generated with a genetic algorithm (32) were also similar to the
stochastic model, except that two sets of long range interactions
between single-stranded regions involving nucleotides 64-66 base
pairing with 315-317 and 118-121 with 270-273 were predicted. In the
absence of other phylogenetic or chemical modification data, these
three models cannot be distinguished from one another.
Samples of native complex or complex pretreated with 10 mM EDTA were diluted to 5 µg/ml with 10 mM Tris-OAc, pH 8.0, and 2% glutaraldehyde. Glow-discharged Formvar-coated grids were touched to 5-µl drops of fixed sample, excess fluid was removed, and the grids were floated on 2% uranyl acetate for 2 min. Grids were then blotted with filter paper, air-dried, and examined on a Zeiss EM910 instrument (33). Images were recorded either on Kodak so-163 or digitally with a Dage-MTI Model 72 CCD camera connected to a Scion LG-3 frame grabber board in an Apple Power Macintosh 7200/90 computer. Digital images were used to measure the diameter of the complexes using NIH Image and were analyzed statistically with Abacus Statview.
Treatment of the CsrA-CsrB Complex with RibonucleasesRNA-free CsrA protein was prepared by treating CsrA-CsrB (0.6 mg of protein) with DNase-free RNase A (10 µg/ml) in 5 mM EDTA and 10 mM Tris acetate, pH 8.0, at 37 °C for 30 min. After dialyzing at 4 °C overnight against two changes of binding buffer (50 mM sodium phosphate, pH 8.0, 500 mM NaCl, and 20 mM imidazole), the protein was separated from the reaction mixture by binding to a Ni-NTA spin column (Qiagen) and elution with 1 M imidazole in binding buffer. The eluate was dialyzed overnight against two changes of 10 mM Tris-OAc, pH 8.0, and concentrated to 0.4 mg/ml in an Amicon stirred cell (Amicon Inc, Beverly, MA). Alternatively, RNA of the CsrA-CsrB was hydrolyzed with the Ca2+-dependent enzyme micrococcal nuclease, which was then inactivated using EGTA (34).
S-30-coupled Transcription-TranslationEffects of the CsrA gene product on the expression of glycogen biosynthesis genes encoded by plasmid pOP12 (glgB, glgC, and glgA) were examined in transcription-translation reactions using an S-30 extract prepared from strain TR-15BW3414 (csrA::kanR) and containing 100 µM cAMP and 1 µg of cAMP receptor protein. The methodology has been previously described in detail (3, 13).
Glycogen StainingEffects of the E. coli csrA and csrB genotypes on glycogen levels were observed by staining colonies with iodine vapor (3).
Based upon previous experiments, which showed that the carboxyl terminus of CsrA could be modified without destroying its in vivo biological activity (1), a plasmid was constructed to allow the expression of recombinant CsrA containing 6 histidine residues at the carboxyl terminus, permitting its purification via metal-binding affinity chromatography (35). This plasmid, pCSRH6-19, complemented the csrA::kanR mutation (data not shown), indicating that it encodes a biologically active CsrA gene product.
Preliminary small scale experiments showed that the CsrA-H6 protein was
expressed by TR1-5BW3414[pCSRH6-19] and could be purified in modest
yield (0.1 mg/liter of batch culture) by a single step of affinity
chromatography on Ni-NTA agarose (data not shown). The CsrA-H6 protein
preparation used for all of the experiments described here was prepared
in large scale from cells grown to late exponential phase in a
500-liter fermenter (see "Materials and Methods"); its
electrophoretic properties were indistinguishable from those of smaller
scale preparations. The fractions that were pooled and saved from the
affinity matrix contained a single detectable ~7-kDa band on SDS-PAGE
using Coomassie Blue staining or Western blot analysis (Fig.
1, A and B). The yield of CsrA
protein from the large scale preparation was approximately 3 mg/35
liters of culture.
Authentication of Purified Recombinant CsrA Protein
To authenticate the CsrA-H6 polypeptide of the CsrA-CsrB complex and to determine whether it may be covalently modified, N-terminal sequencing, amino acid composition analysis, and MALDI TOF mass spectrometry were performed. Fifteen cycles of automated Edman degradation yielded the sequence Met-Leu-Ile-Leu-Thr-Arg-Arg-Val-Gly-Glu(X)-Thr-Leu-Met-Ile-Gly, identical to the deduced amino acid sequence of CsrA (1), except for an unexplained minor peak (X) observed in addition to the expected glutamic acid residue at cycle 10. Amino acid analysis was also consistent with the deduced composition of the CsrA protein, with the exception of Gly, which was almost 4-fold higher than expected and was apparently carried over from the transfer buffer, and His was somewhat lower than expected, 4.6 versus 7 residues (data not shown). MALDI TOF mass spectrometry revealed a molecular mass of 7677.7, differing by less than 3 Da from the predicted value of CsrA-H6 (data not shown), indicating that the polypeptide was not covalently modified, except for the deformylation of the N-terminal methionine residue.
CsrA-H6 Protein Is Bound to RNA in a Discrete Multisubunit ComplexThe ultraviolet absorbance spectrum of purified CsrA-H6 was found to be typical of nucleic acid instead of protein, i.e. the ratio of absorbance at 260-280 nm was 2.0 (data not shown). In view of the fact that CsrA modulates glgC mRNA stability in vivo (3) and its deduced amino acid sequence contains a putative RNA-binding domain, this nucleic acid was suspected to be RNA. This was confirmed by several approaches. The ribose content of the preparation was quantified by an orcinol assay (29), and a mass ratio of RNA to protein was determined to be 1.27. Substantial orcinol reactivity relative to absorbance at 260 nm excluded DNA as a major constituent of the preparation, which is not reactive with orcinol (data not shown). Analysis of the CsrA-H6 preparation by 7.5% native PAGE revealed a single major product that stained with Coomassie Blue, Western blot analysis, and acridine orange (Fig. 1C, lanes 1-3, respectively), indicative of a discrete ribonucleoprotein complex. A ladder of faster running minor components extending downward from the major component was observed with the most sensitive treatment, Western blotting. The complex yielded red fluorescence with acridine orange treatment, a characteristic of single-stranded RNA, and in contrast to DNA, which fluoresces green (23). Fig. 1D shows that phenol-purified RNA from the complex consisted of a major RNA band of approximately 350 nucleotides. The native complex was estimated to contain approximately 20 subunits of CsrA-H6 per single-stranded RNA, based upon an approximate size of 350 nucleotides for the RNA, 7.6 kDa for the protein subunit, and a protein:RNA mass ratio of 1.27. Attempts to separate the CsrA protein from the RNA under nondenaturing conditions, e.g. strong anion exchange chromatography on Mono-Q FPLC, were unsuccessful, suggestive of strong noncovalent interactions within the complex.
Negative Staining Electron Microscopy of Native CsrA-CsrB ComplexesTransmission electron microscopy of negatively stained
ribonucleoprotein complexes revealed them to be globular in shape and 7.89 ± 0.75 or 7.46 ± 0.12 nm (mean ± S.D.) in
diameter, depending upon whether they were treated with EDTA or not
treated, respectively (Fig. 2). Size variability of
complexes within each preparation was attributed to some RNA decay
during handling.
Molecular Mass Determination of CsrA-CsrB by Native PAGE
Nondenaturing polyacrylamide gel electrophoresis of multimeric proteins on a series of gels with increasing acrylamide concentration is a sensitive and reliable means for estimation of molecular mass, irrespective of charge (28). Since CsrA-CsrB was globular upon negative staining electron microscopy, its size should also be accurately estimated by this approach. Using this approach, the molecular mass of the CsrA-CsrB complex was estimated to be 256 kDa. Subtraction of the RNA mass (121 kDa) from this value revealed that there are approximately 18 CsrA subunits/complex by this method, in good agreement with the rough estimation of 20 CsrA subunits/RNA determined by the mass ratio of protein to RNA. This size also corresponds well to the larger structures observed by electron microscopy. Attempts to estimate the size of the complex by sedimentation equilibrium centrifugation were prevented due to precipitation of the complex under the high salt conditions required for the analysis.
Purified Recombinant CsrA Is Biologically Active in the Absence of the RNA ComponentTo determine whether the affinity-purified
CsrA-CsrB was biologically active, its effects on coupled
transcription-translation of glycogen biosynthesis genes were tested in
S-30 extracts prepared from TR1-5BW3414
(csrA::kanR), using pOP12 DNA as the genetic template (3, 13). Preliminary experiments established an appropriate
concentration range for testing the CsrA-CsrB (data not shown). Fig.
3 shows that several pOP12-encoded genes were expressed,
including the glycogen biosynthetic genes glgB,
glgC, and glgA, as well as asd, which
encodes an enzyme not involved in glycogen biosynthesis, aspartate
semialdehyde dehydrogenase (lanes 1 and 12). The
addition of CsrA-CsrB (0.5 µg of protein/35-µl reaction)
specifically and potently inhibited the expression of the
glg genes (lane 2), and in higher concentrations
(2.0 µg/reaction; lane 3) their expression was almost
undetectable. Decoupling of transcription and translation in these
reactions using rifampin showed that this specific inhibition of the
glg gene occurs posttranscriptionally,1 in
agreement with the conclusion that CsrA facilitates the decay of
glg mRNA in vivo (3).
The requirement of the RNA component of the CsrA-CsrB complex for biological activity was tested by treatment of the complex with micrococcal nuclease followed by EGTA inactivation of this enzyme or by RNase A treatment followed by affinity repurification of the CsrA protein (see "Materials and Methods"). As shown by native PAGE (Fig. 3, B and C), micrococcal nuclease treatments yielded faster migrating CsrA-CsrB complexes, indicative of RNA degradation, while the CsrA-H6 protein was not altered, as shown by SDS-PAGE (panel D). RNase A treatment followed by affinity purification yielded CsrA-H6 protein essentially free of RNA. On native PAGE, the RNA-free protein migrated slower than the original CsrB complexes and was observed near the top of the stacking and running gels, consistent with the fact that this protein is somewhat basic, and suggestive of possible protein aggregation (panels B and C). No acridine orange staining was observed in this region of the native gels. The mobility of the RNA-free protein on SDS-PAGE was unaltered (panel D). The protein remained fully active in specific genetic regulation under all treatment conditions. In fact, micrococcal nuclease-treated or RNA-free CsrA-H6 preparations were somewhat more inhibitory to glg gene expression than was the CsrA-CsrB complex. Clearly, the RNA component of the CsrA-CsrB complex was not required for its biological activity in vitro.
cDNA Cloning and Sequencing of CsrB RNAMolecular
characterization of the purified RNA component of the CsrA-CsrB complex
was accomplished by preparing cDNA clones from it, determining the
nucleotide sequence from 14 different cDNA clones, and searching
data bases for homologous genes. This approach revealed that the RNA
was transcribed from a locus in the 64-min region of the E. coli K-12 genome, which had been previously sequenced3 but had not been otherwise
studied. The gene encoding this RNA was designated as csrB
(Fig. 4). The csrB gene is immediately downstream from and in the same orientation as syd, a
recently described gene that encodes a protein that interacts with the SecY gene product, a component of the protein secretion apparatus (36).
No significant open reading frames were found within the nucleotide
sequence of csrB, indicating that CsrB does not function as
a messenger RNA. At the 3-end of csrB is a perfect 10-base pair stem and loop sequence followed immediately by UUUUUUUAUU, characteristic of a Rho-independent transcription terminator (37). No
typical promoter sequence is present between csrB and the
3
-end of syd, the significance of which awaits
transcription analyses and other studies of this gene.
A Highly Repeated Element Potentially Involved in Binding to CsrA
To identify the sites in CsrB that could interact with 18 CsrA protein subunits to form the observed CsrA-CsrB complex, we scanned the RNA sequence for highly repeated elements. Located within
the CsrB sequence are numerous imperfect repeats of the consensus
sequence 5-CAGGA(U/C/A)G-3
(Fig. 4). When the RNA secondary structure
was predicted, these sequences were found predominantly in
single-stranded regions of the molecule. Most strikingly, approximately
half of the sequences were consistently localized to the loop regions
of characteristic five-member hairpin loops distributed throughout the
RNA, while others were consistently not in such regions (Fig.
5). These hairpin loops are closed by the 5
-C:G-3
base
pair of the consensus sequence, which itself forms the terminal base
pair of the nonconserved short stem. The sequence 5
-CAGGAUG is the
most prevalent variant of the loop motif. Excluding the apparent
Rho-independent terminator, all except two of the predicted stem-loop
structures of CsrB contain the conserved sequence elements.
Data Base Searches for csrB Homologs: An Apparent csrB Homolog in E. carotovora Activates Genes That Are Repressed by the csrA Homolog, rsmA
Data base searches also identified a sequence homologous to csrB in the plant pathogen E. carotovora (Fig. 4). This region was previously found to activate the expression of the same extracellular virulence proteins that are repressed by rsmA, the csrA homolog of this species (7). A 141-nucleotide open reading frame initiating with the codon GTG was proposed to encode a protein responsible for the observed effects and was called aepH for "activator of extracellular proteins" (12). However, no evidence showed that aepH is actually translated, and this small reading frame itself is not conserved in E. coli. Rather, the E. coli and E. carotovora regions both contain the highly repeated consensus sequences. As in E. coli, the E. carotovora repeated sequences are also predicted to be predominantly found in single-stranded regions and in loops of stem and loop structures of the RNA encoded by this region (not shown). Murata and co-workers (12) had previously recognized these unusual repeated sequences and had found that they were significant for the function of this region, since mutations and deletions that were well upstream from the proposed aepH reading frame but among or immediately upstream from the repeated elements inactivated the gene.
Although csrA homologs are now known from a variety of eubacterial species, including Hemophilus influenzae, the entire genome of which has been sequenced (38), no csrB homolog was located in the H. influenzae sequence or elsewhere by nucleotide data base searches. This indicates that either other bacterial species lack csrB or, more likely, that the sequence of CsrB RNA is not highly conserved beyond the level of the bacterial family. The latter is a direct prediction of the hypothesis that the variably spaced repeated elements of CsrB RNA are essential for its function, and not a protein coding sequence or an overall RNA secondary structure as occurs in ribozymes.
Examination of the glgC message and mRNAs of other CsrA-regulated genes did not reveal the repeated stem-loop structures. Nevertheless, the Shine-Dalgarno sequence of glgC (AAGGAGU) is located within the cis-acting region, that mediates CsrA regulation (3) and contains the central portion of the repeat within a predicted single-stranded region (data not shown).
Genomic Cloning and Overexpression of csrBUsing PCR, a
0.5-kilobase pair region containing an almost minimal csrB
gene was amplified and subsequently cloned. Fig. 6 shows
that the clones in which csrB can be expressed via the
lacZ promoter of the vector, e.g. pCSRB-SF,
strongly enhanced glycogen accumulation by E. coli, while
plasmids in which the csrB gene is oriented in the opposite
direction, e.g. pCSRB-SR, had a weak or negligible
stimulatory effect. These results demonstrated that multiple copies of
csrB in the cell do not simply bind to (or titrate out) a
DNA-binding protein inhibitor of glycogen synthesis but rather that
expression of csrB is required for its effects on glycogen.
The effect of pCSRB-SF on glycogen levels was not as strong as the
TR1-5 csrA::kanR mutation, which causes cells to
accumulate more than 20-fold higher levels of glycogen (1, 5). The
introduction of pCSRB-SF into a strain that was defective in
csrA, TR1-5BW3414, did not alter its glycogen levels (Fig. 6). Several E. coli strains that are wild type for
csrA were transformed with pCSRB-SF (BW3414, DH5, JM101,
and E. coli B), and in each case the transformants were
found to accumulate elevated levels of glycogen (data not shown).
Furthermore, the multicopy plasmid clones pAK671 and pAK672, which
encode csrB homologs aepH* and aepH+ of E. carotovora (12), also
stimulated glycogen accumulation in E. coli, indicating that
csrB and aepH are functionally equivalent (data
not shown).
Csr is a regulatory system that was originally identified via a mutation that inactivates a small RNA binding protein, CsrA, which is the central factor of this system (1). The role of CsrA in procaryotic metabolism and physiology is just beginning to be understood. It is a negative regulator of certain processes associated with the early stationary phase of growth, including glycogen synthesis and catabolism (1, 5), and gluconeogenesis (1, 4) in E. coli, and the expression of several extracellular virulence factors in E. carotovora (6, 7). CsrA also modulates the glycolytic pathway in E. coli (4), affects cell surface properties (1), and has been proposed to directly or indirectly affect DNA gyrase activity (39). CsrA is related in sequence to a diverse subset of RNA-binding proteins known as KH proteins, and its regulatory role in the glycogen biosynthesis pathway is to bind to and facilitate the decay of the glgCAP message (Ref. 3, and RNA mobility shift experiments).1 The current study demonstrates that the CsrA protein itself is a direct modulator of glg genetic expression and has provided a highly purified, biologically active, and well characterized recombinant CsrA gene product useful for structural and mechanistic studies.
Purified recombinant CsrA protein was found to be bound to a ~350-nucleotide RNA in a large globular multisubunit complex. The fact that 14 different cDNA clones generated from this RNA all originated from a single genetic locus in E. coli showed that it was not a collection of various CsrA-regulated messages bound to CsrA in an intermediate state of turnover but instead represented a single gene product, CsrB. Extensive endonucleolytic degradation of the CsrB and repurification of the RNA-free CsrA protein did not inactivate it but instead generated a preparation that was somewhat more active in regulating gene expression in S-30 transcription-translation reactions. More rigorous investigations of the in vitro activities of CsrB and CsrA are planned, in which S-30 extracts prepared from a mutant in both csrA and csrB (currently under construction in our laboratory) will be used to monitor the effects of purified CsrB and CsrA on glg gene expression. Nevertheless, it is already clear that CsrB RNA is not required for CsrA activity. Thus the CsrA-CsrB complex is not a ribozyme. Furthermore, CsrB lacks a significant open reading frame in its sequence and does not appear to be a messenger RNA. Finally, overexpression of csrB in E. coli strains that are wild type for csrA caused an increase in intracellular glycogen levels, which are under strong negative control of CsrA, while a strain that was defective in csrA showed no effect of increased CsrB. Based upon the evidence presented here, CsrB is proposed to be a second regulatory component of the Csr system. CsrB defines a novel function for an RNA molecule, which is sequestration and inhibition or antagonism of an mRNA decay factor.
Further evidence of the regulatory function of CsrB is derived from previous studies of the csrB homolog of another species of the Enterobacteriaceae, the plant pathogen E. carotovora (12). The csrB region of E. carotovora, aepH, functions antagonistically to the highly conserved csrA homolog of this species, rsmA (7). The aepH region positively regulates several secreted virulence proteins, which are repressed by rsmA. The mechanism for the positive effects of aepH on these virulence proteins has not been shown, except that transcript levels of regulated genes were affected (12). Taken in context with the studies reported here for the E. coli CsrB RNA, the small open reading frame previously noted in the aepH region of the E. carotovora genome probably does not encode the regulatory factor responsible for these effects. Furthermore, (i) there is no evidence the AepH protein is actually synthesized; (ii) the aepH open reading frame is not conserved in E. coli; (iii) the highly repeated sequence elements noted for E. coli CsrB RNA are conserved in E. carotovora; (iv) transposition mutations and deletions significantly upstream from the aepH reading frame, but within or slightly upstream from the repeated elements in E. carotovora inactivated the regulatory function of this region and indicated that the repeated elements are functional (12).
We propose that the Csr systems of E. coli and E. carotovora include but are not necessarily limited to, a
regulatory RNA transcript, CsrB, which binds to and antagonizes the
activity of an RNA-binding protein, CsrA or its homolog RsmA. CsrA
functions by modulating mRNA stability. A model depicting these
interactions within Csr is shown in Fig. 7. A prediction
of this model that has been tentatively tested but should be more
thoroughly explored is that factors affecting CsrB levels in the cell
determine the activity of CsrA. A further prediction is that turnover
of CsrB RNA could present an effective means to rapidly increase the
active concentration of CsrA in the cell and halt the expression of
CsrA-inhibited genes by causing their mRNAs to be degraded. Thus,
in addition to its interesting role as biological control system, Csr
has the potential to be exploited as a means of modulating gene
expression in biotechnology applications, independently of or in
addition to existing transcriptional control mechanisms.
The observations that 18 CsrA subunits are bound to CsrB RNA to form a ribonucleoprotein complex and that 18 imperfect repeated elements are localized to predicted single-stranded regions of CsrB RNA strongly suggests that the sequence CAGGA(U/A/C)G serves as a recognition element for CsrA. The intriguing relationship of this sequence to the Shine-Dalgarno sequence (40), which is involved in binding of the ribosome to mRNA, should not go unrecognized, especially in view of the fact that the cis-acting region for the regulation of glgC expression by CsrA approximates the ribosome binding site (3). The elucidation of the molecular interactions of CsrA with these highly repeated sequences of CsrB, as well as with mRNAs such as glgC, may also provide general insight into the structure and function of KH domains, which are found in numerous diverse RNA-binding proteins.
No other bacterial RNAs are currently known to function in the same capacity as CsrB, although the number and variety of transcripts with trans-acting regulatory functions are growing rapidly. Examples include several antisense RNAs (41, 42), RNA III of Staphylococcus aureus, which is a global regulator of virulence factors and can affect gene transcription as well as translation (e.g. Ref. 43), and the small E. coli transcript DsrA, which regulates the transcription of rcsA (a regulatory gene for capsule biosynthesis) by antagonizing H-NS-mediated silencing (44). Due to the absence of open reading frames in their gene sequences, RNA-based regulatory systems such as these may be more difficult to recognize than systems based on protein factors. Moreover, mutagenesis may be less efficient for small genes that lack strict coding requirements. We anticipate that such systems will be increasingly recognized as information from ongoing microbial genome sequencing projects is applied to the molecular analysis of bacterial physiology and regulation.
We thank Paul Nixon for help in generating Fig. 5 and Arun Chatterjee for providing plasmids containing aepH.