(Received for publication, December 23, 1994; and in revised form, June 5, 1995)
From the
The gene encoding the subunit of Bacillus subtilis RNA polymerase was isolated from a
gt11 expression library
using an antibody probe. Gene identity was confirmed by the similarity
of its predicted product to the Escherichia coli
subunit
and by mapping an alteration conferring rifampicin resistance within
the conserved rif coding region. Including the rif region, four colinear blocks of sequence similarity were shared
between the B. subtilis and E. coli
subunits.
In E. coli, these conserved blocks are separated by three
regions that either were not conserved or were entirely absent from the B. subtilis protein. The B. subtilis
gene was
part of a cluster with the order rplL (encoding ribosomal
protein L7/L12), orf23 (encoding a 22,513-dalton protein that
is apparently essential for growth), rpoB (
), and rpoC (
`). This organization differs from the
corresponding region in E. coli by the inclusion of orf23. Experiments using promoter probe vectors and
site-directed mutagenesis located a major rpoB promoter
overlapping the 3`-coding region of orf23, 250 nucleotides
upstream from the
initiation codon. Thus, the B. subtilis
rpoB region differs from its E. coli counterpart in both
genetic and transcriptional organization.
The basic features of the transcriptional machinery are
remarkably conserved in all organisms. In particular, the ,
`, and
subunits that comprise the catalytic core of the
eubacterial RNA polymerase are similar to subunits of the three nuclear
RNA polymerases of eukaryotes (see (1) and (2) for
reviews). The
` subunit of Escherichia coli shares eight
regions of sequence similarity with the largest subunit of the
eukaryotic enzymes(3) , and the
subunit of E. coli was initially reported to share nine regions with the second
largest subunit(4, 5) . Additional sequence alignment
of
homologues has further refined the boundaries of these nine
regions into 12 colinear segments, which are conserved in both
eubacteria and eukaryotes(6) .
, the third largest subunit
of the eubacterial core enzyme, also has similarity to the third
subunit of RNA polymerase II and to the fourth subunits of RNA
polymerases I and III, although this similarity is less striking than
that found among the
or
` homologues (reviewed in (1) and (2) ). This conservation of primary amino acid
sequence suggests common functions for the shared regions.
Although
the well characterized RNA polymerase from E. coli is
available to represent the Gram-negative lineage of eubacteria, there
has been less information regarding RNA polymerases from Gram-positive
bacteria. With the view that the transcriptional apparatus of a
genetically amenable Gram-positive bacterium could contribute to a
structure-function analysis of RNA polymerases, we began a study of the
genes encoding the core subunits from the spore-forming bacterium Bacillus subtilis. We earlier reported the isolation and
characterization of rpoA, the gene for the
subunit(7, 8) . Here, we describe the genetic and
transcriptional organization of the region containing rpoB,
the gene encoding the
subunit, and show that this organization
differs substantially from that of the corresponding region in E.
coli. Because
is involved in most of the catalytic functions
of RNA polymerase, including nucleotide
binding(9, 10) , transcription initiation, elongation,
and
termination(10, 11, 12, 13, 14, 15) ,
and interactions with both the
subunit (16, 17) and the NusA
protein(18, 19, 20) , we also compared the
likely functional domains of the B. subtilis and E. coli subunits. This comparison revealed two regions of E. coli
that were entirely absent from the B. subtilis protein and thus are likely dispensable for function.
Significantly, another region of 186 residues, which is known to be
dispensable in E. coli, shared little overall sequence
similarity with B. subtilis
but nonetheless contained a
common, 69-residue segment. We infer from these results that the common
segment has a more fundamental role in
function than previously
believed.
Figure 1:
Genetic
organization of the B. subtilis -
` region. The
chromosome in the rpoB region is represented by the heavyline and kilobase scale. The rectanglesabove the physical map indicate the open reading frames
encoding ribosomal protein L7/L12, the 22,513-dalton protein P23, and
the
and
` subunits of RNA polymerase, all of which are
transcribed from left to right; the arrows adjacent to the L7/L12 and
` reading frames indicate that
they extend beyond the cloned region. The restriction map shows the
sites used in the recombinant DNA constructions described under
``Materials and Methods.'' (R), EcoRI
linkers derived from the construction of the
gt11
library(7) ; H, HindIII; S, SpeI; P, PvuII; and Y, StyI. The regions of the chromosome used in some of the
constructions are indicated by the horizontallinesbeneath the restriction map. pKB10 was made by subcloning
the indicated 1.1-kb EcoRI-SpeI fragment into the
pCP115 integration vector(27) . Upon integration into the B. subtilis chromosome, pKB10 would prevent any transcription
initiating upstream of the L7/L12 gene from entering the
and
` genes. pKB3, pKB4, and pKB13 were made by sucloning the
indicated fragments into the single-copy transcriptional fusion vector
pDH32 (26) . The two fragments conferring promoter activity are
labeled +. The physical map of the
-
` region was derived
from the restriction map and DNA sequence of the chromosomal inserts
carried by the two
gt11 phages.
The promoter activity manifested by pKB4 was further localized by subcloning PCR-generated fragments into pDH32. pKB14 carried a fragment extending from nt 829 to the SpeI site at nt 1101 ( Fig. 3and 5), pKB15, a fragment from nt 935 to the SpeI site, and pKB16, a fragment from nt 829 to 989. DNA sequencing confirmed that no alterations had been introduced by the PCR reactions and that all three fragments were oriented with the gene direction toward lacZ. These three plasmids were then linearized and integrated into the amyE locus of strain PB2 to yield strains PB365, PB366, and PB367.
Figure 3:
Nucleotide sequence of the orf23-rpoB (P23-) interval. The sequence shown represents the C-terminal
coding region for P23, the P23-
intercistronic region, and the
N-terminal coding region for
. This includes the 0.4-kb HindIII-SpeI fragment that contains rpoB promoter activity (Fig. 1). The two 5`-ends of rpoB message detected in the primer extension experiment described in Fig. 4are indicated by the verticallines marked A at nt 876 and B at nt 1012. The
locations of the three complementary primers used to analyze the rpoB message are indicated by <P973, <P1065, and
<P1150, with (<) denoting the 3`-end of the primer. The proposed
-35 and -10 recognition sequences of the rpoB promoter are doubleunderlined at nt
840-845 and 863-868, respectively; the indicated mutations
change the -35 sequence from TTGACT to TAAACT in pKB17. The
proposed ribosomal binding site for the rpoB message is underlined at nt 1076-1083.
Figure 4:
Mapping the 5`-end of the rpoB message by primer extension. Wild type strain PB2 was grown in 2
SG sporulation medium and harvested during logarithmic growth;
the RNA was then extracted. Primer extensions were done using a molar
excess of three different synthetic primers, P973, P1065, and P1150
(described under ``Materials and Methods''). These primers
are complementary to message synthesized using the orf23-rpoB (P23-
) intercistronic region as template; their locations are
shown in Fig. 3. For each experiment, samples containing 75
µg of RNA were loaded onto lane1 of a sequencing
gel. A sequencing ladder was run in parallel using the same primer
employed in the extension reactions; the letters A, C, G, and T indicate the dideoxynucleotide
used to terminate the reaction. The sequences indicated on the right are from the non-transcribed strand and are the
complement of the sequences that can be read from the ladder. Reactions
using primers P973, P1065, and P1150 all gave a 5`-signal centered
around the thymidine complementary to adenosine 876 (Fig. 3);
the experiment using primer P1065 is shown in panel A.
Reactions using primers P1065 and P1150 both gave a 5`-signal centered
around the thymidine complementary to adenosine 1012 (Fig. 3);
the experiment using primer P1150 is shown in panel
B.
Sequences necessary for the promoter activity manifested by pKB16 were further defined by site-directed mutagenesis. A mutation in the proposed -35 recognition sequence of the rpoB promoter carried by pKB16 was created using the primer 5`-GCAAAAAAAGTTAAACTCGGTATTTTAACTATG, the same as nt 829-861 except for the underlined residues, and extended to nt 989 using the same second primer employed to amplify the pKB16 insert. DNA sequencing confirmed that this mutagenized fragment was identical to that carried by pKB16 except for the alteration of the proposed -35 sequence from TTGACT to TAAACT. The mutagenized fragment was cloned into pDH32 to yield pKB17, which was linearized and integrated into the amyE locus of PB2 to yield strain PB368.
Initial screening for
promoter activity of the integrated fusions was on tryptose blood agar
base plates (Difco Laboratories) containing
5-bromo-4-chloro-3-indolyl--D-galactoside. For
quantitative estimates of promoter activity, we performed
-galactosidase assays essentially as described by
Miller(28) . B. subtilis cells were grown to late
logarithmic stage in 2
SG sporulation medium (29) and
then diluted 1:25 into fresh medium. Samples were taken throughout the
logarithmic and stationary phases of growth. Activity was expressed in
Miller units, defined as 1000
A
/min/ml/unit of optical density at 600 nm.
Figure 2:
Alignment of the primary sequences of the B. subtilis and E. coli subunits. The predicted
sequence of B. subtilis
(upper) is from this
work and that of E. coli
(lower) is from
Ovchinnikov et al.(35) . Each of the four conserved
blocks was separately aligned by means of the FASTA program of Pearson
and Lipman(32) , with identical residues indicated by a colon (:) and conserved substitutions (36) by a period (.). The underlinedsegments denote
the 12 regions that are highly conserved in the second largest subunits
of RNA polymerases from eubacteria and eukaryotes(6) . The four
conserved blocks are separated by three variable regions that either
are not conserved or are absent entirely from the B. subtilis protein. Key residues of E. coli
that have been
altered by mutation are shown below the E. coli sequence. These include residues at which single substitutions
cause either rifampicin resistance (*) or an altered termination
phenotype (ˆ) (see (12) , (13) , (18) , (37) , (38) ). Arrows indicate one particular
substitution that confers rifampicin resistance on both organisms; this
is the site of the E. coli rpoB2 alteration (H526Y) (18) and the corresponding B. subtilis rfm2103 alteration (H482Y) (this work). Other arrows indicate E. coli substitutions, which render RNA polymerase insensitive
to the Alc termination factor of bacteriophage T4 (R368H (paf32)(39) ) or which alter promoter clearance
capabilities of the mutant polymerase (K1065R (12) and
H1237A(10) ). Affinity labeling studies have shown that
Lys-1065 and His-1237 lie near the binding site for the
transcript-initiating nucleotide(10) .
Rifampicin is an antibiotic that traps prokaryotic RNA polymerases
in the initial transcribing complex, preventing elongation of the
nascent transcript beyond three or four
nucleotides(40, 41) . All known mutations resulting in
rifampicin resistance map to the gene encoding the
subunit(42, 43, 44) . In E. coli
, most alterations conferring rifampicin resistance lie
between residues 512 and 573 in conserved block
2(12, 13, 18, 37, 38) . We
used PCR to isolate this region from the chromosome of B. subtilis strain PB355 (rfm2103) and found that it contained only
one alteration from the wild type sequence: a C to T transition that
would change residue 482 of B. subtilis
from a histidine
to a tyrosine. Transformation of strain PB2 with the fragment bearing
the rfm2103 allele conferred the ability to grow in the
presence of high levels of rifampicin, indicating that the H482Y
alteration alone is sufficient for resistance. As shown in Fig. 2, this is the identical change in primary sequence caused
by the E. coli rpoB2 allele(18) , which elicits severe
defects in transcription termination and is incompatible with the rho-15, nusA10, nusA11, and dnaA46 mutations (11, 19, 45, 46) .
Indeed, most rifampicin-resistant mutations are highly pleiotropic,
causing altered initiation, elongation, and termination
phenotypes(9, 11, 15, 19, 46, 47, 48) .
Because rifampicin-resistant mutants in conserved block 1 have the same in vivo and in vitro phenotypes as mutants in
conserved block 2, Jin and Gross (18) and Severinov et al.(49) have suggested that these two blocks perform a common
catalytic function in the core enzyme.
Conserved blocks 3 and 4 are
also thought to comprise part of the active center for catalytic
function. Affinity labeling studies indicate that both Lys-1065 and
His-1237 of E. coli
lie near the site that binds the
transcript-initiating nucleotide, although neither residue appears to
be directly involved in subsequent phosphodiester bond
formation(10) . Notably, Lys-1065 and His-1237 are conserved in
all prokaryotic and eukaryotic
homologues characterized to
date(10, 12) , and these residues are found at the
expected positions in B. subtilis
(Fig. 2).
Genetic evidence underscores the importance of these residues in
function. Alteration of E. coli Lys-1065 to arginine (K1065R)
results in a dominant lethal mutant enzyme that is blocked in promoter
clearance(12) , and alteration of His-1237 to alanine (H1237A)
also results in a mutant polymerase with impaired promoter
clearance(10) . Other single residues surrounding E. coli Lys-1065 are also important for
function, because the
R1069A, G1071A, and K1073A substitutions result in dominant lethal
mutants with decreased promoter clearance capabilities and aberrant
response to pause sites(50) . All three of these residues are
conserved in B. subtilis.
On the basis of the strong
similarity of its predicted product to E. coli , we
conclude that we have isolated the gene encoding the B. subtilis
subunit. This conclusion is reinforced by two independent
criteria. First, the initial antibody screening and epitope selection
indicated that the
gt11 clones encoded a protein with the
antigenic properties of B. subtilis
. Second, as an
additional biological criterion, we mapped within the B. subtilis
coding sequence an alteration conferring rifampicin
resistance. In keeping with the standard genetic nomenclature, we will
refer to the gene for the
subunit as rpoB.
To confirm the presence of a promoter within this 1.1-kb fragment, we moved this region into the single-copy transcriptional fusion vector pDH32(26) . Upon introduction of the resulting fusion into the B. subtilis chromosome at the amyE locus, this fragment showed clear promoter activity (Fig. 1). To more precisely locate this activity, we first subcloned two portions of the 1.1 kb-fragment into the pDH32 vector and introduced these two fusions into the amyE locus. As shown in Fig. 1, the upstream 0.7-kb EcoRI-HindIII fragment, which included the rplL-orf23 intercistronic region, had no detectable promoter activity. In contrast, the downstream 0.4-kb HindIII-SpeI fragment, which included the orf23-rpoB intercistronic region, had strong promoter activity. The nucleotide sequence of this region is shown in Fig. 3. Primer extension experiments (Fig. 4) located two 5`-ends of the rpoB message within the region. One signal was centered near nt 876 (labeled A in Fig. 3, 4, and 5) and the other near nucleotide 1012 (labeled B in Fig. 3-5).
To establish whether the 5`-ends detected in the primer extension experiment correlated with regions that contained promoter activity, we made a series of additional transcriptional fusions in the pDH32 vector, as shown in Fig. 5. From the activities manifested by these fusions, we concluded that whereas the 5`-end centered around nt 876 lay near sequences conferring promoter activity, the 5`-end centered around nt 1012 did not. Instead, it seems likely that this latter 5`-end represented a site at which the rpoB message was processed. Because we detected this same signal near nt 1012 using two different primers, we think it less likely that it was an artifact of the extension reactions.
Figure 5:
Activity of lacZ reporter genes
transcriptionally fused with fragments from the P23- interval. Panel A depicts the HindIII-SpeI fragment
that contains rpoB promoter activity (Fig. 1). The
C-terminal coding region of P23 and the N-terminal coding region of
are indicated by the rectanglesabove the
nucleotide scale. The locations of the 5`-ends of rpoB message
detected in the primer extension experiments of Fig. 4are shown
by the arrows, with the signal centered at nt 876 marked A and the signal centered at nt 1012 marked B. The
fragments shown beneath the nucleotide scale were subcloned into the
transcriptional fusion vector pDH32 and integrated in single copy at
the amyE loci of the indicated strains. The fragments carried
by strains PB367 and PB368 are identical except for the mutation of the
proposed -35 recognition sequence (TTGACT to TAAACT), indicated
by the filledtriangle. Panel B shows the
-galactosidase activities directed by the four transcriptional
fusions depicted in panelA. The strains carrying
these fusions were grown and assayed for
-galactosidase activity
as described under ``Materials and Methods.'' The fusion
carried by strain PB367 had strong promoter activity, and this activity
was abolished by the -35 mutation in the otherwise identical
fusion carried by strain PB368. We therefore conclude that the A signal centered at nt 876 represents the likely site of initiation
for the rpoB message. In contrast, the fusion carried by
strain PB366 had no promoter activity, suggesting that the B signal represents a processing site.
Inspection of the region
necessary for promoter activity in the fusion experiments revealed the
sequence TTGACT-(17 base pairs)-TAATAT just upstream from
the 5`-end near nt 876 (Fig. 3). This sequence and spacing
closely match the consensus recognized by RNA polymerase holoenzyme
containing the major factor of B. subtilis,
(52) . To determine whether this sequence was
required for rpoB promoter activity, we made a double mutation
in the proposed -35 recognition sequence, changing it from TTGACT
to TAAACT (Fig. 3). When the fragment containing this mutation
was fused to the lacZ reporter gene of the pDH32 vector and
integrated into the chromosome, promoter activity was completely
abolished (Fig. 5).
On the basis of promoter activity
measurements, primer extension experiments, and mutational analysis, we
conclude that a major rpoB promoter partly overlaps the
3`-coding region for the preceding orf23 gene. Our experiments
further suggest that the estimated 213-nt leader of the message
originating from this promoter is processed near nt 1012. In E.
coli, the rpoB message is processed at an RNaseIII site
located within a region of secondary structure in the 321-base pair rplL-rpoB interval(53, 54) . In B.
subtilis, there is no stable secondary structure apparent in the
sequence of the orf23-rpoB interval, and the mechanism of the
inferred processing remains to be established. Because there is no
obvious factor-independent terminator sequence separating orf23 and rpoB, transcription originating upstream from the
gene for r-protein L7/L12 may also contribute to rpoB expression. However, as shown by the plasmid integration
experiments, this upstream transcription is not required for normal
growth rate and sporulation frequency. Thus, the rpoB promoter
we identified is sufficiently strong to provide adequate levels of
subunit under most growth conditions.
We have identified rpoB, the gene encoding the
subunit of B. subtilis RNA polymerase. The evidence for this
identification includes (i) the antigenic properties of the rpoB product, (ii) the high similarity of the predicted rpoB product with E. coli
, and (iii) the presence within
the rpoB coding region of an alteration conferring rifampicin
resistance. We have further shown that B. subtilis rpoB is
transcribed from a promoter that overlaps the 3`-end of the preceding orf23 coding region and that this promoter is sufficiently
active to support wild type growth rate and sporulation frequency.
The transcriptional organization of the B. subtilis rpoB region stands in sharp contrast to that of E. coli, in which the primary transcripts originate from the promoters of the upstream ribosomal protein operons L11 and L10(54, 55, 56, 57) . In E. coli, differential regulation of r-protein and RNA polymerase subunit expression results partly from the action of a transcriptional attenuator that lies between the gene for r-protein L7/L12 and rpoB(53, 58) and partly from specific translational feedback mechanisms operating on the r-protein and polymerase subunit messages(59, 60, 61, 62) . We do not yet know whether B. subtilis rpoB expression is subject to the same translational regulation as has been proposed for the enteric system. However, because B. subtilis has a strong promoter immediately upstream from rpoB, for which no counterpart exists in E. coli, it is possible that differential regulation of r-protein and RNA polymerase gene expression could be accomplished largely at the transcriptional level.
The E. coli
subunit plays a key role in transcriptional initiation, elongation, and
termination. What information regarding structure-function
relationships can be derived from comparison of the predicted sequences
of B. subtilis and E. coli
? As shown in Fig. 2, the two proteins share highest sequence conservation in
four large blocks. This degree of conservation between such highly
divergent organisms suggests that the four large blocks mediate common
functions, and there is ample biochemical and genetic evidence that key
residues within these blocks are critical for
function in E.
coli. However, it is the differences between the two
subunits that provide our most significant results. On the one hand,
the four conserved blocks are separated by three regions that are
either variable or absent in B. subtilis
. On the other
hand, we find that the B. subtilis protein has a C-terminal,
46-residue extension of the fourth conserved block that is not present
in E. coli. These differences appear to be characteristic of
the Gram-positive and Gram-negative lineages of eubacteria.
In E. coli, this 69-residue
segment contains the paf32 alteration (39) and is
therefore thought to define part of a contact site with the Alc
protein, a site-specific termination factor encoded by bacteriophage T4
which acts as a block to the transcription of host
genes(63, 64) . On the basis of these results,
Severinov et al.(39) proposed that non-essential
regulatory proteins specific to the E. coli system may have
evolved to target this dispensable region. However, the conservation of
the 69-residue segment within variable region 1 of B. subtilis , albeit at a different location, suggests that the segment
is widespread among prokaryotes and therefore has a more fundamental
role in
function.
This notion is supported by work of Landick et al.(13) , who identified a series of substitutions
in E. coli that affects transcription termination in
vivo. As shown in Fig. 2, four of these substitutions map
within the common 69-residue segment, and a fifth maps in the region
immediately adjacent in the E. coli protein. Furthermore, we
note that the common segment is also conserved in the
subunits of Pseudomonas putida(65) , Buchnera
aphidicola(66) , Mycobacterium
leprae(67) , and Mycobacterium
tuberculosis(44) , lending further support to the
suggestion of a more universal role. In Pseudomonas and Buchnera, the
segment lies toward the C-terminal portion of variable region 1, as is
the case in E. coli. In contrast, the mycobacterial segment
lies toward the N-terminal portion of the region, as is the case in B. subtilis. Thus, it appears that the difference in location
of this segment within variable region 1 reflects a difference in
domain organization between Gram-positive and Gram-negative bacteria.
The importance of this common, 69-residue segment may also extend to
the eukaryotic lineage. Shaaban et al.(6) found that
alterations mapping in or near the corresponding region of the second
largest subunit of yeast RNA polymerase III also affected termination
properties of the enzyme.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank[GenBank].