(Received for publication, June 13, 1995)
From the
Ribosomal protein S4 from Escherichia coli binds a
large domain of 16 S ribosomal RNA and also a pseudoknot structure in
the operon mRNA, where it represses its own synthesis. No
similarity between the two RNA binding sites has been detected. To find
out whether separate protein regions are responsible for rRNA and mRNA
recognition, proteins with N-terminal or C-terminal deletions have been
overexpressed and purified. Protein-mRNA interactions were detected by
(i) a nitrocellulose filter binding assay, (ii) inhibition of primer
extension by reverse transcriptase, and (iii) a gel shift assay.
Circular dichroism spectra were taken to determine whether the proteins
adopted stable secondary structures. From these studies it is concluded
that amino acids 48-104 make specific contacts with the mRNA,
although residues 105-177 (out of 205) are required to observe
the same toeprint pattern as full-length protein and may stabilize a
specific portion of the mRNA structure. These results parallel
ribosomal RNA binding properties of similar fragments (Conrad, R. C.,
and Craven, G. R. (1987) Nucleic Acids Res. 15,
10331-10343, and references therein). It appears that the same
protein domain is responsible for both mRNA and rRNA binding
activities.
Functional studies of ribosomes have tended to focus on the roles of the ribosomal RNAs in recent years, as a number of studies have uncovered specific contributions of different rRNA domains to ribosome activities(1) . As more ribosomal protein sequences have become available, it is becoming clear that a number of these proteins are highly conserved among all organisms and must also have specific and necessary roles in ribosome function. An intriguing set of ribosomal proteins are those that bind directly and independently to the ribosomal RNAs and also autogeneously regulate ribosomal protein expression. In many cases the regulation is due to the protein recognition of the mRNA translational initiation region(2) , although protein binding to a pre-mRNA splice site has also been observed(3) . These instances of a single protein carrying out two different RNA-related functions provide interesting systems for studying how RNA recognition has evolved and is related to specific protein functions(4) .
In several instances there is
convincing similarity between the secondary structures of the mRNA and
rRNA targets of an autoregulatory ribosomal
protein(4, 5, 6) . It is reasonable to
conclude that both mRNA and rRNA bind in the same active site of any
one of these proteins. In other cases there is no obvious similarity
between the two RNA substrates. For instance, the mRNA target site for Escherichia coli S4 protein is a complex pseudoknot of about
110 nucleotides within the operon mRNA(7) . Nearly the
entire 5` domain of the 16 S rRNA, a fragment of 460 nucleotides, is
needed to form the ribosomal binding site for the protein(8) ,
although a smaller region is protected from cleavage by bound
protein(9) . There is no primary or secondary structural
similarity between the two target sites, for which there are two
possible explanations. S4 may be recognizing a three-dimensional rRNA
structure that is common to the two RNAs but not obvious from
comparisons at the secondary structure level, or separate rRNA and mRNA
binding domains may have evolved in S4. In vivo experiments
showing that mutant S4 proteins with C-terminal deletions assemble into
ribosomes but are defective in mRNA regulation (10, 11) have been interpreted in favor of the latter
explanation(12) .
In this paper we build on the previous
work of Craven and colleagues (12, 13, 14, 15) , who cleaved S4 by
various methods and studied the ability of the protein fragments to
bind 16 S rRNA and promote ribosome assembly in vitro. We have
prepared a number of similarly truncated proteins by overexpression of
the S4 gene rather than cleavage methods, and show that N- and
C-terminal regions of S4 that are not necessary for rRNA binding are
also not required for S4 recognition of the mRNA pseudoknot. The
results show that no more than
130 of the 205 amino acids are
needed to fold a stable RNA binding domain, and suggest that two
regions within this domain may recognize different parts of the RNA.
The wild
type S4 protein was extracted from the crude lysate by raising the salt
concentration to 0.7 M NaCl, followed by centrifugation at
10,000 rpm for 30 min. The supernatant was then diluted with an equal
volume of 6 M urea buffer (6 M urea, 20 mM
KHPO
, 0.5 mM DTT, pH 5.6) and then
dialyzed against the same 6 M urea buffer. The protein
fragments were not extracted into the supernatant by the high salt
buffer, and had to be solubilized from the pellet of cell debris. 30 ml
of 6 M urea buffer was added to the pelleted debris, the
mixture stirred in the cold for 1 h, and centrifuged 40 min at 10,000
rpm.
Proteins in 6 M urea buffer were purified by high
performance liquid chromatography using a Bio-Rad TSK SP-5-PW cation
exchange column (75 7.5 mm) with a 3-h gradient from
0-40% 1 M KCl in 6 M urea buffer at a flow rate
of 0.8 ml/min. Full-length S4 reproducibly yielded nearly 40 mg of pure
protein from 1 liter of cell culture. After purification, all the
proteins were dialyzed into TK buffer (30 mM Tris-HCl, 350
mM KCl, 0.5 mM DTT, pH 7.6), using dialysis membrane
with a molecular mass cut-off of 3500 daltons (SpectraPor 3), and
stored at -70 °C. Protein purity was checked by
electrophoresis of samples in polyacrylamide gels containing SDS or
urea/acetate at pH 4.5; all preparations were better than 95% pure.
Amino acid compositions were obtained for all the proteins and agreed
with the predicted compositions. N-formyl methionine is
missing from all the analyses, and we presume it has been
proteolytically removed as it is in wild type S4. Protein
concentrations were determined by absorbance at 280 nm using extinction
coefficients calculated according to the method of Gill and von
Hippel(20) . The protein fragments and their extinction
coefficients are listed in Table 1. Note that previous work on
S4-mRNA binding (7) used a larger extinction coefficient
reported by Rhode et al.(21) ; binding constants cited
from this paper have been corrected to account for this.
S4 protein and its fragments were warmed to
37 °C for 30 min in their own storage buffer supplemented with 7
mM 2-mercaptoethanol, cooled at room temperature for 5 min,
and then placed on ice before use. Both protein and subunits were kept
on ice until use. The final buffer composition for the toeprinting
reaction was 50 mM Tris-HCl, pH 8.0, 75 mM KCl, 10
mM DTT, 3 mM MgCl, and 0.7 mM 2-mercaptoethanol. The mRNA concentration was 57 nM, and
dNTPs were 0.1 mM each. 200 units of SuperScript reverse
transcriptase (Life Technologies, Inc.) were used for each 10-µl
assay. The order of addition of components to the assay buffer was
dNTPs, protein, and RNA with annealed primer. Following a 10-min
incubation of all the components at 0 °C, reverse transcriptase was
added and extension allowed to proceed at the indicated temperature for
15 min; temperatures from 28 to 37 °C were used. Reactions were
then quenched with 10 µl of formamide dye mix (99% (v/v) deionized
formamide, 10 mM NaOH, 1 mM Na
EDTA, and
0.05% each of bromphenol blue and xylene cyanol), placed in a boiling
water bath for 3 min, and then placed on ice before loading half of the
reaction onto an 8% denaturing polyacrylamide sequencing gel (40
20
0.035 cm). Dideoxy sequencing reactions, carried out
with avian myeloblastosis virus reverse transcriptase (Life Sciences,
Inc) and the same DNA primer as used for toeprint reactions, were
included on each gel for reference. Gels were exposed to preflashed
x-ray film at -70 °C, and quantitation of bands was done on a
Molecular Dynamics scanning densitometer.
The
reactions were loaded onto a polyacrylamide gel (7.7% acrylamide, 0.2%
bis-acrylamide, 10 cm long and 0.8 mm thick) with 30 mM Tris-HCl, 350 mM KOAc, and 2 mM MgSO, pH 7.6, as the running buffer. The gel was
cooled to 4 °C before loading the samples; electrophoresis was
carried out at 75 V for 3 h at 4 °C. The running buffer was
recirculated periodically to prevent formation of a pH gradient. The
RNA and complexes were stained with either 0.0001% ethidium bromide or
2% methylene blue.
Fig. 1A shows titrations
of mRNA and 16 S rRNA fragments with the recombinant S4 protein
in a filter binding assay. The binding constant estimated for the mRNA
(14 ± 3 M
) is slightly higher than
that measured previously with ribosome-derived protein under the same
conditions (7.4 µM
, (7) ). The
S4 affinity for the rRNA fragment (17 ± 5
µM
) is also comparable to that measured
previously (14 µM
, (8) ). S4
exhibits a high level of nonspecific binding to tRNA and 23 S rRNA
fragments in the filter binding assay. Under the salt conditions used
here, the apparent nonspecific binding affinities are on the order of 1
µM
, and the maximum retention
extrapolates to nearly 1.0(8, 28) . The same behavior
is seen in the binding of overexpressed S4 with a 505-nucleotide
fragment of 23 S rRNA (K
3 µM
and maximum retention
1.0, data not shown). The six S4
fragments were also used in the same filter assay; representative
titrations of RNA fragments with S4(2-104) are shown in Fig. 1B. The binding of the mRNA, 16 S rRNA, and 23 S
rRNA fragments are indistinguishable, and all three extrapolate to a
maximum retention of
1.0. All the protein fragments bound all the
RNA fragments with apparent affinities ranging between 1 and 4
µM
. The curves varied considerably
between experiments, and it was not possible to reliably determine
whether binding to the 16 S rRNA and
mRNA fragments was
significantly larger than nonspecific binding for any of the protein
fragments.
Figure 1:
Nitrocellulose filter
retention assays. The fraction of S-labeled RNA retained
on filters is plotted as a function of protein concentration. The curves are least squares best fits to the indicated sets of
data. A, S4(2-206) titrating.
, 5` domain of 16 S
rRNA (K = 12 µM
,
maximum retention 0.58, background retention 0.091);
,
mRNA
fragment (K = 15 µM
,
maximum retention = 0.48, background retention = 0.038). B, S4(2-104) titrating.
, 5` domain of 16 S rRNA;
,
mRNA fragment;
, 23 S rRNA fragment. The curve is fit to the
mRNA data set, with K = 0.94
µM
, maximum retention = 1.0,
background retention = 0.069.
The high level of nonspecific binding seen for S4 and its
fragments seems to be at variance with results from Craven's
laboratory, in which binding of similar proteins to 23 S rRNA could not
be detected ((12) ; see also ``Discussion''). The
filter ``pull-through'' assay used by that group measures the
loss of labeled S4 from nitrocellulose filters upon titration with 16 S
rRNA, and therefore depends on the inability of S4-rRNA complexes to
bind nitrocellulose. This behavior is observed only for 16 S rRNA
fragments with 3` termini extending beyond approximately nucleotide
900; presumably the larger rRNA ``surrounds'' the protein and
prevents it from contacting the filter(8) . Nonspecifically
bound S4 might not be protected by the RNA in the same way, rendering
the pull-through assay less sensitive to nonspecific binding than the
filter retention assay shown in Fig. 1. Since the filter
retention assay cannot distinguish weak but specific binding of the S4
fragments to rRNA from nonspecific binding, we have looked for other
assays to determine whether the fragments retain specific interactions
with the mRNA.
Figure 2:
Toeprint assays with S4(2-206) and
wild type mRNA (A) or CKT4 mRNA (B). Assays
were carried out as described under ``Materials and
Methods,'' with reverse transcription at 32 °C. Dots are at positions of S4(2-206)-dependent pauses that occur
with wild type mRNA but not with CKT4 mRNA.
Figure 3:
Diagram of the mRNA pseudoknot
secondary structure, based on compensatory base pair changes preserving
S4 binding affinity(7) . The Shine-Dalgarno sequence and
initiation codon are indicated by dots and an underline, respectively; numbering is from the first
nucleotide of the
promoter transcript(46) . Positions of
RVT pausing induced by S4(2-206) are shown by arrows.
Since (i) S4 and 30 S subunits are able to
induce similar RVT pauses and (ii) the most intense pauses occur near
the 3` end of the pseudoknot structure, we think it likely that the RVT
is sensing stabilization of the pseudoknot by S4, rather than S4
protein itself. Several weak pauses toward the 5` end of the mRNA are
usually seen as well (Fig. 2). Since RVT transcription should
completely unfold the pseudoknot by the time it reaches these pause
sites, they are presumably induced by nonspecific S4 binding. To test
this, two mRNA mutations with reduced S4 affinity were used in the
toeprint assay, CKT12 (G
C) and CKT4
(C
C
GG). Both have binding constants
on the order of 1 µM
in the filter
binding assay(7) . Neither of these induced pauses between
C
and G
, arguing that the the wild type RNA
pattern of toeprints in this region is due to a specific S4-mRNA
complex (Fig. 2B and data not shown). Weak pauses
toward the 5` end of the mutant mRNAs are similar to those seen in the
wild type mRNA and are presumably associated with nonspecific binding.
Filter binding assays were carried out with the same RNA under an
identical protocol of incubations as used for the toeprint assay. The
binding constant measured was 7.2 ± 0.7
µM, slightly lower than measured under
the conditions of Fig. 1A. Determination of an apparent
binding constant from the toeprint assay is somewhat uncertain, since
nonspecific inhibition of the transcriptase at higher S4 concentrations
complicates quantitation of the paused transcripts and precludes an
accurate determination of the efficiency with which bound S4 induces
RVT pausing. From densitometry of the Fig. 2A gel, we
estimate that the A
-U
pause reaches 50% of
its maximum possible value at
0.3 µM S4
concentration; thus, K
3
µM
. It is likely that RVT biases the
measurement by displacing bound S4, so 3 µM
should be considered a lower limit. The affinity of the complex
inducing the toeprint pattern is therefore in the range expected for
specific binding.
Figure 4: Toeprint assays with S4 fragments. Assays were carried out as described under ``Materials and Methods,'' with reverse transcription at 32 °C. The S4 fragments used were S4(2-177) (A), S4(48-177) (B), S4(2-123) (C), S4(2-104) (D), S4(48-123) (E), and S4(48-104) (F). The lanes labeled C are controls with no added protein. Protein concentrations in µM are indicated above the remaining lanes. Dots indicate nucleotide positions for comparison with Fig. 2A.
Two of the remaining fragments, S4(2-123) and
S4(2-104), gave similar patterns as intact S4 at the
U-G
, C
-A
, and
C
-G
pause sites, but did not show any
effect on the cluster of pauses around U
(Fig. 4, C and D). The simplest interpretation of this result
is that the protein forms two sets of contacts with the RNA, one of
which is localized to residues 124-177 and specifically
stabilizes the U
region.
Of the two shortest protein
fragments, S4(48-123) induced extremely weak pausing at
C-G
, C
-A
, and
C
-U
(Fig. 4E).
Stronger pausing is observed at U
. The A
and
A
pauses seen in Fig. 4E were not
consistently observed. S4(48-104) does not induce a detectable
toeprint at concentrations of over 2 µM (Fig. 4F).
All of the S4 fragments were also
assayed with CKT4 RNA. No toeprint pattern in the
C-G
region was observed for any of
them, arguing that the pause sites seen in Fig. 4reflect
specific S4-mRNA complexes (data not shown).
Figure 5: Gel shift assays. Electrophoresis of RNA (5 µM) incubated with the indicated S4 fragments (10-15 µM) was carried out as described under ``Materials and Methods.''
The CD spectrum of S4
extracted from ribosomes has been reported as part of an extensive
physical study of the protein(32) . Although obtained under
different buffer conditions, it is essentially the same as what we
observe (Fig. 6A). The approximate T of S4 is 41-45 °C, from observations of the
irreversible unfolding of the protein in scanning calorimetry
experiments(32, 33) . This is consistent with the
large decrease in negative ellipticity that we observe between 37 and
55 °C (Fig. 6C).
Figure 6:
CD spectra of S4 and S4 fragments. A, CD spectra of proteins with the normal N terminus and
various C termini, as indicated. Spectra were taken at 8 °C in TK
buffer with 2-mercaptoethanol. B, spectra of fragments with
amino acids 2-47 deleted; conditions were the same as in panelA. C, temperature dependence
of S4 and S4 fragment ellipticity. CD spectra were recorded at
different temperatures, and the ellipticity at the wavelength of the
minimum (209-210 nm) in the 8 °C spectrum is plotted. +,
S4(2-206); , S4(2-177);
, S4(48-177);
, S4(2-123);
, S4(48-123);
,
S4(2-104);
, S4(48-104). Ellipticity is reported
per mole of protein fragment.
Fig. 6(A and B) also displays the CD spectra of the S4 fragments at 8 °C. As successive deletions are made to the C terminus in proteins with an intact N terminus, there are incremental decreases in the intensity of the CD signal (Fig. 6A), as if secondary structures can be sequentially removed from the C terminus without major disruption of the remaining protein. Deletion of N-terminal amino acids has a more unusual set of effects on the protein structure, as seen in comparisons of panelsA and B in Fig. 6. S4(2-177) and S4(48-177) have virtually identical spectra, suggesting that the N terminus has little or no secondary structure. (The ellipticity of S4(48-177) is a maximum of 6.8% more negative than that of S4(2-177) at 218 nm, and less than 1% different at the spectrum minimum of 210 nm.) However, S4(48-123) has substantially less secondary structure than S4(2-123). In the same way, S4(48-104) is essentially unstructured, while S4(2-104) has significant secondary structure. An interpretation of these results is that the central part of the protein (48-104) is stabilized by either the N-terminal 47 amino acids or a region near the C terminus (124-177). This interpretation is consistent with the observations that fragments producing a specific toeprint have in common only amino acids 48-104, although S4(48-104) itself does not show any specific RNA binding.
The above studies
show that S4 specificity for 16 S rRNA resides in residues
48-177. We find that the same region is responsible for S4-
mRNA binding specificity: S4(2-177) and S4(48-177) both
give the same toeprint pattern as intact S4, and specifically interact
with the pseudoknot form having faster mobility in gel electrophoresis.
The C-terminal deletion decreases the S4-mRNA binding affinity,
suggesting that this region either stabilizes the rest of the protein
or makes nonspecific contacts with the mRNA. This is consistent with
qualitative observations made by Daya-Grosjean et al.(39) and Green and Kurland (38) of truncated S4
proteins binding to 16 S rRNA, although Changchien et al.(12) obtained a binding constant closer to that of intact
S4 with the C-terminal deletion they examined. Our own filter binding
measurements, made under similar conditions as used by Changchien et al.(12) but with the 5` domain of 16 S rRNA
instead of the intact rRNA, show that S4(2-177) binds rRNA about
6-fold more weakly than the intact protein.
Streptomycin-independent
revertants of S4 do not regulate mRNA translation in
vivo(10, 11) , which led to the suggestion that
C-terminal sequences deleted in these mutants are required for
translational repression activity but not ribosome
assembly(12) . Our finding that S4(2-177) binds
mRNA specifically suggests that defective regulation by these mutants
is not due to lack of
mRNA recognition. Two factors may be
responsible: the weaker S4-mRNA binding affinity will require a higher
pool size of free S4 to achieve regulation, and the mutants are turned
over much more rapidly than wild type S4 and do not accumulate as free
protein(40) . Thus there is no reason to postulate a specific
role for the S4 C terminus in
mRNA binding, or separate mRNA and
rRNA binding domains.
S4 fragments with C-terminal deletions
produced by MeSO-HBr reaction with Trp-170, hydroxylamine
cleavage at Asn-123/Gly-124, or cyanogen bromide reaction at Met-105
have been described(14, 15, 41) . All of
these interact preferentially with 16 S rRNA over 23 S rRNA in an assay
in which the binding of labeled protein to nitrocellulose filters is
prevented by formation of the protein-RNA complex(42) . The
ability of the fragments cleaved at Trp-170 and Met-105 to participate
in ribosome assembly was investigated; both fragments promote assembly
of 30 S subunits with a full complement of proteins, but sedimentation
of the subunits shows either a smaller S value (Trp-170, (14) )
or a much broader peak indicating heterogeneous conformations (Met-105, (41) ). From these experiments, Conrad and Craven (41) concluded that the sequence 48-104 contained
sufficient information for RNA recognition, but that N- and C-terminal
sequences are essential for proper assembly of the 30 S subunit.
Because others had found UV-induced cross-links of the C-terminal half
of S4 to rRNA (43) and observed rRNA-induced protection of
Lys-121 and Lys-148 from reductive methylation(44) , the
possibility was raised that a region C-terminal to Arg-104 makes a
second set of RNA contacts (41) .
These results on the 16 S
rRNA binding capacity of S4 fragments are again consistent with the
behavior we see for similar fragments binding mRNA.
S4(2-104) and S4(2-123) both give a toeprint pattern
missing one set of bands found with S4(2-206), but otherwise
identical. Neither of these fragments shows a gel shift with the mRNA;
probably the binding affinity is even weaker than with S4(2-177)
and S4(48-177). The most straightforward interpretation of these
results is that a set of RNA-protein contacts has been deleted in the
fragments, although indirect effects of the 124-177 region on the
conformation of the protein-mRNA complex are also a possibility. In
either case, it appears that a region in the C-terminal half of the
protein is needed to form a functional S4 complex with both
mRNA
and 16 S rRNA.
Although the results obtained with S4(2-104) and S4(48-206) suggest that RNA binding specificity should be retained by S4(48-104), this fragment does not interact in any specific way with the mRNA. The somewhat larger fragment S4(48-123) may bind specifically but very weakly. The CD spectra suggest a reason for the lack of binding in these fragments. S4(48-104) has little, if any, secondary structure, and the little secondary structure present in S4(48-123) denatures above 25 °C. The structure of the 48-104 region may need either N- or C-terminal sequences to fold stably.
Figure 7:
Conservation of the S4 sequence. The
continuous sequence is that of the E. coli rspD gene(46) ; below it is the secondary structure predicted
by the PHD program (45) (L = turn, H = helix, E =
sheet). (Only
predictions with a confidence level of about 80% are shown.) The line
above the E. coli sequence shows the positions at which 10/11
bacterial and chloroplast S4 homologs are identical, and the top line
indicates where 4/5 eukaryotic S4 homologs are identical to the
bacterial consensus. The termini of the fragments discussed in this
work are located at the ends of the sequence regions shown, or at the
position of the vertical line.
Taking into
consideration the RNA binding properties of the fragments, the sequence
conservation of the protein, and the predicted secondary structure, we
suggest that S4 can be divided into four regions. The N-terminal 46
amino acids of the protein are not predicted to have secondary
structure, consistent with the removal of these amino acids by trypsin
digestion of an S4-rRNA complex (13) and the nearly identical
CD spectra of S4(2-177) and S4(48-177); these residues are
neither conserved nor essential for mRNA or rRNA binding. The region
from 48-104 is probably responsible for most of the RNA contacts,
is predicted to have extensive -helical structure, and is also
well-conserved. (This region is unstructured by itself, however, and is
therefore not a protein domain in the sense of an independently folding
structure.) A third region extends from 105 to somewhere between 137
and 145; it is also well conserved and predicted to have
-helix
and
-sheet structure. This is potentially the region responsible
for the altered toeprint seen with fragments terminating at Ile-123 and
Arg-104. The fourth region of S4 extends from
145 to the C
terminus; it is neither conserved nor predicted to have much secondary
structure. At least the sequence from 178 to 206 is not essential for
mRNA or rRNA binding, although it appears to increase the binding
affinity without altering specific interactions.
Finally, we note that S4(48-177), which contains the RNA recognition domain, is small enough that its structure could, in principle, be determined by NMR, although the marginal stability of the fragment at 30-35 °C makes such experiments difficult. The conservation of this region suggests that a homologous fragment with higher stability could be isolated from thermophilic organisms.