 |
INTRODUCTION |
Stochastic fluctuations in gene expression have drawn the
attention of several research groups. It has been postulated (1, 2) and
observed (3) that proteins are produced in short "bursts" occurring
at random time intervals rather than in a continuos manner. This
expression pattern arises from the fact that elementary processes such
as polymerase binding and open complex formation involve small number
of molecules and therefore show a wide distribution of reaction times.
The fluctuations in a number of protein molecules resulting from
stochastic gene expression are of special importance if the
concentration of the protein constitutes the message that regulates
expression of another gene. In this case, one may expect that the
activation pattern of the regulated gene is also random, to some
extent. The stochastic nature of gene expression and other biochemical
processes involving small numbers of molecules may explain the
variations observed in isogenic populations of bacterial cells.
Examples are individual chemotactic responses (4), the distribution of
generation times observed in Escherichia coli cells (5), and
the individual cell responses observed during induction of lactose (6)
and arabinose (7) operons at subsaturating inducer concentrations.
Stochastic effects have also been observed in engineered genetic
networks in E. coli (8). Taking into account the stochastic
phenomena described above, it is clear that the cell must contain
various mechanisms that ensure deterministic regulation of cellular
processes. The check points in eukaryotic cell cycle regulation (9) are
examples of such a mechanism.
Computer simulations implementing various Monte Carlo algorithms proved
to be useful in studying stochastic processes in biochemical reaction
networks. This is especially true for several prokaryotic systems for
which, due to their relative simplicity, a large number of quantitative
measurements are available. Examples are the kinetic analysis of
developmental pathway bifurcation in phage
(2), computer simulation
of the chemotactic signaling pathway (4), and mechanistic modeling of
the all or none effect in lactose operon regulation (10).
The goal of this article is to study the kinetic mechanisms that are
responsible for the stochastic phenomena of gene expression. Because
both RNA and protein elongation rates are approximately equal for all
genes, different levels of gene expression are the result of varying
frequencies of transcription and translation initiations. Therefore, we
examined the influence of transcription and translation initiation
frequencies on the pattern of gene expression. The model of a
prokaryotic gene was built using kinetic parameters of LacZ determined
by Kennell and Riezman (11). The model is consistent with the
experimental results presented in their work. We have systematically
varied the frequencies of transcription and translation initiation
reactions, keeping other parameters of the model constant. Our results
indicate that the genes expressed from strong promoters (high frequency
of transcription initiation) produce proteins with constant velocity
and low variation in the number of molecules. A decrease in the
frequency of transcription initiation causes a noisy pattern of gene
expression, i.e. the protein is produced in bursts at random
time intervals. In contrast, a decrease in the frequency of translation
initiation lowers the speed of protein synthesis but does not lead to
noisy expression patterns.
 |
EXPERIMENTAL PROCEDURES |
Formulation of the Model--
We present the kinetic model of
single prokaryotic gene expression. The transcription and translation
processes are represented by the collection of kinetic equations that
are numerically integrated with the use of the Monte Carlo approach of
Gillespie (12, 13).
We assume that the cells in which the model gene is expressed are in
the exponential growth phase. All of the experimental data cited in our
work concern the cells in this growth phase. Moreover, under these
conditions, the concentrations of free RNA polymerase and ribosomes
available for the single gene are most probably kept constant by the
delicate balance between many coupled reactions (14-16) such as
synthesis of RNA polymerase and ribosomes, binding of RNA polymerase
and ribosomes to other expressing genes, nonspecific binding of RNA
polymerase to DNA, and its idling at pause sites. A detailed model of
all the processes that buffer concentrations of free RNA polymerase and
ribosomes would require taking into account the biosynthesis of all the
components of gene expression machinery and their binding within all
the transcription units in the cell. Most of the parameters of such a
complex model would be unknown and impossible to estimate. Instead, we
included in the model randomly changing pools of the free RNA
polymerase and ribosomes. Because these pools result from many small
contributions of other processes, their distributions should be
Gaussian. The mean values of these distributions have been set close to
the estimations presented by other authors. The fluctuations in the RNA
polymerase and ribosome concentrations described by the standard deviations of the distributions are unknown. Therefore, we checked the
sensitivity of our results to these parameters.
Although many research groups are developing detailed kinetic models of
various elements of the gene expression machinery (e.g.
transcription initiation (17)), the in vivo values of the
rate constants of many elementary reactions remain unknown. In
contrast, many experimental works describe the kinetics of prokaryotic
gene expression (mainly of the LacZ gene in E. coli) in terms of simple effective parameters such as the
frequencies of transcription and translation initiation, the rate of
RNA polymerase and ribosome movement, mRNA half-life, etc. (11,
18). In our model, we adjust unknown values of elementary reactions to
reproduce measured values of transcription and translation inititiation frequencies, mRNA level, and the protein synthesis rate in
vivo. We assume that effective parameters obtained in this way
implicitly include reactions involving nucleotides, translation
initiation factors, bivalent ions, and other compounds necessary for
proper function of the gene expression machinery. The summary of the kinetic model and its parameters for the LacZ gene is given
in Table I. Details of the model and
simulation algorithm are described below.
Model of Transcription and Its Parameters--
Transcription
initiation is represented in our model as a two-step
process:
|
(Eq. 1)
|
|
(Eq. 2)
|
|
(Eq. 3)
|
where P represents the promoter region of the gene,
RNAP represents RNA polymerase, and TrRNAP
represents the transcribing RNA polymerase. Reactions in Equations 1
and 2 describe reversible RNA polymerase binding; the reaction in
Equation 3 represents isomerization of closed binary complex to open
binary complex. The second order rate constant of
RNAP1 binding has been set to
108 M
1
s
1, which is the order of magnitude of RNAP
DNA binding determined in in vitro assays (17). The values
of closed complex to open complex isomerization rates reported in the
literature are of the order of 1 s
1. The rate
of reaction (Equation 3) was set to this value. If the rate constants
of reactions in Equations 1 and 3 are set to the values listed above,
the rate of RNAP dissociation must be of the order of 10 s
1 to reproduce a transcription initiation
frequency close to the value of 0.3 s
1
reported by Kennell and Riezman (11). Moreover, the binding constant resulting from the rates of reactions in Equations 1 and 2
is 107 M
1, which is
in reasonable agreement with values estimated according to in
vitro measurements (17).
To clear the promoter region, active RNA polymerase must move 30 to 60 nucleotides (17). Taking into account that the rate of polymerase
movement is about 40 nucleotides/s, this step takes about 1 s
(17). The length of the mRNA chain that is synthesized during this
time corresponds roughly to the length of the leader region containing
the ribosome binding site (RBS). Therefore, the synthesis of the RBS
and promoter clearance occur at approximately the same rate of 1 s
1. It is irrelevant for the conclusions of
our model whether the RBS appears slightly earlier or later than
promoter clearance. Therefore, we modeled these two processes by the
single first order reaction with a rate constant of 1 s
1:
|
(Eq. 4)
|
where ElRNAP denotes polymerase elongating a given
mRNA molecule. The elongation of the mRNA is not modeled. We
assume that the RNA polymerase completes synthesis of the mRNA
molecule with a rate sufficient to allow for a ribosome movement rate
of 15 amino acids/s.
Translation, mRNA Decay, and Protein Degradation--
In
prokaryote transcription, translation and mRNA degradation are
tightly coupled. Following the experimental results of Yarchuk et
al. (18), we have assumed the following interdependence between translation and mRNA decay: (i) RNase E and ribosomes compete for
the RBS; (ii) if RNase E binds to the RBS faster than the ribosome, it
degrades mRNA in the 5' to 3' direction and does not interfere with
the movement of the ribosomes that have been already bound; and (iii)
every ribosome that successfully binds to the RBS completes translation
of the protein. The above assumptions have been modeled by the
following reactions:
|
(Eq. 5)
|
|
(Eq. 6)
|
|
(Eq. 7)
|
|
(Eq. 8)
|
where Ribosome represents the pool of free ribosomes,
RibRBS represents the ribosome binding site protected by the
bound ribosome, and ElRib denotes the ribosome elongating
protein chain. The reaction in Equation 8 models decay of unprotected
RBS. Parameters of the reactions in Equations 5-8 are more difficult
to estimate than those concerning transcription initiation. The
second order rate constant for ribosome binding was set to
108 M
1
s
1, which is of the order of diffusion
limited macromolecule binding. The rate of the RBS clearance step in
translation initiation (the reaction in Equation 7) was set to 0.5 s
1, which is close to experimental estimates
(19). Parameters of the reactions in Equations 6 and 8 have been
adjusted to reproduce the experimentally known rate of protein
synthesis and ribosome spacing of the LacZ gene. Another
important constraint is provided by experiments of Yarchuk et
al. (18). The authors introduced mutations in the RBS region of
the LacZ gene that resulted in a 200-fold variation in the
expression level. A decrease in the protein synthesis rate has been
followed by a nearly equivalent change in the stationary mRNA
level. This means that for the less effective ribosome binding
sites, the rate of mRNA decay exceeded or at least was no lower
than the transcription initiation frequency. Otherwise, the significant
mRNA level would be observed even in the case of RBS practically
unprotected by ribosomes. We have set the rate of the reaction in
Equation 8 to 0.3 s
1, which is equal to the
transcription initiation frequency. Therefore, the rate of ribosome
dissociation has to be set to 2.25 s
1 to
reproduce the rate of protein synthesis measured by Kenell and Riezman.
The elongation of the protein chain has been modeled as a single first
order reaction:
|
(Eq. 9)
|
We have used the commonly accepted rate of ribosome movement of
15 amino acids/s. Therefore, the rate of the reaction in Equation 9 has been set to 15/L
s
1, where L denotes the length of
protein chain. For the 1024-amino acid-long product of the
LacZ gene, the value of this parameter is 0.015. This
treatment of the elongation reaction is simplified with respect to
other Monte Carlo models of prokaryotic gene expression (1, 20). In
these publications, the addition of a single amino acid by a single
ribosome has been simulated as a separate reaction, and ribosome
overlap has been prevented by excluded volume constraints. It will be
shown in the following sections that despite neglecting fine details of
ribosome movement and using the average rate of protein chain synthesis
instead, the model is able to reproduce the experimental data
concerning expression of the LacZ gene. Simplification of
the elongation reaction made the simulation computationally fast, thus
we could collect good statistics from long timescale trajectories for a
large number of parameter combinations.
Protein degradation has been simulated by the following
reaction:
|
(Eq. 10)
|
Although protein degradation rates are variable, most of the
native proteins have half-lives of the order of hours (21). In our
simulations, we have applied the value of 3 h that has been
measured for wild type
-galactosidase in E. coli cells
(22).
RNA Polymerase and Ribosome Pools--
The numbers of available
RNAP and ribosome molecules were generated from the Gaussian
distribution. In the case of RNAP, the mean of the distribution was set
to 35 molecules. This value is close to estimations made by other
authors according to experimental data (14, 15, 17). Because the number
of ribosome molecules present in the cell is approximately an order of
magnitude larger than the number of RNAP molecules, the mean of
the ribosome pool distribution has been set to 350.
We are not able to estimate the S.D. of the above-mentioned pools. We
have repeated all the experiments with the standard deviations equal to
10% and 50% of the means to study the influence of the fluctuations
in RNAP and ribosome pools on the gene expression pattern. In the
following sections, results will be presented for the standard
deviations of the pools equal to 10% of the means, unless it is stated
explicitly that standard deviations are set to 50% of the means.
Simulation Protocol--
To compute the time evolution of the
system described above, we have used the Monte Carlo algorithm of
Gillespie (12). This method has a sound physical basis and has already
been proven useful in the simulation of prokaryotic gene expression (1, 2). For the convenience of the reader, a brief description of the
Gillespie algorithm is given below. The rigorous derivation of this
computational protocol can be found elsewhere (12, 13).
Gillespie's algorithm is formulated to numerically study systems
composed of a large number of chemical species whose interactions are
described by rate equations. At each step of simulation, two random
numbers are generated. The first one is used to determine which
reaction in the system will happen; the second one determines the
waiting time for this event. The probability of a given reaction to be
chosen is proportional to the product of its stochastic rate constant
and numbers of substrate molecules. The distribution of the waiting
times is given by the following equation:
|
(Eq. 11)
|
where
|
(Eq. 12)
|
P(
, µ) is the probability that the waiting time
for the reaction is
, provided that reaction µ was chosen to
happen; hµ is the product of the numbers of
substrate molecules in reaction µ; and cµ is
the stochastic rate constant of this reaction. After the reaction and
its waiting time are determined, one elementary reaction is executed
(e.g. if the reaction is A + B
AB, one A molecule and one
B molecule will be removed from the system, and one AB molecule will be
added). The time of the simulation is then increased by
, and the
next simulation step is executed.
The stochastic rate constant of the reaction is defined as the
probability that the elementary reaction will happen in the infinitesimally short time interval. It can be calculated from the rate
constant using simple relations. For the first order reactions, both
constants are equal. For the second order reaction, the stochastic rate
constant is equal to the rate constant divided by the volume of the system.
It has been rigorously proven that for an infinitely large number of
molecules, the Markov Chain generated by the Monte Carlo scheme given
above converges to the same trajectory of the system as the
deterministic integration of differential equations (12, 13). The
advantage of the Gillespie algorithm over the differential equation
approach is that it also remains exact for arbitrary low numbers of
molecules in the system. This is extremely important in the simulation
of biochemical systems, where some processes involve single molecules
(e.g. one molecule of promoter in transcription initiation).
Randomly fluctuating pools of available ribosomes and RNA polymerase
have been modeled in the following way. Before executing every step of
the Gillespie algorithm, the number of RNAP molecules and ribosomes has
been generated from Gaussian distributions with the means and standard
deviations defined in the previous section. Then, the values of
aµ (see Equation 12) for every reaction were calculated, and the step of the Gillespie algorithm has been executed. Generation of the number of substrate molecules from the
appropriate probability distribution has already been applied by Arkin
et al. (2) in the Gillespie algorithm simulation to model
rapid equilibrium phenomena in regulatory protein binding.
We have applied the Gillespie algorithm to the model described in the
previous sections. At the beginning of the simulation, only a single
promoter (P) was present in the system. The numbers of all
other molecules, except random pools of RNAP and ribosomes, were set to
0. Simulation was pursued until it reached the generation time of
E. coli, and then cell division was simulated,
i.e. simulation was continued with the system containing one
promoter molecule and all other numbers of molecules divided by 2. The
generation time was set to 35 min. We have neglected the fact that
during DNA replication, two copies of the bacterial gene can be
expressed. Linear volume change of the growing cell has been assumed.
Before every step of the Gillespie algorithm, stochastic rate constants of second order reactions have been recomputed by dividing the appropriate rate constants by the current volume of the cell.
Simulation of several generations is important in the simulation of
constitutive gene expression. As shown below, after starting our
simulation with 0 protein molecules, the level of gene expression equilibrates after a few generations. We have found that 10 generations are sufficient to equilibrate the system. For each simulation, we
computed 100 independent trajectories of the system covering 10 generations each.
We have used our own implementation of the Gillespie algorithm. The
software has been tested against examples given in the original work
(12).
 |
RESULTS |
Validation of the Model--
To check the validity of the model,
we performed the simulation with the parameters describing the
LacZ gene. Fig. 1A
presents the gene expression pattern spanning 10 bacterial generations. Standard deviation of RNAP and ribosome pools has been set to 3.5 and
35, respectively (10% of the mean values). To compare these data with
the results of Kennell and Riezman (11), we have analyzed the
changes in the number of protein molecules in the first generation
because the experimental data covered the time scale of about 30 min
after induction. The transcription initiation frequency obtained
in this simulation was equal to 0.258 s
1,
close to that of 0.3 s
1 reported by Kennell
and Riezman for the LacZ gene. As seen in Fig.
1B, after about 500 s, the number of protein molecules
begins to grow linearly with time. The rate of protein synthesis, which was determined as the slope of this part of the curve, is 26 s
1, which is in good agreement with the value
of 20 s
1 reported by Kennell and
Riezman (11).

View larger version (30K):
[in this window]
[in a new window]
|
Fig. 1.
Simulation of LacZ gene
expression. A, number of protein molecules as a
function of time. The results for 100 trajectories recorded over 10 bacterial generations are shown (generation time, 2100 s). After
every generation, the number of protein and mRNA molecules was
divided by 2. B, the number of protein molecules recorded in
the first generation. In every 10-s time interval, the number of
protein molecules was averaged over 100 trajectories, and the S.D. was
computed. Data on the plot are mean (average trajectory) and ± 3 values for each time interval. Linear function is fitted to all
data points on the average trajectory recorded for times exceeding
500 s. C, average and ± 3 trajectories for
the number of mRNA molecules in the first generation. D,
average and ± 3 trajectories for the number of ribosomes
moving on LacZ mRNA.
|
|
The average number of mRNA molecules present in the cell was
determined from the data shown in Fig. 1C. After about
1000 s, the number of mRNA molecules reached a stationary
level. The mean number of molecules in this part of the curve is
56 ± 4. This result is in good agreement with the value of 62 LacZ mRNA molecules/cell measured by Kennell and Riezman
(11).
Fig. 1D shows the number of ribosomes translating the LacZ
mRNA. The average number of molecules present after 1000 s is
1838 ± 73. Therefore, according to our simulation, 1 molecule of
LacZ mRNA is occupied, on average, by 32 ribosomes. This yields a
ribosome spacing of 96 nucleotides (the length of Z message is 3074 bp according to the GenBankTM entry). This number is in
agreement with the experimental value of 110 nucleotides
reported by Kennell and Riezman (11).
As will be shown in the next sections, when the rate of ribosome
dissociation is increased in our simulation, the model predicts that
the mRNA level will decrease. This behavior is consistent with the
experimental data of Yarchuk et al. (18).
We have repeated the simulation above with the S.D. of RNAP and
ribosome pools set to 50% of the mean values. There are only minor
differences with respect to the results reported above. The rate of
protein synthesis is 24 s
1; the mRNA
level is 54 ± 4 molecules; the number of translating ribosomes is
1756 ± 43; and there is a ribosome spacing of 95 nucleotides.
We conclude that the results of our simulations are consistent with the
experimental data concerning expression of the LacZ gene. In
the following sections, we will show predictions of the model for the
genes with frequencies of transcription and translation initiations
different from those of the LacZ gene.
Effect of Varying Transcription Initiation Frequency--
Promoter
efficiency can be decreased by two factors: a low binding constant of
RNA polymerase within the promoter region and/or a low rate of closed
complex isomerization. We have performed two sets of simulations in
which transcription initiation frequency has been lowered in both ways.
Low RNAP specificity was simulated by increasing the RNAP dissociation
rate. We assume that RNAP binding is limited by
three-dimensional diffusion in solution or by
one-dimensional diffusion along DNA molecule, so the
promoter sequence influences only the dissociation rate and not the
binding rate. The simulations have been performed for the following
RNAP dissociation rates: 10, 100, 1000, and 10000 s
1. This resulted in transcription initiation
frequencies of 0.258, 0.052, 0.0054, and 0.0005 s
1. These values cover the entire range of
transcription initiation frequencies (1 s
1 to
10
4 s
1) observed
thus far in prokaryotic systems. All other parameters of the model and
simulation protocol have been kept constant and set to the values used
for the simulation of the LacZ gene.
Fig. 2 shows the results of the
calculations. The plots display the values recorded in the 10th
generation (time range, 18,900-21,000 s) when the expression pattern
of the gene is equilibrated. In the case of strong promoters (Fig. 2,
A and B), all 100 trajectories shown are similar.
In both cases, the number of protein molecules grows linearly, with
smaller differences between trajectories observed in the case of
stronger promoter. The expression pattern in the case of the smallest
frequency of transcription initiation analyzed (Fig. 2D) is
qualitatively different. The protein is produced in pulses rather than
evenly, as in the cases described above. In some trajectories, the
number of protein molecules does not grow at all in the 10th
generation, and the plots indicate only slow decay due to the protein
degradation. In other cases, the gene is expressed for some time,
resulting in a sudden "burst" in the amount of protein. The time
intervals between these events are random. The gene expression pattern
for the transcription initiation frequency 0.0054 s
1 (Fig. 2C) is the intermediate
case with respect to the two extremes already described.

View larger version (52K):
[in this window]
[in a new window]
|
Fig. 2.
Expression patterns for various transcription
initiation frequencies. On each plot, 100 trajectories recorded in
the 10th generation are plotted with red lines. Average
and ± 3 trajectories are plotted with black
lines. In D, the 3 trajectory is not shown
because it would have negative numbers of molecules. A-D
present results for transcription initiation frequencies of 0.258, 0.052, 0.0054, and 0.0005 s 1,
respectively.
|
|
To express the above observations quantitatively, the following
analysis was performed. For each 10-s time interval in the whole
trajectory of the simulation (10 generations; time range, 0-21,000 s),
the mean and S.D. of the number of protein molecules were calculated.
The variation coefficient for each time interval was then calculated as
the ratio of the S.D. and the corresponding mean value. The values
plotted as a function of time are presented on Fig.
3A. The plot shows that the
relative fluctuations of gene expression are much larger in the 1st
generation of the cells than in subsequent ones. After ~2.5
generations, the curves converge to the values 0.03, 0.08, 0.26, and
0.84 for transcription initiation frequencies 0.258, 0.052, 0.0054, and
0.0005 s
1, respectively. Therefore, in the
case of weakly expressed genes, the fluctuations in the number of
protein molecules converge to about 84% of the mean value, whereas for
very strongly expressed genes, these fluctuations do not exceed a few
percent.

View larger version (32K):
[in this window]
[in a new window]
|
Fig. 3.
Variation coefficients of the number of
protein molecules calculated for different transcription initiation
frequencies. For every simulation, the average number of protein
molecules and its S.D. in 100 trajectories were computed in each 10-s
time interval. Data on plots are S.D./mean ratios for every time
interval. Plots span the time scale of 10 generations. A,
transcription initiation frequency (TIF) lowered by increasing RNAP
dissociation. Curves I, II, III, and IV
correspond to a TIF of 0.258, 0.052, 0.0054, and 0.0005 s 1, respectively. B, TIF lowered
by increasing closed complex isomerization rate. Curves I, II,
III, and IV correspond to a TIF of 0.258, 0.035, 0.0035, and 0.0001 s 1. C and
D show results for the simulations in which the S.D. of RNAP
and ribosome pools has been changed from 10% to 50% of the mean
values. C, TIF changed by RNAP dissociation rate. Curves
I, II, III, and IV correspond to TIF values of
0.259, 0.051, 0.0051, and 0.0003 s 1.
D, TIF changed by closed complex isomerization rate. Curves
I, II, III, and IV correspond to TIF values of
0.259, 0.034, 0.0034, and 0.0001 s 1.
Numbers in parentheses represent the values to which the
curves converge.
|
|
In the next set of simulations, promoter efficiency has been lowered by
setting closed complex isomerization rates (the reaction in Equation 3)
to 1, 0.1, 0.01, and 0.001 s
1. The resulting
transcription initiation frequencies were 0.258, 0.035, 0.0035, and
0.0001 s
1. Gene expression patterns obtained
in these simulations do not differ qualitatively from those presented
in Fig. 2. Fig. 3B shows the plots of variation coefficient
versus time.
To examine the influence of the fluctuations in the pools of available
RNAP and ribosomes, we repeated all of the above experiments with the
standard deviations of these pools set to 50% of mean values. Results
do not differ qualitatively from the previous ones. Fig. 3,
C and D, shows the plots of variation
coefficients versus time for these calculations.
Effects of Varying Translation Initiation Frequency--
We
performed simulations in which translation initiation frequency was
lowered by increasing ribosome dissociation. As in the case of RNA
polymerase, we have assumed that ribosome binding is diffusion limited
and that changes in ribosome binding site affect the dissociation rate
rather than the association rate. Simulations have been
executed for the following values of ribosome dissociation rate: 2.25, 22.5, 225, and 2250 s
1. The protein levels
obtained in these calculations covered the 200-fold range observed by
Yarchuk et al. (18) for RBS sites engineered to have various
translation initiation rates. mRNA levels recorded in the
simulations are shown on Fig. 4. The
number of mRNA molecules is lowered by the increase of the ribosome
dissociation rate. These results are in agreement with the experimental
observations of Yarchuk et al. (18). Therefore, the results
shown in Fig. 4 strongly support our treatment of the translation
initiation kinetics.

View larger version (30K):
[in this window]
[in a new window]
|
Fig. 4.
Number of mRNA molecules calculated for
different ribosome dissociation rates. A: I,
ribosome dissociation rate of 2.25 s 1;
II, ribosome dissociation rate of 22.5 s 1. B, ribosome dissociation rate
of 2250 s 1. The results for a ribosome
dissociation rate of 225 s 1 are not shown
because the curve would overlap with the curve shown in
B.
|
|
The gene expression patterns obtained in the simulations described
above are shown in Fig. 5. The protein
levels cover a range similar to that in the simulations with
varying frequency of transcription initiation. The striking difference
is that even for the lowest rate of protein synthesis, the protein
molecules are still produced evenly, rather than in pulses. Variation
coefficients calculated as described in the previous section are shown
in Fig. 6A. As one can see,
the shape of the curves is similar to those calculated for various
transcription initiation frequencies, but the final levels are
significantly lower. Fluctuations in the number of protein molecules
are higher for the lower rates of ribosome binding, but they do not
exceed 15%.

View larger version (25K):
[in this window]
[in a new window]
|
Fig. 5.
Expression patterns for various ribosome
dissociation rates. On each plot, 100 trajectories recorded in the
10th generation are plotted with red lines. Average ± 3 trajectories are plotted with black lines.
A: I, ribosome dissociation rate of 2.25 s 1; II, ribosome dissociation rate
of 22.5 s 1. B: I,
ribosome dissociation rate of 225 s 1;
II, ribosome dissociation rate of 2250 s 1.
|
|

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 6.
Variation coefficients of the number of
protein molecules calculated for various ribosome
dissociation rates. For every simulation, the average number of
protein molecules and its S.D. in 100 trajectories were computed for
each 10-s time interval. Data on the plot are S.D./mean ratios for
every time interval. Plots span the time scale of 10 generations.
Curves I, II, III, and IV show the results for
ribosome dissociation rates of 2.25, 22.5, 225, and 2250 s 1. Numbers in parentheses denote
the values to which the curves converge. A, results for the
experiments in which the standard deviations of RNAP and ribosome pools
were set to 10% of the mean value. B, results for the
experiments in which the standard deviations of RNAP and ribosome pools
were set to 50% of mean value.
|
|
We have repeated the above calculations with the standard deviations of
the number of available RNAP and ribosome molecules set to 50% of the
means. Results do not differ qualitatively. Variation coefficients of
the number of protein molecules are plotted on Fig. 6B.
The decrease of RBS effectiveness followed by the decrease of mRNA
level, as has been observed by Yarchuk et al. (18), implies that the strength of the ribosome binding site is governed by ribosome
binding/dissociation rather than by following the RBS clearance
reaction (the reaction in Equation 7). If the ribosomes were stalled at
RBS, they would protect mRNA from RNase E binding, and the
low protein level would not be followed by the decrease in the mRNA
level. Despite this fact, we have performed simulations in which
frequency of translation initiation has been lowered by decreasing the
rate of the reaction in Equation 7 (data not shown). Results do not
differ significantly from the ones shown in Figs. 4-6.
 |
DISCUSSION |
The kinetic model of gene expression presented above was tested
against the experimental data concerning the LacZ gene in E. coli. The rate of protein synthesis, mRNA level, and
ribosome spacing were in reasonable agreement with the experimental
observations. Moreover, the changes in mRNA level due to the
variation in ribosome binding rate were correctly reproduced. The
model, when applied to study the influence of transcription and
translation initiation frequencies on the gene expression pattern,
shows a significant, qualitative difference between these two cases.
When the level of gene expression is lowered by decreasing promoter
strength, the decrease in the number of protein molecules is
accompanied by large fluctuations reaching 100% of the mean value. In
contrast, low expression of the gene, without large fluctuations in the number of protein molecules, can be maintained by decreasing the efficiency of the ribosome binding site, i.e. lowering the
rate of ribosome binding. For protein levels comparable to those
corresponding to the smallest rate of transcription initiation
frequency analyzed, the fluctuations do not exceed 15%.
According to our simulations, the model is not sensitive to the
magnitude of the fluctuations of the pools of available RNA polymerase
and ribosomes. The simulations repeated with variation coefficients of
these pools equal to 10% and 50% resulted in good agreement with
experimental data. The main conclusions of our work concerning the
fluctuations in gene expression patterns are also unaffected by the
variation coefficients of RNAP and ribosome pools. The influence of the
variation coefficient of the pools is limited to a minor change in the
exact values of standard deviations of the number of protein molecules.
According to our model, the qualitative difference between the
expression patterns of genes with low frequency of transcription initiation and low frequency of translation initiation is caused by the
following mechanism. The transcription initiation event is always
controlled by a single promoter molecule, whereas in the case of
translation initiation, many mRNA molecules recruiting ribosomes or
RNases are involved. Therefore, in this case, fluctuations are lowered
by averaging over the number of ribosome binding sites taking part in
the reaction.
Although variable frequencies of the translation initiation have been
observed in bacterial cells, gene expression is more commonly regulated
on the promoter level. This seems to contradict the results of our
model, which suggest that large fluctuations in the number of protein
molecules result from maintenance of low expression levels by low
frequency of transcription initiation. These fluctuations are
especially important in the context of the synthesis of regulatory
proteins. Regulatory proteins are expressed in low amounts, and random
changes in their concentration could affect regulated genes in an
unpredictable way. This raises the question of why the low frequency of
transcription initiation is the more frequent mechanism of maintaining
low expression of these proteins. To answer this question, one
needs to consider, at least semiquantitatively, the kinetics of
protein-DNA binding. Kinetic studies of Lac repressor binding to DNA
revealed that this protein finds its target operator site with a very
high rate of about 7 × 109
M
1 s
1
(23). To appreciate the consequences of the high value of this second
order rate constant, one can express it as the rate with which a single
repressor molecule binds its operator site in the volume of the cell
(10
15 L). The result, 12 s
1, is an order of magnitude higher than the
largest transcription initiation rates observed. This means that
binding of even a single repressor molecule to its operator site would
be still faster than the rate of RNA polymerase recruitment and
activation. Thus, should the fluctuations in the number of protein
molecules be comparable to those presented in Fig. 2D, the
number of repressor molecules present in most cells would be sufficient
for rapid binding of the target gene, and the fluctuations would not
influence the gene regulation process. The situation can be more
complex if the gene is influenced by several regulatory proteins, as
happens in the complex life cycles of bacteriophages. In this case,
the random changes in relative amounts of these proteins can trigger bifurcations in developmental pathways as shown by Arkin et
al. (2).
We conclude that the low level of proteins regulating bacterial operons
can be maintained by low transcription initiation frequency because
regulatory proteins bind to DNA sites with very high efficiency,
exceeding the diffusion limit. In contrast, keeping the low expression
level by inefficient ribosome binding is not optimal from the energetic
point of view because mRNA molecules have to be unnecessarily
synthesized. Therefore, expression of most regulatory proteins from
weak promoters is an evolutionarily preferred strategy despite the
large random fluctuations in the number of protein molecules.
It is also worth commenting on the difference between constitutive and
induced gene expression as implied by our model. The curves in Figs. 3
and 6 indicate that the relative magnitude of random variation is
significantly higher in the bacterial generation in which the gene has
been induced than in the subsequent generations. Therefore, according
to the model, constitutively expressed genes show a lower level of
fluctuations than induced genes. On the other hand, many induced genes
that code for enzymes are expressed to very high levels, so that the
fluctuations in the number of protein molecules quickly fall to very
low values (see Fig. 3A).
We have presented a kinetic model of gene expression that is able to
reproduce a variety of experimental observations concerning the model
prokaryotic gene, LacZ. We believe that the results obtained by the
application of the model provide insights into the origins of the
stochastic patterns of prokaryotic gene expression. These effects may
be of interest in both efforts aimed at understanding the molecular
details of gene regulation and biotechnology, where optimization of
protein overexpression in bacteria is one of the primary goals.