(Received for publication, November 21, 1995; and in revised form, January 25, 1996)
From the
The cI gene of coliphage 186 maintains lysogeny and
confers immunity to 186 infection by repressing the major early
promoter, p, and the promoter for the late
transcription activator gene, p
. Gel mobility
shift and DNase I footprinting show that CI protein binds to the DNA at p
and p
and also to sites
300 base pairs upstream and downstream of p
,
called FL and FR. Mutations which cause virulence reduce CI binding to p
. The biochemical and genetic data identify three
CI operators at p
, two at p
,
and single operators at FL and FR. The operators at the p
, FL, FR, and central p
sites are inverted repeat sequences, separated by 5 base pairs
(Type A) or, in the case of p
, by 4 base pairs
(Type A`). A different inverted repeat operator sequence (Type B) is
proposed for the binding sites on each side of the central site at p
. Thus, CI appears to recognize two distinct DNA
sequences. CI binds cooperatively to adjacent operators, and binding at p
is strongly dependent on these cooperative
interactions. A high order CI multimer appears to be the active DNA
binding species, even at single operators.
Coliphage 186 shares with the well characterized phage the
ability to achieve a lysogenic state that is extremely stable yet which
can efficiently switch to the lytic state in response to activation of
the host SOS system(1) . Elucidation of the gene control
mechanisms and strategies used in
's genetic switch (2, 3) has been of profound value in molecular biology
and has informed thinking about the ways in which higher organisms
utilize alternative stable developmental states. 186 is a member of the
P2 phage family (4) and, since it shows very little similarity
at the DNA or protein sequence level to
, appears to represent a
relatively independent solution to the requirements of such a genetic
switch.
The 186 cI gene is the central player in the
maintenance of the stable lysogenic state(5) . The cI gene product represses two promoters: p, the
promoter for the early lytic operon, including the apl, cII, and replicase genes, and p
, the
promoter for the late promoter activator gene, B ( (6) and (7) ; see Fig. 1). CI thus, directly or
indirectly, represses all the lytic genes of the prophage and also
blocks lytic development of 186 phage that infect the
lysogen(8) . Expression of cI is maintained during
lysogeny by transcription from the p
promoter. The
leftward p
promoter and the rightward p
promoter are arranged face-to-face, with their
transcripts overlapping by 62 bases(6) . This is quite
different from the analogous P
and P
promoters of
, which are arranged
back-to-back.
Figure 1:
CI binding sites in the 186 early
control region. Map of the 186 genome from the PstI site at
65.5% to the BglII site at 79.6%(14, 25) ,
showing the location of the four CI binding regions, pB, FL, pR, and FR
identified by gel mobility shift studies (Fig. 2). Genes are
indicated by gray boxes: B, late promoter
activator(26) ; 69, unknown function; int,
integrase(14) ; cI, immunity repressor; apl,
excisionase and transcriptional
control(6, 11) ; cII,
establishment of lysogeny(5) ; dhr, inhibitor of host
replication; fil, inhibitor of cell division(27) . The
promoters p
, p
, and p
are denoted by solid arrowheads, and
their transcripts by arrows. Terminators are shown as stem
loops. The phage attachment site attP
is
shown.
Figure 2:
Binding of CI to pB, FL, pR, and FR sites
by gel mobility shift assay. CI gel mobility shift assays (see
``Materials and Methods'') with gel-purified, radiolabeled
DNA fragments prepared by PCR from plasmid clones using primers USP and
RSP, at least one of which was 5`-labeled with polynucleotide kinase
and [-
P]ATP (see ``Materials and
Methods''). The pB, FL, pR, and FR DNA fragments were derived from
pEC627, pEC629, pEC625, and pJC251, respectively. The nonbinding
control fragment was a similarly labeled fragment amplified from the
186 tum gene (primers 9 and 87). The CI concentrations
(nM) are given above each track. The gel origin is at the top of the gel image.
The processes of establishment, stable maintenance,
and efficient breakdown of the lysogenic transcriptional state in 186
display some unusual features, and characterization of the activity of
the CI protein is needed to understand these processes. During
lysogeny, the face-to-face promoter arrangement would seem to create
difficulties for CI in maintaining repression of p. If the CI protein binds at p
, then RNA polymerase from p
must pass through the CI
p
complex in
order to transcribe the cI gene. This passage of RNA
polymerase seems likely to remove CI from the DNA and thus make p
accessible for RNA polymerase binding. Whether
this situation is a problem and, if so, what special mechanisms are
used to cope with it, are questions that are relevant to the important
topic of how mobile protein
DNA complexes, such as polymerases,
interact with each other (9) and with static protein
DNA
complexes, such as repressors and nucleosomes(10) . The
strategy for establishment of lysogeny in 186 appears very similar to
, with lysogenic transcription requiring CI and the initial
production of CI being dependent on another phage protein,
CII(5) . (
)However, the activation of lysogenic
transcription by 186 CI seems to be indirect, with CI repression of p
removing an inhibition of p
caused by converging transcription from p
(6) . (
)Efficient breakdown of
CI repression during SOS induction of the prophage is initiated not by
RecA, as in
, but by a phage protein, Tum, that antagonizes CI
repression of p
and p
(1) . (
)Efficient derepression
may also require repression of cI transcription from p
by the Apl protein, which binds between p
and p
(11) .
Interactions between Apl and CI at p
are
likely to be critical in the operation of the lysis-lysogeny switch.
To investigate CI repression, Lamont et al.(8) isolated and examined a number of virulent (vir) mutants of 186. These are phage mutants that are
insensitive to lysogenic immunity and are able to develop lytically in
a 186 lysogen. It was expected that these mutants would carry mutations
at p which interfered with CI repression. Indeed,
in all 19 vir mutants, mutations were found within the
-49 to -3 region of p
. These mutations
are clustered into three sites. All of the vir mutants carry
at least one mutation in the central site (Site I), and most carry
additional changes in the leftmost site (Site II). Two mutants have
changes in the rightmost site (Site III), with one of these also
altered at Site II. The three mutants with changes at Site I only are
poorly virulent, forming plaques with low efficiency (10-15%) on
lysogens, and not forming plaques on a strain carrying the cI gene on a multicopy plasmid. Mutants carrying additional changes
at Sites II or III plate with higher efficiency on lysogens
(22-52%), and most are able to form plaques on the strain with
the cI plasmid. It was expected that these mutations disrupt
CI operators and thus act by reducing CI binding to p
. However, no DNA sequence element that was
conserved between the three sites and which could serve as a likely CI
operator sequence could be found(8) .
The experiments
reported here were designed to: (i) confirm that 186 CI is a
sequence-specific DNA-binding protein, (ii) identify the CI binding
regions in the early control portion of the 186 genome, (iii) confirm
that the vir mutations reduce CI binding at p, and (iv) identify a likely CI recognition
sequence. In the course of this work, a number of other aspects of DNA
binding by CI became apparent.
The following oligodeoxynucleotide primers (Bresatec) were used. Any 186 sequence is underlined, with the first and last 186 sequence positions given.
USP (universal sequencing primer: M13 -20 primer), GTAAAACGACGGCCAGT; RSP (M13 reverse sequencing primer), CACACAGGAAACAGCTATGACCATG; Primer 34(2660-2689), ACATCCACGTTGCTCCATCCTAAAGAATCT; Primer 68 (2907-2894), CCGCCCAAGCTTGAATTCAGAAACACCCTCAA; Primer 71(2653-2670), CGGACCAAGCTTGACGTCAAGTACATCCACGTT; Primer 89(3087-3070), TCCCCGCGGTACGAGAGCACCGATGACGAGTTG; Primer 87 (tum left), CGTAGTGGAGGTCATATGGATAGAGAGCT; Primer 9 (tum right), CACTTCCTGAAGGATGC.
Figure 3:
DNase I footprinting with CI at pR. The CI
DNase I footprint on the top strand (the 186 l-strand) of the
pR binding site is shown. The interpretation of footprint data for both
strands of all the CI binding sites is given in Fig. 4. The rightmost four tracks show the DNA fragment treated with DNase
I in the presence of various indicated concentrations of CI
(nM). DNA fragments were prepared by PCRs in which one of the
primers was radiolabeled at its 5` end (see Fig. 2legend). The
sequencing marker tracks (C, G, T) were
prepared by dideoxy chain termination sequencing, using the same
radiolabeled primer and dsDNA template with Sequenase version 2.0
(United States Biochemical Corp.). The numbering of the
sequence positions is explained in Fig. 4. The solid bracket to the right of each gel image indicates the dense
CI-affected region, with the dashed brackets showing the
extended portions of the footprint (see text). The asterisk indicates the running position of a small amount of undenatured
DNA probe (determined by a gel lane with DNA untreated with DNase I).
The CI binding reactions were made by mixing 5 µl of dilutions of
CI (Affi-Gel Blue fraction) in 50 mM Tris-HCl, pH 8, 250
mM NaCl, 1 mM EDTA, 10% glycerol with 45 µl of
the radiolabeled DNA probe (final concentration in the binding reaction
0.5-1 nM) in 10 mM Tris-HCl, pH 7.5, 1 mM EDTA, 0.1 mM dithiothreitol, 5% glycerol, 50 µg
µl bovine serum albumin. The reactions were
incubated at room temperature for 15 min before the addition of 5
µl of 0.2 µg ml
DNase I in 100 mM KCl, 100 mM MgCl
, 20 mM CaCl
, 10 mM Tris-HCl, pH 8.0, 2 mM dithiothreitol. After 2 min at room temperature, the reaction was
stopped by the addition of 50 µl of 50 mM EDTA, 1% SDS, 20
µg ml
glycogen, 600 mM sodium acetate,
and 200 µl of ethanol. The DNA was ethanol-precipitated,
phenol/chloroform-extracted and reprecipitated. The pellet was
dissolved in 5 µl of formamide loading buffer (60% formamide, 12
mM EDTA, 0.03% bromphenol blue, 0.03% xylene cyanol), heated
to 95 °C, chilled, and loaded onto a 6% sequencing gel. Following
electrophoresis, the gel was dried onto Whatman 3MM paper under vacuum.
The bands were visualized and manipulated as described for the gel
mobility shift gel (see ``Materials and Methods''). In some
of the gel images, the sequence markers have been contrast-adjusted
differently from the footprint tracks.
Figure 4:
Summary of DNase I footprinting with CI.
The effects of CI on DNase I cleavage at the pR (a), pB (b), FL (c), and FR sites (d) were judged
from various footprint experiments, including that shown in Fig. 3. Protected bonds are indicated by vertical
lines, bonds whose cleavage was enhanced by CI are indicated by arrows. The lengths of these lines or arrows represent the
sensitivity of the effect, that is, the CI concentration at which the
effect was apparent (the long, intermediate, and short lines correspond
to the effects being apparent at 130 nM, 260 nM, and
520 nM, respectively). The small open circles denote
bonds that remained DNase I-accessible in the presence of CI. In some
cases, bonds were protected and remained somewhat accessible;
these bonds are indicated by a line capped by a small open
circle. The effects were judged by visual inspection of
phosphorimages at high magnification; these effects are not always
apparent from the printed images of Fig. 3. No effects were seen
outside the regions shown. The brackets above and below each
sequence indicate the dense portion of the footprint believed to
contain the major binding determinants, with the shaded arrows between the strands showing the putative CI operator sequences
(see text) The FL, pR, and FR sequences are numbered from the p start site (186 position 2747; (6) ).
The pB sequence is numbered from the p
start site
(186 position 270; (28) ). The -35, -10, and
+1 promoter sequences (including the -10 sequence of p
) are shown in bold, as is the stop
codon of the apl gene.
Fig. 2shows gel shift assays with CI purified to at least 98% homogeneity (16) and with DNA fragments containing each of the four binding regions. In each case, there was a single major retarded species at each concentration tested. Thus, only one major complex appeared to be formed at each concentration. The mobility of the retarded species was similar for the different fragments, implying binding of a similar number of protein subunits in each case. A minor, less retarded species was seen in some experiments (see the FR fragment in Fig. 2). The lack of binding to the control DNA fragment in Fig. 2showed that CI binding to the pB, FL, pR, and FR sites was sequence-specific.
The gel retardations showed an unusual effect that we are not able to explain: the mobility of the retarded species decreased in small steps with increasing CI concentration (Fig. 2). This effect was seen with all CI binding DNA fragments and occurred over increments of CI concentration even smaller than shown in Fig. 2. The phenomenon was not affected by the order in which samples were prepared or loaded. It is unlikely that the effect is caused by a nonspecific DNA-binding contaminant in the CI preparation because the binding is both sequence-specific, as shown with the control DNA fragment (Fig. 2), and CI-specific, as crude cell extracts not containing CI show no such binding to these fragments (data not shown)(29) . Presumably, the decreasing mobility reflects increasing numbers of CI subunits per complex. However, if this were the case, one would expect both a greater magnitude of shift and the appearance of multiple bands in at least some tracks instead of the single band seen.
At each site there was a particularly dense region of relatively strong protections (indicated by the brackets in Fig. 3and Fig. 4). Within this region there were a few positions that were still sensitive or showed an enhanced sensitivity to DNase attack in the presence of CI.
Extending beyond these dense footprint regions, usually on one side
only, was a less dense region of enhancements and weaker protections
that tended to be interspersed with positions where DNase I cleavage
was not affected by CI. Most of the effects in this region were
apparent only at higher CI concentrations. This extended portion of the
footprint was often quite large; in the case of the pR site, this
portion of the footprint was almost 60 bp long and reached the
-14 position of p. The less dense region
effects were not intensified in footprints with crude CI
extracts(29) . The accessible and protected positions in the
extended regions tended to show a 9-11-bp periodicity,
reminiscent of the ``phasing'' effect seen with the HK022 CI
protein, in which protein bound at specific sites nucleates, by
cooperative interactions, binding to nonspecific
sites(17, 18) . The extended footprint region may be
involved in some way in the ``stepping'' effect seen with the
gel shifts.
The footprint at the pB site contained an interesting feature. The dense region of the footprint was divided on each strand by a group of four consecutive strong enhancements, with the enhancements on one strand being offset in the 3` direction from those on the other. Since DNase I cuts in the minor groove and since the nearest phosphates across the minor groove are offset 3 bp in the 3` direction(19) , these enhancements indicate that the minor groove in one region of the DNA helix was made particularly DNase-susceptible by CI. Such sensitivity is often found when the minor groove lies on the outside of a DNA bend(20) .
There are two copies of this
inverted repeat element at the pB site and single copies at FL and FR.
The alignment of the eight half-sites is shown in Fig. 5and
yields the conservations
a.T
.T
.C
.a
.C
with a 5-bp A/T-rich spacer. The formal consensus recognition
sequence of 17 bases is MTTCWCWWWWWGWGAAK (W = A or T, M
= A or C, K = G or T).
Figure 5: Proposed CI operators at the pB, FL, and FR sites. The DNA half-sites of the inverted-repeat sequences proposed as CI operators at the pB, FL, and FR sites (see Fig. 4) are aligned. The diamond shows the center of symmetry of the sequences. The small triangles indicate bonds which remain DNase-accessible (unfilled) or whose DNase cleavage is enhanced (filled) in the presence of CI (Fig. 4); upward pointing triangles indicate the bond on the complementary strand. Conserved bases and their frequencies are indicated below the sequences. To quantitate the discriminative power of the putative CI binding sequence (see text), a full-length consensus sequence MTTCWCWWWWWGWGAAK (M = A or C, K = T or G, W = A or T) was derived from the alignment. We scored the degree of match of each of the four putative binding sequences with the consensus sequence by assigning a match at a nonredundant consensus position (A, C, G, or T) a score of 1 and a match at a redundant position (W, M, or K) a score of 0.5. With this scheme, each binding sequence scores at least 12 points out of a maximum of 12.5. A computer program was written to scan random DNA sequence for matches to consensus sequences; this showed that a score of 12 or more on this consensus occurred by chance less than once per 1000 kb.
A sequence-specific DNA-binding
protein must use the information in its DNA recognition sequence to
specify binding to its functional binding sites and to minimize binding
to nonfunctional sites. Therefore, an important test for a proposed
recognition sequence is to show that it can provide for discrimination
between known binding sites and known nonbinding sites. We quantitated
the discriminatory power of this recognition sequence by determining
(i) how similar each of the four sequences at pB, FL, and FR are to the
consensus, and (ii) how frequently sequences with this degree of
similarity to the consensus arise in random DNA sequences (described in
the legend to Fig. 5). We found that such sequences occurred
less than once per 1000 kb of random DNA sequence, showing that the
consensus is highly discriminative and emphasizing its legitimacy as a
recognition sequence of the CI protein. By comparison, the CI
operators are less well conserved. Using a consensus sequence for the
CI operators, sequences similar to the best scoring
operators occurred less than once per 1000 kb. However, some of the
operators scored quite poorly, with similar matches occurring
every 10 kb of random sequence.
The location of the eight half-sites relative to the footprints is indicated in Fig. 4. The position of the sequences at pB, FL, and FR is consistent with the DNase I footprints in three ways, further strengthening their assignment as recognition sequences for CI. (i) The sequences lie totally within the dense portion of the footprints. (ii) The two operators at pB are positioned symmetrically on either side of the strong enhancements in the center of the pB footprint and are on the same face of the DNA helix as each other (having a center-to-center distance of 31 bp). The DNase-sensitive minor groove between these sites lies on the opposite face of the helix. Thus, the hypersensitivity of the DNA in the center of the pB footprint may be explained by DNA distortion, perhaps bending, induced by interactions between CI molecules bound to adjacent sites. (iii) There is a tendency for certain bonds in approximately equivalent positions in the half-sites of the sequences to remain exposed to DNase or to show enhanced cleavage in the presence of CI (Fig. 4; indicated by triangles in Fig. 5), suggesting that these sequences interact similarly with CI. The relationship between these cleavages on the two strands indicates sensitivity of the minor groove at a single region in each operator arm.
In
order to further examine CI binding at pR, we first narrowed down the
location of the binding determinants, since the extended CI footprint
at the pR site was very large, extending from positions -78 to
+76. The dense portion of the footprint was considerably smaller,
from -53 to +14, and contains the loci of the vir mutants. To test the idea that all of the CI binding determinants
at p are contained within the dense footprint
region, we examined CI binding to a p
DNA fragment
from which most of the 186 sequences around the dense footprint region
were removed.
Using the gel shift assay, we compared CI binding to a
DNA fragment containing the -58 to +14 sequence of p, which carries little more than the dense
footprint region (the minimal pR fragment), with binding to a DNA
fragment containing the -81 to +126 sequence, which covers
the entire CI footprint at the pR site (large pR fragment). One result
is shown in Fig. 6, top. The amounts of bound and
unbound DNA in the gel were quantitated and graphed in Fig. 6, bottom. This shows that the affinity of CI for the pR site was
not affected by the replacement of sequences from the extended
footprint region with non-186 DNA. The apparent dissociation constant
for CI to pR, K
, was obtained as the CI
concentration at which half the DNA was bound. This was 30 nM for the large pR fragment and 31 nM for the minimal pR
fragment. Ratios close to one for the relative binding strengths of the
two fragments were found in three experiments (average K
minimal/K
large = 0.97 ± 0.05).
Figure 6:
CI
binding determinants at the pR site lie within the dense footprint
region. Gel mobility shift assays (see ``Materials and
Methods'') were carried out to compare the affinity of CI for a
DNA fragment carrying p sequence from -81 to
+126, containing the entire CI footprint region (large pR
fragment) with a fragment carrying p
sequence from
-58 to +14, containing only the dense footprint region
(minimal pR fragment). Both fragments were prepared by PCR (see
``Materials and Methods''): the large fragment was obtained
from pEC631 with primers 35 and USP, the minimal fragment was from
pEC627 with primers RSP and USP (USP labeled in both cases). The CI
concentrations were 0, 2, 6, 18, 54, 128, 384, and 1152 nM.
The graph shows the fraction of DNA that was bound by CI, obtained from
quantitation of the phosphorimage and calculated as 1 - (unbound DNA/total DNA) with a correction factor subtracted to
give a fraction DNA bound value of 0 in the absence of
CI.
Figure 7: A pB-FL-FR-like sequence with altered half-site spacing at Site I. The sequence conservations in the operators proposed for CI binding at pB, FL, and FR (Fig. 5) sites are aligned with the half-sites of an inverted-repeat sequence at Site I in pR. The diamond shows the center of symmetry of the sequences; note that the half-site spacings for the pB-FL-FR sequences are different from the Site I sequence. The changes found in the left and right half-site sequences in the vir mutants (8) are indicated, with the subscripts denoting the frequency of each change. The asterisk indicates a mutation that does not reduce the match to the consensus. We derived from these five sequences the Type A sequence consensus: TCWCWW(W)WWGWGA (W = A or T; the W in parentheses is optional, reflecting the alternative spacings). In a search of 1000 kb of random DNA sequence, matches as good or better than the poorest match among the pB, FL, or FR sequences to the longer consensus (Type A) were found 11 times (see Fig. 5legend) and matches as good or better than the match of the Site I sequence to the shorter consensus (Type A`) occurred 23 times. Thus, pB-FL-FR-like or Site I-like sequences arose by chance once per 29 kb.
The sequence conservations between Site I and the pB-FL-FR sequences provide sufficient discriminatory power to provide for the binding seen to these sequences and the lack of binding elsewhere on the 186 genome. A consensus TCWCWW(W)WWGWGA (W = A or T) was derived from the five sequences. The central W is optional, reflecting the alternative spacings. Sequences matching either the 5-spacer consensus as well as the pB-FL-FR sequences or the 4-spacer consensus as well as the Site I sequence occurred only once every 29 kb of random DNA sequence. No other strong matches to this consensus occur in the 186 early control region. We term the pB, FL, and FR recognition sequence Type A and the Site I recognition sequence Type A` to indicate its different half-site spacing.
All of the 19 vir mutants carry a change at Site I, and there are in total 29 base changes at this site(8) . This large body of mutational data provides a stringent test of the proposed recognition sequence. The very location of the A`-type sequence at Site I supports its role in CI recognition. However, the fit between this sequence and the vir mutations is much more extensive than co-location and not only provides very strong evidence that it is a CI recognition sequence but also indicates those bases that are critical for binding. Firstly, the mutations lie in both arms of the sequence (Fig. 7), showing the importance of both half-sites, a feature expected for a symmetrical binding sequence. Secondly, in all of the 19 vir mutants, the match to the consensus sequence is worsened. Of the 29 mutations occurring at Site I, 27 involve one of the three fully conserved positions in the half-sites. Two mutations lie at less conserved positions, one slightly improving the match to the consensus. However, both these mutations occur in combination with changes at the fully conserved positions. There is therefore a remarkable agreement between the vir mutations and the consensus sequence. The mutations indicate that the two fully conserved C residues within the half-sites are critical for CI binding, since every one of the 19 vir mutants carries a change at one of these positions.
The Type A recognition sequence
cannot explain CI binding in the left (Site II) and right (Site III)
regions of pR as we were unable to find such sequences in these
regions, even when other alternative half-site spacings were tested. We
had begun with the assumption that CI is able to recognize only one
type of sequence. However, some DNA-binding proteins, for example, the
integrase proteins of and other phages, have two independent DNA
binding domains and are able to recognize distinct DNA
sequences(22) . We therefore examined the sequences at Sites II
and III for alternative recognition elements. An element that occurs at
Site II and at Site III and which is a likely candidate for a second
type of CI recognition sequence is shown in Fig. 8. The element
is again an inverted repeat sequence with an A/T-rich spacer; however,
the half site sequences are quite different from the A-type sequences.
The consensus sequence TNGRYWWWRYCNA (W = A or T, R = A
or G, Y = C or T, N = any base) was derived from the
alignment. The discriminatory power provided by this sequence is
strong; the matches to the consensus as good or better than those
obtained by the Site II and III sequences arose only once per 67 kb of
random DNA sequence. We term this second proposed CI recognition
sequence Type B. No significant matches to this consensus were found
elsewhere in the 186 early control region.
Figure 8: A second putative CI recognition sequence at Sites II and III. The half-site sequences of the operators proposed for CI binding at vir Sites II and III (Fig. 4) are aligned. The diamond shows the center of symmetry of the sequences. The small triangles indicate bonds which remain DNase I-accessible (unfilled) or whose DNase cleavage is enhanced (filled) in the presence of CI (Fig. 4); upward pointing triangles indicate the bond on the complementary strand. Conserved bases and their frequencies are indicated below the sequences. The changes found in the left and right half-site sequences in the vir mutants (8) are indicated, with the subscripts denoting the frequency of each change. The asterisks indicate mutations that do not reduce the match to the consensus. One mutation, lying just left of the Site II sequence, is not shown. The consensus Type B sequence: TNGRYWWWRYCNA (W = A or T, R = A or G, Y = C or T, N = any base) was derived from the alignment. The match of the Site II or III sequences to this consensus was found to arise by chance once per 67 kb.
Again, strong evidence for these Type B sequences being CI binding determinants is provided by the vir mutations (Fig. 8). There are 16 vir mutants with changes at Site II or Site III, involving a total of 21 mutations. Of these, 18 reduce the match to the consensus. (Three mutations do not alter the match to the consensus; however, these mutations occur in combination with other changes that do disrupt the consensus). Nineteen of the twenty-one mutations occur in Site II and are distributed over both arms of sequence symmetry, with 11 mutations involving the same fully conserved position. Only two mutations have been isolated at Site III. One of these (in vir121) reduces the match to the consensus. The other (in vir100) does not reduce the match to the consensus. However, although it is clear that the Site III mutation in vir121 increases virulence (8) and reduces CI binding to pR (see below), it is not clear that the Site III mutation in vir100 does so. The vir100 mutant, unlike vir121, also carries a change at Site II and its virulence is no different from other Site II mutants(8) , so the vir100 change in Site III may have no effect on CI binding.
Further evidence for these recognition sequences at Sites II and III is apparent in the DNase footprint. There is a pattern of CI-dependent DNase I enhancements that is very well conserved between all four half-sites (Fig. 4; marked by filled triangles in Fig. 8). These enhancements indicate minor groove sensitivity at the same position in each half-site and argue that CI interacts in a similar fashion with the four sequences.
Two further observations support our three-site model (B-A`-B) at pR. Firstly, the location of the three sequences correlates very closely to the dense footprint region (see Fig. 4). Secondly, the three sequences all lie on the same face of the DNA helix. The center-to-center spacing is 21.5 bp between the Site II and Site I sequences (B-A`) and 20.5 bp between the Site I and Site III sequences (A`-B). Binding to the same face of the DNA helix is a feature of binding at the pB site and is also consistent with the interactions seen between CI bound at pR, as described below.
Figure 9:
Binding of CI to pR DNA from virulent
mutants. A, gel mobility shift assay (see ``Materials and
Methods'') comparison of affinity of CI for DNA fragments carrying vir mutations. DNA fragments were prepared by PCR (see
``Materials and Methods'') from 186,
186del1vir122 186cI
vir97,
and 186del1vir121 phage, using primers 71 and 68. From a
number of experiments, the average K
values
(determined as in Fig. 6) for the wild-type, vir122, vir97, and vir121 fragments were 36, 102, 295, and
228 nM, respectively. B, DNase I protection study of
CI binding to DNA fragments carrying vir mutations. DNA
fragments were prepared as in A, except that primers 34
(labeled) and 89 were used. The data are for the top strand. The
procedure was as described in the Fig. 3legend, except that the
reactions were stopped by extraction with phenol saturated with 10
mM Tris-HCl, 50 mM EDTA, and the DNA was
ethanol-precipitated only once. To the left of the figure are
indicated the p
coordinate of each band, the
location of the predicted operators at pR (vertical lines),
and the positions of the mutations used (arrowheads). To the right of each set of tracks, CI-affected cleavages are
indicated by dots. C, sequence of the densely
CI-protected region of the pR site, showing the mutations carried by vir122, vir97, and vir121. The underlined bases are the positions at which vir mutations occur. The predicted CI operator sequences are shown by converging arrows, the -10 and -35 hexamers of p
are in boldface.
Gel
shift studies with these fragments showed that all three mutant
fragments bound CI more weakly than wild-type (Fig. 9A). Furthermore, each of the three mutations
weakened CI binding. From the K values (Fig. 9A, legend), mutation at Site I reduced binding
2.8-fold and further mutation at Site II or Site III reduced binding an
additional 2.9- and 2.3-fold, respectively. These effects on CI binding
correlate with the degree of virulence shown by the mutants, weak for vir122 and strong for vir97 and vir121(8) , and correlate with the number of intact CI
operators in these fragments. Although we have tested only three vir mutants, it is now reasonable to assume that CI binding is
weakened in all of the vir mutants.
We noted previously that CI binding to pR, pB, FL, and FR DNA fragments produced, at each CI concentration, a single retarded species of similar mobility (Fig. 2), despite the fact each fragment contains a different number of CI operators. The identical gel shift patterns seen with pR fragments carrying one, two, or three intact operators confirms this result. Thus, at any one CI concentration, it appears that a similar number of CI subunits is binding to the DNA whether this DNA carries one, two, or three operators.
Fig. 9B shows the DNase I protection results with the vir mutant fragments at CI concentrations at which there was very little CI binding in the extended footprint region. We found no evidence for independent binding to individual operators at the pR site. Instead, each mutation affected binding not only to its own site but to the whole of the dense footprint region. With the wild-type fragment, cleavage at almost every position was strongly protected or enhanced by CI at 210 nM (indicated by dots to the right of the gel lanes). Most effects were also visible at 70 nM. The Site I mutation (vir122) strongly reduced binding to the central Site I region but also to the whole dense footprint area, with a subset of the wild-type effects remaining at 210 nM only. Further mutation at Site II or Site III (vir97 and vir121) eliminated all CI binding at these concentrations, despite the fact that there was an intact Type B site on these fragments. Again, the mutation at one site affected binding at the others: mutation at Site II removed the weak remaining binding at Sites I and III; mutation at Site III removed the weak remaining binding at Sites I and II.
These results indicate a high degree of cooperativity in CI binding at pR, that is, the favorable interactions between bound CI subunits are strong compared with the interactions between the subunits and the DNA. Thus, the strong CI binding to pR appears to be a result of strong cooperativity between CI protomers at relatively weak DNA binding sites. Strong cooperativity in CI binding is supported by the gel shift results with pB and pR fragments. Noncooperative or weakly cooperative binding of a protein to fragments containing multiple operators should give rise, at any one protein concentration, to multiple retarded species, with the species differing in the number of protein subunits bound. The lack of such species ( Fig. 2and Fig. 9A), therefore, argues for strong cooperativity in CI binding to adjacent operators.
Gel shift and DNase I footprinting studies have confirmed
that the 186 CI repressor is a sequence-specific DNA-binding protein
and have identified four binding regions in the early control region of
the phage genome. CI binds to the lytic promoters p and p
that it represses and also binds to
sites at the -330 and +350 positions of p
(FL and FR). The biological role, if any, of these flanking sites
is not yet known. The CI binding region at p
is
distinct from that of the Cro-like Apl protein, although there is some
overlap between the two regions(11) . We showed that the CI
binding determinants at p
are located between the
-58 and +14 positions of the promoter and confirmed for
three vir mutants that the mutations at p
carried by these mutants reduced CI binding.
DNA recognition by CI is unusual. There appear to be two distinct recognition sequences, Type A and B. Furthermore, CI appears to be able to recognize a Type A half-site spacing variant, A`. This gives the structure AA-A-BA`B-A for CI binding to the pB-FL-pR-FR region. Three types of evidence provide strong support for our operator assignments. (i) There is good sequence conservation between the A/A`-type sequences and between the B-type sequences; with very strong conservation between the A-type sequences at pB, FL, and FR. (ii) The operators are consistent with the DNase I protection data, including many of the fine details of the footprints. (iii) The proposed operator sequences at the pR site can explain the large body of genetic data provided by the vir mutations. The A` sequence (Site I) is disrupted in all 19 of the vir mutants, invariably involving one of the fully conserved bases. Fifteen of these mutants also carry a disruption of the left B-type sequence (Site II), and one mutant carries a disruption of the right B-type sequence (Site III), with the most conserved bases often involved. The number of intact operators correlates with the strength of binding of CI to pR in the three mutants tested and with the sensitivity to immunity of all the vir mutants(8) .
A number of proteins are known to
recognize two different sequences or different half-site spacings. For
example, the integrase protein is able to recognize distinct
``core'' and ``arm'' type DNA sequences, utilizing
different regions of the protein(22) , a property that seems to
be general to the large family of ``complex'' integrases
(see, for example, (23) ). An example of recognition of
alternative half-site spacings is provided by the AraC protein, which
has a flexible linker between the dimerization domain and the DNA
binding domain of the protein(21) . Further work is planned to
define the spacing requirements within and between CI recognition
sequences and to identify the DNA-binding regions of the protein.
Certain features of CI binding can be deduced from the data. In some of the A-type sites and in both the B-type sites, the minor groove in each arm of the operator remains sensitive or becomes hypersensitive to DNase I in the presence of CI (see Fig. 5and Fig. 8). Thus, the contacts made by CI with the bases in these operator arms must occur via the major groove. The inverted repeat nature of the operators and the spacing of the exposed bonds (9-11 bp at Type A sites and 8 bp at Type B sites) indicates that a rotationally symmetrical CI dimer is recognizing successive major grooves that are close to being on one face of the DNA helix.
Strong cooperative
binding of CI to adjacent operator sequences was indicated by (i) the
presence of only a single retarded species in gel shift experiments
with fragments carrying multiple operators, and (ii) the finding that vir mutations at pR weakened binding to nonmutated operators.
Cooperative binding is not surprising, as CI has been shown to exist in
a monomer-dimer-tetramer-octamer equilibrium in solution (16) with interaction energies very similar to CI (24) .
Although the data of Shearwin and Egan (16) show that CI in solution is predominantly dimeric at the concentrations used in our studies, the gel shift experiments suggest that a higher-order CI multimer is the active binding species, since the mobility of the major retarded species was very similar for DNA fragments with differing numbers of operators. This multimer seems able to occupy DNA containing up to two operators of any one type (two A-type operators at pB or two B-type operators at pR). Assuming that each operator is contacted by a CI dimer, then this species must be at least a tetramer. Studies of CI-DNA stoichiometries at the different operators are planned.
Once we have suitably characterized the relationship of type A, A`, and B sequences to CI binding, we will then be in a position to investigate the possible interaction between CI bound at FL, FR, and pR and its biological significance.