(Received for publication, December 3, 1996, and in revised form, April 11, 1997)
From the Institut de Génétique et Microbiologie, Université Paris-Sud, URA CNRS D 2225, Bâtiment 409, Centre Universitaire d'Orsay, F-91405 Orsay Cedex, France
AlcR is the transactivator mediating
transcriptional induction of the alc gene cluster in
Aspergillus nidulans. The AlcR DNA-binding domain consists
of a zinc binuclear cluster different from the other members of the
Zn2Cys6 family by several features. In
particular, it is able to bind to symmetric and asymmetric sites with
the same affinity, with both sites being functional in A. nidulans. Here, we show that unlike the other proteins of the
Zn2Cys6 binuclear cluster family, AlcR binds
most probably as a monomer to its cognate targets. Two molecules of the
AlcR protein can simultaneously bind in a noncooperative manner to
inverted repeats. The consensus core has been determined precisely
(5-CCGCN-3
), and the AlcR-binding site in the aldA
promoter has been localized. The sequence downstream of the zinc
cluster is necessary for high affinity binding. Furthermore, our data
show that the use of the carrier protein glutathione S-transferase in AlcR binding experiments introduces an
important bias in the recognition of DNA sites due to its tertiary
dimeric structure.
The Aspergillus nidulans activator AlcR is a member of the DNA-binding protein family whose DNA-binding domain contains a highly conserved zinc binuclear cluster (1, 2). The proteins in this class, such as GAL4 (3), PPR1 (4), and HAP1 (5) in Saccharomyces cerevisiae and UaY (6), PrnA (7), NirA (8), and AlcR (1, 2) in A. nidulans, are transcriptional activators that control a wide variety of metabolic pathways.
Among the Zn2Cys6 zinc binuclear cluster proteins, some have been characterized both biochemically and structurally by analysis of their three-dimensional conformations. Most of them, such as GAL4, PPR1, and UaY, bind to symmetric DNA sites as dimers through their coiled-coil dimerization element. HAP1 recognizes asymmetric sites (9), and it has been shown recently that the Zn2Cys6 zinc cluster is responsible for asymmetric binding, with the coiled-coil region stabilizing the complex (10). However, other proteins of this family, e.g. ARGR2 and MAL63 (11), have been suggested to function as monomers.
AlcR, the specific transactivator of the alc cluster
involved in ethanol utilization and in other related carbon metabolic pathways (12, 13), appears to be different from the other members of
the Zn2Cys6 family. It contains in its
DNA-binding domain, between the third and fourth cysteines, an unusual
extended sequence of 16 residues instead of the six to eight usually
found, and no predicted dimerization regions were found downstream of the zinc cluster. Furthermore, unlike the other members of this family,
AlcR appears to bind with the same affinity to both symmetric and
asymmetric sites containing the consensus motif 5-CCGCA-3
(14, 15).
Both types of targets localized in the alcR and alcA promoters have been shown to be functional in
vivo (15, 16).
The AlcR-binding sites have been determined previously using a bacterial expression system with a glutathione S-transferase (GST)1 fusion protein (GST-AlcR-(7-60)) (14, 15). Bacterially expressed fusion proteins are widely used to identify specific DNA sequences that are recognized by regulators. Among them, GST presents several advantages since it provides high yields of protein that can be easily purified to homogeneity. Furthermore, cleavage by thrombin releases the DNA-binding protein (17). The GST fusion protein system was successfully used to determine the DNA-binding sites for a number of proteins such as SWI5 in yeast (18), N-Myc in mouse (19), and T/E1A in human (20). In A. nidulans, binding sites for the CreA repressor (16) and NirA (21) and UaY (6) activators have been localized using GST fusion proteins. The binding sites have been shown to be functional in vivo, as for AlcR-binding sites.
In this report, we have compared the binding specificities of the GST-AlcR-(1-60) fusion protein and a longer AlcR-(1-197) protein, tagged at its carboxyl terminus with six histidine residues. We demonstrate that the use of GST introduces an important bias in the recognition of DNA sites as the result of its quaternary dimeric structure. It prevented the identification of an AlcR-binding site in the aldA promoter that is now established. It hindered the important observation that the AlcR protein binds presumably as a monomer to DNA, unlike the other proteins of the Zn2Cys6 binuclear cluster family. Two molecules of the AlcR protein can simultaneously bind in a noncooperative manner to symmetric sites, whereas only one molecule occupies a direct repeat site. Finally, we show also that the sequence downstream of the zinc cluster is necessary for high affinity binding.
To construct the GST-AlcR fusion expression vector, a DNA fragment encompassing the AlcR DNA-binding domain (amino acids 1-60) was amplified by polymerase chain reaction and cloned in frame with the GST gene in the pGEX-2T plasmid (Pharmacia Biotech Inc.). The protein was expressed from the tac promoter and purified as described earlier (17, 14). The protein obtained was 80-90% pure as judged by SDS gel electrophoresis using Coomassie Blue staining.
The AlcR peptide was separated from GST by cleavage with thrombin (1 unit/mg of fusion protein; Sigma) in the presence of 2.5 mM CaCl2 for 10 min at 30 °C. The mixture was loaded onto a Resource S high pressure liquid chromatography column (6 ml; Pharmacia Biotech Inc.) pre-equilibrated in 10 mM phosphate (pH 7.2) and 0.1 M NaCl. AlcR was eluted using an increasing gradient of 1 M NH4Cl, 10 mM phosphate (pH 7.2), and 0.1 M NaCl. The eluted AlcR peptide was then passed through a p-aminobenzamidine column (Sigma) to remove contaminating thrombin, and pH was adjusted to 6.0.
A plasmid expressing AlcR-(1-197) tagged with His6 at its
C terminus was constructed by cloning the
NcoI-BamHI fragment into the pET-22b vector
(QIAGEN Inc.). The NcoI site was introduced into the ATG
codon during polymerase chain reaction amplification. Escherichia
coli BL21(DE3) cells bearing the expression plasmid were grown at
37 °C to A600 = 0.6. After 3 h of
induction with 1 mM
isopropyl--D-thiogalactopyranoside in the presence of 20 µM ZnCl2, the cells were harvested by
centrifugation and resuspended in 50 mM sodium phosphate
buffer (pH 7.9) containing 0.3 M NaCl, 5 mM
-mercaptoethanol, and 20 µM ZnCl2. After
sonication, the AlcR protein was partially purified on a
Ni2+/nitrilotriacetic acid-agarose column according to the
recommendations of the supplier (QIAGEN Inc.) using a stepwise gradient
of imidazole. The fraction eluted at 40 mM imidazole was
directly used for electrophoretic mobility shift assays (EMSAs). The
purity of the His-tagged AlcR protein was estimated to be 15% by SDS
gel electrophoresis. It migrated as a 33-kDa polypeptide. Background
contamination arose from E. coli proteins bound
nonspecifically to Ni2+/nitrilotriacetic acid-agarose
rather than from products of AlcR-(1-197)-His6 degradation
since even at high protein concentration, no extra DNA complex was
observed in gel band shift experiments (see Figs. 3 and 4).
Electrophoretic Mobility Shift Assays
The DNA sequences of
the oligonucleotide probes used in EMSAs are listed in Table I. DNA
binding shift assays were carried out as described previously (14) with
several modifications. They were performed in 20 µl of reaction
mixture containing 25 mM Tris-HCl (pH 8.0), 100 mM KCl, 4 mM spermidine, 1 mM
dithiothreitol, 10 µM ZnCl2, 2 µg of
(dI-dC)n, 5% glycerol, 20-40 fmol of radiolabeled DNA probe,
and increasing amounts of AlcR protein. After 20 min of incubation at
room temperature, samples were loaded onto 6% polyacrylamide gels and
run in 0.25 × TBE buffer (Tris borate/EDTA) at 4 °C and 18 V/cm for 45 min. 10% polyacrylamide gels were utilized for EMSA with
the AlcR-(1-60) peptide to increase the separation between the probe
and the complex. Double-strand oligonucleotides containing a specific
target for AlcR were labeled with T4 polynucleotide kinase (New England
Biolabs Inc.) and [-32P]ATP (3000 Ci/mmol; Amersham
Corp.). However, single-strand oligonucleotides present in the same
mixture were also end-labeled, which results in two different types of
probes on the EMSA gels. Single-strand oligonucleotides migrated faster
than double-strand oligonucleotides (see Figs. 2 and 4); therefore, to
calculate the AlcR relative affinity, only the double-strand probe had
to be taken into account. The relative apparent affinity of AlcR was
defined from gel shift reactions as the concentration of AlcR protein
required to bind half of double-strand DNA. Quantification was
performed on a PhosphorImager (Molecular Dynamics, Inc.).
|
In Vitro Transcription-Translation
Two plasmids of different length encoding the AlcR N-terminal region, namely AlcR-(1-163) and AlcR-(1-197), were constructed by cloning amplified DNA fragments into the T7 expression vector pET-22b digested with NcoI and XhoI or with NcoI and BamHI, respectively. The corresponding proteins were expressed in a transcription-translation system (Promega) according to the recommendations of the supplier by adding 1 µg of each plasmid either alone or mixed into the reaction mixture. Expression of AlcR proteins was monitored by SDS-polyacrylamide gel electrophoresis followed by autoradiography. 5-10 µl of translation products were used directly in DNA mobility shift assays.
Footprinting AnalysisThe DNAs used in the footprinting
analysis were oligonucleotides containing AlcR sites as listed in Table
I. Methylation interference was performed as described previously (14).
Briefly, single-strand DNA was end-labeled with
[-32P]ATP (3000 Ci/mmol), annealed with its
complementary nonlabeled strand, and used for EMSAs. End-labeled DNA
probes were partially methylated by dimethyl sulfate following the
procedure of Maxam and Gilbert (22). The chemically modified probes
(103 counts/s) were incubated with 1000 ng of partially
purified AlcR-(1-197) as described above. Bound and unbound DNAs were
sliced from a preparative EMSA gel and electroeluted onto DEAE NA45
membrane in 1 × TBE buffer. Recovered DNAs were purified by
phenol/chloroform extraction, cleaved by 1 M piperidine
according to Maxam and Gilbert (22), and subjected to electrophoresis
on a 16% polyacrylamide gel containing 8 M urea.
Previous
studies have shown that the GST-AlcR-(7-60) fusion protein binds to
inverted and direct repeat sites with the same consensus core
(5-CCGCA-3
) in the alcR and alcA promoters.
Both types of sites have been shown to be functional in vivo
(15, 16).2,3 Surprisingly,
the interactions between the AlcR fusion proteins and guanines within
the motif appeared to be different in both types of targets (14, 15).
Since GST (the fusion protein carrier) is known to be dimeric in
solution (23, 24) and hence could introduce a bias in the DNA binding
specificity, we decided to perform in parallel the same binding
experiments with another bacterially synthesized AlcR protein
consisting of six histidine residues fused to the C terminus of the
truncated AlcR protein (residues 1-197) and purified on a nickel
column as described under "Experimental Procedures." Such chimeric
proteins are also widely used. For example, NMR studies of the Fru
repressor from E. coli showed that the extra histidine
residues have no influence on the protein conformation and its activity
(25).
The three AlcR proteins utilized in this DNA binding study are depicted
in Fig. 1 (A and B). As shown in
Fig. 1B, the GST-AlcR-(1-60) protein migrated on SDS gels
according to its predicted size (34 kDa), whereas the AlcR-(1-197)
(25 kDa) and AlcR-(1-60) (
7 kDa) proteins exhibited aberrant
electrophoretic mobility. These results remained unexplained. In the
case of the AlcR-(1-60) peptide, the molecular mass determined by mass
spectroscopy appeared to be 7.1 kDa.4 The
GST-AlcR-(1-60) protein contains only the AlcR DNA-binding domain,
including the amino terminus (amino acids 1-6), which was deleted in
our previous studies. Cleavage by thrombin resulted in the isolated
AlcR-(1-60) peptide, which was purified (see "Experimental Procedures"). The His-tagged AlcR-(1-197) protein (Fig.
1C) comprises additional domains equivalent to those present
in other proteins of the zinc cluster family, the so-called linker and
dimerization regions. Therefore, questions may also be addressed to the
role in AlcR binding of these two regions described as essential
elements for the binding specificity of proteins of this
Zn2Cys6 class.
Two Molecules of AlcR Bind to Inverted Repeat Sites
Two
chimeric proteins, His-tagged AlcR-(1-197) and GST-AlcR-(1-60), were
initially tested by gel retardation assays with a wild-type inverted
repeat probe (probe b in the alcA promoter) (Fig.
2). Upon an increase in the AlcR-(1-197) protein
concentration, a second complex of higher molecular mass was formed
(Fig. 3A). Competition experiments showed
that both AlcR-(1-197) complexes (I and II) are specific (data not
shown). The apparent Kd for complex I was estimated
as 4 × 108 M and that for complex II as
2 × 10
8 M, which is not significantly
different. As shown in Fig. 3A, the mobility of the fast
migrating complex (complex I) corresponded to that of the complex
obtained with a single copy site (probe 1/2b), indicating the DNA
interaction of one AlcR molecule. Therefore, the slow migrating complex
(complex II) contains two AlcR molecules bound noncooperatively to a
palindromic sequence.
With the AlcR-(1-60) peptide containing only the zinc binuclear cluster, at very high protein concentration (1000 ng), two complexes were observed, indicating the binding of two AlcR molecules (Fig. 3B). The low affinity binding observed with the AlcR DNA-binding domain alone supports the idea that the sequence downstream of the zinc cluster (amino acids 61-197) contributes significantly to high affinity binding. This is indicative of either an increase in stability or thermostability of the complex and/or additional contacts between AlcR and DNA. Interestingly, it was observed previously that the binding of the AlcR-(7-60) peptide is unstable, and DNA binding activity was restored by changing the conditions of gel band shift experiments as described in Ref. 14 and under "Experimental Procedures."
With GST-AlcR-(1-60), only one complex was obtained whatever the
protein concentration (Fig. 3C). The apparent
Kd was estimated as 109 M,
indicating a high affinity binding of GST-AlcR-(1-60). Therefore, the
presence of the GST moiety enhances GST-AlcR-(1-60) affinity for its
specific target by 10-fold. The presence of only one complex is not
surprising since the GST protein naturally occurs as a dimer, and thus,
one single complex contains two AlcR molecules. Therefore, the dimeric
structure of the GST protein prevents AlcR binding to single sites.
The simplest interpretation of these experiments is that AlcR is able to bind DNA as a monomer. To test this hypothesis, transcription-translation assays in the reticulocyte lysate system were performed using two plasmid constructions encompassing alcR encoding His-tagged proteins of different length: AlcR-(1-163)-His6 and AlcR-(1-197)-His6. The AlcR-(1-163) protein contains the region corresponding to the two heptad repeats in GAL4 involved in dimerization, and the AlcR-(1-197) protein contains an additional downstream region (Fig. 1C).
Fig. 4A shows that the two 35S-labeled AlcR proteins were produced in the in vitro reticulocyte transcription-translation system and could be expressed simultaneously. As shown in Fig. 4B, no AlcR heterodimer-DNA complex was formed when the two different AlcR proteins expressed simultaneously in the reticulocyte lysate were assayed against the inverted repeat probe b target in the alcA promoter. These results show that within the AlcR protein containing 197 residues, no dimerization sequence is present. It is a strong indication that AlcR binds DNA as a monomer.
Identification of the AlcR-binding Site in the aldA PromoterOne intriguing question was the localization of
AlcR-binding sites in the promoter of the aldA gene (26),
the transcription of which is absolutely dependent upon alcR
expression (12). Analysis of the promoter showed the presence of an
inverted repeat with T instead of A in the fifth position of the
consensus core (5-AGCGGCTCCGCT-3
) (Fig. 2 and Table
I). Previous gel band shift experiments with
GST-AlcR-(7-60) and overlapping restriction fragments in the
aldA promoter failed to demonstrate any retardation (1).
Therefore, it was important to test this potential AlcR target with the
AlcR-(1-197) protein in parallel with the GST-AlcR-(1-60) protein.
This sequence is similar to the functional inverted repeat target in
the alcR promoter (14) with a symmetric change of the last
base pair A to T (probe aldA (B1)).
As shown in Fig. 5, nucleotide change prevented
completely the binding of the GST-AlcR-(1-60) fusion protein to the
probe. In contrast, the His-tagged AlcR-(1-197) protein was able to
form two complexes with a slightly lower affinity as compared with the
inverted repeat sequence present in the alcA promoter. A
similar pattern of binding was observed when the last A was replaced by C. Taken together, these results imply that there is no strong preference in the fifth position of the consensus motif for tight binding. Therefore, the presence of the GST moiety hinders the identification of an AlcR palindromic target in the aldA
promoter.
Specificity of AlcR Recognition for Direct Repeats
Previous
footprinting and gel retardation experiments have shown that the direct
repeat sequence in the alcR promoter (probe A) (Table I and
Fig. 2) may be occupied by the GST-AlcR-(7-60) fusion protein only if
the adjacent inverted repeat target (probe B) is bound by the fusion
protein (14). As shown in Fig. 6A, no binding
to the alcR direct repeat target was observed with the
GST-AlcR-(1-60) fusion protein, even at high protein concentration. Conversely, the His-tagged AlcR-(1-197) protein formed a single complex with a lower affinity compared with the alcA direct
repeat target (probe c) (see below). It is evident that one AlcR
molecule is able to bind to the natural direct repeat in the
alcR promoter, whereas the GST-AlcR fusion protein does
not.
The direct repeat target (probe c) (Table I and Fig. 2) in the
alcA promoter was recognized with high affinity by both AlcR proteins (apparent Kd = 3.108
M for the fusion and His-tagged proteins, respectively).
Similarly, a single complex was observed, indicating the binding of one
AlcR-(1-197) molecule and of one dimeric GST-AlcR-(1-60) molecule,
respectively (Fig. 6B). Therefore, it was important to
address the question whether the two sites could be occupied randomly,
or if AlcR-(1-197) occupies preferentially one site. In addition, it
was interesting to determine if the different pattern of binding
specificity observed with GST-AlcR-(1-60) and His-tagged AlcR-(1-197)
could be attributed to differences in the interacting guanines in the
sites.
As illustrated in Fig. 7, interactions by methylation
interference showed that AlcR-(1-197) made strong contacts with the two central guanines (positions 183 and
185) in the bottom strand within the 3
-site (with G at position
186 being also protected, however less), whereas the central G (position
200) in the top strand
did not interfere. These results differ from those obtained with
GST-AlcR-(1-60) 5 or with
GST-AlcR-(7-60) (15) (note that the affinity was too low to perform
footprint experiments with AlcR-(1-60)): (i) both sites were protected
instead of one; and (ii) all the guanines in the consensus motifs
interfered with the complex formation, and furthermore, the
methylation of the two upstream guanines in the motifs (positions
204 and
191) interfered (although differently) with the
GST-AlcR-(1-60) protein. These experiments show clearly that in
addition to the altered DNA specificity with GST-AlcR-(1-60) for
the direct repeat probe c target, there is also a change in the
interaction between the GST-AlcR fusion protein and DNA.
Analyses of eukaryotic site-specific DNA-binding proteins are
generally carried out either with bacterially synthesized chimeric proteins using fusion proteins such as glutathione
S-transferase and -galactosidase or with chimeric
proteins consisting of additional histidine residues. We show here the
limit of the approach based on the utilization of a heterologous
protein linked to a DNA-binding domain for defining a specific target.
Until now, to our knowledge, it had not yet been described that such an
approach would introduce a serious bias related to the quaternary
structure of the heterologous carrier protein. For example, the GST
protein is a dimer in solution (23, 24), and
-galactosidase is a
tetramer (27, 28). Therefore, these proteins fused to a DNA-binding
domain impose a conformation resulting from their quaternary structure
for selecting the DNA targets. This was clearly demonstrated by
comparing the binding activities of the GST-AlcR-(1-60) and His-tagged
AlcR-(1-197) proteins.
As a first approach, the localization of AlcR-specific binding sites
was performed with the GST-AlcR-(7-60) protein containing only the
zinc binuclear cluster domain. Results were confirmed using the
isolated and purified AlcR-(7-60) peptide cleaved by thrombin (14,
15). However, two important elements were missing in this analysis.
First, the absence of the region downstream of the zinc cluster shown
to be necessary for high affinity binding prevented further analysis of
the molecular interactions between AlcR-(1-60) and its cognate
targets. Second, the utilization of AlcR with its six amino-terminal
residues deleted (GST-AlcR-(7-60)) unexpectedly resulted in a change
in binding specificity.6 Therefore, this
study was performed with a longer AlcR protein (AlcR-(1-197))
containing the amino terminus and the region downstream of the zinc
cluster that includes the so-called linker and dimerization regions
found in most proteins of the Zn2Cys6 class.
Comparisons with GST-AlcR-(1-60) binding specificity showed that
AlcR-(1-197) recognizes the same motif (5-CCGCA-3
) organized in
direct and inverted repeat sequences. In agreement with these results,
physiological studies on A. nidulans have shown by deletion
and site-directed mutagenesis that both types of targets are functional
in the promoters of the alcR
(16)7 and alcA (15)3
genes.
However, the use of the GST-AlcR-(1-60) fusion protein in previous
studies has prevented the identification of important features that
place AlcR in an original and unique position in the zinc binuclear
cluster family. The most significant result is probably that AlcR is
able to bind as a monomer to its targets. That could explain its
unusual specificity for both inverted and direct repeats. In agreement
with this result, the region downstream of the zinc cluster, which is
organized in heptad repeats in the other zinc cluster proteins (GAL4
(3), PPR1 (4), and HAP1 (5)), is not present in AlcR. In fact, in AlcR,
two downstream regions could be similar to leucine zippers. They
contain four leucine or hydrophobic amino acids (able to replace
leucine) every seven residues. However, proline residues are also
present (12) (see Fig. 1C), which are known to impair
-helical structures (29). The second and strong argument is that no
functional dimerization elements were detected in these regions since
no formation of heterodimers by transcription-translation assays in the
reticulocyte lysate system was detected with two AlcR proteins of
different length.
The use of the AlcR-(1-197) protein allowed us to establish the
AlcR-binding site in the aldA promoter. The localization of one palindromic site (see Fig. 2) is important for understanding the
absolute dependence of aldA transcriptional induction on
AlcR. The same consensus motif differing at its last nucleotide from the one determined previously (14) was observed. This indicates that
within the consensus site (5-CCGCA-3
), the last base does not
contribute significantly to AlcR binding. Moreover, results obtained
with different direct repeat sites (probes A and c) clearly show that
in addition to the consensus core, the flanking regions contribute
significantly to AlcR binding.
Another important observation is the absence of cooperativity between the two AlcR molecules upon binding to symmetric sites. This differentiates, once again, AlcR from GAL4 which synergizes transcriptional activation of structural genes under its control and is thought to result mainly from this binding cooperativity (30). The alcA promoter is one of the strongest inducible genes found in filamentous fungi (31) and is widely used for heterologous protein expression (32).
It was shown by deletion and mutational analyses that the strong synergistic transcriptional activation of the alcA gene is mediated via AlcR binding to the clustered targets organized as direct and inverted repeats. The target immediately upstream of the transcriptional start site (probe c) comprises a direct repeat sequence separated by 15 nucleotides from a single copy site. Surprisingly, the three sites were shown to be necessary for full transcriptional activation, showing the importance of the isolated site.8 Therefore, these results are in agreement with our finding that AlcR can bind as a monomer to its binding sites. It opens the question of the functional AlcR targets that in vivo are organized in tandem or inverted repeats and that in vitro occur as single copy sites with the same consensus core. We will have to consider for our future research that other proteins could be involved in the specific activation process mediated by the AlcR activator.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) G16800[GenBank]7 (alcA), G16800[GenBank]9 (alcR), and G16801[GenBank]1 (aldA).
We are grateful to Dr. S. Fillinger for critical reading of the manuscript and to Dr. M. Blight for its English version.