(Received for publication, December 29, 1995; and in revised form, January 31, 1996)
From the
The aryl hydrocarbon receptor (AHR) and the aryl hydrocarbon
receptor nuclear translocator (ARNT) belong to a novel subclass of
basic helix-loop-helix transcription factors. The AHRARNT
heterodimer binds to the xenobiotic responsive element (XRE).
Substitution of each of four amino acids in the basic region of ARNT
with alanine severely diminishes or abolishes XRE binding, intimating
that these amino acids contact DNA bases. Three of these amino acids
are conserved among basic helix-loop-helix proteins, and the
corresponding amino acids of Max and USF are known to contact DNA
bases. Alanine scanning mutagenesis of the basic domain of AHR and
substitution with conservative amino acids at particular positions in
this domain and in a more amino-proximal AHR segment previously shown
to be required for XRE binding (Fukunaga, B. N., and Hankinson, O.
(1996) J. Biol. Chem. 271, 3743-3749) demonstrate that
the most carboxyl-proximal amino acid position of the basic domain and
a position within the amino-proximal segment are intolerant to amino
acid substitution with regard to XRE binding, suggesting that these two
amino acids make base contacts. Amino acid positions in these AHR
regions and in the ARNT basic region less adversely affected by
substitution are also identified. The amino acids at these positions
may contact the phosphodiester backbone. The apparent bipartite nature
of the DNA binding region of AHR and the identity of those of its amino
acids that apparently make DNA contacts impute a novel protein-DNA
binding behavior for AHR.
The AHR ()mediates carcinogenesis by certain
environmental pollutants, including the halogenated aromatic
hydrocarbon, TCDD, and the polycyclic aromatic hydrocarbon,
benzo(a)pyrene (reviewed in (1) ). Prior to binding of ligand,
AHR is located in the cytoplasm as part of a complex that has a
molecular mass of about 280 kDa. This complex is comprised of AHR, two
molecules of the 90-kDa heat shock protein, and possibly other
proteins(2, 3, 4) . After binding ligand, AHR
dissociates from the above complex and translocates to the nucleus
where it heterodimerizes with ARNT. The heterodimer of AHR and ARNT
constitutes a transcription factor referred to as the transformed AHR
complex, which stimulates the synthesis of the CYPlAl protein and
several other proteins involved in xenobiotic
metabolism(5, 6, 7, 8) . Induction
of CYPlAl is regulated exclusively at the transcriptional
level(9, 10) . Activation of transcription occurs
through interaction of the transformed AHR complex with several copies
of short sequences, termed xenobiotic responsive elements (XREs) or
dioxin-responsive elements, located within the 5`-flanking region of
the CYP1A1 gene.
The AHR and ARNT proteins both contain
bHLH motifs toward their amino
termini(11, 12, 13) . Additionally, an
approximately 300-amino acid PAS homology region is located more
centrally in both proteins. PAS regions are also found in the Drosophila proteins PER and SIM (14) and the mammalian
hypoxia-inducible factor l(15) . PAS regions mediate
homodimerization of PER and heterodimerization of PER with SIM (16) and are necessary for heterodimerization of AHR and
ARNT(17, 18) . As well as possessing the PAS homology
region, the AHR
ARNT dimer differs from other bHLH bearing
transcription factors in at least two other ways: (i) AHR activity is
ligand activated and (ii) unlike most other bHLH-bearing transcription
factors, whose DNA recognition sequence is the E-box sequence,
CANNTG(19) , the AHR
ARNT heterodimer recognizes an
asymmetrical XRE sequence that only partially resembles the E-box. The
consensus core XRE sequence is
5`-TNGCGTG-3`(20, 21, 22) . We previously
determined the orientation of the AHR
ARNT heterodimer on the
asymmetric XRE sequence by UV light covalent cross-linking analysis.
ARNT contacts the thymidine in the XRE core (5`-CGTG-3`), whereas AHR
binds 5`-proximal to this(23) .
x-ray crystallographic
analysis of four bHLH transcription factors (homodimers of Max, USF,
MyoD, and E47) bound to their cognate DNA sequences has shown that
protein-DNA interactions occur directly through the basic domain of
each monomer (which manifests as an -helical extension of the
subunit's helix 1), whereas the HLH domain mediates
intermolecular
dimerization(24, 25, 26, 27) . We
previously demonstrated corresponding requirements for the equivalent
regions of AHR and ARNT by deletion analysis(17, 28) .
The two central nucleotides of the E-box and to a lesser extent the identity of the nucleotides flanking the invariant 5`-CA and TG-3` residues, dictate to which bHLH proteins the E-box will bind.(29, 30, 31) . Specificity for the central two nucleotides (typically CG or GC) is determined primarily by the identity of the amino acid located one helical turn toward the carboxyl terminus from a conserved glutamic acid residue in the basic regions of the bHLH proteins (found at position 36 of Max, see Fig. 3). When arginine is found at this position (as for example, in Max, USF, and TFEB), CG is preferred. When a smaller, nonpolar residues is present (as for example in AP4, MyoD and E12), GC is preferred(19) .
Figure 3: Comparison of basic regions of other bHLH proteins with basic region of ARNT and putative DNA-binding regions of AHR. The consensus sequence for bHLH proteins that bind the 5`-CACGTG-3` E-box is shown at the top. DNA contacts for Max (24) and for molecule 1 of USF (25) were determined by x-ray crystallography. Those for TFEB were inferred by alanine scanning mutagenesis(30) . Amino acids in the various basic regions (or nominal basic region in the case of AHR) that do not conform to the consensus are shown in red. The region of AHR immediately amino-terminal to its nominal basic region is shown at the bottom. m, mouse; f, frog; h, human.
The basic domain of ARNT conforms well to
the consensus polypeptide sequence for this submotif, whereas the basic
domain of AHR (which we refer to as its nominal basic domain) conforms
poorly, consistent with the observation that the DNA recognition
sequence for AHR is noncanonical for this class of transcription
factors. In this paper, we have studied the basic domain of ARNT and
the equivalent region of AHR, as well as flanking regions, in order to
characterize the protein-DNA interactions of the AHRARNT
XRE
complex. Using an alanine scanning mutagenesis approach, we tested
individual residues for XRE binding by performing EMSA. The alanine
scanning mutagenesis approach was used, because (i) alanines confer
-helical secondary structure in polypeptide chains and
-helical secondary structure is manifested by the basic regions of
solved bHLH proteins as they contact DNA and (ii) alanine lacks a
reactive side group. Our results indicate that the mode of interaction
of the basic domain of ARNT with DNA resembles that of other bHLH
proteins. However, we provide evidence that AHR exhibits a pattern of
DNA interaction that is novel for bHLH proteins. We find that XRE
binding requires only the carboxyl-proximal portion of the nominal
basic region of AHR. We previously demonstrated that DNA binding by AHR
also requires a block of amino acids amino-terminal to the nominal
basic domain(18) . By characterizing both alanine substitution
mutations and certain conservative amino acid substitution mutations in
both the above amino-terminal block and in the nominal basic region of
AHR, we discriminate between amino acids positions that are less
tolerant to substitution with regard to XRE binding and those which are
more tolerant. The former may contact bases in the XRE, and the latter
may contact the phosphodiester backbone. By peptide plot analysis and
through informative mutant AHR proteins, we impute a non-
-helical
structure for the extended putative DNA-binding region of AHR, further
indicating the unique nature of the DNA-binding domain of this bHLH
protein.
Figure 1:
XRE binding analysis of
mutant AHR and ARNT proteins in combination with normal AHR and ARNT.
Equimolar amounts of normal or mutant AHR proteins were mixed with
normal or mutant ARNT proteins as indicated in the presence (+) or
the absence (-) of 10 mM TCDD along with radiolabeled
XRE, and EMSA was then performed. The resultant AHRARNT
XRE
complexes (open arrows) were resolved by nondenaturing 4.5%
polyacrylamide gel electrophoresis. The solid arrows indicate
free probe. A, XRE binding analysis of ARNT alanine scanning
mutant proteins. B, XRE binding analysis of AHR alanine
scanning mutant proteins. C, XRE binding analysis of
additional AHR alanine scanning and conservative substitution mutant
proteins and XRE binding analysis of ARNT amino-terminal deletion
mutant proteins.
x-ray crystallograhic analysis of the DNA-binding domains of several bHLH proteins when bound to DNA has shown that certain residues are not involved in DNA interaction (24-27, see Fig. 3). In order to test whether the corresponding amino acid positions within and flanking the basic domain of ARNT share this lack of DNA interaction, several mutants containing blocks of alanines substituted for the normal amino acids were studied. ARNT alanine mutants A(86-89), A(92, 95-97), and A(103-106) showed no significant reduction or increase in XRE binding when compared with wild type ARNT when tested by EMSA. These results indicate that the amino acids that are substituted in A(92, 95-97) do not contact DNA and that amino acids immediately amino-terminal (A(86-89)) or carboxyl-terminal (A(103-106)) to the basic region are also unlikely to contact DNA.
Four single alanine substitution mutant ARNT proteins (H94A, E98A, Rl0lA, and R102A) exhibited markedly reduced XRE binding of between 0 and 12% of that observed for wild type ARNT (p<0.05), indicating that critical XRE-protein contacts occur at these positions. Three other single alanine substitution ARNT mutants (R91A, N93A, and R99A) exhibited moderately diminished complex formation (between 40 and 75% of the wild type ARNT value (p < 0.05)), suggesting that these positions may also be involved in contacting the XRE. ARNT mutant R100A bound the XRE as efficiently as wild type ARNT. The conserved arginine residue at position 100 is therefore probably not involved in XRE-protein contact, as has been shown to be the case by x-ray crystallography for the corresponding arginine residue of other bHLH proteins contacting the E-box (see Fig. 3). These findings are summarized in Table 1and Fig. 3.
Each ARNT mutant with
reduced XRE binding activity was tested for its ability to
heterodimerize with AHR, because reduced levels of
AHRARNT
XRE complex formation could reflect diminished
dimerization capacity rather than a direct effect on DNA binding.
[
S]methionine-radiolabeled in vitro synthesized mutant or wild type ARNT protein was mixed with an
equimolar amount of unlabeled AHR in the presence or the absence of 10
nM TCDD. Following dimerization, the mixtures were
immunoprecipitated using AHR antibody. The degree of heterodimerization
for each mutant protein with wild type AHR was calculated as a
percentage of the amount of wild type ARNT coimmunoprecipitated with
wild type AHR. The first six lanes of Fig. 2A represent
the controls for the coimmunoprecipitation assay. These reactions were
performed with wild type AHR and wild type ARNT proteins incubated in
the presence or the absence of TCDD and utilized AHR antibodies or the
corresponding preimmune IgG fraction, as indicated. The results of the
control reactions showed that TCDD treatment increased the amount of
ARNT protein that coimmunoprecipitated with AHR and that very little or
no ARNT was coimmunoprecipitated by the preimmune IgG fraction.
Therefore, the coimmunoprecipitations were efficient, TCDD-inducible,
and specific for AHR
ARNT heterodimers. Each ARNT mutant performed
as efficiently as wild type ARNT in the heterodimerization assays,
demonstrating that the observed reductions in XRE binding ability
represent altered DNA binding capabilities rather than decreased
formation of AHR
ARNT heterodimers.
Figure 2: Dimerization analysis of AHR and ARNT mutant proteins in combination with normal AHR or ARNT, respectively. The first six lanes of each panel represent control reactions. AHR antibody (Ab) was used in all reactions except in the fifth and sixth control lanes of each panel, where preimmune IgG was used. Equimolar amounts of radiolabeled ARNT or its mutant derivatives were incubated with normal AHR or its mutant derivatives as indicated below. Immunoprecipitated pellets and acetone supernatants were subjected to 7.5% SDS-polyacrylamide gel electrophoresis. AHR, AHR antibody; PI, preimmune IgG; -, no TCDD treatment; +, TCDD treatment; p, immunoprecipitate; s, supernatant. The positions of the molecular mass markers are indicated on the left in kDa. A, dimerization analysis of the indicated radiolabeled ARNT alanine scanning mutants with normal AHR. B, dimerization analysis of the indicated AHR alanine scanning mutants with radiolabeled normal ARNT. (The eighth lane is derived from the same gel as the first seven lanes. A(40, 42) appears to migrate more slowly than the other AHR proteins because of the ``smile'' affect manifested in lanes at the ends of gels.) C, dimerization analysis of additional AHR alanine scanning and conservative substitution mutant proteins with radiolabeled ARNT and dimerization analysis of radiolabeled ARNT amino-terminal deletion mutants with normal AHR.
A number of AHR single
substitution alanine mutants (E28A, G29A, I30A, K31A, S32A, P34A, R37A,
and D40A) formed AHRARNT
XRE complexes to the same (or
greater) degree as wild type AHR. Mutant P34A is of particular
interest, because nearly all basic-helix-loop-helix proteins lack
proline residues within their basic domains. Proline interrupts
-helical structure, and proline 34, being located at such a
central position within the AHR basic domain, might be expected to have
a significant role. However, our results show that substitution of this
proline for alanine affects neither heterodimerization with ARNT nor
XRE binding. Two mutants (H38A and R39A) formed complexes at less than
5% of the level observed for wild type AHR. Heterodimerization of these
mutants with ARNT was not significantly different from wild type AHR in
the case of H38A and was significantly higher than that for wild type
AHR in the case of R39A. Thus the loss of complex formation observed
with these mutants is not due to decreased heterodimer formation.
Six other AHR alanine single substitution mutants (N33A, S35A, and
K36A within the basic domain and R41A, L42A, and N43A within helix 1)
generated reduced levels of the XRE complex, ranging from approximately
28 to 69% of the wild type AHR value. None of these mutants
heterodimerized any less efficiently with ARNT than did wild type AHR.
Additionally, mutant A(40, 42), which contains the substitutions
present in mutants D40A and L42A, formed the XRE complex at nearly the
same reduced level as observed for the L42A mutant (approximately 38
and 28% of the wild type AHR value, respectively). Contrasting the
result observed for this mutant is that of another double alanine
substitution mutant, A(27-29), which generated the XRE complex at
approximately 40% of the efficiency of the wild type AHR. Substitutions
at positions Glu and Gly
individually with
alanine in mutants E28A and G29A, respectively, resulted in mutant AHR
proteins that formed the XRE complex at undiminished levels. Although
A(27-29) carries alterations in the peptide sequence to which the
AHR antibodies were raised, it was precipitated by the antibodies as
efficiently as AHR and therefore could be tested for its
heterodimerization potential with ARNT. A(27-29) heterodimerized
with ARNT as efficiently as wild type AHR. The unexpected reduction in
XRE binding by A(27-29) compared with mutants E28A and G29A may
result from the run of three alanines generated in the former mutant.
This may alter the secondary structure of the protein (as discussed
later), rendering the mutant less efficient at interacting with the
XRE. A summary of the above findings is found at the bottom of Table 2.
All the mutants dimerized with ARNT as efficiently as wild type AHR.
(Although Arg contains a substitution within the peptide
used to generate the AHR antibodies, mutation at this position does not
affect its ability to be precipitated by the AHR
antibodies(18) .) Substitution of the tyrosine residue at
position 9 with either tryptophan or serine markedly reduced XRE
binding activity (10 and 8% of the wild type AHR level of complex
formation for mutants Y9W and Y9S, respectively). Substitution of
lysine for arginine at position 39 adversely affected XRE to nearly the
same degree as alanine substitution. These data suggest that DNA base
contact probably occurs at Arg
and also perhaps at
Tyr
(if indeed the latter contacts DNA directly).
Substitution of histidine 38 with asparagine affected XRE binding only
mildly, suggesting that His
probably contacts the
phosphodiester backbone. Substitution of arginine 14 with lysine
moderately affected XRE binding, indicating that this amino acid could
make either base or phosphate contact or both (assuming that this amino
acid contacts DNA directly).
bHLH and bHLH leucine zipper proteins govern the expression
of critical genes involved in growth control and differentiation
through specific activation or repression programs. Regulation of
transcription by these proteins involves their interaction with
specific DNA recognition sequences in target genes. Most bHLH and bHLH
leucine zipper protein-DNA interactions occur at the E-box sequence
(5`-CANNTG-3`). The bHLH-PAS proteins, AHR and ARNT, heterodimerize in
the presence of an activating ligand and then transcriptionally
activate responsive genes, such as CYP1A1. The DNA recognition
element for the AHRARNT dimer, the XRE (5`-TNGCGTG-3`), is
asymmetrical and only resembles an E-box at the underlined nucleotides.
DNA binding by HLH proteins requires formation of homo- or
heterodimers. The HLH protein motif is critical for formation of these
dimers. Secondary dimerization domains such as the leucine zipper of
bHLH leucine zipper proteins or the PAS domain found in AHR, ARNT,
hypoxia-inducible factor 1, and SIM not only function as
additional protein-protein dimerization interfaces but also probably
serve as a means to determine the permissible combinations of homo- or
heterodimers. In the case of ARNT, evidence is building that it
represents one of the ubiquitously expressed bHLH proteins, which
through heterodimerization with several other bHLHPAS proteins, is
involved in the regulation of multiple genes. For example, the
hypoxia-inducible factor 1
is a heterodimer of hypoxia-inducible
factor 1
and ARNT. ARNT can also homodimerize in vitro and bind the E-box sequence, CACGTG, and can drive transcription
of a reporter gene driven by same the E-box
sequence(35, 36) .
Crystallographic studies of four
homodimeric bHLH proteins complexed with their cognate DNA sequence
show that in each case the basic domain of each subunit manifests an
-helical extension of helix 1 as it interacts with DNA through the
major groove(24, 25, 26, 27) . We
found a pattern of amino acid residue-XRE interactions for ARNT that
are similar to those known for homodimers of the bHLH proteins, Max or
USF co-crystallized with their target sequences (Fig. 3). Max
and USF both bind the E-box subclass CACGTG, whose half site is
identical to the segment of the XRE core sequence believed to be
contacted by ARNT(22, 23) . We found a strong
sensitivity to alanine substitution for ARNT residues His
,
Glu
, Arg
, and Arg
.
His
, Glu
, and Arg
correspond in
position to Max basic domain residues His
,
Glu
, and Arg
, which each make direct contact
with DNA bases within the E-box (see Fig. 3). Our finding of
high sensitivity to alanine substitution at these positions in ARNT
suggests that these residues make base contacts within the XRE. The
spacing of these residues in both Max and ARNT is significant in that
it sets the
-helical register such that every fourth residue
aligns to the same plane on the
-helix. Thus, all three residues
can face into the major groove of DNA, enabling base interaction. Max
residue Arg
, corresponding to ARNT Arg
,
makes contact only with the phosphodiester backbone. ARNT Arg
could be involved in a base contact, or it may contact the
phosphodiester backbone and be particularly sensitive to alanine
substitution. The moderately reduced levels of AHR
ARNT
XRE
complex formation resulting from substitution of each of ARNT amino
acid residues Arg
, Asn
, and Arg
with alanine suggests that these residues contact the
phosphodiester backbone.
Introduction of a run of four alanine
substitutions flanking the amino-terminal border of the basic domain in
ARNT alanine mutant A(86-89) had no effect on
AHRARNT
XRE complex formation. These alterations fall within
an alternatively spliced region of ARNT (11, 37) and
demonstrate that Phe
and Leu
, which are
located immediately adjacent to the basic region in the two
alternatively spliced proteins, are insensitive to alanine
substitution, consistent with the observation that both alternatively
spliced products bind the XRE.
The putative XRE contacts of AHR are
illustrated in Fig. 3. Two adjacent residues at the extreme
carboxyl end of the nominal basic domain of AHR (His and
Arg
) exhibited high sensitivity to alanine substitution.
When a conservative substitution to lysine was made for
Arg
, XRE binding was impaired strongly, suggesting that
Arg
makes base contact in DNA. Substitution of histidine
38 with asparagine had much less of an adverse affect on XRE binding
than did substitution with alanine, suggesting that His
contacts the phosphodiester backbone. Three other residues within
the nominal basic domain (Asn
, Ser
, and
Lys
) and three residues near the amino terminus of helix 1
(Arg
, Leu
, and Asn
) exhibited
only moderate sensitivity to alanine substitution, suggesting that they
may contact the phosphodiester backbone. Residue Ser
presents the only potential target for phosphorylation within the
nominal basic domain. This serine lies within a recognition sequence
for phosphorylation by protein kinase C, and the reduced level of
complex formation in the alanine mutant S35A could reflect a loss of
the phosphorylation target, rather than presence of a phosphodiester
contact. However, a recent study suggests that AHR is probably not
phosphorylated in the amino-terminal half of the protein(38) .
We further characterized the AHR-XRE interaction by extending the
observations of Fukunaga and Hankinson(18) , who demonstrated
that amino acids near the amino terminus of AHR, well removed from the
nominal basic domain, are required for XRE binding. Two positions,
Tyr
and Arg
, at which alanine substitution
abolished complex formation, suggesting possible DNA base contacts, and
several other positions that were moderately sensitive to alanine
substitution, suggesting possible phosphodiester contacts, were
identified. We generated mutants in which we replaced Tyr
and Arg
with amino acids conservative with regard to
charge and side group. The results from these mutants support the
notion that amino acid residue Tyr
makes base contact but
leave open the question as to whether Arg
contacts base(s)
in DNA or the phosphodiester backbone. It is unlikely that tyrosine 9
is phosphorylated, because its site does not conform to any known
phosphorylation recognition sequences. (It of course remains possible
that the amino-terminal region of AHR does not contact DNA directly but
plays some other role in DNA binding, such as directing the nominal
basic region into the correct conformation for DNA binding).
A
prominent feature both within the nominal basic domain of AHR and in
the adjacent amino-terminal region is the presence of several proline
residues. Proline residues affect the secondary structure of a protein
by interrupting -helices. Insight into the secondary structural
nature of the nominal basic domain of AHR was provided by the data from
two alanine substitution mutants. Substitution of alanine for the
proline residue at position 34 did not affect AHR
ARNT
XRE
complex formation. Alanine substitution mutant A(27-29) exhibited
significantly lower levels of complex formation than wild type AHR. A
run of three alanines is created in this mutant that would not
adversely affect
-helicity but could be detrimental to some other
presently unidentified structural motif. In fact, a Garnier-Robson plot
analysis (Generunner, Hastings Software) of wild type AHR's amino
terminus through helix 1 (residues 1-60) identified no regions of
predicted
-helicity, whereas both P34A and A(27-29) mutant
proteins were predicted to have
-helical regions in the nominal
basic domain as well as helix 1. Taken together, the above findings
suggest a unique non-
-helical nature for the relevant region of
AHR as it associates with DNA.
Certain other bHLH proteins contain
proline residues in their basic regions and, like AHR, bind recognition
sequences divergent from the E-box. These include E2F1, which binds the
sequence 5`-GCGCGAAA-3` (39) and some heterodimeric
transcription factors, including the Hes family of proteins which
recognize the N-box (5`-CACNAG-3`) as well as the E-box (40, 41, 42) and enhancer of split, E(spl),
which only binds the N-box(43) . E2F1 is particularly
interesting because it contains a proline residues in a similar
position in its basic domain as AHR. (In fact this proline is part of a
proline-glycine pair, which is particularly disruptive of -helix
formation.) However, unlike AHR, binding of E2F1 to DNA does not
require amino acids amino-terminal to its basic domain(44) .
In summary, we have provided evidence that ARNT interacts with the
XRE in a manner highly analogous to several bHLH proteins that
recognize the 5`-GTG-3` E-box half site, whereas AHR interacts with DNA
in a manner unique among bHLH proteins. The putative DNA-binding region
we have identified in AHR spans at least 35 amino acids, from Tyr to Asn
or beyond, and is composed of two highly
basic blocks of amino acids separated by a 16-amino acid intervening
sequence (18) containing four proline residues. The intervening
amino acids may be critical for establishing a precise protein
conformation, because deletion of this region resulted in a complete
loss of DNA binding. These findings indicate that AHR possesses a novel
DNA-binding motif among bHLH proteins. Ultimately, x-ray
crystallography should definitively reveal the structure of the
DNA-protein interaction between the AHR
ARNT dimer and the XRE,
whereas reports such as the present study and that of Fukunaga and
Hankinson(18) , provide strong indications of DNA-protein
contacts and also identify which interactions are essential for XRE
binding.