(Received for publication, September 13, 1995; and in revised form, November 7, 1995)
From the
The aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that binds DNA in the form of a heterodimer with the AHR nuclear translocator protein (ARNT). Both proteins possess basic helix-loop-helix motifs. ARNT binds to the side of the xenobiotic responsive element (XRE) that resembles an E-box (the sequence recognized by the majority of other basic helix-loop-helix proteins), whereas AHR binds to the side of the XRE that does not conform to the E-box sequence. The basic region of ARNT closely resembles those of other E-box-binding proteins, whereas the ``nominal basic region'' of AHR (amino acids 27-39), although required for XRE binding, deviates from this consensus. By extensive mutational analysis it is shown here that an additional block of amino acids of AHR (from tyrosine 9 to lysine 20) that contains a highly basic segment is required for XRE binding and transcriptional activation. Deletion of the first nine amino acids negates XRE binding. Substitution of either tyrosine 9 or arginine 14 with alanine eliminates XRE binding, whereas alanine substitutions at certain other sites within the block reduce but do not eliminate binding. The reported absence of the first nine amino acids in the purified protein may therefore be artifactual. These results suggest that the amino acids of AHR involved in binding to the XRE constitute a novel DNA-binding domain, comprising amino acids located within and amino-terminal to the nominal basic region.
AHR ()binds a variety of environmentally important
carcinogens, including polycyclic aromatic and halogenated aromatic
hydrocarbons, and mediates carcinogenesis by these compounds. The
unliganded AHR is a component of a soluble cytosolic protein complex
containing a 90-kDa heat shock protein and perhaps other
proteins(1) . After binding ligand, AHR dissociates from the
above complex and translocates to the nucleus, where it heterodimerizes
with ARNT. The AHR/ARNT dimer binds specific DNA sequences, termed
XREs, in the enhancer regions of certain enzymes involved in the
metabolism of xenobiotics (reviewed in (2) ).
Mouse AHR and ARNT are 20% identical in amino acid sequence and resemble each other, as well as the SIM (single minded) protein of Drosophila, in domain structure(3, 4, 5) . The three proteins contain bHLH motifs in their amino-terminal regions and share a more centrally located, approximately 300-amino acid region of homology, which is also possessed by another Drosophila protein, PER (period). This PAS domain contains two approximately 50-amino acid degenerate direct repeats, termed PAS A and PAS B(6) . Recently, the PAS domain has been shown to mediate homodimerization of PER and heterodimerization of PER and SIM (7) as well as contributing toward dimerization of AHR with ARNT(5, 8) .
The bHLH motif is common to a number of transcription factors. Most bHLH-containing transcription factors bind as dimers to specific DNA sequences termed E-boxes (5`-CANNTG-3`) (reviewed in (9) ). The XRE sequence, to which the AHR/ARNT dimer binds, does not conform to the canonical E-box sequence. A consensus XRE sequence that can confer ligand-induced expression of a linked reporter gene has been identified as 5`-(T/G)NGCGTG(A/C)(G/C)A-3`(10, 11, 12) . The consensus sequence for binding the AHR/ARNT dimer is less restrictive and has been identified as 5`-CGTG(A/C)(G/C/T)(A/T)-3`. The four core nucleotides of the XRE (5`-CGTG-3`) are absolutely required for binding, whereas substitutions at other positions reduce binding affinity by up to about 8-fold(10, 11) . ARNT contacts the thymidine in the 5`-CGTG-3` core and thus in a region identical to an E-box half site (GTG), whereas AHR binds 5` proximal to this in a region differing from the E-box sequence(13) . Recent crystallographic data of the bHLH domains of the Max, USF, E47, and MyoD transcription factors binding to their specific target sequences have confirmed that their HLH domains are responsible, in part or entirely, for dimerization; that the basic domains are required for DNA binding; and that all contacts with nucleotide bases are restricted to amino acids in their basic domains(14, 15, 16, 17) . Consistent with these observations, we previously demonstrated the requirement of the basic domains and HLH domains in AHR and ARNT for TCDD-induced XRE binding and dimerization, respectively(5, 8) . Whereas the basic region of ARNT conforms well to the consensus for bHLH proteins, the basic region of AHR conforms only very poorly. Conformity only occurs at the extreme carboxyl-terminal end of the AHR basic region (Fig. 1). This observation is compatible with the fact that AHR binds to the ``non-E-box-like'' side of the XRE sequence. We have assigned the boundaries of the AHR basic region by alignment to the corresponding region of the other bHLH proteins. Henceforth we will call this region the ``nominal basic'' region of AHR, because although it is less basic in character, it corresponds in position to the basic domain of other bHLH proteins.
Figure 1: Alignment of basic regions of certain bHLH proteins that bind the CACGTG subclass of E-boxes. The consensus sequence shown is that derived from all mammalian bHLH protein that bind the CACGTG sequence(18, 19) . i, polar; @, aliphatic. Amino acids conforming to the consensus are shown in bold.
AHR purified from mouse liver lacked the nine most amino-terminal
amino acids encoded by the cloned cDNA for mouse
AHR(3, 20) . Bradfield and co-workers (3) suggested that these nine amino acids constituted a leader
peptide that is cleaved from the primary translation product to
generate the mature protein. Our interest in the amino-terminal region
of AHR developed from our observation that an AHR derivative lacking
the putative leader peptide generated in vitro from the
appropriately deleted cDNA is incapable of binding the XRE sequence in
conjunction with ARNT (this paper). In the current work we perform a
mutational analysis of AHR from its amino terminus to the nominal basic
region. Our results define a novel region required for DNA binding and
for in vitro and in vivo functionality. Our approach
was first to perform deletion analysis to identify the amino terminus
of the region required for DNA binding and then to substitute
individual amino acids or blocks of amino acids with alanine residues
in order to identify essential amino acids in the region. We utilized
this ``alanine scanning'' strategy for two reasons. (i) The
solved crystal structures of certain bHLH proteins demonstrate that
their basic regions are -helical when interacting with their
specific DNA
sequences(14, 15, 16, 17) . We
therefore did not wish to compromise any
-helical conformation
that may exist in the amino-terminal region of AHR. Substitution with
alanine should not have any disruptive effect on
-helix formation,
because alanine is the most common amino acid in protein
-helices(21) . (ii) Alanine lacks a reactive side group.
Proline residues were left unaltered due to their potential structural
role. Utilizing this approach we identify several amino acids located
amino-terminal to the AHR nominal basic region that are required for
TCDD-induced XRE binding and transcriptional activation. These are
required in addition to amino acids located in the nominal basic
region. (
)
Two clones were isolated and analyzed for each mutation. All clones were sequenced to confirm their mutation. Plasmids were prepared by QIAGEN maxiprep according to the supplier's protocols (QIAGEN, Chatsworth, CA).
Alanine scanning mutants Y9A, S11A, R12A, K13A,
R14A, R15A, and K16A were generated by PCR using Ultma Polymerase. 5`
primers (that contain the alanine codon substitutions) were used with
the external 3` PCR primer listed above. The 5` primers are as follows:
Y9A, 5`-CACCATGTCTAGCGGCGCCAACATCACCGCTGCCAGCCGC-3`; S11A,
5`-CACCATGTCTAGCGGCGCCAACATCACCTATGCCGCCCGCAAGCGG-3`; R12A,
5`-CACCATGTCTAGCGGCGCCAACATCACCTATGCCAGCGCCAAGCGGCGCAAGCC-3`; K13A,
5`-CACCATGTCTAGCGGCGCCAACATCACCTATGCCAGCCGCGCGCGGCGCAAGCC-3`; R14A,
5`-CACCATGTCTAGCGGCGCCAACATCACCTATGCCAGCCGCAAGGCGCGCAAGCCG-3`; R15A,
5`-CACCATGTCTAGCGGCGCCAACATCACCTATGCCAGCCGCAAGCGGGCCAAGCCGGTG-3`; and
K16A,
5`-CACCATGTCTAGCGGCGCCAACATCACCTATGCCAGCCGCAAGCGGCGCGCTCCGGTGCAG-3`.
The PCR products were digested with restriction enzymes NarI
and Bpu 1102I and then ligated to similarly digested
pcDNA3/AHR.
Two clones were isolated and analyzed for each mutation. All clones were sequenced to confirm their mutations. Plasmids were prepared by QIAGEN maxiprep according to the supplier's protocols (QIAGEN).
Figure 2:
XRE binding analysis of AHR amino-terminal
mutants. Equimolar amounts of AHR and its mutant derivatives were mixed
with equimolar amounts of ARNT, incubated with or without 10 nM TCDD as indicated and subjected to gel mobility shift analysis.
The open arrow indicates the AHRARNT
XRE complex.
The solid arrow indicates free
probe.
AHR amino-terminal deletion mutants N-2, N-4, N-6, and N-7 showed that loss (<1%) of TCDD-induced XRE binding capacity does not occur until removal of amino acid 9(N-8). The results obtained with AHR mutant N-14 are consistent with this observation. Clustered alanine scanning mutants CA(12-16), CA(18-21), and CA(22,23,25) were designed to identify the carboxyl-terminal terminal boundary of the domain in the amino-terminal region required for DNA binding. CA(12-16) showed a loss (<1%) of TCDD-induced XRE binding capacity, indicating that one or more of amino acid residues 12-16 is required for DNA binding. Mutant CA(18-21) showed a significant (49%, p < 0.05) decrease in XRE binding capacity, whereas mutant CA(22,23,25) bound XRE at levels not significantly different (p > 0.05) from full-length AHR. These results demonstrate that the carboxyl-terminal boundary in the amino-terminal region required for TCDD-induced XRE binding does not extend beyond amino acid 21.
Single alanine substitutions within
amino acids 9-21 allowed for the identification of specific amino
acids within this region required for TCDD-induced XRE binding. Amino
acids 9 (Y9A) and 14 (R14A) are required for DNA binding, as
substitutions to alanines at these positions resulted in profound
reductions (to <1 and 4%, respectively) of TCDD-induced XRE binding
capacity compared with normal AHR. Alanine substitutions at amino acid
positions 11 (S11A), 12 (R12A), 13 (K13A), 16 (K16A), and 20 (K20A)
resulted in less marked but nevertheless significant (p <
0.05) reductions in XRE binding. AHR mutants R15A, V18A, Q19A, and T21A
had TCDD-induced XRE binding capacities not significantly different
from that of full-length AHR. Although AHR mutants N-8, N-14,
CA(12-16), and Y9A formed AHRARNT
XRE complexes at
very low levels (<1%), these complexes were detectable upon extended
exposure of the gels to film (data not shown), indicating that XRE
binding was drastically reduced but not totally abolished.
Figure 3: Immunoprecipitation of AHR amino-terminal mutants with AHR antibodies. Equimolar amounts of radiolabeled AHR and its mutant derivatives were incubated with AHR antibodies. Immunoprecipitated pellets and acetone-precipitated supernatants were subjected to 7.5% PAGE. p, immunoprecipitate; s, supernatant. The positions of the molecular mass markers are indicated on the left.
Figure 4: Dimerization of AHR amino-terminal mutants with full-length ARNT. AHR and its mutant derivatives were incubated with equimolar amounts of radiolabeled ARNT. The AHR antibody preparation was used throughout, except in the fifth and sixth lanes, where preimmune IgG (PI) was used. Immunoprecipitated pellets and acetone-precipitated supernatants were subjected to 7.5% PAGE. -, no TCDD treatment; +, TCDD treatment; p, immunoprecipitate; s, supernatant. The positions of the molecular mass markers are indicated on the left.
The first six lanes of Fig. 4represent the controls for the coimmunoprecipitation assay, utilized full-length AHR and ARNT incubated in the absence or the presence of TCDD, and were treated with AHR antibodies or the corresponding preimmune IgG, as indicated. The data demonstrate that TCDD treatment increased the amount of ARNT coimmunoprecipitated with AHR and that very little ARNT was precipitated from the coincubation mixture upon treatment with the preimmune IgG preparation. Therefore, the coimmunoprecipitates were efficient, inducible, and specific for AHR/ARNT heterodimers. All mutant proteins heterodimerized with ARNT as efficiently as full-length AHR, indicating that the reduced TCDD-induced XRE binding capacity is not due to loss of dimerization capacity.
Consistent with their dramatically reduced XRE binding activities (<4%), N-8, N-14, CA(12-16), Y9A, and R14A generated markedly reduced CAT activities (16-29% of the activity obtained with wild-type AHR) when cotransfected with pMC6.3k. In addition, the majority of other mutants (N-2, N-6, CA(18-21), S11A, and R12A) with significantly but less markedly reduced XRE binding capacities also generated significantly reduced CAT activities. The reductions in CAT activities for the mutants were not so marked as their reductions in XRE binding. Interestingly, a few mutant clones (CA(21,22,24), R15A, Q19A, and T21A) that bound XRE at levels not significantly different from full-length AHR generated CAT activities that were significantly (p < 0.01) greater than full-length AHR. In summary, the CAT activities associated with the mutant AHR constructs reflected their in vitro XRE binding activities.
Poland and co-workers purified murine AHR and determined the amino-terminal amino acid sequence of their purified material(20) . The protein encoded by the AHR cDNA contains nine amino acids amino-terminal to the sequence obtained by Poland and co-workers(3, 4) . Poland and co-workers suggested that these nine amino acids constitute a leader peptide that is removed to generate the mature protein(3) . However, our data show that removal of these nine amino acids and mutation of the ninth amino acid both eliminate XRE binding, suggesting that the absence of these amino acids in the protein purified by Poland and co-workers is an artifact of purification or sequencing.
XRE binding analysis of our AHR cDNA
amino-terminal deletion mutants clearly demonstrated that DNA binding
capacity was maintained until removal of the tyrosine at position
9(N-8). The clustered alanine scanning clones defined the location of
the carboxyl-terminal boundary of the amino-terminal region required
for DNA-binding at Thr. XRE binding analysis of alanine
scanning mutants demonstrated that substitution of Tyr
or
Arg
with alanine results in the loss of TCDD-induced XRE
binding capacity. Alanine substitutions of amino acids at
Ser
, Arg
, Lys
, Lys
,
and Lys
significantly reduced DNA binding function but to
a lesser degree. Many of the amino acids for which substitution with
alanine reduces or eliminates XRE binding reside in a stretch of five
basic amino acids (Arg
to Lys
). The results
of the coimmunoprecipitation experiments demonstrate that the decreased
capacity of the mutant proteins to bind the XRE sequence is not due to
reduced dimerization ability. Consistent with in vitro data,
most AHR mutants that possessed reduced XRE binding capacities also
generated significantly reduced in vivo CAT activities when
cotransfected along with the pMC6.3k reporter plasmid (containing a CAT
gene driven by the CYP1A1 enhancer/promoter) into CV-1 cells
in the presence of TCDD. We have shown that AHR mutants that are
completely unable to dimerize with ARNT and/or bind the XRE are
completely unable to stimulate CAT activity in the above
assay(8) . All the current mutants retained at least some XRE
binding activity (although it was barely detectable in Y9A and
equivalent mutants) and also generated significant CAT activity. The
observation that the CAT activities of the mutants were not so severely
reduced as their in vitro XRE binding activities is probably
due to overexpression of the encoded AHR proteins in the transfected
cells and/or to the fact that the CYP1A1 5`-flanking region
present in pMC6.3k contains multiple functional XRE sequences, and
these sequences can act in a cooperative fashion to stimulate
transcription(31) . The highly basic region of amino acids
12-17 resembles a nuclear localization signal. However, because
alanine substitutions within this region (except for arginine 14) only
modestly reduce transcriptional activation of the CAT gene in the
cotransfection assay and do not reduce CAT activities to any greater
degree than they reduce XRE binding, this region cannot be required for
nuclear localization. All the amino acids where we showed that alanine
substitution affects XRE binding are conserved in mouse, rat, and human
AHR(32) .
The basic region of ARNT conforms well to the consensus for the basic region of other bHLH protein that binds the E-box subclass CACGTG (Fig. 1). In particular, arginine 102 of ARNT corresponds in position to arginine residues in other bHLH proteins that bind the above E-box subclass(14, 15, 33) . The nominal basic region of AHR also contains an arginine residue at the corresponding location. However, ARNT but not AHR contains a glutamic acid residue (glutamate 98 in ARNT) that is known, from x-ray crystallographic analysis of other bHLH proteins, to contact the CA base at each end of the E-box(14, 15, 16, 17) . These observations are consistent with the findings that ARNT binds the E-box side of the XRE, that AHR binds the side of the E-box that does not resemble the XRE(1), and that a homodimer of ARNT can apparently bind the above E-box subclass (34, 35) and suggest that the half-sites for AHR and ARNT binding are divided by the third and fourth nucleotides in the above sequence.
The most plausible explanation
for our observations is that the amino acids within the amino-terminal
region that we identified as being required for DNA binding as well as
amino acids within the nominal basic region(8) directly contact DNA. However, physical analysis of the
AHR
ARNT
XRE complex will be required to prove this.
Crystallographic analysis of other bHLH proteins bound to their DNA
targets have shown that the DNA binding domains of these proteins only
extend over a span of 11-14 amino acids and that in each case the
basic regions and helix 1 forms a continuous
-helix. In contrast,
our results indicate that DNA binding by AHR extends from amino acid 9
to amino acid 39, a stretch of 31 amino acids. Furthermore, this
segment contains four proline residues, strongly suggesting that it
does not form a continuous
-helix. The Drosophila protein
Hairy and mammalian Hes proteins (36, 37) contain a
centrally located proline within their basic regions. AHR and ARNT,
however, do not bind the Hairy/Hes N-box consensus sequence
(5`-CANNAG-3`). Additionally, the presence of proline residues in AHR
does not inhibit DNA binding to the extent that AHR acts as a negative
regulator of transcription, as seen with the Id family of transcription
factors that lack a basic region(38) . The bHLH protein E2F1,
which binds the sequence 5`-GGCGGG-3` as a homodimer, resembles AHR
with regard to the nonconformity of its basic region and the position
of a proline residue in this region. However, unlike AHR, DNA binding
by E2F1 does not require amino acid residues amino-terminal to its
basic region(39) . Thus the region of AHR we have defined
appears to represent a novel DNA-binding domain very different from
that of other bHLH proteins.
Tyrosine 9, which is required for DNA
binding, does not reside within a known protein kinase phosphorylation
sequence. Serine 11 and threonine 21 are potential phosphorylation
sites. Phosphorylation could be involved in DNA binding at the former
amino acid residue, because alanine substitution at this site reduces
XRE binding. Because alanine substitution at the latter residue does
not affect DNA binding, phosphorylation at threonine 21 is not involved
in DNA binding. However, although some indirect evidence exists for the
involvement of phosphorylation of AHR in XRE
binding(40, 41, 42, 43, 44) ,
direct analysis of phosphorylation sites on AHR suggests that no
phosphorylations occur within the basic or amino-terminal regions of
the protein(45) . Alanine substitution at arginine 14 has a
much greater effect on DNA binding than substitutions at other
positions within the highly basic region encompassing amino acids
12-16, suggesting that Arg may contact a DNA base
and that the other amino acids in this block may not play a role in DNA
sequence discrimination and may contact the phosphodiester backbone. A
similar argument suggests that tyrosine 9 may also contact a base(s) in
DNA.