The AHR (
)is a bHLH protein that mediates the
metabolic, carcinogenic, and teratogenic effects of compounds such as
TCDD(1) . In response to agonists, the AHR interacts with a
related protein known as ARNT to form a dimeric (
)complex
that is capable of binding genomic enhancer elements, known as DREs,
and activating transcription at adjacent
promoters(2, 3, 4, 5) . The AHR and
ARNT have sequence similarities to two regulatory proteins found in Drosophila, SIM, and
PER(6, 7, 8, 9, 10) . SIM
is a developmentally regulated bHLH protein involved in controlling
central nervous system midline gene expression (11) . PER lacks
a bHLH domain and thus may be an inhibitor of a related signaling
pathway involved in the maintenance of circadian rhythms (12) .
The hallmark of this family of proteins is that they all possess
homology in a sequence of 200-300 amino acids termed a PAS domain (13) . In the AHR, the PAS domain has been shown to be involved
in ligand binding, interaction with Hsp90, and may serve as a secondary
surface to support ARNT
dimerization(2, 14, 15, 16) .
Basic/helix-loop-helix proteins are involved in a variety of tightly
regulated biological processes, such as the regulation of myogenesis
(MyoD/E47)(17) , neurogenesis
(Achaete-scute/Daughterless)(18) , regulation of immunoglobulin
genes (TFEC/TFE3) (19) , cellular
proliferation (Myc/Max)(20, 21) , and xenobiotic
metabolism (AHR/ARNT)(10) . Biochemical and crystallographic
data suggest that the HLH domains often act in concert with secondary
dimerization surfaces (e.g. ``leucine zippers'' and
possibly PAS domains) to position the two
helical basic regions
within opposing major grooves of B-DNA, generating a ``scissor
grip'' structure with high affinity for the core DNA sequence,
CANNTG (22, 23, 24) . This DNA enhancer
sequence is commonly referred to as an E-box and contains either CG or
GC dinucleotides at the degenerate positions (i.e. CACGTG or
CAGCTG) (25, 26, 27, 28) . Current
models suggest that E-boxes can be viewed as containing two half-sites,
with each partner's basic region determining half-site
specificity (e.g. the 5`-CAN or the NTG-3` half-sites within
5`-CANNTG-3`). The multiplicity of half-sites and potential
dimerization partners may allow production of a large number of homo-
or heterodimeric pairs, each with unique sequence binding specificities
and consequences for cellular signaling. In contrast to the recognition
sites for most bHLH dimers, the cognate response element of the
AHR
ARNT complex, the DRE, usually contains
TNGCGTG(5, 29, 30, 31, 32) .
Unlike the E-box, the DRE is not palindromic, and thus the DNA
half-site specificities of each protein are not readily apparent and
are probably different.
In this study, we employed a DNA selection
and amplification protocol to identify those bHLH-PAS protein
combinations that could form productive DNA binding species and to
characterize their individual DNA recognition sites. To validate the
approach, we first demonstrated that the AHR
ARNT heterodimer
would select the known DRE sequence from a pool of over 10
sequences. We then used this selection approach to demonstrate
that ARNT also has the capacity to form homodimers as well as
heterodimers with SIM, with each complex generating a unique DNA
sequence binding specificity. Integration of the DNA selection and
coprecipitation results allowed us to deduce the half-site
specificities and pairing rules for the AHR, ARNT, and SIM.
EXPERIMENTAL PROCEDURES
Materials
The SIM cDNA expression
plasmid, pSIMNB40, was a gift from Dr. Stephen Crews (University of
North Carolina, Chapel Hill). The plasmids pmuAHR, pmuAHRC
516,
pmuAHRGN
315, and phuARNT were constructed as described previously (2, 33) . The affinity-purified anti-ARNT polyclonal
immunoglobulins were a gift from Dr. Alan Poland(34) . The
affinity-purified anti-AHR polyclonal immunoglobulins, G1295 4B, were
raised in a goat against a synthetic peptide corresponding to the
N-terminal sequence of the protein as described
previously(35) . Purified immunoglobulin was obtained from
Sigma. Nickel-nitriloacetic acid resin was obtained from Qiagen
(Chatsworth, CA).
Oligonucleotides
Oligonucleotides were
synthesized at the Northwestern University Biotechnology Center using
an Applied Biosystems DNA synthesizer (Foster City, CA). The commonly
recognized core DRE sequences are underlined, E-box recognition sites
are in bold, and SIM/ARNT recognition sites are in italics. OL73,
TCGAGTAGATCACGCAATGGGCCCAGC; OL74, TCGAGCTGGGCCCATTGCGTGATCTAC; OL185,
GGCGGATCCTGAGTCTGAAC; OL186, CGTCTCGAGACGCTCAGG; OL187,
GGCGGATCCTGAGTCTGAACN
CCTGAGCGTCTCGAGACG; OL224,
GGCGGATCCGATCTAGATTCN
GCGTGN
CCTGAGCGTCTCGAGACG;
OL225, GGCGGATCCGATCTAGATTC; OL316,
TCGAGCTGGGCAGGTCATGTGGCAAGGC; OL317,
TCGAGCCTTGCCACATGACCTGCCCAGC; OL318,
TCGAGCTGGGGGCATTGCGTGACATACC; OL319, TCGAGGTATGTCACGCAATGCCCCCAGC;
OL321, TCGAGCTGGGCAGGTCACGTGGCAAGGC; OL322,
TCGAGCCTTGCCACGTGCACCTGCCCAGC; OL323,
TCGAGCTGGGCAGGTCAGCTGGCAAGGC; OL324,
TCGAGCCTTGCCAGCTGACCTGCCCAGC; OL329,
TCGAGCTGGGCATGTCACGTGACCGAGC; OL330,
TCGAGCTCGGTCACGTGACATGCCCAGC; OL331,
TCGAGCCATGGGATGTGCGTGACATTTC; OL332,
TCGAGAAATGTCACGCACATCCCATGGC; OL464,
TCGAGCCATGGGATGTACGTGACATTTC; OL465,
TCGAGAAATGTCACGTACATCCCATGGC; OL501,
TCGACTAGAAATTTGTACGTGCCACAGA; OL502,
TCTGTGGCACGTACAAATTTCTAGTCGA; OL503,
TCGACTAGAAATTTGTGCGTGCCACAGA; OL504,
TCTGTGGCACGCACAAATTTCTAGTCGA.
Protein Expression
In vitro expression of the AHR, AHRC
516, AHRGN
315, ARNT, and SIM
proteins was carried out in rabbit reticulocyte lysates (Promega) as
previously reported(2) . For verification of protein
expression, the translation was performed in the presence of
[
S]methionine, and the product was analyzed by
SDS-polyacrylamide gel electrophoresis. Quantitation of the expressed
proteins was determined by excising the radiolabeled proteins from the
gel and scintillation counting. Baculovirus expression and purification
of histidine-tagged AHR and ARNT were carried out as reported
previously(36) .
Gel Shift Analysis
The DNA probes were
radiolabeled with either [
-
P]ATP, by end
labeling with T4 polynucleotide kinase(37) , or by PCR of the
appropriate template in the presence of
[
-
P]dCTP, using OL186 and either OL185 or
OL225 as primers(27) . Unincorporated nucleotides were removed
using a 1-ml G-25 Sephadex spin column. The protein combinations were
incubated for 30 min at 30 °C to facilitate protein dimerization.
The clone AHRC
516, a constitutively active form of the AHR that
interacts with ARNT and binds DNA in a ligand-independent manner, was
used to circumvent the use of agonist in some experiments(2) .
When full-length AHR was used, the incubation period was extended to 2
h in the presence of 10 µM of the AHR agonist
-naphthoflavone. To minimize nonspecific interactions, 200 ng of
poly(dI-dC) was added to the protein mixture along with KCl (final
concentration, 100 mM). After 10 min of incubation at room
temperature, the DNA probe was added (100,000 cpm), and the sample was
allowed to incubate for an additional 10 min. Samples were then
subjected to 4% acrylamide nondenaturing gel electrophoresis using 0.5
TBE (45 mM Tris base, 45 mM boric acid, 1
mM EDTA, pH 8.0) as the running buffer(38) .
DNA Selection and Amplification
The DNA
binding site selection and amplification was performed essentially as
described (27) . For example, 10 ng of OL187 containing 13
sequential 4-fold degenerate nucleotides (4
7
10
possible sequences) was annealed to a 5-fold
molar excess of primer OL186. The complementary strand was synthesized
by incubation with the Klenow fragment of DNA polymerase (5 units) at
37 °C for 1 h. The resultant double-stranded DNA was purified by
agarose gel electrophoresis (NuSieve, FMC Bioproducts, Rockland, ME),
electroelution, and precipitation. For the first round of selection, 10
ng of the double-stranded oligonucleotide pool and either 1 fmol of in vitro expressed protein or 20 fmol of baculovirus-expressed
protein were subjected to gel shift analysis. The electrophoresis was
terminated when the bromphenol blue dye marker had migrated 1.5 cm. In
this manner, the protein-complexed oligomer could be efficiently
recovered and the majority of unbound oligonucleotide eliminated. The
protein-bound oligonucleotide was then isolated from the upper 1 cm of
the gel and was eluted for 3 h at 37 °C in buffer containing 10
mM Tris-HCl (pH 8.0), 1 mM EDTA, 50 mM NaCl,
and 0.2% SDS. The eluant was extracted with phenol:chloroform:isoamyl
alcohol (25:24:1), 10 µg of glycogen was added, and the DNA was
precipitated. One-fifth of the recovered oligonucleotide pool was
amplified by PCR. PCR conditions were 95 °C (1 min), 55 °C (1
min), 72 °C (30 s) for 25 cycles. Reactions contained 10 mM Tris-HCl, 50 mM KCl, 1.5 mM MgCl
,
0.001% (w/v) gelatin, 200 µM of each deoxyribonucleotide
triphosphate, 2 units of Taq polymerase and primers OL185 and
OL186 (when OL187 was the template) or OL186 and OL225 (when OL224 was
the template) in a total volume of 100 µl. After ethidium bromide
visualization, approximately 5 ng of amplified template was
radiolabeled by PCR and subjected to gel shift analysis for the
subsequent round of selection. In these later rounds of selection, the
bromphenol blue dye was allowed to migrate approximately 8 cm from the
top of the gel to achieve higher resolution. The gels were then dried,
the specific complexes were visualized following autoradiography, and
the appropriate areas were excised. In most analyses, a double-stranded
oligonucleotide corresponding to a commonly used synthetic DRE
(annealed OL73/74) served as a migration marker. In initial rounds of
selection and amplification, specific complex formation was determined
by its migration similar to the OL73/74 complex and its dependence on
the expressed protein (e.g. absent in lanes containing only
one member of the heteromeric pair or unprogrammed reticulocyte lysate
when analyzing for homomeric interactions). The presence of the AHR or
ARNT in a complex was verified by the ability to
``supershift'' the complex upon polyacrylamide gel
electrophoresis using either anti-AHR or anti-ARNT immunoglobulins (1
ng). Once a discrete protein-oligomer complex could be detected
(typically after three or four rounds of selection and amplification),
the amplified oligonucleotide was either digested with BamHI
and XhoI and subcloned into pBluescript SK (Stratagene) or
extracted with phenol:chloroform:isoamyl alcohol (25:24:1) and directly
subcloned into pGEM-T (Promega). Individual clones were sequenced using
the dideoxy chain termination method(39) .
Dissociation Rate Analysis
The
dissociation rates of each DNA binding complex (i.e. full-length AHR
ARNT, ARNT
ARNT, and SIM
ARNT) were
determined by gel shift analysis using the indicated DNA sequences as
probes. For each off-rate analysis, a master binding reaction
equivalent to at least six reactions as described above was used with 1
ng of end-labeled probe. Following binding, 100-200-fold molar
excess of unlabeled, doubled-stranded oligonucleotide that was
identical to the probe DNA was added. Aliquots (25 µl) were removed
and analyzed at the indicated times. To determine the end point value,
a 100-fold excess of unlabeled competitor was added prior to the
introduction of labeled probe. Complex formation at each time point was
determined using a Fuji PhosphorImager. The amount of protein-DNA
complex from the end point value was subtracted from each intermediate
time point value. To evaluate possible degradation of the protein-DNA
complexes, a mixture containing the protein(s) and end-labeled probe
was incubated for 20 min in the absence of competitor oligonucleotide
(data not shown). Quantitation of this control indicated a degradation
of less than 5% of the complex over the time period analyzed. Half-life (t
) was calculated from the slope of the linear
regression curve where t
= 0.693/k and k = -(2.303)(slope).
Coprecipitation
Sf9 soluble extract
containing approximately 120 µg of baculovirus-expressed ARNT and
S-labeled reticulocyte lysate-expressed protein (ARNT,
full-length AHR, AHRC
516, AHRGN
315, or SIM) was combined with
the nickel-nitriloacetic acid resin in wash buffer (50 mM Tris, pH 7.4, 100 mM KCl, 10% glycerol, 10 mM
-mercaptoethanol, 0.4% Tween 20, and 5 mM imidazole)
and mixed gently for 2 h at 4 °C. In parallel reactions, uninfected
Sf9 soluble extract containing similar amounts of total protein was
substituted for ARNT soluble extract as a negative control. Samples
containing oligonucleotides were incubated at room temperature for 10
min in the presence of poly(dI-dC) (10 µg) followed by the addition
of the indicated oligonucleotides prior to the 2-h incubation. The
resin was pelleted following centrifugation at 16,000
g for 10 s, and the samples were washed five times using 1 ml of
wash buffer. The pellets were resuspended and analyzed by
SDS-polyacrylamide gel electrophoresis and autoradiography.
Statistical Analysis
-Square
goodness of fit test was used to determine whether frequencies of
nucleotides at each position of the oligonucleotide were different than
expected random frequencies(40) . In the case where DNA
selection and amplification-derived AHR
ARNT sequences were
compared with those present in bona fide (
)DREs, two by two
contingency tables were used to compare frequencies of nucleotides.
Significance for all tests was set at p < 0.01.
RESULTS
Validation of the DNA Selection and Amplification
Strategy
To validate the DNA selection and amplification
technique, we first examined the nucleotide specificity of the
AHR
ARNT heterodimer. We amplified a pool of oligonucleotides,
derived from OL187, to generate double-stranded oligomers that
contained 13 consecutive random nucleotides, theoretically encoding
greater than 7
10
unique sequences. The
oligonucleotides that specifically bound to the AHR
ARNT complex
were subjected to three rounds of selection and amplification. 24
selected oligonucleotides were cloned and sequenced (Fig. 1A). Statistical analysis by
-square was
performed to identify those nucleotides preferentially selected for by
the AHR
ARNT complex (Fig. 1B). Nucleotides that
occurred at greater than expected frequencies (p < 0.01)
were used to derive a consensus recognition sequence, TNGCGTGC (Fig. 1C). Of these oligonucleotides, 22 contained the
GCGTG core sequence that is commonly found in bona fide DREs. Two
oligonucleotides, AHA23 and AHA24, contained similar core motifs, TCGTG
and GTGTG, respectively. Subsequent gel shift analysis indicated that
these two sequences were capable of binding AHR
ARNT complexes,
albeit at lower affinities than those sites containing the complete
core motif, GCGTG (results not shown). Analysis of the OL187
oligonucleotide pool that was cloned, amplified, and sequenced
directly, without selection by the AHR
ARNT heterodimer, served as
a control.
-Square analysis of these sequences indicated that the
AHR
ARNT selected sequence was not the result of biased
oligonucleotide synthesis (Fig. 1D).
Figure 1:
AHR
ARNT recognition sites
selected from random sequences after three rounds of selection. A, the double-stranded oligonucleotide pool generated from
OL187 was incubated with 1 fmol of reticulocyte lysate-expressed
AHRC
516 and ARNT. The mixture was subjected to the selection and
amplification protocol, and the individual clones were sequenced. The
most highly conserved sequence, GCGTG, is boxed. B,
tabulation of nucleotide frequencies at each position (n = 24). All frequencies were multiplied by 100. The
frequencies of individual nucleotides were analyzed by
-square at
the p < 0.01 level. Frequencies above the expected random
level are underlined. C, an AHR
ARNT consensus
sequence derived from statistically significant nucleotides. D, a sample from the double-stranded oligonucleotide pool
(OL187) was cloned and sequenced to verify equal representation of each
nucleotide, and the frequencies were calculated and analyzed by
-square (n = 19).
Analysis of Sequences Flanking the GCGTG
Motif
As shown in Fig. 1, the position of the GCGTG
was biased toward the 3`-end of the oligonucleotide. To determine if
this bias was the result of flanking nucleotides, we constructed an
additional nucleotide pool (OL224) that fixed the core motif, GCGTG,
between seven random nucleotides on the 3`- and 5`-ends. This
oligonucleotide pool, containing approximately 3
10
possible sequences, was subjected to three rounds of the
selection and amplification protocol, and the selected oligonucleotides
were sequenced (Fig. 2A).
-Square analysis was
performed (Fig. 2B) and indicated that nucleotide
preference occurred at 11 of the 14 flanking positions, resulting in a
consensus sequence of GGGNATYGCGTGACANNCC (underlined sequences are
fixed, Fig. 2C). Again, analysis of the control
oligonucleotide pool indicated that nucleotide preference was not the
result of biased oligonucleotide synthesis (Fig. 2D).
To confirm that our consensus sequence was highly specific for the
AHR
ARNT complex, we synthesized the corresponding oligonucleotide
(OL318/319) and performed gel shift analysis. As demonstrated in Fig. 3, complex formation required both proteins. Neither ARNT
nor AHRC
516 recognized this motif alone (Fig. 3, lanes
1-3), recognition of the consensus sequence by full-length
AHR and ARNT was ligand responsive (Fig. 3, lanes 4 and 5), and the complex was recognized by anti-ARNT and anti-AHR
immunoglobulins (Fig. 3, lanes 6 and 7).
Addition of purified immunoglobulin did not affect the migration of the
AHR
ARNT complex (Fig. 3, lane 8).
Figure 2:
AHR
ARNT selection analysis of
sequences flanking the GCGTG core. A, The double-stranded
OL224 oligonucleotide pool containing the fixed sequence, GCGTG,
flanked by seven random nucleotides on each side was incubated with 1
fmol of reticulocyte lysate-expressed AHRC
516 and ARNT. The
mixture was subjected to three rounds of DNA selection and
amplification, and the individual clones were sequenced. B,
tabulation of the nucleotide frequency at each position (n = 25). All frequencies were multiplied by 100. The
frequencies of individual nucleotides were analyzed by
-square at
the p < 0.01 level. Frequencies above the expected level
are underlined. C, nucleotides with above expected
frequencies were used to derive an AHR
ARNT consensus sequence. D, the double-stranded oligonucleotide pool (OL224) was cloned
and sequenced to verify equal representation of each nucleotide, and
the frequencies were calculated and analyzed by
-square (n = 23).
Figure 3:
Specific recognition of the derived
consensus sequence obtained from DNA selection and amplification by the
AHR
ARNT complex. Approximately 0.5 fmol of reticulocyte
lysate-expressed proteins were subjected to gel shift analysis using
OL318/319, the derived AHR
ARNT consensus sequence (Fig. 2C), as a probe. The AHR
ARNT complex is
indicated by the arrow. The incubations contained the
following proteins: lane 1, ARNT alone; lane 2,
AHRC
516 alone; lane 3, both AHRC
516 and ARNT; lane 4, full-length AHR and ARNT incubated with dimethyl
sulfoxide (vehicle control); lane 5, full-length AHR and ARNT
incubated with 10 µM
-naphthoflavone; lane
6, AHRC
516 and ARNT incubated with anti-ARNT immunoglobulin; lane 7, AHRC
516 and ARNT incubated with anti-AHR
immunoglobulin (G1295); lane 8, AHRC
516 and ARNT
incubated with purified IgG.
Comparison of the AHR
ARNT Selected Sequence
With Bona Fide Enhancer Elements
To support the idea that
our strategy would select for biologically relevant DNA binding motifs,
we compared the consensus sequence selected by the AHR
ARNT
complex in vitro to sequences known to correspond to
functional enhancers in vivo. For this comparison, we first
analyzed 10 bona fide DREs to determine the frequency of
nucleotides at each position (Fig. 4). These frequencies were
then compared to the corresponding frequencies observed in the selected
and amplified oligonucleotides (Fig. 4D). The in
vitro derived consensus was similar to the bona fide DREs at 14
out of 19 of the nucleotide positions. Statistically significant
differences were detected at the outer most positions(-8,
-9, 9, 10) and at the -5 position.
Figure 4:
Sequence comparison of AHR
ARNT
binding motifs with bona fide DREs found upstream of several regulated
genes. A, the indicated binding motifs are found in the
upstream regions of the following genes: sites A-F and
DRE 4, mouse cytochrome P4501A1(31, 52) ; rXRE1 and
rXRE2, rat cytochrome P4501A1 are identical to sites D and E and thus were omitted from the analysis(29) ; Ya
DRE, glutathione S-transferase Ya(53) ; QR DRE,
quinone reductase(54) ; and huXRE, human cytochrome
P4501A1(55) . B, tabulation of nucleotide frequencies
at each position (n = 10). All frequencies were
multiplied by 100. The frequencies of individual nucleotides were
analyzed by
-square at the p < 0.01 level.
Frequencies above the expected level are underlined. C, a bona fide AHR
ARNT consensus sequence derived from
statistically significant nucleotides. D, nucleotide
frequencies of the DRE obtained by the selection and amplification
analysis (in vitro, Fig. 2) were compared to those that
occur in bona fide DREs from part A. Frequencies that were
statistically significantly different are indicated by a superscript. a, C occurs more frequently in the bona
fide DREs than the in vitro analysis. b, A occurs
more frequently in the in vitro analysis than in bona fide
DREs. c, A occurs more frequently in the bona fide DREs than
in the in vitro analysis. d, C occurs more frequently
in the in vitro analysis than in bona fide DREs. e, G
occurs more frequently in the bona fide DREs than in the in vitro analysis.
Selection and Amplification of ARNT-Homodimer
Recognition Sequences
Purified ARNT obtained from
baculovirus-infected Sf9 cells (36) with the addition of
unprogrammed reticulocyte lysate was subjected to the same DNA
selection and amplification protocol described above, using
double-stranded oligonucleotides generated from OL187. After four
rounds of selection and amplification, 20 ARNT-specific sequences were
aligned and analyzed by
-square to yield a consensus sequence,
CACGTG (Fig. 5). Unlike the oligonucleotides selected from OL187
by the AHR
ARNT complex, no bias was observed due to flanking
nucleotides, and no statistically significant specificities were
observed for nucleotides that flanked this core (Fig. 5B). Four sequences that contained the AACGTG
(AA17, AA18, AA19, AA20) motif were also amplified. Gel shift analysis
demonstrated that these sequences were recognized by the ARNT complex
but at a lower affinity than sequences containing the CACGTG sequence
(data not shown). To confirm that the derived consensus sequence,
CACGTG, was specific for ARNT homodimers, we synthesized the
corresponding consensus oligonucleotide, and demonstrated that a
specific ARNT
DNA complex was formed in gel shift analysis (Fig. 6A, lane 2). The presence of ARNT in the
complex was confirmed by supershifting the complex in the presence of
anti-ARNT immunoglobulin (Fig. 6A, lane 4) but
not by purified immunoglobulin (Fig. 6A, lane
5). In agreement with our previous results(36) , purified
bHLH-PAS proteins require heat denaturable factor(s) found in
reticulocyte lysate for function (Fig. 6A, lanes
1-3). The addition of bovine serum albumin also stabilizes
the ARNT dimer formation to a lesser degree, demonstrating that the
only bHLH-PAS protein in the complex is ARNT (Fig. 6A, lane 3). Finally, we confirmed that the ARNT
DNA binding
complex could be formed at the lower concentrations of ARNT that are
typically generated in the reticulocyte lysate expression system and
that may also be found in cells (i.e.
1 fmol/5 µl) (Fig. 6B, lane 1).
Figure 5:
Determination of ARNT homodimer DNA
recognition sites. A, the double-stranded oligonucleotide pool
containing 13 random nucleotides (OL187) was incubated with 20 fmol of
baculovirus-expressed ARNT and 10 µg of unprogrammed reticulocyte
lysate, the mixture was subjected to four rounds of DNA selection and
amplification, and the individual clones were sequenced. The most
highly conserved sequence, CACGTG, is boxed. Lower case
letters represent nucleotides that comprise the primer annealing
region of the oligonucleotide and do not represent randomly selected
nucleotides. B, tabulation of the nucleotide frequency at each
position (n = 20). All frequencies were multiplied by
100. The frequencies of individual nucleotides were analyzed by
-square at the p < 0.01 level. Frequencies above the
expected level are underlined. C, An ARNT consensus
sequence derived from statistically significant
nucleotides.
Figure 6:
Specificity of ARNT homodimer recognition
of the derived consensus sequence. Approximately 4 fmol of baculovirus (Bac)-expressed ARNT (A) or 1 fmol of reticulocyte
lysate-expressed ARNT (B) was subjected to gel shift analysis
using OL329/330, the derived ARNT consensus sequence (Fig. 5C) as a probe. A, lane 1,
baculovirus-expressed ARNT alone; lane 2,
baculovirus-expressed ARNT with 10 µg of unprogrammed reticulocyte
lysate; lane 3, baculovirus-expressed ARNT with 10 µg of
bovine serum albumin; lane 4, baculovirus-expressed ARNT with
10 µg of unprogrammed reticulocyte lysate and anti-ARNT
immunoglobulin; lane 5, baculovirus-expressed ARNT with 10
µg of unprogrammed reticulocyte lysate and purified IgG. B, lane 1, reticulocyte lysate-expressed ARNT; lane 2, reticulocyte lysate-expressed ARNT with anti-ARNT
immunoglobulin. Panel B was subjected to three times longer
exposure than panel A. The arrow indicates the
ARNT
ARNT DNA binding complex.
ARNT and SIM Interact Resulting in Unique DNA Binding
Specificity
The selection and amplification protocol was
performed to determine if ARNT could interact with SIM and recognize a
specific DNA sequence. Using the oligonucleotide pool derived from
OL187, we were unable to select and amplify a discrete
SIM
ARNT
DNA complex that was dependent on the presence of
both proteins. We repeated the procedure using OL224 as the
oligonucleotide source (see ``Discussion''). Following four
rounds of selection and amplification, a pool of specific SIM/ARNT
selected DNA was cloned and sequenced. Given the apparently weak
interaction of the complex and the comigration of nonspecific
protein-oligonucleotide species, 80 of the selected sequences were
radiolabeled, and each was individually reanalyzed by gel shift
analysis to confirm its interaction with both SIM and ARNT. Of the 80
amplified oligonucleotides, 19 were specific for the SIM
ARNT
complex as judged by the formation of specific gel shift bands that
were detected only in the presence of both proteins and that were
recognized by the ARNT-specific antibodies (Fig. 7A).
Nucleotides that were associated with the SIM
ARNT
DNA
complex formation were identified by
-square analysis and were
used to derive a consensus sequence, GNNNNGTGCGTGANNNTCC (Fig. 7, B and C). Gel shift analysis using an
oligonucleotide corresponding to the derived consensus sequence
(OL331/332) confirmed that it was specific for the SIM
ARNT
complex (Fig. 8A). Again, complex formation required
both proteins, since neither ARNT nor SIM could recognize the sequence
alone (Fig. 8A, lanes 1-3), and the
complex was recognized by ARNT-specific antibodies but not purified IgG (Fig. 8A, lanes 4 and 5).
Figure 7:
Determination of SIM
ARNT DNA
recognition sites. A, Double-stranded OL224 containing the
fixed sequence, GCGTG, and flanked by seven random nucleotides was
incubated with 1 fmol each of reticulocyte lysate-expressed SIM and
ARNT, the mixture was subjected to four rounds of DNA selection and
amplification, and the individual clones were sequenced. The most
highly conserved sequence, GTGCGTGA, is boxed. B,
tabulation of the nucleotide frequencies at each position (n = 19). All frequencies were multiplied by 100. The
frequencies of individual nucleotides were analyzed by
-square at
the p < 0.01 level. Frequencies above the expected level
are underlined. C, A SIM
ARNT consensus sequence
derived from statistically significant
nucleotides.
Figure 8:
Gel shift analysis of the SIM
ARNT
DNA binding complex. Approximately 0.5 fmol of reticulocyte
lysate-expressed SIM and 4 fmol of baculovirus-expressed ARNT were
subjected to gel shift analysis using OL331/332, which contained the
derived SIM
ARNT consensus sequence (Fig. 7C), as
a probe. A, specificity of SIM
ARNT heterodimer
recognition of the derived consensus sequence. The incubation mixtures
contained ARNT with 5 µl of unprogrammed reticulocyte lysate (lane 1), SIM alone (lane 2), ARNT and SIM (lane
3), ARNT and SIM with the anti-ARNT immunoglobulin (lane
4), and ARNT and SIM with purified IgG (lane 5). The arrow indicates the SIM
ARNT DNA binding complex. B, gel shift analysis of SIM and ARNT using
P-labeled oligonucleotides corresponding to different
flanking and core sequences. Gel shift experiments were performed with
SIM alone (lanes 2, 5, 8, and 11),
ARNT alone (lanes 3, 6, 9, and 12),
or both SIM and ARNT (lanes 1, 4, 7, 10) using
P-labeled OL331/332
(GGGATGTGCGTGACATTC, lanes 1-3), OL464/465
(GGGATGTACGTGACATTC, lanes 4-6), OL501/502
(AATTTGTACGTGCCACAGA, lanes 7-9), or OL503/504
(AATTTGTGCGTGCCACAGA, lanes 10-12).
Unprogrammed reticulocyte lysate was added, if necessary to normalize
the amount of lysate in each reaction.
While
this work was in review, a report described a consensus sequence found
upstream of SIM-regulated genes in Drososphila,
GTACGTG(41) . This core sequence differed by a single
nucleotide from the sequence deduced by our in vitro approach (i.e. GTACGTG versus GTGCGTG). Since our selected
SIM
ARNT sequence was biased for a G at this position due to the
use of oligonucleotides with a fixed GCGTG core, we chose to examine
the impact of this single nucleotide difference on binding by the
SIM
ARNT complex. To control for effects of adjacent sequences, we
engineered oligonucleotides that contained these two core sequences
into flanking sequences derived from either the SIM
ARNT consensus
that was deduced in Fig. 7C (i.e. GGGATGT(A/G)CGTGACATTC; OL464/465 and OL331/332; respectively) or
the SIM-dependent enhancer found upstream of the Drosophila Tl gene (i.e. AATTTGT(A/G)CGTGCCACAGA; OL501/502 and
OL503/504, respectively). Gel shift analysis indicated that all four
sequences were bound by the SIM/ARNT with a similar binding affinity.
Thus, either an A or a G is well tolerated at this position, with no
difference in binding observed when the core sequence is within the
context of the flanking sequences derived from the Tl enhancer (Fig. 8, A and B).
Half-site Recognition of ARNT, AHR, and
SIM
The experiments described above suggest that ARNT is
capable of forming a homodimer that recognizes the previously described
E-box sequence, CACGTG, forming a heterodimer with the AHR recognizing
TNGCGTG and forming a heterodimer with SIM recognizing GT(G/A)CGTG.
Since all ARNT-containing complexes bind sequences with a GTG
3`-half-site and the ARNT alone complex binds a palindrome of this site
(CACGTG), we conclude that this half-site corresponds to an ARNT
binding half-site. The observation that unique heteromeric partners
each yield different 5`-half-sites is consistent with T(C/T)GC being
the 5`-half-site of the AHR and GT(G/A)C being the 5`-half-site of SIM.
Examination of Other Possible PAS-Protein DNA
Complexes
In an effort to determine if additional PAS
proteins could interact and generate DNA binding specificity, we
attempted our selection and amplification protocol with either OL187 or
OL224 and SIM, AHR, or a combination of the AHR and SIM. After several
rounds of selection, neither the AHR, SIM, or a combination of the two
proteins developed specific DNA binding complexes. To increase the
sensitivity of these attempts, experiments were also performed using
baculovirus-expressed AHR. All combinations were repeated three times
without detection of a specific DNA binding complex. In addition, we
synthesized oligonucleotides containing a palindrome of the predicted
recognition half-sites of the AHR and SIM (core sequences of
T(C/T)GCGC(A/G)A and GTGCGCAC, respectively). Gel shift analysis of
either the AHR or SIM with these radiolabeled oligonucleotides failed
to yield specific DNA binding complex formation (data not shown).
DNA Binding Specificity of bHLH-PAS Dimers for Their
Selected Consensus Sequences and Various E-boxes
As an
additional demonstration of DNA binding specificity, we used
competitive binding analysis to compare the affinities of bHLH-PAS
dimers for oligonucleotides corresponding to their consensus DNA
sequences and a variety of E-boxes. Competitive binding analysis with
each productive bHLH-PAS pair (i.e. AHR
ARNT,
ARNT
ARNT, or SIM
ARNT) demonstrated that each DNA binding
complex had the greatest affinity for its derived consensus sequence
over all of the E-box sequences tested (see Fig. 9, A-C). Presence of the ARNT homodimer consensus sequence
(OL329/330) diminished the complex formation in all reactions that
contained the ARNT protein (Fig. 9, A and C, lane 3). The ARNT homomeric species demonstrated the greatest
affinity for the E-box CACGTG (Fig. 9B, lanes 3 and 5), with much lower affinity for the TNGCGTG sequence (Fig. 9B, lane 2) or the other E-boxes, CAGCTG
or CATGTG (Fig. 9B, lanes 6 and 7).
Figure 9:
Specificity of DNA recognition by
AHR
ARNT, ARNT
ARNT, and SIM
ARNT complexes. Gel shift
analysis of incubation mixtures are shown containing reticulocyte
lysate-expressed ARNT and AHR (0.5 fmol of each protein) with OL318/319
as the probe (A), baculovirus-expressed ARNT (4 fmol) and 10
µg of unprogrammed reticulocyte lysate with OL329/330 as the probe (B), or reticulocyte lysate-expressed SIM and ARNT (0.5 fmol
of each protein) with OL331/332 as the probe (C) and 100-fold
molar excess of the indicated competing oligonucleotides: lane
1, none; lane 2, OL318/319; lane 3, OL329/330; lane 4, OL 331/332; lane 5, OL321/322; lane
6, OL323/324; and lane 7,
OL316/317.
Relative DNA Binding Affinities of AHR-, ARNT-, and
SIM-containing Complexes
To obtain estimates of the
relative DNA binding affinities of the full-length AHR
ARNT,
ARNT
ARNT, and SIM
ARNT complexes, we performed dissociation
rate analysis using the gel shift assay as an end point. As shown in Fig. 10, the calculated half-life values of the full-length
AHR
ARNT and ARNT
ARNT complexes are similar (3.2 versus 5.06 min) while that of the SIM
ARNT complex was considerably
more rapid (less than 0.2 min).
Figure 10:
Dissociation rate analysis of the
full-length AHR
ARNT, ARNT
ARNT, and SIM
ARNT DNA
binding complexes. Each binding reaction containing the indicated
proteins was allowed to come to binding equilibrium with 1 ng of the
appropriate radiolabeled oligonucleotide (i.e. the derived
consensus sequence of each DNA binding complex). Excess of unlabeled
oligonucleotide was added to the mixture, and aliquots were removed at
the indicated time points. Each value represents the average of two
independent experiments ± S.E. See ``Experimental
Procedures'' for details.
Demonstration of PAS Protein Interactions by
Coprecipitation
To further establish the interaction of
bHLH-PAS proteins, we utilized a coprecipitation assay (Fig. 11). Protein-protein interactions of ARNT-AHR,
ARNT-AHRC
516, and ARNT-ARNT, but not ARNT-SIM, were observed.
Specificity of the ARNT-containing interactions was demonstrated by the
lack of coprecipitation using the
S-labeled GN
315 AHR
construct in which most of the dimerization domain has been replaced by
the DNA binding and dimerization domain of Gal4(2) .
Interestingly, ARNT-ARNT interactions were observed only when the
incubations contained the CACGTG-containing oligonucleotides.
Figure 11:
Coprecipitation analysis of AHR-ARNT,
ARNT-ARNT, and SIM-ARNT interactions.
S-Labeled
full-length AHR (top left), AHRC
516 (top
middle), AHRGN
315 (top right), ARNT (bottom
left), and SIM (bottom right) were coprecipitated in the
presence (+) or absence(-) of the baculovirus-expressed six
histidine-tagged ARNT (ARNT-his) using nickel-nitriloacetic acid resin
in the presence or absence of the following: 10 µM
-naphthoflavone (top left), OL318/319 (top
middle), OL329/330, OL316/317, OL 323/324 (bottom left),
and OL331/332 (bottom right).
DISCUSSION
Strategy
Our hypothesis was that
bHLH-PAS proteins could form a variety of heteromeric and homomeric
combinations and that each complex would display unique oligonucleotide
binding specificities. We predicted that the analysis of these
different recognition sites would allow us to deduce the half-site
specificity of each protein. To test these ideas, we utilized a DNA
selection and amplification strategy to identify the preferred
recognition sequences of various AHR, ARNT, and SIM
combinations(27) . The oligonucleotides bound by these protein
complexes were isolated from pools of millions of independent, unbound
sequences. Once selected by the protein complex, the oligonucleotides
were isolated from nondenaturing polyacrylamide gels and amplified by
PCR. To increase the specificity of the method, the oligonucleotide
pools were typically subjected to multiple rounds of selection and
amplification prior to cloning and sequence analysis. The power of this
method arises from the fact that it is independent of any prior
knowledge or preconceptions regarding DNA binding specificity and has
the potential to yield information about protein-DNA interactions not
readily attainable by more conventional methods such as DNA
footprinting or site-directed mutagenesis of a single oligonucleotide
sequence.
Specific versus Nonspecific
Interactions
A number of approaches were used to ensure
that amplified sequences were specific for the protein complex and not
simply sequences that were nonspecifically comigrating in the gel.
First, bands of amplified oligonucleotides were analyzed (considered
specific) only if the band was dependent upon the presence of all of
the bHLH-PAS proteins used in the assay. Second, specificity was
confirmed by the capacity of ARNT- or AHR-specific antibodies to
supershift the radiolabeled complex. Third, a consensus was deduced
from each set of selected oligonucleotides, and this information was
used to design consensus oligonucleotides that were used in gel shift
assays to confirm specificity of interaction. Only in the case of
SIM
ARNT sequences was the presence of a comigrating nonspecific
oligonucleotide observed. In this case, we reanalyzed each of the 80
amplified oligonucleotides independently by gel shift analysis to
eliminate any nonspecific sequences (see above).
Validation of the DNA Selection and Amplification
Strategy
To validate our strategy, we first employed this
technique using the AHR
ARNT complex that recognizes the DRE
sequence,
TNGCGTG(5, 29, 30, 31, 32) .
We anticipated one of two outcomes. Either the AHR
ARNT complex
would recognize sequences containing this known core and validate our
experimental approach or the complex would recognize a unique DNA
sequence, such as the E-box motif, that is commonly recognized by most
other bHLH proteins. In our initial experiment, we performed
AHR
ARNT selection on a pool of oligonucleotides that had mixed
bases incorporated at 13 sequential positions (OL187).
-Square
analysis of the nucleotide frequencies at various positions revealed a
consensus sequence of TNGCGTGC. This sequence was essentially identical
to the previously described DRE,
TNGCGTG(5, 29, 30, 31, 32) .
No sequences conforming to E-boxes were found in any of the 24 clones
that were sequenced.The analysis presented in Fig. 1indicated that the positioning of the TNGCGTG core
sequence within the random 13-mer was biased by the flanking sequences
required for annealing PCR primers (i.e. most core sequences
were found closer to the 3`-end of the oligonucleotide, Fig. 1A). This observation led us to examine the impact
of flanking sequences on AHR
ARNT DNA binding specificity. The
analysis using OL224 as the oligonucleotide pool revealed a consensus
binding sequence of GGGNAT(C/T)GCGTGACANNCC (Fig. 2). (
)Nucleotides that were present at frequencies above
expected random values were identified at 11 of the 14 flanking
positions, including those in positions -4, -3, 4, 5, and
6. These results are consistent with those obtained using substitution
mutagenesis of a DRE-containing
oligonucleotide(31, 32) . The selection of flanking
nucleotides suggests that both the AHR and ARNT (or other proteins
within this complex) are capable of DNA contacts at sites adjacent to
the commonly recognized core sequence. In addition, our results suggest
that positions not identified previously, the -9, -8,
-7, -5, 9, and 10 positions, are selected for and thus
could also play a role in the AHR
ARNT-DNA recognition.
If
binding affinity is the only determinant of a functional DRE in
vivo, then our consensus sequence for the AHR
ARNT complex
should be identical to bona fide DREs. In an attempt to address this
question, we compared our selected sequences to 10 DREs known to
function upstream of TCDD-regulated genes. Since similarity is a
difficult assertion to prove statistically, we identified those
nucleotides that were statistically different. The most interesting
discrepancy between the in vitro and in vivo consensus is the preference for an A at position -5 for the in vitro derived sequence and the lack of an A at -5 in
any reported DRE. The absence of A at -5 may be an indication
that inappropriate contacts are occurring in vitro, that
additional proteins are required for in vivo interactions, or
that some attenuation of binding affinity is required for optimal
control of gene expression in vivo.
DNA Recognition by ARNT Homodimers
In an
effort to determine half-site recognition of ARNT and to determine if
ARNT could recognize a specific DNA sequence as a homodimer or as a
heterodimer with other bHLH-PAS partners, we performed a series of
selection and amplification experiments with various combinations of
the AHR, ARNT, and SIM. The observation that ARNT is not found in
association with Hsp90 (42) and is present at high
concentrations in the nuclear compartment of hepatoma cells (34) led us to first attempt to characterize oligonucleotide
sequences that were specifically bound by ARNT alone (presumably as an
ARNT homodimer). We attempted to increase the sensitivity of the
selection by using ARNT that had been purified from a baculovirus
expression system(36) . Given our previous results suggesting
that the purified AHR from this expression system required
uncharacterized protein factors for DNA binding, we routinely added 10
µg of unprogrammed reticulocyte lysate to the ARNT/oligonucleotide
incubation mixture(36) . Using these conditions, we found that
ARNT recognized the sequence CACGTG. This complex migrated to a
position similar to that of the AHR
ARNT and SIM
ARNT
heterodimers, suggesting that ARNT recognized this sequence as an
oligomer of a size similar to the other complexes, presumed to be
dimeric. Further, the ability of bovine serum albumin to stabilize the
ARNT complex (albeit to a lesser degree) indicates that the ARNT DNA
binding complex is not the result of an interaction with an unknown
protein present in the reticulocyte lysate. As shown in both Fig. 6B and Fig. 11, we could also detect this
interaction using the concentrations of ARNT generated in our
reticulocyte lysate system (
1 fmol). This indicates that the
interaction can occur at the lower ARNT concentrations that may be
found in cell nuclei(34) . While this manuscript was in review,
work by Sogawa et al.(43) also reported that ARNT
homodimers recognize the CACGTG motif and used chimeric reporter
constructs to suggest that this interaction may be capable of
up-regulating endogenous promoters downstream of the corresponding
E-box element in vivo. Our dissociation rate experiments
indicate that the relative stabilities of the AHR
ARNT and
ARNT
ARNT complexes for their respective recognition sites are
similar (Fig. 10). However, analysis of these complexes by
coprecipitation yielded lower amounts of complexed ARNT
ARNT than
that of AHR
ARNT (especially in the absence of oligonucleotides).
The reason for this discrepancy is unclear, but it is an indication
that ARNT
ARNT interactions are weaker than AHR
ARNT
interactions in the absence of DNA. Taken together, these studies
suggest that the ARNT
ARNT homodimer may act as an important
transcriptional regulator through its interaction with E-box elements.
DNA Recognition by SIM
ARNT
Heterodimers
The observation that ARNT homodimeric
complexes could specifically interact with DNA suggested that other
bHLH-PAS combinations might recognize unique DNA sequences and shed
light on the half-site recognition and pairing rules of this family of
transcription factors. Although our initial attempts to demonstrate
SIM-ARNT-DNA interactions using OL187 were unsuccessful, we also
initiated the selection analysis with OL224. This strategy was
initiated for two reasons. First, the AHR and SIM share the highest
degree of sequence similarity in their bHLH, PAS, and C-terminal
domains (Fig. 12)(6) , thus we predicted these would
have the most similar DNA recognition sequences. Second, our
preliminary experiments led us to suspect that amplification of ARNT
homodimer-specific sequences (CACGTG) was preferentially occurring in
our attempts to select and amplify sequences specific for the
SIM
ARNT complex. Our results presented in Fig. 5indicated
that use of OL224 would minimize DNA interactions resulting from ARNT
homodimers, thus minimizing contamination by ARNT-specific sequences (i.e. CACGTG). Using this strategy, we were able to amplify
oligonucleotides that bound SIM
ARNT complexes specifically, with
the consensus sequence GNNNNGTGCGTGANNNTCC. Our failure to detect
SIM-ARNT interactions using the coprecipitation assay (Fig. 11)
combined with the rapid dissociation rate of the SIM
ARNT
DNA
complex (Fig. 10) indicate that the SIM-ARNT interaction is
relatively weak. The weak interaction of the SIM-ARNT complex found in
this study is in contrast to that reported by Sogawa et
al.(43) .
Figure 12:
Comparison of basic regions and
recognition half-sites of bHLH-PAS proteins to other bHLH proteins. The
conserved glutamic acid and arginine residues of class A and class B
are indicated with a solid line. The conserved arginine that
distinguishes class B is indicated with a dotted line. The letter B denotes a basic amino acid, and dashes represent highly degenerate positions.
Recently, a number of SIM-responsive elements
have been cloned from Drosophila using an enhancer trapping
technique(41) . Sequence alignment of these regulatory elements
revealed a consensus motif, (G/A)(T/A)ACGTG. This sequence differs by a
single nucleotide when compared to the SIM
ARNT consensus core
sequence we describe in Fig. 7, GTGCGTG. The difference exists
at the -2 position (underlined) within the putative SIM binding
5`-half-site (A versus G). To examine the importance of this
nucleotide position, we performed a series of gel shift experiments to
determine the impact that this nucleotide had on SIM
ARNT
recognition. We found that both A and G at the -2 position are
specifically bound by the SIM
ARNT complex (Fig. 8B). Our
inability to predict an A nucleotide at this position arose from our
use of OL224 that has a fixed GCGTG core (see above). Thus, we conclude
that the in vitro SIM
ARNT consensus core sequence is
more appropriately GT(A/G)CGTG, with GTACGTG possibly having greater
relevance to SIM-responsive gene regulation in vivo.
Half-site Recognition of ARNT, AHR, and
SIM
The identification of half-site recognition of ARNT,
AHR, and SIM in combination with analysis of the amino acid sequences
of their basic regions should provide insights into the relationships
between the bHLH-PAS proteins and members of other bHLH families.
Interestingly, the ARNT-specific sequence half-site is also recognized
by other bHLH proteins such as Max(44) , Myc(45) , and
USF(22, 46, 47) . The bHLH proteins that bind
the 3`-half-site GTG sequence (binding CACGTG as homodimers) have been
denoted as class B proteins and are distinguished by the presence of an
arginine (R) residue in their basic region immediately following the
sequence ERRR (i.e. ERRRR) (48) (Fig. 12). The
bHLH proteins that lack this C-terminal Arg residue commonly recognize
the 3`-half-site CTG sequence (binding CAGCTG) and are denoted class A.
Our results suggest that ARNT is a class B protein since its homomeric
form recognizes the palindromic CACGTG sequence with greatest affinity,
and its basic region has an Arg residue at the characteristic position.
In addition, many bHLH proteins possess a critical glutamic acid
residue (ERRR), which has been shown to contact the CA of the E-box
sequence CANNTG(49) . Although this residue is present in the
basic region of ARNT, it does not occur at corresponding positions in
either the AHR or SIM proteins. Thus, by predictions derived from these
rules and from their primary amino acid sequences, neither the AHR nor
SIM proteins would be expected to bind any known E-box half-sites. Our
results support this prediction and suggest that when complexed with
ARNT, the AHR has the greatest affinity for the 5`-half-site T(C/T)GC,
and SIM has the greatest affinity for the half-site GT(A/G)C. We
suggest that these proteins represent a unique class of bHLH proteins
and designate this group as class C. While this paper was in review,
another group determined the position of ARNT as the 3`-GTG half-site
of the DRE(50) .
Pairing Rules of bHLH-PAS DNA Binding
Complexes
Our results indicate that certain rules dictate
pairing and subsequent DNA binding of bHLH-PAS proteins. In contrast to
the identification of DNA binding complexes formed with ARNT alone, AHR
and ARNT, or SIM and ARNT, no oligonucleotide sequences could be
selectively amplified when the AHR and SIM (each alone or mixed) were
used as the binding species. These experiments were repeated multiple
times, using either OL187 or OL224 and the higher concentrations of
protein that were attainable with baculovirus-expressed AHR. The fact
that heterodimeric binding of the bHLH-PAS proteins was detected only
with ARNT suggests that ARNT may be a general dimerization partner for
PAS proteins that respond to cellular signals. In addition, the
multiplicity of productive bHLH-PAS protein combinations may have a
significant impact on the spectrum of DNA binding sites, enhancer
elements, and responsive genes affected by these proteins in the
presence and absence of compounds such as TCDD. A second explanation
for the limited number of bHLH-PAS protein pairs that were detected by
this method cannot be ruled out. Our inability to detect AHR or SIM
homodimeric or AHR-SIM heterodimeric interactions with DNA may be due
to a failure of the method to detect weaker protein-protein or
protein-DNA interactions in vitro.
Summary
These data support several
important conclusions. First, ARNT is capable of forming distinct DNA
binding complexes with another molecule of ARNT, the AHR, or SIM. This
suggests that bHLH-PAS proteins may be involved in a combinatorial
mechanism of gene regulation that involves the formation of multiple
homo- or heterodimeric pairs, each with a role in controlling
expression of distinct batteries of genes(51) . For example,
the observation that ARNT may interact with E-box elements suggests
that in the absence of AHR agonists, ARNT homodimers play a role in the
regulation of a second battery of genes, possibly through interactions
at E-boxes that may be down-regulated in the presence of TCDD. Second,
since ARNT is capable of recognizing DNA as a component of several
distinct complexes, we were able to elucidate the DNA recognition
half-sites of these PAS proteins. As predicted by amino acid sequence
homology to other class B bHLH proteins, ARNT recognizes the
3`-half-site GTG. In contrast, the basic region amino acid sequences of
both the AHR and SIM are unique and specify distinct 5`-half-sites,
T(C/T)GC and GT(A/G)C, respectively. Finally, the AHR
ARNT complex
displays a preference for nucleotides that flank the core T(C/T)GCGTG
motif, suggesting that the protein-DNA interactions of this complex
extend beyond the core motif. Other PAS protein complexes (i.e. ARNT
ARNT or SIM
ARNT) display fewer preferences for
flanking nucleotides, suggesting that the sequence specificity of
various PAS protein complexes may differ substantially or may be less
restricted than that of the AHR
ARNT complex.