(Received for publication, September 15, 1995)
From the
Core promoters are defined by the presence of either a TATA box
at approximately 30 base pairs upstream of the transcriptional start
site (+1) and/or an initiator element centered around the +1
site. The prevalence, function, and significance of the various
combinations of core promoter elements are as yet unclear. We describe
here the identification and characterization of an initiator element in
the TATA-containing human globin promoter. Mutagenesis of the
-globin initiator element at positions +2/+3 and
+4/+5 abrogates transcription in a heterologous construct.
Interestingly, we have found a
-globin initiator binding activity
in nuclear extracts whose presence or absence correlates with function
of the
-globin initiator. Accordingly, this binding activity may
be part of the machinery required for
-globin initiator-dependent
transcription. Our analysis further describes a previously
uncharacterized
-thalassemia mutation at the +1 site as a
mutation that decreases
-globin initiator activity. Finally,
consistent with other initiator elements, the
-globin initiator
requires a TFIID-containing fraction for in vitro activity.
Thus, the human
-globin promoter contains an initiator element
whose function, as revealed by a
thalassemia mutation, is of
physiological relevance.
Study of RNA polymerase II transcription necessitates characterization of both the cis and trans elements involved in promoter function. Viral and minimal promoters, as well as those promoters directing highly regulated tissue-specific expression, contain a variety of upstream elements and exhibit heterogeneity in their core promoter elements. In this context, it is important to examine the relative roles of both the upstream elements and the core promoter, and their mutual interactions, in order to understand the regulation of complex tissue-specific promoters.
Accurate
transcriptional initiation has been classically thought to require a
TATA box. However, the finding of numerous promoters that do not
contain a TATA box and yet accurately initiate transcription led to the
discovery of elements centered around the start site as components of
the core promoter(1, 2) . These initiator (Inr) ()elements direct accurate transcription from artificial
constructs containing only upstream Sp1 sites(1, 2) .
Mutation of Inr elements in several promoters decreased or abolished
transcription (see (3) for review), and in heterologous
constructs Inr elements stimulated transcription in the presence of a
TATA box(4, 5, 6) . Experiments performed
with promoters containing both a TATA box and Inr suggest that the TATA
box is the predominant selector of the site of initiation and that the
Inr contributes to the magnitude of the initiation(2) .
Several models have been put forward to describe how specific proteins initiate transcription through Inr elements. One suggests that factors binding to the Inr, and necessary for its function, are present in the TFIID complex(4, 5, 7, 8, 9, 10, 11) . A second model suggests that Inr-dependent transcription is mediated by initiator-binding proteins such as YY1 and TFII-I(6, 12, 13) , as both can substitute for members of the basal machinery in reconstituted systems in vitro (TBP and TFIIA, respectively)(14, 15) . In a third model, recognition of the Inr by RNA polymerase II serves as the nucleation event, analogous to the role of TBP in TATA-containing promoters(3, 16) . Finally, an alternate model suggests that TBP provides a nucleation function through its ability to recognize the -30 regions of TATA-less promoters(17) .
These additional complexities have prompted us to reevaluate the
role of a core promoter in the expression of the human -globin
gene, a paradigm for developmentally regulated genes. Early studies
defined several sequences contributing to the activity of the
-globin promoter. Internal deletion/substitution and point
mutation analysis assigned the TATA box, a CCAAT box at approximately
-75, and a CACC box at approximately -90 as the major
determinants of transcriptional regulation(18, 19) .
Mutations around the +1 site decreased transcription by
approximately 50%(18, 19) . In a more recent study a C
T mutation at -1 was shown to reduce
-globin promoter
expression to about 80% of wild-type activity in MEL
cells(20) .
Human -thalassemia disease is a disorder
characterized by reduced or absent
-globin expression. The
resulting globin chain imbalance due to unimpeded
-globin
expression leads to precipitation of globin polypeptide chains in
developing erythroid cells, and the ensuing anemia. Study of these
naturally occurring
-thalassemia mutations has proven useful in
revealing the in vivo relevance of specific cis elements in the
-globin promoter(21) . Wong et
al.(22) reported a patient with mild asymptomatic
-thalassemia whose DNA was homozygous for an A
C
transversion at +1 of the
-globin promoter. In this report we
describe the characterization of the +1 region of the human
-globin promoter as a functional initiator element and demonstrate
that the +1
-thalassemia mutation is a mutation in the
-globin Inr element (
Inr). Furthermore, we show that in
vitro transcription from the
Inr is dependent on partially
purified TFIID and that a
Inr DNA binding activity exists whose
binding correlates with
Inr functional activity.
Figure 1:
The human -globin
promoter contains a functional initiator element. A, in
vitro transcription with templates containing Sp1 sites either by
themselves (Sp1; lane 1), or upstream of the TdT Inr (Sp1/TdT; lane 2), the
-globin +1 region
(from -8 to +13) (Sp1/
+1; lanes 3 and 5), or the
-globin +1 region in the reverse
orientation (Sp1/
+1R; lane 4). Lane 6 is a transcription reaction treated with 2 µg/ml
-amanitin (+
-aman). Heat-inactivated nuclear
extract (HINE, 47 °C for 15 min) was used in the
transcription reaction in lane 7(26) . Lanes
8-10 are identical to lanes 5-7 except that
the adenovirus MLP was used as template. Arrows indicate the
primer extension product representing the correctly initiated
transcript. B, the primer extension product resulting from an in vitro transcription reaction using the Sp1/
+1
template was electrophoresed next to the sequence of the
Sp1/
+1 template. Both primer extension and sequencing
reactions used the same primer.
Figure 2:
Mutational analysis of the -globin
initiator element. In vitro transcriptions of the wild-type
+1 region (lane 1) and double point mutations in the
+1 region (lanes 2-6). The numbers above the lanes indicate the position of the mutations relative to
the +1 transcriptional start site. The mutations in the Inr
element are listed below the lanes. All constructs contained Sp1 sites
upstream(1) .
Figure 3:
Nuclear extract contains a Inr
binding activity whose presence correlates with
Inr functional
activity.
P-Labeled double-stranded wild-type (lane
6) and mutant (lanes 1-5) oligomers were used in
binding assays and run on a 4% polyacrylamide gel. The arrow indicates an activity whose presence correlates with the
functional analysis of the
Inr mutations (Fig. 2). The
mutant designations are identical to those in Fig. 2.
Figure 4:
A naturally occurring -thalassemia
mutation is a mutation in the
-globin Inr element. (A)
Representative in vitro transcriptions of templates containing
the
-globin promoter from -815 to +18 upstream of a
growth hormone (GH) reporter. Template designations are
indicated above each lane and depicted below. Mutations in each
template are underlined. B, graphical representation
of the data in A. Experiments for each template were performed
at least three times. Error bars indicate standard deviations from the
mean transcription level relative to
GH (all
GH templates) or
MLPGH (all templates with the MLP designation). Note that the
error bars all show 10-15% deviation, which represents the
intrinsic error in the assay.
Representative in vitro transcriptions from these
constructs are shown in Fig. 4A. Fig. 4B provides quantitation of the results of three to four experiments.
The incorporation of the A C transversion at the +1 site
into the
GH (wild-type) template resulted in transcriptional
activity
75% of wild-type (Fig. 4A, lane
2). We also observed a slight shift in the pattern of initiation
from three predominant sites to two major and two minor sites (compare lanes 1 and 2 in Fig. 4). Introduction of two
known
-thalassemia mutations into the
-globin TATA box
(-30
GH: CATA to CACA; -31
GH: CATA to CGTA) (28, 29) resulted in transcription levels
40% of
wild-type (Fig. 4, A and B), consistent with
transient expression data of TATA box
-thalassemia mutations and
mutagenesis
studies(19, 20, 30, 31) . The TATA
box mutations, however, did not alter the pattern of initiation (Fig. 4A, lanes 2 and 3). The double
mutant (-30
THALGH), which contained both the -30 TATA
box T
C transition and the +1 A
C transversion,
further reduced transcription to
20% of wild-type activity.
Replacement of the -globin TATA box sequence (CATA) with the
adenovirus MLP TATA box (TATA) provided a second promoter background
into which we incorporated the +1 A
C transversion and the
2,3 mutant. In the context of a stronger TATA box (compare lanes 1 and 8 in Fig. 4, A and B) (32) the A
C +1 mutation reduced expression to
40% of the parent
MLPGH template, compared to
75% in the
wild-type background (compare
MLPGH and
MLPTHALGH to
GH
and
THALGH in Fig. 4, A and B). Here
again the initiation pattern was altered (compare lanes 10, 11, and 12). Interestingly, the 2,3 mutant, which
abolished transcription in the Sp1 assay (Fig. 2, lane
3), reduced transcription in the
MLP2,3GH template to
approximately
20% of
MLPGH levels (Fig. 4, panel
A, lane 12 and panel B). This similar reduction
in initiation by the 2,3 and the +1
-thalassemia mutations in
the
MLPGH construct suggests that both mutations affect
Inr-dependent transcription and that results using the Sp1-based
templates can be reproduced in the context of a natural promoter.
Figure 5:
A TFIID-containing phosphocellulose
fraction rescues Inr-dependent transcription in heat-inactivated
nuclear extracts. In vitro transcriptions employed the
adenovirus MLP (lanes 1-3) or the Sp1/
+1
template (lanes 4-6). Lanes 1 and 4 are transcriptions using a MEL nuclear extract. Lanes 2 and 5 are transcriptions using heat-inactivated nuclear
extracts (47 °C, 15 min). Lanes 3 and 6 are in vitro transcriptions using heat-inactivated nuclear extract
and 2 µl of a 0.85 M P-11
fraction.
In this report we describe the identification and
characterization of an initiator element in the TATA-containing human
-globin promoter. In so doing, we provide evidence that a protein
fraction containing TFIID is required for Inr-dependent transcription
and detected a DNA binding activity whose binding to the
Inr
correlates with its functional activity. Finally, we demonstrate that a
base substitution at the
-globin +1 site, found in
association with a human
-thalassemia, impairs the activity of the
initiator element, thereby implicating the
-globin Inr as a
functional element in vivo.
Consistent with studies of
other initiator elements(1, 2, 6) , the
-globin Inr functions in a heterologous context and in an
orientation-dependent manner (Fig. 1). Comparison of
transcription from the Sp1/TdT Inr and Sp1/
Inr templates indicates
that within these contexts the
Inr is weaker than the TdT Inr, a
finding consistent with observations suggesting that deviations from
the loose Inr consensus sequence element (YYA
NT/AYY)
decrease Inr activity(7) . Accordingly, mutation of the
-globin Inr with double point mutations replacing nucleotides
-2 through +8 with purines reveals that positions
-1,-2 (YY), 2,3 (NT/A), and 4,5 are necessary for
Inr
activity (Fig. 2).
Gel shift analysis of MEL cell nuclear
extracts with the Inr sequence reveals a protein binding activity (Fig. 3) that strictly correlates with transcriptional activity in vitro (Fig. 2). Previous reports proposed TFII-I and
YY1 transcription factors as candidates for mediating initiator
activity(6, 12, 14, 15) . However,
others have reported that the functional activities of YY1 mutant
binding sites do not correlate precisely with YY1 binding activities
over the same mutant sites(7) . This discrepancy is complicated
by the assays used to define the activities of YY1. The mutational
analysis performed by Javahery et al. (7) employed Sp1
templates containing a YY1 site and used crude nuclear extracts for in vitro transcriptions, whereas reconstituted in vitro systems were used to define YY1 as a functional initiator
protein(15) . It is formally possible that these two functional
assays do not assay similar activities. Further experiments are
required to ascertain whether results obtained with systems using
reconstituted factors are in accord with those using crude nuclear
extracts. A second caveat is the possibility that the context of the
Inr may influence the functional assays(7) . Nonetheless, our
analysis of a panel of
Inr mutants provides an example of a
correlation between Inr functional activity and Inr DNA binding
activity.
A report by Wong et al.(22) described an
Asian-Indian with a mild, asymptomatic -thalassemia. Their
analysis of the patient's
-globin promoter indicated that he
was homozygous for an A
C transversion at +1. Our analysis (Fig. 4) indicated that this transversion is a mutation in the
initiator element, as levels were reduced to 75% of wild-type levels (Fig. 4, panel A, lane 2 and panel
B). Previously described TATA box
-thalassemia mutants
express at approximately 25% of the wild-type levels in transient
assays in HeLa cells(31) . Consistent with these results,
templates containing a -30
-thalassemia mutation (T
C) (-30
GH) and a -31
-thalassemia mutation (A
G) (-31
GH) were expressed in vitro at 30% of
wild-type levels (Fig. 4, A and B). These
results indicate that the in vitro system data can accurately
reflect the in vivo environment. Incorporation of the +1
A
C
-thalassemia into the -30
GH template
resulted in a further reduction in transcription, again indicating that
the +1 mutation affects the function of the initiator element.
Finally, the conversion of the
-globin TATA box (CATA) to the
adenovirus major late promoter TATA box (TATA) supplied a second
template (
MLPGH) with which to study the effects of the +1
-thalassemia mutation. Curiously, the incorporation of the +1
mutation into the
MLPGH background resulted in transcription
levels that were 35% of wild-type (Fig. 4B, compare
MLPGH and
MLPTHALGH). This effect is 2-fold greater than that
seen with the wild-type
globin promoter background (Fig. 4B, compare
GH with
THALGH). This
difference may be due simply to the higher expression from the
MLPGH template or may indicate a cooperativity between the TATA
box and Inr element(34) . We observe that, despite the lack of
transcription from the Sp1/2,3 double mutant (Fig. 2) the same
mutation in the
MLP2,3GH template results in transcription 20% of
wild-type (Fig. 4). From these data we conclude that other
elements are able to compensate for the mutation within the
THALGH,
MLPTHALGH, and
MLP2,3GH templates. Although
other mechanisms are possible, these data are compatible with the
requirement for TFIID, whose footprint on the adenovirus major late
promoter extends from approximately -45 to +35(26) .
If so, the mutations in the
Inr and TATA box may result in
destabilization, or partial disruption, of the interaction of TFIID
with the core promoter, and thereby lead to a decrease in
transcription. Mutation of the Inr results in a complete loss of
transcription (Fig. 2), as the template only contains one site
for TFIID, whereas the partial loss of transcription in the
-globin templates through either TATA box or Inr mutations (Fig. 4) is due to the destabilization/weakening of TFIID
binding in the core promoter. Our results, therefore, are not mutually
exclusive but may reflect the ability of TFIID to bind to both the TATA
box and the Inr.
Consistent with this interpretation, our data
reveal that a phosphocellulose fraction containing TFIID is required
for Inr activity (Fig. 5). These observations are
consistent with other data and models proposing that the machinery for
Inr-dependent transcription is contained in the TFIID
complex(5, 8, 9, 10, 35) .
Validation of this hypothesis will require further experiments to
demonstrate a correlation between TFIID binding and the functional
activity of the
Inr mutants(8) . If this is the case, the
Inr binding activity we detect may reside within the TFIID complex.
Alternatively, the existence of an independent DNA binding activity
might suggest that Inr-dependent transcription requires distinct
mechanisms which differ from those used by other promoters (see
Introduction). Finally, the rescue of Inr-dependent transcription from
heat-inactivated nuclear extracts by the addition of partially purified
TFIID is incomplete, despite the complete rescue of adenovirus major
late promoter transcription (Fig. 5). These data suggest the
existence of an additional heat-sensitive factor that is required for
optimal
Inr activity. These findings resemble those described for
the TdT Inr(4, 10, 33) .
Several groups have produced data that suggest there are distinct mechanisms of initiation mediated by Inr elements. For example, Zenzie-Gregory et al.(36) have reported that Inr-dependent transcription in vitro does not show the lag period found using TATA-dependent templates. Moreover, increasing amounts of nuclear extract reduced activity of a TATA-dependent template, but not that of an Inr-dependent template. In addition, transient assays suggest that overexpression of TBP inhibits expression from TATA-containing, but not TATA-less, promoters(37) . With an upstream activator site, the combination of a TATA box and an Inr significantly increased promoter activity, suggesting that the Inr element may cooperate with a TATA box and also significantly enhance a promoter's response to activators(34) . Additional evidence of a lack of a TBP rate-limiting step in Inr-containing promoters is provided by Ham et al.(38) , who showed that overexpression of TBP can relieve a block on expression of minimal promoters containing a papillomavirus E2 activator binding site and a TATA box. However, the same promoter containing a TATA box and Inr is activated by E2 without the requirement for TBP overexpression. Consistent with this model are data suggesting that the interaction of c-Fos and TBP is required for TATA box-mediated, but not for Inr-dependent, transcription (39) . Finally, Lescure et al.(40) provide data that part of the N terminus of TBP is required for the assembly of preinitiation complexes in TATA-containing, but not TATA-less, promoters. Collectively, these data argue for distinct mechanisms of transcriptional initiation mediated by Inr elements and suggest that Inrs contribute to the response of promoters to upstream activators. The study of Inr-dependent promoters in TATA-containing and TATA-less contexts will thus further our understanding of the mechanisms of transcriptional initiation.