(Received for publication, August 18, 1995; and in revised form, October 20, 1995)
From the
Neural-specific expression of the mouse regulatory type-I
(RI
) subunit gene of cAMP-dependent protein kinase is controlled
by a fragment of genomic DNA comprised of a TATA-less promoter flanked
by 1.5 kilobases of 5`-upstream sequence and a 1.8-kilobase intron.
This DNA contains a complex arrangement of transcription factor binding
motifs, and previous experiments have shown that many of these are
recognized by proteins found in brain nuclear extract. To identify
sequences critical for RI
expression in functional neurons, we
performed a deletion analysis in transgenic mice. Evidence is presented
that the GC-rich proximal promoter is responsible for cell
type-specific expression in vivo because RI
DNA
containing as little as 17 base pairs (bp) of 5`-upstream sequence was
functional in mouse brain. One likely regulatory element coincides with
the start of transcription and includes an EGR-1 motif and 3
consecutive SP1 sites within a 21-bp interval. Maximal RI
promoter
activity required the adjacent 663 bp of 5`-upstream DNA where most,
but not all, of the regulatory activity was localized between position
-663 and -333. A 37-bp direct repeat lies within this
region that contains 2 basic helix-loop-helix binding sites, each of
which are overlapped by two steroid hormone receptor half-sites, and a
shared AP1 consensus sequence. Intron I sequences were also tested, and
deletion of a 388-bp region containing numerous Sp1-like sequences
lowered transgene activity significantly. These results have identified
specific regions of the RI
promoter that are required for the
expression of this signal transduction protein in mouse neurons.
The variety of neuronal cell types that comprise the mammalian
nervous system is determined by highly refined spatial and temporal
patterns of gene expression(1, 2) . This idea is
supported by the restricted expression patterns of numerous
transcription factors within the developing nervous system. Members of
the homeobox(3, 4) , POU
domain(5, 6) , and bHLH ()(7, 8) families of DNA binding proteins
help initiate the cascades of gene expression that establish
region-specific classes of neurons. The diversity of neuronal
phenotypes is further enhanced during the life of the organism by
stimulus-dependent modifications in differentiated neurons. This
adaptive plasticity, controlled by various neurotransmitters and
cytokines, elicits new patterns of gene expression and long-term
changes in cell behavior(9, 10) .
A better
understanding of how neuronal diversity and function is achieved will
be facilitated by the identification of the mechanisms that regulate
neural-specific gene expression. A mouse gene that we have focused on
encodes the regulatory type I (RI
) subunit of cAMP-dependent
protein kinase (PKA)(11) . Protein phosphorylation by PKA
serves a pivotal role in neuronal function (12, 13, 14) . The R subunits of the
holoenzyme, of which four genes have been identified, prevent catalysis
in the absence of cAMP(15) . Each type of regulatory subunit
contributes distinct qualities that broadens enzyme function and helps
regulate the myriad of cellular responses controlled by this second
messenger pathway. The RI
subunit gene is expressed primarily in
neurons, and within the central nervous system RI
mRNA appears
throughout the brain and within the spinal cord (16, 17) . Relative to the other R subunits, RI
makes the holoenzyme more sensitive to cAMP(18) , and recent
gene disruption experiments indicate that mice lacking RI
show
deficits in long-term depression and depotentiation in hippocampal
neurons(19) .
The neural-specific component of RI gene
expression was localized to 3.5 kb of genomic DNA that includes 1.5 kb
upstream of a GC-rich (TATA-less) promoter, exon I, and a 1.8-kb
intron. A fusion gene containing this promoter fragment and the
bacterial lac z coding region mimicked the expression pattern
of the endogenous RI
gene in the central nervous system of
transgenic mice(20) . Analysis of sequence and protein binding
activity of this DNA (21) has identified a complex array of
recognition sequences for general transcriptional activators like
Sp1(22) , developmental regulators like bHLH and POU-domain
proteins(6, 23) , and physiological regulators such as
immediate-early gene products (9) and steriod hormone receptors
(SHR) (24, 25) . To address the functional role of
these various binding sites in vivo, we performed a deletion
analysis in transgenic mice. The results indicate that the sequences
required for neuronal expression lie within the GC-rich proximal
promoter and that flanking regions upstream of transcription start and
within intron I are required for full transgene promoter activity.
Figure 1:
DNA sequence of the RI promoter
and flanking regions. Exons I and II are boxed, and the
location of intron I is indicated. Transcription initiation sites are
identified by the ˆ , and the 5`-start site is designated +1.
The translation initiation codon at +1993 is labeled MET. The
coding region of the lac z gene was fused in frame at this
ATG. Consensus binding motifs for known transcription factors are shown
in bold and underlined. Closed circles indicate SHR superfamily half-sites. Cleavage sites for the
restriction enzymes used in the deletion analysis are indicated by the black arrowheads.
Previous results demonstrated that 3.5 kb of DNA encompassing
the RI promoter was sufficient to direct lac z gene
expression in the central nervous system of transgenic
mice(20) . Fig. 1presents the sequence of this DNA and
highlights the complex array of transcription factor binding sites used
by various tissue-specific regulators, immediate-early genes, and
mediators of hormone action. There are, for instance, 12 SP1 binding
sites clustered within a 350-bp region surrounding the GC-rich proximal
promoter. An additional six SP1 motifs are located within a 200-bp
region in intron I. Most of these consensus sequences (14/18) are the
``GT'' box version, which has been reported to be the
preferred binding sequence of a brain-specific member of the SP1
family(30) . The proximal promoter region also contains two
EGR-1 (31, 32) and two AP2 motifs (33) . 12
out of 15 E-boxes are located upstream of the transcription initiation
sites. This sequence, CANNTG, is recognized by the bHLH family of
proteins(34) , which includes regulators of nerve cell
differentiation(8, 23) . Another family of proteins
implicated in neural development recognizes the POU consensus
sequence(35) ; five out of six POU motifs are clustered in a
250-bp region of intron I. The most prevalent consensus sequence is
recognized by the superfamily of SHRs. At least 40 half-sites comprised
mostly of GGTCA, GGTGA, and AGGACA were identified. Employing gel
mobility shift assays, we established that many of these consensus
sequences bind proteins found in brain nuclear extracts as well as
purified SP1, AP2, and MyoD/E47 dimers(21) .
Efforts to
identify the functional domains of the RI promoter were initiated
using standard transient transfection protocols and cell lines that
expressed RI
protein. It was determined that the RI
promoter
retained tissue-specific regulation in vitro since the
RI
lac reporter plasmid was active in NB2a and HT-22 neuroblastoma
cell lines but inactive in cells of non-neuronal origin such as Chinese
hamster ovary and JEG cells (data not shown). To localize the
regulatory sequences important for this expression pattern, we
introduced a series of deletion constructs into N2Ba cells. It was
discovered, however, that plasmid sequences in the expression vector
influenced promoter activity. As indicated in Fig. 2A,
cells transfected with 1.5 kb of 5`-upstream DNA (pBSRI
lac-1515)
produced less
-gal activity than cells given a plasmid containing
just 17 bp of 5`-upstream sequence (pBSRI
lac-17). This result
might suggest that sequences between -1515 and -17 had an
inhibitory effect on promoter function. Alternatively, this
``induction'' was caused by the juxtaposition of plasmid
sequences closer to the RI
proximal promoter. To distinguish
between these possibilities, we repeated this transfection using
plasmid-free RI
lac DNA and discovered that the relative strengths
of these two constructs were reversed (Fig. 2B). Thus,
when assayed in the absence of plasmid DNA, it appears that deleting
the majority of 5`-upstream DNA has a negative effect on promoter
activity in vitro.
Figure 2:
Transient transfection of NB2a
neuroblastoma cells with RIlac DNA. A, RI
lac
plasmids with different lengths of 5`-upstream DNA (see Fig. 1)
were transfected by calcium phosphate precipitation and assayed for
-gal enzyme activity. B)
-gal activity obtained with
the same RI
lac deletions as in A with the exception that
all plasmid sequences were removed prior to transfection. The data are
expressed as the fold increase in enzyme activity obtained in
comparison to transfections using a promoterless lac z gene, plac F(26) . Each bar represents the average
± S.D. of two experiments performed in
triplicate.
Because of the contradictory results
obtained with transient transfection and because of the uncertain
physiological relevance of testing neuroblastoma cells instead of
neurons, we measured the activity of various RIlac constructs in
transgenic mouse brain. Fig. 3A illustrates the
transgenes used in this experiment and indicates the number of lineages
that expressed lac z. Animals were considered positive if
-gal staining was detected in sections of brain tissue. Note that
almost all of the lines expressed the transgene even with as little as
17 bp of upstream sequence. As expected, no animals expressed
transgenes containing deletions through the region of transcription
initiation and exon I (RI
lac+243). Upon inspection of
-gal staining in brain slices, we noted that enzyme levels
appeared to be quite variable within each group of transgenic mice. Fig. 4, for instance, shows the expression pattern of two
lineages carrying RI
lac-7500.
-gal staining in line 349 was
strong and broadly dispersed throughout all the major regions of the
brain (Fig. 4, A and C), whereas expression in
line 344 was completely absent or significantly reduced in many regions (Fig. 4, B and D). These patterns were
heritable and observed in siblings of different gender and at different
ages. Line 344 contained three times the number of integrated
transgenes as line 349, so the reduced expression in 344 could not be
attributed to gene copy number. In fact, no correlation between copy
number and transgene expression was observed in 75 independent
lineages. This suggests that the variability of lac z expression probably relates to the random nature of transgene
integration and the domain effects caused by the flanking host
chromatin(36) .
Figure 3:
Expression of RIlac 5`-deletion
constructs in the brains of adult mice. A, transgenes
containing the indicated amounts of 5`-DNA are shown. The restriction
sites used for this analysis were as follows: BII, BglII; X, XbaI; K, KpnI; S, StuI; SII, SstII; E, EagI; and n, NsiI. The number of expressing
lineages out of the total number of lineages positive for transgene
incorporation are shown on the right (Exp/TG). B,
transgene expression in regions of mouse brain. Tissue slices were
incubated with X-gal, and the relative intensity of staining in the
indicated regions was scored using an arbitrary scale from 0 to 5. The
number of lineages examined is shown above the graph.
A minimum of three mice was assayed from each lineage. The graph indicates averaged values. C,
-gal enzyme activity
in whole brain extracts. The number of expressing lineages that were
assayed is shown above the graph. A minimum of three
mice was assayed from each lineage. The Wilcoxon Rank Sum test (29) was used to determine statistical significance (p < 0.05; see text).
Figure 4:
Variability of transgene expression in
adult mouse brain. A and C, Lineage 349, transgenic
for RIlac-7500, expressed
-gal activity in most regions of
the brain, including the neocortex (N), caudate-putamen (Cp), septum (S), hippocampus (H), thalamus (T), hypothalamus (Hy), and piriform cortex (P). B and D, lineage 344, also transgenic
for RI
lac -7500, showed intense staining in the cortex but
little to no enzyme activity in other anatomical regions. E,
lineage 874, transgenic for RI
lac -17, expressed only in the
neocortex (arrows show approximate lateral extent of labeled
cells). F, lineage 867, also transgenic for RI
lac
-17, showed scattered expression in the cingulate region of the
neocortex, the dorsal hypothalamic nucleus, and the amygdala (A). Positive cells are also scattered throughout the
hypothalamus and piriform cortex.
To quantitate the expression of each
transgenic lineage, we scored the relative level of lac z staining in specific regions of the brain (Fig. 4B), and as a second method we measured lac z activity in whole brain extracts (Fig. 4C). The
results of these assays showed a similar trend; the construct with the
most 5`-upstream DNA, RIlac-7500, was equivalent to RI
lac
-663 in activity. Removal of DNA down to position -333
reduced promoter activity, and deletion to -17 diminished
transgene expression significantly. Although RI
lac -17 was
rather weak,
-gal expression was always restricted to the central
nervous system and undetectable in non-neuronal tissue. To calculate
whether the differences observed between these deletion constructs were
statistically significant, we employed the Wilcoxon Rank Sum test ((29) , see ``Materials and Methods''), which is used
for data sets with large variances. Using this test, it was determined
that the differences between -663, -333, and -17 were
statistically significant (p < 0.01). Two-way comparisons
involving -7500, -1456, -1063, and -663 showed
no statistical differences (p > 0.10). We conclude from
this series of deletions that sequences positioned between intervals
-663/-333 and -333/-17 contribute to high level
expression of the RIb promoter. Moreover, the residual activity of
RI
lac -17 ( Fig. 3and Fig. 4) indicates that
neural-specific expression is controlled by additional sequences
downstream of -17. We suggest that these sequences encompass the
RI
proximal promoter and transcription initiation (see
``Discussion'').
Intron I of the RI gene also
contains a variety of protein binding motifs (Fig. 1). To test
the functional requirement for this DNA, we made transgenic mice with
the constructs shown in Fig. 5. The largest deletion removed
1900 bp from the end of exon I to the beginning of the initiator MET of
the lac z gene. 6 out of 10 founders demonstrated only very
weak staining in brain sections (Fig. 5B). No
expression was detected in whole brain extracts (Fig. 5C). Another deletion removed most of the intron
but left intact the 5`- and 3`-splice sites
(
+243/+1661). Again, the loss of activity was dramatic:
4 of 10 founders expressed only low levels of
-gal. Finally, a
relatively minor deletion that removed 388 bp of intron sequence
(
+243/+643) also had a marked effect when compared to
the intact intron. These results demonstrate that intron I is required
for meaningful RI
expression in vivo, although its
absence does not eliminate brain-specific expression.
Figure 5:
Expression of RIlac intron deletions
in the brains of adult mice. A, various amounts of intron I
were removed from RI
lac-1515. The number of transgenic mice
established for each construct is indicated with the number that
expressed lac z. RI
lac -1456, which has an intact
intron, served as the positive control. The restriction sites used were RI, EcoRI; BII, BglII; A, AvaI; SII, SstII; X, XbaI;
and n, NsiI. B, transgene expression in
different brain regions, as measured by relative X-gal staining. C,
-gal activity in whole brain extracts. The number of expressing lineages that were assayed is shown above the graph. A minimum of three mice was assayed from each
lineage.
In contrast to many genes, the 3.5 kb of DNA that flank the
RI promoter contain a rather complex arrangement of transcription
factor binding sites that includes 18 SP1 sites, 15 bHLH E-boxes, 40
steroid hormone receptor half-sites, and multiple EGR-1, AP2, AP1, and
POU sequence motifs (Fig. 1, (21) ). The redundancy of
these binding sites has presented an interesting challenge in
identifying the sequences responsible for RI
gene expression, and
rather than systematically mutating specific classes of binding sites,
we prepared a series of constructs in which varying amounts of DNA were
removed from the 5`-upstream region and from intron 1. Although a test
of these deletions was initiated in neuroblastoma cells by transient
transfection, we turned to transgenic mice to ascertain which mutations
would cause a meaningful effect on gene expression in a functional
nervous system.
Sequences primarily responsible for neuron-specific
expression of the RI gene most likely reside within the proximal
promoter between positions -17 and +243. This conclusion is
based on the activities of RI
lac-17 and RI
lac
IV, both
of which expressed
-gal in a brain-specific manner, and
RI
lac+243, which failed to express in the absence of the
proximal promoter. Consistent with this result was the observation that
the -17/+243 promoter fragment stimulated lac z activity in NB2a neuroblastoma cells but did not function in
non-neuronal cell types (data not shown). A likely regulatory element
within the proximal promoter coincides with the start of transcription
and includes an EGR-1 motif and three consecutive SP1 sites within a
21-bp interval (Fig. 1). Gel shift experiments confirmed that
this region in RI
binds pure SP1 and additional proteins in brain
nuclear extract(21) . Similar sequences are present in a number
of genes expressed in neural tissue(37, 38) ,
including the promoter for the RII
subunit of PKA(39) ,
and are required for expression of aldolase C in mouse
brain(40) . Sequences such as these have also been shown to
bind inhibitors of transcription(41) .
Although the proximal
RI promoter could function in a tissue-specific manner, its
activity was enhanced considerably by 5`-upstream sequences. RI
lac
-663 provided maximum transgene expression, indicating that
sequences between -663 and -17 function as positive
regulators of the basal promoter. This region contains numerous
transcription factor binding sites, possibly the most complex being a
37-bp direct repeat (located between -555 and -519) that
contains two E-boxes, each of which are overlapped by two steroid
hormone receptor half-sites, and a shared AP1 consensus
sequence(21) . Again, gel mobility shift assays demonstrated
that this multiple repeat sequence readily formed a complex with brain
nuclear extract containing bHLH-, SHR-, and AP1-related proteins.
Nuclear proteins isolated from liver failed to bind these
oligonucleotides(21) . Sequences like these have been shown to
function as enhancer elements in numerous genes (42, 43, 44) . Evidence that the 37-bp repeat
may be required for high level expression is suggested by the
significant reduction in transgene activity obtained with RI
lac
-333. This latter deletion construct, however, still enhanced
basal promoter activity, suggesting that additional sequences, some of
which are redundant with those in region -663/-333, are
also involved in controlling promoter activity.
Two additional
regulatory regions may also have been uncovered further upstream: a
positive element between -7500 and -1456 whose removal
lowered promoter activity and an inhibitory sequence between
-1063 and -666 whose deletion increased expression.
Interestingly, this latter segment contains a duplicated E-box (Fig. 1; near position -736), which in the low affinity
nerve growth factor receptor gene acts as a negative regulatory
element(45) . The significance of these deletions is hard to
measure because of the large variability inherent to this experiment.
However, if we use the Wilcoxon Rank Sum Test and combine the
activities from RIlac -1456 and RI
lac -1063 into
a single data set and compare this to RI
lac -663, then the
differences between these become statistically significant (p < 0.05). We do not believe, however, that this inhibitory
region contains the type of silencer elements observed in other
neural-specific genes (46) because their elimination did not
stimulate proximal promoter activity in non-neuronal Chinese hamster
ovary or JEG cell lines nor did it induce expression in the
non-neuronal tissues of these transgenic mice.
The final region
shown to have regulatory activity was intron I, which contains many
binding motifs including Sp1, AP1, bHLH, POU, CarG, and SHR sites (Fig. 1, (21) ). We made two large deletions in intron I
and saw a dramatic decrease in promoter activity. Interpretation of
this effect is complicated by the observation that transgene expression
is often dependent on 5`-intronic sequences(47) . While the
loss of promoter activity may have resulted from a splicing
requirement, we note that the RNA splice and donor sites were left
intact in RIlac
418, and even RI
lac
388, which
removed one AP1, four SP1, and three SHR half-sites, had significantly
lower promoter activity. It has recently been shown with transgenic
mice that deletion of peripheral Sp1 sites causes de novo methylation and inactivation of the aprt gene(48) . Since the RI
promoter encompasses a GC
island, a similar phenomenon may explain the loss of transgene
expression following the removal of SP1 sites in RI
lac
388.
Our experiments show some of the intricacies associated with
analyzing promoter function. RIlac expression in mice, for
instance, was subject to large variations in expression due to position
effects at the site of transgene integration. The mechanism responsible
for this inhibition is unknown but may involve
methylation(48) . As a result of the variability in expression,
a large number of transgenic mouse lineages were produced, and
utilizing the Wilcoxon Rank Sum test as a statistical tool, we were
able to delineate major regulatory regions in the RI
promoter.
Using transgenic mice, however, to test the relative importance of the
specific binding sites within each region may prove more difficult. The
N2A neuroblastoma cell line was originally used for a deletion analysis
because of the relative simplicity of transient transfections. The
contradictory results obtained with these cells (Fig. 2)
required that we first establish RI
promoter regulation in
vivo. Now that a similar expression pattern has been established
between transgenics and the transfection of RI
lac genes (minus
plasmid sequences) in vitro, it should be possible to use a
cell culture system for detailed mapping of these regulatory regions.
Endogenous RI mRNA is detected at varying levels in most
regions of the brain and spinal cord(16) , and RI
transgene expression has been detected in peripheral nerves as
well(20) . Our data suggest that expression of RI
reflects
a constitutive regulation common to many types of neurons and that this
occurs at the level of its GC-rich proximal promoter. The large number
of transcription factor binding sites flanking this region may serve to
integrate information from numerous signal transduction pathways,
allowing expression of RI
mRNA to be finely regulated. The
redundancy of these sequences may guarantee the continual expression of
this gene. Given the major role PKA plays in the physiology of neurons,
precise regulation of individual subunit gene expression may be
necessary to provide appropriate cellular responses.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) S72345[GenBank].