(Received for publication, June 29, 1994; and in revised form, September 22, 1994)
From the
The glucagon gene is expressed in the endocrine pancreas, the intestine, and the brain. In the endocrine pancreas, expression of the glucagon gene is restricted to the alpha cells of the islets of Langerhans. We previously showed that 168 base pairs of the promoter was critical for this restricted expression. To further characterize the mechanisms involved in alpha cell specificity, we analyzed the responsible DNA sequences by transient transfection studies into glucagon- and insulin-producing cell lines. We localized alpha cell-specific sequences between nt 100 and 52, a region that corresponds to the upstream promoter element G1. Four protein complexes, B1, B2, B3, and B6 interact with G1; B6 requires most of G1 to be formed. B1, B2, and B3, by contrast, bind on closely overlapping sequences, display similar methylation interference patterns, and appear to be related complexes. Point mutations of G1 indicate, however, that their binding specificities are different. All four complexes are islet-specific, and impairment of their binding results in decreased transcription. We conclude that G1 interacts with islet cell-specific proteins to restrict glucagon gene expression to the alpha cells.
Developmental regulation of gene expression and the events leading to cell differentiation remain poorly characterized. Identification of nuclear proteins and their cis-acting cognate DNA control elements constitute a preliminary step to approach the molecular mechanisms of gene expression during embryogenesis and in the differentiated cell.
Glucagon is a 29-amino acid peptide involved in
glucose homeostasis. This is the first hormone to be produced by the
embryonic pancreas(1, 2) . Characterization of the
factors involved in the regulated expression of the glucagon gene
should help to understand the events leading to islet cell
differentiation. Studies with transgenic mice and glucagon-producing
cells transiently transfected with DNA constructs containing various
lengths of the rat glucagon gene 5`-flanking sequence linked to the
chloramphenicol acetyl transferase (CAT) ()reporter gene
have shown that the most proximal 300 bp were sufficient to direct
alpha cell-specific expression(3, 4) . Three DNA
control elements have been further defined within these 300 bp, G1, G2,
and G3 G2 and G3 function as enhancer-like sequences, and G1 functions
as an upstream promoter element with little intrinsic activation
potential.
We previously showed that 168 bp of the rat glucagon gene promoter were critical for alpha cell specificity(5) . To further understand the molecular mechanisms leading to cell-specific expression of the glucagon gene, we first localized within this region the responsible sequences by 5`- and 3`-deletional analyses of the promoter and studied their interactions with nuclear proteins by binding and functional assays. We report here that G1 is the critical DNA control element of the rat glucagon gene promoter, which confers alpha cell-specific expression. At least four protein complexes (B1, B2, B3, and B6) bind to G1 B6 requires most of G1 for its formation, whereas B1, B2, and B3 bind to overlapping sequences, display similar DNA interactions as assessed by dimethyl sulfate (DMS) interference assays, and may be related complexes. Specific mutations of G1, however, can selectively prevent or impair the binding of B1, B2, or B3. The four complexes are islet cell-specific, and mutations that affect their binding result in decreased transcriptional activity. Our data indicate that glucagon gene expression depends on the interactions of islet cell-specific DNA-binding proteins with sequences of the upstream promoter element G1.
The human cell lines THP-1 (human monocyte), Hep-2 (human epidermoid carcinoma), HepG2 (human hepatocellular carcinoma), ME-43 (melanoma), JAR (choriocarcinoma), and the Epstein-Barr virus-transformed B lymphocyte line, Mann, were grown in the same conditions as described above.
Nuclear extracts were prepared according to Shapiro et al.(9) or Schreiber et al.(10) . Protein concentrations were determined by the Bio-Rad protein assay kit (Bio-Rad).
A 408-bp fragment (nt -350 to +58) subcloned into Bluescript (Stratagene, La Jolla, CA) was 5`-end-labeled on the upper strand and used for the DNase I footprinting assay essentially as described previously (5) except that no polyvinyl alcohol was added in the incubation buffer.
Oligonucleotides used for EMSAs (see Table 1for sequences) and site-directed mutagenesis were synthesized on a gene assembler (Pharmacia Biotech Inc.) by using the phosphoramidite method and purified on 20% sequencing gels. Site-directed mutagenesis of G1 was performed on G1-56 subcloned into Bluescript. The wild-type G1-56 and the resulting mutated G1 (M1 to M12) were then isolated and used as binding sites in EMSAs. Relative localization within G1 of the mutated nucleotides is indicated in Table 1. 5` and internally deleted mutants were constructed from the appropriate 5`- and 3`-truncated DNA fragments of the rat glucagon 5`-flank, as previously described(5, 18) .
Figure 1:
Cell-specific expression of the rat
glucagon gene 5`-flanking sequences. CAT plasmids containing fragments
of the 5`-flanking region of the glucagon gene were transfected along
with a control plasmid, pSV2Apap, into glucagon-producing (InR1G9 and
TC1) and insulin-producing (HIT-15 and
TC1) cells. CAT
activities represent CAT/placental alkaline phosphatase enzymatic
activity ratios and are expressed relative to that obtained with the
positive control, RSVCAT; values are indicated along with the standard
error of the mean (n = 4). poCAT (the CAT gene without
the promoter) was used as a negative control. A and C, DNA transfection into InR1G9 and HIT-15 cells. B and D, DNA transfection into
TC1 and
TC1 cells.
G2 refers to sequences nt -200 to -165, and G3 refers to nt
-270 to -230. *, #, and + indicate p values
of <0.01, <0.02, and <0.05,
respectively.
Selective expression of DNA constructs
nt -1600CAT, nt -1100CAT, nt -350CAT, nt
-213CAT, and nt -168CAT was observed in the
glucagon-producing cell lines, InR1G9 and TC1. By contrast, little
or no activity above that obtained with the promoterless construct
poCAT was measured in the insulin-producing cells, HIT-15 and
TC1.
The fact that nt -168CAT was still specifically expressed in
InR1G9 and
TC1 cells indicates that a critical determinant for
alpha cell-specific expression is present within the promoter of the
rat glucagon gene or the 51 bp of the first exon, a result in agreement
with our previous analyses(5) .
Additional DNA constructs
were made in order to further localize the alpha-specific DNA element.
We thus linked the rat glucagon gene enhancer sequences G2 or G3 to
different lengths of the 5`-deleted promoter (nt -136 to
+51). G2 and G3 were chosen for their capability to drive
transcription in both glucagon- and insulin-producing cells with
similar efficiency when linked to a heterologous promoter(5) .
With constructs containing either 31 or 60 bp of the glucagon promoter
driven by G2 or G3, no clear specificity of expression between
glucagon- and insulin-producing cells could be demonstrated (Fig. 1, C and D). Specific expression into
InR1G9 and TC1 cells became evident, however, when at least 75 bp
of the promoter were present. A further increase in specificity was
observed when 136 bp were included. Replacement of G2 or G3 by a
heterologous enhancer, represented by the hepatocyte nuclear factor 3
(HNF3) binding site of the transthyretin gene (HNF3-136CAT),
which is active in both beta and alpha cell phenotypes(19) ,
gave similar results, indicating that alpha cell-specific expression is
not critically dependent on G2 or G3. In addition, activities measured
after transient transfection of the 3`-deleted mutant nt -200 to
-118 linked to nt -31CAT, which lacks G1, were comparable
in both glucagon- and insulin-producing cells. Our results indicate
that critical determinants of alpha cell-specific expression are
present between nt -118 and -60 of the glucagon promoter.
This region corresponds to the previously identified upstream promoter
element G1(5) . G1 thus serves as the alpha cell-specific
element of the glucagon gene.
Figure 2: DNase I footprint analysis of the rat glucagon gene promoter. The upper strand of the nt -350 to +58 fragment of the 5`-flanking sequence of the rat glucagon gene was 5`-end-labeled as described under ``Materials and Methods.'' Lanes F, 1, 2, and 3 represent DNase I-digested free DNA (F) and DNA to which 25 (lane 1), 50 (lane 2), and 75 (lane 3) µg of nuclear extracts prepared from InR1-G9 cells were added. Protected regions are indicated by brackets, and their approximate borders are numbered according to the transcription start site. Enhanced DNase I cutting site is indicated by an arrow.
Figure 3: Schematic representation of the first 300 bp of rat glucagon gene 5`-flanking sequence. G1, G2, and G3 boxes represent regulatory elements(7) . T, TATA box. Nucleotides are numbered relative to the transcriptional start site (indicated by an arrow). The DNA sequence of G1 is indicated above the map. Fragments used for electrophoretic mobility shift assays and methylation interference experiments are represented below the diagram. The relative positions of the mutated nucleotides within G1 are represented by black boxes. See ``Materials and Methods'' for the sequences of the mutated oligonucleotides.
Figure 4:
Binding of nuclear proteins from InR1G9
cells to the G1 control element. Electrophoretic mobility shift assays
were performed by incubating P-end-labeled G1-121 in
the absence(-) or presence of the indicated molar excess of
unlabeled competitor oligonucleotides (indicated above each
lane) with 6 µg of InR1-G9 extracts. B1, B2, and B6 indicate
the positions of specific protein-bound complexes, whereas B4 and B5
point to nonspecific complexes. NS, nonspecific
oligonucleotide. The asterisk represents migration of free
oligonucleotides.
To examine the minimal sequences needed for binding of B1, B2,
and B6, we first competed for these three complexes by progressively
shortened oligonucleotides (Fig. 3). As shown in Fig. 4, B and C, oligonucleotides G1-56 (nt -115
to -60) and G1-33 (nt -95 to -63) were both
able to compete for B1, B2, and B6, although relatively more
G1-33 was necessary to displace B2 and B6 as compared with
G1-56 and G1-121. By contrast, a mutant G1-56
(G1r-56, mutated from nt -89 to -68) or a nonspecific
oligonucleotide were unable to displace any of the three complexes (Fig. 4B). To further localize the binding sites of B1,
B2, and B6, we used two different G1-33 mutants as competitors,
G1-33r3 and G1-33r5 (Table 1). As shown in Fig. 4D, only modest competition of B1 was observed
with the 5`-mutant G1-33r5, whereas both B1 and B2 were
efficiently displaced by the 3`-mutant G1-33r3. Neither
G1-33 mutant affected B6. These results indicate that the core of
G1 (nt -89 to -80) is critical for the interactions with B1
and B2 and that B6 requires at least the 33 bp of G1-33 to be
formed. We then performed EMSAs with the various shortened
oligonucleotides. With G1-56, we observed not only B1 and B2 but
a new complex, B3, in addition to two nonspecific complexes, B4 and B5 (Fig. 5A). B3 was specifi-cally competed for by
G1-56 but to a lower extent by G1r-56 or nonspecific DNA.
Nonspecific competitor DNA always somewhat affected B3 formation,
indicating that B3 has a lower binding specificity to G1 as compared
with B1, and B2. B3 was also efficiently competed for by the mutant
G1-33r3 but not by G1-33r5 (data not shown). Of note,
binding of B1 and B3 to G1-56 was maximal at 5 mM MgCl and at 20 °C, whereas B2 binding was optimal
at 3.5 mM MgCl
and not influenced by temperature
(4 versus 20 °C) (data not shown). B6 was never observed
to form with G1-56, suggesting that sequences between nt
-60 and -52, which are present in G1-121 but not in
G1-56 are important for its formation. The reason for not
observing B3 with labeled G1-121 is unclear, although shorter DNA
fragments might be preferable binding sites in EMSAs; alternatively, B3
may be a component of B6. In this hypothesis, B3 might interact with
other proteins to form the slow migrating B6 complex. When the 3`-end
of G1 is deleted (as with G1-56 and G1-33), B6 may not be
able to form, allowing for the appearance of the faster migrating (and
thus probably smaller) B3 complex.
Figure 5: Binding of nuclear proteins from InR1G9 cells to shortened G1 oligonucleotides. Gel retardation assays were performed with 6 µg of nuclear extracts as in Fig. 4. A and C, binding of nuclear extracts to G1-56. B, binding of InR1G9 nuclear extracts to G1-33. The molar excess of the competitor oligonucleotide is indicated above each lane. B1, B2, and B3 indicate the positions of specific complexes, whereas B4, B5, and dots point to nonspecific complexes. The asterisk represents free oligonucleotides.
B1, B2, and B3 were also able to bind G1-33, although, with this shorter binding site, B2 migrated close to B1 (Fig. 5B). Addition of mutant G1-56 competitors ( Table 1and Fig. 9), which either bind B1 and not B2 (M4) or B2 and not B1 (M9), clearly showed, however, that both B1 and B2 are binding to G1-33. G1-33 is thus sufficient to bind B1, B2, and B3.
Figure 9:
Effect of G1 mutants on protein binding. A, EMSA performed by incubating P-end-labeled
G1-56 and 11 G1 mutant oligonucleotides (see Table 1) with
6 µg of InR1-G9 nuclear extracts. B, G1 mutant
oligonucleotide-specific binding was investigated by competition
experiments. The binding reactions were carried out by incubating
labeled G1-56 in the absence(-) or presence of a 100-fold
molar excess of mutated DNA competitor (M1 to M12) as indicated above each lane. Positions of the specific DNA-protein
complexes (B1 to B3), nonspecific complexes (B4, B5, and dots), and free DNA (asterisk or F) are
indicated with arrowheads.
Transient expression of DNA constructs containing 5`deleted fragments of the glucagon gene promoter driven by either G2 or G3 revealed that alpha cell-specific expression was partially restored with 75 bp of the promoter compared with 60 bp. Since the binding sites of B1, B2, and B3 are located 5` of nt -75, we tested whether we could detect the interaction of protein complexes with G1-3`, an oligonucleotide spanning sequences of the promoter between nt -77 and -47. However, no specific complexes were observed (data not shown).
Figure 6: Binding of increasing amounts of InR1G9 nuclear extracts to G1-56. Gel retardation assays were performed as in Fig. 4. A, the amount of nuclear extracts incubated with labeled G1-56 was increased from 6 (lane 1) to 9 (lane 2), 12 (lane 3), 15 (lane 4), 18 (lane 5), and 21 µg (lane 6). B, the amount of nuclear extracts was 6 (first lane) and 12 µg (second and third lanes). The unlabeled competitor oligonucleotide is indicated above lane 3. B1, B2, and B3 indicate specific complexes, whereas B4 and B5 indicate nonspecific complexes. The asterisk represents free labeled G1-56.
Figure 7:
Methylation interference analysis of B1
and B2. G1-56 coding (C) and noncoding (NC) strands were
individually P-endlabeled and incubated with 48 µg of
InR1-G9 extracts as described under ``Materials and
Methods.'' A, methylation interference pattern of B1 and
B2. F represents the cleavage pattern of free DNA; B1 and B2 correspond
to the cleavage products obtained from B1 and B2 complexes. DNA
sequences of modified methylation are indicated along the ladder. B, schematic representation of the methylation interference
profiles of B1 and B2. Filled arrowheads indicate G residues
in B1 and B2 complexes at which methylation specifically interferes
with protein binding. Open arrowheads represent enhanced DNA
cleavage. Intensity of each band was evaluated by laser
densitometry.
Enhanced DNA cleavage was noted 5` and 3` of the contact points for all three complexes (at nt -109 and -74 on the upper and lower strands, respectively), probably indicating more exposed sites either by protein binding-induced structural alteration of the DNA or by the formation of hydrophobic pockets at the boundaries of the protein binding sites.
Figure 8:
Cell type distribution of B1, B2, B3, and
B6. Equal amounts of nuclear extracts (6 µg) from cell lines
indicated below were assayed for the presence of B1, B2, B3, and B6,
using the G1-121 and G1-56 probes. EMSAs were performed as
described in the legend to Fig. 4. A, cellular
specificity of DNA-protein complexes B1, B2, and B6 in extracts from
phenotypically different islet cells: InR1G9 and HIT-15 (left
panel, lanes 1 and 2, respectively) and TC1
and
TC1 (right panel, lanes 1 and 2,
respectively). B and D, cellular specificity from
non-islet cells. B, comparison of the EMSA profiles obtained
with nuclear extracts from InR1-G9 (lane 1) and the
non-pancreatic rodent cell lines PC12 (lane 2) and BHK-21 (lane 3) using labeled G1-121. C, comparison of
the EMSA profiles obtained with nuclear extracts from InR1G9 cells (lane 1), ME43 (lane 2), HepG2 (lane 3),
Hep-2 (lane 4), JAR (lane 5), and Mann (lane
6) using labeled G1-121. D, patterns obtained with
InR1G9 (lane 1) and the non-pancreatic human cell lines Mann (lane 2), THP-1 (lane 3), Hep-2 (lane 4),
HepG2 (lane 5), ME-43 (lane 6), and JAR (lane
7) using labeled G1-56. Asterisk indicates unbound
oligonucleotides. B1, B2, B3, and B6 indicate the positions of bound
specific complexes. For descriptions of the different cell lines, see
``Materials and Methods.''
Figure 10:
Functional analysis of mutations within
G1 by transient expression into InR1G9 and HIT-15 cells. Plasmid
constructs containing the CAT gene (CAT) under the control of
sequence nt -350 to +58 of the rat glucagon gene were tested
for CAT activity by transient transfections into InR1G9 () and
HIT-15 (
) cells as described under ``Materials and
Methods.'' The regulatory elements are represented by hatched
boxes, and the wild-type (WT) and mutated G1 sequences
are indicated below the map. The E box motif is underlined. The site of transcription initiation is indicated
by the arrow. Each construct was cotransfected with the
plasmid pSV2Apap to correct for differences in transfection
efficiencies, and relative CAT activity values represent CAT/PAP
enzymatic activity ratios relative to that of construct WT with the
standard error of the mean (n = 4). pBLCAT3 (the CAT
gene without the promoter) was used as a negative control. #, +,
and * indicate p values of <0.01, <0.02, and <0.05,
respectively.
To investigate the possibility that G1 might function as silencer in non-alpha cell lines rather than as an alpha cell-specific element, we transfected the mutant G1 constructs into the insulin-producing cell line, HIT-15. None of the constructs was capable of significantly activating transcription above base line (Fig. 10). These results strengthen the proposed alpha cell-specific role of G1.
In the studies presented here, we have attempted to better
define the factors that contribute to alpha cell-specific expression of
the glucagon gene. We have previously shown that the glucagon gene
promoter (nt -168 to +51) was critical for cell-specific
expression, whereas the enhancers G2 and G3 were capable of activating
transcription in phenotypically different islet cells(5) . We
show here that the DNA sequences of the promoter that determine alpha
cell specificity are localized between nt -118 and -60 and
correspond to G1, a DNA control element previously identified by DNase
I footprint assays. Additional elements are present within the rat
glucagon gene promoter, but their contribution to differential
expression between islet cell phenotypes is probably small. ()G1 is a large 49-bp element that binds at least four
protein complexes, B1, B2, B3, and B6. Surprisingly, B1, B2, and B3
interact with a limited part of G1, between nt -95 and -75.
No complex binding to the proximal half of G1 (G1-3`), between nt
-75 and -52 was detected by EMSAs, although many different
conditions were attempted both for running gels (Tris-glycine and TBE
(0.045 M Tris-borate, 0.001 M EDTA) buffers) and
performing the binding reactions of the EMSAs (range of KCl and
MgCl
concentrations). Several indications suggest, however,
that the proximal half of G1 plays a significant role in both
transcriptional activity and cell specificity. Transcriptional
activities of G2/G3-60CAT in alpha and beta cells are comparable,
whereas a clear difference in favor of alpha cells is apparent for
G2/G3-75CAT, indicating the presence of an alpha cell-specific
determinant in the proximal 75 bp. Furthermore, mutations at nt
-72 to -73 (M11) result in the most dramatic decrease in
transcription. It is thus likely that the proximal part of G1 is
functionally important and consequently interacts with transcription
factors. The B6 complex actually requires the proximal region of G1 to
be formed. The facts that B6 needs most of G1 for binding and that it
is a slow migrating complex suggest that it may contain several
proteins, one or more of which may interact with the proximal part of
G1. Since we cannot detect specific complexes binding to G1-3`,
it may be hypothesized that protein interactions with the proximal part
of G1 require the cooperativity of proteins binding to its distal half
and among them, potentially, B1, B2, and B3. In that regard, M11 (nt
-72 to -73) decreases B3 binding, whereas B3 may not
interact with these nucleotides. The facts that B6 can only bind when
an intact proximal half of G1 is present and that B3 only forms in the
absence of this proximal half suggest that B3 may be contained in B6.
B1, B2, and B3 probably share common subunits as suggested in the EMSAs by the preferential formation of B3 at the expense of B1 and B2 at high protein concentrations. These complexes may thus represent combinations of homo- and heterodimers belonging to a family of transcription factors. They bind on overlapping sequences between nt -95 and -75 and display very similar DMS interference patterns. Mutational analyses clearly reveal, however, different specificities at least for B1 and B2. The overlapping binding sites of B1, B2, and B3 suggest that only one of the complexes, exclusively of the others, may interact with G1 at a particular time; if this is indeed the case, it will be important to determine whether all three complexes can interact with G1 in vivo, what favors the binding of one complex over the others, and what are their respective effects on transcription. In that regard, it may be of interest to note that B1 may have a higher affinity for G1 than B2.
Can the interactions of G1 with B1, B2, B3, or B6 explain alpha cell-specific expression of the glucagon gene? All four complexes are found in both insulin- and glucagon-producing cells. The nature of the four beta cell complexes may differ from those found in alpha cells. Combinations of different subunits contained in B1, B2, B3, or B6 between alpha and beta cells might then explain cell-specific expression of the glucagon gene. Alternatively, additional complexes that we have not detected in our binding assays or proteins not directly contacting DNA and present only in alpha cells may be required to confer specificity. In any case, multiple proteins are likely to act in concert to insure specificity. This hypothesis is strengthened by our transfection studies indicating that 75 bp of the promoter impart cell specificity as compared with 60 bp but that the entire G1 is necessary for optimal alpha cell-specific expression. An additional consideration may be that complexes from non-glucagon-producing islet cells bind to G1 and prevent transcription; transfections of the mutant G1 constructs in HIT-15 cells do not support this possibility.
Search by computer analysis did not reveal any sequence homology between G1 and other known DNA control elements, except for an E box (CAGATG from nt -83 to -78) and two TAAT sequences (from nt -57 to -54 and nt -91 to -88).
We noted previously that disruption of the E box by linker scanning mutation resulted in a complete loss of transcriptional activity(5) . We thus hypothesized that the E box could serve as a main determinant of cell-specific expression of the glucagon gene. Helix-loop-helix proteins, which bind E boxes, play a central role in the differentiation process of a wide variety of cell types (26, 27, 28, 29) and have been suggested to be involved in the cell-specific expression of the insulin gene as well as other genes, such as those encoding gastrin and secretin expressed at some stages in islet cells(20, 21, 22) . We show, however, that B1, B2, and B3 do not show binding specificities for the E box of G1, suggesting that the E box motif of G1 has no relevance by itself to the cell-specific expression of the glucagon gene.
The two TAAT sequences present in G1 suggest that homeobox-containing transacting factors may play a role in alpha cell-specific expression. Such sites have already been shown to be critical for the cell-specific expression of two other islet hormone genes encoding insulin and somatostatin(30, 31, 32) . Further characterization of B1, B2, B3, and B6 should help us understand the development and differentiation of alpha cells.