(Received for publication, September 13, 1995; and in revised form, October 31, 1995)
From the
The commitment of cells to specific lineages during development
is determined in large part by the relative expression of various
homeodomain (HOX) selector proteins, which mediate the activation of
distinct genetic programs. But the mechanisms by which individual HOX
genes are themselves targeted for expression in different cell types
remain largely uncharacterized. Here, we demonstrate that STF-1, a
homeodomain protein that functions in pancreatic morphogenesis and in
glucose homeostasis is encoded by an ``orphan'' homeobox gene
on mouse chromosome 5. When fused to a -galactosidase reporter
gene, a 6.5-kilobase genomic fragment of 5`-flanking sequence from the
STF-1 gene shows pancreatic islet specific activity in transgenic mice.
Two distinct elements within the STF-1 promoter are required for
islet-restricted expression: a distal enhancer sequence located between
-3 and -6.5 kilobases and a proximal E-box sequence located
at -104, which is recognized primarily by the helix loop
helix/leucine zipper nuclear factor USF. As point mutations within the
-104 E-box that disrupt USF binding correspondingly impair STF-1
promoter activity, our results demonstrate that USF is an important
component of the regulatory apparatus which directs STF-1 expression to
pancreatic islet cells.
The vertebrate pancreas consists of endocrine and exocrine components, which arise from a common progenitor cell in the duodenal anlage(1) . Within the endocrine component of the pancreas, a pluripotent precursor cell, which initially expresses multiple islet hormones, undergoes progressive restriction to form the four subpopulations of cells comprising the adult islets of Langerhans: insulin, somatostatin, glucagon, and pancreatic polypeptide-producing cells(2, 3) . The mechanism by which these developmental pathways are activated is unclear, but current evidence implicates the homeobox factor STF-1 (IPF-1/IDX-1) as an important determinant in this process. Indeed, the requirement for STF-1 in development is supported by homologous recombination studies in which targeted disruption of the STF-1/IPF-1 gene leads to congenital absence of the pancreas(4) .
STF-1 (also referred to as Pdx1) expression is first detectable at embryonic day 8.5 in cells of the pancreatic anlage and in pluripotent precursor cells. Transiently expressed in both endocrine and exocrine components of the developing pancreas, STF-1 production is progressively restricted to insulin- and somatostatin-producing islet cells(5, 6) . In these cells, STF-1 appears to regulate both insulin and somatostatin genes by binding to functional elements within each promoter(5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16) .
Although STF-1 appears to be an important regulator of pancreatic
genes, the mechanism by which STF-1 expression is itself targeted to
pancreatic cells remains uncharacterized. Here, we show that a 6.5-kb ()fragment of the STF-1 promoter is sufficient to direct
islet-specific expression of a
-galactosidase reporter gene in
transgenic mice as well as in cultured cells. Within this 6.5-kb
fragment, an E-box element, located at -104 relative to the major
transcription initiation site, appears to be particularly critical for
STF-1 promoter activity. Our studies suggest that this element is
recognized by an upstream activator which is essential for islet
expression of STF-1.
Figure 1: Chromosomal location of the STF-1 gene. The STF-1 gene is encoded by an orphan homeobox gene located in the distal region of chromosome 5 (mouse). Top, schematic diagram showing position of STF-1 (referred to as Pdx1) relative to other markers on mouse chromosome 5. Centimorgan scale is indicated on right. Bottom, table showing recombination frequency between STF-1 and various markers on chromosome 5. Left column indicates markers employed for chromosomal assignment. STF-1 referred to here as Pdx1.
To isolate the gene encoding
STF-1, we screened 10 bacteriophage clones from a rat EMBL
3 genomic library with a
P-labeled STF-1 cDNA probe and
obtained two positive clones, each containing a genomic insert of 15
kb. In addition to 6.5 kb of 5`-flanking and 3.5 kb of 3`-flanking
sequence, the 15-kb STF-1 genomic fragment contained the entire STF-1
coding region, which was interrupted by a single 4-kb intron inserted
immediately upstream (Ala-135) of the homeobox coding sequence (amino
acids 140-215) (Fig. 2A).
Figure 2: The STF-1 promoter is TATA-less and utilizes multiple transcription initiation sites. A, schematic representation of the 15-kb genomic clone of STF-1 is shown with kilobase scale above. 6.5-kb 5`-flank, 4-kb intron, and 3-kb 3`-flank are indicated. IVS refers to the 4-kb intron that interrupts exons I (shaded and indicated as I) and II (shaded and indicated as II). Lower panel shows the nucleotide sequence of the 5`-flanking region of the STF-1 gene. Transcription start sites mapped by RNase protection (filled arrowheads) and primer extension (empty arrowheads) are indicated as S1 (major initiation site, bold), S2, and S3. Potential upstream factor binding sites for bHLH/bHLH-ZIP proteins (E-box), CTF/NF-1 (CAAT), and C/EBP are underlined and labeled above along with their locations relative to the major start site. B, left, RNase protection assay of Tu6 RNA (Tu6, lane 2) or control yeast tRNA (yeast, lane 3) using antisense STF-1 RNA probe. Undigested probe is shown on left (lane 1). Right, primer extension analysis on yeast (lane 4), Tu6 mRNA (lane 5), or RIN mRNA (lane 6) using an antisense STF-1 primer. Sequencing ladder is shown on the far right (GATC, lanes 7-10). Correspondence between RNase-protected products and primer-extended products are marked, and first three start sites are designated S1, S2, and S3 (see also A, bottom).
The absence of consensus TATA box or initiator sequences in the 5`-flanking region of the STF-1 genomic clone (Fig. 2A) prompted us to map the transcriptional initiation sites for this gene. Using RNase protection and primer extension analysis on mRNAs from the insulin-producing cell lines RIN and Tu6 (Fig. 2, A and B), we identified three principle initiation sites, termed S1, S2, and S3, which were located 91, 107, and 120/125 nucleotides upstream of translational start site, respectively. A fourth minor transcriptional initiation site 137 nucleotides upstream of the translational start site was also observed. Like other TATA-less promoters, the STF-1 promoter contains G/A and G/C-rich sequences 30 bp upstream of the S1 and S2 start sites(23, 24, 25) .
Figure 3:
A 6.5-kb fragment of the STF-1 promoter
targets expression of a -galactosidase reporter gene to pancreatic
islet cells of transgenic mice. Representative cryosections of adult
pancreas from transgenic (top) and control (bottom)
littermates evaluated for lacZ activity using X-Gal as chromogenic
substrate. Arrows point to pancreatic islet
cells.
To define functional
elements that direct STF-1 expression to pancreatic islet cells, we
examined the activity of the -6500 STF Luc reporter in two
distinct pancreatic islet cell lines (TC 3, HIT). As predicted
from results in transgenic mice, the STF-1 reporter showed
20-100-fold more activity in these islet cells compared to
non-islet lines such as HeLa, PC12, and COS (Fig. 4A).
By contrast, the 4-kb intron and 3-kb 3`-flanking region of the STF-1
gene showed no such activity when inserted into a minimal SV40
chloramphenicol acetyltransferase promoter plasmid (not shown),
suggesting that the 6.5-kb STF-1 promoter fragment is specifically
responsible for targeted expression of STF-1 in islet cells.
Figure 4:
Distal and proximal elements within the
STF-1 promoter direct STF-1 expression to pancreatic islet cells. A, activity of a -6500 STF-1 luciferase reporter plasmid
following transfection into pancreatic islet (TC 3, HIT) versus non-islet cell lines (PC12, COS, HeLa). Representative
assay showing STF-1 promoter activity in HIT cells (100%) relative to
other cell lines after normalization with co-transfected Rous sarcoma
virus-chloramphenicol acetyltransferase control plasmid. Assays were
repeated at least three times. B, representative assay of
STF-1 luciferase (STF Luc) promoter constructs following transfection
into HIT cells. Constructs are named according to 5`-promoter boundary
relative to the major transcriptional start site (S1, Fig. 2, A and B). Schematic diagrams show position of
potential binding sites for nuclear factors; the major transcriptional
start site is represented by the filled arrow. Asterisk indicates uncharacterized binding activity in the distal 3 kb. For
each construct, activity was calculated relative to -6500 STF Luc
(100%) following normalization for transfection efficiency using Rous
sarcoma virus-chloramphenicol acetyltransferase as an internal control.
Assays were repeated at least four times.
To delineate sequences within the STF-1 promoter that confer islet cell expression, we generated a series of 5`-deletion constructs and analyzed these reporters by transfection into HIT cells (Fig. 4B). Deletion of sequences from -6500 to -3500 bp from the -6500 STF reporter construct reduced STF-1 reporter activity 4-fold, suggesting the presence of a distal activating sequence within that region. Further truncation of the STF-1 promoter from -3500 to -190 bp did not affect reporter activity in HIT cells significantly (Fig. 4B), but deletion of STF-1 promoter sequences from -190 to -95 bp severely attenuated reporter activity in HIT cells, indicating that a proximal element was also required for STF-1 promoter function. Inspection of the sequence in the -190 to -95 region of the STF-1 promoter revealed three consensus E-box motifs (Fig. 2A). Although removal of two tandem E-boxes at -177 did not reduce promoter activity, deletion of the proximal E-box sequence at -104 (-95 STF Luc) completely abolished STF-1 expression in HIT cells.
Figure 5:
A proximal E-box element in the STF-1
promoter binds the upstream factor USF. A, DNase I protection
assay using a P-labeled STF-1 promoter fragment extending
from -182 to -37 (145 bp). Autoradiogram shows digestion
pattern with no extract (lanes 1-4), with nuclear
extracts from HIT or HeLa cells (lanes 2 and 3,
respectively), or with recombinant USF-1 (lane 5). B,
electrophoretic mobility shift assay of HIT nuclear extract incubated
with a
P-labeled double-stranded oligonucleotide
containing the STF-1 E-box (-118/-95) (lane 1).
Unlabeled competitor oligonucleotides (50-fold molar excess) were added
to binding reactions as indicated (lanes 2-5). STF
E, wild-type STF-1 E-box oligonucleotide; STF-E MUT,
mutant STF-E oligonucleotide containing substitutions in the E-box
motif at -106 (C/A) and -102 (T/G). Ins-1 (P),
P-element from insulin I promoter; Gal-4, GAL4 recognition
site. C, left, gel shift assay of HIT nuclear
extracts using STF-E-box as probe (lane 1). Addition of USF or
TFE-3 antibody to reactions as indicated (lanes 2 and 3). Complexes C1, C2, C3 are as labeled. Right, gel
shift assay of HIT (lanes 1-3) extracts using STF-E
probe. Addition of USF-1 and USF-2 specific antisera to binding
reactions are indicated.
Previous reports demonstrating that E-boxes like the -118/-95 STF-1 motif (CACGTG) preferentially bind bHLH-ZIP proteins such as Myc, Max, TFE-3, TFE-B, and USF prompted us to examine whether any of these candidate proteins was contained within the C1, C2, or C3 complexes (26, 27) . The ability of the -118/-95 E-box binding protein to withstand heat denaturation (not shown) led us to first examine whether USF, a heat-stable upstream factor, was a component of C1, C2, or C3 (28) . Remarkably, addition of anti-USF antiserum to gel mobility shift reactions inhibited formation of all three complexes (Fig. 5C, left panel), but anti-TFE-3 antiserum had no effect on complexes C1, C2, or C3, suggesting that these complexes were most likely formed by USF proteins. In gel shift assays, recombinant USF-1 gave rise to a protein DNA complex, which migrated at the same relative position as complex C2 (not shown), and in DNase I protection studies, recombinant USF-1 footprinting activity coincided with that observed in HIT extracts (Fig. 5A).
Two forms of USF, termed USF-1 and USF-2,
appear to be expressed in most cell types(18) . To distinguish
which of these USF proteins was contained within the C1, C2, and C3
complexes, we incubated HIT or HeLa extract with either anti-USF-1
specific or anti-USF-2 specific antiserum (Fig. 5C, right panel). Although USF-1 antiserum could
``supershift'' all three complexes, the USF-2 specific
antiserum only inhibited formation of the C1 and C3 complexes. These
results suggest that complexes C1 and C3 correspond to USF-1USF-2
heterodimers, whereas C2 may contain a USF-1 homodimer.
To verify whether the CACGTG E-box sequence was essential for STF-1 promoter activity, we constructed a mutant STF-1 oligonucleotide that contains two base pair substitutions in the E-box (-118/-95). In gel mobility shift assays with HIT nuclear extracts, this mutant E-box motif (AACGCG) could not form C1, C2, and C3 complexes and could not compete for binding of USF-1 to wild-type E-box oligonucleotide (Fig. 6A). Correspondingly, full-length (6.5 kb) STF-1 and truncated (-190 STF) reporter plasmids containing the mutant STF E motif were nearly 10-fold less active than their wild-type counterparts in pancreatic islet cells (Fig. 6B). These results indicate that the proximal E-box, which binds USF, is indeed critical for STF-1 promoter activity.
Figure 6: Binding of USF to the -104 E-box is important for STF-1 promoter activity. A, effect of E-box mutations on USF binding activity. Gel shift assay of HIT nuclear extract using wild-type (E-WT) or mutant (E-MUT) STF-1 E-box probes. Sequence of wild-type and mutant probes from -106 to -102 are shown below. C1-3, complexes C1, C2, and C3. B, effect of E-box mutation on STF-1 promoter activity in HIT cells. Representative assay of HIT cells transfected with wild-type, mutant, or deleted (-118/-95) STF-1 E-box motifs in the context of 6500 or 190 bp of STF-1 promoter. Reporter activities are shown relative to wild-type -6500 STF-1 Luc (100%) construct after normalizing for transfection efficiency with a cotransfected Rous sarcoma virus-chloramphenicol acetyltransferase control plasmid. Assays were repeated at least three times.
The majority of vertebrate homeobox genes are confined to four chromosomal clusters, termed HOX A-D(29, 30) . Within each cluster, individual homeobox genes are ordered in a 5` to 3` pattern, which is co-linear with each antero-posterior expression pattern during development. It is not entirely clear whether this colinear organization is critical for proper expression of hox genes, but current evidence suggests that such clusters may contain upstream enhancers that coordinately regulate hox gene expression (29, 31) . Remarkably, the STF-1 gene does not map to any hox cluster but rather belongs to a group of so-called orphan hox genes. Although the regulatory implications of this distinct chromosomal location for STF-1 remain to be shown, our results suggest that orphan homeobox genes like STF-1 may be regulated by signals that are distinct from those employed for the HOX clusters. In this regard, the STF-1 promoter displays pancreatic islet cell-specific activity both in transgenic animals as well as in transient transfection assays, and the lineage specific activity of this transgene contrasts with the segmental expression pattern of most hox genes.
Two
elements within the first 6500 bp of STF-1 5`-sequence appear to be
important for islet-specific expression: a distal element located
between -6500 and -3500 and a proximal element located at
-104. Although the identity of the distal element remains to be
elucidated, the proximal -104 element consists of a consensus
E-box motif that predominantly recognizes the upstream activator USF.
Multiple lines of evidence suggest that USF is important for STF-1
promoter activity. First, both non-discriminating USF-1 and USF-2
antibodies as well as USF-1- and USF-2-specific antibodies recognize
the complexes specific for the STF-1 E-box. Second, the STF-1 E-box
binding activity in HIT nuclear extracts has characteristics
reminiscent of USF: the complexes are heat stable and demonstrate
half-lives similar to recombinant USF-1. ()Finally, point
mutations that inhibit formation of USF complexes on the STF E-box
correspondingly attenuate STF-1 reporter activity. These results
suggest that USF complexes are indeed important for STF-1 promoter
activity and consequently for pancreatic organogenesis.
Other nuclear factors in addition to USF, most notably Myc and Max, can also bind with high affinity to the STF-1 E-box (CACGTG) motif. Myc has been shown to stimulate target gene transcription by binding as a heterodimer with Max to E-box motifs(32, 33) . As myc gene expression is typically undetectable in post-mitotic cells such as those in pancreatic islets, Myc-Max complexes may not be involved in STF-1 promoter regulation there. During development, however, STF-1 expression appears to be concentrated in proliferating ductal cells (6) , and myc may consequently stimulate STF-1 expression under those conditions. In this regard, it is tempting to speculate that the profound changes in STF-1 expression, which are observed during pancreatic development, may in part reflect changes in E-box binding activities that ultimately restrict STF-1 production to pancreatic islet cells.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U39640[GenBank].