(Received for publication, October 19, 1995; and in revised form, December 15, 1995)
From the
The promoter and its upstream regulatory region of the mouse
cellular retinoic acid-binding protein I (crabp-I) gene were
examined in transgenic mouse embryos, a mouse embryonal carcinoma cell
line P19, and a mouse embryonic fibroblast cell line 3T6. In transgenic
mouse embryos, a -galactosidase reporter gene under the control of crabp-I promoter and its upstream regulatory region displayed
a very specific pattern of expression characteristic of crabp-I gene expression during developmental stages. In tissue
culture systems, the minimal promoter of this gene was identified, and
regions containing positive and negative regulatory activities were
dissected from the upstream 3-kilobase sequence using assays for
transient reporter activity. It is concluded that the minimal promoter
of the mouse crabp-I gene is located between 120 and 150 base
pairs upstream from the transcription initiation site. Several cell
type-specific positive and negative regulatory regions for this
promoter have been identified. A region encoding a common negative
regulatory activity in both P19 and 3T6 cells is also inhibitory to two
heterologous promoters, and specific protein-DNA interactions between
this DNA fragment and nuclear extracts of P19 and 3T6 are demonstrated
by gel retardation experiments.
Retinoic acid (RA) ()exerts pleiotropic effects in
animals, and the effects are mediated through various cellular
components. The RA receptors and retinoic acid X receptors are
transcription factors that regulate gene expression in response to RA
(For review, see (1) and (2) ), whereas a group of
cellular retinoic acid-binding proteins (CRABPs) are believed to be
involved in metabolic pathways of RA (For review, see (3) and (4) ). crabp-I is ubiquitously expressed in adult
tissues at a very low basal level and is highly expressed in several
RA-sensitive tissues such as the eye and the
testis(5, 6) . In embryos, strong expression of this
gene is also spatially and temporally specific to tissues that are most
sensitive to RA, especially the central nervous
system(7, 8, 9, 10) . Based upon the
promoter sequence, the mouse crabp-I gene has been
characterized as a house keeping gene(11) . However, its
upstream region contains numerous inverted repeat sequences and
putative binding sites for transcription factors, suggesting that a
complex regulatory mechanism may be involved in its cell- and
stage-specific expression (12) . The bovine crabp-I
gene has also been characterized (13) , and it appears that
both the exon/intron junctions and the promoter region of this gene are
highly conserved among animal species.
Although crabp-I
deficient mice displayed no apparent
phenotypes(14, 15) , previous studies in transgenic
mice (16) and embryonal carcinoma cells (17) showed an
association of elevated crabp-I expression with abnormal
cellular differentiation and RA-regulated gene expression. Studies in
embryonic palate cells demonstrated that expression of RA
receptor-, TGF-
3, and tenascin was altered as a result of
introduction of anti-crabp-I oligonucleotides(18) .
Recent biochemical studies provided more evidence for a role of crabp-I in RA catabolism(19) . It is suggested that
the level of crabp-I expression must be tightly controlled
because abnormally high level of expression may disturb RA
concentration, thereby affecting gene expression in specific cells at a
critical time(16) .
Consistent with the observation of weak crabp-I expression in most adult tissues, its expression is
also very weak in most cell lines examined, except in a mouse embryonic
fibroblast cell line 3T6 (11, 20) . Significant
induction of this gene has only been observed in embryonal carcinoma
cell lines, such as P19 and F9, treated with RA (20) . The
study of the mouse crabp-I genomic structure has revealed
several interesting features within a 3-kb upstream sequence, such as a
GC content of greater than 70%, 9 pairs of inverted repeats, 5 copies
of GC boxes (Sp-1 sites, GGGCGG), and several potential binding sites
for transcription factors(11, 12) . Recently, using
pharmacological treatments, we showed that RA induction of this gene
could be enhanced by 5-azacytidine (21) and
DC-erythro-dihydrosphingosine (sphinganine)(22) . The
effect of sphinganine was associated with an 870-bp DNA fragment in the
most 5`-end of the upstream region containing a putative AP-1 binding
site (TGACTCA). The effect of 5-azacytidine was examined by analyzing
the methylation status of the 3-kb upstream sequence, which revealed
hypermethylation of this region in cells where crabp-I
expression was low. Demethylation was associated with up-regulation of
this gene expression(21) . The biological activity of the 3-kb
upstream sequence was demonstrated in transgenic mouse embryos using an Escherichia coli -galactosidase (lacZ)
reporter(12) . However, the transgene expression pattern
differs slightly from the endogenous crabp-I expression
pattern detected by in situ hybridization(8) ,
possibly due to the use of a heterologous DNA fragment, the mouse
Hox1.3(23) , in the fusion.
In this study, to address both the promoter and the upstream regulatory activities of the mouse crabp-I gene, we constructed a series of lacZ reporter fusion genes by inserting a lacZ structural gene fragment, in frame, into the fifth amino acid codon of the mouse crabp-I gene. The biological activity of the full-length fusion gene was tested in both cultured cells and transgenic mouse embryos, and systematic deletion mutants were made to dissect minimal promoter and cell type-specific regulatory regions. Gel retardation assays were conducted to demonstrate specific protein-DNA interactions between the regulatory DNA fragments and nuclear extracts of P19 and 3T6 cells.
Figure 1:
Reporter fusion gene constructs for
5`-deletion analysis of mouse crabp-I upstream region. The BamHI fragment of E. coli lacZ structure gene
(pMC1871, Pharmacia) was filled-in with Klenow enzyme and fused, in
frame, into blunt-ended KpnI site of the mouse crabp-I EcoRI genomic fragment containing exon I (12) . A fragment containing SV 40 poly(A) fragment was added
to the 3`-end of this fusion. This generated the parental vector,
designated as CRABP-lacZ, which contained 3.2 kb of crabp-I
genomic sequence including 3 kb in the upstream region. The constructs
870,
1960,
2990,
3020, and
3110 were made
using PCR-amplified fragments, and the constructs of
2100,
2140,
2400, and
2600 were made by HindIII (H), XhoI (X), PstI (P),
and SmaI (S) digestion, respectively. The nucleotide
position was numbered from 5` to 3`-end, according to the published
sequence ( Fig. 1in (12) ) for consistency. A filled
triangle indicates the putative AP-1 site, and an arrow under the CRABP-lacZ construct indicates the putative RARE. Above
the constructs, detailed features of the promoter and its immediate
5`-flanking region are shown. Five vertical bars indicate the
five Sp-I sites, and a horizontal arrow indicates the position
of transcription initiation. Translation initiation codon (ATG) is
indicated at nt 3233. -, CRABP-I region;
, CRABP-I
coding region;
, lacZ; &cjs2110;, SV40 poly(A)
site.
Figure 2:
Spatial and temporal specific CRABP-lacZ
transgene expression in transgenic mouse embryos. Transgenic mouse
embryos were dissected at gestation dates of E9.5 (A), E10.5 (B), E11.5 (C), and E12.5 (D) and analyzed
for lacZ expression in whole mount embryos as described(12) .
The stained (lacZ positive) areas are indicated. m,
mesencephalon; r, rhombencephalon; s, spinal cord.
The magnification is 30, 20, 15, and 15, for A, B, C, and D,
respectively.
Figure 3:
Specific reporter activity of 5`-deletion
analysis. Promoter activity of each construct (shown in Fig. 1)
was determined as described under ``Experimental Procedures''
and represented as A/30 µg of protein.
Triplicate cultures were used in each experiment, and three independent
experiments were conducted in P19 (solid bars) and 3T6 (open bars) cells to obtain the means (A
/30 µg protein) and S.E.
values.
Figure 4:
Relative reporter activity of internal
deletion analysis. A, the deletion of various sequences in the
region between nt 2100 and 2600 was made by restriction enzyme
digestion from the parental construct CRABP-lacZ. The constructs
2100/2140,
2100/2400,
2100/2600,
2140/2400, and
2140/2600 were made by using HindIII-XhoI, HindIII-PstI, HindIII-SmaI, XhoI-PstI, and XhoI-SmaI digestion
(restriction sites shown in Fig. 1), respectively. A filled
triangle indicates the putative RARE between nt 2100 and 2140. B, relative reporter activity of each construct was
represented as the percentage of the parental construct CRABP-lacZ
activity in P19 (solid bars) and 3T6 (open bars), and
three independent experiments were conducted to obtain the means and
S.E. values.
Figure 5:
Regulatory activity of region 2100/2600 on
the ras promoter. A, the 2100/2600, 2400/2600, and
2140/2400 constructs were made by ligating HindIII-SmaI, PstI-SmaI, and XhoI-PstI fragments (restriction sites shown in Fig. 1), respectively, to the 5`-end of a lacZ reporter
containing the ras promoter in the sense orientation.
Antisense constructs for each region were also made and designated as
2100/2600,
2400/2600, and
2140/2400. B,
relative reporter activity of each construct was represented as the
percentage of the parental vector ras-lacZ activity in P19 (solid
bars) and 3T6 (open bars), and three independent
experiments were conducted to obtain the means and S.E.
values.
Figure 6:
Regulatory activity of regions 2100/2600
and 2140/2400 on the thymidine kinase promoter. A, the
2100/2600 and 2140/2600 constructs were made by ligating HindIII-SmaI and XhoI-PstI
fragments (restriction sites shown in Fig. 1), respectively, to
the 5`-end of a luciferase reporter containing the thymidine kinase
promoter in the sense orientation. The antisense constructs were also
made and designated as 2100/2600 and
2140/2400. B,
relative reporter activity of each construct was represented as the
percentage of the parental vector thymidine kinase-luciferase activity
in P19 (solid bars) and 3T6 (open bars), and three
independent experiments were conducted to obtain the means and S.E.
values.
Figure 7:
Gel
retardation. The HindIII-PstI fragment (restriction
sites shown in Fig. 1) was labeled with P with
Klenow enzyme and tested for binding to specific proteins isolated from
P19 and 3T6 nuclei as described in the text. The sample order is as
follows: 1) no nuclear extract (probes alone), 2) 10 µg of P19
extract, 3) 10 µg of P19 extract + 10
unlabeled
fragment (cold competitor), 4) 10 µg of P19 extract +
100
unlabeled fragment, 5) 10 µg of P19 extract +
500
unlabeled fragment, 6) 7 µg of 3T6 extract, 7) 7 µg
of 3T6 extract + 10
unlabeled fragment, 8) 7 µg of 3T6
extract + 100
unlabeled fragment, and 9) 7 µg of 3T6
extract + 500
unlabeled fragment. Arrow head indicates the position of the specifically retarded
band.
We have demonstrated that the mouse crabp-I gene promoter, including approximately 3 kb of its upstream region, is able to direct a lacZ reporter expression in transgenic mouse embryos. The expression pattern agrees with results generated from several in situ hybridization studies(7, 8, 9, 10) , indicating that this region contains spatial and temporal information for crabp-I gene expression. By systematic deletion analysis in both high expressing (3T6) and low expressing (P19) cells, the minimal promoter is located between nt 2990 and 3020, approximately 120-150 bp upstream from the transcription initiation site. In transient reporter assays, a 390-bp fragment (nt 2600-2990) immediately upstream from the transcription initiation site (nt 3140) encodes the maximal promoter activity in the high expressing cell line 3T6, whereas a complete 3-kb upstream sequence is needed to obtain the maximal promoter activity in the low expressing cell line P19. Additional sequence in the further upstream region of the 3-kb sequence appears to have no effect on this promoter in either cell line (data not shown). The 870-bp fragment in the 5`-end of this 3-kb fragment is important for the maximal activity in P19 cells. This agrees with our previous study(21) , showing the requirement of this 870-bp fragment for optimal crabp-I expression in P19 cells treated with RA and sphinganine, a compound known to increase AP-1 activity(30) .
Common to both P19 and 3T6 cells, a strong negative effect is observed for deletion to nt 2100, approximately 1 kb upstream from the transcription initiation site. A further deletion of 40 bp (deletion to nt 2140) abolishes this negative effect in both cell types. Results from these 5`-deletion studies would suggest that the 40-bp sequence (2100-2140) contains negative regulatory information for crabp-I gene expression. However, studies of internal deletions (Fig. 4) show that the sequence 2100-2140 is critical for a high level of reporter expression in both P19 and 3T6 cells, as deletion of 40 bp results in greater than 94% decrease of the full promoter activity in both cell types. This would argue against a negative activity of this 40-bp sequence because deletion of this presumably negative element should have either little effect or have abolished its negative effect. It is possible that the 2100/2140 sequence is a portion of a complex regulatory unit, which could operate very differently depending upon the sequence in its vicinity and the combination of available regulatory proteins.
The regulatory mechanism under physiological conditions could be much more complicated considering the complexity of DNA structure, modification of DNA, and multiple protein interactions in the cells. Based upon studies of internal deletions (Fig. 4) and heterologous promoters ( Fig. 5and Fig. 6), it is suggested that fragment 2100/2600 may be a complex regulatory unit that can be affected by many factors. It is clear that the 40-bp sequence is critical for the full crabp-I gene promoter activity in the context of the natural crabp-I gene regulatory region (as demonstrated by internal deletion analysis in Fig. 4), and yet, in conjunction with its 3`-flanking sequence, this region becomes a negative regulatory element (Fig. 4Fig. 5Fig. 6). It is interesting that the 40-bp sequence 2100/2140 contains a putative DR5-type RARE. However, by itself, this sequence has little effect on heterologous promoters (data not shown). It would be interesting to determine the protein factors bound to these sequences and how they interact with each other.
Based upon studies using heterologous promoters, the sequences
derived from 2140/2400 and 2100/2600 are able to function as strong
negative regulatory elements in both cell types. However, 5`-deletion
analysis reveals positive activity of 2140/2400 when it is fused to its
natural 3`-flanking sequence (the construct 2140) in P19 cells.
This also suggests that the sequence 2140/2400 constitutes a portion of
a complex regulatory unit. Interaction between this sequence and its
neighboring sequences determines the final regulatory activity of the
whole unit. Likewise, sequence 2400/2600 has little effect on
heterologous promoters, yet, when situated in its natural position, it
is able to increase and inhibit crabp-I promoter activity in
P19 and 3T6 cells, respectively (Fig. 3). Consistent with these
results, gel retardation experiment (Fig. 7) shows that fragment
2100-2400 can be bound by specific nuclear factors that are
present in both P19 and 3T6 cells, and the protein-bound fragments
migrate at the same position in both cases. In contrast, very different
and more complex patterns of band shift have been observed for
2100/2600 fragment (data not shown).
P19 cells, in undifferentiated states, express endogenous crabp-I at a very low level. The expression can be specifically induced by RA, which is prohibited by cyclohexamide, a protein synthesis inhibitor(19) . In contrast, 3T6 cells express endogenous crabp-I constitutively at a much higher level, yet RA has little effect on its expression(11, 20) . Based upon data collected from this and other studies, a model is proposed for the regulatory elements controlling crabp-I gene expression as shown in Fig. 8. It is hypothesized that both positive and negative regulatory mechanisms are needed for the control of crabp-I gene expression. For most cell types, crabp-I gene utilizes the minimal promoter located between nt 2990 and nt 3020, which is constantly demethylated (21) and active. The upstream region of this promoter contains numerous regulatory DNA elements for both positive and negative transcription factors and their associate proteins. The region between nt 2600 and nt 2990 contains sequence for some positive factors that are probably present more abundantly in some highly expressing cells such as 3T6. The 870-bp fragment of the most 5`-end contains a sequence for positive factors that can be induced by certain drugs (such as sphinganine) in some cells (like P19). In contrast, the region 2100/2600 contains a sequence for negative transcription factors (such as repressors). For most cell types, either the lack of these positive factors for sequences 1/870 and 2600/2990 or the presence of negative factors for sequence 2100/2600 prohibited its optimal level of expression. In the presence of RA, this promoter activity is enhanced in certain cell types such as embryonal carcinoma because of diminishing levels of negative factors, induction of positive factors, or a combination of both. A total of nine inverted repeats are present within the 3-kb region, which have the potential to form complex structures, thereby bringing these regulatory elements to a close proximity. This model is being tested by asking specifically if the dissected elements can be associated with known regulatory proteins in terms of physical interaction and biological activity. With this information, it would be possible to begin to address how crabp-I gene expression is regulated in specific cell types and during developmental stages.
Figure 8: A model for mouse crabp-I gene regulatory elements. Nucleotide number starts from the 5`-end to the 3`-end according to previous sequence data (12) for consistency. Putative regulatory elements in the 3-kb upstream region, such as AP-1, RARE, and Sp-1, are indicated above the sequence. Question marks represent unknown factors. Transcription initiation site is indicated with a horizontal arrow under the sequence. Relative regulatory activity of each region in P19 and 3T6, as shown above the sequence, is arbitrarily scaled from -4 to +4 according to relative activity detected in transient transfection studies (Fig. 3Fig. 4Fig. 5Fig. 6). The negative signs for region 2140/2400 shown in the parentheses indicate negative activity of this region fused to the heterologous promoters.