(Received for publication, March 26, 1996, and in revised form, October 30, 1996)
From the Department of Medicine and
Biochemistry and Biophysics, The University of North Carolina,
Chapel Hill, North Carolina 27599-7038 and the ¶ Department of
Biology and Molecular Biology Institute, San Diego State
University, San Diego, California 92182
Since several lines of evidence implicate the
3-flanking region in regulating
1(I) collagen gene transcription,
we analyzed 12.4-kilobase pairs of 3
-flanking sequence of the
murine
1(I) collagen gene for transcriptional elements. A region of
the 3
-flanking region stimulated expression of the heterologous
-globin gene promoter in an enhancer trap plasmid and of the
1(I)
collagen gene promoter in a collagen-luciferase reporter gene construct when located 3
to the luciferase reporter gene. DNase I footprinting analysis demonstrated the presence of three regions where DNA binding
proteins specifically interact within this 3
-stimulatory region.
Inspection of the DNA sequence revealed a consensus E-box, a binding
site for basic helix-loop-helix proteins, in one of the protein binding
sites. Mobility shift assays demonstrated that upstream stimulatory
factors (USF) USF-1 and USF-2 bind to this E-box. Mutating the E-box in
the context of the 3
-flanking region confirmed that it contributes to
the enhancement of transcriptional activity of the
1(I) collagen
gene promoter. Mutations in all three protein binding sites abolished
transcriptional activation by the 3
-flanking region, suggesting a
complex interaction among the trans-acting factors in
enhancing transcriptional activity. Thus, a region of the 3
-flanking
region of the
1(I) collagen gene stimulates transcription of the
1(I) collagen gene promoter, and USF-1 and USF-2 contribute to this
transcriptional stimulation.
Type I collagen, the most abundant protein in vertebrates, has
diverse biological functions. It promotes cell migration,
differentiation, and tissue morphogenesis during development.
Additionally, it provides tensile strength to connective tissues such
as bone, tendons, and skin, and forms a supporting framework of
connective tissues in all major internal organs and the vascular
system. Type I collagen is also the major protein produced during
repair of tissue injuries and wound healing. Excess deposition of type I collagen occurs in fibrogenic diseases, such as hepatic fibrosis (1),
pulmonary fibrosis (2), primary systemic sclerosis (3), and
eosinophilic myalgia syndrome (4). Type I collagen is the product of
two genes, the 1(I) and the
2(I) collagen genes, whose products
form a heterotrimeric protein composed of two
1(I) and one
2(I)
polypeptide chains. Although located on different chromosomes, both
genes are generally coordinately regulated in a developmental and
tissue-specific manner. Expression of the type I collagen genes is
active in many cell types and under various physiological conditions,
and their regulation is accordingly complex (5-8). Type I collagen
gene expression is also modulated by agents such as cytokines and
chemical or viral transformation.
Transcriptional regulatory elements have been previously identified in
the 5-flanking region, the promoter region, and the first introns of
both type I collagen genes in several species (9-19). While sequences
in the minimal promoter, within 220 bp1
upstream of the start site of transcription, appear to be sufficient for the basal activity and partial tissue specificity of the
1(I) collagen gene promoter in transient transfection assays, the precise function of distal 5
-flanking sequences and the first intron is less
clear. Moreover, several lines of evidence suggest that, in addition to
these regulatory elements, sequences in the 3
-flanking region may
contribute to transcriptional regulation of the
1(I) collagen gene.
First, the 5
-regulatory elements were not always sufficient for
precisely regulated, tissue-specific, high level expression of the gene
when tested in transient transfection experiments or in transgenic mice
(10, 20-23). Second, in the transgenic HucII mouse strain, a single
copy of the human
1(I) collagen gene, which included 1.6 kb of
5
-flanking region, the entire structural gene, and 20 kb of
3
-flanking sequence, was expressed as efficiently as the endogenous
collagen gene in an appropriate tissue-specific manner (24) and was
induced appropriately during hepatic fibrogenesis (7). Finally, the
human
1(I) collagen gene contains several DNase I-hypersensitive
sites located immediately 3
of the structural gene (25), which are
often indicative of regulatory elements.
Therefore, we initiated a systematic analysis for regulatory elements
located within the 3-flanking region of the murine
1(I) collagen
gene. We located a segment of the 3
-flanking region which was found to
enhance expression of the heterologous
-globin gene promoter as well
as the endogenous
1(I) collagen gene minimal promoter in NIH 3T3
fibroblast cells. DNase I footprinting analysis demonstrated the
location of three sites of DNA-protein interactions within this
transcriptional stimulatory region. One of the binding sites contained
a consensus E-box, to which both USF-1 and USF-2, two basic
helix-loop-helix proteins (bHLH), were found to bind. When the E-box
was mutated within the context of surrounding wild-type 3
-flanking
sequence, reduced levels of reporter gene activity were observed.
However, when all three cis-acting elements were mutated a
complete loss of transcriptional stimulatory activity was obtained.
These results demonstrate that transcription of the
1(I) collagen
gene is stimulated by a region within the 3
-flanking region of the
gene and that USF-1 and USF-2 participate in the stimulation of
transcriptional activity of the
1(I) collagen gene promoter.
Fragments of the
genomic clones pCE4 and pCE5 (kindly provided by K. Harbers) which
contain the 3-flanking sequences of the murine
1(I) collagen gene
(26) (Fig. 1) were cloned into the unique SphI site of the
enhancer trap plasmid p
e
(27). The entire
1(I)
collagen sequence present in pCE5 was sequenced. This fragment contains
the last 12 codons of the
1(I) collagen gene. As a reference, the
first base following the translational stop codon is designated as +1.
The complete genomic insert in pCE5 (4.6 kb) was cloned into
p
e
, creating plasmid p
CE5, and a series of
deletions shown in Fig. 1B were created by digestion with
Bal31 exonuclease utilizing unique PstI and
HindIII sites in the plasmid. The plasmid
p
COL-(+1899-4597) was created by digesting p
CE5 with
PstI, gel-purifying the large DNA fragment, and religating
using T4 DNA ligase. A partial HincII digestion of the pCE4
genomic insert produced two fragments (5.4 and 2.4 kb) which were
individually cloned into the SphI site of
p
e
. The reporter gene pGLCOL3 was constructed by
ligating the
1(I) collagen promoter (
220 to +116) into the
BglII and HindIII sites of the luciferase
reporter gene pGL2-Basic (Promega, Madison, WI). Reporter genes
pCOL-(+1899-4597) and pCOL(+4597 to 1899) were constructed by ligating
the 2698-bp 3
1(I) collagen fragment from plasmid
p
COL-(+1899-4597) (Fig. 1) into the blunted SalI site of
pGLCOL3, which is located at the 3
end of the luciferase gene, in both
orientations. Plasmid pCOL-(+1899-4597) contains the 3
-flanking
region in the 5
to 3
orientation, with respect to the direction of
transcription from the collagen promoter, while in plasmid pCOL-(+4597
to 1899) the 3
-flanking sequence is positioned 3
to 5
, with respect
to the direction of transcription from the collagen promoter. Plasmids
pCOL-(+3590-4597) and pCOL-(+4597 to 3590) were created by digesting
p
COL-(+1899-4597) with HindIII and PstI,
blunting the ends of the isolated 1-kb fragment, and ligating the
fragment into pGLCOL3 which was digested with BamHI and the
ends blunted. This places the
1(I) collagen 3
-flanking sequence 3
with respect to the luciferase gene. Plasmids pCOL-(+4090-4597) and
pCOL-(+4597 to 4090) were constructed by generating a PCR product using
primers that hybridized to positions +4090 to +4113 (5
-ATC CGG ATC CGT
AAC CTA AAG ATG GTG GGT TTT C-3
) and positions +4597 to +4577 (5
-ATC
CGG ATC CGA ATT CCC ACT AGT GCG GGG G-3
) of the
1(I) collagen
3
-flanking sequence. The PCR reaction contained 10 mM
Tris-HCl, pH 8.9, 50 mM KCl, 2.5 mM
MgCl2, 2 µM dNTPs, 1 µM each
primer, 100 ng of pCE5 as the template, and 2.5 units Taq
DNA polymerase (Boehringer Mannheim). The reaction was cycled as
follows: 94 °C for 1 min, 60 °C for 1 min, 72 °C for 1 min for
30 cycles, an extension incubation at 72 °C for 10 min, followed by
incubation at 4 °C, using a GeneAmp PCR System 9600 (Perkin-Elmer). The PCR fragment was digested with BamHI, using the
BamHI recognition site designed in the primers, and cloned
into the BamHI site of pGLCOL3 in both orientations. To
construct plasmids containing the 3
-footprinted regions,
oligonucleotides were synthesized and complementary strands annealed.
The annealed oligonucleotides were ligated together and the ligation
reaction subsequently electrophoresed in polyacrylamide gels. Bands
representing three and five copies of the ligated oligonucleotides were
excised and eluted from the gel. The oligonucleotides were then ligated
into the BamHI site of pGLCOL3. The oligonucleotides used
for the three footprinted regions were as follows, 3
FP1 top strand:
5
-GAT CCG CGG CTG TCA CGT GGC ATG GGC TGG TAT GTG CTC TAA ATA-3
,
bottom strand: 5
-GAT CTA TTT AGA GCA CAT ACC AGC CCA TGC CAC GTG ACA
GCC GCG-3
; 3
FP2 top strand: 5
-GAT CCT CCT TTC CGC TGA CAT CAT TGC
TGC CA-3
, bottom strand: 5
-GAT CTG GCA GCA ATG ATG TCA GCG GAA AGG
AG-3
; and 3
FP3 top strand: 5
-GAT CCC TTT GGG GAG GGA CCT GGA GCA
A-3
, bottom strand: 5
-GAT CTT GCT CCA GGT CCC TCC CCA AAG G-3
. In creating the construct p3
FP1M, where the E-box was mutated while in
the context of surrounding wild-type 3
-flanking sequence in plasmid
pCOL(+3590-4597), the overlap extension method described by Ho
et al. (28) was used. Briefly, two oligonucleotides were synthesized which contained the 6-bp mutation (underlined) (primer 1:
5
-CAC CCC GCA GCG GCT GT
G
CAT GGG CTG GTA TGT GCT-3
; primer 2: 5
-AGC ACA TAC CAG CCC ATG
C
AC AGC CGC TGC GGG GTG-3
). Two additional primers were synthesized which flanked the 5
end (primer 3: 5
-ATC CGG ATC CTG CGC CTG AAA ATC TAT ACA TAT AC-3
)
and 3
end (primer 4: 5
-ATC CGG ATC CGA ATT CCC ACT AGT GCG GGG G-3
)
of the
1(I) collagen 3
-flanking sequence of the pCOL-(+3590-4597)
insert. Two PCR reactions were performed using the conditions as
described above, one using primers 2 with 3 and the second PCR reaction
with primers 1 and 4. An aliquot of each PCR reaction was combined in a
second round PCR reaction along with primers 3 and 4 using the same PCR
conditions described above. The product of this second PCR reaction was
then digested with BamHI and subsequently cloned into the
BamHI site of pGLCOL3.
To mutate all three of the footprinted regions in the 3-flanking
region while in the context of plasmid pCOL-(+3590-4597), a similar
mutagenesis approach, as described above, was utilized. The plasmid
p3
FP1M was used as a template in PCR reactions using primers designed
to mutate 3
FP2 (primer 1: 5
-TGA ACC CAA GCC CTC CTT TC
GCT GCC
TTA AAT ACA GAT GCC-3
; primer 2: 5
-GGC ATC TGT ATT TAA GGC AGC
GA AAG GAG GGC TTG GGT TCA-3
; the mutated nucleotides,
+3797-3809, are underlined) along with the 5
-flanking primer (primer
3 above) and 3
-flanking primer (primer 4 above). After the second PCR
reaction the 3
FP1 and 3
FP2 mutated fragment was used as a template in
PCR reactions to generate mutations in 3
FP3 as described above using
3
FP3 M primers (primer 1: 5
-TTG GAA TCC AAG TCC CTT TGG
GG AGC ATG
GTC ACT CCT GG-3
; primer 2: 5
-CCA GGA GTG ACC ATG CTC C
C CAA AGG GAC TTG GAT TCC
AA-3
; the mutated nucleotides, +4050-4059, are underlined) along with
the 5
-flanking primer (primer 3 above) and 3
-flanking primer (primer
4 above). After the second PCR reaction the PCR product was digested
with BamHI, gel-purified, and cloned into pGLCOL3, digested
with BamHI, creating p3
FP1-2-3M.
DNA sequencing was performed by the dideoxy method using the Sequenase version 2.0 kit (Boehinger Mannheim) according to the manufacturer's recommended protocol to confirm the presence of the mutations in the respective footprinted regions.
Transfections and Reporter Gene AssaysNIH 3T3 fibroblast
cells and HeLa cells were cultured in 150-mm plates with Dulbecco's
minimum essential medium (Life Technologies, Inc.) supplemented with
10% calf serum and grown in a 5% CO2, 95% air atmosphere
at 37 °C. Transfections using the -globin reporter gene were
performed using the calcium phosphate precipitation method as described
(13, 19). In addition to 20 µg of the p
e
based
reporter gene constructs, each transfection contained 20 µg of the
p
reporter gene (27) which served as an internal control for
transfection efficiencies. Cells were treated with 75 µM
chloroquine during the transfection, and after 4 h the
transfection mixture was removed, and the cells were shocked for 1 min
using 10% glycerol. The cells were harvested 36-48 h after
transfection, and total RNA was prepared by the acid-phenol method
(29). Radiolabeled antisense
-globin and
-globin RNA probes were
generated by SP6 and T7 RNA polymerase, respectively. T7 transcription
of p
creates a 244-nucleotide transcript of which 131 nucleotides
are protected (30). SP6 transcription of p
e
creates a
500-nucleotide transcript of which 350 nucleotides are protected (27).
Total RNA samples were analyzed by RNase protection assay as described
(31), and
- and
-globin transcripts were quantitated by scanning
with an image analyzer or direct counting of the bands. Transient
transfections using the luciferase reporter gene plasmids were
performed using LipofectAMINE (Life Technologies, Inc.). NIH 3T3 cells
were seeded into 6-well dishes at a density of 9 × 104 cells per well. The day after seeding 0.5 µg of
luciferase reporter plasmid, 0.5 µg of pRSV-
gal and 1.1 µg of
carrier DNA, pUC19, was added to the cells using 11 µg of
LipofectAMINE reagent per well following the recommended protocol of
the manufacturer. Liposomes were incubated with the cells for 8 h.
The RSV-
gal reporter gene plasmid was co-transfected to normalize
for transfection efficiencies. Luciferase and
-galactosidase
reporter gene assays were performed as described previously 36-48 h
after transfection (13, 19).
DNase I
footprinting analysis and mobility shift assays were performed as
described previously (13). Nuclear proteins were obtained using the
method described by Schreiber et al. (32). Supershift assays
were performed, as described previously (13), using USF-1 and USF-2
polyclonal antibodies (Santa Cruz Biotechnology, Inc., Santa Cruz, CA).
The oligonucleotides used in the mobility shift assays representing
3FP1 were top strand: 5
-GAT CCG CTG TCA CGT GGC ATG GGC TGA-3
;
bottom strand: 5
-GAT CTC AGC CCA TGC CAC GTG ACA GCG-3
. The
oligonucleotides that contained a mutation within the consensus E-box,
3
FP1M were top strand: 5
-GAT CCG CTG T
GC ATG GGC TGA-3
; bottom strand: 5
-GAT CTC AGC CCA TGC
ACA GCG-3
(the mutated nucleotides are
underlined).
Oligonucleotides were synthesized using a Cyclone Plus Oligonucleotide Synthesizer (Milligen, Novato, CA) and were gel-purified after electrophoresis in 10% polyacrylamide gels.
DNA SequenceThe DNA sequence described in this paper has been deposited into the GenBank data base.
To investigate the presence of transcriptional regulatory
elements located in the 3-flanking region of the
1(I) collagen gene, we cloned several fragments representing 12.4 kb of this region
into the enhancer trap plasmid, p
e
(27). This plasmid
contains the human
-globin promoter and structural gene with a
unique SphI cloning site located 2.2 kb downstream (or 3.3 kb upstream) of the
-globin transcriptional start site. The
recombinant plasmids were transiently transfected into NIH 3T3 cells
along with a plasmid containing the human
-globin gene, to serve as
an internal control for transfection efficiency. The relative levels of
- and
-globin mRNA were assessed by RNase protection assays
using total RNA from transfected cells. Insertion of the entire 4.6 kb
genomic fragment of pCE5 into the enhancer trap plasmid stimulated the
expression of the
-globin gene 1.7-fold compared with expression of
p
e
alone, whereas
1(I) collagen gene fragments
derived from the pCE4 plasmid did not exhibit any stimulatory effects
on
-globin expression (Fig. 1A and
summarized in Fig. 1B). Therefore, we concentrated our
efforts on
1(I) collagen 3
-flanking sequences in pCE5 to further
localize the position of the enhancing activity using deletional
analysis. Fig. 1B summarizes the results of two to six
transfection experiments carried out with independently purified
plasmid preparations. The smallest construct containing enhancing
activity was p
COL-(+2643-4597) which reproducibly stimulated
-globin expression 2.8-fold as compared to p
e
. This
level of stimulation by the
1(I) collagen 3
fragment is comparable
with the stimulation of p
e
by the strong SV40 early
gene enhancer in NIH 3T3 cells (27). To eliminate the possibility that
the expression of the
-globin gene interferes with or affects the
-globin expression, the
-globin constructs were also transfected
alone in a separate experiment with identical results (data not
shown).
To determine whether this enhancing activity is cell type-specific,
transient transfections were performed using the same reporter gene
constructs in HeLa cells, which express no or low levels of type I
collagen (33). None of the 1(I) collagen 3
-flanking genomic
sequences stimulated
-globin gene expression in HeLa cells (Fig.
1C). In fact, the
1(I) collagen derived fragments in
plasmids p
COL-(+1899-4597) and p
COL-(+2643-4597) had a slight inhibitory effect on
-globin gene expression relative to the control
p
e
plasmid in these cells.
Since a segment of the
1(I) collagen gene 3
-flanking region stimulated expression from the
heterologous
-globin promoter, we wanted to determine if this region
could also stimulate expression from the homologous
1(I) collagen
gene promoter. Therefore, we cloned the
1(I) collagen genomic insert
from the plasmid p
COL-(+1899-4597) in both orientations, behind the
luciferase reporter gene in pGLCOL3, in which the luciferase gene is
driven by the
1(I) collagen minimal promoter (
220 to +116),
creating the plasmids pCOL-(+1899-4597) and pCOL-(+4597 to 1899) (Fig.
2A). When these plasmids were transiently transfected into NIH 3T3 cells, and transfection efficiencies were
normalized by co-transfection with pRSV-
gal, a significant increase
in transcriptional activity of the
1(I) collagen promoter was
observed when the 3
-flanking sequence was in the 5
to 3
orientation.
However, when positioned in the opposite orientation (3
to 5
) an
inhibitory effect was observed (Fig. 2B) indicating the
presence of an inhibitory element. To further localize the stimulatory
element within the
1(I) collagen 3
-flanking region, the shorter
collagen sequence in p
COL-(+3590-4597) was inserted behind the
luciferase reporter gene in the pGLCOL3 plasmid, in both orientations,
creating the plasmids pCOL-(+3590-4597) and pCOL-(+4597 to 3590) (Fig.
2A). Transient transfections with these plasmid constructs
into NIH 3T3 cells also demonstrated a stimulatory effect of this
region on the
1(I) collagen gene promoter when positioned in either
orientation (Fig. 2B). However, further deleting 3
-flanking
sequence to nucleotides +4090 to +4597 (plasmids pCOL-(+4090-4597) and
pCOL-(+4597 to 4090)) eliminated the stimulatory effect. Together, these data indicate that a transcriptional stimulatory element located
in the
1(I) collagen gene 3
-flanking region is positioned between
nucleotides +3590 to +4089.
DNA Binding Proteins Specifically Interact at Three Locations within the
To determine
the locations of DNA binding proteins in this stimulatory element in
the 3-flanking region of the
1(I) collagen gene, we performed DNase
I footprinting analysis using NIH 3T3 nuclear extracts. Three sites of
specific DNA-protein interactions were located and called 3
FP1, 3
FP2,
and 3
FP3 (Fig. 3A, left, middle, and
right, respectively; and summarized in Fig. 3B).
To determine the effect that each of the footprinted regions exerts on
expression of the
1(I) collagen promoter region, we cloned one copy,
in both orientations, as well as three and five copies of the
oligonucleotides representing each footprint into the BamHI site of pGLCOL3. We investigated single and multiple copies of the
binding sites in order to optimize their effects on transcriptional activation. The plasmid constructs were transiently transfected into
NIH 3T3 cells and transfections normalized to the activity of the
co-transfected pRSV-
gal plasmid. Although no stimulatory effect on
transcriptional activity of the
1(I) collagen gene minimal promoter
was observed when one copy of 3
FP1 was present in either orientation,
the 3
FP1 sequence stimulated expression nearly 4-fold when three
copies were present and approximately 6-fold when five copies were
present (Fig. 4). On the other hand, 3
FP2 slightly
stimulated expression of the
1(I) collagen gene promoter with only
one copy present but not with five copies (Fig. 4). 3
FP3 slightly
stimulated expression of the parental pGLCOL3 plasmid when either a
single copy or with three copies of the footprinted regions were
positioned 3
of the luciferase reporter gene (Fig. 4).
USF-1 and USF-2 Interact at a Consensus E-box Binding Site Located in the
Since 3FP1 stimulated
expression of the
1(I) collagen gene minimal promoter, we wished to
identify the protein(s) that interacts with this cis-acting
element. Mobility shift assays were performed with a radiolabeled 3
FP1
oligonucleotide using NIH 3T3 nuclear extracts. A DNA-protein complex
was formed in the mobility shift assay (Fig. 5,
lane 2) which was specifically competed with cold unlabeled
3
FP1 oligonucleotide (Fig. 5, lane 3). Examination of the
nucleotide sequence of 3
FP1 demonstrated a consensus E-box (CANNTG), a
known binding site for bHLH proteins (34), located at nucleotide +3695
to +3700 in the
1(I) collagen 3
-flanking sequence. To determine if
the consensus E-box is involved in protein binding to 3
FP1, we used a
mutated 3
FP1 oligonucleotide in which the six nucleotides representing
the consensus E-box were mutated in the context of surrounding
wild-type sequence. This mutated oligonucleotide failed to compete for
protein binding (Fig. 5, lane 4) demonstrating that the
E-box is required for protein binding to the 3
FP1 footprint sequence.
The nucleotide sequence of this E-box matches the USF-1/USF-2 consensus
binding site (CACGTG) (35). Therefore, we used antibodies directed
against USF-1 or USF-2 in the mobility shift assay to test if these two
bHLH proteins interact with 3
FP1. Addition of either the USF-1 or the
USF-2 antibody in the binding reaction supershifted the DNA-protein complex (Fig. 6, lanes 4 and 5),
with the USF-1 antibody supershifting the complex to a greater degree
than the USF-2 antibody. When both antibodies were included in the
binding reaction (Fig. 6, lane 6), the entire complex was
eliminated indicating that both USF-1 and USF-2 interact with 3
FP1 as
either homodimers or heterodimers. Additionally, binding to 3
FP1
requires Mg2+ for binding with maximal binding activity
requiring at least 5 mM MgCl2 (data not shown).
This is in agreement with a previous report demonstrating that
decreased Mg2+ concentration reduces binding affinity of
USF-1 to DNA (36). Binding activity of USF-1/USF-2 to 3
FP1 was not
salt-dependent as binding occurs over a wide range of salt
concentrations (26 to 200 mM NaCl) in the binding reaction
(data not shown).
USF Participates in the Activation of the
To determine the functional role of USF interaction with
this 3 E-box in the stimulatory effect of the 3
-flanking region on
the
1(I) collagen gene promoter, we generated a mutated reporter gene construct in which only the CACGTG (nucleotides +3695 to +3700)
E-box site was mutated in pCOL-(+3590-4597) (see Fig.
2A), creating p3
FP1M. When p3
FP1M was transiently
transfected into NIH 3T3 cells and transfection efficiencies normalized
to the co-transfected pRSV-
gal plasmid, a significant decrease in
reporter gene activity was observed compared with the wild-type
parental pCOL-(+3590-4597) plasmid (Fig. 7). This
indicates that at least part of the stimulatory properties of this
region on the
1(I) collagen gene promoter are due to USF-1/USF-2
binding to the E-box. To assess if the other footprinted regions, 3
FP2
and 3
FP3, participate in the stimulatory properties in the
pCOL-(+3590-4597) plasmid, we created plasmid p3
FP1-2-3M which
contained mutations in all three footprinted regions (nucleotides
+3695-3700; nucleotides +3797-3809; nucleotides +4050-4059). When
this plasmid was transiently transfected into NIH 3T3 cells, a complete
inhibition of transcriptional activation was observed and, in fact,
resulted in a reduction of expression compared with the parental
pGLCOL3 plasmid (Fig. 7). This result supports data presented in Fig.
2B demonstrating an inhibitory element is present as shown
in plasmid pCOL-(+4597 to 1899). Taken together, these data indicate
that USF-1/USF-2 is involved in the stimulatory properties of the
3
-flanking region on the
1(I) collagen gene promoter; however,
3
FP2 and 3
FP3 also cooperate in stimulating transcription of the
1(I) collagen gene promoter.
The genes encoding the 1 and
2 polypeptide chains of type I
collagen are regulated in a developmental, inducible, and
tissue-specific manner. In previous studies of the human, murine, and
rat
1(I) collagen genes using either transient transfections into
cultured cells or transgenic animals, cis-regulatory
elements have been identified in the 5
-flanking regions, the promoter
region, and first introns of these genes (9-19). The precise functions
of many of these cis-acting elements remain to be
elucidated, and some conflicting results have been reported, perhaps
reflecting the use of different reporter gene constructs and different
cell types. Most studies support the notion that the 5
region of the
1(I) collagen gene contains a strong proximal promoter which exhibits at least partial tissue specificity and which is modulated by
additional regulatory elements.
Several lines of evidence suggested that the 3-flanking region of the
1(I) collagen gene contains transcriptional regulatory elements.
Therefore, we initiated a systematic analysis of the 3
-flanking region
to locate important regulatory elements. Our analysis of the
3
-flanking region of the
1(I) collagen gene has located a fragment
(between nucleotides +1899 to +4597 downstream of the translational
termination site of the
1(I) collagen gene) that enhances
transcription of the heterologous
-globin gene promoter in NIH 3T3
cells, a cell line that expresses moderate levels of type I collagen.
Although the level of enhancement is not dramatic, it is comparable to
the enhancement of the same reporter gene construct by the SV40
enhancer in the same cells (27). This fragment also stimulates
transcription driven by the
1(I) collagen promoter in an
orientation-dependent manner. Deletional analysis of the
3
-stimulatory region narrowed the enhancing region to within a
500-bp segment. DNase I footprinting assays demonstrated the presence
of three regions that specifically interact with DNA binding proteins.
Only one of the footprinted regions was capable of stimulating
expression of the
1(I) collagen promoter when multiple copies were
inserted 3
of the luciferase reporter gene. Further analysis of this
protein binding site demonstrated that both USF-1 and USF-2 interact
with the nucleotide sequence of a consensus E-box, the binding site for
basic helix-loop-helix (bHLH) proteins. A functional role of this site
in stimulating expression of the
1(I) collagen gene promoter was
assessed by mutating only the nucleotides comprising the consensus
E-box while in the context of the wild-type pCOL-(+3590-4597) collagen
containing sequence. The mutated collagen-luciferase reporter gene
construct containing the mutated E-box had decreased transcriptional
activity compared with the wild-type E-box. However, the
transcriptional stimulatory properties of the 3
-flanking region was
completely lost only when all three protein binding sites were mutated.
This was surprising since 3
FP2 and 3
FP3 alone did not stimulate
transcription (Fig. 4). These data suggest a complex interaction
between the trans-acting factors interacting with the
3
-flanking region. In addition, when all three cis-acting
elements were mutated, promoter activity was reduced compared with to
the level of the
1(I) promoter-driven plasmid pGLCOL3. This
indicates that a negative regulatory element is present between
nucleotides +3590 and 4597. This finding is supported in the construct
pCOL-(+4597 to 1899) where an inhibitory effect was observed on
transcriptional activity (see Fig. 2).
The USF-1 binding site in the endogenous 1(I) collagen gene is
located over 20 kb downstream from the promoter. Regulatory elements
have also been demonstrated in the 3
-flanking region of several other
genes, including the mouse cytosolic glutathione peroxidase gene (37),
the human keratin 19 gene (38), the murine c-fos
protooncogene (39), the human angiotensinogen gene (40), and the human
tyrosine hydroxylase gene (41). One could speculate that looping of the
chromatin would place the two regions in close enough proximity to one
another for appropriate interactions to occur.
Both USF-1 and USF-2 are ubiquitously expressed, although their
relative abundance varies in different cell types (42). In fact, USF
was originally identified and characterized in HeLa cells (43-45).
Thus, it was surprising that the 3 region of the
1(I) collagen gene
enhanced transcription in NIH 3T3 cells but not in HeLa cells. Perhaps
differences in the USF proteins, transcriptional co-factors, or other
DNA binding proteins allow for this difference in transcriptional
activity between the two cell types. The expression of both USF-1 and
USF-2 genes results in multiply spliced messages producing variations
in the amino terminus of the protein (42). This, therefore, may give
rise to proteins with different transcriptional stimulating activities
since the amino-terminal portion of the proteins contain the
trans-activating domain. Since the USF family members
homodimerize and heterodimerize with each other, but do not
heterodimerize with other bHLH proteins (46, 47), various ratios of
multiple USF-1 and USF-2 proteins will generate complex sets of USF
members with potentially different transcriptional activating
capacities. Additionally, transcriptional co-factors may be absent in
the HeLa cells that are necessary for USF to stimulate the
1(I)
collagen gene promoter. USF does not appear to
trans-activate effectively in highly purified in
vitro systems (48) suggesting a role for additional interacting
proteins. It is believed that USF-1 affects transcription by
interacting with the multisubunit TFIID complex (43, 49) and
specifically with the TFIID subunit TAFII55 (50).
Additional factor(s) also appear to be required for USF to stimulate
transcription, one of which appears to be PC5, a novel co-factor which
has been demonstrated to mediate transcriptional activation by USF-1
(51).
USF contributes to the regulation of multiple genes, some of which are
expressed in a tissue-specific or inducible manner. The human
-globin locus control region (52), the murine type II
regulatory
subunit gene of cyclic adenosine 3
,5
-monophosphate-dependent protein
kinase gene (53), the murine p53 gene (54), the human cell
cycle-dependent cyclin B1 gene (55), the chicken
A-crystallin gene (56), the rat cardiac ventricular myosin light
chain 2 gene (57), and the rat preprotachykinin-A gene (58) are
regulated in part by USF. USF has also been shown to be involved in the TGF-
1 responsive element in the human plasminogen activator
inhibitor gene (59) and in the glucose responsive element in the
pyruvate kinase gene (60). Our study demonstrates that the ubiquitous USFs contribute to the moderate high transcriptional rate of the
1(I) collagen gene in NIH 3T3 cells.
The present study, in conjunction with previous studies, demonstrates
that the 1(I) collagen gene is regulated by a complex array of
cis-acting elements in the 5
-flanking region, the promoter, the first intron, and 3
-flanking region. Like USF-1 and USF-2, which
interact with the 3
-regulatory element identified in this study, most
of the other factors involved in transcriptional regulation of the
1(I) collagen gene are ubiquitous factors. For example, the proximal
promoter has been shown to interact with SP1, NF-I, CBF, and cKrox (13,
61, 62), none of which is present exclusively in collagen-producing
cells. Several elements, however, have been shown to contribute to
cell-specific expression of the
1(I) collagen promoter. These
include a TGF-
response element located approximately 1.6 kb
upstream of the transcriptional start site (63), a sequence required
for expression in bone (64) and elements that modulate expression in
dermal fibroblasts, osteoblasts, and odontoblasts, and in tendon and
fascia fibroblasts (65). However, even without these elements the
1(I) collagen promoter exhibits a remarkable degree of tissue
specificity (10, 65, 66). Taken together, these observations suggest
that the various regulatory elements located in the 5
-flanking region,
the first intron, the 3
-flanking region, and possibly additional sites
so far unidentified act in concert to provide appropriate tissue
specificity and high levels of activity of the
1(I) collagen gene
promoter and that the correct spatial arrangement of the various
elements is required for appropriate activity.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U50767[GenBank].