(Received for publication, January 3, 1996; and in revised form, February 13, 1996)
From the
The members of the syndecan family are temporally and spatially expressed heparan sulfate proteoglycans of various tissues, where they mediate extracellular influences on cell morphology and behavior. Functional characterization of the mouse syndecan-1 promoter was carried out in order to elucidate the mechanisms involved in the maintenance of the high transcription levels of syndecan-1 gene in various epithelia. For that 9.5 kilobase pairs of the upstream region of mouse syndecan-1 gene were cloned, sequenced, and used to prepare chimaeric constructs with a reporter gene followed by transient or stable transfections into NMuMG epithelial and 3T3 fibroblastic cells. In NMuMG cells, cultured either in the presence or absence of serum, the 2.5-kilobase pair promoter region resulted in the constitutive transcription activity, whereas in 3T3 cells the serum depletion decreased the promoter activity significantly. Deletion of the upstream sequences to -437 base pairs relative to the translation initiation site had little effect on this promoter activity. Further deletion to -365 base pairs removed three GT boxes and slightly increased the promoter activity, whereas the deletion of the next two GC boxes (to -326 base pair) reduced the promoter activity dramatically. All of the GC or GT box sequences bound the same set of Sp1-like nuclear proteins in gel shift assays. Nuclear protein binding was also demonstrated around both of the most intense transcription initiation sites. Mutation of these regions separately resulted in total loss of transcription initiation from the deleted site and decreased the promoter activity in relation to the intensity of the abolished start site. This indicates that the transcription initiation of the syndecan-1 gene is directed through initiator-like elements directly overlapping the start sites, as shown for several TATA-less housekeeping and growth regulated genes. We assume that the constitutive high level gene expression in epithelial cells is achieved by the proximal promoter, which is controlled by members of Sp1 transcription factor family.
The syndecans are a family of integral-membrane proteoglycans. They take part in the regulation of cell morphology and behavior by conveying the extracellular information to cells. The syndecans share a common domain structure, first described to murine syndecan-1(1) . The extracellular domains of these proteins contain attachment sites for glycosaminoglycan side chains, which may be composed of either heparin or chondroitin sulfate(2) . The intracellular domain is highly conserved between all four known members of the syndecan family (for review, see (3, 4, 5) ) and apparently contains signals for the proper localization of the molecule within polarized epithelial cells(6) . Syndecans can simultaneously bind both structural proteins of the extracellular matrix and heparin-binding growth factors, such as basic fibroblast growth factor(7) . Indeed, the presence of heparin or heparan sulfate seems to enhance the signal transduction by basic fibroblast growth factor(8, 9) . Interestingly, however, forced expression of syndecan-1 in 3T3 cells down-regulates the growth response to basic fibroblast growth factor(10) . Therefore, syndecan-like molecules may promote but also antagonize growth factor action (for review, see (11) ).
Each member of the syndecan family has a specific pattern of expression (12) . Syndecan-1 expression is restricted mainly to epithelia in adults, but during embryonic development it is temporarily expressed at high levels in proliferating and condensing mesenchymes, e.g. in the development of teeth(13) , limbs(14) , kidneys (15) , and lungs(16) . Likewise, keratinocytes in healing wounds express enhanced levels of syndecan-1(17) .
The role of syndecan-1 in the control of cell growth and morphology is illustrated by its altered expression in clinical malignancies and experimental cell culture models of transformation. First, in steroid-regulated S115 mammary epithelial cells, testosterone-induced transformation is associated with the loss of syndecan-1 expression, while the non-transformed, epitheloid phenotype, together with organized actin cytoskeleton, and normal growth are restored in cells genetically engineered to express syndecan-1 in the presence of the hormone(18, 19) . Second, decreased syndecan-1 expression is correlated with poor differentiation status of UV-induced skin tumors in mice (20) and tumor formation by transformed keratinocytes in nude mice(21) . Third, syndecan-1 expression is restricted to myeloma tumors with a well-differentiated, i.e. less aggressive phenotype(22) . Finally, patients with syndecan-1 positive squamous cell carcinomas have a more favorable overall and recurrence-free prognosis than patients with syndecan-1 negative carcinomas(23) .
The unique developmental
expression of syndecan-1 and its loss in several neoplasias prompted us to
characterize the structure of the syndecan-1 gene and its transcriptional
regulation. In a previous paper we reported the complete structure and
nucleotide sequence of the murine syndecan-1 gene including also the first
1-kb ()upstream region(24) . We have now sequenced
a further 8.5-kb fragment of this upstream region, characterized the
functional regions of the proximal promoter, mapped the protein-DNA
interactions of the regions required for high level expression, and
analyzed the remainder of the gene for putative enhancer or silencer
elements. All this data suggest that the proximal promoter of the
syndecan-1 gene is a major regulatory element for syndecan-1 expression in
epithelial cells and is controlled by members of Sp1-transcription factor
family.
The promoter fragments for
constructs p-492CAT (-492 to -95), p-437CAT (-437 to
-95), p-365CAT (-365 to -95), p-351CAT (-351 to
-95), p-326CAT (-326 to -95), and p-289CAT (-289
to -95) were generated by polymerase chain reaction (PCR). All of
these fragments share the same downstream primer
(5`-dTTGCTCTAGACTTTGCTG-3`) located in the 5`-untranslated region of exon
I. A XbaI site (underlined) was incorporated into this primer.
The reaction conditions were chosen according to the manufacturer's
recommendations (Perkin-Elmer). PCR products were first cloned into the
pGEM-T vector (Promega) and then sequenced from both ends to ensure the
correct positioning of the oligonucleotide primers in the template, and to
check the orientation of the insert. Inserts from pGEM-T vectors were
excised by SphI/XbaI digestion and recloned into the
SphI/XbaI site of pCAT-basic. Finally, all of the
PCR-made inserts of these CAT constructs were sequenced using the
Sequenase kit and synthetic oligonucleotide primers. One A to
G mutation (at position -265) was found in the construct p-289CAT.
This mutated sequence was not located in a protein binding area.
PCR amplification was used to delete the TATA sequence and footprinted
region B from the promoter. The TATA sequence TTTATTATAA was removed from
the upstream primer (5`-dCTGCAGAGCCTTTGGGGGCGGAGCG-3`) and the downstream
primer (5`-dGGCAGGCTGCAGGCGCACGCCAGCG-3`) was selected 332 bp downstream
from it. In order to get the region B deletion the sequence AACTAG was
left out from the downstream primer (5`-TCTGCAGTTGCAACCCACCCCCAGC-3`) and
the upstream primer (5`-AGAGTGGGGTGGGCTTCGA) was selected 137 bp upstream
from it. The following reaction conditions were used in PCR amplification.
For the first two cycles: denaturation at 94 °C (2 min), annealing at
50 °C (1 min), and extension at 72 °C (3 min). In the following 33
cycles: denaturation at 94 °C (1 min), annealing at 55 °C (1 min),
and extension at 72 °C (2 min). The PCR products were first cloned
into the pGEM-T vector, after which the p(TATA)CAT and p(
B)CAT
transfection constructs were generated by replacing the fragments
PstI/XhoI (-271 to -137) and
ApaI/PstI (-340 to -271) of construct
p-437CAT with the mutated fragment from the pGEM-T clones, respectively.
The mutations were verified by sequencing.
To search for putative enhancer elements, the vector pSynPromCAT was first modified from the pCAT-promoter plasmid (Promega) by exchanging the SV40 promoter region upstream of the CAT reporter gene with the mouse syndecan-1 promoter. The SV40 promoter from the pCAT-promoter plasmid was first excised by BglII/StuI digestion, and then the syndecan-1 promoter fragment BglII/XhoI (-1310 to -137 bp) was ligated in its place after blunt ending the XhoI site with Klenow polymerase. Gene fragments were cloned into the polylinker region located far upstream of the reporter gene. Four XbaI fragments named Xb4 (-9422 to -5064), Xb7 (-5063 to -4376), Xb5 (+5370 to +9436), and Xb3 (+9752 to +14982) were cloned into the XbaI site of the polylinker in a forward orientation, and were designated as pSynProm-Xb4CAT, pSynProm-Xb7CAT, pSynProm-Xb5CAT, and pSynProm-Xb3CAT, respectively. The gene fragments XbaI/SphI (-4375 to -2394) and BamHI/XbaI (+191 to +5369) were cloned into the polylinker of the vector in the forward orientation, and named as pSynProm-Xb2/5`CAT and pSynProm-Xb2/3`CAT, respectively.
To make stably transfected cell clones, CAT constructs were co-transfected with pBGS (a gift from Bruce Granger, University of Montana), a plasmid containing an SV40 promoter, and a neomycin-resistance cassette. Two days after transfection the cells were plated on 10-cm dishes (Falcon) and the selection with Geneticin (G418; 750 µg/ml, 98% pure; Sigma) was started. The G418 concentration was based on titering the selection efficacy with wild type cells. The surviving cell clones were pooled together and the relative copy number of the CAT constructs was determined by Southern hybridization using the 563-bp long XbaI/NcoI fragment from the vector pCAT-basic (Promega) as a probe.
Figure 1: Mouse syndecan-1 RNA levels and promoter activity in control and serum-deprived NMuMG and 3T3 cells. A, cells were grown for 48 h in the presence of Dulbecco's modified Eagle's medium supplemented with either 10% FCS or 2% carboxymethyl-Sephadex medium (CMS). Syndecan-1 RNA levels were analyzed by Northern blotting using PM-4 probe(1) . Rat glyceraldehyde-3-phosphate dehydrogenase (GAPDH) probe was used to control the RNA loading. Below, the relative abundance of syndecan-1 RNA of serum-deprived cells compared to control cells is presented by percentages. B, relative activity of the 2.5-kb long syndecan-1 promoter fragment in control and serum-deprived NMuMG and 3T3 cells. Both cell lines were stably co-transfected with a promoter-CAT chimaeric construct and a neomycin resistance gene construct. The effect of serum deprivation for promoter activity is demonstrated as relative CAT activities of cell extracts. The amount of extracts used in CAT reactions were adjusted by their protein amounts.
We first analyzed the longest proximal promoter construct p-2.5CAT (-2528 to -137) both in NMuMG and 3T3 cells cultured in the presence and absence of serum, using stably transfected cells. The serum depletion was done by culturing clones 2 days before harvest in 2% carboxymethyl-Sephadex eluted FCS. In both cell lines the CAT construct followed closely the expression of the endogenous syndecan-1 gene. In NMuMG cells the promoter activity was high in both culture conditions. The CAT activity of NMuMG cell extracts after serum depletion was even 65% higher than the activity of extracts from normal culture conditions (Fig. 1B). On the other hand, in the 3T3 cell clone the serum depletion decreased the CAT activity of the cell extracts to one-half of the activity seen in normal culture conditions (Fig. 1B). The suppression of the promoter activity is actually more dramatic, because the half-life of the CAT enzyme is over 50 h. According to these results the 2.5-kb syndecan-1 promoter fragment seemed to be responsible for the basal transcription activity of the syndecan-1 gene both in NMuMG and 3T3 cells.
The more detailed characterization of the promoter region was done by transiently transfecting a series of promoter constructs into NMuMG cells. The deletion of the 5`-flanking region from -2528 to -437 had only a small effect on the promoter activity as seen with constructs p-2.5CAT (-2528 to -137), p-1.0CAT (-1023 to -137), p-830CAT (-830 to -137), p-492CAT (-492 to -95), and p-437CAT (-437 to -95) (Fig. 2). The deletion to -365 (construct p-365CAT) abolished three GT box sequences (GGGGTGGG or GGGGTGTGG) and resulted in a 20-30% increase of promoter activity (Fig. 2). The further deletion to -326 (construct p-326CAT) removed two GC box sequences (GGGGCGGGG) and reduced the promoter activity about 75% (Fig. 2). The short constructs p-326CAT (containing one GT and one GC boxes) and p-289CAT (containing one GC box) showed only low basal transcription activity as did the shortest construct p-271CAT (-271 to -137) including only the putative TATA sequence and the first and the second transcription initiation sites (Fig. 2). Stable expressing NMuMG cell clones were prepared by co-transfecting these chimaeric promoter-CAT constructs with a plasmid pBGS containing a neomycin-resistance gene(6) . The CAT activities of the stably transfected clones were in accordance with the data from transient transfections (data not shown). These results demonstrated that the single GT/GC box sequence (between nucleotides -282 and -295) with the downstream promoter region was not able to activate a high level transcription of the reporter gene in NMuMG cells. At least one of the GC boxes located between nucleotides -326 and -365 was also needed for high level transcription activity of syndecan-1 promoter. The more upstream regions (-365 to -437) contained some negatively acting cis-elements.
Figure 2: Deletion analysis of the mouse syndecan-1 promoter region. In the top of the figure a schematic presentation of the 5`-flanking region of the gene is shown. The open box indicates exon I. The horizontal arrows show the locations of oligonucleotide primers used in PCR reactions to generate the promoter fragments. Restriction enzymes (XhoI, PstI, DraI, StuI, and HindIII) used in the preparation of the other constructs are also shown. The vertical arrow indicates the beginning of the translated sequence, which is counted as +1 in the nucleotide numbering of the sequence. The closed circles represent the location of the consensus binding sequences for Sp1 (GC and GT boxes). The scale bar represents 100 bp. Below, the structures and the names of the chimaeric constructs are shown. The lines represent the syndecan-1 promoter fragments and the hatched boxes the CAT gene. The 5`- and 3`-ends of the promoter fragments are drawn in scale to the map in the top of the figure. The relative CAT activities of the constructs transfected into NMuMG cells are represented by the black columns in the diagram.
Figure 3: DNase I footprinting analysis of the mouse syndecan-1 proximal promoter. A, a XhoI-BamHI restriction fragment containing promoter sequences from -137 to -388 was labeled on the sense strand at the XhoI end and used as a probe. The probe was incubated either without (lane 2) or with (lane 3) the nuclear extract prepared from epithelial NMuMG cells. The nucleotide positions (numbers) in the left of the panel are based on the G + A chemical sequencing reaction ladder shown in lane 1. In the right of the panel the region protected from DNase I digestion are indicated by the bar and letter. B, a XhoI-DraI restriction fragment (-137 to -830) labeled on the sense strand at the XhoI site was used in DNase I footprinting analysis as described in A. C, a nucleotide sequence of the promoter region extending from -470 to -221. The footprinted regions, marked FP-A, -B, -C, -D, -E, -F, and -G, are underlined. The GC/GT-box sequences and the TATA sequence are indicated by boldface letters.
Figure 4: Gel mobility shift analysis of footprinted regions D, E, and F including GC or GT box sequences. End-labeled double stranded oligonucleotide probes covering footprinted regions D (lanes 1-3), E (lanes 4-6), and F (lanes 7-9) were incubated without (lanes 1, 4, and 7) or with (lanes 2, 5, and 8) nuclear extract from epithelial NMuMG or from mesenchymal 3T3 cells (lanes 3, 6, and 9). The shifted protein complexes are indicated by horizontal lines to the left of the panel.
Figure 5: Gel mobility shift/competition analysis of the footprinted regions including GC/GT boxes. The end-labeled double stranded oligonucleotide covering footprinted region E was used as a probe in all of the assays. In lane 1 the probe only was loaded. The nuclear extracts from NMuMG (lanes 2-10) or 3T3 cells (lanes 11-15) were first incubated for 10 min with the unlabeled double stranded competitor oligonucleotides F (lanes 3, 4, and 12), E (lanes 5, 6, and 13), D (lanes 7, 8, and 14), and B (lanes 9, 10, and 15) after which the probe was added. 20- (lines 3, 5, 7, and 9) and 100-fold (lines 4, 6, 8, 10, and 12-15) molar excess of the competitor was used. In lanes 2 and 11 no competitors were used. The specifically shifted protein complexes are indicated by horizontal lines to the left of the panel.
Because
the proximal promoter included several binding sequences for the
transcription factors Sp1, AP-2, and NFB, we also carried out
competition assays with their double stranded consensus binding site
oligonucleotides. Oligonucleotide E was used as a probe in these
experiments. The binding of all complexes were clearly inhibited by the
Sp1 binding site oligonucleotide (ATTCGATCGGGGCGGGGCGAGC), but not by the
AP-2 (GATCGAACTGACCGCCCGCGGCCCGT) or NF
B (AGTTGAGGGGACTTTCCCAGGC)
binding site oligonucleotides (Fig. 6A). These results
indicated that the nuclear protein binding to the promoter regions D, E,
and F share similar binding specificity through Sp1-like binding domains.
This was supported by the fact that the retarded complex generated by
purified Sp1 protein co-migrated in the mobility shift assays with the
uppermost complex produced by the nuclear extract (Fig. 6B). Moreover, in all supershift experiments using the
polyclonal anti-Sp1 antiserum with probes D, E, F, and nuclear extract the
binding of the slowest migrating complex was lost and a supershifted
complex was produced (Fig. 7). The intensities of the two other
complexes were partly reduced (Fig. 7). As a negative control a
polyclonal antiserum against a peptide of human neurofibromin (anti-P111)
was used. These experiments confirmed that Sp1 was the nuclear factor
bound to the probe in the slowest migrating complex. The two faster
migrating complexes may represent other members of the Sp1 multigene
family, which share some immunological homology with Sp1(33) .
Hence, we conclude that the GC and GT box sequences of the syndecan-1
promoter bind Sp1, and possibly some other Sp1 multigene family members.
Interestingly, there were no differences in the binding of nuclear
proteins from NMuMG or 3T3 cell extracts to these regions.
Figure 6:
Gel mobility shift analysis
demonstrating that Sp1 binds to the GC/GT boxes. A, nuclear
extract from NMuMG cells was incubated with the unlabeled double stranded
competitor oligonucleotide before the probe was added. The double stranded
oligonucleotide covering footprinted region E was used as a probe. In
lane 1 the probe only and in lane 2 the probe only with
nuclear extract were loaded. The competitor oligonucleotides including
consensus binding sites for Sp1 (lane 3), AP-2 (lane 4),
and NFB (lane 5) were used in 100-fold molar excess. The
specifically shifted protein complexes are indicated by horizontal
lines in the left of the panel. B, the slowest
migrating complex generated from the 3T3 cell nuclear extract with
oligonucleotide probe E (lane 1) co-migrates with the retarded
complex produced with pure Sp1 protein (lane 2, and 3). 0.5
(lane 2) and 1.0 (lane 3) footprinting units of the Sp1
protein were used.
Figure 7: Characterization of GC/GT box binding proteins by immunosupershift. A polyclonal rabbit anti-Sp1 peptide antibody (lanes 3, 7, and 11) or as a control a polyclonal antibody against human neurofibromin (anti-P111; lanes 4, 8, and 12) were incubated for 10 min with the nuclear extract from NMuMG cells before the oligonucleotide probe was added. End-labeled oligonucleotide probes covering footprinted regions D (lanes 1-4), E (lanes 5-8), and E (lanes 9-12) were used. In lanes 1, 5, and 9, pure probes, and in lanes 2, 6, and 10, the probes incubated with nuclear extract without antibody were loaded. The specifically shifted protein complexes are indicated by horizontal lines in the left of the panel and the supershifted complex by horizontal arrows.
Footprinted region C and the 5` part of region G were not studied in gel shift experiments. They both included GT box sequences and we thus assumed that their binding properties were the same as footprinted region F. On the contrary, the 3` part of footprinted region G showed binding both to Sp1-like nuclear proteins and to two unknown cell type-specific nuclear proteins. The double stranded oligonucleotide probe G (5`-dCCTAGGAGGCGTGGAAGGGGGTGT) covering the 3` part of footprinted region G included a possible Sp1 binding sequence (GAGGCGTGG) mismatching only one nucleotide from the consensus binding sequence for Sp1 (G/TG/AGGCG/TG/AG/AG/T). In a gel shift assay using probe G and nuclear extract from NMuMG cells, at least five specifically retarded complexes were obtained (Fig. 8, lanes 2 and 3). The proteins in the three slowest migrating complexes (I, II, and III) appeared to be the same Sp1-like proteins as seen with oligonucleotide probes D, E, and F as described above. These complexes co-migrated with the nuclear protein complexes produced with probe D (Fig. 8, lane 10), and their formation was specifically competed with the Sp1 consensus binding site oligonucleotide or oligonucleotide D (Fig. 8, lanes 4 and 6). As with oligonucleotide probes D, E, and F, the slowest migrating complex (I) contained transcription factor Sp1, as in supershift experiments using the polyclonal anti-Sp1 antiserum it was replaced by a supershift complex (Fig. 8, lane 7). The rabbit IgG used as a negative control in this experiment did not result in a supershift complex (Fig. 8, lane 8). The intensity of complex I was relatively low as compared with those seen with probes of similar specific labeling activity including complete GC or GT box sequences. This might be a result of the mismatch in the Sp1 binding sequence. The protein complexes IV and V were not competed with Sp1 binding oligonucleotides, indicating that they bound to different parts of the probe. Interestingly, nuclear extract from 3T3 cells only produced protein complexes I, II, and III when incubated with probe G (Fig. 8, lane 9). Thus, the nuclear factors producing complexes IV and V were present only in NMuMG cell extract. It was concluded that the 3` region of footprinted region G binds the same set of Sp1-like proteins as footprinted regions D, E, and F, probably through the unusual GC box sequence. In addition, two unknown epithelial cell-specific nuclear protein complexes bound to this footprinted region.
Figure 8: Gel mobility shift analyses of footprinted region G. The end-labeled double stranded oligonucleotide probe G covering the 3` part of footprinted region G (lanes 1-9) and the oligonucleotide probe D (lane 10) were used. In lane 1, only probe G was loaded. Nuclear extracts from NMuMG (lanes 2-8 and 10) and 3T3 cells (lane 9) were used. The five complexes generated from NMuMG cell extract (lane 2) are indicated by roman numbers and arrows in the left of the panel. The competition assays were carried out with probes D, E, and F as described in the legend to Fig. 5. The unlabeled oligonucleotide G (lane 3), oligonucleotides covering footprinted regions D (lane 4) and B (lane 5), and Sp1 consensus binding site oligonucleotide (lane 6) were used as competitors in 100-fold molar excess to probe. A polyclonal rabbit anti-Sp1 peptide antibody (lane 7) and as a control rabbit IgG (lane 8) were used in supershift experiments. The supershifted complex is indicated by an asterisk and arrow at the left of the panel. From nuclear extract of 3T3 cells only the three slowest migrating complexes were produced (lane 9). Oligonucleotide probe D was used to demonstrate that the three largest complexes produced with probe G are co-migrating with complexes binding to GC/GT boxes (lane 10).
Footprinted region B, which covered one of the transcription initiation sites, also bound a nuclear protein complex. This was demonstrated in gel shift assays using the double stranded oligonucleotide probe B (5`- dGGGCGGCTAGTTTTGCAACTGCAGAG-3`) and nuclear extracts from NMuMG and 3T3 cells. A single specifically retarded complex was produced when nuclear extract from NMuMG cells was used (Fig. 9, lanes 2 and 7), whereas, that from 3T3 cells produced only a barely visible band (Fig. 9, lane 3). Formation of this complex was not competed by oligonucleotides D, E, and F (Fig. 9, lanes 4-6), which demonstrated that the binding property of this protein complex was different from the Sp1-like proteins described above.
Figure 9: Gel mobility shift analyses of footprinted region B. The end-labeled double stranded oligonucleotide covering footprinted region B was used as a probe. Probe only was loaded in lane 1. With nuclear extract from NMuMG cells a single protein complex shift was formed (lane 2), whereas from 3T3 cell extract only a faintly visible band was produced (lane 3). The unlabeled competitor oligonucleotides D (lane 4), E (lane 5), F (lane 6), and B itself (lane 7) were used in 100-fold molar excess as in previous shift assays. The specifically shifted protein complex is indicated by a horizontal line.
Figure 10:
Deletion of the promoter
elements effect on the transcription activity and start site usage.
A, in the top of the figure a schematic presentation of the
promoter region is shown. The filled circles represent the
location of the footprinted sequences for Sp1 (GC and GT boxes). The
horizontal arrows show the location of transcription initiation
sites. The size of the arrow indicates the relative abundance of
each initiation site(25) . The vertical arrow indicates
the beginning of the translated sequence. Below, the structures
and the names of the chimaeric constructs are shown. The lines
represent the syndecan-1 promoter fragments and the hatched boxes
the CAT genes. The relative CAT activities of the constructs transiently
transfected into NMuMG cells are represented by the black columns
in the diagram. B, primer extension analyses of the RNAs
transcribed from the promoter-CAT construct. A primer complementary to the
5`-end of the CAT gene was hybridized to total RNA from NMuMG cell clones,
which were stably transfected with constructs p-437CAT (lane 1),
p(TATA)CAT (lane 2), and p(
B)CAT (lane 3). The
primer extension products are indicated by brackets. The product
A located at position -252 bp and the lower band of the
product B at -281 bp. B` indicates the shortened
extension product B as a result of the TATA sequence deletion. Sequencing
reaction (lanes G, A, T, and C) was done with the same
primer as the primer extension reactions, thus the complementary
nucleotides of the transcription initiation sites can be read directly
from the sequencing ladder.
In vivo the expression of syndecan-1 gene is constitutive in several epithelial cells. In addition, during wound healing and embryonic development the expression appears to be strongly inducible (for reviews, see (3) and (34) ). We have previously reported the genomic organization and nucleotide sequence of the mouse syndecan-1 gene(24) . In this work we characterized the upstream gene regions responsible for its transcription regulation in cell culture conditions. We have focused mainly to the identification of the promoter elements involved in the constitutive gene expression. Using CAT assays from transiently and stably transfected cells we mapped a highly active proximal promoter region, which binds Sp1 and probably other members of the Sp family. Moreover, we found that transcription initiation was directed by initiator-like elements as in TATA-less promoters. No enhancer elements were found in CAT assays when the upstream region (up to -9.4 kb) and the first intron (to +15 kb) were studied, whereas some suppressor elements were located upstream to the promoter (-2.4 to -4.4 kb).
Previously described features of the promoter included a TATA-like sequence 250 bp upstream from the translation start site (counted as +1) and several Sp1 and AP-2 transcription factor binding sites upstream of it (from -284 to -430)(24) . Five E box sequences were found (from -549 to -1612) and a long TAATAA repeat (from -917 to -879), possibly a binding site for Antennapedia homeobox transcription factor(25) . Three major transcription initiation sites were located around the putative TATA sequence(25) . The genomic organization of chicken syndecan-4 gene resembles that of mouse syndecan-1(35) , but at the moment no information is available on the structure or function of the promoters of the other members of the syndecan gene family.
As described previously by Kim and co-workers (12) syndecan-1 is expressed in most of the cultured epithelial and mesenchymal cells. We used epithelial NMuMG and mesenchymal 3T3 cells, which both expressed high levels of syndecan-1 mRNA in normal culture conditions. However, when these cells are cultured in serum depleted conditions, the syndecan-1 mRNA levels in 3T3 cells, but not in NMuMG cells, decreased dramatically (Fig. 1). By preparing polyclonal cell lines expressing the promoter-CAT constructs we were able to demonstrate that the 2.5-kb long promoter fragment was able to mimic the regulation of the endogenous gene.
To find out the functional
elements of the promoter, the detailed characterization was done by
transient transfections with NMuMG cells. The deletion of most of the
5`-flanking sequence (from -2.5 kb to -437 bp) had only a
minimal effect on promoter activity. At least five E-box sequences, which
are possible targets for helix-loop-helix transcription factors, such as
members of myc and MyoD gene families(36, 37) , as well as the long TAATAA repeat were located in this deleted
promoter area. They are, thus, most likely not involved in the
constitutive epithelial expression of syndecan-1 gene. Further deletions
of the promoter sequences first revealed some negative regulatory
element(s) (-437 to -365). Downstream from that the short
region from -365 to -326 was shown to be critical for the
function of the promoter, because the deletion of it effectively reduced
the reporter gene expression to a minimum. The inhibitory region contained
two GT box sequences and the positive two GC box sequences, which were all
protected by nuclear proteins in DNase I footprinting assays. GC boxes are
typical Sp1 transcription factor binding sites found in many genes. Also,
functional GT box elements have been found in several other promoters and
enhancers, including those of the -globin gene (38) ,
tyrosine aminotransferase gene(39, 40) , tryptophan
oxygenase gene(41, 42) , and interleukin 2 gene(43) . Both of the two GC boxes and one of the GT box sequences were
studied by gel mobility shift assays. Interestingly, they all bound a
similar set of three protein complexes from nuclear extracts of both
epithelial NMuMG and mesenchymal 3T3 cells. The transcription factor Sp1
proved to be one of the binding proteins, as demonstrated by competition
assays, by co-migration of pure Sp1 protein, and by supershifts with
specific antibodies. We propose that the two uncharacterized complexes
represent other members of the Sp1 multigene family, since their binding
was easily competed by Sp1 consensus binding site oligonucleotide. Today
at least four separate Sp1-like nuclear factors (Sp1, Sp2, Sp3, and Sp4)
have been characterized(33, 44) . They all bind to
both GC and GT box sequences, even though Sp1 and Sp3 bind GT boxes with
higher affinity than GC boxes. A highly conserved Sp1-like zinc finger
region for DNA binding and a glutamine-rich region for transactivation are
typical for all these proteins. Interestingly, it has been shown that Sp3
represses Sp1-mediated transcription activation by competition(45) . This might explain the slight transcription inhibition seen
with the longer syndecan-1 promoter constructs, which contained
footprinted region G. This region bound the two other proteins more
strongly than Sp1. Some other zinc finger DNA-binding proteins have also
been shown to bind to GC and GT boxes. For example, EGR-1, a member of an
early growth response gene family binds to a GT box sequence of the
interleukin 2 gene promoter and activates the gene expression(43) .
The assembly of the transcription initiation machinery in syndecan-1 promoter was directed by initiator elements typically found in TATA-less promoters. Syndecan-1 promoter contains a putative TATA box sequence 23 bp upstream from one of the previously described transcription initiation sites(25) . However, the most frequently used transcription initiation site located only 2 bp and the second intense site 31 bp upstream from this TATA box. Deletion of the sequences spanning these two major transcription start sites resulted in a loss of transcription initiation from the site being deleted, while a deletion of a classical TATA box should have resulted in a loss of a downstream initiation site. In addition, the mutations decreased the transcription activity of the promoter in accordance to the preference of the abolished start site. This data indicated that the transcription initiation from both of these major start sites is independently regulated by elements directly overlapping the initiation sites. The sequences around the sites exhibited high homology with the consensus sequence of the initiator element (PyPyANPyPyPy), where the A is the start site(46) . It has, furthermore, been shown that in TATA-less promoters the direct protein-protein interactions between the Sp1 and TFIID complex are essential for the assembly of preinitiation complex (47) . Consistent with this are the several Sp1 binding sequences upstream the transcription initiation sites in syndecan-1 gene.
The Sp1-like
transcription factors seem to have an essential role in the regulation of
the transcription activity of the syndecan-1 gene. The significance of Sp1
for the transcriptional activity has been studied in detail, for example,
in the SV40 promoter(48, 49, 50) , in the
hamster dihydrofolate reductase promoter(51, 52, 53) , and in the rat transforming growth factor- promoter(54) . In all of these promoters the Sp1 elements are required for
efficient transcription. Although Sp1 was first thought to be a ubiquitous
transcription factor chiefly regulating housekeeping genes, recent data
indicates that its expression level varies severalfold in different cells
and tissues, especially during development(55) . Interestingly,
the binding activity of Sp1 is also regulated by phosphorylation as
demonstrated in terminal differentiation of liver(56) . The
versatility of Sp1 in transcription regulation has been further expanded
by the demonstration that Sp1 also functions through a class of
co-activators(57) . The co-activators connect
trans-activators into a general initiation complex and it is
assumed that they could also exert a negative effect. Recently Sp1 has
been shown to have an essential role in cell type-specific gene regulation
or in hormone/growth factor-mediated transcription regulation. For
example, mutation of the Sp1 site from the keratinocyte-specific rabbit K3
keratin promoter resulted in a 50% loss of promoter activity(58) , transforming growth factor-
was shown to stimulate
2(I) collagen gene expression by increasing the affinity of an Sp1
containing protein complex for the promoter(59) , and the
induction of cathepsin D gene expression by estrogen was found to be
mediated by an estrogen receptor-Sp1 complex(60) . In addition,
Li and co-workers (61) have demonstrated that the cellular
transcription factor Sp1 and the bovine papillomavirus type 1 enhancer
protein E2 synergistically activate transcription from the viral promoter.
By electron microscopy, the DNA was shown to make a loop between the
enhancer element and the Sp1 complex, which suggests that Sp1 by
physically interacting with E2 protein brings the enhancer protein into an
appropriate position to influence transcription.
Structural organization and functional analyses of several matrix proteoglycan promoters have been recently reported. The published promoter sequences include those of mouse aggrecan(62) , human versican (63) , human decorin(64, 65) , human biglycan(66) , human perlecan(67) , and mouse and human serglysin genes(68, 69) . The promoter structures of the extracellular proteoglycans perlecan and biglycan mostly resemble that of the mouse syndecan-1 gene. The promoter region and 5`-end of the perlecan gene were located in a CpG island. Also, the 5`-flanking region of the biglycan gene was GC-rich. In both genes no canonical TATA or CAAT boxes were found, but several Sp1 transcription factor binding sites were located within the first 200 bp of the promoter. In the perlecan gene five transcription initiation sites were dispersed around the GC boxes, but in the biglycan gene only a single transcription initiation site was found. Also in the mouse aggrecan gene the transcription was initiated from four separate sites and no TATA sequence was found, but otherwise the promoter structure of the extracellular proteoglycan aggrecan, versican, and decorin did not share remarkable homology with mouse syndecan-1 promoter. The functional analyses of these promoters is currently limited, not allowing conclusions of the cell specificity of these promoters. The promoter region of an intracellular proteoglycan serglysin has been functionally analyzed(70) . The cell-specific regulatory element were present in the 250-bp long promoter fragment, however, no homology to mouse syndecan-1 promoter was found.
In summary, we have identified functional regions of the mouse syndecan-1 promoter, which are responsible for the constitutive gene expression in epithelial cells. The transcription initiation sites behaved like initiator elements of TATA-less promoters. The upstream region contained several functional GC and GT boxes, which bound members of the Sp1 gene family. This work will provide a basis for further work where we aim at the characterization of syndecan-1 gene suppression during malignant transformation and formation of carcinomas.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) Z22532[GenBank].