(Received for publication, January 27, 1995; and in revised form, July 7, 1995)
From the
Three reporter genes, the chloramphenicol acetyltransferase (CAT), the lacZ, and the intronless NF-L DNA, were used to test the activity of the proximal promoter region (-292 bp) of the human neurofilament light (hNF-L) gene in transgenic mice. Surprisingly, the hNF-L/CAT construct was highly sensitive to position effect, and its expression was found at low levels in several tissues of adult transgenic mice (Beaudet, L., Charron, G., Houle, D., Tretjakoff, I. Peterson, A., and Julien, J.-P.(1992) Gene (Amst.) 116, 205-214). In [Medline] contrast, the hNF-L/lacZ or the hNF-L/intronless constructs were expressed exclusively in the nervous system during embryonic development and in adult animals. The DNA sequences analysis of the different reporter genes revealed the presence of matrix attachment regions (MARs) within the 3`-untranslated regions of all three transgenes. DNA unwinding elements were found within the MARs of lacZ and hNF-L gene constructs but not in the CAT gene construct. When this element was removed from the lacZ construct, expression of the hNF-L/lacZ transgene became susceptible to position effect and was no longer tissue-specific. These results indicate that DNA unwinding elements are essential for position effect independence conferred by MARs to the hNF-L basal promoter.
A variety of genes such as those coding for
luciferase(2) , -galactosidase (lacZ)(3) , chloramphenicol acetyltransferase
(CAT)(4) , (
)growth hormone(5) , or alkaline
phosphatase (6) have been extensively used as reporter genes,
because they possess enzymatic activities that are easily detectable
either in vitro or in vivo. With the availability of
transgenic mouse technology, reporter-containing vectors have been
developed for applications as diverse as analysis of enhancer and
promoter function (7, 8, 9, 10) and
cell lineage studies(11) . Transgenic mice have been found to
be powerful tools to delimitate regions responsible for tissue-specific (12) developmental (13) or stimulatory (14) pattern of expression.
When a transgene integrates into a host genome, its expression pattern might be affected by elements localized near the insertion site(15) . This position effect has been related to the absence of insulator sequences responsible for the establishment of a higher order of chromatin structure called the loop domain(16) . Loop domains may represent independent transcription units, with genes located within a loop being subject to inside regulatory environment and insulated from outside environment(17, 18) . Loops are maintained on the nuclear matrix by MARs located at both extremities and called domain boundaries (19) . Even if domain boundaries do not seem to have transcriptional regulatory property of their own, their insulating effect on expression has been demonstrated as new domain boundaries have been characterized (20, 21, 22, 23, 24, 25) . It has been shown that not only a position-independent expression but also a finer regulation of a transgene developmental and stimulatory pattern of expression could be achieved when transgenes are flanked by MARs(26) . Recently, compilation and comparison of MAR sequences have allowed the assessment of many MAR characteristics(27) . MARs range from 250 bp to several kb in length, they are enriched in A/T nucleotides, they bind to nuclear matrix in vitro and in vivo, and they often contain consensus topoisomerase II cleavage sites, single-stranded, kinked, or curved DNA and potential binding sites for homeobox containing DNA-binding proteins. Sometimes MARs colocalize with replication origins(28, 29) , regions implicated in transcriptional regulation(30, 31) , and regions of nuclease hypersensitivity(22) . Considerable effort has been made to delimitate the minimal requirement for MAR to have an effect on the transcription process. Although binding to nuclear matrix seems to be a prerequisite, this criterion is insufficient by itself to assure an effect on transcription. Using stable transfection and mutagenesis experiments, sequences presenting unwinding capability have been shown to play an important role for transcriptional activation(32) . However, their influence on tissue-specific and developmental transcriptional activation has yet to be determined.
The proximal
promoter region (positions -292 to +15 bp) of the human
neurofilament light gene (hNF-L) is not sufficient to drive
high level and neuron-specific expression of a CAT reporter gene in
transgenic mice(1) . However, nervous system-specific
expression is obtained with the same promoter linked to a lacZ reporter gene or an hNF-L/intronless gene. ()These results prompted us to look for functional
similarities between the lacZ and the hNF-L/intronless reporter genes that could explain their
correct tissue-specific expression. Our results demonstrate that
unwinding elements colocalize with MARs found at the 3`-untranslated
region (UTR) of the lacZ and the hNF-L/intronless
reporter genes. Deletion of a DNA segment containing these unpaired DNA
regions abolishes tissue-specific expression of hNF-L/lacZ transgene in transgenic mice. We propose that unwinding elements
are one of the important features required for MAR to insulate the hNF-L basal promoter from position effect.
To delete all three introns of the hNF-L gene, we took advantage of the presence of a conserved BglII restriction site in the middle of exon 1 between the human gene and the mouse gene and of another conserved EcoRI restriction site near the polyadenylation signal of both genes. First, a 4.9-kb BamHI-XbaI fragment that includes the hNF-L gene (1) was subcloned in a Bluescript pSK+ vector to create the plasmid pSKhNF-L. pSKhNF-L was digested to completion by BglII and EcoRI. From the mouse cDNA clone(33) , a 1,220-bp BglII-EcoRI fragment was isolated and ligated to the open pSKhNF-L vector, creating the plasmid phNF-L/intronless.
The hNF-L/lacZ construct was
directly isolated by complete BamHI digestion of the plasmid
phNF-L/lacZ and microinjected. Only the basal
sequences of SV40 that allow correct polyadenylation were present in
this 4,035-bp fragment. This fragment includes no plasmid sequences.
A pair of oligonucleotides specific for the first hNF-L exon were used, their sequences are respectively 5`-GCACATCTCCAGCGTGCGT-3` (primer 522) and 5`-AGATCTGCGCGTACTGGATCTGCGCC-3` (primer 784). A second pair of oligonucleotides specific for the mouse and rat G3PDH mRNAs were used to quantify the amount of RNA in each sample. Their sequences are respectively 5`-TTGATGGTATTCGAGAGAAGGG-3` (primer 808) and 5`-TCCAGGAGCGAGATCCCGTC-3` (primer 809).
The first strand synthesis
was performed by adding 8 µl of 1.25 mM dNTP mix and 250
ng of primers 784 and 808 in presence of 30 units of RNAguard and 20
units of avian myeloblastosis virus reverse transcriptase. The mixture
was incubated at 42 °C during 1 h, and then the reaction was
stopped at 95 °C for 5 min. Each sample was subjected to two PCR
reactions, one specific for the hNF-L transcript using primers
522 and 784 and the other for the G3PDH transcript using primers 808
and 809. This last one is an internal control. PCR reactions were done
according to the manufacturer (NEB) using of the cDNA mix as template,
0.20 µCi of [-
P]dATP, and 1 unit of
VENT DNA polymerase (New England Biolabs). After 25 cycles, of the
hNF-L-PCR reaction products and of the G3PDH-PCR products
reaction were separated by electrophoresis on a 1% agarose gel,
blotted, and autoradiographed.
Nuclear matrix isolation was performed
according to Cockerill and Garrard(30) , except that EDTA was
replaced by EGTA in all the buffers. The nuclei suspension was adjusted
to a final concentration of 1 mM CaCl and was
digested with 100 µg/ml of DNase-1 for 1 h at 24 °C. After
centrifugation for 10 min at 750
g, pellets were
resuspended in RSB with 0.25% sucrose, and an equal volume of solution
containing 20 mM Tris-HCl (pH 7.4), 4 M NaCl, and 20
mM EGTA was added. After a 10-min incubation, the nuclei were
centrifuged at 1,500
g for 15 min. Pellets were
extracted twice by suspension in 10 mM Tris-HCl (pH 7.4), 2 M NaCl, 0.5 mM phenylmethylsulfonyl fluoride, and
0.25 mg/ml bovine serum albumin and centrifugation at 4,500
g for 15 min. The resulting nuclear matrices were washed three
times with 10 mM Tris-HCl (pH 7.4), 50 mM NaCl, 1
mM MgCl
, and 0.25 mg/ml bovine serum albumin by
centrifugation for 30 s at 10,000
g.
Using three different reporter genes (Fig. 1A), very different conclusions regarding the
strength, the tissue-specificity, and the developmental expression
driven by the basal hNF-L promoter were reached.Fig. 1B summarizes the expression pattern
generated in transgenic mice with each hNF-L/reporter gene
construct. The hNF-L/lacZ and the hNF-L/intronless transgenes were correctly expressed in
neuronal cells; examples are shown in Fig. 1, C and D, respectively. However, the same promoter was not sufficient
to direct CAT expression in those cells(1) . The striking
finding from those studies is that the different reporter genes exerted
some influence on the activity of the hNF-L promoter.
Figure 1:
Schematic
representation of the three hNF-L/reporter gene constructs and
their relative expression in transgenic mice. A, schematic
representation of the three constructs. The black boxes indicate the hNF-L promoters (positions -292 to
+15). The white boxes represent the reporter gene
sequences; the 3`-UTR of the reporter gene is underlined. The
other boxes identify the origin of the various parts of the
3`-UTR. The shaded box represents the SV40 region
4713-4104 on SV40 map (70) ; the hatched box represents SV40 region 2774-2533 for hNFL/CAT or
2666-2533 for hNF-L/lacZ; the cross-hatched
box represents SV40 region 2533-1782. The hNF-L/CAT
construct is constituted of the TN9 chloramphenicol acetyltransferase
gene followed at its 3` end by the SV40 intron from the small t-antigen
gene, which is flanked on each side by coding sequences. This segment
is linked to the SV40 polyadenylation signal-containing region. The hNF-L/lacZ construct is derived from plasmid pCH110
(Pharmacia). Its 3`-UTR contains a SV40 DNA fragment that comprises the
polyadenylation signal region. The hNF-L/intronless transgene has
been generated using conserved restriction sites between the human NF-L gene and the mouse cDNA. We deleted as a block from
middle of exon 1 up to the first polyadenylation signal of the human NF-L gene and replaced it by corresponding coding sequences
from the mouse cDNA NF-L making a hybrid NF-L/intronless transgene (detailed under ``Experimental
Procedures''). No SV40 DNA sequence has been added to this hNF-L/intronless ,construct and polyadenylation occurs from
the first endogenous hNF-L poly(A) signal(71) . B, BamHI; Bg, BglII; Hd, HindIII; RI, EcoRI; Sc, ScaI; Xb, XbaI. B, relative
expression pattern of the three hNF-L/reporter gene constructs
in transgenic mice. Expression was judged to be in the nervous system
when it colocalized with nervous system structures. C, X-gal
staining on a whole mount hNF-L/lacZ embryo.
-galactosidase (lacZ) staining of one transgenic embryo
expressing the transgene. Embryo was analyzed 13.5 days after
microinjection, and pictures were taken from side, back, and front
views. Positive tissues are identified by an X. D,
reverse transcription PCR expression analysis of the hNF-L/intronless transgene. RNAs from various tissues of
transgenic mouse line 38 were reverse transcribed prior to the PCR. Two
sets of oligonucleotides were used: Set 1 corresponds to
oligonucleotides specific for the first exon of the hNF-L gene, and the second set corresponds to the mouse G3PDH gene and
is used as an internal control. Upper panel, amplified
products from a hNF-L/intronless transgenic mouse mRNA; lower panel, amplified products from G3PDH
mRNA.
Figure 2: A/T content of the three hNF-L/reporter gene constructs. All constructs were analyzed for their A/T content on segments of either functional significance, like promoter, coding region, and UTR, or segments of different species origin. Schematic representation of each transgene is placed at 50% on the x axis while the y axis represent the percentage of A/T. A, hNF-L/intronless construct. B, hNF-L/CAT construct. C, hNF-L/lacZ gene construct.
Every DNA region of the hNF-L/CAT construct did bind to the brain nuclear matrix but
with different affinities (Fig. 3A). The SV40
polyadenylation sequence (band 3) presents the strongest
binding signal, followed by the CAT gene itself (band 2) and
the hNF-L promoter (band 1), which is only weakly
bound. Poly(dAdT) is a better competitor than unrelated E.
coli genomic DNA (Fig. 3A, lanes 8 and 15), indicating that A/T-rich sequences are responsible for
specific nuclear matrix DNA attachment as
expected(30, 38) . Fig. 3C shows the
binding pattern of the hNF-L/lacZ construct DNA
fragments. Three regions are attached to the nuclear matrix in the
presence of nonspecific competitor DNA. These regions correspond
respectively to the hNF-L promoter (band 1), the lacZ 3`-UTR (band 2), and the SV40 flanking sequences (band 3). The lacZ gene itself was not attached to
the nuclear matrix. The lacZ 3`-UTR has the highest affinity
for the nuclear matrix, and its attachment is not competed by high
quantity of nonspecific competitor DNA, whereas the promoter region and
the SV40 flanking sequences were competed away. For the hNF-L/intronless construct (Fig. 3E), two
regions attached to nuclear matrix. The first one corresponds to the
promoter and first hNF-L exon (band 1), whereas the
second one corresponds to the 3`-UTR (band 2). This latter
band has the highest affinity for nuclear matrix, and binding is
efficiently competed by poly(dA
dT). For all these constructs,
identical results were obtained with brain or liver nuclear matrices. Fig. 3(B, D, and F) summarize the
results for each construct. In conclusion, the 3`-UTR regions of all
three reporter genes are bound to the nuclear matrix with high
affinity.
Figure 3:
In vitro nuclear matrix binding
assays. Binding assays were performed by incubating labeled DNA probes
with nuclear matrices purified from adult C3H mouse brain and liver.
Increasing concentration of E. coli genomic DNA
(125-1,000 µg/ml) were used as unspecific competitor with or
without poly(dAdT) (250 µg/ml). Isolated DNA from the pellet
fraction was run on agarose gel in parallel with 20% of input DNA
probe (lanes 1) and autoradiographed (A, C, E). Under
each panel, a schematic representation of the constructs summarizes the
binding pattern. The thickness of the band underlining the construct
region is indicative of the relative attachment in the binding assays (B, D, F). A, the plasmid phNF-L/CAT
construct was digested with HindIII, ScaI, and BamHI restriction enzymes. DNA fragments were labeled and
served as probes. This digestion pattern generated bands at 320 bp (band 1), 657 bp (band 2), and 972 bp (band
3) and 2.8 kb corresponding respectively to the hNF-L promoter, the CAT gene, the SV40 polyadenylation signal, and the
pUC-9 plasmid. C, the plasmid phNF-L/lacZ was digested with HindIII, EcoRI, and BamHI restriction enzymes prior to end-labeling. Bands at 320
bp (band 1), 3.7 kb, 450 bp (band 2), 750 bp (band 3), and 2.95 kb correspond respectively to the hNF-L promoter, the lacZ gene, the 5` segment of the 3`-UTR,
the 3` segment of the 3`-UTR, and the Bluescript plasmid. E,
the plasmid phNF-L/intronless was digested with EcoRI, BglII, and XbaI restriction enzymes
prior to end-labeling. This generate bands at 1,120 bp (band
1), 1,220 bp, 485 bp (band 2), and 2.95 kb corresponding
respectively to the hNF-L promoter and first exon, mNF-L exons 2, 3, and 4, the hNF-L 3`-UTR, and the Bluescript
plasmid.
Figure 4: Localization of unwinding elements in each hNF-L/reporter gene construct. Plasmids phNF-L/intronless, phNF-L/lacZ, and phNF-L/CAT were tested for presence of unwinding elements. 25 µg of supercoiled plasmid was treated with 0, 1, 5, and 10% of chloracetaldehyde (lanes 1-4). After linearization of each vector at the 5` of the promoter region with proper restriction enzyme, plasmids were digested with the SI nuclease to remove single-stranded regions. Digestion products were separated by electrophoresis, hybridized with a 307-bp probe recognizing the hNF-L promoter, and autoradiographed. ori indicates the band generated by the bacterial ColE1 origin of replication. The positions of unwinding element are indicated by arrows. Positions of double-stranded DNA size markers are indicated on the left.
Figure 5:
X-gal staining on whole mount embryos hNF-L/ lacZ.
-galactosidase (lacZ) staining of the two transgenic embryos expressing the
transgene. Embryos 1 (a, b, and c) and
embryos 2 (d, e, and f) were analyzed 13.5
days after microinjection, and pictures were taken from the side (a and d), the back (b and e), and the
front (c and f ). Positive tissues are identified by
an X. These two embryos correspond to different insertion
events.
The hNF-L basal promoter does contain regulatory elements to confer neurone specificity. As demonstrated in transgenic mice, a region extending at the very 5` end of the promoter between -292 and -190 bp is necessary for neuronal expression(45) . This -292-bp basal promoter was found to be sufficient to allow expression of the lacZ reporter gene and the hNF-L/intronless gene in a correct tissue-specific and developmental manner (Fig. 1B). However, the CAT reporter gene expression under the same basal hNF-L promoter is highly susceptible to insertion site influence and is not expressed in a neuron-specific fashion(1) .
All three transgenes have the same basal hNF-L promoter, and therefore their pattern of expression cannot be totally explained by the influence of this promoter. Clearly, other unknown elements within the reporter genes themselves do interfere with the expression of these transgenes. The presence of regulatory elements such as enhancers and silencers in the CAT or the lacZ reporter genes is unlikely because a neutrality on transcriptional activation of these genes has been inferred from numerous studies. In fact, both CAT and lacZ genes have been extensively used as enhancer and silencer trap vectors (review in (46) and (47) ). MARs represent likely candidates to explain apparent discrepancies in our analysis of NF-L promoter(23, 48) .
The chromatin of
interphase nuclei is organized into topological constrained loops
averaging 80-90 kb in length that are attached to the nuclear
matrix(19, 49) . This DNA organization seems to be
important not only in the compaction of the chromatin fiber but also
for the utilization of genetic information. Each domain can define an
independent unit of gene activity insulated from the regulatory
influences of adjacent domains(18) . MARs have been found in
the 5`- and 3`-flanking regions of fushi-tarazu, sgs-4, and
alcohol dehydrogenase genes of Drosophila(19, 21, 50) , the J-C
intron of the mouse immunoglobulin
gene(30, 51) , the first intron of the human HPRT gene, the chicken lysozyme gene, the human interferon-
gene,
the human and murine
-globin gene, the chicken
- and
-globin gene, and the apolipoprotein B gene (reviewed in (27) ). MARs enhance general promoter functions in an
orientation- and partially distance-independent manner, and their
effect is restricted to the integrated state of transfected
templates(23) . These findings strongly suggest that some MARs
might serve as a crucial control point for gene regulation.
Our results are indicative of nuclear matrix attachment through A/T stretches as described for many well characterized MAR sequences(38) . For the hNF-L/CAT transgene, the MAR coincide to SV40 sequences already reported by Pommier et al. (52) as having the highest in vitro affinity for nuclear matrix in the entire SV40 genomic DNA. The SV40 sequences present in the hNF-L/lacZ transgene bind to nuclear matrix with a lower affinity(52) . In this construct, the region having the highest affinity for nuclear matrix is localized at the 3`-UTR of the lacZ reporter gene. In addition, these experiments enabled us to identify a novel MAR element in the 3`-UTR of the hNF-L gene. By sequence comparison between the hNF-L/lacZ and hNF-L/intronless MARs, we have tried to highlight similarities that could not be found in the hNF-L/CAT transgene. No highly conserved sequence motif was found that could explain their common neuronal expression pattern, except for their A/T-rich sequences.
One intrinsic property of A/T-rich sequences is that they unpair easily (53) . DNA single-stranded regions are formed by opening of the double helix under torsional stress in different paranemic structures (reviewed in (54) ). Sequences composed of a minimum of 15-20 consecutive purines on one strand have a tendency to present that state (55, 56, 57, 58, 59, 60, 61) . In vivo, unpaired sequences can be identified by digestion of nuclei with S1 nuclease. This generates a pattern of single-strand nuclease-sensitive sites. This mapping assay done on different gene loci has revealed that single-strand nuclease-sensitive sites are generally found in promoter regions(62, 63) , within genes (59, 64) , or at the 3` end(65) . Their occurrence on chromatin coincides with a region of transcriptional activity(66, 67, 68) . It has been demonstrated that DNA strand unpairing could be induced by supercoiling and that this phenomenon is independent of external protein factor for its initiation(66) . This characteristic has allowed us to directly test the presence of unpaired sequences in supercoiled plasmids containing the different MARs. Sequences with unwinding properties were present only in plasmids containing NF-L fusion constructs correctly expressed in the nervous system of transgenic mice. It was found within the associated MAR and colocalized with polypurine sequences.
The hNF-L/intronless and
the hNF-L/lacZ transgenes are basically not
susceptible to insertion site interference for their tissue-specific
expression, whereas the hNF-L/CAT construct is
highly sensible to position effect(1) . A conceivable
explanation for the different expression patterns of these hNF-L/reporter gene constructs in transgenic mice is that the
different transgenes are in different chromatin conformations, the hNF-L/intronless and the hNF-L/lacZ transgenes being in independent loop domains while transcription
of the hNF-L/CAT transgene is dependent on the integration
site because it does not carry the information for self-modulation of
chromatin structure.
Unwinding elements within the MARs were found
only in the two NF-L transgenes correctly expressed in the
nervous system. Our results show that when this element is deleted from
the lacZ reporter gene, only two out of seven transgenic
embryos did express the transgene but in a nonspecific fashion in
contrast to 6 out of 7 transgenic embryos expressing correctly the
complete hNF-L/lacZ construct. This
indicates that removal of the unwinding element made expression of the lacZ transgene insertion-dependent. Using a lacZ transgene with a deleted 3` end, Allen et al.(69) have shown that one out of six transformed embryonal
stem cells expressed a hsp/lacZ transgene, this weak
proportion of expressing versus unexpressing cells being
caused by rare insertion events near activating sequences.
The same phenomenon is likely to be observed with other basal tissue-specific promoters. A good strategy in assessment of a promoter transcriptional characteristic would be to test this promoter with and without MAR with unwinding elements. The lacZ reporter gene from pCH110 is a good choice because the unwinding element influence can be removed easily by a BamHI digestion. The addition of MARs with DNA unwinding elements should be considered when designing transgene or viral vectors because it might circumvent some of the expression problems usually encountered.