(Received for publication, November 2, 1995; and in revised form, January 31, 1996)
From the
The roles of each DNase hypersensitive site (HS), and the DNA
sequences between them, in the activity of the locus control region of
the mammalian -globin gene domain were examined by placing human
and rabbit restriction fragments containing the cores of HS2, HS3, HS4,
and HS5, along with varying amounts of flanking DNA, upstream of a
hybrid
-globin-luciferase reporter gene and testing for effects on
expression both prior to and after integration into the chromosomes of
K562 cells, a human erythroid cell line. Prior to integration,
fragments containing HS2 enhanced expression to the greatest extent,
and the modest enhancement by some fragments containing HS3 correlated
with the presence of a well-conserved binding site for AP1/NFE2. The
stronger effects of larger locus control region DNA fragments in clones
of stably transfected cells indicates a role for sequences outside the
HS cores after integration into the genome. The strong effect of a
1.9-kilobase HindIII fragment containing HS3 after, but not
prior to, integration argues for the presence of a chromatin
domain-opening activity. Use of a rabbit DNA fragment containing both
HS2 and HS3 demonstrated a synergistic interaction between the two HSs
when their natural context and spacing are preserved.
The expression of the -like globin gene cluster in birds
and mammals is regulated both by proximal elements, such as promoters,
and a distal locus control region, or LCR. (
)The LCR will
greatly increase the level of expression of linked reporter genes in
erythroid cells and will allow expression of constructs regardless of
their position of integration in the genome of transgenic
mice(1) . Deletion of the LCR leaves the gene cluster in a
chromatin conformation that is inaccessible to DNase I(2) .
Hence, several functions have been proposed for the LCR, including
enhancement, insulation from position effects, and activation of a
chromosomal domain (reviewed in (3) and (4) ).
The
LCR is a large regulatory region covering about 20 kb of DNA at the 5`
end of the gene cluster. It is marked by 5 DNase hypersensitive sites,
HS1-HS5(5, 6) . The number and spacing of these
sites is very well conserved in
mammals(7, 8, 9, 10, 11) .
The positions of DNase I cleavage have been mapped precisely, and the
minimal sequences required for position-independent expression of a
linked -globin gene in transgenic mice have been established for
HS2(12, 13, 14) ,
HS3(15, 16) , and HS4(17, 18) . These
200-400-bp minimal regions will be referred to as the cores of
the HSs. The DNA sequences within the cores are very well conserved,
especially for HS2. In addition, prominent conserved blocks of sequence
are also found outside the cores of the
HSs(7, 11, 19) , but the functional
significance of these sites is largely untested at present.
Numerous
studies have examined the roles of individual HSs in various expression
assays. HS1, -2, -3, and -4 will produce position-independent
expression of linked globin genes in transgenic mice, each with a
characteristic level of expression and a preferred developmental
stage(20, 21) . In transfected erythroid cells, HS2
will enhance expression of -like globin genes both transiently
from unintegrated constructs (22, 23) and after stable
integration into a chromosome(12, 24) . HS3 will
enhance stable expression in MEL cells (9, 24, 25) and transiently in K562
cells(11) . HS4 has a much stronger effect in transgenic mice
than in stably transfected MEL cells (17) and will generate a
DNase hypersensitive site in transgenic mice(18) . HS5 has been
implicated in insulation from some position
effects(26, 27) , analogous to the situation in the
chicken
-like globin gene cluster (28) . It is not yet
clear how these individual activities lead to the full effect of the
LCR, and it is likely that the HSs act together for some or perhaps all
functions. Indeed, the quantitative effects of the individual HSs on
expression in transgenic mice are markedly less than the effects of the
entire LCR (e.g. Refs. 13, 17, 24, and 29), and Collis et
al.(24) have shown that restriction fragments containing
the cores of three HSs, but not two HSs, are needed for levels of
expression in stably transfected cells approaching that of the entire
LCR. The full activity of a specific function, such as domain opening
or enhancement, may require some combination of HSs, possibly including
DNA sequences between the HS cores.
In order to examine the effects
of sequence elements both inside and outside the cores of the HSs on
expression, restriction fragments of increasing lengths have been added
to an -globin-luciferase hybrid gene and tested in the human cell
line K562, which produces
-,
-,
-, and
-globins.
Tests of expression of the reporter gene prior to integration
(transient expression) reveal enhancement function, and measurement of
levels of expression after integration into a chromosome (stable
expression) should show the effects of enhancement, domain opening, and
perhaps protection from some position effects. However, the use of a
selectable marker in the stable transfection assays precludes the
observation of strong negative position effects, since clones showing
such negative effects would not express the drug resistance gene.
Results reported here show that sequences outside the cores of HS2 and
HS3 do increase the level of expression in stably integrated
constructs. A large DNA fragment containing HS3 has a chromatin
domain-opening activity. A striking synergism is seen between rabbit
HS2 and HS3 when their natural spacing and LCR sequences between them
are preserved, and this effect is dependent on sequences outside the
cores. Only the HS5 region showed an appreciable effect on protection
from position effects in these assays.
Figure 1:
Map of LCR fragments and the
-luciferase expression vector. The top line is a
schematic of the human
-globin gene cluster. The next set of
lines shows the locations of the restriction fragments containing
segments of the human and rabbit LCR, indicated underneath the thick
lines showing the positions of the DNase HSs. Each of these fragments
was placed 5` to the
-globin-luciferase hybrid reporter gene,
diagrammed on the bottom line.
A map of the LCR indicating the positions
of the fragments tested is shown in Fig. 1. Table 1provides details about the restriction endonuclease
cleavage sites used to add the LCR fragments to the expression vectors,
along with an indication of the orientation of the insert in the
expression plasmid. The sequenced regions of the rabbit -globin
LCR are very similar to those of the human(11) , and Table 1lists both the rabbit DNA positions and the aligning human
sequence in GenBank
file HUMHBB. The alignment allows one
to refer to the LCR positions in terms of the human DNA sequence, even
for DNA segments from other mammals(33) .
To produce individual stably expressing clones of K562
cells, 1 10
cells were electroporated with 100
µg of total DNA containing 90 µg of linearized test plasmid and
10 µg of linearized pM5neo as a selectable marker. The plasmid
pM5neo contains the Tn5 phosphotransferase gene driven by the long
terminal repeat of the myeloproliferative sarcoma virus, thereby
conferring resistance to the drug G418 in hematopoietic
cells(36) . G418 (final concentration of 1.2 mg/ml) was added
24 h after electroporation to begin drug selection. Individual stably
expressing clones were plucked from soft agar 2 weeks after
electroporation and grown another 4-6 weeks until there were
enough cells to prepare both cell extract and genomic DNA. Transgene
expression was determined using luciferase assays as described above.
Transgene copy number was determined by hybridizing a luciferase DNA
probe to Southern blots containing 10 µg of genomic DNA from each
clone digested with PstI plus EcoRV. Copy number
standards containing the equivalent of 1, 10, and 100 copies of
pBS
-luc in 10 µg of
-DNA were included on each blot. For
each clone, the expression per copy of the transfecting gene was
calculated by dividing the luciferase activity by the number of copies
of the
-luciferase construct in the transfected cells. The test
constructs are subject to strong positive position effects after
integration, occasionally resulting in a small number of individual
clones producing much more luciferase than do the rest of the clones.
In order to reduce the impact of these outliers on the summary value
for the group of clones, the geometric mean is reported in the
appropriate tables and figures.
The luciferase activity is a
reliable indicator of the amount of RNA produced from these reporter
constructs. The amount of -luciferase RNA was measured by an RNase
protection assay (37) in several of the stably transfected
clones whose level of expression of luciferase ranged from 4000 to over
1
10
relative light units per s. A plot of the
amount of RNA versus the luciferase activity shows a strong,
linear correlation between the amount of RNA and the luciferase
activity over a this very wide dynamic range. The slope of the line is
0.66, showing that the increase in luciferase activity can exceed that
in the amount of RNA only by about 2-fold.
Figure 2:
Effect of rabbit individual HS sites on
transient expression of the -luciferase gene in K562 cells.
Increasing amounts of the test plasmid were transfected in triplicate
into K562 cells, along with the plasmid pRSVlacZ as a control
for transfection efficiency. Luciferase activity was normalized to the
-galactosidase activity, and the averages (± S.D.) are
plotted as a function of amount of transfecting DNA. The diamonds are values for the parental reporter
-luciferase
(
luc), triangles are for rabbit 0.1-kb AvaII to HindIII HS2-
-luciferase, inverted
triangles are for rabbit 0.5-kb StuI to ScaI
HS3-
-luciferase, filled circles are for rabbit 0.2-kb BglII to XbaI HS4-
-luciferase, and squares are for human 3.0 kb HindIII to BamHI
HS5-
-luciferase.
Figure 3:
Summary of the effects of LCR fragments on
the transient expression of the -luciferase gene in K562 cells.
The bar graph shows average fold enhancement over expression
of the parental
-luciferase. Average fold enhancement values for
transiently transfected cells (arithmetic means) were obtained from
multiple determinations at more than one concentration of test DNA. The
average fold enhancement for sets of clones of stably transfected cells
(geometric means) are derived from the data in Fig. 5and Fig. 6. Each LCR fragment was placed 5` to
-luciferase in
either the reverse or native genomic orientation and tested for level
of expression. The effects of the 5.6-kb EcoRI to NsiI fragment containing both HS3 and HS2 was much greater
than the effects of other sites, and hence it is plotted on a scale
different from the others. Restriction sites defining each fragment are
as follows: A = AvaII, Ac = AccI, B = BglII, Ba = BamHI, E = EcoRI, F = Fnu4HI, H = HindIII, Hp = HphI, K = KpnI, N = NsiI, Nc = NcoI, P = PstI, Pv = PvuII, Sc = ScaI, Ss = SspI, St = StuI, and X = XbaI. r = rabbit, h =
human.
Figure 5:
Expression of -luciferase in sets of
stably expressing K562 clones and effects of fragments containing HS4
and HS5. Individual G418-resistant clones were assayed for both
luciferase activity and transgene copy number, and the luciferase
activity per gene copy is plotted along with the gene copy number. Each
set of clones is labeled according to the test construct used, along
with a schematic map of the LCR and relevant fragments. A, 15
clones expressing the parental
-luciferase, with clones from each
of the four versions of the reporter construct. B, sets of
clones with
-luciferase linked to four different DNA fragments
containing HS4. C, set of clones with
-luciferase linked
to a fragment containing the human HS5.
Figure 6:
Expression of sets of clones of stably
transfected K562 cells with -luciferase linked to fragments
containing HS2, HS3, and HS2 plus HS3 together. A, expression
per copy for sets of clones with five different fragments containing
HS2. B, sets of clones with the
-luciferase gene linked
to four different fragments containing HS3. C, sets of clones
with the
-luciferase gene linked to fragments containing both HS3
and HS2.
Some
restriction fragments encompassing the HS3 core also enhanced transient
-luciferase expression, although not as much as HS2. As shown
previously(11) , a rabbit 504-bp StuI to ScaI
fragment (corresponding to positions 4489 to 4993 in the human
sequence) increased the expression of
-luciferase an average of
6-fold at all DNA concentrations tested (Fig. 2). Since others
have not observed an effect on transient expression by human or mouse
fragments containing HS3(22, 25) , we examined this
effect in more detail. Indeed, the data in Fig. 4show that the
human 225-bp HphI to Fnu4HI that has been defined as
the HS3 core (16) does not enhance expression in either
orientation, nor does a larger 1.9-kb HindIII fragment, in
agreement with previous results(22) . However, the human 0.8-kb PstI to PvuII fragment encompassing the HS3 core does
produce a modest enhancement, similar to that of the rabbit StuI to ScaI fragment. Additionally, a large 2.9-kb NsiI to HindIII fragment from rabbit will also
enhance expression in transiently transfected cells. For all three
fragments that show enhancement in this assay, the effect is greater in
the reverse genomic orientation. Although most DNA fragments showed
similar effects in the different versions of the
-luciferase
reporter construct in various assays, these fragments containing HS3
placed into pBS
-luciferase.4 showed enhancement only at lower
concentrations of DNA (
5 µg/transfection), whereas the
fragments in pBS
-luciferase.1 showed enhancement at higher DNA
concentrations as well(11) .
Figure 4:
Finer
dissection of the effect of restriction fragments containing HS3 on
transient expression of the -luciferase in K562 cells. Restriction
maps are shown on the top and bottom lines. The second line shows conserved blocks, which are strings of six
or more positions in the multiple alignment of sequences from human,
galago, rabbit, goat, and mouse with no more than one mismatch per
position(33) , and an Alu repeat in human. Those
conserved blocks that are known (GATA1 and AP1 or a relative) or
proposed to be binding sites for proteins are indicated below the
second line; CBF = CCAAT binding factor, CAC = proteins that bind to a CAC motif, such as Sp1, TEF2, and
relatives, Dyad = a prominent conserved dyad. Some
conserved blocks are so short on this scale that they appear as thin filled boxes. The positions and orientations of fragments
tested are indicated along with the mean fold enhancement, ±
S.D. (range for two experiments), with the number of experiments (most
of which were triplicate determinations) in parentheses.
As shown on the map in Fig. 4, the rabbit and human DNA fragments that enhance expression contain a well-conserved sequence that matches the binding site for AP1, NFE2, and related proteins, whereas the 225-bp HS3 core HphI to Fnu4HI fragment does not contain this site. This site was removed from the rabbit fragment containing HS3 by cutting with NcoI, and the resulting 400-bp NcoI to ScaI fragment no longer showed enhancement in transiently transfected K562 cells (Fig. 4), supporting a possible role for AP1-like proteins in this ability to enhance expression transiently. The inability of the human 1.9-kb HindIII fragment to enhance in this assay suggests that a negative element may be located 5` to the PstI site at position 4344.
Tests of rabbit and human DNA
fragments containing HS4 and HS5 showed no effect on -luciferase
expression in these assays (Fig. 2). Use of larger DNA fragments
up to 2.4 kb containing HS4 had no effect on enhancement (Fig. 3). In contrast, a HindIII to NsiI
fragment from the rabbit genome enhanced
-luciferase expression
6-fold (Fig. 3). This fragment contains HS4 and was originally
thought to contain the homolog to human HS5, based on hybridization
data and the expected distance between these hypersensitive sites in
rabbit. However, analysis of recently determined DNA sequences (
)shows that this region of the rabbit DNA does not match
with human HS5. Thus, the small increase in expression achieved with
this DNA fragment results from the inclusion of distal sequences that
are not homologous to human.
The presence of fragments containing HS2 caused a substantial increase in the expression per copy of the transfecting gene (Fig. 6A; note the difference of scale from the plot in Fig. 5A). However, the level varied considerably among individual clones within each set, regardless of the size of the tested LCR fragment, showing that these constructs are still subject to some position effects. The largest increase in expression was obtained with the 1.5-kb KpnI to BglII fragment from human; it was 2.5 to 7 times as effective as fragments containing only HS2 core sequences, i.e. the rabbit 0.1-kb AvaII to HindIII fragment and the human 0.4-kb HindIII to XbaI HS2 core fragment (summarized in Fig. 3). Likewise, the rabbit 2.2-kb BglII to AccI fragment also produced a larger increase in expression than did the core fragments after integration. Although less of an effect was obtained with the rabbit 2.2-kb HindIII fragment, all three larger DNA fragments containing HS2 gave stronger enhancement than that obtained with the human HS2 core (0.4-kb HindIII to XbaI fragment). These data indicate that sequences adjacent to the core play a positive role after integration, such as in domain opening.
Several DNA fragments containing HS3 also produced an increase in expression in stably transfected cells, when compared to constructs lacking LCR fragments. Two of the tested fragments, the 0.5-kb StuI to ScaI fragment and the 2.9-kb NsiI to HindIII fragment from rabbit (corresponding to human positions 3079 to 6558) gave an increase in expression (Fig. 6B) comparable to the values obtained in transient transfections (Fig. 3). In contrast, the human HS3 core fragment showed a 4-fold enhancement in stably transfected cells, whereas it had no effect in transient transfections (Fig. 3). The larger 1.9-kb HindIII fragment containing human HS3 showed a dramatic increase in expression in virtually all clones (Fig. 6B), averaging over 100-fold, whereas it had no effect in transiently transfected cells (Fig. 3). The capacity of large fragments containing HS3 to greatly increase expression only after stable integration argues for an effect on domain opening when sequences flanking the core are present.
Most small or large
restriction fragments containing HS4 had no effect on -luciferase
expression in stably transfected clones; as seen in Fig. 5B, the profile of expression per copy is very
similar to that of the parental vectors without an LCR fragment.
However, the 3.9-kb NsiI to HindIII fragment from
rabbit gave an average 9-fold increase in expression.
This synergism requires sequences
outside the hypersensitive site cores. Removal of 1.2 kb from the 5`
end of this fragment dramatically reduces the activity seen in clones
of stably transfected cells, as shown for the 4.4-kb NcoI to NsiI fragment in Fig. 6C. These clones average
a 34-fold increase over the expression of -luciferase alone, a
level comparable to that seen with for large fragments containing only
HS2 (Fig. 3). Thus, the DNA sequences flanking the HS3 core,
including the binding site for AP1-like proteins (Fig. 4), are
needed for the synergism observed between rabbit HS2 and HS3.
Comparison of the sequences of the -like globin gene
LCRs from various mammals has shown that the cores of the HSs are well
conserved(7, 8, 9, 10, 11) ,
and indeed small fragments (200 to 400 bp) that constitute the core of
HS2(12) , HS3(15) , and HS4 (17, 18) are active in some assays of LCR function. In
addition, the number and distance between the HSs is almost invariant,
and many blocks of sequence between the cores of the HSs are conserved.
This was shown initially as blocks of sequences that consistently align
in multiple pairwise comparisons among LCR sequences from four mammals (11) and is confirmed dramatically by the results of a
simultaneous alignment of these sequences from five mammals: human,
galago, rabbit, goat, and mouse(19, 31, 33) .
Even at a very stringent criterion for ``conserved,'' such as
strings of seven contiguous invariant positions in the multiple
alignment, conserved blocks of sequence are detected between the cores
of the HSs, sometimes at a density equal to that within the cores. (
)These sequence comparisons implicate DNA segments between
the HS cores in function of the LCR. However, much experimental work
has focused on the HS cores singly or in combination. The data in this
paper show that sequences between the HS cores have a strong positive
effect on reporter constructs, especially after integration into the
chromosome. These effects of flanking sequences are particularly
notable for a domain opening activity of fragments containing HS3, for
a synergistic interaction between rabbit HS2 and HS3, and for
demonstrating a role for a highly conserved AP1 binding site close to
HS3.
The very large increase in expression for the human 1.9-kb HindIII fragment containing HS3 dramatically illustrates the
effect of sequences outside the core after integration. With the
-globin luciferase reporter construct in K562 cells, little to no
effect was observed with the 225-bp HphI to Fnu4HI
HS3 core fragment, whereas this core fragment is highly effective with
the human
-globin gene as a reporter in transgenic mice and stably
transfected MEL cells(15) . However, the effect of the minimal
HS3 core was reduced substantially when tested with an H2-K gene driven
by the
-globin gene promoter(38) . Thus, in some assays,
the body of the
-globin gene may contribute to the high level of
expression obtained with HS3 constructs. In our assays, the additional
sequences in the 1.9-kb HindIII fragment are needed to see the
effect of HS3 on the
-globin gene in K562 cells. When assaying
-globin gene expression in transgenic mice, the human 1.9-kb HindIII fragment gives a higher level of expression than the
225-bp HphI to Fnu4HI core
fragment(15, 20) , providing further evidence that
sequences outside the core contribute to the activity of HS3.
This
effect of the large 1.9-kb DNA fragment containing HS3 is seen only
after integration of the constructs in stably transfected cells. The
absence of an effect prior to integration shows that this DNA fragment
does not function as a classical enhancer. Also, the expression
(adjusted for copy number) varied among clones, arguing against a
strong insulator effect. The most reasonable interpretation of these
data is that the 1.9-kb fragment containing HS3 has cis-acting
sequences needed for opening a chromatin domain at the site of
integration. A different line of experiments also leads to the
assignment of a dominant domain-opening activity for this DNA fragment.
Constructs containing this 1.9-kb fragment with HS3 attached to a
-globin gene are active as single copy integrants in transgenic
mice(39) , whereas constructs with a HS2 fragment are effective
only in multiple copies(40) . The single copy integrants
containing HS3, but not HS2, formed DNase hypersensitive sites in these
experiments, demonstrating an opened chromatin domain. Such a domain
opening activity has been inferred previously for an artificial
construct composed of small DNA fragments containing the cores of HS1,
HS2, and HS3(8) , and our experiments help localize at least
part of this function to the region surrounding the HS3 core.
Similarly, a domain-opening activity, but not enhancement, has been
mapped to the region surrounding the HS2
core(13, 41) .
The strongest effects on expression
of the -luciferase reporter gene were obtained with a 5.6-kb
rabbit DNA fragment containing both HS3 and HS2 along with all the
sequences between them and some DNA flanking HS3 on the 5` end and HS2
on the 3` end. Thus, it contains all the conserved sequences from this
region of the LCR. This combination of HS2 and HS3 in their natural
sequence context and with the native spacing produces a very large
increase in expression of the linked reporter gene, far greater than
the effects obtained with individual sites. This synergism could
reflect interactions between the proteins bound to the two sites, the
effects of sequences between the HS cores, and/or the effects of an
optimal spacing. The reduction in activity upon deletion of DNA 5` to
the HS3 core shows that flanking sequences are needed for this effect;
it is not simply from interactions between proteins bound to the HS
cores. Indeed, combinations of human HS2 and HS3 core fragments show no
increase over the level of expression of the individual fragments. (
)Additional experiments in progress using other DNAs as
spacers indicate that non-globin sequences will not substitute in these
constructs,
further supporting the involvement of LCR
sequences between the cores. In this context, it is interesting to note
that constructs with HS2 (1.5-kb fragment) and HS3 (1.9-kb fragment)
juxtaposed 5` to the human
-globin gene had no additional effect,
compared to the effects of individual sites, in stably transfected
cells(24) . These authors report that three HSs together (e.g. HS4-HS3-HS2 or HS4-HS3-HS1) were required to generate a
substantially higher level of expression than the individual sites, but
the DNA fragments used did not include all the sequences between the
HSs. The synergistic effects between the pair of sites obtained in the
present work suggests that spacing and/or sequences between the cores
could be important.
A third example of the effects of sequences
outside the HS cores is a well-conserved binding site for members of
the AP1 family (such as NFE2). We find that the ability of some DNA
fragments containing HS3 to enhance expression prior to integration
correlates with the presence of this site, and constructs with this AP1
binding site closer to the target promoter (reverse genomic
orientation) tend to show a stronger enhancement. In studies of HS3
effects on human -globin gene expression after integration in MEL
cells, addition of this AP1 binding site had no effect on the ability
of the intact human HS3 core fragment to stimulate transcription, but
it would increase the activity of subregions of the HS3
core(38) . This AP1 binding site is in the region 5` to the HS3
core that when deleted causes a loss of synergism between HS3 and HS2.
Thus, several lines of evidence implicate this site in LCR function,
although it is outside the minimal HS3 core (defined by its effect on
the human
-globin gene). It should be emphasized that despite the
modest enhancement seen in transient transfections, the strongest
effect from DNA fragments containing HS3 is seen after integration.
The effects of sequences outside the HS cores and the ability of
these cores, in the proper context, to interact synergistically, are
consistent with a model for the LCR forming a large holocomplex that
interacts alternately with promoters of -like globin
genes(42) . Just as enhancers are composed of multiple sites
(enhansons) for binding trans-activating
proteins(43) , full LCR function may be achieved by the several
HSs together, perhaps in a context-dependent manner. Our data show that
the HS3 and HS2 interactions require sequences outside the cores.
Recent loss-of-function experiments show that deletion of the HS3 or
HS4 core fragments from the human
-globin gene cluster in
transgenic mice causes a catastrophic loss of expression of the
-like globin genes at all developmental stages(44) ,
whereas gain-of-function experiments showed that individual HSs were
adequate to drive a high level of expression of the target genes. One
explanation is that the residual DNAs, outside the HS cores, elicit a
dominant-negative phenotype on the rest of the locus(44) .
Thus, these recent experiments also support a role for sequences
outside the cores in the function of the LCR.