©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Nuclear Factor Binding Sites in Human Globin IVS2 (*)

(Received for publication, August 9, 1995)

Christine E. Jackson (1)(§) David O' Neill (3)(¶) Arthur Bank (1) (2)(**)

From the  (1)Departments of Genetics and Development, (2)Medicine, and (3)Pathology, College of Physicians and Surgeons, Columbia University, New York, New York 10032

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES

ABSTRACT

The second intron of the human beta globin gene (beta IVS2) has been previously identified as a region required for proper expression of beta globin. To further characterize this region, we have footprinted the entire beta IVS2 and have analyzed regions of interest by electrophoretic mobility shift assay. Through these studies we have identified four utilized binding sites for the erythroid regulatory factor GATA-1, two sites bound by general transcription factor Oct-1, two sites bound by the nuclear matrix attachment DNA binding protein special A-T-rich binding protein 1, and a site bound by a potential homeobox protein. Additionally, we have found several factors displaying temporal or tissue specificity by electrophoretic mobility shift assay, which may be potentially involved in the regulation of beta globin expression. These proteins are not supershifted by antibodies to factors important in erythroid regulation such as GATA-1, NFE-2, or YY1, or by antibodies against more general transcription factors.


INTRODUCTION

The 70-kb (^1)human beta globin gene complex has been extensively studied as a model of gene regulation. The region consists of 20 kb known as the locus control region, or LCR, which can confer erythroid-specific expression and position independence on any gene of interest (1, 2, 3) followed by a series of beta family genes, five of which are temporally expressed (embryonic , fetal ^A, ^G, adult and beta)(4, 5) . A number of nuclear proteins have been identified that play a role in transcription of one or more beta family genes, through binding to the LCR, promoter, and/or enhancer sequences of these genes(6, 7, 8, 9, 10, 11, 12) . Some of these factors may have a role in the normal temporal switching of globin, as switching has been shown to be regulated at the level of transcription (4, 5) . Despite these many findings, the details of the molecular mechanisms regulating beta family gene expression and switching are still unclear.

The adult human beta globin gene has been shown to include enhancers 3` to the structural gene and within the gene itself, specifically, in the region spanning the 3`-end of IVS2 and the beginning of exon 3 (13, 14, 15) . Previous work has indicated that beta IVS2 is required for beta globin expression(16) . Two DNase I hypersensitive sites have been identified within the beta globin structural gene, a stronger site in exon 3, and a weaker site in the center of IVS2(17) . The beta IVS2 intronic enhancer region has a utilized binding site for erythroid transcription factor GATA-1. (^2)We have previously shown in murine erythroleukemia (MEL) cells that the IVS2 sequences from the globin gene will not substitute for beta IVS2 in human beta globin expression. The replacement of beta IVS2 with IVS2 leads to a substantial decline in the expression of beta globin and renders the gene uninducible(18) .

beta IVS2 has also been shown to be a nuclear matrix attachment region (MAR), one of 10 found in the beta globin complex(19, 20) . These MARs may be involved in transcription, splicing, and replication of DNA (21, 22, 23) .

In order to identify DNA binding proteins that may play a role in the expression of human beta globin, we have characterized the entire human beta IVS2 region by DNase I footprint analysis. Based on the footprint pattern and transcription factor binding analysis utilizing a computer-generated map, sites were chosen to be analyzed by EMSA. We report here evidence that three of nine potential GATA-1 binding sites are utilized, as well as a fourth degenerate GATA-1 site. Additional proteins seen to bind IVS2 are Oct-1, a ubiquitous factor binding to a homeobox consensus site, several stage or tissue-specific factors, and the nuclear matrix binding protein SATB1. The binding of this latter protein to beta IVS2 suggests that beta IVS2 and nuclear matrix attachment may be involved in the regulation of beta globin transcription.


MATERIALS AND METHODS

Subcloning and Footprint Analysis

beta IVS2 was cleaved into three smaller pieces as depicted (see Fig. 1). The parent vector for beta IVS2 subcloning was pSP72beta 3.4, which contains a 3.4-kb ClaI/SphI fragment of the human beta globin gene subcloned into the ClaI and SphI sites of vector pSP72 (Promega). The BamHI-EcoRI fragment of the gene, including all of beta IVS2, was analyzed further. To produce subclone I, pSP72beta3.4 was sequentially digested with restriction enzymes BamHI and MunI, and the 200-bp insert was ligated into the EcoRI and BamHI sites of the Bluescript SK+ vector (Stratagene). For subclone II, pSP72beta3.4 was sequentially digested with MunI and DraI, and the 329-bp insert was subcloned into the EcoRI and SmaI sites of plasmid pGEM 7Zf+ (Promega). For subclone III, pSP72beta3.4 was sequentially digested with EcoRI and DraI, and the resultant 400-bp insert was subcloned into the EcoRI and SmaI sites of pGEM 7Zf+.


Figure 1: beta IVS2 subclones for footprint analysis. Restriction map of the three subclones (I, II, and III) used in footprinting beta IVS2, as described under ``Materials and Methods.'' beta IVS2 has been numbered 1-850, with 1 corresponding to the start of IVS2 (base number 62684 in the GenBank human globin gene complex sequence, accession number J00179).



All three subclones were entirely footprinted at least twice each in both orientations, using nuclear extracts from CEM(24) , HEL-92(25) , HeLa(26) , K562(27) , and MEL (28) cells. CEM is a human T-lymphocyte cell line; HEL-92 and K562 are human fetal-embryonic hematopoietic lines; and MEL is a murine adult hematopoietic line. HeLa cells are human cervical carcinoma cells and are not hematopoietic. Linearized subclones were dephosphorylated and end-labeled with [-P]ATP using T4 polynucleotide kinase. Following a second restriction digest, probes were polyacrylamide gel electrophoresis-purified on a 5% nondenaturing polyacrylamide gel (29) . Some of each probe was subjected to Maxam-Gilbert sequencing (29) to provide a sequence ladder for footprint gels. Labeled probe was footprinted by DNase I digestion as described previously (30) using 50 µg of nuclear extract and 160 ng of DNase I/reaction, except where indicated.

Nuclear Extracts

All nuclear extracts were prepared from cells as described previously (31) or by a small scale adaptation of this method(32) .

Electrophoretic Mobility Shift Assays

After annealing, oligonucleotide probes were end-labeled using T4 polynucleotide kinase and [-P]ATP. In general, 20,000 cpm of probe/reaction were used. EMSA were run on 3.5, 4, or 5% nondenaturing polyacrylamide gels as described previously(30) . Reactions included 10 µg of nuclear extract and 5 µg of double-stranded poly(dIbulletdC) as nonspecific competitor each. EMSA buffer was 20 mM Tris-HCl, pH 7.6, 10% glycerol, 0.2 mM EDTA, 2.5 mM MgCl(2), and 60 mM KCl, with 1 mM dithiothreitol and 0.2 mM phenylmethylsulfonyl fluoride. Reactions were incubated for 20 min on ice. For competition assays, 10-, 100-, or 1000-fold molar excess of unlabeled competitor oligonucleotides was added to the reaction. For supershift assays, reactions were incubated as usual for 20 min on ice, and then 1 µg of antibody was added and the reactions were further incubated for 45 min to 1 h on ice. All antibodies were Supershift grade antibodies from Santa Cruz Biochemicals, except for the anti-SATB1 antibody, which was the generous gift of Dr. T. Kohwi-Shigematsu (La Jolla Cancer Research Foundation, La Jolla, CA).

Oligonucleotides

The following oligonucleotides (sense strands listed), which were synthesized and polyacrylamide gel electrophoresis-purified by Genosys Biotechnologies, Inc., were used in these analyses: site 2, 5`-GAATGGGAAACAGACGAATGATTGCATCAGTGTGGAAGTCTC-3`; site 5, 5`-GTACATTACTATTTGGAATATATGTGTGCTTATTTGCATATTCATAATCTCCCTACTTTA-3`; site 6, 5`-TTATTTTTAATTGATACATAATCATTATACATATTTATGGG-3`; site 7, 5`-GTTTATCTTATTTCTAATACTTTCCCTAATCTCTTTCTTT-3`; site 9, 5`-CATGCCTCTTTGCACCATTC-3`; site 10, 5`-ATTCTAAAGAATAACAGTGATAATTTCTG-3`; site 12, 5`-GTTTCATATTGCTAATAGCAGCTACAAT-3`; antennapedia consensus, 5`-CGCTCCCAATTAAATTGGCGGCTGGT-3`; SATB1, 5`-TCTTTAATTTCTAATATATTTAGAATCTTTAATTTCTAATATATTTAGAA-3`(21) ; and Santa Cruz Biochemical oligonucleotides GATA-1, GATA-1 mutant, Oct-1, Oct-1 mutant, and AP-1.


RESULTS

Footprinting Analysis of Human beta IVS2

The 850-bp human beta IVS2 was subcloned in three pieces into pGEM 7 or Bluescript SK+ vectors, as described under ``Materials and Methods'' (Fig. 1). DNase I footprinting was performed on both strands of each construct using nuclear extracts from the following cell lines: CEM, HEL-92, K562, HeLa, and MEL. The intron was found to be extensively footprinted. The central region of the intron has previously been shown to include a DNase I hypersensitive site(17) . Of the 14 footprinted regions identified (Table 1), seven were selected for further analysis, and one site had previously been characterized in this laboratory.^2 The sequence of beta IVS2 was mapped using the Eukaryotic Transcription Factor Binding Sites (tfsites) data base of the University of Wisconsin GCG package(33) . This data base includes consensus sequences for the binding of sequence-specific eukaryotic transcription factors. Footprint data and this map were analyzed to determine which sequences to characterize by EMSA.



GATA-1 Binding of Human beta IVS2

beta IVS2 includes nine consensus sequences for the binding of GATA-1, yet not all of these sites are bound in vitro by GATA-1. Four sites were found to bind GATA-1 (footprints 2, 7, 10, and 14, Fig. 2; Table 1; Fig. 3, A and B; and see Fig. 8B). Sites 7, 10, and 14 conform to the general GATA-1 consensus sequences as listed in the GCG tfsites data base(33, 34) , but site 2 does not match any of these sequences. However, it is clear that GATA-1 can bind a great variety of sequences(35, 36) . When compared with the commonly used GATA-1 consensus sequences YTATCW (35, 36) or MYWATCWY(34) , the site 2 sequence (TGCATCAG) matches the first sequence at four of six positions and the second sequence at four of eight positions. Double-stranded oligonucleotides were synthesized to generate EMSA probes for sites 2, 7, and 10. Each of these three probes was specifically competed by an excess of unlabeled competitor GATA-1 consensus oligonucleotide, but not with a GATA-1 mutant consensus oligonucleotide (Fig. 3B and Fig. 8B). Site 14 has been previously characterized in this laboratory,^2 and was found to bind GATA-1.


Figure 2: Characterization of nuclear factor binding sites in beta IVS2. All experiments were performed using CEM, HEL-92, HeLa, K562, and MEL nuclear extracts. beta IVS2 has been numbered 1-850, with 1 corresponding to the start of IVS2 (base number 62684 in GenBank beta globin complex sequence). Included as reference points are the BamHI site in exon 2 and the EcoRI site in exon 3. A, sites of footprints observed by DNase I digestion of end-labeled probe. Thicker bands are footprints observed on both DNA strands; thinner bands are footprints observed on one strand. Sites have been labeled as to which extracts generated the footprint and are numbered 1-14 for easy reference. s, sense; a, antisense; C, CEM; E, erythroid (HEL-92, K562, and MEL); A, all extracts; EF, embryonic-fetal (HEL-92 and K562); hs, hypersensitive site. B, consensus sites for binding of GATA-1(33, 34) . C, sites found experimentally to bind GATA-1.




Figure 3: EMSA of site 2 with the 42-bp site 2 oligonucleotide. Gels are 5% polyacrylamide. A, EMSA with nuclear extracts as marked. Lane 1, control with no nuclear extract. Bands 1a and b are seen only in erythroid cell lines HEL-92, K562, and MEL. B, competition EMSA of site 2; competitor oligonucleotides as marked. GATAm is the mutant GATA oligonucleotide. Lanes 2, 5, and 7 include 10times molar excess of unlabeled competitor probe; lanes 3, 6, and 8 include 100times molar excess of unlabeled competitor probe.




Figure 8: EMSA of site 7 with the 40-bp site 7 oligonucleotide. Reactions include 10 µg of the indicated nuclear extract and 5 µg of poly(dIbulletdC) as nonspecific competitor. Gels are 5% polyacrylamide. A, lane 1, control with no nuclear extract; lane 8, MEL nuclear extract with anti-GATA-1 antibody (Ab). Band 1, SATB1; band 2, GATA-1; band 3, ubiquitous band. B, competition EMSA with CEM (lanes 1-8) and K562 (lanes 9-16) nuclear extracts, and competitor oligonucleotide as marked. Lanes 3 and 11 include 100times molar excess of unlabeled competitor oligonucleotide; all other competition lanes include 1000times molar excess of unlabeled competitor oligonucleotide. Band 4 is a faint band that is seen in CEM and K562 nuclear extracts when probe has been labeled to a high specific activity. C, supershift EMSA with CEM (lanes 1-4) and K562 (lanes 5-8) nuclear extracts, and antibodies (Ab) as marked.



Site 5 Analysis

Footprint site 5, a broad footprint seen with all nuclear extracts tested (Fig. 4), contains consensus sequences for transcription factors Oct-1 and Oct-2 (ATTTGCAT), and GATA-1 (ATAATCTC) (Table 1). EMSA done with a 60-bp oligonucleotide containing the ubiquitous footprint 5 sequence revealed a complex pattern (Fig. 5A). Within this site are two possibly stage-specific bands seen only with HEL-92, K562 (bands 3, a and b), and weakly with CEM nuclear extracts in some gels (not shown). There is also a different sized band seen only with MEL nuclear extract (band 4), a unique band found only with CEM and K562 (band 5), and a ubiquitous band (band 1). Each of these bands was competed by an excess of unlabeled site 5 probe (Fig. 5B, lanes 2, 8, and 9) but not by a nonspecific probe for the general transcription factor AP-1 (not shown). The large ubiquitous band (Fig. 5A, band 1) was competed by an Oct-1 consensus oligonucleotide (Fig. 5B, lanes 3, 10, and 11) and was also supershifted by an anti-Oct-1 antibody (lane 5). Additionally, the GATA-1 consensus oligonucleotide did not compete any bands even though a GATA-1 consensus is present in the site 5 oligonucleotide (lane 4). In an attempt to identify the nuclear factors responsible for bands 3, a and b, 4, and 5, supershift analyses were performed using antibodies against c-Fos, c-Jun/AP-1, Oct-2 (lane 6), Fli-1, Pu.1, Ets 1/Ets 2, GATA-1, NFE-2, YY1, and SATB1. All of these were negative by supershift assay with the site 5 oligonucleotide.


Figure 4: DNase I footprints of sites 5 and 6 of beta IVS2. Lane 1, T+C lane of Maxam-Gilbert sequencing reaction; lane 2, C lane of Maxam-Gilbert sequencing reaction, Lanes 3 and 9, control with no nuclear extract and 20 ng of DNase I. Reactions include 50 µg of the nuclear extract indicated and 160 ng of DNase I.




Figure 5: EMSA of site 5 with the 60 bp site 5 oligonucleotide. Gels are 4% polyacrylamide. A, EMSA with nuclear extracts as marked. Lane 1, control with no nuclear extract. Reactions include 10 µg of nuclear extract and 5 µg of poly(dIbulletdC) as a nonspecific competitor. Band 1, ubiquitous Oct-1 band; band 2, ubiquitous band; bands 3a and b, bands specific to HEL-92 and K562 nuclear extracts; band 4, band specific to MEL; band 5, band specific to CEM and K562. B, competition and supershift EMSA of site 5, with competitor oligonucleotide or antibody (Ab) as marked. Lanes 1-6, K562 nuclear extract; lanes 7-11, MEL nuclear extract. Lanes 8 and 10 include 100times molar excess of unlabeled competitor oligonucleotide; all other oligonucleotide competition lanes include 1000times molar excess of unlabeled competitor oligonucleotide.



Site 6 Analysis

Footprint site 6 is also seen with all nuclear extracts (Fig. 4) and includes consensus sequences for the homeobox protein engrailed (CAATTAAA) and GATA-1 (ATAATCAT). Fig. 6A shows an EMSA gel with an oligonucleotide synthesized to the site 6 footprint sequence. As occurred for site 5, the site 6 oligonucleotide EMSAs revealed a complex pattern buried within this ubiquitous footprint. This includes an embryonic-fetal erythroid specific band seen only with HEL-92 and K562 (band 2b). There is also a larger band seen with HEL-92, K562, and possibly MEL (band 2a), although the low mobility of this band makes resolution difficult. All bands are competed by an excess of unlabeled site 6 probe (Fig. 6B, lanes 2, 3, 6, and 7; Fig. 6C, lanes 2 and 9, Fig. 6D, lanes 2, 3, and 4), but not with the GATA-1 consensus oligonucleotide, although the gel shift oligonucleotide contains a GATA-1 consensus sequence (Fig. 6C, lanes 4 and 10). One ubiquitous band is competed by an oligonucleotide containing the consensus sequence for the homeobox protein Antennapedia (CAATTAAA) (Fig. 6D, band 4, lanes 5, 6, and 7), and is also competed by an oligonucleotide containing the consensus sequence for the homeobox protein Oct-1 (ATTTGCAT) (Fig. 6C, lanes 6 and 12) but not by the mutant Oct-1 oligonucleotide (Fig. 6C, lanes 7 and 13). This may be the site in IVS2 previously described to bind the homeobox protein HOX 2B (B6)(37, 38) .


Figure 6: EMSA of site 6 with the 41-bp site 6 oligonucleotide. Gels are 3.5% polyacrylamide, with 5 or 10 µg of each nuclear extract as marked and 5 µg of poly(dIbulletdC) as nonspecific competitor/reaction. A, lane 1, control with no nuclear extract. Band 1, ubiquitous band; band 2a, HEL-92-, K562-, and MEL- (?) specific band; band 2b, HEL-92- and K562-specific band; band 3, SATB1; and band 4, ubiquitous ``homeobox'' band. Lower bands on the gel are likely proteolytic degradation products and are not reproducible. B, competition and supershift EMSA of site 6 with CEM (lanes 1-4) and K562 (lanes 5-8) nuclear extracts, with competitor oligonucleotide or antibody (Ab) as marked. Lanes 2 and 6 include 100times molar excess of unlabeled competitor oligonucleotide; all other competition lanes include 1000times molar excess of unlabeled competitor oligonucleotide. C, competition EMSA of site 6 with CEM (lanes 1-7) and K562 (lanes 8-13) nuclear extracts and competitor oligonucleotides as marked. GATAm and oct-1m are the GATA and Oct-1 mutant oligonucleotides, respectively. Lane 3 includes 100times molar excess of unlabeled competitor oligonucleotide; all other competition lanes include 1000times molar excess of unlabeled competitor oligonucleotide. D, competition EMSA of site 6 with HEL-92 nuclear extract. ANT is the Antennapedia consensus oligonucleotide. Lanes 2 and 5 include 10times molar excess of unlabeled competitor oligonucleotide; lanes 3 and 6, 100times molar excess of unlabeled competitor oligonucleotide; lanes 4 and 7, 1000times molar excess of unlabeled competitor oligonucleotide.



Interestingly, the nuclear matrix attachment DNA binding protein SATB1 was found to bind to the site 6 probe in CEM nuclear extract (Fig. 6A, band 3). beta IVS2 has been described as one of nine sites in 90 kb of globin gene sequence studied to contain a MAR(19) . Additionally, it has been noted that MARs from the human beta globin gene can bind SATB1(21) . A SATB1 consensus oligonucleotide inhibits band 3 formation with the site 6 probe (Fig. 6C, lane 3), and the SATB1 band is supershifted by an anti-SATB1 antibody (Fig. 6B, lane 4). SATB1 is a 103-kDa protein(39) , and the SATB1 runs very slowly on EMSA. It is also possible that the SATB1 band seen in Fig. 6B might be a complex of SATB1 and some other protein. SATB1 complexes have been suggested in a recent paper describing the binding of SATB1 to an ^A globin regulatory region (20) . As was done for footprinted site 5, a supershift assay was performed using the same panel of antibodies against general factors, erythroid factors, and ets proteins. No additional supershifts were seen by EMSA.

Site 7 Analysis

The sense strand footprint at site 7, which is seen with CEM, HEL-92, and K562 nuclear extracts (Fig. 7) has consensus sequences for the homeobox protein bicoid (CCTAATCTC) and GATA-1 (CCTAATCTC). This footprint is broader in HEL-92 and K562 nuclear extracts than in CEM, and the region footprinted only in HEL-92 and K562 includes a GATA-1 consensus sequence. EMSA with a 40-bp oligonucleotide including the site 7 footprint showed a more complex pattern with erythroid nuclear extracts than other extracts (Fig. 8A). The site 7 probe used spanned two GATA-1 consensus sequences (CCTAATCTC and TTATCTTA), and in fact proved to bind GATA-1 in the erythroid lines, generating a GATA-1 band which could be supershifted with an anti-GATA-1 antibody (Fig. 8A, lane 8; Fig. 8C, lane 7). This band was also specifically competed with a GATA-1 consensus oligonucleotide, but not with a GATA-1 mutant oligonucleotide (Fig. 8B, band 2, lanes 13 and 14). Additionally, a higher band seen strongly with CEM and faintly with K562 was due to binding of nuclear matrix binding protein SATB1 (band 1). Band 1 is specifically competed by the SATB1 consensus oligonucleotide (Fig. 8B, band 1, lanes 3 and 11) and is supershifted by the anti-SATB1 antibody (Fig. 8C, lanes 2 and 6). The SATB1 binding of the site 7 oligonucleotide seemed stronger than the binding to the site 6 oligonucleotide. A faint band was seen in CEM and K562 nuclear extracts when the site 7 probe was labeled to a high specific activity (Fig. 8, B and C, band 4). This band could be competed with an Oct-1 consensus oligonucleotide, but not with an Oct-1 mutant oligonucleotide (Fig. 8B, lanes 7 and 15 and 8 and 16). However, no supershift was seen using the anti-Oct-1 antibody (Fig. 8C, lanes 4 and 8).


Figure 7: DNase I footprint of beta IVS2 site 7. Lane 1, C lane of Maxam-Gilbert sequencing reaction; lane 2, control with 10 ng of DNase I; lane 3, control with 20 ng of DNase I. Reactions with nuclear extract include 50 µg of nuclear extract except as marked, and were treated with 160 ng of DNase I.



Other Gel Shift Analyses

A 20-bp oligonucleotide was synthesized to further investigate a faint but reproducible 7-bp footprint (site 9) found only on the sense strand (data not shown). It is possible that this represents either a functional RNA binding protein or a single-stranded DNA binding protein(40) . However, no bands were seen by EMSA of double-stranded probe or labeled sense or antisense single-stranded probe with any nuclear extract (data not shown).

A 28-bp oligonucleotide was synthesized to characterize a footprint (site 12, Fig. 9) seen with nuclear extracts CEM, HEL-92, and K562. This oligonucleotide generated a complex gel shift pattern. One band was seen only with CEM, HEL-92, K562 and NIH 3T3 embryonic fibroblast nuclear extracts (41) (data not shown). Further characterization of binding to this site 12 probe showed that the bands are not competed by a general factor binding oligonucleotide (AP-1), and no supershifts were observed with the anti-SATB1 antibody (data not shown).


Figure 9: DNase I footprint of site 12 of beta IVS2. Lane 1, C lane of Maxam-Gilbert sequencing reaction; lane 2, control with no nuclear extract and 10 ng DNase I; lanes 3 and 9, control with no nuclear extract and 20 ng DNase I. Reactions include 50 µg of the nuclear extract indicated and 160 ng DNase I.




DISCUSSION

Human beta globin IVS2 has been entirely footprinted and further characterized by EMSA. Previous data have indicated that this region has several interesting structural and functional features; it contains a 3`-enhancer region(13, 14, 15) , two DNase I hypersensitive sites(17) , and is required for proper expression of the beta globin gene(16) . We have previously analyzed the expression in MEL cells of beta constructs in which beta IVS2 has been replaced by or globin IVS2, and have found that these globin IVSs are not interchangeable. When beta IVS2 is replaced with IVS2, the base-line expression of beta is greatly decreased, and the cells are not inducible with Me(2)SO(18) . In addition, constructs in which beta IVS2 has been replaced with IVS2 produce beta transcripts that are improperly initiated in K562 cells(42) . Comparison of , , , and beta IVS2 using restriction maps and maps generated by the tfsites data base reveals no significant sequence conservation on the nucleotide level, and few conserved potential transcription factor binding sites, except for two GATA-1 binding sites that are conserved in position. The first is the second intronic GATA-1 site (Fig. 2), which is conserved in position between and beta globin. The second is the seventh GATA-1 site, which is conserved in position between ^A and beta globin. Neither of these sites were found to bind GATA-1 in our experiments and are not apparently functionally important, at least in the expression of beta globin.

The footprint pattern of beta IVS2 and those areas studied by EMSA have revealed a very dense and complex pattern of protein binding (Table 1). Previous studies in which only the DNase I hypersensitive site of murine beta IVS2 was characterized also revealed a complex gel shift pattern(43) . These gel shift analyses covered about one-third of the total sequence of murine beta IVS2. Two proteins were identified as binding to murine beta IVS2, GATA-1 and Spi-1/Pu.1, an Ets family protein(44) . These murine beta IVS2 binding sites are not conserved in human beta IVS2. Human beta IVS2 does contain four potential Ets binding sites, but only one of these is footprinted in human beta IVS2 (site 14), and this one site has been shown to bind GATA-1 only.^2 The complexity of the binding pattern seen in IVS2 suggests that it is an area of complex regulatory function, and might be involved in the regulation of beta expression in the adult, and perhaps in earlier stages of erythropoiesis. Certainly a regulatory function is supported by the extensive binding of erythroid transcription factor GATA-1 to this region. The redundancy of the GATA-1 consensus sequences alone (nine sites), unique among the globin genes, indicates that some function is likely. By comparison, the human globin gene has only three GATA-1 consensus sequences, and the human gene only two GATA-1 consensus sequences. Three of the nine consensus sites in beta IVS2 are bound by GATA-1 as is a fourth related sequence. It is interesting that although binding sites for GATA-1 have been extensively characterized(34, 35, 36) , one still cannot predict with certainty which sites are utilized in vivo or in vitro.

Several stage- or tissue-specific bands were seen on EMSA of beta IVS2. The gel shift analyses on site 5 in particular revealed several bands of interest (Fig. 5, A and B), none of which could be supershifted by antibodies to general transcription factors or known factors important in the regulation of globin expression such as NFE-2, YY1, or GATA-1. Of particular interest is a band seen only with MEL (adult erythroid) cells (Fig. 5A, band 4), as this could represent a potential factor for positive expression of beta globin. The two bands seen with HEL-92 and K562 (bands 3, a and b) and faintly in CEM and the band seen only with CEM and K562 nuclear extracts (band 5) are also intriguing. Perhaps these proteins are not seen in murine MEL nuclear extract due to the species difference, or possibly they play a role in embryonic-fetal erythropoiesis, or in lymphoid cells. The HEL-92-, K562-, and possibly MEL-specific band bound to the site 6 oligonucleotide (Fig. 6A, band 2a) can be approximately sized due to the presence of SATB1 (band 3) binding to this oligonucleotide in CEM nuclear extract. The HEL-92 and K562 specific band runs more slowly than SATB1 which has a molecular mass of 103 kDa. This size would be larger than known erythroid regulatory proteins, with the exception of PE. PE is a 108-kDa protein that binds to sites near human globin(45). Its broad pattern of tissue distribution argues against it being any of the uncharacterized proteins we have found. The particular pattern of expression of the HEL-92- and K562-specific band, i.e. only in embryonic-fetal erythroid cells, could be relevant to down-regulation of beta globin expression early in development. This could be of particular importance if GATA-1 binding to beta IVS2 is indeed important in positive regulation of beta globin expression, as GATA-1 is certainly present in embryonic-fetal erythroid cells. Another differentially expressed band at site 12 (Fig. 9), seen with CEM, HEL-92, K562, and NIH 3T3 nuclear extracts, but not with the adult HeLa or MEL cells, is of unknown significance.

We have found one potential homeobox protein binding site in beta IVS2 (site 6, Fig. 6A, band 4). Previous data have shown that homeobox proteins may be important in erythroid differentiation(46) . Eight of nine genes in the HOX 2 cluster are expressed in erythroid cells, but rarely in B or T cells(46, 47, 48, 49) . There is also indirect evidence that HOX 3C may be necessary for adult hematopoiesis(50) . A band found in all extracts at site 6 was competed by an oligonucleotide with the Oct-1 consensus sequence (ATTTGCAT) (Fig. 6C, band 4, lanes 6 and 12) and by an oligonucleotide with the Antennapedia consensus sequence (CAATTAAA) (Fig. 6D, band 4, lane 7). The Antennapedia sequence is within the site 6 footprint and site 6 probe and is listed in the tfsites data base as the engrailed consensus. This is the core consensus for many HOX proteins including HOX B6, which may have a role in erythroid differentiation(51) . However, we do not see any erythroid cell specificity of the particular protein binding at this site. Although the site 7 footprinted region and oligonucleotide contain the consensus sequence for the homeobox protein bicoid, all bands seen with K562 nuclear extracts could be competed by GATA-1, SATB1, or Oct-1 sequence oligonucleotides, and so all bands in erythroid cell extracts are accounted for. There seems to be no erythroid-related homeobox binding at this site in beta IVS2.

We have found that the nuclear matrix-associated DNA binding protein SATB1 binds to site 6 of beta IVS2 with CEM nuclear extract and more intensely to site 7 of beta IVS2 with CEM and K562 nuclear extract (Fig. 6A, band 3; Fig. 8A, band 1). MARs are postulated to play an important role in the functional organization of chromatin loop domains. There is evidence that replication and transcription occur at the interface of DNA and the nuclear matrix and that the nuclear matrix is involved in RNA splicing(21, 22, 23) . Recent reports have indicated that DNA binding of some transcription factors is associated with the nuclear matrix (52, 53, 54, 55) . MARs have a strong potential for extensive unpairing or unwinding. Although MARs often contain or reside close to enhancer sequences(21) , their role is not clear as yet.

SATB1 is one of the characterized MAR binding proteins. It is a 103-kDa protein that binds as a monomer and is expressed primarily in thymus (21, 39) . It binds selectively to MARs with well mixed ATC sequences (21, 39) . beta IVS2 has been previously characterized to be one of nine sites in the 90 kb of the human beta globin gene locus to function as an MAR(19) . MARs are regions of DNA at least 200 bp in length and are generally 70% AT-rich(21, 22) . The areas binding SATB1 in beta IVS2 are about 73% AT rich and do consist of a well mixed ATC sequence. Two distinct sites, in footprints 6 and 7, bind SATB1; two sites seem to be required for a strong SATB1 interaction to occur(21, 39) . The site 6 and 7 oligonucleotides are, respectively, 83 and 75% AT-rich. The bands run very slowly on EMSA and may consist of a complex of SATB1, and some other protein as has been suggested(20) .

Preliminary data show that SATB1 is a suppressor of transcription based on transient cotransfection assays with a reporter gene(21) . One regulatory region to which SATB1 binds is the Igµ heavy chain intronic enhancer, which is flanked by MARs. In this context SATB1 may help to repress expression in non-B cells(56) . SATB1 has also been observed to bind the beta globin gene(21) . Also, SATB1 has been recently reported to bind to the human ^A 3`-regulatory region at sites I and IV(20) . These sites had been previously characterized as binding HOX protein 2.8 (2H)(57, 58) . Besides being highly expressed in CEM cells, SATB1 was also found in heart, skeletal muscle, fetal liver, K562 cells, and B and T cells (20) . It was proposed that the ^A regulatory region might influence gene expression through interaction with the nuclear matrix. The regulatory region was also found to be an MAR, and this group speculated that promoter/enhancer interaction is mediated by SATB1 binding of MARs. However, they found no MAR near the ^A promoter(20) .

MARs and SATB1 binding in beta IVS2 could have any of several functions. Previous data seem to indicate a correlation between MARs and enhancer regions(21, 22) . Also, each beta family gene (except ^G) harbors an MAR, while by comparison, no such sites exist in the large alpha globin gene complex(19) . MARs might mediate an attachment between individual beta globin family genes and the beta globin LCR (which also contains MARs)(19) , possibly having some role in globin switching. The beta IVS2 MAR might increase expression mediated by the beta IVS2 enhancer, as the beta 3`-MAR(19) , situated 500 bp downstream of the beta 3`-enhancer, might facilitate expression mediated by this enhancer. Or, the beta IVS2 MAR in combination with GATA-1 binding or binding of other factors might function as an independent enhancer in beta IVS2. Possibly, MARs could mediate interaction between IVS2 and the beta promoter or 3`-enhancer.

From the complexity of DNase I footprint and EMSA results we have obtained, it is clear that there are many interactions between human beta IVS2 sequences and nuclear factors, both known factors and those yet to be characterized. The biological significance of the presence of protein-DNA interactions in beta IVS2, and any interactions between beta IVS2 and other regulatory sequences 5` or 3` to the human beta gene, other beta family genes or the LCR remain to be determined. The details of the relationships between DNA binding factors and globin gene switching also remain to be elucidated. Deletion analysis and site-directed mutagenesis of human beta IVS2 transacting factor binding sites may provide new insights into the relationship between protein binding to this region and human beta globin gene function.


FOOTNOTES

*
This work was supported in part by Public Health Service Grants from the National Institutes of Health DK-25274, HL-28381, and HL-48345 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) J00179[GenBank].

§
Supported by a National Institutes of Health Hematology Training Grant DK-07373.

Supported by a National Institutes of Health Clinical Investigator Award DK-02260.

**
To whom correspondence should be addressed: Columbia University College of Physicians and Surgeons, 701 W. 168th St., HHSC 1602, New York, NY 10032. Tel.: 212-305-4186; Fax: 212-923-2090.

(^1)
The abbreviations used are: kb, kilobase pair(s); MEL, murine erythroleukemia; MAR, matrix attachment region; LCR, locus control region; IVS2, intervening sequence 2; SATB1, special A-T-rich binding protein 1; bp, base pair(s); EMSA, electrophoretic mobility shift assay.

(^2)
M. Flamm, submitted for publication.


REFERENCES

  1. Blom van Assendelft, G., Hanscombe, O., Grosveld, F., and Greaves, D. R. (1989) Cell 56, 969-977 [Medline] [Order article via Infotrieve]
  2. Grosveld, F., Blom van Assendelft, G., Greaves, D., and Kollias, G. (1987) Cell 51, 975-985 [Medline] [Order article via Infotrieve]
  3. Talbot, D., Collis, P., Antoniou, M., Vidal, M., Grosveld, F., and Greaves, D. R. (1989) Nature 338, 352-355 [CrossRef][Medline] [Order article via Infotrieve]
  4. Orkin, S. H. (1990) Cell 63, 665-672 [Medline] [Order article via Infotrieve]
  5. Stamatoyannopoulos, G., and Nienhuis, A. W. (1987) in The Molecular Basis of Blood Diseases (Stamatoyannopoulos, G., Nienhuis, A. W., Leder, P., and Majerus, P. W., eds), pp. 66-105, W. B. Saunders Company, Philadelphia, PA
  6. Tsai, S. F., Martin, D. I. K., Zon, L., D'Andrea, A. D., Wong, G. G., and Orkin, S. H. (1989) Nature 339, 446-451 [CrossRef][Medline] [Order article via Infotrieve]
  7. Evans, T., and Felsenfeld, G. (1989) Cell 58, 877-885 [Medline] [Order article via Infotrieve]
  8. Andrews, N. C., Erdjument-Bromage, H., Davidson, M. B., Tempst, P., and Orkin, S. H. (1993) Nature 362, 722-728 [CrossRef][Medline] [Order article via Infotrieve]
  9. Chan, J. Y., Han, X.-L., and Kan, Y. W. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 11366-11370 [Abstract]
  10. Miller, I. J., and Bieker, J. J. (1993) Mol. Cell. Biol. 13, 2776-2786 [Abstract]
  11. Caterina, J. J., Donze, D., Sun, C.-W., Ciavatta, D. J., and Townes, T. M. (1994) Nucleic Acids Res. 22, 2383-2391 [Abstract]
  12. Moi, P., Chan, K., Asunis, I., Cao, A., and Kan, Y. W. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 9926-9930 [Abstract/Free Full Text]
  13. Antoniou, M., de Boer, E., Habets, G., and Grosveld, F. (1988) EMBO J. 7, 377-384 [Abstract]
  14. Behringer, R. R., Hammer, R. E., Brinster, R. L., Palmiter, R. D., and Townes, T. M. (1987) Proc. Nat. Acad. Sci. U. S. A. 84, 7056-7060 [Abstract]
  15. Trudel, M., and Costantini, F. (1987) Genes & Dev. 1, 954-961
  16. Collis, P., Antoniou, M., and Grosveld, F. (1990) EMBO J. 9, 233-240 [Abstract]
  17. Groudine, M., Kohwi-Shigematsu, T., Gelinas, R., Stamatoyannopoulos, G., and Papayannopoulou, T. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 7551-7555 [Abstract]
  18. LaFlamme, S., Acuto, S., Markowitz, D., Vick, L., Landschultz, W., and Bank, A. (1987) J. Biol. Chem. 262, 4819-4826 [Abstract/Free Full Text]
  19. Jarman, A. P., and Higgs, D. R. (1988) EMBO J. 7, 3337-3344 [Abstract]
  20. Cunningham, J. M., Purucker, M. E., Jane, S. M., Safer, B., Vanin, E. F., Ney, P. A., Lowrey, C. H., and Nienhuis, A. W. (1994) Blood 84, 1298-1308 [Abstract/Free Full Text]
  21. Dickinson, L. A., Joh, T., Kohwi, Y., and Kohwi-Shigematsu, T. (1992) Cell 70, 631-645 [Medline] [Order article via Infotrieve]
  22. Cockerill, P. N., and Garrard, W. T. (1986) Cell 44, 273-282 [Medline] [Order article via Infotrieve]
  23. Sun, J.-M., Chen, H. Y., and Davie, J. R. (1994) J. Cell Biochem. 55, 252-263 [Medline] [Order article via Infotrieve]
  24. Foley, G. E., Lazarus, H., Farber, S., Uzman, B. G., Boone, B. A., and Mc Carthy, R. E. (1965) Cancer 18, 522-529
  25. Martin, P., and Papayannopoulou, T. (1982) Science 216, 1233-1235 [Medline] [Order article via Infotrieve]
  26. Gey, G. O., Coffman, W. D., and Kubicek, M. T. (1952) Cancer Research 12, 264-272
  27. Lozzio, C. B., and Lozzio, B. B. (1975) Blood 45, 321-334 [Abstract]
  28. Friend, C., Scher, W., Holland, J. G., and Sato, T. (1971) Proc. Natl. Acad. Sci. U. S. A. 68, 378-382 [Medline] [Order article via Infotrieve]
  29. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual , pp. 6.46-6.48 and 13.88-13.94, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  30. O'Neill, D., Bornschlegel, K., Flamm, M., Castle, M., and Bank, A. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 8953-8957 [Abstract]
  31. Dignam, J. D., Martin, P. L., Shastry, B. S., and Roeder, R. G. (1983) Methods Enzymol. 101, 582-598 [Medline] [Order article via Infotrieve]
  32. Schreiber, E., Matthias, P., Muller, M. M., and Schaffner, W. (1989) Nucleic Acids Res. 17, 6419-6420 [Medline] [Order article via Infotrieve]
  33. Genetics Computer Group (1991) Program Manual for the GCG Package, version 7 , Genetics Computer Group, Madison, WI
  34. Wall, L., de Boer, E., and Grosveld, F. (1988) Genes & Dev. 1089-1100
  35. Ko, L. J., and Engel, J. D. (1993) Mol. Cell. Biol. 13, 4011-4022 [Abstract]
  36. Merika, M., and Orkin, S. H. (1993) Mol. Cell. Biol. 13, 3999-4010 [Abstract]
  37. Vallerga, A., Shen, W.-F., Hauser, C., Largman, C., and Lawrence, H. J. (1991) Blood 78, 254 (Suppl.)
  38. Shen, W.-F., Largman, C., Lowney, P., Hauser, C., Simonitch, T. A., Hack, F. M., and Lawrence, H. J. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 8536-8540 [Abstract]
  39. Nakagomi, K., Kohwi, Y., Dickinson, L. A., and Kohwi-Shigematsu, T. (1994) Mol. Cell. Biol. 14, 1852-1860 [Abstract]
  40. Bandziulis, R. J., Swanson, M. S., and Dreyfuss, G. (1989) Genes & Dev. 3, 431-437
  41. Jainchill, J. L., Aaronson, S. A., and Todaro, G. J. (1969) J. Virol. 4, 549-553 [Medline] [Order article via Infotrieve]
  42. Donovan-Peluso, M., Acuto, S., Swanson, M., Dobkin, C., and Bank, A. (1987) J. Biol. Chem. 262, 17051-17057 [Abstract/Free Full Text]
  43. Galson, D. L., and Housman, D. E. (1988) Mol. Cell. Biol. 8, 381-392 [Medline] [Order article via Infotrieve]
  44. Galson, D. L., Hensold, J. O., Bishop, T. R., Schalling, M., D'Andrea, A. D., Jones, C., Auron, P. E., and Housman, D. E. (1993) Mol. Cell. Biol. 13, 2929-2941 [Abstract]
  45. Lloyd, J. A., Case, S. S., Ponce, E., and Lingrel, J. B (1994) J. Biol. Chem. 269, 19385-19393 [Abstract/Free Full Text]
  46. Lawrence, H. J., and Largman, C. (1992) Blood 80, 2445-2453 [Medline] [Order article via Infotrieve]
  47. Shen, W.-F., Largman, C., Lowney, P., Simonitch, T. A., Hack, F. M., and Lawrence, H. J. (1990) Adv. Exp. Med. Biol. 271, 211-219
  48. Shen, W.-F., Detmer, K., Mathews, C. H. E., Hack, F. M., Morgan, D. A., Largman, C., and Lawrence, H. J. (1992) EMBO J. 11, 983-989 [Abstract]
  49. Mathews, C. H. E., Detmer, K., Boncinelli, E., Lawrence, H. J., and Largman, C. (1991) Blood 78, 2248-2252 [Abstract]
  50. Takeshita, K., Bollekens, J. A., Hijiya, N., Ratajczak, M., Ruddle, F. H., and Gewirtz, A. M. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 3535-3538 [Abstract]
  51. Lawrence, H. J., Johnson, R. A., Perrine, S., and Largman, C. (1994) Ann. N. Y. Acad. Sci. 718, 165-176 [Medline] [Order article via Infotrieve]
  52. Bortell, R., Owen, T. A., Bidwell, J. P., Gavazzo, P., Breen, E., van Wijnen, A. J., De Luca, H. F., Stein, J. L., Lian, J. B., and Stein, G. S. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 6119-6123 [Abstract]
  53. Dworetzky, S. I., Wright, K. L., Fey, E. G., Penman, S., Lian, J. B., Stein, J. L., and Stein, G. S. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 4178-4182 [Abstract]
  54. Isomura, T., Tamiya Koizumi, K., Suzuki, M., Yoshida, S., Taniguchi, M., Matsuyama, M., Ishigaki, T., Sakuma, S., and Takahashi, M. (1992) Nucleic Acids Res. 20, 5305-5310 [Abstract]
  55. Waitz, W., and Loidl, P. (1991) Oncogene 6, 29-35 [Medline] [Order article via Infotrieve]
  56. Forrester, W. C., van Genderen, C., Jenuwein, T., and Grosschedl, R. (1994) Science 265, 1221-1225 [Medline] [Order article via Infotrieve]
  57. Lavelle, D., Ducksworth, J., Eves, E., Gomes, G., Keller, M., Heller, P., and De Simone, J. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 7318-7322 [Abstract]
  58. Sengupta, P. K., Lavelle, D. E., and De Simone, J. (1994) Blood 83, 1420-1427 [Abstract/Free Full Text]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.