Identification and Characterization of a Conserved Erythroid-specific Enhancer Located in Intron 8 of the Human 5-Aminolevulinate Synthase 2 Gene*

Katharina H. SurinyaDagger , Timothy C. Cox§, and Brian K. MayDagger parallel *

From the Dagger  Department of Biochemistry and § Department of Genetics, University of Adelaide, Adelaide, South Australia, Australia 5005

    ABSTRACT
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Thirty five kilobases of sequence encompassing the human erythroid 5-aminolevulinate synthase (ALAS2) gene have been determined. Analysis revealed a very low GC content, few repetitive elements, and evidence for the insertion of a reverse-transcribed mRNA sequence and a neighboring gene. We have investigated whether introns 1, 3, and 8, which correspond to DNase I-hypersensitivity sites in the structurally related mouse ALAS2 gene, affect expression of the human ALAS2 promoter in transient expression assays. Whereas intron 3 was marginally inhibitory, introns 1 and 8 of the human gene stimulated promoter activity. Intron 8 harbored a strong erythroid-specific enhancer activity which was orientation-dependent. Deletion analysis of this region localized enhancer activity to a fragment of 239 base pairs. Transcription factor binding sites clustered within this region include GATA motifs and CACCC boxes, critical regulatory sequences of many erythroid cell-expressed genes. These sites were also identified in the corresponding intron of both the murine and canine ALAS2 genes. Mutagenesis of these conserved sites in the human intron 8 sequence and transient expression analysis in erythroid cells established the functional importance of one GATA motif and two CACCC boxes. The GATA motif bound GATA-1 in vitro. The two functional CACCC boxes each bound Sp1 or a related protein in vitro, but binding of the erythroid Krüppel-like factor and the basic Krüppel-like factor could not be detected. The intron 8 enhancer region was not activated by GATA-1 together with Sp1 in transactivation experiments in COS-1 cells indicating the involvement of a related Sp1 protein or of another unidentified erythroid factor. Overall, these results demonstrate that a GATA-1-binding site and CACCC boxes located within the human ALAS2 intron 8 are critical for the erythroid-specific enhancer activity in transfected erythroid cells, and due to the conserved nature of these binding sites across species, it seems likely that these sites play a functional role in the tissue-restricted expression of the gene in vivo.

    INTRODUCTION
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Large amounts of hemoglobin are synthesized during erythropoiesis, and this requires the coordinated synthesis of heme molecules and globin chains together with cellular iron uptake (1-3). A major regulatory site of erythroid heme formation is at the first step of the heme biosynthetic pathway, catalyzed by an erythroid-specific isoform of 5-aminolevulinate synthase (ALAS)1 (EC 2.3.1.37), designated ALAS2 (2). A second closely related isoform, ALAS1, is expressed ubiquitously and, while supplying heme for various hemoproteins in nonerythroid cells, probably also supplies small amounts of heme during erythropoiesis (2, 4). ALAS is the only enzyme of the heme pathway for which there are two distinct genes with the gene encoding ALAS1 located on chromosome 3 and the gene for ALAS2 on the X chromosome (5-7). For all other enzymes of the heme pathway, there is only one structural gene with either a composite promoter that is expressed ubiquitously and at elevated levels in erythroid cells or, alternatively, two separate promoters providing these functions (2, 8-10).

Molecular studies have established that expression of the ALAS2 gene occurs only in erythroid cells and is regulated at both the transcriptional and post-transcriptional levels. The gene is transcriptionally activated by erythropoietin (2) in concert with the other genes for heme pathway enzymes (11) and the genes encoding the globins (1-2, 12). Subsequent translation of the ALAS2 mRNA is modulated by the cellular iron status through an iron-responsive element located in the 5'-untranslated region, thus coupling iron availability with protoporphyrin production and heme formation and ultimately hemoglobin assembly (2, 13-15). Defects in the ALAS2 gene underlie the impaired heme biosynthesis observed in X-linked sideroblastic anemia (3).

The structural organization of the ALAS2 genes for human (13, 16), mouse (17), and chicken (18) is remarkably similar, consisting of 11 exons with a feature being the presence of a 5-6-kb intron in the 5'-untranslated region. DNase I-hypersensitivity mapping studies performed on the murine ALAS2 gene in mouse erythroleukemia (MEL) cells have identified five hypersensitive sites located in the immediate promoter region, at the 5' end of intron 1, within intron 3, and at the 3' end of intron 8 (17). Such DNase I-hypersensitive sites are indicative of nucleosome free regions of DNA associated with transcription regulatory factors (19-20). We have determined the entire sequence of the human ALAS2 locus, and we report in this communication that there are multiple transcription factor binding sites in the intronic regions corresponding to the DNase I-hypersensitivity sites identified in the murine ALAS2 gene.

In our recent transient expression studies of the 5'-flanking region of the human ALAS2 gene, we have demonstrated that deletion constructs from -10.3 kb to -293-bp express efficiently in both erythroid cells and nonerythroid cells (21). Although an inadequate assembly of repressive nucleosomes on the transiently transfected reporter gene constructs may explain expression in nonerythroid cells, there existed the possibility that the tissue-specific expression of the ALAS2 gene is conferred by an erythroid cell-specific enhancer located elsewhere in the gene. We have investigated whether the intronic sequences corresponding to DNase I-hypersensitivity sites in the mouse ALAS2 gene, namely introns 1, 3, and 8 play a role in the transcriptional regulation of the human ALAS2 gene. In the present study, we have shown that sequences located within the human ALAS2 intron 8 confer strong erythroid cell-specific enhancer activity to both the ALAS2 promoter and a heterologous promoter. This regulatory region was localized to 239 bp and is highly conserved in the human, mouse, and dog ALAS2 genes. By using site-directed mutagenesis and transient expression analysis of both ALAS2 and heterologous promoter/reporter gene constructs in erythroid cells, we have identified control elements in the intron 8 enhancer region and have analyzed protein binding to these sites by gel shift assays.

    EXPERIMENTAL PROCEDURES
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Manual Sequencing of the Human ALAS2 Locus-- The complete sequence of the cosmid clone, pTC-EA1 (13, 16), which contained the entire ALAS2 gene was determined with the view that it would facilitate a thorough examination of potential regulatory regions. For initial cloning events, the cosmid clone was digested with HindIII, and all fragments subcloned into HindIII-restricted pBluescript (Stratagene) or pTZ18 (Amersham Pharmacia Biotech) vector DNAs, with the exception of the largest fragment (approximately 18 kb) which self-propagated after religation due to the presence of the essentially intact pAVCV007 vector DNA. These subclones were then subjected to multiple and consecutive rounds of single and double enzyme digestion, and overlapping fragments subcloned into appropriately restricted vector DNAs. End sequences of each subclone were generated by double-stranded DNA sequencing using universally available vector sequencing primers and SequenaseTM V2.0 (U. S. Biochemical Corp.). When convenient restriction sites were limited, "shotgun" cloning and sequencing were performed. Remaining gaps in the sequence were targeted by selective digestion, cloning, and sequencing or, alternatively, directly sequenced using specifically designed oligonucleotide primers. Positioning and joining of the cosmid HindIII restriction fragments were achieved by the isolation and sequencing of appropriate restriction fragments spanning all HindIII sites. The integrity of sequence data from the original HindIII subclones was checked by comparison with sequence obtained from multiple EcoRI-derived cosmid subclones as well as by comparison of the sequence-derived restriction map with that of the physically derived map (16). Approximately 95% of the sequence was compiled from data obtained in both directions. The majority of remaining sequence was determined by reading multiple overlapping clones in the same direction.

Sequence Analysis-- The generated sequence of the ALAS2 locus was examined for simple repetitive elements by searching against the GenBankTM and EMBL data bases and by performing a GRAIL II analysis. Sequence composition and restriction analysis of the locus was performed using DNAsis-Mac V2.0 (Hitachi). Putative transcription factor binding sites were identified by word searches using the core consensus as the query and the GCG sequence analysis software package "FIND." Regions in which a clustering of putative cis-regulatory sequences were identified were further scrutinized for additional consensus binding motifs by direct visual inspection. For GATA-like sequences, only sites matching the full consensus (5'-WGATAR-3') (22-23) have been considered as potential binding sites.

Amplification and Sequencing of Murine ALAS2 Intron 8-- Clones encompassing intron 8 of the murine ALAS2 gene were generated using the polymerase chain reaction and employing both a degenerate exon 8 (AE8Z-1: 5'-CACGAATTCGGRGCBCTGACBTTYGTDGA-3') and exon 9 primer (AE9Z-1: 5'-GCGAGAATTCCVCCSACRSMRCCMAADGCYTT-3') on murine (CBA × C57Bl/6J) genomic DNA (EcoRI sites are underlined). Sequence of the murine ALAS2 intron was obtained following digestion of the amplified product with EcoRI and cloning into an appropriately restricted phagemid vector. One clone was sequenced and was identical to that of intron 8 sequence determined by analysis of a previously isolated murine ALAS2 genomic clone. A homology plot of the human and murine sequences was carried out using the DNAsis-Mac V2.0 software.

Construction of Intron/Reporter Gene Plasmids-- Human ALAS2 intron 1, intron 3, and intron 8 fragments were isolated from the cosmid clone, pTC-EA1 (13), and ligated into the firefly luciferase (LUC) reporter gene vector, pGL2-Basic (Promega) containing 293 bp of human ALAS2 promoter sequence (pALAS-293-LUC) (21). pALAS-293-LUC previously described in Surinya et al. (21) contains ALAS2 promoter sequence from -293 bp to +28 bp and is referred to here as pAp-LUC.

The synthesis of a plasmid construct containing 293 bp of ALAS2 promoter sequence and intron 1 was performed in several steps. A HindIII site was introduced at -7/-2 in the ALAS2 promoter and +4973/+4978 at the translation initiation site located in exon 2 by site-directed mutagenesis in a subclone containing -6.0 kb to +5.0 kb of contiguous human ALAS2 sequence. A 289-bp BglII-HindIII fragment (-293 to -4) and a 4.98-kb HindIII fragment (-4 to +4978) were individually isolated from this subclone. The 289-bp BglII-HindIII fragment was ligated into the BglII-HindIII digested promoterless firefly luciferase (LUC) reporter gene vector, pGL2-Basic, and then the 4.98-kb HindIII fragment was subsequently ligated into this HindIII-linearized plasmid construct to generate pAp-I1(4.9kb)-LUC. Clones were initially screened by restriction enzyme analysis and then sequenced to confirm the correct orientation of the intron 1 fragment downstream of the ALAS2 promoter.

The entire sequence of intron 3 containing 850 bp was generated by the polymerase chain reaction using Pfu polymerase (Stratagene) and pTC-EA1 as the template and primers with engineered EcoRI sites as follows: AE3Z-1, 5'-CAAGAATTCGTSAAGRCTTTCAAGACAG-3' (binding at the 3' end of exon 3), and AE4Z-1, 5'-GTGAATTCAGKCATATTRTTCTGAATCAG-3' (binding at the 5' end of exon 4). The introduced EcoRI sites in these primers are underlined. The amplified product was cloned into the EcoRI-digested pBluescript KS+ vector. A SmaI-HindIII fragment was then isolated from this clone, blunted with Klenow, and ligated upstream of the 293-bp ALAS2 promoter, into the SmaI-linearized pAp-LUC, and the resulting construct was designated pAp-I3(850)-LUC. Only clones of this plasmid construct containing intron 3 in the forward orientation were isolated. One clone was sequenced and was identical to that for intron 3 as determined from sequence analysis of human ALAS2 genomic clones.

To isolate intron 8 sequence, the polymerase chain reaction was performed again using pTC-EA1 as the template and the following primers (introduced EcoRI sites are underlined): primer 1, 5'-GAAGAATTCGTAAGTGAATGCTTTGGGCCT-3' (which bound at the exon 8/intron 8 boundary), and primer 2, 5'-AGGGAATTCCTGGAGACAGAAAGGAATAAG-3' (which bound near the intron 8/exon 9 boundary). The 579-bp amplified product was digested with EcoRI, and a 460-bp fragment (as a result of an internal EcoRI site in intron 8) was ligated into the similarly digested pBluescript KS+ phagemid (pKS-I8(460)). Sequence analysis confirmed that the amplified fragment was identical to intron 8. Since repeated attempts to synthesize plasmid constructs containing intron 8 ligated 3' of the luciferase reporter gene were unsuccessful, pKS-I8(460) was digested with KpnI and BamHI, and the resulting fragment was cloned into KpnI/BglII linearized pAp-LUC to generate pAp-I8(460)-LUC. The KpnI-BamHI intron 8 fragment was also ligated into the KpnI/BglII-linearized construct pALAS-293mut6-LUC (referred to as pAt-LUC) containing a TATA box at the -27 site (as described in Ref. 21), to generate pAt-I8(460)-LUC. Although attempts to synthesize plasmid constructs containing intron 8 ligated in the reverse orientation upstream of the ALAS2 promoter were also unsuccessful, a 460-bp human ALAS2 intron 8 fragment was ligated in both orientations upstream of the thymidine kinase promoter in the plasmid ptk-LUC. This plasmid contained a 164-bp BamHI-BglII fragment of the thymidine kinase promoter isolated from the pBLCAT2 vector (provided by Dr. C. Hahn) (24) and ligated into the BglII-linearized pGL2-Basic vector. To synthesize these plasmid constructs, a KpnI-SacI fragment and a Kpn-SmaI fragment were individually isolated from pKS-I8(460) and ligated in the native and reverse orientations, upstream of the thymidine kinase promoter in the similarly digested ptk-LUC vector to generate ptk-I8(460)-LUC and ptk-I8(460)R-LUC, respectively.

To synthesize plasmid constructs containing the 3'-flanking sequence of the human ALAS2 gene, a 2.935-kb BglII-BamHI fragment was isolated from a subclone containing a 9.45-kb HindIII fragment from the genomic clone, pTC-EA1, corresponding to the 3'-end of the human ALAS2 gene. This BglII-BamHI fragment was ligated into the BamHI-linearized pAp-LUC, downstream of the luciferase reporter gene, and the resulting construct was designated pAp-3'(2.9kb)-LUC.

Construction of Intron 8 Deletion Constructs-- A series of human ALAS2 intron 8 deletion constructs ranging in length from 403 bp to 115 bp were synthesized as follows. A PstI fragment containing 403 bp of intron 8 sequence was isolated from pKS-I8(460), blunted with T4 DNA polymerase enzyme, and ligated into the SmaI-linearized pAp-LUC to create pAp-I8(403)-LUC. To generate a plasmid containing 279 bp of intron 8 sequence, extending from the HindIII site to the introduced EcoRI site located at the end of intron 8, the plasmid pAp-I8(403)-LUC was digested with HindIII, and the 590-bp HindIII fragment containing intron 8 and ALAS2 promoter was religated with the vector fragment, resulting in pAp-I8(279)-LUC. To synthesize constructs containing 115 and 176 bp of intron 8 sequence, the polymerase chain reaction was performed using pAp-I8(403)-LUC as the template and the following primers: primer 3, 5'-CACCCTCTCGAGAAGCTTCATCTTAGCTCC-3' (an introduced XhoI site is underlined); with primer 4, 5'-GTCGTAGCGTCGACTTCTGCTGCTTTGAGATA-3' (an introduced SalI site is underlined); and, primer 5, 5'-AAAGTCCTCGAGCAAAGCAGCAGAATTATC-3' (an introduced XhoI site is underlined); with primer 6, 5'-GCCAAAGGGTCGACCTGGAGACAGAAAGGAAT-3' (an introduced SalI site is underlined), respectively. The amplified products were each digested with XhoI and SalI and individually ligated into the XhoI-linearized pAp-LUC plasmid in both orientations. The sequence of the amplified fragments was confirmed as that from intron 8. The resulting plasmids were designated as pAp-I8(115)-LUC and pAp-I8(176)-LUC, respectively. To synthesize a plasmid containing a 239-bp fragment of intron 8 sequence from the PstI site extending to the first introduced SalI site, the polymerase chain reaction was employed using pTC-EA1 as the template and primer 1 with primer 4. The amplified product was digested with PstI and SalI and ligated into the similarly digested pBluescript KS+ (pKS-I8(239)). This plasmid was then digested with SmaI and SalI, and a 242-bp fragment was ligated into the SmaI/XhoI-linearized pAp-LUC to generate pAp-I8(239)-LUC. A 132-bp SmaI-HindIII fragment was isolated from pKS-I8(239), blunted with Klenow enzyme, and ligated into SmaI-linearized pAp-LUC to create pAp-I8(129)-LUC. All constructs were verified by restriction mapping and sequence analysis.

Site-directed Mutagenesis-- Site-directed mutagenesis was performed using the Bio-Rad Muta-Gene M13 in vitro Mutagenesis kit according to the manufacturer's instructions. The plasmid pKS-I8(239) was transformed into the Escherichia coli CJ236 strain, and following superinfection with the helper phage M13K07, single-stranded DNA was purified and used as a template in the mutagenesis reaction. To inactivate the two CACCC sites (designated CACCC site A and CACCC site B) and the two GATA sites (GATA site A and GATA site B), PvuII sites were introduced. In the final step, a SmaI-XhoI fragment harboring the mutation was excised from pBluescript KS+ and subcloned into the similarly digested pAp-LUC vector. To generate constructs containing the mutated CACCC site A in combination with either the mutated CACCC site B, GATA site A, or GATA site B, a two-step cloning procedure was performed. The plasmid pAp-I8mut1-LUC containing the mutated CACCC site A was digested with HindIII and SalI, and the 117-bp fragment was removed and subsequently replaced with a HindIII-SalI fragment isolated from pAp-I8mut2-LUC, pAp-I8mut3-LUC, and pAp-I8mut4-LUC containing a mutation in CACCC site B, GATA site A, and GATA site B, respectively, to generate pApI8-mut5-LUC, pAp-I8mut6-LUC, and pAp-I8mut7-LUC. A SmaI-XhoI fragment was isolated from pAp-I8mut5-LUC and pAp-I8mut7-LUC and ligated into the similarly digested ptk-LUC vector. These constructs designated ptk-I8mut1-LUC and ptk-I8mut2-LUC contain mutations in CACCC sites A and B, and CACCC site A and GATA site B, respectively. Mutant clones were confirmed by DNA sequence analysis. The primers used in these reactions were as follows with the mutations underlined: CACCC site A, 5'-TAAACCCCTCCTCAGCTGTAGCCCCAAGCTT-3'; CACCC site B, 5'-CAGCTAAAGGTTCAGCTGAGCTACTGCCT-3'; GATA site A, 5'-CCAGCTACTGCCAGCTGAGTCATTGCAT-3'; GATA site B, 5'-ACTTGAAAGTCCAGCTGCAAAGCAGCAG-3'.

Cell Lines and DNA Transfections-- The human erythroleukemia cell line, K562, the adherent murine erythroleukemia MEL (F4-12B2) cell line, and COS-1 cells were all maintained as described previously (21). For electroporation, exponentially growing K562, MEL, or COS-1 cells were transfected with 2 pmol of the reporter construct and 250 µg of sheared salmon sperm DNA (21). As an internal control, 5 µg of the beta -galactosidase expression vector, RSV-beta -galactosidase, was cotransfected into K562 and COS-1 cells and 10 µg of this vector into MEL cells. Cells were harvested 24 h after transfection, and cell lysates were assayed for luciferase and beta -galactosidase activity.

Plasmid DNA was prepared by the CsCl/ethidium bromide equilibrium density gradient procedure (25). All transient transfections were performed in triplicate, with at least three different plasmid DNA preparations.

Reporter Gene Assays-- Transfected cells were harvested (21), and the supernatant was assayed to determine total protein concentration (Bio-Rad protein microassay procedure). Subsequent assays (luciferase and beta -galactosidase) were performed with 100 µg of cell lysate (21). Luciferase activities were normalized for transfection efficiency using the beta -galactosidase activity as an internal control, and the data were expressed as "relative luciferase activity." The fold transactivation obtained with the addition of an ALAS2 intronic fragment or the 3'-flanking sequence was calculated relative to the promoter construct (pAp-LUC) and assigned a value of 1.0.

Gel Shift Assays-- The preparation of nuclear extracts, radiolabeling of single-stranded oligonucleotides with [gamma 32P]ATP, and the binding reactions used in the detection of GATA- and CACCC box-binding proteins were performed as described (21).

The sequences of the sense strand synthetic oligonucleotides used in the gel shift experiments are as follows. Binding motifs are underlined: GATA-A, 5'-AGCTACTGCCTATCTAGTCATTGC-3'; GATA-B, 5'-TTGAAAGTCCTATCTCAAAGCAGC-3'; beta -globin GATA-cons (26), 5'-TTGGCTCCCTTATCATGTCCCTG-3'; CAC-A, 5'-CTAGTCCCCCACCCTAGCGAA-3'; CAC-B, 5'-AAAGGTCCCCACCCAGCTACT-3'; beta -globin CACCC (27), 5'-AGCTAGCCACACCCTGAAGCT-3'; Sp1-cons, 5'-ATTCGATCGGGGCGGGGCGAGC-3'; nonspecific competitor (CACCC site A containing a PvuII mutation), 5'-CTAGTCCTCCAGCTGTAGCGAA-3'.

For supershift assays, the GATA-1-specific monoclonal antibody, N-6 (28) (provided by Dr. M. Crossley), and polyclonal antibodies to Sp1 (PEP2) (Santa Cruz Biotechnology, Inc.), BKLF, and EKLF (provided by Dr. M. Crossley) were incubated in the binding reaction prior to the addition of probe. Gel shift competition assays were performed with unlabeled competitor oligonucleotides included in the binding reactions. To determine the binding affinity constants (Kd) of GATA-1, the purification of a GST-GATA-1 zinc finger fusion protein (GST-GATA-1(f); Ref. 29), the binding reactions, and electrophoresis conditions were performed as described previously (21).

Transactivation Studies-- Transactivation experiments in COS-1 cells were performed with 2 pmol of the reporter construct and 10 µg of the respective cDNA clones for the murine GATA-1 (pXM/GF-1; kindly provided by Dr. S. H. Orkin), Sp1 (provided by Dr. M. Crossley), or murine FOG (pMT2/FOG; provided by Dr. S. H. Orkin) (30). For transactivation experiments in K562 cells with exogenously expressed EKLF, 2 pmol of the reporter construct and 7.5 µg of the EKLF cDNA expression clone, pSG5/EKLF (27) (provided by Dr. J. Bieker), were employed. The vector pGL2-Basic was included as a control. Cells were harvested 24 h after transfection, and 100 µg of total protein was assayed for luciferase activity. The fold transactivations were determined following subtraction of the background activity obtained with pGL2-Basic (21) which was 1.2- to 1.4-fold.

    RESULTS
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Sequence Features of the Human ALAS2 Locus-- We previously reported the genomic organization of the human ALAS2 gene (13, 16). We have now determined over 35 kb of sequence encompassing the ALAS2 locus, and this includes intronic sequences together with about 10.3 kb of 5'-flanking region and 2.9 kb of 3'-flanking region. Several noteworthy features were revealed (Fig. 1). There was a marked paucity of common repetitive elements with only two truncated LINE homologous sequences (2.3 kb and 310 bp in length) in the 5'-flanking region and four copies of Alu sequences, three of which were located immediately 3' to an exon (Fig. 1). Other sequence features (see Fig. 1) include the following: 1) two large dinucleotide (AC-type) repeat sequences located in introns 4 and 7, the largest of which has been examined in further detail and shown to exhibit 78% heterogeneity (31), whereas the smaller repeat showed considerably less variation2; 2) a perfectly repeated 17-bp sequence located approximately 950 bp apart in the 5'-flanking region. Immediately upstream of the 3' copy of the direct repeat was a long poly(A) stretch, inferring that the intermediate sequence resulted from the reverse transcription of an mRNA followed by integration into the genome. Additional support for such an event was the observation that the aforementioned 310-bp truncated LINE element comprised the first one-third of the sequence. No other sequences in the current data bases showed any significant similarity to this integrated "cDNA"; 3) a directly repeated 23-bp sequence within intron 9 with only one mismatch evident between the two copies and no intervening DNA between the two sequences; 4) overlapping expressed sequence tags (ESTs; GenBankTM accession C01178, AA554484 and N59092) located approximately 1.2 kb 3' to the final human ALAS2 exon, indicating the presence of a convergent transcript (Fig. 1).


View larger version (16K):
[in this window]
[in a new window]
 
Fig. 1.   Sequence features of the human ALAS2 locus. The organization of the ALAS2 gene (16) is schematically represented on the completely sequenced pTC-EA1 cosmid insert. A kilobase scale bar is positioned below the gene structure. Positions of various repetitive elements, an inserted cDNA, and the clustered expressed sequence tags (ESTs) are indicated below the scale bar.

Potential Regulatory Regions of the Human ALAS2 Gene-- CpG dinucleotide density across the 35-kb ALAS2 locus was very low (6.85 × 10-3) compared with that of the GpC frequency (4.11 × 10-2), which was closer to that expected for both (4.51 × 10-2). This low density was also reflected by a low frequency of HhaI and HpaII restriction enzyme recognition sequences (data not shown). Often, a high density of CpG dinucleotides (CpG islands) occurs in housekeeping gene promoters and regulatory regions and corresponds to hypomethylated DNA (32). Therefore, our data are in keeping with the cell-specific expression of the gene.

Possible regulatory elements, notably GATA motifs and CACCC box protein-binding sites, were identified by sequence analysis throughout the 10.3 kb of 5'-flanking sequence, 2.9 kb of 3'-flanking sequence, and also in all introns (data not shown; Ref. 13). In contrast, sequences with high similarity to the binding site motif for the erythroid-enriched factor, NF-E2 (33), were few, with single sites (10/11 identity) in the promoter (13, 21) and immediately 3' to the final exon, and a site (9/11 identity) located within intron 3.

Potential GATA sites were found throughout the human ALAS2 gene sequence (data not shown). Although many of these sites may prove to be non-functional, a high frequency of sites was found within the 4.9-kb intron 1 (17 consensus sites) and the 562-bp intron 8 (4 consensus sites). Within intron 1, it was of interest that 16 of the 17 potential GATA sites were present in the same antisense orientation. Six CACCC-like sequences (34) were also identified in intron 1, four being located within 500 bp of each other at the 5' end of the intron. Intron 8 also contained two CACCC elements, and these two sequences together with the four potential GATA-1-binding sites were all located within a 270-bp region as described later. Again, all GATA sites in intron 8 were in the same relative (antisense) orientation. Intron 3 consisting of 850 bp of sequence possessed fewer consensus binding sites for these transcription factors (two CACCC sites and one GATA site) although perhaps notably all were clustered. Intriguingly, introns 1, 3, and 8 were the locations of DNase I-hypersensitivity sites in the mouse ALAS2 gene (Fig. 2A; Ref. 17). Since such hypersensitive sites often reflect the binding of transcription factors (19), these introns were further examined by transient reporter gene assays for a possible role in the transcriptional regulation of the human ALAS2 gene.


View larger version (20K):
[in this window]
[in a new window]
 
Fig. 2.   Analysis of introns 1, 3, and 8 and the 3'-flanking region of the human ALAS2 gene. A, structure of the murine and human ALAS2 genes composed of 11 exons together with 5'- and 3'-flanking regions. The location of DNase I-hypersensitivity sites identified in the murine ALAS2 gene (17) are indicated by upward facing arrows, and the corresponding regions in the human ALAS2 gene, namely the immediate promoter region, introns 1, 3, and 8 together with the 3'-flanking region are bracketed. B, expression of ALAS2 intron and 3'-flanking region constructs in transiently transfected K562, MEL, and COS-1 cells. The transcription initiation site at +1 is depicted by the arrow. The orientation of the introns and the 3'-flanking region are also indicated by the arrows. Luciferase activities were standardized relative to beta -galactosidase activity (RSV-beta -galactosidase) as an internal control for variation in transfection efficiencies. The normalized luciferase activities are expressed relative to pAp-LUC containing 293 bp of human ALAS2 promoter, assigned a value of 1.0. Intron 8 was ligated upstream of the ALAS2 promoter containing a canonical TATA box at -27 (pAt-I8(460)-LUC), and the normalized luciferase activity is expressed relative to pAt-LUC containing a canonical TATA box in the ALAS2 promoter (ALAS2-T) (21). The data are averages obtained from constructs tested in quadruplicate in at least three experiments and are represented as the mean ± S.D. ND (not determined) corresponds to those constructs not tested in a particular cell line. C, intron 8 was ligated upstream of the heterologous thymidine kinase promoter. Constructs were transiently expressed in K562, MEL, and COS-1 cells and cotransfected with RSV-beta -galactosidase. The normalized luciferase activities are expressed relative to ptk-LUC, assigned a value of 1.0. The orientation of intron 8 is indicated by the arrow, and the data were obtained as described above.

Introns 1, 3, and 8 of the Human ALAS2 Gene Affect ALAS2 Promoter Activity-- Intron 1 (4.9 kb) was cloned downstream of the human ALAS2 promoter, whereas intron 3 (850 bp) and the most 3' 460 bp of intron 8 were each cloned upstream of the human ALAS2 promoter which was fused to the firefly luciferase reporter gene (Fig. 2B). All constructs were transiently transfected into either K562, MEL (F4-12B2), or COS-1 cells, the latter as a nonerythroid control, and luciferase activity was determined in cell lysates. The plasmid construct pAp-LUC containing ALAS2 promoter (-293 to +28) fused to the luciferase reporter gene was assigned a value of 1.0 in each cell line (Fig. 2B). Intron 1 located in its native position in the 5'-untranslated region (pAp-I1(4.9kb)-LUC) increased transcription of the ALAS2 promoter by 3.2- and 2.4-fold in K562 and MEL cell lines, respectively, but reduced expression in COS-1 cells (Fig. 2B). The insertion of intron 3 upstream of the ALAS2 promoter (pAp-I3(850)-LUC) reduced expression in K562 cells to about 60% of pAp-LUC alone (Fig. 2B). Importantly, inclusion of intron 8 sequence upstream of the promoter (pAp-I8(460)-LUC) substantially increased expression to 12.0- and 4.0-fold in K562 and MEL cells, respectively, but did not alter expression in COS-1 cells (Fig. 2B). Attempts to synthesize ALAS2/reporter gene constructs containing intron 8 in the reverse orientation or downstream of the reporter gene were unsuccessful. The 3'-flanking region (2.9 kb) was also investigated since this sequence contained putative transcription factor binding sites although they did not correlate with the location of DNase I-hypersensitivity sites detected in the murine ALAS2 gene (17). It was found that a construct containing 2.9 kb of 3'-flanking sequence (pAp-3'(2.9kb)-LUC) had a marginally repressive effect on the ALAS2 promoter-luciferase reporter gene in K562 cells (Fig. 2B), but this sequence was not further examined.

To determine if intron 8 could also confer erythroid cell-specific activity to a heterologous promoter, constructs were generated in which intron 8 was ligated upstream of the thymidine kinase promoter. As seen in Fig. 2C, in the native orientation (ptk-I8(460)-LUC), intron 8 increased transcription of the thymidine kinase promoter 25.0- and 12.0-fold in K562 and MEL cells, respectively, but only 1.8-fold in COS-1 cells. In the reverse orientation (ptk-I8(460)R-LUC), intron 8 displayed reduced enhancer activity of 4.0- and 2.0-fold in K562 and MEL cells, respectively (Fig. 2C). Hence, intron 8 exhibits erythroid cell-specific enhancer activity although the extent of this activity was orientation-dependent.

The human ALAS2 promoter contains a non-canonical TATA box (13) which is able to bind GATA-1 in vitro (21). We have examined whether the binding of GATA-1 to the non-canonical TATA site may be required for the enhancer activity of intron 8 (35) by converting this site to a consensus TATA box (pAt-LUC) that did not bind GATA-1 in vitro (21). However, intron 8 stimulated promoter activity to a similar degree as that observed with the native ALAS2 promoter (Fig. 2B).

"Phylogenetic Footprinting" of Intron 8-- The correlation between the observed enhancer activity of the human ALAS2 intron 8 and the presence of a DNase I-hypersensitive site in the murine ALAS2 intron 8 (17) prompted a comparison of these sequences together with that of the available canine sequence (36) (GenBankTM accession number U17083). As seen in Fig. 3, the murine intron 8 is approximately twice the size of the human and canine introns and contains an additional 521 bp of sequence that exhibits similarity to a rodent-specific repetitive sequence and was apparently inserted following the divergence of this species. There is a striking degree of sequence identity between the human and canine ALAS2 intron 8 sequences (76% identity over the 3' most 460 bp), and 59% between the human and mouse, and 54% between the canine and murine sequences (Fig. 3). Of particular note is the complete conservation in all three species of a putative CACCC box-binding site (5'-CCCCACCC-3'), designated CACCC site B (Fig. 3, nucleotides 327-334), and two putative GATA-1-binding sites (sites A and B at nucleotides 344-349 and 380-385, respectively). The GATA sites A and B are both located on the non-coding strand (5'-AGATAG-3') (Fig. 3) and conform to the consensus for GATA-1 (22-23). A second possible CACCC box-binding site (5'-CCCCACCC-3') designated CACCC site A (Fig. 3) was located further upstream at nucleotides 268-275 in the human ALAS2 intron 8 sequence but contains a single nucleotide mismatch of C to T in the other two species (Fig. 3). There are two other putative GATA-1-binding sites in the human and canine ALAS2 intron 8 sequence, but these sites are not conserved in the murine intron 8 sequence (Fig. 3).


View larger version (120K):
[in this window]
[in a new window]
 
Fig. 3.   Comparison of the intron 8 sequences between human, canine, and murine ALAS2 genes. The murine ALAS2 intron 8 contains an additional 521 bp of sequence inserted at the 5' end of the intron. Regions of sequence homology are shaded. Only the human ALAS2 intron 8 sequence is numbered. Conserved putative GATA and CACCC box-binding sites are darkly shaded. Two putative GATA sites conserved between the human and canine ALAS2 intron 8 sequences are also darkly shaded.

Localization of Human ALAS2 Intron 8 Enhancer Activity to a 239-bp Fragment-- To identify the sequences within the human ALAS2 intron 8 that confer erythroid cell-specific enhancer activity, constructs were generated with different lengths (403 bp to 115 bp) of intron 8 and encompassing different regions of the intron cloned in the native orientation in the plasmid pAp-LUC (Fig. 4). The location of the putative GATA and CACCC box-binding sites in each intron 8 construct are shown in Fig. 4A. These constructs and pAp-I8(460)-LUC containing 460 bp of the human ALAS2 intron 8 were transiently transfected into K562 cells and their activities expressed relative to pAp-LUC. As seen in Fig. 4B, the constructs pAp-I8(460)-LUC, pAp-I8(403)-LUC, and pAp-I8(239)-LUC increased activity of the human ALAS2 promoter 12.0-, 11.5-, and 12.9-fold, respectively, in K562 cells, inferring that enhancer activity is located within 239 bp of intron 8 sequence. In comparison, constructs pAp-I8(279)-LUC and pAp-I8(115)-LUC increased the levels of ALAS2 promoter transcription only 3.2- and 2.3-fold, respectively, whereas constructs pAp-I8(129)-LUC and pAp-I8(176)-LUC containing the non-conserved GATA sites C and D did not affect transcription of the human ALAS2 promoter in erythroid cells (Fig. 4B). These results established that the erythroid cell-specific enhancer activity of the 239-bp fragment is significantly reduced when this fragment is divided into regions of 129 and 115 bp, suggesting that there may be a cooperative interaction between sites located within these two regions.


View larger version (25K):
[in this window]
[in a new window]
 
Fig. 4.   Deletion analysis of the ALAS2 intron 8 enhancer. A, sequence and restriction map of the 460-bp ALAS2 intron 8 enhancer region. Sequences in intron 8 with homology to the consensus GATA and CACCC box-binding sites are boxed. The GATA sites on the non-coding strand are underlined by arrows and are designated GATA-A, GATA-B, GATA-C, and GATA-D. The CACCC sites are designated CAC-A and CAC-B. B, expression of ALAS2 intron 8 deletion constructs in transiently transfected K562 cells. The sizes of the intron 8 fragments are shown. The location of GATA and CACCC box-binding sites in these fragments are indicated where GATA sites are boxed and shaded black, and CACCC box-binding sites are indicated by lightly shaded boxes. Luciferase activities were standardized relative to beta -galactosidase activity (RSV-beta -galactosidase) as an internal control for variation in transfection efficiency. The normalized luciferase activities are expressed relative to pAp-LUC, assigned a value of 1.0. The data are averages obtained from constructs tested in quadruplicate in at least three experiments and represented as the mean ± S.D.

Characterization of Control Elements in the Human ALAS2 Intron 8-- The 239-bp enhancer region located within the human ALAS2 intron 8 contains the conserved sites described earlier, namely CACCC site B, GATA sites A and B on the non-coding strand, and the partially conserved CACCC site A (see Fig. 4). The functional contributions of these sites were investigated by mutating the sites individually or in various combinations in the plasmid pAp-I8(239)-LUC. Expression was analyzed in K562 and MEL cells, and luciferase activities of the mutant constructs were expressed relative to pAp-I8(239)-LUC, which was assigned a value of 100 (Fig. 5A). Mutagenesis of CACCC site A (pAp-I8mut1-LUC) substantially reduced expression of the enhancer in K562 and MEL cells to 37 and 47%, respectively, and mutagenesis of CACCC site B (pAp-I8mut2-LUC) reduced expression to 43 and 58%, respectively, in these cells (Fig. 5A). Inactivation of both CACCC sites (pAp-I8mut5-LUC) further reduced enhancer activity to 21 and 35%, respectively. Interestingly, the conserved GATA site A in intron 8 did not contribute to enhancer activity since mutagenesis of this site (pAp-I8mut3-LUC) marginally increased enhancer activity to 119 and 122% in K562 and MEL cells, respectively (Fig. 5A). The non-functional contribution of this GATA site was confirmed by a double mutation of GATA site A with CACCC site A which expressed at 49% of the wild-type level in K562 cells (pAp-I8mut6-LUC) (Fig. 5A). In marked contrast, the conserved GATA site B was important for enhancer activity since mutagenesis of this site (pAp-I8mut4-LUC) reduced expression to 36 and 45% in K562 and MEL cells, respectively (Fig. 5A). In addition, mutagenesis of both CACCC site A and GATA site B (pAp-I8mut7-LUC) severely reduced enhancer activity to 7 and 21%, respectively, in K562 and MEL cells (Fig. 5A).


View larger version (27K):
[in this window]
[in a new window]
 
Fig. 5.   Effect of mutating the conserved GATA- and CACCC box-binding sites in ALAS2 intron 8 on enhancer activity. A, CACCC sites A and B (CAC-A and CAC-B, respectively) and the GATA sites A and B (GATA-A and GATA-B, respectively) were each mutated to a PvuII site represented by a cross in the 239-bp ALAS2 intron 8 sequence and ligated upstream of the 293-bp ALAS2 promoter. These constructs were cotransfected with a beta -galactosidase expression construct (RSV-beta -galactosidase) and transiently expressed in K562 and MEL cells. The normalized luciferase activities of the mutant constructs are expressed relative to pAp-I8(239)-LUC which was set at 100%. The data are averages obtained from constructs tested in quadruplicate in at least three experiments and are represented as the mean ± S.D. ND (not determined) corresponds to those constructs not tested in a particular cell line. B, expression of ALAS2 intron 8 constructs containing the mutated CACCC site A and CACCC site B (ptk-I8mut1-LUC) or mutated CACCC site A and GATA site B (ptk-I8mut2-LUC) in the 239-bp fragment of intron 8 were ligated upstream of the thymidine kinase promoter. These constructs and ptk-I8(460)-LUC were cotransfected with RSV-beta -galactosidase and transiently expressed in the K562 and MEL cells as described above. The normalized luciferase activities of the mutant constructs are expressed relative to ptk-I8(460)-LUC (set at 100%).

The effect of these mutations on conferring enhancer activity to the heterologous thymidine kinase promoter was also examined. As seen in Fig. 5B, mutagenesis of CACCC site A in combination with CACCC site B (ptk-I8mut1-LUC) reduced enhancer activity to 62 and 34% in K562 and MEL cells, respectively, whereas inactivation of both CACCC site A and GATA site B (ptk-I8mut2-LUC) almost abolished enhancer activity in these cells.

GATA-1 Protein Binds to GATA Sites A and B in the Human ALAS2 Intron 8 Enhancer-- The binding of nuclear proteins to GATA sites A and B (GATA-A and GATA-B probes, respectively) was investigated by gel shift assays with nuclear extracts from either K562, MEL, or COS-1 cells and also from COS-1 cells transfected with the murine GATA-1 cDNA expression vector, pXM/GF-1. A major retarded complex was obtained with the GATA-B probe (Fig. 6A) using nuclear extracts from K562 (lane 8) and MEL (lane 10) cells, and a complex of the same mobility was detected with the GATA-A probe (lanes 1 and 3) although the intensity was reduced. This complex corresponded in mobility to that detected with a beta -globin GATA-1 consensus sequence (26) (data not shown). This retarded protein complex was also observed with nuclear extracts from COS-1 cells expressing recombinant murine GATA-1 (lanes 6 and 13) but was not detected with nuclear extracts from mock-transfected COS-1 cells (lanes 5 and 12). The protein complex detected using nuclear extracts from K562, MEL, or COS-1 cells expressing recombinant GATA-1 was confirmed immunologically as GATA-1 since it was substantially supershifted with the GATA-1 monoclonal antibody, N-6 (28) (Fig. 6A).


View larger version (68K):
[in this window]
[in a new window]
 
Fig. 6.   Gel shift analysis of GATA sites A and B. A, radiolabeled double-stranded oligonucleotides containing the GATA site A (GATA-A probe) and GATA site B (GATA-B probe) were incubated with nuclear extracts from K562 (lanes 1, 2, 8, and 9), MEL (lanes 3, 4, 10, and 11), and COS-1 cells (lanes 5 and 12) and COS-1 cells expressing recombinant GATA-1 (lanes 6, 7, 13, and 14). For supershift assays, the GATA-1 monoclonal antibody, N-6, was added to nuclear extracts from K562 (lanes 2 and 9) and MEL (lanes 4 and 11) cells and COS-1 cells expressing recombinant GATA-1 (lanes 7 and 14) prior to the addition of probe. The retarded complex corresponding to GATA-1 in the absence of antibody (arrow) and the supershifted complex (arrowhead) are indicated. B, radiolabeled GATA-A and GATA-B probes were incubated with nuclear extracts from COS-1 cells expressing recombinant GATA-1 (lanes 1-6 and 7-12, respectively). The retarded complex (arrow) was competed with either a 10- and 50-fold molar excess of the GATA-A (lanes 2 and 3) and GATA-B (lanes 8 and 9) oligonucleotides, respectively, in self-competition or GATA-cons (lanes 4, 5, 10, and 11) and a 100-fold molar excess of a nonspecific (NS) competitor (lanes 6 and 12).

Competition experiments with nuclear extracts from COS-1 cells expressing recombinant GATA-1 protein indicated that the GATA sites A and B bind GATA-1 with similar affinities (Fig. 6B). The binding of GATA-1 to either GATA-A (lane 1) or GATA-B (lane 7) probes was effectively and specifically inhibited with a 50-fold molar excess of either the GATA-A (lane 3) and GATA-B (lane 9) oligonucleotides, respectively in self-competition or with a 50-fold molar excess of the beta -globin GATA-1 consensus sequence (lanes 5 and 11). A nonspecific competitor at a 100-fold molar excess was ineffective in these assays (lanes 6 and 12).

To extend these studies, the DNA binding affinity of GATA-1 for GATA sites A and B were compared in gel shift assays using a purified GST-GATA-1(f) fusion protein (29). An increasing concentration of GST-GATA-1(f) was incubated with a constant amount of each probe and the extent of DNA binding determined. A concentration of 2 µM protein was required to give 50% DNA binding with both probes (data not shown). These results demonstrate that both GATA site A and GATA site B can bind GATA-1 in vitro with a similar binding affinity, but only GATA site B appears to be important in enhancer function as determined by mutational analysis.

Sp1 Binds to the Functional CACCC Sites in the Human ALAS2 Intron 8 Enhancer-- Gel shift assays were performed with radiolabeled probes containing CACCC site A and CACCC site B (CAC-A and CAC-B probes, respectively) to investigate nuclear protein binding to these sites. The CACCC site from the murine adult beta -globin promoter which binds EKLF, BKLF and Sp1 in vitro (34) and an Sp1 consensus sequence were included as control probes. A major slow migrating complex was detected with both the CAC-A probe (Fig. 7A, lanes 1 and 3) and CAC-B probe (lanes 8 and 10) using nuclear extracts from K562 and MEL cells. A retarded complex of similar mobility was detected with COS-1 cell nuclear extracts but with reduced intensity (lanes 5 and 12); the intensity of this complex was greatly increased with nuclear extracts from COS-1 cells transfected with an Sp1 cDNA expression clone (lanes 7 and 14). This complex detected in both erythroid and COS-1 cell nuclear extracts contained Sp1 and/or Sp1-related proteins since it was supershifted with an antibody to Sp1 (Fig. 7A). The complex corresponding to Sp1 was also detected with both the beta -globin CACCC box and the Sp1-cons probes (data not shown).


View larger version (63K):
[in this window]
[in a new window]
 
Fig. 7.   Gel shift analysis of CACCC sites A and B. A, radiolabeled double-stranded oligonucleotides containing CACCC site A (CAC-A probe) and CACCC site B (CAC-B probe) were incubated with nuclear extracts from K562 (lanes 1, 2, 8, and 9), MEL (lanes 3, 4, 10, and 11) and COS-1 (lanes 5, 6, 12, and 13) cells and COS-1 cells expressing recombinant Sp1 (lanes 7 and 14). For supershift assays, an anti-Sp1 antibody (lanes 2, 4, 6, 9, 11, and 13) was added prior to the addition of the probe. The retarded complexes corresponding to Sp1 (arrow) and an additional complex (asterisk) detected with the CAC-A probe and MEL cell nuclear extracts are indicated. B, radiolabeled CAC-A and CAC-B probes were incubated with nuclear extracts from COS-1 cells expressing recombinant Sp1 (lanes 1-6 and 7-12). The major retarded complex (arrow) was competed with a 10- and 50-fold molar excess of the CAC-A (lanes 2 and 3) and CAC-B (lanes 8 and 9) oligonucleotides in self-competition, a 10-fold molar excess of the beta -globin CACCC (lanes 4 and 10) and Sp1-cons (lanes 5 and 11) oligonucleotides, and a 100-fold molar excess of a nonspecific (NS) competitor (lanes 6 and 12).

The binding of BKLF to the beta -globin CACCC box probe has been observed using MEL cell nuclear extracts, and formation of this complex was inhibited by an antibody specific to BKLF (21, 34). However, a similarly migrating complex detected with the CAC-A probe and MEL cell nuclear extracts (Fig. 7A, lane 3) was not affected with an anti-BKLF antibody (data not shown), and no corresponding complex could be detected with the CAC-B probe (Fig. 7A, lane 10). Similarly, the binding of EKLF to CACCC sites A and B could not be detected (data not shown) although binding of EKLF to the beta -globin CACCC box control probe was observed using nuclear extracts from COS-1 cells expressing recombinant EKLF (21, 34).

Competition experiments using COS-1 nuclear extracts expressing recombinant Sp1 and CAC-A and CAC-B probes (Fig. 7B) showed that the binding of the nuclear protein complex corresponding to Sp1 to these sites was substantially and specifically inhibited with a 10-fold molar excess of either the beta -globin CACCC box oligonucleotide (lanes 4 and 10) or the Sp1-cons oligonucleotide (lanes 5 and 11). However, in self-competition, a 50-fold molar excess of either the CAC-A (lane 3) or CAC-B (lane 9) oligonucleotides resulted in only a partial inhibition. Similar competition experiments performed with the beta -globin CACCC box and the Sp1-cons probes demonstrated that while a 10-fold molar excess of the beta -globin CACCC box and Sp1-cons oligonucleotides, respectively, were effective in self-competition, a 200-fold molar excess of the CAC-A oligonucleotide or the CAC-B oligonucleotide was required for similar inhibition (data not shown). Overall, these experiments show that CACCC sites A and B only weakly bind Sp1 or Sp1-related proteins but do not detectably bind EKLF or BKLF, unlike the beta -globin CACCC site which binds Sp1, BKLF, and EKLF in vitro (34).

Effect of Exogenous GATA-1, Sp1, FOG, and EKLF on Human ALAS2 Intron 8 Activity-- As described previously, the ALAS2 promoter construct (pAp-LUC) is active in nonerythroid COS-1 cells, presumably through the interaction of Sp1 with the -54 CACCC box (21), but the inclusion of intron 8 (pAp-I8(460)-LUC) did not further stimulate this promoter activity. We investigated the effect of exogenously expressed GATA-1 in these cells. As seen in Fig. 8 and as described previously (21), exogenous GATA-1 stimulated expression of pAp-LUC 4.0-fold, but no further increase was detected with the inclusion of intron 8 sequence (pAp-I8(460)-LUC) even though this intron contains functional GATA-1-binding sites. In contrast, the human ALAS2 intron 1 in the plasmid pAp-I1(4.9kb)-LUC was transactivated 9.5-fold by exogenously expressed GATA-1 in COS-1 cells (Fig. 8), but the specific sites within intron 1 through which GATA-1 functioned have not yet been investigated.


View larger version (20K):
[in this window]
[in a new window]
 
Fig. 8.   Effect of exogenous GATA-1 and Sp1 on the activity of intron 8. The constructs pAp-LUC, pAp-I8(460)-LUC, pAp-I1(4.9kb)-LUC, and pGL2-Basic were cotransfected with the murine GATA-1 cDNA expression clone, pXM/GF-1, alone or together with an Sp1 cDNA expression clone in COS-1 cells, and luciferase activities were determined. The data are averages obtained from constructs tested in quadruplicate in at least four experiments and represented as the mean ± S.D. pGL2-Basic was included as a control and assigned a value of 1.0.

Since the levels of Sp1 in COS-1 cells appeared to be lower than in K562 cells (see Fig. 7A, compare lanes 1 and 5), we investigated whether the ALAS2 intron 8 enhancer would respond when both GATA-1 and Sp1 were expressed exogenously in COS-1 cells. GATA-1 and Sp1 together resulted in a 7.1-fold increase in promoter activity, but no additional increase in transcription was observed with the inclusion of intron 8 (Fig. 8). Similar cotransfection experiments were performed with exogenously expressed GATA-1 and its recently isolated cofactor, FOG (30), together with Sp1. However, the expression of exogenous FOG in fact reduced promoter activity of pAp-LUC, and reporter gene activity again was not altered by the inclusion of intron 8 (pAp-I8(239)-LUC) (data not shown).

We have previously reported (21) that exogenously expressed EKLF in K562 cells increases expression of the ALAS2 promoter (pAp-LUC) through the -54 CACCC site by approximately 4.0-fold. A similar level of stimulation was seen with the construct pAp-I8(460)-LUC containing the ALAS2 intron 8 enhancer (data not shown) indicating that EKLF cannot function through CACCC sites located within this intron, a finding that is in agreement with the inability to detect the in vitro binding of EKLF to either CACCC sites A or B.

    DISCUSSION
Top
Abstract
Introduction
Procedures
Results
Discussion
References

We report here that introns 1, 3, and 8 in the human ALAS2 gene play a role in regulating gene transcription in transfected cells. These introns were chosen for investigation because DNase I-hypersensitivity sites were identified in the corresponding introns in the mouse ALAS2 gene (17), the structural organization of which is remarkably similar to that of the human (2, 16), and because sequence analysis of the human ALAS2 locus revealed possible regulatory elements located within these introns. Intron 3 (850 bp) appeared inhibitory, whereas intron 1 (4.9 kb) and intron 8 sequence (460 bp) each conferred a stimulatory effect on promoter activity. Although 17 potential GATA sites and 6 CACCC box-binding sites are present in intron 1 and we have demonstrated that its activity can be increased by exogenous GATA-1 in nonerythroid cells, we have not pursued this large intron further in this study. Instead, due to the significant increase in promoter activity conferred by intron 8 and the small size of this intron, we have investigated the role of putative transcription factor binding sites located within this intron.

The enhancer activity of intron 8 from the human ALAS2 gene was erythroid cell-specific, but unlike classic enhancers this activity was orientation-dependent. A 25-fold level of stimulation was observed in K562 cells when the enhancer was located upstream of the heterologous tk promoter and oriented in the same direction as the promoter but only 4-fold in the reverse orientation. Other enhancers have been reported where the extent of functional activity is dependent upon the orientation of the enhancer in the DNA construct (37-38).

A comparison of intron 8 sequences from the human, mouse, and canine ALAS2 genes revealed a remarkable degree of sequence conservation over 460 bp, although the mouse intron contains additional 5' sequence that corresponds to a rodent-specific repetitive element but does not contribute to enhancer activity.2 Indeed the level of sequence conservation over this 460-bp intronic region exceeds that of exonic sequences of some genes (39), a fact in itself inferring an important role in the regulation of ALAS2 gene expression.

The erythroid-specific enhancer region in the human ALAS2 intron 8 was subsequently localized to a 239-bp region. GATA-1 and CACCC box proteins play key regulatory roles in determining the expression of erythroid-specific genes with sites for these proteins located in both promoter (40-44) and enhancer regions (45-48). In this study, we have focused on the possible contribution of these sites to intron 8 enhancer activity. Sequence analysis of the 239-bp enhancer region identified two putative GATA-1-binding sites, conserved in both sequence and location in the murine and canine introns and two possible CACCC boxes, one of which (site B) was also completely conserved in sequence and location in the murine and canine introns. The second CACCC box (site A) contained a single C to T nucleotide mismatch in the other two species. Mutational analysis demonstrated that both CACCC boxes were functional, but of the two conserved GATA sites, only the more distal site (site B) contributed to enhancer activity. Inactivation of the second conserved GATA site (site A) marginally increased enhancer activity suggestive of an inhibitory role (49). The non-functional contribution of the conserved GATA site A is of interest since this site bound GATA-1 protein in vitro with a similar affinity as GATA site B. The conserved GATA sites are likely to be located on opposite faces of the DNA helix, and it is possible that for GATA-1 functionality, a stereospecific alignment of the GATA site with other nearby sites, such as CACCC box-binding sites, may be critical (41, 50-51).

Several proteins are known to bind to CACCC box sequences in gel shift assays, notably the following members of the Krüppel family of transcription factors: Sp1 (52), EKLF (27), and BKLF (34). An EKLF-responsive CACCC box has been identified in the mouse adult beta -globin gene promoter (53-57) although all three proteins can bind to this sequence in gel shift assays (34). A similar situation exists for the -54 CACCC sequence in the human ALAS2 promoter which responds to EKLF in transactivation experiments but binds Sp1, EKLF, and BKLF in vitro (21). In the present study, gel shift assays were performed to identify proteins that bind to CACCC sites A and B in the human ALAS2 intron 8. By using erythroid cell nuclear extracts and extracts with exogenously expressed transcription factors, the binding of Sp1 or an immunologically related protein was detected at both sites but not the binding of EKLF or BKLF. Thus CACCC sites A and B in intron 8 do not mimic the -54 CACCC sequence in the human ALAS2 promoter (21) or the CACCC box in the mouse adult beta -globin promoter (34).

To identify further proteins that bind to the CACCC sites A and B in the human ALAS2 intron 8, transactivation experiments were performed in COS-1 cells with exogenously expressed transcription factors. However, while exogenous GATA-1 transactivated the human ALAS2 promoter construct (21) and exogenous Sp1 further increased this expression, the inclusion of intron 8 in this construct did not alter reporter gene expression under these conditions. Similarly, intron 8 failed to be activated when FOG, a recently identified cooperative partner of GATA-1 (30), was also cotransfected together with GATA-1 and Sp1. Likewise, exogenously expressed EKLF in K562 cells also failed to activate the enhancer, although in these experiments, the human ALAS2 promoter was stimulated by EKLF through the -54 CACCC sequence (21). The data overall demonstrate that intron 8 is not responsive to EKLF in erythroid cells or to Sp1 and FOG together with GATA-1 in the nonerythroid environment of COS-1 cells. This suggests that there is either an erythroid-specific factor immunologically related to Sp1 that binds to CACCC sites A and B which is required to cooperate with GATA-1 or that another, as yet unidentified, erythroid transcription factor binds elsewhere in intron 8 and is involved in the activation process.

There is evidence that GATA-1 bound to the 3'-enhancer of the chicken beta -globin gene interacts with another GATA-1 molecule bound at the non-canonical TATA box in the promoter, imparting erythroid cell-specific enhancer activity to transcription initiation (35). Since the human ALAS2 gene also contains a non-canonical TATA box (13, 21), the possibility existed for a similar interaction between the intron 8 enhancer and the non-canonical TATA box in the ALAS2 promoter. However, conversion of the non-canonical TATA box to a consensus TATA box that was no longer able to bind GATA-1 in vitro (21) did not alter the enhancer activity of intron 8 in erythroid cells. This result is in keeping with our observations that this intron can also increase transcription of a heterologous promoter and that no TATA/GATA-like sequence is found in the mouse ALAS2 promoter (17).

Numerous enhancers have been identified which regulate or maintain expression of erythroid cell-specific genes. In addition to the well known locus control region of the beta -globin gene cluster (58-59), tissue-specific enhancer regions have been reported in the 5'- (46-47, 60-65) and 3'-flanking (45, 66-67) regions of various erythroid cell expressed genes. These enhancers have been shown to bind transcription factors including GATA-1, CACCC box-binding proteins, and NF-E2 (45-48, 61). An important question concerns the mechanism by which enhancers or the locus control region regulate gene expression. It has been widely accepted that enhancers function to increase the rate of initiation of gene transcription (68-69). However, there is now substantial evidence that the locus control region and enhancers may act by preventing the formation of repressive structures that silence genes, with little effect on the rate of initiation of gene transcription (70-73). Future studies are planned with stably transformed erythroid cell lines and transgenic mice to evaluate intron 8 enhancer activity in a chromatin environment and to address the issue of whether this region increases the rate or probability of gene transcription.

    ACKNOWLEDGEMENTS

We are extremely grateful to the following people: Dr. Merlin Crossley (polyclonal EKLF and BKLF antibodies and the GATA-1 monoclonal antibody, N-6, and the GST-GATA-1(f) expression construct); Dr. S. H. Orkin (pXM/GF-1 and pMT2/FOG); Dr. M. Frances Shannon (Sp1 antibody); Dr. Jim Bieker (pSG5/EKLF); and Dr. G. Bergholz (MEL (F4-12B2) cells). We sincerely thank Chris Matthews for advice in the preparation of textual figures and the purification of GST-GATA-1(f).

    FOOTNOTES

* The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) AF068624.

Recipient of a National Health and Medical Research Council of Australia CJ Martin Postdoctoral Training Fellowship and an AMRAD Post-Doctoral Award.

parallel To whom correspondence should be addressed. Tel.: 61-8-8303-3139; Fax: 61-8-8303-4348; E-mail: bmay{at}biochem.adelaide.edu.au.

1 The abbreviations used are: ALAS, 5-aminolevulinate synthase; EKLF, erythroid Krüppel-like factor; BKLF, basic Krüppel-like factor; bp, base pair(s); kb, kilobase(s); MEL, murine erythroleukemia; RSV, Rous sarcoma virus; GST, glutathione S-transferase; LUC, luciferase; tk, thymidine kinase; FOG, Friend of GATA-1.

2 K. H. Surinya, T. C. Cox, and B. K. May, unpublished data.

    REFERENCES
Top
Abstract
Introduction
Procedures
Results
Discussion
References

  1. Dierks, P. (1990) in Biosynthesis of Heme and Chlorophylls (Dailey, H. A., ed), pp. 201-233, McGraw-Hill Inc., New York
  2. May, B. K., Dogra, S. C., Sadlon, T. J., Bhasker, C. R., Cox, T. C., and Bottomley, S. S. (1995) Prog. Nucleic Acid Res. Mol. Biol. 51, 1-51[Medline] [Order article via Infotrieve]
  3. Bottomley, S. S., May, B. K., Cox, T. C., Cotter, P. D., and Bishop, D. F. (1995) J. Bioenerg. Biomembr. 27, 161-168[Medline] [Order article via Infotrieve]
  4. Kappas, A., Sassa, S., Galbraith, R. A., and Nordmann, Y. (1995) in Metabolic and Molecular Basis of Inherited Disease (Scriver, C. R., Beaudet, A. L., Sly, W. S., and Valle, D., eds), 7th Ed., pp. 2103-2159, McGraw-Hill Inc., New York
  5. Sutherland, G. R., Baker, E., Callen, D. F., Hyland, V. J., May, B. K., Bawden, M. J., Healy, H. M., and Borthwick, I. A. (1988) Am. J. Hum. Genet. 43, 331-335[Medline] [Order article via Infotrieve]
  6. Cox, T. C., Bawden, M. J., Abraham, N. G., Bottomley, S. S., May, B. K., Baker, E., Chen, L. Z., and Sutherland, G. R. (1990) Am. J. Hum. Genet. 46, 107-111[Medline] [Order article via Infotrieve]
  7. Bishop, D. F., Henderson, A. S., and Astrin, K. H. (1990) Genomics 7, 207-214[Medline] [Order article via Infotrieve]
  8. Chretien, S., Dubart, A., Beaupain, D., Raich, N., Grandchamp, B., Rosa, J., Goossens, M., and Romeo, P.-H. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 6-10[Abstract]
  9. Kaya, A. H., Plewinska, M., Wong, D. M., Desnick, R. J., and Wetmur, J. G. (1994) Genomics 19, 242-248[CrossRef][Medline] [Order article via Infotrieve]
  10. Taketani, S., Inazawa, J., Abe, T., Furukawa, T., Kohno, H., Tokunaga, R., Nishimura, K., and Inokuchi, H. (1995) Genomics 29, 698-703[CrossRef][Medline] [Order article via Infotrieve]
  11. Beaumont, C., Deybach, J. C., Grandchamp, B., Silva, V. D., de Verneuil, H., and Nordmann, Y. (1984) Exp. Cell Res. 154, 474-484[Medline] [Order article via Infotrieve]
  12. Karlsson, S., and Nienhuis, A. W. (1985) Annu. Rev. Biochem. 54, 1071-1108[CrossRef][Medline] [Order article via Infotrieve]
  13. Cox, T. C., Bawden, M. J., Martin, A., and May, B. K. (1991) EMBO J. 10, 1891-1902[Abstract]
  14. Bhasker, C. R., Burgiel, G., Neupert, B., Emery-Goodman, A., Kuhn, L. C., and May, B. K. (1993) J. Biol. Chem. 268, 12699-12705[Abstract/Free Full Text]
  15. Melefors, O., Goossen, B., Johansson, H. E., Stripecke, R., Gray, N. K., and Henzte, M. W. (1993) J. Biol. Chem. 268, 5974-5978[Abstract/Free Full Text]
  16. Conboy, J. G., Cox, T. C., Bottomley, S. S., Bawden, M. J., and May, B. K. (1992) J. Biol. Chem. 267, 18753-18758[Abstract/Free Full Text]
  17. Schoenhaut, D. S., and Curtis, P. J. (1989) Nucleic Acids Res. 17, 7013-7028[Abstract]
  18. Lim, K.-C., Ishihara, H., Riddle, R. D., Yang, Z., Andrews, N., Yamamoto, M., and Engel, J. D. (1994) Nucleic Acids Res. 22, 1226-1233[Abstract]
  19. Elgin, S. C. R. (1988) J. Biol. Chem. 263, 19259-19262[Free Full Text]
  20. Gross, D. S., and Garrard, W. T. (1988) Annu. Rev. Biochem. 57, 159-197[CrossRef][Medline] [Order article via Infotrieve]
  21. Surinya, K. H., Cox, T. C., and May, B. K. (1997) J. Biol. Chem. 272, 26585-26594[Abstract/Free Full Text]
  22. Merika, M., and Orkin, S. H. (1993) Mol. Cell. Biol. 13, 3999-4010[Abstract]
  23. Ko, L. J., and Engel, J. D. (1993) Mol. Cell. Biol. 13, 4011-4022[Abstract]
  24. Luckow, B., and Schutze, G. (1987) Nucleic Acids Res. 15, 5490[Medline] [Order article via Infotrieve]
  25. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) in Molecular Cloning: A Laboratory Manual (Ford, N., Nolan, C., and Ferguson, M., eds), 2nd Ed., pp. 1.42-1.46, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  26. Wall, L., deBoer, E., and Grosveld, F. (1988) Genes Dev. 2, 1089-1100[Abstract]
  27. Miller, L. J., and Bieker, J. J. (1993) Mol. Cell. Biol. 13, 2776-2786[Abstract]
  28. Ito, E., Toki, T., Ishihara, H., Ohtani, H., Gu, L., Yokoyama, M., Engel, J. D., and Yamamoto, M. (1993) Nature 362, 466-468[CrossRef][Medline] [Order article via Infotrieve]
  29. Crossley, M., Merika, M., and Orkin, S. H. (1995) Mol. Cell. Biol. 15, 2448-2456[Abstract]
  30. Tsang, A. P., Visvader, J. E., Turner, C. A., Fujiwara, Y., Yu, C., Weiss, M. J., Crossley, M., and Orkin, S. H. (1997) Cell 90, 109-119[CrossRef][Medline] [Order article via Infotrieve]
  31. Cox, T. C., Kozman, H. M., Raskind, W. H., May, B. K., and Mulley, J. C. (1992) Hum. Mol. Genet. 1, 639-641[Abstract]
  32. Bird, A. P. (1986) Nature 321, 209-213[Medline] [Order article via Infotrieve]
  33. Andrews, N. C., Erjument-Bromage, H., Davidson, M. B., Tempst, P., and Orkin, S. H. (1993) Nature 362, 722-728[CrossRef][Medline] [Order article via Infotrieve]
  34. Crossley, M., Whitelaw, E., Perkins, A., Williams, G., Fujiwara, Y., and Orkin, S. H. (1996) Mol. Cell. Biol. 16, 1695-1705[Abstract]
  35. Fong, T. C., and Emerson, B. M. (1992) Genes Dev. 6, 521-532[Abstract]
  36. Boyer, G., Nonneman, D., Shibuya, H., Stoy, S. J., O'Brien, D., and Johnson, G. S. (1995) Anim. Genet. 3, 206-207
  37. Fujiwara, J., Kimura, T., Ayusawa, D., and Oishi, M. (1994) J. Biol. Chem. 269, 18558-18562[Abstract/Free Full Text]
  38. Madhusudhan, K. T., Naik, S. S., and Patel, M. S. (1995) Biochemistry 34, 1288-1294[Medline] [Order article via Infotrieve]
  39. Hardison, R. C., Oeltjen, J., and Miller, W. (1997) Genome Res. 7, 959-966[Free Full Text]
  40. deBoer, E., Antoniou, M., Mignotte, V., Wall, L., and Grosveld, F. (1988) EMBO J. 7, 4203-4212[Abstract]
  41. Mignotte, V., Eleouet, J.-F., Raich, N., and Romeo, P.-H. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 6548-6552[Abstract]
  42. Zon, L. I., Youssoufian, H., Mather, C., Lodish, H. F., and Orkin, S. H. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 10638-10641[Abstract]
  43. Rahuel, C., Vinit, M.-A., Lemarchandel, V., Cartron, J.-P., and Romeo, P.-H. (1992) EMBO J. 11, 4095-4102[Abstract]
  44. Max-Audit, I., Eleouet, J.-F., and Romeo, P.-H. (1993) J. Biol. Chem. 268, 5431-5437[Abstract/Free Full Text]
  45. O' Prey, J., Ramsey, S., Chambers, I., and Harrison, P. R. (1993) Mol. Cell. Biol. 13, 6290-6303[Abstract]
  46. Porcher, C., Picat, C., Daegelen, D., Beaumont, C., and Grandchamp, B. (1995) J. Biol. Chem. 270, 17368-17374[Abstract/Free Full Text]
  47. Lacronique, V., Lopez, S., Miquerol, L., Porteu, A., Kahn, A., and Raymondjean, M. (1995) J. Biol. Chem. 270, 14989-14997[Abstract/Free Full Text]
  48. Orkin, S. H. (1995) Eur. J. Biochem. 231, 271-281[Abstract]
  49. Raich, N., Clegg, C. H., Grofti, J., Romeo, P.-H., and Stamatoyannopoulos, G. (1995) EMBO J. 14, 801-809[Abstract]
  50. Merika, M., and Orkin, S. H. (1995) Mol. Cell. Biol. 15, 2437-2447[Abstract]
  51. Gregory, R. C., Taxman, D. J., Seshasayee, D., Kensinger, M. H., Bieker, J. J., and Wojchowski, D. M. (1996) Blood 87, 1793-1801[Abstract/Free Full Text]
  52. Kadonaga, J. T., Carner, K. R., Masiarz, F. R., and Tjian, R. (1987) Cell 51, 1079-1090[Medline] [Order article via Infotrieve]
  53. Bieker, J. J., and Southwood, C. M. (1995) Mol. Cell. Biol. 15, 852-860[Abstract]
  54. Donze, D., Townes, T. M., and Bieker, J. J. (1995) J. Biol. Chem. 270, 1955-1959[Abstract/Free Full Text]
  55. Nuez, B., Michalovich, D., Bygrave, A., Ploemacher, R., and Grosveld, F. (1995) Nature 375, 316-318[CrossRef][Medline] [Order article via Infotrieve]
  56. Perkins, A. C., Sharpe, A. H., and Orkin, S. H. (1995) Nature 375, 318-322[CrossRef][Medline] [Order article via Infotrieve]
  57. Wijgerde, M., Gribnau, J., Trimborn, T., Nuez, B., Philipsen, S., Grosveld, F., and Fraser, P. (1996) Genes Dev. 10, 2894-2902[Abstract]
  58. Tuan, D., Solomon, W., Li, Q., and London, I. M. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 6384-6388[Abstract]
  59. Forrester, W. C., Thompson, C., Elder, J. T., and Groudine, M. (1986) Proc. Natl. Acad. Sci. U. S. A. 83, 1359-1363[Abstract]
  60. Higgs, D. R., Wood, W. G., Jarman, A. P., Sharpe, J., Lida, J., Pretorius, I. M., and Ayyaub, H. (1990) Genes Dev. 4, 1588-1601[Abstract]
  61. Nemoto, Y., Terajima, M., Shoji, W., and Obinata, M. (1996) J. Biol. Chem. 271, 13542-13548[Abstract/Free Full Text]
  62. Youssoufian, H. (1994) Blood 83, 1428-1435[Abstract/Free Full Text]
  63. Nicolis, S., Bertini, C., Ronchi, A., Crotta, S., Lanfranco, L., Moroni, E., Giglioni, B., and Ottolenghi, S. (1991) Nucleic Acids Res. 19, 5285-5291[Abstract]
  64. Onodera, K., Takahashi, S., Nishimura, S., Ohta, J., Motohashi, H., Yomogida, K., Hayashi, N., Engel, J. D., and Yamamoto, M. (1997) Proc. Natl. Acad. Sci. U. S. A. 94, 4487-4492[Abstract/Free Full Text]
  65. McDevitt, M. A., Fujiwara, Y., Shivdasani, R. A., and Orkin, S. H. (1997) Proc. Natl. Acad. Sci. U. S. A. 94, 7976-7981[Abstract/Free Full Text]
  66. Trudel, M., and Costantini, F. (1987) Genes Dev. 1, 954-961[Abstract]
  67. Choi, O.-R., and Engel, J. D. (1986) Nature 323, 731-734[Medline] [Order article via Infotrieve]
  68. Treisman, R., and Maniatis, T. (1985) Nature 315, 73-75[Medline] [Order article via Infotrieve]
  69. Weber, F., and Shaffner, W. (1985) Nature 315, 75-77[Medline] [Order article via Infotrieve]
  70. Walters, M. C., Fiering, S., Eidemiller, J., Magis, W., Groudine, M., and Martin, D. I. K. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 7125-7129[Abstract]
  71. Walters, M. C., Magis, W., Fiering, S., Eidemiller, J., Scalzo, D., Groudine, M., and Martin, D. I. K. (1996) Genes Dev. 10, 185-195[Abstract]
  72. Martin, D. I. K., Fiering, S., and Groudine, M. (1996) Curr. Opin. Genet. & Dev. 6, 488-495[CrossRef][Medline] [Order article via Infotrieve]
  73. Sutherland, H. G. E., Martin, D. I. K, and Whitelaw, E. (1997) Mol. Cell. Biol. 17, 1607-1614[Abstract]


Copyright © 1998 by The American Society for Biochemistry and Molecular Biology, Inc.