©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Analysis of the Chicken GPAT/AIRC Bidirectional Promoter for de Novo Purine Nucleotide Synthesis (*)

(Received for publication, September 7, 1994; and in revised form, October 20, 1994)

Anthony Gavalas (§) Howard Zalkin (¶)

From the Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

GPAT and AIRC encode two enzymes that catalyze steps 1 and 6 plus 7, respectively, of the de novo purine biosynthetic pathway. The chicken genes are closely linked and divergently transcribed from an 230-base pair intergenic region. The promoter was scanned by deletion mutagenesis in a bireporter vector that allowed assay of transcriptional activity in both directions in transfected HepG2 and chicken LMH cells. Three classes of deletions were obtained: those affecting bidirectional transcription, those predominantly affecting GPAT transcription, and those predominantly affecting AIRC transcription. Defects in bidirectional transcription resulted from removal of an initiator-like element overlapping the AIRC transcription start site, as well as deletions removing a series of GC and CCAAT boxes from the AIRC proximal half of the promoter and a CCAAT-containing segment from the GPAT side. Several regions in the GPAT proximal half of the promoter, including an octamer-like motif downstream from the transcription start site, were required predominantly for GPAT expression. Evidence for interaction of HeLa nuclear proteins with some of these sites was obtained by gel retardation, DNase I, and methylation interference assays. Overall, the results showed that the intergenic region is an integrated bidirectional promoter and that a novel initiator-like element plays a central role in coordinating expression of the divergently transcribed AIRC and GPAT genes.


INTRODUCTION

De novo biosynthesis of purine nucleotides proceeds by a 14-step branched pathway via IMP. GPAT-encoded glutamine 5`phosphoribosylpyrophosphate amidotransferase catalyzes the first committed step of the pathway, and 5`-phosphoribosylaminoimidazole carboxylase/5`-phosphoribosyl 4-(N-succinocarboxamide)-5-aminoimidazole synthetase, encoded by AIRC, catalyzes steps 6 and 7. The approximate chromosomal locations of the seven human genes required for AMP synthesis were deduced by complementing Chinese hamster ovary mutants deficient in AMP synthesis and by subsequent cytogenetic analysis of Chinese hamster ovary-human somatic cell hybrids. GPAT and AIRC were thus mapped to overlapping regions of chromosome 4, whereas other genes of the pathway were localized on different chromosomes (Barton et al., 1991). More recently, the human GPAT-AIRC locus has been mapped by in situ fluorescence hybridization to the q12 region of chromosome 4 (Brayton et al., 1994).

In order to set the groundwork for investigations of gene expression and regulation of this pathway in vertebrates, we recently cloned and characterized the chicken and human GPAT genes and the proximal AIRC genes (Brayton et al., 1994; Gavalas et al., 1993). This work established that GPAT and AIRC are closely linked and divergently transcribed from intergenic regions of approximately 230 and 625 bp (^1)in chickens and humans, respectively. Intron/exon boundaries are strictly conserved, as is the approximate size of the GPAT gene. On the other hand, human AIRC is approximately 2-fold larger than the corresponding chicken gene. The two promoters have also diverged significantly, although the close linkage of the two genes has been retained along with a high GC content and the presence of several Sp1 boxes. Both promoters lack TATA elements. Although the functional consequences of tight linkage between GPAT and AIRC are not known, promoters with the capacity to direct bidirectional transcription may provide one mechanism for the co-regulation of functionally related genes. As such, this arrangement may constitute the eukaryotic equivalent of a prokaryotic operon. This structural unit was named a dioskourion (from the Greek Dioskouri, the mythological inseparable twin sons of Zeus) (Gavalas et al., 1993). Bidirectional promoters may also be useful as a genetic engineering tool, directing expression of two genes in predetermined relative amounts and/or in a tissue-specific manner.

Previous experiments (Gavalas et al., 1993) have shown that chicken GPAT-AIRC promoter strength was about 10-fold higher in the AIRC direction compared with the GPAT direction using a bireporter promoter vector in transfected HepG2 cells. In addition, the intergenic region was dissected to yield ``half-promoters'' having about 30% function in the GPAT direction and 80% function in the AIRC direction. In this earlier work, a bidirectional promoter was defined operationally as a short segment of DNA that initiates bidirectional transcription in vivo. The question remains, however, whether common cis-elements and assembled transcription factors are used for transcription in both directions, as in an ``authentic'' bidirectional promoter or whether expression in the two directions employs distinct cis-elements. The latter case could result from juxtaposition of separate promoters. In order to identify cis-acting sites and to distinguish between the two types of promoter function, deletion mutants were constructed and tested for transcriptional activity by transfection of a bireporter vector carrying the LUC and CAT genes in divergent orientations. Two hepatoma cell lines, human HepG2 and chicken LMH, were used to evaluate the effect of these deletions. cis-Elements were found for bidirectional expression in both cell lines. One of these cis-elements, central for the expression of both genes, is a novel initiator (Inr)-like element, situated around the AIRC transcription start site. Mobility shift, DNase I, and methylation interference assays limited this element to 41 bp, showed that nuclear protein binding results in hypersensitivity to DNase I at the AIRC transcription start site, and identified nucleotides involved in specific DNA/protein contacts. cis-Elements also exist that are largely side-specific, most notable of which is an octamer-like motif found downstream from the GPAT transcription start site.


MATERIALS AND METHODS

Construction of Plasmids and Promoter Mutations

The plasmid pSKP-R01 (Gavalas et al., 1993) was used to introduce SmaI sites by site-directed mutagenesis (Kunkel, 1985) in the positions depicted in Fig. 1and Fig. 7, resulting in plasmids pSma-0 to pSma-9. In this nomenclature, the suffix (0-9) indicates the position of the SmaI site. SmaI/BamHI digestion of these plasmids generates two fragments. The large fragment represents the vector carrying a portion of the promoter, and the small one represents the rest of the promoter. By combining small and large parts of different plasmids, the mutations shown in Fig. 1were generated. The positions of the SmaI sites are used to designate regions deleted in a plasmid. Plasmid pSma-2.7i was constructed by introducing SmaI sites at positions 2 and 7, digesting the resulting double mutant with SmaI, religating, and screening by restriction mapping for plasmids carrying the region 2-7 in the inverted orientation with respect to the starting plasmid. All of the mutations were sequenced by the dideoxy sequencing method (Sanger et al., 1977) using single-stranded phagemid DNA and a primer that annealed to the part of the CAT gene present in these plasmids.


Figure 1: Map of promoter deletions. A, the plasmid pSK-PRO1 was used for initial construction before subcloning the mutated promoters into the bireporter plasmid pLUC/CAT-3. The middle box between HindIII and SalI represents the promoter. The hatched parts correspond to the 5`-untranslated regions of the two genes that were incorporated into the promoter. The LUC and CAT boxes represent 5` parts of these genes that are incorporated into this plasmid. The line between the promoter and the reporter boxes represents short polylinker sequences. B, schematic representation of the promoter. Consensus sites for cis-elements and an octamer-like sequence are shown. The lowercase letter indicates a base deviating from the consensus. Arrows denote the transcription start sites, and arrowheads indicate the positions of SmaI sites that were introduced by site-directed mutagenesis. The exact sequence of this region and the exact position of the SmaI sites are shown in Fig. 7. WT, wild type. C, schematic representation of the promoter mutations. The deletions are noted by the interrupted line and the number of deleted bp is written in the gap. Mutant 2.7i bears no deletion; instead, the region between points 2 and 7 is inverted. Constructs are named according to the region deleted.




Figure 7: Nucleotide sequence of the AIRC/GPAT bidirectional promoter. Flanking sites for HindIII at the 5` end and SalI at the 3` end that are used for subcloning are not shown. Arrowheads indicate the positions of SmaI sites used for construction of deletions. Potential GC and CCAAT elements are boxed. Large bent arrows and filled squares represent the transcription start sites as determined from transient transfection experiments and endogenous mRNA, respectively. The lowercase sequence at positions 50-90 contains the AIRC Inr-like element, the upward vertical arrow shows the position of the DNase I-hypersensitive site on the bottom (noncoding) strand, and asterisks mark the interfering nucleotides on the top strand (above the sequence) and the bottom strand (below the sequence). Half-arrows above the Inr-like element indicate the presence of an imperfect palindrome with a 3-bp spacer. A region similar to the adenovirus major late promoter Inr is noted at positions 43-55. Direct (underlined) repeats close to the GPAT transcriptional start sites are noted by Roman numerals. An octamer-like motif is shown with lowercase letters at positions 320-327, and its DNase I footprint is shown by solid lines above (top strand) and below (bottom strand) the sequence.



The resulting mutagenized promoters were subcloned as HindIII/SalI restriction fragments into the bireporter plasmid pLUC/CAT-3. This plasmid is identical to the plasmid pLUC/CAT-1 that was used earlier (Gavalas et al., 1993) with the exception that the contiguous BamHI and SacI sites at the 3` end of the chloramphenicol acetyltransferase (CAT) reporter have been replaced by a ClaI restriction site. The resulting plasmids were checked by restriction mapping and were purified through two cesium chloride gradient ultracentrifugations in preparation for the transient transfection assays.

Transient Transfections and Reporter Assays

HepG2 cells were grown in Eagle's minimal essential medium supplemented with 10% fetal bovine serum at 37 °C with 5% CO(2) to 60-80% confluency in 35-mm plates. LMH cells were grown in Waymouth medium supplemented with 10% fetal bovine serum in 5% CO(2) at 37 °C. HepG2 cells were transfected by the lipofection method (Felgner et al., 1987) or by the calcium phosphate (Ausubel et al., 1987) method, whereas LMH cells were transfected only by the calcium phosphate procedure. Transfected cells were incubated with 10% serum-supplemented medium for 24-48 h. After removal from the plate, two-thirds of the cells were used to prepare extract for the CAT assay (Seed and Sheen, 1988), and one-third was used for preparation of extract for beta-galactosidase and luciferase (LUC) assays. LUC and beta-galactosidase activities were determined by chemiluminescent assays as described by the suppliers of the reagents (Promega and Tropix). The light emission was measured using a Monolight 2010 instrument (Analytical Luminescence Laboratory) as relative light units. Protein concentration was determined by the Bradford assay (Bradford, 1976), and CAT and LUC specific activities were normalized for beta-galactosidase specific activity, derived from transfection with an RSV-lacZ plasmid (Darn et al., 1987). Plasmid pLUC/CAT-3 was used to obtain the background values of the reporter assays.

Gel Retardation and Competition Assays for Protein-DNA Binding

Probes containing promoter regions were isolated from plasmids pSma-2 and pSma-7 using the appropriate enzymes. Digestion of pSma-2 with HindIII/XmaI (XmaI is a SmaI isoschizomer) and XmaI/AvaI gave the probes that encompass regions between SmaI sites 0-2 and 2-4, designated 0.2 and 2.4, respectively. The AvaI site is located at position 188 in Fig. 7. Digestion of pSma-7 with AvaI/XmaI and XmaI/SalI gave the probes encompassing regions 5.7 and 7.9, respectively. After the first cut, the end was dephosphorylated and labeled using T4 polynucleotide kinase and [-P]ATP. Then digestion with the second restriction enzyme released the P-labeled fragment of interest, which was isolated by polyacrylamide gel electrophoresis and electroelution. Probe 79 (see Fig. 3) was prepared by the polymerase chain reaction using end-labeled primers.


Figure 3: Analysis of protein binding to the AIRC Inr region by gel retardation. A, protein-DNA complexes with promoter probe fragment shown in C. The probe was isolated from plasmid pSma-2 by digestion with HindIII and XmaI, sites that flank the sequence shown, and was end-labeled with [-P]ATP. Arrows identify two specific protein-DNA complexes. These complexes were competed by Sp1 oligonucleotide and by segments of the promoter proximal to the AIRC transcription start site defined in C. Plasmid pBluescript polylinker (169 bp) was the nonspecific competitor. All competitors were used at a 100-fold molar excess. Noncompeted unmarked bands may be nonspecific. B, protein-DNA complexes with probe PCR #79 (see C). This probe has 5` and 3` ends corresponding to probes PCR #7 and PCR #9, respectively. An arrow marks the position of a protein-DNA complex competed by the unlabeled probe. The molar excess concentration of the competitor is given. The nonspecific competitor (100-fold molar excess) is the same as in A. C, the sequence of the promoter around the AIRC transcription start site is shown. The arrow and the filled square represent the transcription start site determined by transient transfections and endogenous mRNA, respectively. Arrowheads mark the positions where SmaI sites were introduced and, therefore, the end points of deletion mutations. The large box indicates the minimal sequence around the transcription start site that competes for the low mobility specific complex, and the small box indicates the Sp1 site. The box with the dotted line represents the homology with the adenovirus major late promoter Inr (see also Fig. 7). The open bars represent DNA fragments used as competitors. PCR #1 is the same as the sequence shown.



Typical protein-DNA binding reactions of 20 µl contained 5 µg of HeLa nuclear extract (Promega), 1 µg of poly(dIbulletdC)-poly(dIbulletdC) (Boehringer Mannheim), 25 mM HEPES, pH 7.6, 50 mM KCl, 0.1 mM EDTA, 5 mM MgCl(2), 10% glycerol, 1 mM dithiothreitol, and 0.1% Nonidet P-40. Incubation was for 25 min at room temperature. Subsequently, 20-25 fmol of labeled probe were added, and the reaction was incubated for an additional 25 min at room temperature. Protein-DNA complexes were resolved by electrophoresis on a 4% native polyacrylamide gel. Competitor DNAs were prepared either by restriction digestion or PCR and were preincubated, before the addition of the labeled probe, in the binding reaction. Commercially available duplex oligonucleotides (Promega) were also used as competitors: Sp1, 5`-ATTCGATCGGGGCGGGGCGAGC-3`; TFIID, 5`-GCAGAGCATATAAGGTGAGG TAGGA-3`; OCT1, 5`-TGTCGAATGCAAATCACTAGAA-3`; CTF/NF1, 5`-CCTTTGGCATGCTGCCAATATG.

Polymerase Chain Reaction

Due to a high GC content, PCR through the GPAT/AIRC promoter region tends to give mostly nonspecific amplified fragments and/or a very low yield of the correct fragment. In order to optimize these reactions, we used the Stratagene Opti-Prime PCR optimization kit. Buffers 6 and 10 (10 mM TrisbulletHCl, pH 8.8 and 9.2, respectively, 1.5 mM MgCl(2), and 75 mM KCl) were found to give only specific bands for the PCR fragments depicted in Fig. 3.

DNase I and Methylation Interference Assays

For the GPAT side of the promoter, DNase I footprints were carried out using 50 or 100 µg of HeLa nuclear extract and scaling up the binding reaction 5 and 10 times, respectively, with the exception of the probe, which stayed the same. Probes were derived from pSma-7 by XmaI/SalI digestion and were labeled on either strand. For the free DNA control, bovine serum albumin replaced HeLa extract. For the AIRC Inr footprint, the binding reaction was scaled up 5-fold. Probes were derived from pSma-2 by HindIII/XmaI digestion and were labeled on either strand. After incubation, a reaction containing 100-125 fmol of probe was digested with 3 units of DNase I for 1 min, and the reactions were electrophoresed on a 4% polyacrylamide preparative gel. The gel was exposed to film overnight at 4 °C; then the shifted band was cut out, and the DNA was eluted on DEAE ion exchange membrane (Schleicher and Schuell). Samples were run on a 10% polyacrylamide sequencing gel, and the gel was exposed at -80 °C with an intensifier screen for up to 4 days.

For the AIRC Inr methylation interference assays, the probes used above were methylated (Ausubel et al., 1987). A 5-fold scaled-up binding reaction was performed with HeLa extract and was electrophoresed on a 4% acrylamide preparative gel. The gel was exposed at 4 °C, and then the bands representing the complex and the free probe were recovered. DNA was eluted and cleaved with piperidine to yield the G ladder (Ausubel et al., 1987). Samples were run on a 10% polyacrylamide sequencing gel, and the gel was exposed at -80 °C for up to 4 days.


RESULTS

Construction and in Vivo Expression of Mutant Promoters

Deletion mutations were designed in such a way that their end points would encompass possible regulatory elements that were inferred by visual inspection of the promoter sequence. The presence of precise transcriptional start sites for both genes suggests the existence of elements that are able to direct the basal transcriptional apparatus to the correct place of transcription initiation. Deletions were constructed by introducing SmaI sites into the promoter positions shown in Fig. 1(A and B; and in more detail in Fig. 7). By combining various SmaI/BamHI fragments, the deletions obtained in Fig. 1C were obtained. Use of the bireporter vector facilitated analysis of the effect that such deletions have on both sides of the promoter. Transient transfection assays in human and avian cells revealed the presence of elements that positively affect transcription on both sides. Results for assays of LUC and CAT reporters to monitor transcription of AIRC and GPAT, respectively, are given in Fig. 2. The mutations fall into three classes: those that remove elements required for bidirectional transcription and those that affect transcription predominantly in one direction, either GPAT or AIRC. These deletions and the further analysis of cis-elements are examined in the sections that follow.


Figure 2: Transcriptional activity of mutant promoters. The activities of the mutant promoters in HepG2 and LMH cells are expressed as the percentage of the wild type and are represented by solid bars. Values are the average of at least five independent transfections with standard deviations shown as error bars.



A Novel Element Around the AIRC Transcription Start Site Required for Bidirectional Transcription

A site in region 1.2 close to the position for AIRC transcription initiation is one of the important cis-elements for bidirectional promoter function. Expression from the AIRC and GPAT directions was reduced to less than 10 and 30% of wild type level, respectively, in HepG2 cells in promoter deletion 1.2 (Fig. 2). The defect in bidirectional expression of this promoter deletion was comparable in LMH cells. To further define the functional site(s) and to search for proteins that interact with the segment of the promoter that flanks the AIRC transcription start site, electrophoretic mobility shift assays were carried out using HeLa cell nuclear extract and a DNA probe containing the 0-2 portion of the promoter. The nucleotide sequence of the DNA probe is given in Fig. 3C. Two specific protein-DNA complexes were detected and are marked by arrows in Fig. 3A. These two protein-DNA complexes were competed by specific subregions of the promoter. Competition with an Sp1 oligonucleotide eliminated the slower migrating complex but not the faster migrating protein-DNA complex. Competition with promoter DNA of decreasing size narrowed down the region responsible for the formation of the faster mobility complex to 41 bp spanning the transcription start site (boxed in Fig. 3C). Shorter oligonucleotides covering positions 30-56, 35-65, 45-71, 61-87, and 76-92 (see Fig. 3and Fig. 7) and a TATA-binding protein oligonucleotide were not able to compete for formation of this complex (data not shown). A specific protein complex was formed with a 41-bp labeled probe (PCR #79) corresponding to this region (Fig. 3B). However, the increased amount of nonspecific binding implies that specificity determinants may have been left out of this 41-bp promoter fragment. In order to further characterize the DNA sequence that mediates the formation of the protein-DNA complex close to the transcription start site, we performed DNase I and methylation interference assays. The DNA region shown in Fig. 3C was isolated from pSma-2 as a HindIII/XmaI fragment and used as a probe. A DNase I footprint was not obtained. Instead, protein binding resulted in a DNase I-hypersensitive site on the noncoding strand just in front of the AIRC transcription start site (Fig. 4B). No such effect was observed on the other strand (data not shown). Methylation interference assays implicated essential amino acid contacts with several guanine residues about one turn of the helix upstream of the DNase I-hypersensitive site relative to the AIRC transcription start (Fig. 4A). Methylation of 5 guanines on the top strand and 2 on the bottom strand interfered with complex formation.


Figure 4: Methylation interference and DNase I assays for the AIRC Inr region. A, methylation interference assay. The promoter fragment shown in Fig. 3was methylated. The piperidine cleavage reactions of the free (F), nonbound (N), and bound (B) DNA are shown. Interfering methylated nucleotides are noted with an asterisk. The position of G residues in the top strand around the binding site is noted with a line. B, DNase I cleavage of the bottom AIRC noncoding strand. Binding of nuclear extract resulted in the hypersensitive site marked by the arrow.



By virtue of its position around the AIRC transcription start site and its capacity to activate transcription at distinct positions in the absence of a TATA box, we infer that the site described above may represent an Inr-like element that affects transcription bidirectionally. This element does not have sequence similarity with known Inr elements (Azizkhan et al., 1993; Weis and Reinberg, 1992). Interestingly, an overlapping stretch of 13 nucleotides (see Fig. 7) has similarity with the adenovirus major late promoter Inr (Weis and Reinberg, 1992). However, this element does not appear to contribute to protein binding in this region, because the competition experiments showed that most of it is dispensable for binding, and a 30-bp double-stranded oligonucleotide, nucleotides 35-65 (see Fig. 3and Fig. 7), encompassing this region of homology did not compete either.

Bidirectional Elements between the Transcriptional Start Sites for GPAT and AIRC

Three of the internal deletions constructed between SmaI site 2 and site 7 resulted in decreased bidirectional promoter activity. Deletion 2.3 removes two GC boxes, deletion 3.4 removes two CCAAT boxes and an overlapping GC box, and deletion 5.6 removes a CCAAT box (see Fig. 1B and Fig. 7). These mutations reduced transcription in both directions to between 20 and 60% of the intact promoter in both cell lines. A larger effect was seen for the cognate side, but, overall, elements in these deletions were required for full expression on both sides. In each case, the defects were more prominent in HepG2 cells than in the LMH line. GC and CCAAT boxes between sites 2 and 4 and between sites 5 and 6 are thus implicated in bidirectional transcription.

HeLa nuclear extracts and a SmaI/AvaI restriction fragment from plasmid pSma-2 (nucleotides 90-188, see Fig. 7) were used to detect protein-DNA complexes in the 2.4 subsection of the promoter. A specific complex was detected that was readily competed with an Sp1 oligonucleotide (Fig. 5) but not by a CTF/NF1 oligonucleotide (Chodosh et al., 1988) (data not shown). This result provides evidence for the binding of Sp1 to one or more of the three Sp1 sites in the 2.4 promoter region. No other complexes were detected using this probe or a probe encompassing sequences downstream of the AvaI site up to point 7 on the GPAT side under the conditions used. Thus, specific protein complexes with CCAAT motifs were not detected.


Figure 5: Sp1 binding in the promoter. The SmaI/AvaI fragment containing 2.4 DNA from plasmid pSma2 was used as a labeled probe, and an Sp1 oligonucleotide was used as competitor. The molar excess of the competitor is given above the lanes. The nonspecific competitor is the pBluescript polylinker used in 100-fold molar excess. The arrow points to the specific complex.



Deletion 4.5 also had a bidirectional effect, but, in contrast to the mutations described above, it resulted in increased transcription on both sides. This could result from removal of a repression element analogous to a structural control element in the dihydrofolate reductase (DHFR) promoter (Azizkhan et al., 1993) or could be because interactions required for bidirectional transcription are distance-dependent.

Sites Specific for GPAT Transcription

Sequences within SmaI positions 6-9, flanking the GPAT transcription initiation site, have a predominantly unidirectional effect on GPAT transcription. Deletion 6.7, which removes a GC box and disrupts three of six direct repeats (see Fig. 7), had no significant effect on AIRC transcription strength but reduced GPAT transcription to 50% or less of the wild type in LMH and HepG2 cells (Fig. 2). Deletion 7.8 removes the GPAT transcription start site and flanking sequences as well as three copies of the direct repeats (see Fig. 7). The effects of this mutation were similar to those of deletion 6.7. Expression from the GPAT side was reduced to less than 50% of the wild type, whereas AIRC transcription was less affected. The 25-bp region between SmaI sites 8 and 9 contains an octamer-like motif (see Fig. 1and Fig. 7) that was deleted in combination with segment 7.8 in the 7.9 mutant. We compared transcriptional activity in mutants 7.8 and 7.9 to assess the importance of the octamer-like motif in expression. By this analysis the 7.9 deletion had somewhat different effects in HepG2 and LMH cells. In both cell lines the major effect was reduced transcription from the GPAT direction compared with the intact promoter. In HepG2 cells transcription from the GPAT side was reduced from about 45% of wild type in deletion 7.8 to less than 15% of wild type in deletion 7.9, consistent with an essential role for the octamer-like site. However, in LMH cells, deletion of the 8.9 DNA with the octamer-like motif increased GPAT expression from 21% of wild type in deletion 7.8 to 43% of wild type in deletion 7.9. The basis for the discordant results is not presently known.

In order to search for protein-DNA interactions in the promoter region between SmaI sites 6 and 9, gel mobility shift assays were carried out with two DNA probes. The first probe was isolated as an AvaI/SmaI fragment from pSma-7 (see Fig. 7, nucleotides 188-262). Specific binding was not detected in this region (data not shown), even though it appears to contain the cis-acting sites needed for GPAT transcription (Fig. 2). This may reflect weak interactions in this region of the promoter that need the presence of distal sequences for stabilization. The second probe was isolated from plasmid pSma-7 as an XmaI/SalI fragment (see Fig. 7, nucleotides 262-349). Two specific complexes were detected using this probe and HeLa nuclear extract (Fig. 6, A and B). Both were competed with an octamer-containing oligonucleotide but not by nonspecific DNA. The two bands may result either from two proteins binding on this site or from protein-protein interactions resulting in the second complex that migrates more slowly. A DNase I footprint was obtained between nucleotides 314-334 on the top strand and nucleotides 320-339 on the bottom strand (Fig. 6C). These positions encompass the octamer-like motif downstream of the GPAT transcription start site (Fig. 7).


Figure 6: Protein binding to the GPAT octamer-like motif. A, gel retardation assay and competition by octamer DNA. The DNA complex was formed with HeLa nuclear extract and an XmaI/SalI end-labeled probe from plasmid pSma-7. Arrows point to specific complexes. Molar excess of an octamer-containing oligonucleotide is shown. B, effect of nonspecific pBluescript polylinker competitor (100-fold excess). C, DNase I footprints of the octamer-like site. The protein-DNA complex was formed with HeLa nuclear extract and an AvaI/SalI probe labeled on either strand. Lanes are shown for 0(-), 50 µg (+), and 100 µg (++) of protein. Nucleotide positions were determined by an adjacent dideoxy sequencing ladder alongside (not shown). The boundaries for the protected regions are numbered according to the sequence in Fig. 7.



Sites for AIRC Transcription

Deletion 0.1, which removes a potential Sp1 site, has a predominantly unidirectional effect on transcription. The 0.1 promoter mutation showed decreased activity, about 40% of the wild type, only on the AIRC side in HepG2 cells (Fig. 2). An Sp1 oligonucleotide competed for the formation of the lower mobility complex detected with a 0.2 DNA probe (Fig. 3A). Thus, the GC box found downstream of the AIRC transcription start site was implicated in mediating transcriptional activation of the AIRC side of the promoter. It is not known why formation of the Sp1 complex was decreased in some experiments (Fig. 3A).

Half-promoters and Internal Inversion

Larger deletions and an internal inversion were constructed in order to determine whether two separate promoters can be derived from this single bidirectional promoter and to define the contribution of internal elements to the relative strength of the two sides. Deletion 0.4, which removes all of the AIRC proximal sites, abolished transcription not only from the AIRC side but also from the GPAT side (Fig. 2). The 5.9 deletion was qualitatively similar to that of 0.4, but in this case transcription from both directions was reduced to 20-30% of the wild type. This reinforces the notion that the most important cis-elements for bidirectional transcription are to be found on the AIRC side. Even though the AIRC half-promoter had residual function, its function was bidirectional. Thus, by this analysis, it was not possible to dissect the intergenic region into two unidirectional promoters.

Construct 2.7i is an inversion of the bidirectional promoter between points 2 and 7. With this inversion we expected to see increased transcription from the GPAT direction and a corresponding decrease from the AIRC side. This would allow an estimate of the relative contribution of Inr-like element(s) to transcription of both sides. However, transcription from both directions decreased relative to the wild type. This would appear to reflect a defined organization of sites with restricted interplay between elements that are at and that flank the sites for transcription initiation.


DISCUSSION

Earlier work has established that the chicken GPAT and AIRC genes are tightly linked and divergently transcribed from an intergenic region of about 230 bp (Gavalas et al., 1993). Operationally, the intergenic region was referred to as a bidirectional promoter. The objective of this work was to identify cis-elements that are important for promoter function, identify potential initiator elements in the TATA-less promoter that direct transcription initiation from well defined points, and distinguish between two models for bidirectional transcription: a model in which bidirectional transcription is driven by two largely independent promoters arranged back-to-back and a model in which a bidirectional promoter drives expression of both genes. In the former case, transcription of each gene should be driven using its own set of cis-acting sites, whereas in the latter case shared cis-elements would be used for transcription from both sides. In order to address these questions, potential cis-elements were inferred in the promoter sequence, and the promoter was scanned by deletion mutagenesis. The function of the mutant promoters was then tested in a bireporter vector that allowed a simultaneous assay of the effect of each deletion in both directions. The results support a model in which several important cis-elements function bidirectionally, and, therefore, expression of these genes is tightly coupled using these elements. Fig. 7gives an overview of the basal promoter sequence and the elements to be discussed. This sequence includes the intergenic region plus approximately 60 bp encoding the 5`-untranslated region of each mRNA.

An important element required for bidirectional transcription overlaps the AIRC transcription start site. The boundaries of this element within the 1.2 sequence shown in Fig. 7were mapped by protein-DNA binding. By virtue of its requirement for transcription and a position overlapping the site for transcription initiation, we refer to this cis-element as Inr-like.

There are at least three classes of Inr elements in RNA polymerase II-transcribed promoters (Azizkhan et al., 1993; Weis and Reinberg, 1992). A 17-bp element around the transcription start site of the terminal deoxynucleotidyltransferase gene (Weis and Reinberg, 1992) directs transcription from a single nucleotide in vivo and in vitro and has a CTCANTCT consensus sequence, where the underlined nucleotide represents the initiation site. Adenovirus major late and IVa2 promoter Inr elements (Smale and Baltimore, 1989) and the porphobilinogen deaminase gene Inr (Beaupain et al., 1990) belong to this class.

The minimal promoter elements required for expression of the dihydrofolate reductase DHFR gene are an Sp1 site and the DHFR Inr, which represents the second class of Inr elements. This Inr element is required for the hamster, mouse, and human DHFR genes (Azizkhan et al., 1993) and genes for hypoxanthine phosphoribosyltransferase, Ki-Ras, 3-phosphoglycerate kinase, osteonectin, and interferon regulatory factor 1 (Linton et al., 1989, and references therein).

The adeno-associated virus type 2 p5 promoter has a third class of initiator that can function by itself or can direct TATA- and Sp1- activated transcription (Seto et al., 1991). A similar element is found in the TATA-less promoter of the human DNA polymerase beta gene (Weis and Reinberg, 1992).

The AIRC Inr-like element has no sequence similarity with the types of Inr elements mentioned above. A unique feature of this element is its bidirectionality. The AIRC Inr-like element may mediate assembly of the basal transcription machinery on both sides or do so only for the AIRC side, and it acts as a transcriptional activator for the distal side of the promoter. The experiments described here defined this element as a 41-bp region, which is unusually long compared with other types of Inr elements. Part of this element is an imperfect palindrome with a 3-bp spacer (Fig. 7). Further experiments will be needed to establish whether mutations in this region affect the selection of the transcription start site and whether this element is able to direct basal transcription in the absence of any upstream activator elements.

Apart from the AIRC Inr-like element, other sites were shown to be important for bidirectional transcription. AIRC proximal GC boxes in region 2.4 are required for transcription of AIRC and GPAT. It is likely but was not directly established that the two CCAAT boxes in fragment 3.4 contribute to the function of this region in bidirectional transcription. Direct evidence for the role of the two CCAAT boxes in fragments 3.4 and the one in fragment 5.6 would require more precise mutational disruption or detection of specific protein complexes with these sites. The bidirectionality of the GC and CCAAT elements in the GPAT/AIRC promoter is not surprising, because they function in both orientations upstream of the transcription start sites of their target genes (Wingender, 1990). Examples for activation of bidirectional transcription include a GC box in the center of the 130-bp intergenic region for the alpha1(IV) and alpha2(IV) collagen genes (Heikkila et al., 1993), a GC box in exon 1 of the alpha2(IV) gene (Heikkila et al., 1993), and GC boxes between the transcription start sites of DHFR and Rep-1 (Fujii et al., 1992).

Aside from sequences that activate GPAT-AIRC bidirectional transcription, several cis-elements influence only one promoter side. A GC box downstream of the AIRC transcription start site enhances expression of its cognate side. Sequences upstream and around the GPAT transcription start site in regions 6.7 and 7.8, respectively, function to activate transcription from this side. Downstream from the GPAT transcription start site an octamer-like motif, ATGTAAAT (differing by only 1 nucleotide from the consensus ATGCAAAT), was implicated in full expression of the GPAT side. The octamer motif can mediate activation of transcription by a subfamily of the POU transcription factors, the octamer-binding factors (Herr, 1992). Oct-1, the best characterized of its class, is broadly expressed and regulates transcription of small nuclear RNAs, histone H2B, and others via the ATGCAAAT motif. Other octamer-binding factors, such as Oct-2, Oct-4, and Oct-6, are expressed in a temporally and spatially restricted manner and are involved in developmental regulation (Schöler, 1991). The positioning of the element downstream of the GPAT transcription start site is peculiar to this promoter, because octamer motifs are generally found upstream of the transcription initiation site of their target gene.

Deletion 4.5 in the middle of the promoter resulted in increased activity on both sides. In the absence of recognizable sequence motifs, this result may reflect a distance-dependent effect for cooperation of the two promoter sides or the presence of an uncharacterized repressor-like structural control element analogous to that of the DHFR/Rep-1 promoter (Azizkhan et al., 1993). Decreased bidirectional transcriptional activity in mutant 2.7i emphasizes the importance of correct alignment of activator sequences and Inr elements for maximal activity.

Earlier experiments (Gavalas et al., 1993) suggest that the GPAT and AIRC half-promoters retained 30 and 80% of the wild type activity for their cognate side, respectively. However, in these mutants, promoter context was altered with respect to upstream flanking sequences. These changes altered the vector substantially and made the comparison with the wild type less reliable than in the present work. Here, the sequence context was maintained in the half-promoter mutants; the vector was not altered. Therefore, we consider the results that indicate that the GPAT half-promoter has essentially no activity and that the AIRC half-promoter retains approximately 20-40% of the bidirectional activity to be a better approximation of function.

A number of genes in vertebrates are divergently transcribed from a bidirectional promoter element. These include the housekeeping genes surf-1 and surf-2 of the surfeit locus (Colombo et al., 1992), the murine and human alpha1(IV) and alpha2(IV) collagen genes (Shimada et al., 1989; Soininen et al., 1988), histone H2A and H2B genes (Hentschel and Birnstiel, 1981), the DHFR and Rep-1 genes (Linton et al., 1989; Schilling and Farnham, 1989), and the Wilms tumor locus (Huang et al., 1990). In other cases of bidirectional transcription, genes were not identified for both sides. These include the proliferating cell nuclear antigen gene promoter (Rizzo et al., 1990), an SV40-like monkey genomic locus (Saffer and Singer, 1984), the HTF9 CpG island (Lavia et al., 1987), the human histidyl-tRNA synthase gene (Tsui et al., 1993), the c-myc oncogene promoter (Chang et al., 1991), and the VH441 promoter of the heavy chain immunoglobulin promoter (Nguyen et al., 1991).

The AIRC/GPAT locus is the only case where two genes encoding enzymes of the same pathway are closely linked. This linkage may provide for co-regulation, but it is not a prerequisite of it, because five of seven human genes for AMP synthesis are found on different chromosomes. Given the properties of the AIRC Inr-like element, a number of models may explain how bidirectional transcription occurs from this promoter. Two basal transcription complexes (Buratowski, 1994) may assemble independently on each transcription start site. In this case, the protein(s) binding on the AIRC Inr would direct assembly of the transcription complex on the AIRC side, and it acts as a transcriptional activator(s) for the distal GPAT side. Alternatively, transcriptional complexes having two orientations may assemble on an AIRC Inr. This would fit with the palindromic nature of the AIRC Inr and the bidirectional activity of deletion 5.9. Another possibility is that the AIRC and GPAT transcription start sites are close in space, through looping of the central promoter region, and, therefore, the AIRC Inr is able to direct assembly of the basal transcription complexes on both sites. The data currently available do not distinguish among these possibilities.


FOOTNOTES

*
This research was supported by National Institutes of Health Grant GM46466. Oligonucleotides were synthesized by the Purdue Laboratory for Macromolecular Structure supported by National Institutes of Health Diabetes Research and Training Grant DK20524. This is Journal Paper 14158 from the Purdue University Agricultural Research Station. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) L12533[GenBank].

§
Present address: Laboratoire de Génétique Moleculaire des Eukaryotes, Institut de Chimie Biologique, Rue Humann 11, 67085 Strasbourg, France.

To whom correspondence should be addressed: Dept. of Biochemistry, Purdue University, 1153 Biochemistry Bldg., West Lafayette, IN 47907. Tel.: 317-494-1618; Fax: 317-494-7897; Zalkin{at}biochem.purdue.edu.

(^1)
The abbreviations used are: bp, base pair(s); Inr, initiator; CAT, chloramphenicol acetyltransferase; LUC, luciferase; PCR, polymerase chain reaction.


ACKNOWLEDGEMENTS

We thank P. Stockbine and David Williams (State University of New York at Stony Brook) for the generous gift of LMH cells and P. Guo and E. Scholz (Purdue University) for help with the cell cultures and transfections.


REFERENCES

  1. Ausubel, F. M., Brent, F., Kingston, R. E., Moore, D. D., Seidmann, J. G., Smith, J. A., and Struhl, K. (eds) (1987) Current Protocols in Molecular Biology , pp. 9.1.1-9.1.3 and 12.3.1-12.3.3, John Wiley & Sons, Inc., New York
  2. Azizkhan, J. C., Jensen, D. E., Pierce, A. J., and Wade, M. (1993) Crit. Rev. Eukaryotic Gene Expression 3, 229-254 [Medline] [Order article via Infotrieve]
  3. Barton, J. W., Hart, I. M., and Patterson, D. (1991) Genomics 9, 314-321 [Medline] [Order article via Infotrieve]
  4. Beaupain, D., Eleouet, J. F., and Romeo, P. H. (1990) Nucleic Acids Res. 18, 6509-6515 [Abstract]
  5. Bradford, M. M. (1976) Anal. Biochem. 72, 248-254 [CrossRef][Medline] [Order article via Infotrieve]
  6. Brayton, K. A., Chen, Z., Zhou, G., Nagy, P. L., Gavalas, A., Trent, J. M., Deaven, L. L., Dixon, J. E., and Zalkin, H. (1994) J. Biol. Chem. 269, 5313-5321 [Abstract/Free Full Text]
  7. Buratowski, S. (1994) Cell 77, 1-3 [Medline] [Order article via Infotrieve]
  8. Chang, Y., Spicer, D. B., and Sonenshein, G. E. (1991) Oncogene 6, 1979-1982 [Medline] [Order article via Infotrieve]
  9. Chodosh, L. A., Baldwin, A. S., Carthew, R. W., and Sharp, P. A. (1988) Cell 53, 11-24 [Medline] [Order article via Infotrieve]
  10. Colombo, P., You, J., Garson, K., and Fried, M. (1992) Proc. Natl. Acad. Sci. U. S. A. 6, 6358-6362
  11. Dorn, A., Bollekens, J., Stawb, A., Benoist, C., and Mathis, D. (1987) Cell 50, 863-872 [Medline] [Order article via Infotrieve]
  12. Felgner, P. L., Gadek, T. R., Holm, M., Roman, R., Chan, H. W., Wenz, M., Northrop, J. P., Ringold, G. M., and Danielsen, M. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 7413-7417 [Abstract]
  13. Fujii, H., Shinya, E., and Shimada, T. (1992) FEBS Lett. 314, 33-36 [CrossRef][Medline] [Order article via Infotrieve]
  14. Gavalas, A., Dixon, J. E., Brayton, K. A., and Zalkin, H. (1993) Mol. Cell. Biol. 13, 4784-4792 [Abstract]
  15. Heikkila, P., Soininen, R., and Tryggvason, K. (1993) J. Biol. Chem. 268, 24677-24682 [Abstract/Free Full Text]
  16. Hentschel, C. C., and Birnstiel, M. L. (1981) Cell 25, 301-313 [Medline] [Order article via Infotrieve]
  17. Herr, W. (1992) in Transcriptional Regulation (McKnight, S., and Yamamoto, K., eds) pp. 1103-1135, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
  18. Huang, A., Campbell, C. E., Bonetta, L., Hill, M. S., McNeill, S., Coppes, M. J., Law, D. J., Feinberg, A. P., Yeger, H., and Williams, B. R. G. (1990) Science 250, 991-994 [Medline] [Order article via Infotrieve]
  19. Kunkel, T. A. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 488-492 [Abstract]
  20. Lavia, P., MacLeod, D., and Bird, A. (1987) EMBO J. 6, 2773-2779 [Abstract]
  21. Linton, J. P., Yen, J. J., Selby, E., Chen, Z., Chinsky, J. M., Liu, K., Kellems, R. E., and Crouse, G. F. (1989) Mol. Cell. Biol. 9, 3058-3072 [Medline] [Order article via Infotrieve]
  22. Nguyen, Q. T., Doyen, N., d'Andon, M. F., and Rougeon, F. (1991) Nucleic Acids Res. 19, 5339-5344 [Abstract]
  23. Rizzo, M. G., Ottavio, L., Travali, S., Chang, C., Kaminska, B., and Baserga, R. (1990) Exp. Cell Res. 188, 286-293 [Medline] [Order article via Infotrieve]
  24. Saffer, J. D., and Singer, M. F. (1984) Nucleic Acids Res. 12, 4769-4788 [Abstract]
  25. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467 [Abstract]
  26. Schilling, L. J., and Farnham, P. J. (1989) Mol. Cell. Biol. 9, 4568-4570 [Medline] [Order article via Infotrieve]
  27. Schöler, H. R. (1991) Trends Genet. 7, 323-329 [Medline] [Order article via Infotrieve]
  28. Seed, B., and Sheen, J.-Y. (1988) Gene (Amst.) 67, 271-277 [CrossRef][Medline] [Order article via Infotrieve]
  29. Seto, E., Shi, Y., and Shenk, T. (1991) Nature 354, 241-245 [CrossRef][Medline] [Order article via Infotrieve]
  30. Shimada, T., Fujii, H., and Lin, H. (1989) J. Biol. Chem. 264, 20171-20174 [Abstract/Free Full Text]
  31. Smale, S. T., and Baltimore, D. (1989) Cell 57, 103-113 [Medline] [Order article via Infotrieve]
  32. Soininen, R., Huotari, M., Hostikka, S. L., Prockop, D. J., and Tryggvason, K. (1988) J. Biol. Chem. 263, 17217-17220 [Abstract/Free Full Text]
  33. Tsui, H. W., Mok, S., De Souza, L., Martin, A., and Tsui, F. W. (1993) Gene (Amst.) 131, 201-208 [Medline] [Order article via Infotrieve]
  34. Weis, L., and Reinberg, D. (1992) FASEB J. 6, 3300-3309 [Abstract/Free Full Text]
  35. Wingender, E. (1990) Crit. Rev. Eukaryotic Gene Expression 1, 11-48 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.