©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Transcriptional Regulation of the Carcinoembryonic Antigen Gene
IDENTIFICATION OF REGULATORY ELEMENTS AND MULTIPLE NUCLEAR FACTORS (*)

(Received for publication, August 10, 1994; and in revised form, December 1, 1994)

Wendy Hauck Clifford P. Stanners (§)

From the Department of Biochemistry and McGill Cancer Centre, McGill University, Montreal, Quebec H3G 1Y6, Canada

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Human carcinoembryonic antigen (CEA) belongs to a family of membrane glycoproteins that are overexpressed in many carcinomas; CEA functions in vitro as a homotypic intercellular adhesion molecule and can inhibit differentiation when expressed ectopically in myoblasts. The regulation of expression of CEA is therefore of considerable interest. The CEA gene promoter region between -403 and -124 base pairs upstream of the translation initiation site directed high levels of expression in CEA-expressing SW403 cells and was 3 times more active in differentiated than in undifferentiated Caco-2 cells, correlating exactly with the 3-fold increase in CEA mRNA seen in differentiated Caco-2 cells. Inclusion of additional upstream sequences between -1098 and -403 base pairs repressed all activity. By in vitro footprinting and deletion analyses, four cis-acting elements were mapped within the positive regulatory region, and one element within the silencing region. Several nuclear factors binding to these domains were identified: USF, Sp1, and an Sp1-like factor. By co-transfection, USF directly activated the CEA gene promoter in vivo in both SW403 and Caco-2 cells. In addition, the levels of factors binding to each positively acting element increased dramatically with differentiation in Caco-2 cells. Thus the transcriptional control of the CEA gene depends on the interaction of several regulatory elements that bind multiple specific factors.


INTRODUCTION

CEA, (^1)a membrane glycoprotein first observed in human fetal colon and colorectal cancer(1) , is a widely used clinical tumor marker. CEA has been shown to function in vitro as a homotypic intercellular adhesion molecule (2, 3) and could thus play an important role during development. A model for a possible carcinogenic role of CEA overproduction in the colon has been suggested(2, 4) . In support of this model, we have recently shown that the ectopic expression of CEA on the surface of rat L6 myoblasts can completely block terminal differentiation and the normal loss of proliferative capacity(5) . These attributes make CEA an important candidate for studies on control of gene expression.

CEA is a member of the immunoglobulin supergene family and is the prototype for its own subfamily of closely related molecules that vary in domain composition and tissue distribution (for a review, see (6) ). This subfamily consists of 29 closely linked genes on chromosome 19, including those coding for CEA itself, nonspecific cross-reacting antigen (NCA), biliary glycoprotein (BGP), CEA gene family member 6 (CGM6), and a number of other genes with yet undetermined products (hsCGMs)(6, 7) . As with CEA, NCA and BGP have been shown to function in vitro as intercellular adhesion molecules(2, 3, 6) .

Cloned cDNAs for CEA, NCA, and BGP have been used as probes to study their expression in normal and tumor tissue(6) . Whereas CEA mRNA is present at low levels in normal adult colon and is usually overexpressed in malignant colon and other cancers of epithelial cell origin, NCA mRNA is found in normal colon, lung, and granulocytes and is elevated by a greater factor in tumors of the colon, breast, and lung(6) . Several forms of BGP have been isolated from bile ducts, gallbladder mucosa, and various tumors(6) . In contrast to CEA and NCA mRNAs, the expression of two BGP mRNAs have been shown to be down-regulated in colorectal carcinomas in comparison to normal adjacent mucosa(8) . The increased levels of CEA have been shown not to be due to gene rearrangements or amplification (9) but, instead, to hypomethylation of upstream regions (9, 10, 11) and/or factor changes leading to altered rates of transcription; post-transcriptional changes have also been implicated(9, 12) .

CEA and NCA mRNA levels have also been investigated in the differentiating Caco-2 cell system(12) . When cultured in vitro, between 4 and 11 days after reaching confluence, this human colon adenocarcinoma cell line differentiates and becomes highly polarized, with tight junctions between individual cells and a brush border membrane containing enzymes characteristic of a fully differentiated intestinal epithelium(13, 14) . We found that CEA transcript levels were 3-fold higher in fully differentiated Caco-2 cells than in undifferentiated monolayers(12) . In the present study, we have used the differentiation control of the CEA gene in Caco-2 cells to check the biological validity of the CEA promoter analysis.

To carry out this analysis, we characterized the upstream noncoding region of the CEA gene. A 424-bp 5` flanking sequence has been reported to confer cell-type specific expression on a reporter gene(15) . Numerous purine-rich sites were postulated to play a role in transcriptional control(11) , but the exact regulatory elements involved remained unknown. We now demonstrate that both positive and negative elements reside within 1098 bp upstream of the translational start site and that a 403-bp upstream sequence confers cell type-specific and differentiation-dependent expression on the luciferase reporter gene. We identify five nuclear factor binding sites and three of the multiple trans-acting factors (USF, Sp1, and Sp1-like) interacting with the stimulatory domain. In addition, we present direct evidence that USF activates CEA gene transcription in vivo.


MATERIALS AND METHODS

Plasmid Constructions

Upon probing a human epithelial genomic EMBL3 phage library (gift from N. Laskin, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY) with the 5` untranslated and leader region of human CEA cDNA(16) , a 17.7-kilobase pair genomic clone was found to contain sequence identical to that of the CEA coding region. In addition, 2.3 kilobase pairs of upstream sequence were found to be virtually identical to the CEA gene (COSCEA01) sequence already published(15) ; only five dispersed single base differences were found in 1098 bp of sequenced DNA immediately upstream of the initiation codon. The -3500 to +1 (translational start site of the CEA gene) upstream region of this genomic clone was used for promoter analyses.

5` deletion mutants of the CEA gene promoter lacked the CEA translation initiation codon and were fused immediately 5` to the firefly luciferase (LUC) reporter gene (17) at the SmaI site of the pXP2 vector(18) . The resulting plasmids were named according to the sizes of their respective CEA promoter restriction fragments as shown in Fig. 1. Internal deletion mutants of the CEA gene promoter, p1098Delta279LUC and p1098aDelta279LUC, were constructed as follows: the AvrII to AvaI fragment was blunt-ended and inserted in the sense and antisense (a) orientation into the blunt-ended AvrII site in p124LUC, thus preserving the transcriptional initiation site at its proper position. p1098+279LUC contains the 974-bp AvrII fragment, spanning from -124 to -1098, inserted into the AvrII site of p403LUC. pRSVLUC(18) , containing the RSV long terminal repeat fused to the LUC gene in pXP2, was obtained from Dr. M. Featherstone (McGill Cancer Centre, Montreal); pRSVZbeta-gal was obtained from Dr. E. Shoubridge (Montreal Neurological Institute, Montreal) and contains the beta-galactosidase gene driven by the RSV long terminal repeat.


Figure 1: Localization of the elements determining CEA gene promoter activity. CEA promoter activity in various cell lines is shown. The activity of each construct in each of the cell lines is presented relative to the activity of the promoterless vector, pXP2. In all cases, CEA sequences from -2 to -124, including the 5` untranslated region and transcription initiation site, are present in plasmid constructs. The translational start site is at position +1. Ovals represent regions protected in DNase I footprinting assays. Names of constructs correspond to the sizes of the fragments tested. Sites used to create constructs are shown on the restriction map, with precise positions shown in Table 1. The data represent the mean ± S.D. of three to four independent experiments, each performed in duplicate, and corrected for protein concentration and for transfection efficiency by the activity of the internal RSVZbeta-gal control plasmid. N.D., not determined.





Cells and DNA Transfections

Four human cultured cell lines, SW403 (CCL 230, human colon adenocarcinoma), HT29 (HTB 38, human colon adenocarcinoma), Caco-2 (HTB 37, human colon adenocarcinoma), and HepG2 (HB 8065, human hepatocellular carcinoma) were obtained from ATCC (Rockville, MD) and maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum and penicillin/streptomycin (Life Technologies, Inc.). The LR-73 (Chinese hamster ovary cell-derived) (19) and HeLa R19 (human epithelial cervical carcinoma) (20) cell lines were grown in alpha-minimal essential medium(21) , supplemented with 10% fetal bovine serum and penicillin/streptomycin.

Cells were plated at a density of 1 times 10^6 cells/100-mm plastic Petri dish 24 h before transfection. 10 µg of pRSVLUC or equimolar amounts of the promoter-LUC gene constructs were cotransfected with 5 µg of pRSVZbeta-gal and l0 µg of calf thymus carrier DNA by calcium phosphate-DNA coprecipitation as described previously(22) . The precipitate was removed after 15 h, and the cultures were incubated for another 72 h in normal growth medium. Undifferentiated Caco-2 monolayers were transfected at 2 days before confluence; fully differentiated Caco-2 cells were transfected at 11 days after confluence. Their state of differentiation was confirmed by the presence of domes and the development of a brush border membrane (12) . -Interferon (Collaborative Research Inc., Bedford, MA) was applied to Caco-2 cells at 2000 units/ml as described previously (12) immediately following transfection, with a fresh medium change. LUC activity was measured as described by DeWet et al.(17) . beta-Galactosidase activity (23) in these same extracts was measured to correct for variations in transfection efficiency. Relative luciferase activity was calculated as the ratio between LUC and beta-galactosidase activities for each transfection and then reported as -fold above background (considered as activity of the parental promoterless vector, pXP2). Each plasmid was tested in duplicate plates and in three to five different transfection experiments.

To assess the effect of USF on CEA promoter activity in vivo, 10 µg of p403LUC and either 5 or 15 µg of pRSV.USF (as indicated in Table 2) were coprecipitated with calcium phosphate and transiently transfected into the SW403 or Caco-2 cell line. pRSV.USF was constructed by inserting human USF cDNA (24, 25) into a pUC18-derived expression vector driven by the RSV long terminal repeat (18) . Rat HNF-4 cDNA in the pSG5 expression vector (26) was generously provided by Dr. F. M. Sladek (University of California, Riverside, CA).



Nuclear Extracts and DNase I Footprinting Assays

Nuclear extracts were prepared as described by Therrien et al.(27) . All buffers contained leupeptin, pepstatin A, and aprotinin at concentrations of 1 µg/ml, and phenylmethylsulfonyl fluoride at 1 mM. Extracts were assayed for protein content by the Bio-Rad protein assay and stored at -75 °C.

DNase I footprinting assays were performed essentially as described by LeFèvre et al.(28) with modifications as by Howell et al.(29) . Briefly, 1-2 fmol (10,000 cpm) of DNA probes, labeled at only one end, were incubated with 0-160 µg of nuclear extracts for 15 min on ice; restriction enzyme-grade bovine serum albumin (Boehringer Mannheim) was added so that equal total amounts (160 µg) of protein were present in each reaction. Freshly diluted DNase I (Life Technologies, Inc.) at 20 ng/30 µl reaction volume was added for 3 min on ice. Reactions were stopped, and the DNA was digested with proteinase K, phenolextracted, and precipitated. The dried DNAs were suspended in formamide loading dye, and equal counts/min were loaded on an 8% polyacrylamide, 8 M urea sequencing gel. Sequencing reactions (23) were run alongside as size markers. The dried gels were autoradiographed at -75 °C with Cronex Quanta III intensifying screens (E. I. du Pont de Nemours & Co.).

Gel Mobility Shift Assays

Probes used for gel mobility shift assays corresponded to the protected sequences in the DNase I footprinting experiments, including a few nucleotides on either side. Their sequences and that of other competitor oligonucleotides used are shown in Table 1. The double-stranded oligonucleotides were synthesized (Sheldon Biotechnology Centre, McGill University), purified by gel electrophoresis, phosphorylated with T4 polynucleotide kinase (Pharmacia Biotech Inc.) in the presence of [-P]ATP (Amersham Corp.), and purified from unincorporated radioactivity by passage through NICK columns (Pharmacia). About 2 fmol (15,000 cpm) of 5`-end-labeled, double-stranded oligonucleotides were then incubated with 20-60 µg of various nuclear extracts and 4 µg of poly(dI-dC) in binding buffer (29) at 23 °C for 15 min. For competition experiments, the competitor oligonucleotides and extracts were mixed together for 15 min at 23 °C, then probe was added for an additional 15-min incubation. All samples were subjected to electrophoresis on 5% nondenaturing polyacrylamide gels buffered with 1 times TGE (Tris/glycine/EDTA)(23) . The gels were autoradiographed at -75 °C with an intensifying screen. Purified human AP-2 (30) and Sp1 (31) transcription factors were obtained from Promega Corp. (Madison, WI); rabbit polyclonal IgG specific to the human p95 and p106 Sp1 proteins was obtained from Santa Cruz Biotechnology, Inc. (Santa Cruz, CA). Human USF cDNA (24, 25) in a T7-driven plasmid and a USF-specific antibody were gifts of Dr. R. G. Roeder (The Rockefeller University, New York, NY). USF protein was obtained directly from USF cDNA by T7 RNA polymerase (New England Biolabs) transcription in vitro, followed by translation in vitro using rabbit reticulocyte lysate (Promega).

To determine equivalent amounts of nuclear extracts from Caco-2 cells in their undifferentiated and differentiated states, oligonucleotides binding to ubiquitous factors such as Sp1 (31) or AP1 (32) could not be used since the levels of these factors have been observed to change with differentiation ( (33) and data not shown). Thus equal concentrations of isolated nuclei were used to normalize the extracts. Using this approach, with the preparations shown in Fig. 6, 20 µg of undifferentiated Caco-2 nuclear extract was equivalent to 18 µg of differentiated Caco-2 nuclear extract. The data shown in Fig. 6are representative of several independent experiments using independent nuclear extracts.


Figure 6: Analysis of Caco-2 nuclear proteins binding to CEA regulatory elements versus differentiation. Oligonucleotides containing the DNA sequence of three of the CEA regulatory elements were used as probes to reveal Caco-2 nuclear proteins which specifically recognized them and to indicate the relative abundance of these factors in equivalent amounts of nuclear extracts prepared from undifferentiated (U) and differentiated (D) Caco-2 cells. By using equal concentrations of isolated nuclei before the elution of nuclear proteins, 20 µg of undifferentiated Caco-2 nuclear extract was found to be equivalent to 18 µg of differentiated Caco-2 nuclear extract in the experiment shown here. The indicated unlabeled oligonucleotides used as competing DNAs assessed the specificity of the complexes formed. Arrows point to the specific complexes.




RESULTS

Functional Analysis Reveals Negative and Positive Cell-specific Regulatory Elements

A genomic CEA clone containing 3.5 kilobase pairs of sequence upstream of the initiation codon was isolated from a normal human epithelial cell library and was found to be virtually identical in the 5` noncoding sequence reported by Schrewe et al.(15) for the CEA gene. In the present report, we specifically localize the regulatory elements in the CEA upstream region by various 5` deletion constructs in several CEA-producing and non-producing cell lines. A very high CEA-producing human colon carcinoma cell line, SW403, showed high levels of luciferase expression when driven by a 403-bp upstream region of the CEA gene (the p403LUC construct in Fig. 1): 91-fold above background relative to only 2-fold and 5.5-fold above background in CEA-non-expressing rodent LR-73 and human HeLa R19 cells, respectively. In a low CEA-expressing human hepatocarcinoma cell line, HepG2, the 403-bp minimal promoter gave 7.7-fold above background activity, whereas in HT29 cells, a medium CEA-expressing human colon carcinoma cell line, it was surprisingly not active (1.8-fold above background) This was perhaps due to the inherently low transfection efficiency for HT29 cells, despite corrections for efficiency.

Deletion of the -403 to -124 sequence, leaving only the transcription initiation site and 5` untranslated region intact (p124LUC construct), abolished all promoter activity (Fig. 1). Thus regulatory elements responsible for the control of cell-specific expression of the CEA gene reside within this 279-bp upstream region. Inversion of the 403-bp minimal promoter in the p403aLUC antisense construct strongly reduced but did not abolish activity, suggesting that the elements within the minimal promoter may use cryptic signals on either strand for transcriptional initiation (no TATA box can be found in the upstream sequence of the CEA gene). Inclusion of further upstream sequences (-1098 to -403) in conjunction with the minimal promoter, as seen with the p1098LUC construct, repressed promoter activity markedly in SW403 cells (Fig. 1). The -1098 to -403 region alone, placed in either orientation before the fragment containing the transcription initiation site, showed no activity whatsoever (constructs p1098Delta279LUC and p1098aDelta279LUC). Hence, a silencer region, which can down-regulate CEA gene transcription, must lie within the -1098 to -403 bp sequence. This silencer can nevertheless be completely overcome by the upstream placement of a second 279-bp (from -403 to -124) region at -1098 bp (construct p279+1098LUC) (Fig. 1).

Differentiation Dependence of Regulatory Elements

To determine whether the transcriptional control of CEA expression seen previously with the differentiation of Caco-2 cells (12) could be accounted for by the CEA gene upstream regulatory region, all promoter constructs were transiently transfected into undifferentiated, subconfluent Caco-2 cells and into fully differentiated Caco-2 cells. The last two columns in Fig. 1summarize the results. The 403-bp minimal promoter was found to be 3 times more active in differentiated than in undifferentiated Caco-2 cells, thus correlating exactly with the 3-fold increase seen in CEA mRNA in differentiated monolayers(12) . CEA regulatory elements found within this 403-bp region could therefore be responsible for differentiation-dependent CEA expression in Caco-2 cells.

As seen with the SW403 cell line, the inclusion of additional upstream (-1098 to -403) sequence also repressed promoter activity in Caco-2 cells, and further deletions of the minimal promoter past position 403 generally decreased activity. Only in Caco-2 cells does the p300LUC construct give higher activity than the p403LUC construct, however. It is thus possible that a second silencing element, recognized by factors only in Caco-2 nuclear extracts, is present in the -403 to -300 region.

Since -interferon increases levels of CEA mRNA quite dramatically in the Caco-2 cell line(12) , this cytokine was applied for 3 days to examine changes in promoter activity. However, no effects on promoter activity were seen (data not shown), despite the presence of several possible -interferon activation sites and -interferon-stimulated response elements in the upstream promoter region (-835 to -1650).

DNase I Footprinting

To localize the regulatory elements responsible for the activities seen with the luciferase constructs in Fig. 1, we performed DNase I protection experiments using end-labeled fragments as probes incubated with nuclear extracts from various CEA-expressing and non-expressing cell lines. In all cases, any observed footprints were considered valid only if demonstrated on both strands in repeated experiments. Within the -403 to -124 upstream region, the deletion of which abolished all promoter activity (Fig. 1; p124LUC construct), we identified four sites protected from DNase I digestion by various CEA-expressing cell line nuclear extracts but not by CEA non-expressing LR-73 extracts (Fig. 2, B and C). Footprints (FP) FP1 and FP4 were clearly visible with nuclear extracts from most of the CEA-expressing lines; their positions in the upstream sequence are shown in Fig. 2A and in Fig. 1(ovals). Comparison of the luciferase activities of p300LUC with p280LUC, containing and lacking the FP4 element, respectively, indicates an approximately 2-fold contribution of this element to the minimal promoter activity.


Figure 2: Localization of binding sites for nuclear proteins within the CEA promoter using DNase I footprinting. A, restriction enzyme map of the CEA promoter showing sites used to generate the probes used in the footprinting experiments. Probes I-III were labeled as indicated by asterisks at either the 5` or 3` end. The circlednumbers show the positions of the footprints detected. B, localization of sites within the -2 to -403 region. 5`-end-labeled probe I was incubated as described under ``Materials and Methods,'' with nuclear extracts from LR-73, SW403, and Caco-2 (undifferentiated monolayers) as indicated. G refers to the G sequencing track; lanes0 contained 160 µg of bovine serum albumin only, and lanes 10-160 show results for 10-160 µg of the indicated nuclear protein extracts. The positions of the various footprints are indicated on the right by bars and labels; their precise sequences are listed in Table 3. The positions of some guanine residues are shown on the left for orientation purposes. Footprints were numbered according to their proximity to the transcription initiation site. C, localization of binding sites with a 3`-end-labeled -403 to -2 probe (Probe II). Footprints FP2 and FP3 are revealed using HepG2 nuclear extract. This probe is also shown using SW403 nuclear extract for comparison. DNase I hypersensitive sites () produced by binding of nuclear proteins to the FP2 and FP3 elements are visible. D, localization of binding sites in the silencer region with a 3`-end-labeled -835 to -403 probe (Probe III). In addition to SW403 nuclear extract, a nuclear extract prepared from HT29 cells was assayed for comparison. Both extracts reveal DNase I hypersensitive sites () flanking the FP5 element.





Two additional footprints, labeled FP2 and FP3, were revealed using nuclear extract from HepG2 cells and were also present, although less apparent, with extracts from SW403 cells (Fig. 2C). Two DNase I-hypersensitive sites are visible between FP2 and FP3.

In Fig. 2D, the -835 to -403 silencing region was used as an end-labeled probe in DNase I footprinting experiments; only one protected region could be detected in repeated experiments. This footprint, labeled FP5, was more apparent in HT29 nuclear extract and was flanked by two DNase I-hypersensitive sites. Deletion of the -1098 to -403 region containing this element led to an increase in promoter activity in all cell lines tested (Fig. 1, p1098LUC versus p403LUC constructs).

USF Binds to the CEA FP1 Element

Comparison of the protected sequence in FP1 to a transcription factor data base identified several candidate binding factors. These included the upstream stimulatory factor (USF) (34) and hepatic nuclear factor 4 (HNF-4)(26) , also known as liver-specific factor A1 (LF-A1)(35) . Two complexes with labeled FP1 oligonucleotide (Table 1), C1 and C2, were visible by band-shift assays using SW403 (Fig. 3A) or HepG2 (Fig. 3C) nuclear extracts. Formation of both complexes were competed effectively by 5- and 10-fold molar excesses of either unlabeled FP1 oligonucleotide (Fig. 3A, lanes 2-4) or of GAL2 oligonucleotide, whose sequence represents a USF-binding element in the GAL2 gene (36) (Fig. 3A, lanes5 and 6). The two complexes were not affected by an excess of an oligonucleotide representing the HNF-4/LF-A1 motif taken from the alpha1-antitrypsin gene (35) (alpha1-AT, Fig. 3A, lane 7), however, suggesting that the FP1 element is not an HNF-4/LF-A1 site. This corroborates our studies in which 10 µg of p403LUC and up to 20 µg of functional HNF-4 cDNA were co-transfected into SW403, Caco-2, HepG2, and LR-73 cells. Activation of the CEA promoter in p403LUC by HNF-4 could not be detected (data not shown), even though USF was capable of activating p403LUC in the same experiment (see below).


Figure 3: Gel mobility shift assay reveals that the USF transcription factor binds to the FP1 regulatory element. A, synthetic, double-stranded oligonucleotide representing FP1 (Table 1) was end-labeled, incubated with 60 µg of SW403 nuclear extract and electrophoresed, showing two complexes (C1 and C2). 5-, 10-, or 100-fold molar excesses of competing oligonucleotides were added as indicated. B, USF synthesized in vitro (see ``Materials and Methods'') was added to the FP1 probe in lanes 9-11; a 5-fold molar excess of unlabeled FP1 DNA was added in lane10; lanes11 and 12, USF specific antibody was first added to USF protein or SW403 nuclear extract, respectively, for 15 min at 0 °C, after which the FP1 probe was added and incubated for another 15 min on ice. The supershifted (s.s.) complex is indicated by an arrow on the right. C, for comparison, 40 µg of HepG2 nuclear extract was analyzed for binding under the same conditions as that with SW403 extract. 5- and 50-fold molar excesses of unlabeled, double-stranded FP1 oligonucleotide were used as competitor DNAs.



Since the GAL2 USF oligonucleotide competed effectively for complex formation, we tested whether purified USF transcription factor would recognize the FP1 element. The complex formed (Fig. 3B, lanes9 and 10) comigrated with C1 from SW403 extract (Fig. 3B, lane8). As expected, specific anti-USF antibody supershifted the DNA/USF complex (Fig. 3B, lane11) but also supershifted the C1 complex from SW403 nuclear extract (Fig. 3B, lane12). Thus colon carcinoma cell lines clearly contain USF, which can bind to the FP1 element in the CEA gene promoter. The presence of two complexes in HepG2 nuclear extracts comigrating with those obtained with SW403 extract (Fig. 3C) suggests that USF is also present in this hepatoma-derived cell line.

USF Activation of the CEA Gene Promoter in Vivo

To test directly whether USF could modulate CEA gene promoter activity in vivo, an expression plasmid carrying USF cDNA, pRSV.USF, together with the CEA minimal promoter construct, p403LUC, were co-transfected into various cells. Co-transfection with 15 µg of pRSV.USF produced 2.6-fold greater LUC activity in SW403 cells and 3.1-fold greater activity in undifferentiated Caco-2 cells than the level produced by co-transfection with 15 µg of pUC, the parental vector of pRSV.USF, lacking USF cDNA; in addition, the effect of pRSV.USF was concentration-dependent (Table 2). In contrast, the activity of the basic luciferase vector, pXP2, was not influenced by co-transfected pRSV.USF (data not shown), indicating that USF-mediated transactivation was not due to sequences within the luciferase vector. In addition, as mentioned above, co-transfection with functional HNF-4 cDNA also had no effect on the activity of p403LUC. Thus it seems likely that CEA promoter activity is partially controlled by the level of USF in differentiated Caco-2 cells and colon tumors. However, USF is a ubiquitously expressed factor and CEA shows a much more restricted tissue-specific expression pattern. Therefore, other elements and trans-acting factors of the CEA gene promoter must interact to achieve tissue-specific expression.

Sp1, Sp1-like, and an Unknown Nuclear Factor Bind the CEA FP2 and FP3 Elements

Gel mobility shift assays with oligonucleotides representing the FP2 and FP3 regions are shown in Fig. 4and Fig. 5, respectively. Three retarded complexes, C1, C2, and C3, were seen with the FP2 probe (Fig. 4A). Two of these, C1 and C2, comigrated with the two retarded complexes seen with the FP3 probe (Fig. 5A). FP2 and FP3 compete with each other for the formation of C1 and C2 (Fig. 4A and Fig. 5A), as would be expected from the high sequence homology between these two adjacent sites. The C3 complex appears to be due to an unrelated factor specifically recognizing the FP2 element, since excess FP2 oligonucleotide, but not FP3 oligonucleotide, competes for its formation (Fig. 4A, lanes 4 and 6). The FP2 sequence exhibited some homology to the AP1 consensus recognition site, but molar excesses of an oligonucleotide representing an AP1 site did not compete with the FP2 probe (Fig. 4A, lanes 9 and 10) or with the FP3 probe (Fig. 5A, lanes 9 and 10). The FP3 sequence showed some homology to a PEA3 site(32, 37) , but an excess of an oligonucleotide with the PEA3 sequence from the polyoma enhancer (PY) did not inhibit complex formation with either the FP3 probe (Fig. 5, lanes 11-13) or the FP2 probe (data not shown). Positive competition by an oligonucleotide representing an Sp1 binding site (38) with both FP2 and FP3 as probes (Fig. 4A and Fig. 5A, lanes7 and 8), however, indicated that this factor may be responsible for complexes C1 and C2, but not C3. Although the FP2 and FP3 sites are remarkably GA-rich, rather than the typical GC-rich Sp1 site sequence(38) , purified Sp1 protein did form complexes with both the FP2 and FP3 probes, which comigrated with the C1 complex in SW403 extract (Fig. 4B, lanes 11 and 12; Fig. 5B, lanes 14 and 15). Antibody specific to Sp1 supershifted the Sp1 C1 complexes formed with the FP2 and FP3 elements, as expected, but also the FP2 and FP3 C1 complexes formed with SW403 nuclear extracts (Fig. 4B, lanes 14 and 15; Fig. 5B, lanes 17 and 18). The C2 and C3 complexes remained unaffected. Since Sp1 protein did not produce a band comigrating with the C2 complex and the latter was not affected by an Sp1-specific antibody, but Sp1 oligonucleotide did compete for the formation of the C2 complex, we can only surmise that the second complex is due to a Sp1-like factor, and not Sp1 itself. Thus, the FP2 and FP3 regulatory elements bind Sp1 as well as a Sp1-related factor. In addition, another unknown factor binds exclusively to the FP2 site to produce the C3 complex. Similar complexes were obtained with Caco-2 extracts (Fig. 6).


Figure 4: Element FP2 is recognized by Sp1 and another novel transcription factor. A, end-labeled, double-stranded FP2 oligonucleotide (Table 1) was incubated with 40 µg of SW403 nuclear extract and subjected to electrophoretic analysis. Three complexes (C1-C3) were revealed. Unlabeled, double-stranded FP2, FP3, Sp1, and AP1 oligonucleotides (Table 1; 10- and 100-fold molar excesses) were used as competitor DNAs. B, 3.5 ng of purified Sp1 protein was incubated with FP2 probe in lanes 12 and 13 under the same binding conditions. Antibody specific to Sp1 was preincubated with 14 ng of Sp1 protein (lane 14) and 40 µg of SW403 nuclear extract (lane15). The Sp1 supershifted (s.s.) complex is indicated by an arrow on the right.




Figure 5: Element FP3 is bound by Sp1. A, electrophoretic analysis of end-labeled, double-stranded FP3 oligonucleotide (Table 1) incubated with 40 µg SW403 nuclear extract shows two specific complexes (C1 and C2). Unlabeled, double-stranded FP3, FP2, Sp1, AP1, and PY oligonucleotides (Table 1; molar excesses as shown) were used as competitor DNAs. B, in lanes 15 and 16, FP3 probe and 3.5 ng of purified Sp1 protein were incubated as described in Fig. 4. Sp1-specific antibody was preincubated with 14 ng of Sp1 protein (lane17) and 40 µg of SW403 nuclear extract (lane18). The Sp1 supershifted (s.s.) complex is indicated by an arrow on the right. C, 20 µg of HepG2 nuclear extract was incubated with FP3 probe under the same binding conditions for comparison purposes.



Levels of Nuclear Factors in Differentiating Caco-2 Cells

Caco-2 cells show increased levels of CEA mRNA (12) and increased transcriptional activity of transfected CEA promoter-luciferase constructs (Fig. 1) with differentiation. Equivalent amounts of undifferentiated and differentiated Caco-2 nuclear extracts (based on equivalent numbers of nuclei that differed by only 10% in the concentration of total nuclear proteins; see ``Materials and Methods'') were examined for the levels of transcription factors by gel mobility shift assays using labeled oligonucleotides representing three of the five CEA cis-acting regulatory elements identified above (Fig. 6).

Probes FP1, FP2, and FP3 each produced several complexes (Fig. 6, lanes 1-12), similar to those seen with SW403 extracts (Fig. 3Fig. 4Fig. 5). The levels of the complexes obtained were dramatically higher using differentiated (D) than undifferentiated (U) Caco-2 nuclear extracts (Fig. 6). Thus, the levels of USF, Sp1, the Sp1-like, and the unknown factor responsible for the C3 complex with the CEA FP2 element all appear to increase with differentiation in Caco-2 cells. The higher levels support our contention that an increase in the abundance of positive factors interacting with the regulatory elements in the minimal promoter are partially responsible for the rise in CEA transcription observed in this differentiating system.


DISCUSSION

In this study, we have delineated the basic organization of the CEA gene promoter (summarized in Fig. 7and Table 3), which is the first to be determined for genes of the CEA subgroup of this human tumor marker family. Functional assays using various 5` flanking sequences of the CEA promoter linked to the luciferase reporter gene transfected into CEA-producing cells coupled with DNase footprint assays revealed that the 5` upstream region contains four positive (FP1-FP4) and one negative regulatory element (FP5). The upstream stimulatory factor (USF)(34) , also known as the adenovirus major late transcription factor (MLTF), was shown to bind to the FP1 element; the positive control of CEA transcription by USF was confirmed directly by the demonstration of specific stimulation of the CEA promoter in vivo by a co-transfected USF-producing plasmid. The Sp1 (38) and an Sp1-like transcription factor were found to bind to both the FP2 and the FP3 element. Through computer sequence analyses, FP4 was found to resemble an AP-2 transcription factor site (30, 39) , and preliminary experiments (data not shown) confirmed this possibility; oligonucleotides containing AP-2 binding sites competed with the FP4 site for nuclear factor binding and purified AP-2 protein was capable of forming a complex with the FP4 element. Other factors binding to the FP4 site, a third factor binding to the FP2 site and factors recognizing the FP5 silencer element remain to be identified.


Figure 7: Schematic representation of the various nuclear proteins binding the regulatory elements of the CEA promoter. Those factors that have been identified are indicated. Factors binding to the BGP promoter are also shown to allow comparison of transcription factor complexes specific to the CEA gene promoter. The arrow signifies the major transcriptional start site. Sequences and positions are compared in Table 3.



The biological significance of these element and factor assignments was further tested using the Caco-2 colonocyte system, which shows an increase in CEA mRNA with differentiation into polarized epithelium. Since the positively acting factor levels increased dramatically, it is thus possible that the transcription factors identified here could also control the expression of CEA in normal colonic epithelial cells, which show an increase in CEA mRNA (40) and protein (41) production during their differentiation in transit from the bottom to the top of a crypt. The basis for transcriptional changes seen for CEA and other CEA family members in tumors remains to be investigated.

Sequence comparisons of the CEA gene regulatory elements with the upstream noncoding sequences of other CEA gene family members are shown in Table 3. The control of expression of this family is of particular interest because of the unusually close alignment of the nucleotide sequences of its members (often over 90%). PSG5 and PSG11 are members of the pregnancy-specific glycoprotein (PSG) subgroup of the CEA gene family whose upstream sequences showed homology to the CEA FP2 and FP3 elements only. We have also analyzed the control of transcription of a second CEA family member, BGP (42) which, unlike CEA(6) , can show decreased transcript levels in colon carcinomas relative to adjacent normal tissue(8) . Comparison of the CEA and BGP gene promoters (see Fig. 7for summary) revealed differences that could explain their differential regulation. Only two of the corresponding BGP gene elements could be shown to bind nuclear factors: thus the Sp1, the Sp1-like (recognizing FP2 and FP3 in CEA), and the silencer factors (recognizing FP5 in CEA) do not bind to the BGP promoter, while a second factor, HNF-4(26) , as well as USF, binds to the USF site(42) . Experiments are in progress to determine whether the different changes in transcription of the CEA and BGP genes seen with colon carcinogenesis can be rationalized by changes in these factors.

Although the Sp1 transcription factor has long been characterized(31) , only recently have dramatically increased levels of this ubiquitously expressed factor been correlated with differentiation(33) . Genes regulated by Sp1 include the following: fibronectin, which shows greatly inhibited expression upon neoplastic transformation(43) ; E-cadherin, which is generally down-regulated in tumors(44) ; and the human papillomavirus type 18 E6-E7 oncogene, which also has an unusual Sp1 site(45) . Specific recognition of the DNA sequence is provided for by the three zinc finger domains of Sp1(46) . Although the core sequence is a typical GC box, substitutions are tolerated as long as certain G residues are present for contact with the ``fingers''(46) . These contact points exist in the FP2 and FP3 elements of the CEA gene, within the aligned homologous sequences of the NCA gene, and within the CGM1 sequence aligned to the CEA FP2 element (see Table 3).

The USF gene has recently been cloned and is now known to code for a ubiquitous factor with a helix-loop-helix repeat domain and a leucine zipper(34) . Its protein-binding interface is similar to that of Myc and Max(47) . All Myc family members recognize an identical core DNA target sequence of CACGTG and appear to bind to DNA as homo- or heterodimers, dependent on specificities contained within the leucine zipper(34, 48, 49) . Since we have directly demonstrated that USF activates the CEA promoter in vivo, any factors interacting with and modulating this nuclear factor should also modulate CEA gene expression. Although purified Myc protein did not bind to the CEA FP1 site (data not shown), it remains possible that the heterodimer c-Myc/Max could bind to this element or to USF itself. Both Myc/Max and USF have also been shown to bend DNA toward the minor groove to the same angle and orientation(50) . Pognonec et al.(51) have also demonstrated that the DNA-binding activity of USF is regulated via a redox dependent mechanism. The activity of the CEA promoter could therefore be partially controlled by a complex balance between the binding of various other b-HLH proteins to form heterodimers with USF, the redox mechanism, and competition for DNA binding by other factors, such as HNF-4 as shown for the BGP promoter(42) . As previously mentioned, CEA mRNA levels are up-regulated in colon carcinomas. About 70% of colon carcinomas overexpress the c-Myc gene as well as other members of the Myc gene family(52, 53) , although the status of USF expression is presently unknown.

This study has identified many of the cis-acting elements involved in the transcriptional control of the CEA gene and some of the trans-acting factors which interact with them. USF, in particular, is involved in both CEA and BGP control and may play an important part in the overall control system of the CEA gene family. These assignments, coupled with further studies on other family members, should lead to the rationalization of the observed tissue-specific, differentiation-dependent expression of the family. The possible deregulation of these trans-acting transcriptional factors in colon carcinogenesis could represent the basis for changes in the expression of CEA and other family members seen in tumors, changes that could be instrumental in the carcinogenetic process(2) . Ectopic expression of CEA, for example, has been shown recently to block myogenic differentiation and leave cells with division potential(5) . It will now be of interest to determine whether the CEA regulatory elements identified here are targets for the action of oncogenes and tumor suppressor genes.


FOOTNOTES

*
This work was supported by grants from the Medical Research Council of Canada and the National Cancer Institute of Canada. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed: McGill Cancer Centre, McGill University, 3655 Drummond St., Rm. 701, Montreal, Quebec H3G 1Y6, Canada. Tel.: 514-398-7279; Fax: 514-398-6769.

(^1)
The abbreviations used are: CEA, carcinoembryonic antigen; bp, base pair(s); LUC, luciferase; RSV, Rous sarcoma virus; FP, footprint; NCA, nonspecific cross-reacting antigen; BGP, biliary glycoprotein; CGM6, CEA gene family member 6; USF, upstream stimulatory factor; HNF-4, hepatic nuclear factor 4; PSG, pregnancy-specific glycoprotein.


ACKNOWLEDGEMENTS

We are indebted to Drs. P. Pognonec, T. Gutjahr, and R. Roeder (Rockefeller University, New York) for USF cDNA and anti-USF antibody, and to Dr. F. M. Sladek (University of California, Riverside, CA) for HNF-4 cDNA. We also thank Dr. N. Beauchemin for stimulating discussions and critical advice and Drs. M. Chamberlin and J. Chou (National Institutes of Health, Bethesda, MD) for helpful discussions of unpublished work.


REFERENCES

  1. Gold, P., and Freedman, S. O. (1965) J. Exp. Med. 121, 439-462
  2. Benchimol, S., Fuks, A., Jothy, S., Beauchemin, N., Shirota, K., and Stanners, C. P. (1989) Cell 57, 327-334 [Medline] [Order article via Infotrieve]
  3. Oikawa, S., Inuzuka, C., Kuroki, M., Matsuoka, Y., Kosaki, G., and Nakazato, H. (1989) Biochem. Biophys. Res. Commun. 164, 39-45 [Medline] [Order article via Infotrieve]
  4. Jessup, J. M., and Thomas, P. (1989) Cancer Metastasis Rev. 8, 263-280 [Medline] [Order article via Infotrieve]
  5. Eidelman, F. J., Fuks, A., DeMarte, L., Taheri, M., and Stanners, C. P. (1993) J. Cell Biol. 123, 467-475 [Abstract]
  6. Thompson, J. A., Grunert, F., and Zimmermann, W. (1991) J. Clin. Lab. Anal. 5, 344-366 [Medline] [Order article via Infotrieve]
  7. Barnett, T. R., Drake, L., and Pickle, W., II (1993) Mol. Cell. Biol. 13, 1273-1282 [Abstract]
  8. Neumaier, M., Paululat, S., Chan, A., Matthaes, P., and Wagener, C. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 10744-10748 [Abstract]
  9. Boucher, D., Cournoyer, D., Stanners, C. P., and Fuks, A. (1989) Cancer Res. 49, 847-852 [Abstract]
  10. Tran, R., Kashmiri, S. V. S., Kantor, J., Greiner, J. W., Pestka, S., Shively, J. E., and Schlom, J. (1988) Cancer Res. 48, 5674-5679 [Abstract]
  11. Willcocks, T. C., and Craig, I. W. (1990) Genomics 8, 492-500 [Medline] [Order article via Infotrieve]
  12. Hauck, W., and Stanners, C. P. (1991) Cancer Res. 51, 3526-3533 [Abstract]
  13. Pinto, M., Robine-Leon, S., Appay, M.-D., Kedinger, M., Triadou, N., Dussaulx, E., Lacroix, B., Simon-Assmann, P., Haffen, K., Fogh, J., and Zweibaum, A. (1983) Biol. Cell 47, 323-330
  14. Rousset, M. (1986) Biochimie (Paris) 68, 1035-1040 [Medline] [Order article via Infotrieve]
  15. Schrewe, H., Thompson, J. A., Bona, M., Hefta, L. J., Maruya, A., Hassauer, M., Shively, J. E., von Kleist, S., and Zimmermann, W. (1990) Mol. Cell. Biol. 10, 2738-2748 [Medline] [Order article via Infotrieve]
  16. Beauchemin, N., Benchimol, S., Cournoyer, D., Fuks, A., and Stanners, C. P. (1987) Mol. Cell. Biol. 7, 3221-3230 [Medline] [Order article via Infotrieve]
  17. De Wet, J. R., Wood, K. V., DeLuca, M., Helinski, D. R., and Subramani, S. (1987) Mol. Cell. Biol. 7, 725-737 [Medline] [Order article via Infotrieve]
  18. Nordeen, S. K. (1983) BioTechniques 6, 454-456
  19. Pollard, J. W., and Stanners, C. P. (1979) J. Cell. Physiol. 98, 571-586 [Medline] [Order article via Infotrieve]
  20. Abraham, G., and Colonno, R. J. (1984) J. Virol. 51, 340-345 [Medline] [Order article via Infotrieve]
  21. Stanners, C. P., Elicieri, G. L., and Green, H. (1971) Nature 230, 52-54
  22. Baserga, R., Croce, C., and Rovera, G. (1980) Introduction of Macromolecules into Viable Mammalian Cells , Vol. 1, Wistar Symposium Series, Alan R. Liss, New York
  23. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual , 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  24. Sawadogo, M. (1988) J. Biol. Chem. 263, 11994-12001 [Abstract/Free Full Text]
  25. Sawadogo, M., Van Dyke, M. W., Gregor, P. D., and Roeder, R. G. (1988) J. Biol. Chem. 263, 11985-11993 [Abstract/Free Full Text]
  26. Sladek, F. M., Zhong, W., Lai, E., and Darnell, J. E., Jr. (1990) Genes & Dev. 4, 2353-2365
  27. Therrien, M., and Drouin, J. (1991) Mol. Cell. Biol. 11, 3492-3503 [Medline] [Order article via Infotrieve]
  28. LeFèvre, C., Imagawa, M., Dana, S., Grindlay, J., Bodner, M., and Karin, M. (1987) EMBO J. 6, 871-981 [Abstract]
  29. Howell, B. W., Lagacé, M., and Shore, G. C. (1989) Mol. Cell. Biol. 9, 2928-2933 [Medline] [Order article via Infotrieve]
  30. Williams, T., and Tjian, R. (1991) Genes & Dev. 5, 670-682
  31. Briggs, M. R., Kadonaga, J. T., Bell, S. P., and Tjian, R. (1986) Science 234, 47-52 [Medline] [Order article via Infotrieve]
  32. Wasylyk, C., Wasylyk, B., Heidecker, G., Huleihel, M., and Rapp, U. R. (1989) Mol. Cell. Biol. 9, 2247-2250 [Medline] [Order article via Infotrieve]
  33. Saffer, J. D., Jackson, S. P., and Annarella, M. B. (1991) Mol. Cell. Biol. 11, 2189-2199 [Medline] [Order article via Infotrieve]
  34. Gregor, P. D., Sawadogo, M., and Roeder, R. G. (1990) Genes & Dev. 4, 1730-1740
  35. Ramji, D. P., Tadros, M. H., Hardon, E. M., and Cortese, R. (1991) Nucleic Acids Res. 19, 1139-1146 [Abstract]
  36. Bram, R. J., and Kornberg, R. D. (1987) Mol. Cell. Biol. 7, 403-409 [Medline] [Order article via Infotrieve]
  37. Wasylyk, B., Wasylyk, C., Flores, P., Begue, A., Leprince, D., and Stehelin, D. (1990) Nature 346, 191-193 [CrossRef][Medline] [Order article via Infotrieve]
  38. Kadonaga, J. T., Carner, K. R., Masiarz, F. R., and Tjian, R. (1987) Cell 51, 1079-1090 [Medline] [Order article via Infotrieve]
  39. Williams, T., Admon, A., Lüscher, B., and Tjian, R. (1988) Genes & Dev. 2, 1557-1569
  40. Jothy, S., Yuan, S.-Y., and Shirota, K. (1993) Am. J. Pathol. 143, 250-257 [Abstract]
  41. Ahnen, D. J., Nakane, P. K., and Brown, W. R. (1982) Cancer 49, 2077-2090 [Medline] [Order article via Infotrieve]
  42. Hauck, W., Nédellec, P., Turbide, C., Stanners, C. P., Barnett, T. R., and Beauchemin, N. (1994) Eur. J. Biochem. 223, 529-541 [Abstract]
  43. Dean, D. C., Bowlus, C. L., and Bourgeois, S. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 1876-1880 [Abstract]
  44. Behrens, J., Löwrick, O., Klein-Hitpass, L., and Birchmeier, W. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 11495-11499 [Abstract]
  45. Hoppe-Seyler, F., and Butz, K. (1992) Nucleic Acids Res. 20, 6701-6706 [Abstract]
  46. Kriwacki, R. W., Schultz, S. C., Steitz, T. A., and Caradonna, J. P. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 9759-9763 [Abstract]
  47. Blackwood, E. M., and Eisenman, R. N. (1991) Science 251, 1211-1217 [Medline] [Order article via Infotrieve]
  48. Beckmann, H., Su, L.-K., and Kadesch, T. (1990) Genes & Dev. 4, 167-179
  49. Fisher, D. E., Carr, C. S., Parent, L. A., and Sharp, P. A. (1991) Genes & Dev. 5, 2342-2352
  50. Fisher, D. E., Parent, L. A., and Sharp, P. A. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 11779-11783 [Abstract]
  51. Pognonec, P., Kato, H., and Roeder, R. G. (1992) J. Biol. Chem. 267, 24563-24567 [Abstract/Free Full Text]
  52. Finley, G. G., Schulz, N. T., Hill, S. A., Geiser, J. R., Pipas, J. M., and Meisler, A. I. (1989) Oncogene 4, 963-971 [Medline] [Order article via Infotrieve]
  53. Melhem, M. F., Meisler, A. I., Finley, G. G., Bryce, W. H., Jones, M. O., Tribby, I. I, Pipas, J. M., and Koski, R. A. (1992) Cancer Res. 52, 5853-5864 [Abstract]
  54. Oikawa, S., Kosaki, G., and Nakazato, H. (1987) Biochem. Biophys. Res. Commun. 146, 464-469 [Medline] [Order article via Infotrieve]
  55. Thompson, J. A., Koumari, R., Wagner, K., Barnert, S., Schleussner, C., Schrewe, H., Zimmermann, W. A., Müller, G., Schempp, W., Zaninetta, D., Ammaturo, D., and Hardman, N. (1990) Biochem. Biophys. Res. Commun. 167, 848-859 [Medline] [Order article via Infotrieve]
  56. Brophy, B. K., MacDonald, R. E., McLenachan, P. A., Beggs, K. T., and Mansfield, B. (1992) Biochim. Biophys. Acta 1131, 119-121 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.