©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
The Mouse p44 Mitogen-activated Protein Kinase (Extracellular Signal-regulated Kinase 1) Gene
GENOMIC ORGANIZATION AND STRUCTURE OF THE 5`-FLANKING REGULATORY REGION (*)

(Received for publication, February 2, 1995; and in revised form, August 17, 1995)

Gilles Pagès (§) E. Richard Stanley (¶) Maude Le Gall Anne Brunet Jacques Pouysségur

From the Centre de Biochimie, CNRS UMR134, Parc Valrose, 06108 Nice Cedex 2, France

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Mitogen-activated protein kinase (MAPK) or extracellular signal-regulated kinase are ubiquitous kinases conserved from fungi to mammals. Their activity is regulated by phosphorylation on both threonine and tyrosine, and they play a crucial role in the regulation of proliferation and differentiation. We report here the cloning of the murine p44 MAP kinase (extracellular signal-regulated kinase 1) gene, the determination of its intron/exon boundaries, and the characterization of its promoter. The gene spans approximately eight kilobases (kb) and can be divided into nine exons and eight introns, each coding region exon containing from one to three of the highly conserved protein kinase domains. Primer extension analysis reveals the existence of two major start sites of transcription located at -183 and -186 base pairs (bp) as well as four discrete start sites for transcription located at -178, -192, -273, and -292 bp of the initiation of translation. However, the start site region lacks TATA-like sequences but does contain initiator-like sequences proximal to the major start sites obtained by primer extension. 1 kb of the promoter region has been sequenced. It contains three putative TATA boxes far upstream of the main start sites region, one AP-1 box, one AP-2 box, one Malt box, one GAGA box, one half serum-responsive element, and putative binding sites for Sp1 (five), GC-rich binding factor (five), CTF-NF1 (one), Myb (one), p53 (two), Ets-1 (one), NF-IL6 (two), MyoD (two), Zeste (one), and hepatocyte nuclear factor-5 (one). To determine the sites critical for the function of the p44 MAPK promoter, we constructed a series of chimeric genes containing variable regions of the 5`-flanking sequence of p44 MAPK gene and the coding region for luciferase. Activity of the promoter, measured by its capacity to direct expression of a luciferase reporter gene, is strong, being comparable with the activity of the Rous sarcoma virus promoter. Progressive deletions of the 1 kb (-1200/-78) promoter region allowed us to define a minimal region of 186 bp (-284/-78) that has maximal promoter activity. Within this context, deletion of the AP-2 binding site reduces by 30-40% the activity of the promoter. Further deletion of this minimal promoter that removes the major start sites (-167/-78) surprisingly preserves promoter activity. This result implicates a major role of this region that contains the Sp1 sites. Finally, removal of the major start sites of transcription as well as the Sp1 sites reveals additional promoter activity at the upstream transcription minor start sites (-240/-167), an activity that is enhanced by the upstream cis-acting elements. In summary, our findings reveal a complex pattern of transcriptional regulation of the mouse p44 MAPK promoter.


INTRODUCTION

Mitogen-activated protein kinases (MAPKs) (^1)or extracellular signal-regulated kinases were first described as two proteins of 42 and 44 kDa that were phosphorylated on both tyrosine and threonine residues following stimulation of 3T3-L1 adipocytes with insulin(1, 2) . These same phosphoproteins had been visualized previously by two-dimensional gel electrophoresis(3, 4, 5) . MAPKs are ubiquitously expressed, being found in all cell systems studied including yeast, worms, flies, frogs, plants, and mammals(6) . They are activated by a wide variety of extracellular signals, and their activation requires the phosphorylation of the highly conserved TEY motif present in almost all described MAPKs. An increasing body of data in particular in yeast suggests that MAPKs belong to a multigene family. In yeast, each reported isoform has been implicated in a different signaling pathway, leading to mating, cell wall synthesis, or regulation of osmotic pressure. Studies with the mammalian homologues of yeast MAPK suggest that they play equivalent roles in different processes, including proliferation, differentiation, and response to environmental stress (7, 8, 9, 10, 11) .

As far as the p42 and p44 MAPK are concerned, two approaches have demonstrated their role in controlling fibroblast cell growth. First, we showed that overexpression of either a dominant-negative p44 MAPK mutant or an antisense construct prevented growth factor-induced cell cycle entry. Second, we (12) and others (13, 14) demonstrated that expression of a constitutively active form of a MAPK activator (MEK1) led to the constitutive activation of p42 and p44 MAPK, an action sufficient to promote cell cycle entry and oncogenicity in fibroblasts. So, at this stage, both MAPK isoforms that are coordinately regulated and capable of phosphorylating identical substrates in vitro appear to be redundant. Alternatively they might serve different functions as a consequence of alternative spliced isoforms that display distinct subcellular localization as recently reported(15) . To resolve this issue we isolate genomic MAPK clones in order to study their regulation and to subsequently disrupt each corresponding mouse gene. Here we describe the detailed structure of the murine p44 MAPK (extracellular signal-regulated kinase 1) gene. We have also investigated its promoter to identify cis elements important to drive transcription by a deletion analysis.


EXPERIMENTAL PROCEDURES

Materials

Restriction and DNA modifying enzymes were obtained from New England Biolabs (Ozyme, France) or from Eurogentec (Liege, Belgium). [alpha-P]dCTP, [alpha-S]dATP, and [-P]dATP were from Amersham Corp. or ICN. Synthetic oligonucleotides were from Eurogentec. The genomic DNA library was made with SV 129 D3 embryonic stem cell DNA and constructed in GEM12. This library was kindly provided by J-M. Garnier (laboratory of P. Chambon, Strasbourg, France).

Genomic Clones

The genomic library was screened with a 0.73-kb KpnI/KpnI fragment from the hamster p44 MAPK cDNA(16) . Six phage plaques were isolated from the library with the first screening. Among these different clones two clones (clones 14 and 15) hybridized to the greatest extent to the probe at high stringency and were subcloned for a further analysis. Clone 15 was found to correspond to p44 MAPK, and clone 14 corresponded to another isoform of p44 MAPK with approximately 80% homology at the nucleotide level. Two SacI fragments of 4.3 and 4.8 kb adjacent in clone 15 were subcloned into Bluescript KS. The partial sequence of these subclones was obtained using Universal M13, T3, T7, SK, and KS primers as well as oligonucleotide primers derived from the coding sequence of the hamster p44 MAPK.

DNA Sequence Analysis

Sequencing was performed by the double-stranded dideoxy chain termination technique using the Pharmacia kit. Restriction analysis and determination of overlapping sequence were done using the Mac Vector program for Macintosh (IBI, New Haven, CT).

Primer Extension Analysis

Three oligonucleotides, ERS 2 corresponding to bases 80-61 of the cDNA (base 1 = A of ATG), ERS 5 corresponding to bases 20-1, and GP9 corresponding to bases 208-189 were designed to hybridize with total RNA isolated from the same ES cells used to construct the genomic library and were used to prime reverse transcription. The oligonucleotides were end-labeled with [-P]dATP and T4 polynucleotide kinase and purified by ethanol precipitation. Labeled oligonucleotides were co-precipitated with 30 µg of total RNA and resuspended in hybridization buffer (5 mM PIPES, pH 6.4, 0.5 mM NaCl, 1.0 mM EDTA, and 80% formamide). The mixture was heated to 95 °C during 10 min and annealed at 50 °C for 16 h to avoid the generation of secondary structures. The RNA and annealed oligonucleotides were extracted with phenol/chloroform and ethanol-precipitated. The pellet was resuspended in reverse transcriptase buffer with 5 mM dNTPs, 25 units of RNase inhibitor, and 50 units of Moloney Murine Leukemia virus reverse transcriptase. Elongation was carried out for 2 h at 42 °C. The reaction was stopped by incubation at 65 °C for 10 min, and the products were incubated with 10 units of RNase H for 30 min at 37 °C. The reaction products were phenol/chloroform-extracted, ethanol-precipitated, and resuspended in formamide loading buffer. Half of the primer-extended products were electrophoresed on a 6% acrylamide, 7 M urea sequencing gel in parallel with the products of a double-stranded sequencing reaction.

Construction of Chimeric Luciferase Plasmids

A BglII/HindIII fragment corresponding to the promoter region was cloned in front of the luciferase reporter gene as follows. First, a 5` 2-kb SacII/SacII fragment derived from a 4.3-kb subclone of genomic lambda phage 15 was introduced into Bluescript KS so that the 5` SacII site was in the Bluescript polylinker. Then a BglII (internal site)/HindIII (polylinker site) fragment was subcloned in the PxP 1 luciferase vector (17) to obtain the BH construct. The BN construct was obtained by cutting the BH plasmid by NheI (internal site) and HindIII (polylinker site) blunt-ending both extremities with the klenow enzyme, and religating the vector on itself. The P3` vector was obtained as follows. A 700-bp PstI/PstI fragment of the promoter was first subcloned in Bluescript KS, and then using the BamHI and HindIII sites of Bluescript this fragment was introduced at the corresponding BamHI and HindIII sites of PxP 1 vector. The PH vector was obtained by cutting the P3` vector by BamHI (polylinker site) and NheI (internal site) and ligating this fragment in the BH vector where a BamHI/NheI fragment has been removed. The NH vector was obtained by cutting the BH vector with NheI (internal site) and BglII (corresponding to the BglII described above), blunt-ending the extremities with klenow enzyme, and religating the vector on itself. The SH construct was obtained by completely digesting the BH vector with BglII and partially digesting with StyI, blunt-ending with klenow, and religating the vector on itself. The BsH vector was obtained by digesting the BH vector with BglII and BssHII, blunt-ending with klenow, and religating the vector on itself. The NP vector was obtained by cutting the P3` vector with BamHI (polylinker site) and NheI (see above), blunt-ending the extremities with klenow enzyme, and religating the vector on itself. The antisense construct was obtained by introducing in the PxP1 vector a KpnI/KpnI fragment of 1.8 kb isolated from the SacII construct described above. The +AP2 and -AP2 vector were constructed by PCR using, respectively, oligo 1 (CGGGATCCCTTAGCATTACTGAG) + 4 (GAAAGCTTGATATCGAATTCCTGC) and oligo 2 (CGGGATCCCTCTTGGCAGACTAAAG) + 4. The +AP2/Bs and -AP2/Bs constructs were obtained by cutting, respectively, the +AP2 and -AP2 constructs by BssHII (internal site) and HindIII (polylinker site), blunt-ending with klenow, and religating the vector on itself. The S/Bs construct was obtained by cutting the SH vector by BssHII (internal site) and HindIII (polylinker site), blunt-ending with klenow, and religating the vector on itself. The Rous sarcoma virus (RSV) luciferase gene was as described previously(18) .

Transient Transfection and Luciferase Assay

CCL39 cells in 12-well dishes (10^5/well) were transiently transfected by CaPO(4) precipitation with the indicated plasmids (2 µg/well). Sixteen hours after addition of DNA, the cells were washed twice with phosphate-buffered saline and incubated with Dulbecco's modified Eagle's medium with 7.5% fetal calf serum. Two days later, the cells were washed with cold phosphate-buffered saline, and luciferase assays were performed as follows (Promega protocols and applications guide). Cells were lysed in lysis buffer (25 mM Tris-phosphate, pH 7.8, 2 mM dithiothreitol, 2 mM 1,2-diaminocyclohexane-N,N,N`,N`-tetraacetic acid, 10% glycerol, and 1% Triton X-100) for 15 min at room temperature and the lysate was cleared by centrifugation (5 min, 12,000 times g). The assay of luciferase activity was performed in a chemoluminometer in a buffer containing 20 mM Tricine, 1.07 mM (MgCO(3))Mg(OH)(2), 5H(2)O, 2.67 mM MgSO(4), 0.1 mM EDTA, 33.3 mM dithiothreitol, 270 µM coenzyme A, 470 µM luciferine, and 530 µM ATP. Protein concentration was measured using the bicinchonic acid (BCA) protein assay kit (Pierce) with bovine serum albumin as standard.

Preparation of RNA

Cells were washed in ice cold phosphate-buffered saline and lysed in RNA Insta-Pure buffer from Eurogentec (Liege, Belgium). The supernatant was cleared by centrifugation and ethanol-precipitated. RNA were resuspended in sterile water.


RESULTS

With the existence of a large MAPK family member and the presence of various pseudogenes, it was crucial to characterize with certainty the genomic clones that hybridize with the entire hamster p44 MAPK cDNA. From the two phages, 14 and 15, that hybridized at high stringency to the p44 MAPK probe, only phage 15 was assigned to the mouse p44 MAPK (extracellular signal-regulated kinase 1) gene. This identification was certified by total exon sequencing. In contrast, phage 14 corresponds to a p44 MAPK close family member. From the phage 15, we estimated that the transcription unit of the mouse gene for p44 MAPK (extracellular signal-regulated kinase 1) spans approximately 8 kb. Fig. 1shows the relationship of the gene to its corresponding mRNA/cDNA and protein. Sequencing of the plasmid subclones that hybridize to the hamster p44 MAPK probe allowed the determination of the position of the intron/exon junctions. We found nine exons ( Fig. 1and Table 1), and all of the splice acceptor and donor sequences agree with the ``GT-AG'' rule(19) . Each exon of the gene encodes one or more of the conserved subdomains previously identified in protein kinases(20) . The first exon contains all of the 5`-untranslated region and also contains the region coding for the GXGXXG domain determinant for the ATP binding. The possibility of alternative splicing with the presence of an additional intron in that region is discussed later. Exon 2 contains lysine 72 of subdomain II implicated in phosphate transfer, subdomain III, and subdomain IV with conservation of glutamic acid 89 and hydrophobic residues; exon 3 encodes the subdomains V and VI with the HRD motif; exon 4 encodes subdomain VII with the invariant DFG motif and subdomain VIII containing the APE triplet; exon 5 encodes the subdomain IX where aspartic acid 228 is conserved; exon 6 encodes subdomain X; exon 7 encodes subdomain XI; exon 8 encodes a region of the protein apparently implicated in the specificity of substrate recognition of the MAP kinase family plus 5.4% of the 3`-untranslated region; and exon 9 encodes the remaining 3`-untranslated region.


Figure 1: Organization of the mouse p44 MAPK gene in relation to its mRNA and predicted protein structure. Positions of exons (filled) and introns (open) are shown aligned with the common restriction enzyme sites and the position of the major transcriptional start sites. The locations of the introns are indicated by the nucleotide number on the cDNA (the exons are boxed, black for coding regions and hatched bars for 5` and 3` noncoding regions) where base 1 corresponds to the ``A'' of the ATG. Roman numerals correspond to the conserved kinase subdomains previously defined by Hanks et al.(20) .





Identification of the Transcription Start Site

The 5` end of the mRNA was determined by primer extension using different 20-base oligonucleotides (see ``Experimental Procedures'') derived from the sequence of the mouse p44 MAPK. ERS 2 and GP9 are specific for p44 MAPK, while ERS 5 can hybridize to p42 MAPK as well as p44 MAPK. The ERS 5 oligo was used to determine the exact position of the start sites because the extended fragments obtained with GP9 or ERS 2 were too long for their size to be determined accurately on classical sequencing gels. The experiment performed with ERS 5 reveals the existence of two major start sites (-183 and -186) as well as four minor initiation sites at positions -178, -192, -273, and -292 bp of the ATG (A of ATG = +1) (Fig. 2). Because of the high proportion of GC content upstream of and including the ERS 5 primer, it was difficult to obtain reliable sequence within this region, and therefore an unrelated template and primer were used to calibrate the primer extension analysis. When the primer extension analysis was performed with the GP9 oligonucleotide, we detected a major band of 44 bp, indicating the existence of another possible transcript shorter than the classical p44 MAPK transcript (data not shown). Interestingly, the sequence shown in Fig. 2also reveals the presence of one splice donor (CGGGTGGGT at -293) and two splice acceptor (CCGCGCAGG at -138, TGGTGAAGG at +92, and GGGCAGC at +100) consensus sites within the promoter, the 5`-untranslated sequence, and the beginning of the coding region. Fig. 2only shows the positions of the start sites of transcription in the absence of alternative splicing.


Figure 2: Identification of the transcription start sites on the mouse p44 MAP kinase gene by primer extension. Primer extension reaction was performed with 30 µg of total ES cell RNA and ERS 5 oligonucleotide. The sizes of the extended fragments (178, 183, 186, 192, 273, and 292 bp) are indicated by arrows. The double-stranded sequencing reaction shown on the left, used to determine the size of the fragments described above, was obtained with the RP oligonucleotide primer used to sequence an internal PstI fragment of the p44 MAPK gene subcloned in the PTZ vector (Bio-Rad) (see ``Experimental Procedures'').



Sequence of the 5`-Regulatory Region

To obtain the sequence of the ATG 5`-flanking region, we subcloned a 2-kb SacII/SacII fragment from the original 4.3-kb SacI/SacI fragment. This SacII/SacII fragment was digested with BstEII, AccI, NheI, and BglII in order to obtain smaller fragments. The sequence obtained is shown in Fig. 3. It was searched for reported consensus sequences that are recognized by DNA-binding proteins. The sequence reveals three putative TATA boxes (TATAAAA at -846; GATACATA at -638; CATAGAGA at -384), one hepatocyte nuclear factor-5 site (TATTTGT at -1321)(21) , two p53 sites (GGGCTTGCTT at -1193 and GGGCTAGCCT at -369)(22) , two MyoD sites (CACCTG at -952 and CACCTG at -772)(23) , one Zeste site (TGAGCC at -947)(24) , two sites for NF-IL6 (TGAAGGAAT at -754 and TGTGGCAAT at -680)(25) , one-half serum-responsive element site (GATGTCC at -743)(26) , one GAGA box (AGAGAGAGAG at -509)(24) , one Malt box (GGATGGA at -491) (27) , one site for Ets-1 (GAGGATGT at -413)(25) , one Myb site (TAACTG at -304)(28) , one AP-2 site in reverse orientation (GGCCTGGGG at -259)(29) , one AP-1 site in reverse orientation (AGACTAA at -232) (30) , one CTF-NF1 site (ACCCTAGTGGCCAA at -222)(31) , five Sp1 sites (GGGCGG at -112, -97, -91, -46, and -35)(32) , and five GC-rich binding factor (GCF) binding sites ((G/C/T)(G/C)CG(C/G)(C/G)(C/G)C(G/C/T)) overlapping the Sp1 sites(33) . All binding sites cited above induced a positive regulation, whereas GCF exhibits transcriptional repressor activity. To study the relevance of these sequences, a comprehensive structure-function analysis of the promoter was performed.


Figure 3: Sequence of the ATG 5`-flanking region of the mouse p44 MAPK gene. Analysis of the sequence flanking the ATG start from -1 (first base upstream of the A of ATG) to -1325. Consensus sequences for DNA-binding proteins (TATA box; AP-2; AP-1; Myb; p53; Ets-1; GAGA box; Malt box; NF-IL6; CTF-NF1; Sp1; GCF; serum-responsive element (Half SRE); MyoD; Zeste; hepatocyte nuclear factor 5 (HNF-5)), splice donor or splice acceptor sequences, oligonucleotides used in primer extension analysis (ERS 2, ERS 5, GP9), and major restriction sites are underlined. The restriction enzyme sites are BglII, PstI, NheI StyI, BssHII, and SacII. The positions of the start sites of transcription are shown by R (178, 183, 186, 192, 273, and 292). The lowercase letters represent the position of the intron. In the case of Sp1 or GCF the number of of each site in the underlined sequence is given.



Identification of the Promoter Region

The promoter activity of a 1128-bp BglII/SacII (BH) fragment (see ``Experimental Procedures'' and Fig. 3) was analyzed by measuring its ability to direct production of a luciferase reporter enzyme in transient transfection assays in CCL 39 lung fibroblasts. Positive and negative controls included RSV luciferase, which contains the RSV promoter, PxP1 (EV), which has no promoter and a KpnI/KpnI (AS) fragment of 1.8 kb introduced into the vector in a reverse orientation. Deletions of the BH fragment from the 5` or the 3` end (see ``Experimental Procedures'' and Fig. 3and 4) were also tested. The BH fragment showed high promoter activity comparable with that of the control RSV promoter. 5` deletion of this construct to the distal PstI (PH) or NheI (NH) sites did not strongly affect the activity of the promoter. Positioned within segment -367/-178 (proximal start site) are sequence motifs that are perfect or single base mismatches to the consensus binding sites for transcription factors Myb(28) , AP-2(29) , AP-1(30) , and CTF-NF1(31) . Interestingly the +AP-2 construct shows a small increase in promoter activity when compared with the BH construct, suggesting removal of upstream inhibitory sequences. Thus, we can assimilate the -284/-78 region to the maximally active promoter. To determine whether or not any of these consensus sequences were genuine AP-2, AP-1, or CTF-NF1 important binding sites, we performed finer deletion analysis. As shown in Fig. 4, deletion of the AP-2 binding site reduced by 30-40% the activity of the +AP-2 construct. A deeper deletion to the StyI site deleting the AP-1 and CTF-NF1 site decreased by 66% the activity of the +AP-2 construct. Therefore, these results indicate that AP-2, AP-1, and CTF-NF1 potential binding sites participate in the strength of the promoter.


Figure 4: Measurement of p44 MAP kinase promoter activity by transient expression of chimeric luciferase reporter gene-promoter constructs in CCL39 lung fibroblasts. Activities from different p44 MAPK promoter constructs measured in cells stimulated with 10% fetal calf serum were compared and plotted (percentage of the luciferase activity of the BH construct considered as 100%). The names of the constructs and the numerotation relative to the initiation of translation (+1) are given (see also Fig. 3and ``Experimental Procedures''). These data are representative of five independent transient transfection experiments.



Transcription Can Be Initiated from All of the Start Sites Determined by Primer Extension

An intriguing observation is that deletion of the major start sites of transcription (BsH construct) decrease but did not abolish transcription, suggesting intervention of initiator-like sequence (34) within the -167/-78 region or specific initiation of transcription in fibroblasts. The quite strong promoter activity (50% of maximum) is probably driven by the few remaining sequences containing the three Sp1 binding sites. However, these Sp1 sites are dispensable since their removal (+AP-2/Bs or -AP-2/Bs constructs) preserves promoter activity (12 or 8% of maximal). On the other hand, a 3` deletion to the PstI proximal site (NP construct) that deletes Sp1 sites as well as the major transcription start sites still possesses 5-10% of promoter activity. The addition to this construct of the upstream cis-acting elements (P3` construct) enhanced 10-fold this activity. This result shows that transcription can be initiated from the discrete start sites(-273, -292) and that there exist positive cis-active elements in the -939/-367 region. Indeed deletion of these discrete start sites (BN construct) abolished promoter activity.


DISCUSSION

MAPK belongs to a multigene family, and previous reports have shown that expression of dominant negative mutants or antisense constructs of p44 MAPK were able to inhibit fibroblast proliferation (7) . Because of their potential importance in growth control(7, 35) and differentiation (36, 37, 38) the genes for human p44 MAPK (extracellular signal-regulated kinase 1), p42 MAPK (extracellular signal-regulated kinase 2), and p63 MAPK (extracellular signal-regulated kinase 3) have been mapped(39) . However, it has not been possible to attribute specific biological roles to the individual isoforms even if some data describe differential activation of p42 MAPK versus p44 MAPK in platelets(40) . As a first step in such an analysis, we have isolated and partially characterized several different mouse MAPK genomic clones and characterized the gene for mouse p44 MAPK (extracellular signal-regulated kinase 1) and a portion of its 5`-flanking regulatory region in detail.

The p44 MAPK (extracellular signal-regulated kinase 1) gene spans approximately 8-kb and is divided into nine exons. An interesting aspect of the gene's structure is that one or more of the domains highly conserved among protein kinases are contained within individual exons. This is the first example of such a distribution, and it is strikingly different from what is observed in related kinases, such as mammalian cdc2 (41) which is divided into four exons only, without precise division of the protein kinase subdomains among them. This unusual subdivision could result from the evolution of an ancestral gene that has progressively acquired specific characteristics. The first 7 exons encode the protein kinase domains. An additional exon, exon 8, encodes the carboxyl terminus of MAPK. The C-terminal domain it encodes can be considered to be specific for p44 MAPK because it is one of the most divergent domains among the MAPK related kinases, the other variable domain being subdomain X. The ninth exon directly encodes 95% of the 3`-untranslated region of p44 MAPK mRNA.

We also describe the presence of two predominant start sites of transcription located at -183 and -186 bp upstream from the ATG (A = 1). However, overexposure of the gels allowed us to see additional discrete start sites. The presence of multiple sites of transcription initiation is open to interpretation. First, the oligonucleotides used in primer extension analysis could have hybridized to mRNA not yet described. This possibility has to be considered because during the screening of the genomic library five other clones, each apparently encoding a different gene, were shown to hybridize to the p44 MAPK probe at high stringency. The phenomenon could also be explained by the absence of a real consensus sequence for a TATA box. For SV 40 and histone H2A genes, removal or mutation of the canonical TATA box results in the initiation of transcription at many sites within the promoter(42, 43, 44) . A third interpretation of the detection of minor transcripts is the possibility of alternative splicing suggested by the presence of one splice donor and three splice acceptor consensus sequences. While it is possible that the splice donor site at -293 is used, it is not likely to be used with the splice acceptor site at position -138 since (a) the major start sites of transcription would be located at positions -338, -343, -346, and -352 in Fig. 2, too close to the ATG at position -338 and (b) the sequence located upstream of this ATG does not match the consensus sequence of Kozak. If splicing occurs between the donor site at position -293 and the splice acceptor sites at +92 or +100 the downstream ATG at position +166 could be used. However, the sequence upstream of this ATG also does not match the consensus sequence of Kozak, and if translation did start at this ATG the conserved GXGXXG ATP binding domain would be deleted. If such alternative splicings did occur, then shorter or longer mRNA could be transcribed from the same gene. Detection of a shorter mRNA has already been described for p42 MAPK/extracellular signal-regulated kinase 2, apparently as a result of alternative splicing of the gene(45, 46) . For the reasons outlined above and because it is impossible to detect, by high resolution polyacrylamide gel electrophoresis and Western blot, proteins with higher or lower molecular weight in fibroblasts or ES cell extracts (data not shown), we believe that it is unlikely that such alternatively spliced transcripts exist in these cells.

We have shown that a 1128-bp BglII/SacII fragment was sufficient to drive transcription of the luciferase gene. Activity of this promoter is high because it is comparable with the activity induced by the RSV promoter, which is considered to be a strong promoter. We have also shown that transcription can be initiated from each start site of transcription determined by primer extension and probably from initiator-like sequences (34) or fibroblast-specific initiation of transcription at least in vitro. First, transcription can be initiated from the minor start sites of transcription leading in the NP construct. The basal transcriptional activity detected with this construct is strongly enhanced in the P3` construct, suggesting positive intervening sequences within the -939/-367 region. However, deletion of an NheI/SacII on the 3` region of the BH fragment (BN) results in complete loss of activity, demonstrating that the TATA box located upstream of the NheI site does not have a relevant activity. Second, the +AP-2/Bs and -AP-2/Bs constructs containing the major start sites of transcription can also drive transcription but at a low level when compared with the BH construct. Presence of a small amount of mRNA in lung (47) and CCL39 fibroblasts (16) suggests that these sites are predominantly used in CCL39 cells. Third, the -167/-78 region where no start sites of transcription are detected is responsible for high basal promoter activity. Different interpretations could account for this result. We can suspect intervention of initiator-like sequences (34) or start sites undetectable in ES cells that can be alternatively used when the major ones are deleted. We can imagine that these different possibilities of initiation reflect what happens in vivo when tissue-specific cis-acting elements are used.

The MAPK promoter contains consensus binding sites for many transcription factors. Their presence however, does not prove their involvement in the regulation of p44 MAPK promoter activity. In fact, it is difficult to attribute a role to the binding sites located upstream of the NheI restriction site. The fact that the luciferase activity of the P3` construct is higher than the NP construct proves that this region contains positive regulatory elements. However, the role of the AP-2, AP-1, and CTF-NF1 sequences is clearer because their deletions decrease the p44 MAPK promoter activity. The fact that Jun, a partner of the AP-1 complex is phosphorylated by a Jun kinase, a member of the MAP kinase family, makes it tempting to speculate that MAPK could regulate its own transcription(8) . However, cotransfection of the BH construct with expression vector for Fos and Jun or constitutive active form of MAP kinase kinase (12) only shows a small increase in the MAPK promoter activity (data not shown). The role of Sp1 sites is unclear because they are all situated before the major start sites of transcription. Sp1 sites have been shown to play a role in transcription of housekeeping genes such as hprt(48, 49) or dhfr(50, 51) genes in the regulation of genes specific to or maximally expressed in the nervous central system such as nicotinic acetylcholine receptors (52) and plasminogen activator (53) as well as in the regulation of transcription of growth control-regulated genes such as c-myc(54) , epidermal growth factor receptor (55) and Ha-ras(56) . In each of these genes many start sites of transcription have been documented such as in the p44 MAPK gene. In the case of epidermal growth factor receptor promoter, transcriptional activity can be detected in chloramphenicol acetyltransferase transient transfection assays even in the absence of the major start sites of transcription(57) , and the proximal bases in front of the initiation of translation can bind nuclear proteins in gel retardation assays, suggesting a major role of this region in the initiation of transcription. This region can function as a promoter and mediates inductive response to epidermal growth factor, phorbol 12-myristate 13-acetate, and cAMP(58) . It is the case for the BsH construct, which shows high transcriptional activity. We can suppose that three of the five Sp1 sites in the construct (stop at the SacII site) are implicated in this activity. However, it is difficult to say if they play the same role in vivo. In fact, Sp1 belongs to a multigene family whose expression varies in different tissues. Expression of Sp1 is high in lung and thymus(59) , but p44 MAPK is low in lung and expressed to near undetectable levels in thymus(47) . However, the very high amount of MAPK mRNA in brain shows that previously described brain-specific Sp1-like factors (53, 60) could regulate transcription of MAPK. The presence of binding sites for GCF overlapping the Sp1 sites suggests a balance activity of these two transcription factors in different physiological conditions. A recent report also describes phosphorylation of Sp1 by a DNA-dependent protein kinase(61) . This phosphorylation could be the result of activation of the kinase pathway activated by UV light or by signaling pathway leading to apoptosis. We can also imagine Sp1-dependent activation of the transcription of p44 MAPK after such stress.

The variation of p44 MAPK mRNA levels in different organs could implicate tissue specific elements of the promoter(47) . Thus, the presence of a binding site for hepatocyte nuclear factor-5, which is involved in gene expression in the liver, two binding sites for MyoD, which is implicated in myocyte differentiation, and a GAGA box, which has been shown to be important in Drosophila development would suggest developmental expression and tissue-specific regulation of the p44 MAPK gene. However, the p44 MAPK mRNA and protein levels are not influenced by growth factors or by the position in the cell cycle in a given cell line, suggesting that the p44 MAPK gene, like many housekeeping gene products, is not submitted to acute regulation. In contrast, the complexity or ``plasticity'' of the promoter region, here defined, might reflect its ubiquitous expression from embryonic stem cells (ES cells, data not shown) to most differentiated tissues.

In summary the work reported here on the cloning and characterization of the p44 MAPK gene is the first step toward the inactivation of the gene by homologous recombination in embryonic mouse stem cells. Parallel studies with the p42 MAPK gene will be necessary to determine whether each MAP kinase serves specific function or is totally redundant and can entirely substitute for each other.


FOOTNOTES

*
This work was supported by the Centre National de la Recherche Scientifique (UMR 134), the Association pour la Recherche contre le Cancer, a grant from Roussel Uclaf, and National Institutes of Health Grants CA 26504 and CA 32551 (to E. R. S.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed. Tel.: 33 93 52 99 66; Fax: 33 93 52 99 17.

On sabbatical leave from the Albert Einstein College of Medicine, Department of Cell Biology, New York. Present address: Albert Einstein College of Medicine, Department of Cell Biology, 1300 Morris Park Avenue, Bronx, NY 10461.

(^1)
The abbreviations used are: MAPK, mitogen-activated protein kinase; MAP, mitogen-activated protein; PIPES, 1,4-piperazinediethanesulfonic acid; RSV, Rous sarcoma virus; kb, kilobase(s); bp, base pair(s); GCF, GC-rich binding factor.


ACKNOWLEDGEMENTS

We thank Dominique Grall and Martine Valetti for skilled technical and secretarial assistance and Corinne Cibre for computing artwork. We also thank Dr. Fergus McKenzie for reading the manuscript.


REFERENCES

  1. Ray, L. B., and Sturgill, T. W. (1987) Proc. Natl. Acad. Sci. U. S. A. 84, 1502-1506 [Abstract]
  2. Ray, L. B., and Sturgill, T. W. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 3753-3757 [Abstract]
  3. Cooper, J. A., Sefton, B. M., and Hunter, T. (1984) Mol. Cell. Biol. 4, 30-37 [Medline] [Order article via Infotrieve]
  4. Kohno, M., and Pouysségur, J. (1986) Biochem. J. 238, 451-457 [Medline] [Order article via Infotrieve]
  5. Vila, J., and Weber, M. J. (1988) J. Cell. Physiol. 135, 285-292 [Medline] [Order article via Infotrieve]
  6. L'Allemain, G. (1994) Prog. Growth Factor Res. 5, 291-334 [Medline] [Order article via Infotrieve]
  7. Pagès, G., Lenormand, P., L'Allemain, G., Chambard, J. C., Meloche, S., and Pouysségur, J. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 8319-8323 [Abstract/Free Full Text]
  8. Dérijard, B., Hibi, M., Wu, I.-H., Barrett, T., Su, B., Deng, T., Karin, M., and Davis, R. J. (1994) Cell 76, 1025-1037 [Medline] [Order article via Infotrieve]
  9. Galcheva-Gargova, Z., Dérijard, B., Wu, I.-H., and Davis, R. J. (1994) Science 265, 806-808 [Medline] [Order article via Infotrieve]
  10. Kyriakis, J. M., Banerjee, P., Nikolakaki, E., Dai, T., Rubie, E. A., Ahmad, M. F., Avruch, J., and Woodgett, J. R. (1994) Nature 369, 156-160 [CrossRef][Medline] [Order article via Infotrieve]
  11. Rouse, J., Cohen, P., Trigon, S., Morange, M., Alonso-Lamazares, A., Zamanillo, D., Hunt, T., and Nebreda, A. R. (1994) Cell 78, 1027-1037 [Medline] [Order article via Infotrieve]
  12. Brunet, A., Pagès, G., and Pouysségur, J. (1994) Oncogene 9, 3379-3387 [Medline] [Order article via Infotrieve]
  13. Cowley, S., Paterson, H., Kemp, P., and Marshall, C. J. (1994) Cell 77, 841-852 [Medline] [Order article via Infotrieve]
  14. Mansour, S. J., Matten, W. T., Hermann, A. S., Candia, J. M., Rong, S., Fukusawa, K., and Van Woude, G. F. (1994) Science 265, 966-970 [Medline] [Order article via Infotrieve]
  15. Boulton, T. G., Yancopoulos, G. D., Gregory, J. S., Slaughter, C., Moomaw, C., Hsu, J., and Cobb, M. H. (1990) Science 249, 64-67 [Medline] [Order article via Infotrieve]
  16. Meloche, S., Pagès, G., and Pouysségur, J. (1992) Mol. Biol. Cell 3, 63-71 [Abstract]
  17. Nordeen, S. K. (1988) BioTechniques 6, 454-458 [Medline] [Order article via Infotrieve]
  18. de Wet, J. R., Wood, K. V., De Luca, M., Helinski, D. R., and Subramani, S. (1987) Mol. Cell. Biol. 7, 725-737 [Medline] [Order article via Infotrieve]
  19. Mount, S. M. (1982) Nucleic Acids Res. 10, 459-472 [Abstract]
  20. Hanks, S. K., Quinn, A. M., and Hunter, T. (1988) Science 241, 42-52 [Medline] [Order article via Infotrieve]
  21. Grange, T., Roux, J., Rigaud, G., and Pictet, R. (1991) Nucleic Acids Res. 19, 131-139 [Abstract]
  22. Funk, W. D., Pak, D. T., Karas, R. H., Wright, W. E., and Shay, J. W. (1992) Mol. Cell. Biol. 12, 2866-2871 [Abstract]
  23. Blackwell, T. K., and Weintraub, H. (1990) Science 250, 1104-1110 [Medline] [Order article via Infotrieve]
  24. Biggin, M. D., and Tjian, R. (1989) Trends Genet. 5, 377-383 [CrossRef][Medline] [Order article via Infotrieve]
  25. Faisst, S., and Meyer, S. (1992) Nucleic Acids Res. 20, 3-26 [Medline] [Order article via Infotrieve]
  26. Perkins, K. K., Admon, A., Patel, N., and Tjian, R. (1990) Genes & Dev. 4, 822-834
  27. Richet, E., and Raibaud, O. (1989) EMBO J. 8, 981-987 [Abstract]
  28. Biedenkapp, H., Borgmeyer, U., Sippel, A. E., and Klempnauer, K.-H. (1988) Nature 335, 835-837 [CrossRef][Medline] [Order article via Infotrieve]
  29. Imagawa, M., Chiu, R., and Karin, M. (1987) Cell 51, 251-260 [Medline] [Order article via Infotrieve]
  30. Angel, P., Imagawa, M., Chiu, R., Stein, B., Imbra, R., Rahmsdorf, H. J., Jonat, C., Herrlich, P., and Karin, M. (1987) Cell 49, 729-739 [Medline] [Order article via Infotrieve]
  31. Jones, K. A., Kadonaga, J. T., Rosenfeld, P. J., Kelly, T. J., and Tjian, R. (1987) Cell 48, 79-89 [Medline] [Order article via Infotrieve]
  32. Briggs, M. R., Kadonaga, J. T., Stephen, P. B., and Tjian, R. (1986) Science 234, 47-52 [Medline] [Order article via Infotrieve]
  33. Kageyama, R., and Pastan, I. (1989) Cell 59, 815-825 [Medline] [Order article via Infotrieve]
  34. Weis, L., and Reinberg, D. (1992) FASEB J. 6, 3300-3309 [Abstract/Free Full Text]
  35. Blenis, J. (1991) Cancer Cells 3, 445-449 [Medline] [Order article via Infotrieve]
  36. Thomas, S. M., DeMarco, M., D'Arcangelo, G., Halegoua, S., and Brugge, J. S. (1992) Cell 68, 1031-1040 [Medline] [Order article via Infotrieve]
  37. Wood, K. W., Sarnecki, C., Roberts, T. M., and Blenis, J. (1992) Cell 68, 1041-1050 [Medline] [Order article via Infotrieve]
  38. Sale, E. M., Atkinson, P. G. P., and Sale, G. J. (1995) EMBO J. 14, 674-684 [Abstract]
  39. Li, L., Wysk, M., Gonzalez, F. A., and Davis, R. J. (1994) Oncogene 9, 647-649 [Medline] [Order article via Infotrieve]
  40. Papkoff, J., Rey-Hui, C., Blenis, J., and Forsman, J. (1994) Mol. Cell. Biol. 14, 463-472 [Abstract]
  41. Dalton, S. (1992) EMBO J. 11, 1797-1804 [Abstract]
  42. Grosschedl, R., and Birnstiel, M. L. (1980) Proc. Natl. Acad. Sci. U. S. A. 77, 1432-1436 [Abstract]
  43. Benoist, C., and Chambon, P. (1981) Nature 290, 304-309 [Medline] [Order article via Infotrieve]
  44. Mathis, D., and Chambon, P. (1981) Nature 290, 310-315 [Medline] [Order article via Infotrieve]
  45. Gonzalez, F. A., Raden, D. L., Rigby, M. R., and Davis, R. J. (1992) FEBS Lett. 304, 170-178 [CrossRef][Medline] [Order article via Infotrieve]
  46. Gonzalez, F. A., Seth, A., Raden, D. L., Bowman, D. S., Fay, F. S., and Davis, R. J. (1993) J. Cell Biol. 122, 1089-1101 [Abstract]
  47. Boulton, T. G., Nye, S. H., Robbins, D. J., Ip, N. Y., Radziejewska, E., Morgenbesser, S. D., DePinho, R. A., Panayotatos, N., Cobb, M. H., and Yancopoulos, G. D. (1991) Cell 65, 663-675 [Medline] [Order article via Infotrieve]
  48. Melton, D. W., McEwan, C., Mc Kie, A. B., and Reid, A. M. (1986) Cell 44, 319-328 [Medline] [Order article via Infotrieve]
  49. Dynam, W. S., Sazer, S., Tjian, R., and Schimke, R. T. (1986) Nature 319, 246-248 [Medline] [Order article via Infotrieve]
  50. McGrogan, M., Simonsen, C. C., Smouse, D. T., Farnham, P. J., and Schimke, R. (1985) J. Biol. Chem 260, 2307-2314 [Abstract]
  51. Sazer, S., and Schimke, R. T. (1986) J. Biol. Chem. 261, 4685-4690 [Abstract/Free Full Text]
  52. Yang, X., Fyodorov, D., and Deneris, E. S. (1995) J. Biol. Chem. 270, 8514-8520 [Abstract/Free Full Text]
  53. Pecorino, L. T., Darrow, A. L., and Strickland, S. (1991) Mol. Cell. Biol. 11, 3139-3147 [Medline] [Order article via Infotrieve]
  54. Watt, R., Nishikura, K., Sorrentino, J., ar-Rushdi, A., Croce, C. M., and Rovela, G. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 6307-6311 [Abstract]
  55. Ishii, S., Xu, Y. H., Stratton, R. H., Roe, B. A., Merlino, G. T., and Pastan, I. (1985) Proc. Natl. Acad. Sci. U. S. A. 82, 4920-4924 [Abstract]
  56. Ishii, S., Merlino, G. T., and Pastan, I. (1985) Science 230, 1378-1381 [Medline] [Order article via Infotrieve]
  57. Johnson, A. C., Ishii, S., Jinno, Y., Pastan, I., and Merlino, G. T. (1988) J. Biol. Chem. 263, 5693-5699 [Abstract/Free Full Text]
  58. Hudson, G. L., Thompson, K. L., Xu, J., and Gill, G. N. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 7536-7540 [Abstract]
  59. Saffer, J. D., Jackson, S. P., and Annarella, M. B. (1991) Mol. Cell. Biol. 11, 2189-2199 [Medline] [Order article via Infotrieve]
  60. Hagen, G., Müller, S., Beato, M., and Suske, G. (1992) Nucleic Acids Res. 20, 5519-5525 [Abstract]
  61. Jackson, S. P., Mac Donald, J. J., Lees-Miller, S., and Tjian, R. (1990) Cell 63, 155-165 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.