©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
The Gene Encoding Human Splicing Factor 9G8
STRUCTURE, CHROMOSOMAL LOCALIZATION, AND EXPRESSION OF ALTERNATIVELY PROCESSED TRANSCRIPTS (*)

(Received for publication, February 6, 1995; and in revised form, May 19, 1995)

Michel Popielarz Yvon Cavaloc Marie-Genevive Mattei (1) Renata Gattoni James Stvenin (§)

From the Institut de Gntique et de Biologie Molculaire et Cellulaire, CNRS/INSERM/ULP, BP 163, 67404 ILLKIRCH Cdex, C.U. de Strasbourg, France and the Unit 406 Gntique Mdicale et Dveloppement, INSERM, Facult de Mdecine de la Timone, 13385 Marseille, France

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

The 9G8 factor is a 30-kDa member of the SR splicing factor family. We report here the isolation and characterization of the human 9G8 gene. This gene spans 7745 nucleotides and consists of 8 exons and 7 introns within the coding sequence, thus contrasting with the organization of the SC35/PR264 or RBP1 SR genes. We have located the human 9G8 gene in the p22-21 region of chromosome 2. The 5`-flanking region is GC-rich and contains basal promoter sequences and potential regulatory elements. Transfection experiments show that the 400-base pair flanking sequence has a promoter activity. Northern blot analysis of poly(A) RNA isolated from human fetal tissues has allowed us to identify five different species, generated by alternative splicing of intron 3, which may be retained or excised as a shorter version, as well as the use of two polyadenylation sites. We also show that the different isoforms are differentially expressed in the fetal tissues. The persistence of sequences between exon 3 and 4 results in the synthesis of a 9G8 protein lacking the SR domain which is expected to be inactive in constitutive splicing. Thus, our results raise the possibility that alternative splicing of intron 3 provides a mechanism for modulation of the 9G8 function.


INTRODUCTION

The splicing of nuclear pre-mRNA occurs in a multicomponent complex containing small nuclear ribonucleoproteins (U snRNP),()splicing factors and hnRNP proteins (for reviews, see Green(1991) and Moore et al. (1993)). Numerous splicing factors have been characterized in both lower (mainly the PRP factors) and higher eukaryotes. In higher eukaryotes, a unique set of factors, which belong to the family of the splicing factors (called the SR factors) rich in serine and arginine residues, is involved in the first steps of splice sites recognition. At present, seven SR factors have been identified and characterized: SF2/ASF (Krainer et al., 1990a, 1990b; Ge and Manley, 1990), SC35/PR264 (Fu and Maniatis, 1990, 1992; Vellard et al., 1992), SRp20 (also named X16) (Zahler et al., 1992; Ayane et al., 1991), SRp55 (Roth et al., 1991; Mayeda et al., 1992), SRp75 (Zahler et al., 1993b), RBP1 (Kim et al., 1992), and the 9G8 factor (Cavaloc et al., 1994). All these factors share two common characteristics: one or two RNA binding domains (RBD) near the amino terminus and a domain rich in serine/arginine (the SR domain) at the carboxyl terminus. The SR factors range in mass from 20 to 75 kDa and the best characterized are the 30-kDa SF2/ASF and SC35 (Krainer et al., 1990a, 1990b, 1991; Ge and Manley, 1990; Ge et al., 1991; Fu and Maniatis, 1990, 1992; Zahler et al., 1992).

It has been argued that the various SR factors are interchangeable in constitutive splicing because each is able to complement SR-deficient extracts (for instance a cytoplasmic S100 extract) (Fu et al., 1992; Zahler et al., 1992). Recently, it has been shown that SF2/ASF and SC35 are able to form commitment complexes with a pre-mRNA substrate (Fu, 1993) and that they are required for the stable interaction of U1 snRNP with the 5` splice site (Kohtz et al., 1994). In agreement with these results, the region of SF2/ASF containing the two RBDs is able to recognize a typical 5` splice site in a short transcript (Zuo and Manley, 1994).

An interesting aspect of SR factors is that they may modulate alternative splicing in a concentration-dependent manner when several 5` splice sites are in competition. Increasing amounts of SF2/ASF or SC35 generally result in the preferred selection of the more proximal 5` splice site (Krainer et al., 1990b; Ge and Manley, 1990; Fu et al., 1992). However, a more extended comparison, including SRp40, SRp55, and SRp75, indicates that each SR factor has a differential ability to modulate alternative splicing in vitro (Zahler et al., 1993a). Moreover, as these factors are differentially expressed in different tissues, Zahler et al. (1993a) proposed that the SR factors may be involved in tissue-specific regulation of alternative splicing in vivo. In support of this idea, overexpression of SF2/ASF by transfection experiments led to a modulation of alternative splicing in vivo (Caceres et al., 1994).

We have isolated recently the 9G8 factor with a molecular mass (30 kDa) similar to those of SF2/ASF and SC35 (Cavaloc et al., 1994). However, its primary sequence in the RBD is only 40% conserved relative to SF2/ASF and SC35. In addition, 9G8 presents some specific features since it contains an RRSRSXSX consensus sequence repeated six times in the SR domain and a CCHC zinc knuckle motif in its median region (Cavaloc et al., 1994). The occurrence of a large family of SR splicing factors which are differentially expressed in organisms raises questions related to the structure of their genes, the existence of a common ancestral gene, and the molecular basis of their modulated expression. In contrast with the abundant data on the SC35/PR264 gene (Sureau et al., 1992; Sureau and Perbal, 1994), very little is known about the genomic structure and the expression of these factors. We report here the isolation and characterization of the 9G8 gene, the determination of the exon/intron organization, a succint analysis of the promoter, and the determination of the structure of the mRNA isoforms produced by alternative splicing and polyadenylation.


MATERIALS AND METHODS

Isolation of Human 9G8 Gene

A 5` P-labeled 38-nt probe spanning positions +153/+190 of the cDNA (QE203) or a P-labeled random priming cDNA PCR product using the QO60 and QN140 oligonucleotides (+682/+902) have been used to screen a human placental genomic library in GEM 12. Duplicate plaque lifts were prepared and probed as described (Sambrook et al., 1989). Three positive genomic clones containing inserts of 17 (9G8-I), 15 (9G8-II), and 17 kb (9G8-III) were isolated from 8.10 recombinant phages.

Design of Probes

cDNA 1 probe, a 330-bp BstBI-BglII fragment (265/595) and cDNA 2 probe, a 709-bp EcoRI-EcoRI fragment (262/971) were obtained from the 9G8 cDNA clone 3 (Cavaloc et al., 1994); 3`-untranslated region, a 383-bp NdeI-AvaII fragment (1003/1386), was obtained from the 3.2-kb SacI subclone of 9G8-III. The probes used for the recognition of intron 3 were obtained by an amplification by PCR of a fragment of 1064 bp containing the total intron 3 and short sequences of the surrounding exons using the primers QM95 (5`-TTTGATAGACCACCTGCC-3`) and QP101 (5`-TTCGTCCCCTGCTCCTGCTGC-3`). The resulting fragment was then cleaved with RsaI and XbaI, and two DNA bands of 450 bp (IVS 3 up) and 327 bp (IVS 3 down) were gel-purified. Each probe was labeled with [P]dCTP by random priming.

Southern Blot Analysis and Subcloning of DNA Fragments

For the Southern blot analysis of the recombinant phages, 2 µg of each DNA were digested with SacI or EcoR I restriction enzymes. One µg was run on 5% polyacrylamide gel and then transferred to a nylon membrane filter Hybond N+ with 0.4 M NaOH as described by the supplier. The phage DNA digested by SacI was used to subclone, by shotgun technique, the different fragments produced. For the genomic Southern analysis, 15 µg of human placental DNA digested by EcoRI and SacI were electrophoresed on 0.8% agarose gel and transferred to a nylon membrane (Hybond N+).

The blot containing the recombinant phages DNA was probed with the 38-mer QE203 oligonucleotide or the QO60/QN140 PCR product (+682/+902), labeled with P. The blot containing human placental DNA was probed with the P-labeled random priming cDNA 1 and cDNA 2 probes. The hybridization was performed at 42 °C for 16 h in the hybridization solution (2 SSC, 0.1% SDS, 1 Denhardt's solution, 30% formamide, and 10 µg/ml salmon sperm DNA), except for the QE203 probe where formamide was omitted. The filters were then washed with 0.2 SSC and 0.1% SDS at 45 °C (QE203 probe) or at 60 °C (other probes) and subjected to an overnight autoradiography.

Primer Extension Analysis

The 5` P labeled QM94 (5`-GCAGCGCCCAGGGCTCGAGTGAC-3`) or QO14 (5`-GTAACGCGACATGATGACAGACC-3`) were annealed overnight with aliquots of 1 µg of 293 cells poly(A) RNA in a solution 1 NPES (250 mM NaCl, 40 mM Pipes, pH 6.4, 5 mM EDTA, 0.2% SDS). After precipitation and washing, the extension reaction was performed with 10 units of avian myeloblastosis virus reverse transcriptase for 30 min at 42 °C. The RNA was then degraded by 0.2 M NaOH at 42 °C and the extension DNA product was electrophoresed on a 6% denaturing polyacrylamide gel containing 8 M urea.

Construction of Reporter and Expression Plasmids

Artificial restriction sites HindIII and BamHI were inserted by PCR technique at positions -420 and +25, respectively. After endonuclease digestion, the PCR product was inserted in the corresponding sites of the pBLCAT3 (Luckow and Schtz, 1987), giving the 9G8 ``full-length'' construct (9G8FL) (Fig. 6B). This region contains one StuI site at position -205 and a NotI site at position -38. Deletions mutants 9G8H/S and 9G8S/N were obtained by releasing respectively HindIII-StuI and StuI-NotI fragments from the 9G8FL vector, blunting, and religating the vector backbone. Deletion mutant 9G8S/-72 was obtained by releasing the StuI-NotI fragment from the 9G8FL vector and inserting the phosphorylated double-stranded 36-mer oligonucleotide corresponding to the +72/NotI fragment of the wild type promoter. The 9G8-88/-34 mutant was generated by cutting the 9G8FL construct with SphI and digesting with Bal-31 for various times. DNA was then blunted with the Klenow fragment of DNA polymerase and religated. All constructs were confirmed by restriction analysis and sequencing. The pE1A, pSVCREM, and pSVBmyb clones that express the 293-amino acid protein of adenoviral E1A unit, CREM, and c-Myb, respectively, were described previously (Leff et al., 1984; Foulkes et al., 1992; Sureau et al., 1992).


Figure 6: A, structure of the different deletion mutants used in transient transfection experiments. The 9G8 promoter from positions -414 to +26 was linked to the bacterial CAT gene. The StuI and NotI sites are indicated. The different deletions are shown and the positions of the deleted nucleotides are indicated. B, transcriptional activity of the 9G8 upstream sequences. CAT activity levels with different deletion mutants are represented. The value corresponds to the average of independent experiments.



Transfections and CAT Assays

JEG-3 human choriocarcinoma cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum and were transfected by calcium phosphate coprecipitation technique. Cells were plated at a density of 10 cells/10-cm plate and transfected with 10 µg of total plasmid DNA. 3 µg of reporter plasmid was included in each transfection sample together with 1 µg of pSVCREM, pE1A, or pSVBmyb expression plasmids. CAT activity was assayed as described previously (Sassone-Corsi et al., 1988) and was quantified by PhosphorImager counting. The activity values correspond to the percentage of chloramphenicol modified by the chloramphenicol acetyltransferase.

Northern Blot Analysis

The human fetal multiple tissue Northern blot, containing 2 µg of poly(A) RNA from each tissue was obtained from Clontech (catalog number 7761-1). The membrane was prehybridized and hybridized in pre/hybridization solution (5 SSC, 10 Denhardt's solution, 45% formamide, 1.5% SDS, and 100 µg/ml salmon sperm DNA) at 42 °C. After a 20-h hybridization with radiolabeled probe (2 10 cpm/ml), the blot was washed 30 min at 55 °C in 0.1 SSC and 0.1% SDS and exposed 16 h with Kodak X-Omat film or was quantified by PhosphorImager counting.


RESULTS

Isolation and Structural Organization of the Human 9G8 SR Splicing Factor Gene

We previously cloned a cDNA encoding the 9G8 SR factor (Cavaloc et al., 1994). To isolate the corresponding genomic clone, a 38-nt oligonucleotide probe encompassing an amino acid sequence of the RBD not present in the other SR factors was used to screen a placental human genomic library in GEM12. We obtained two genomic clones 9G8-I and -II containing inserts of 17 and 15 kb, respectively (see also Cavaloc et al., 1994). However, preliminary analysis indicated that they do not cover the entire open reading frame of the 9G8 mRNA. Therefore, a 211-bp PCR product, from positions 692 to 902 of the 9G8 mRNA, covering the C-terminal region of the 9G8 factor, allowed the isolation of another 17-kb clone (9G8-III). Analysis of the three clones by restriction endonuclease mapping and Southern blotting using the specific probes mentioned above revealed that the inserts overlap and together cover the entire open reading frame of the 9G8 gene (Fig. 1). The 5.5-kb SacI fragment at the 3` terminus of the clone I insert, as well as the 2.6-kb internal fragment and 3.2-kb fragment at the 3` terminus of the clone III insert were subcloned into pBluescript SK+ and used for further characterization and sequencing analysis.


Figure 1: Intron/exon organization and partial restriction map of the human 9G8 SR factor gene. Exons are represented as solid boxes. Exon 8 resulting of the use of distal polyadenylation site is depicted as an expanded open box. The partial restriction map and the overlapping human genomic clones 9G8-I, -II, and -III are shown below the schematic representation of the 9G8 gene. The different restriction enzymes sites are represented as follows: H, HindIII; S, SacI; X, XhoI; and E, EcoR I. The complete genomic sequence is available in GenBank® (accession number L41887).



Sequencing of the Human 9G8 Gene and Its Genomic Organization

Complete exon-intron organization of the 9G8 gene was determined by sequencing the totality of the gene. Using the previously cloned 9G8 cDNA sequence as the reference mRNA sequence, we have determined that the 9G8 gene is 7745 nucleotides long and contains 8 exons and 7 introns (Table 1). The exons range from 36 (exon 7) to 1572 bp (last exon), and the intron sizes vary from 308 to 1298 bp (Table 1). From the sizes of exons and introns as well as the splice site sequences which fulfill the GT-AG rule (Breathnach and Chambon, 1981), we deduced that the 9G8 gene exhibits features typical of many eukaryotic genes. The translated sequence of 9G8 is highly cut up as the open reading frame is distributed over the 8 exons. Interestingly, the RBD is contained in exon 2, and the exon 3 encodes for the zinc knuckle motif (Cavaloc et al., 1994). In contrast, the SR domain which is 110 amino acids in size covers exons 4-8. A study of the DNA sequence downstream the stop codon reveals two polyadenylation signals at 610 and 1511 bp downstream of the stop codon.



Southern Analysis and Chromosomal Localization

Southern blot analysis of human genomic DNA digested by EcoR I or SacI was performed using the two cDNA probes cDNA 1 (positions 265/595) covering exons 2 and 3 and cDNA 2 (positions 262/971) covering exon 2 to exon 8 (Fig. 2). The short cDNA probe revealed a single band after digestion with either EcoRI (2.7 kb) or SacI (7.5 kb), in agreement with what was expected from structure of genomic clones (Fig. 1). In contrast, the extended probe (cDNA 2) detected an additional fragment in the EcoRI (8 kb) and SacI (2.5 kb) restriction digests (Fig. 2), consistent with the structure of the 9G8-III clone shown in Fig. 1. Thus, these results confirm the structural organization of the 9G8 gene. They indicate also that 9G8 is encoded by a single copy gene and that no pseudogenes are present in the genome. The chromosomal location of the 9G8 gene was determined by in situ hybridization using the cDNA 2 probe. Analysis of 100 metaphase cells revealed a total of 237 silver grains on chromosomes, and 51 of these (21.5%) were located on chromosome 2. Analysis of the grain distribution indicated that 36 out of 51 (70.6%) of these mapped to the p22-21 region on the short arm of chromosome 2, with an intense localization in the p21 band (Fig. 3). This result allows to unequivocally assign the 9G8 gene to the p22-21 region of human chromosome 2.


Figure 2: Southern blot analysis of 9G8 gene. Human genomic DNA (15 µg) was digested with EcoR I or SacI restriction endonucleases and after gel electrophoresis and blotting was hybridized using cDNA 1 (positions +265 to +595) or cDNA 2 probes (+262 to + 971). The marker size (in kilobases) is indicated on the left.




Figure 3: Idiogram of distribution of signals on chromosome 2. In situ hybridization of human metaphase chromosomal spreads were performed as described (Mattei et al., 1993). 70.6% of grains located on chromosome 2 mapped to the p22-21 region of short arm, with a maximum in the p21 band.



Identification of the Transcription Initiation Site

To determine the transcription initiation site, we performed a primer extension analysis (Fig. 4). Primer extension on poly(A) RNA isolated from 293 cells, using two 23-mer oligonucleotides QM94 (upstream the AUG codon) and QO14 (encompassing the AUG codon) resulted in the synthesis of cDNAs of 72-70 residues (Fig. 4) or 117-115 residues (not shown), respectively. Comparing the extension products with a sequence ladder generated by extending the same primers from a plasmid containing incomplete 9G8 cDNA localized the initiation site at a G residue, downstream from the CT-rich sequence CTCTTCCTC/G + 1. It is not known if the cDNA beginning 2 residues downstream of the longer cDNA (Fig. 4) corresponds to a true initiation site at an A residue or to a premature stop of the primer extension.


Figure 4: Primer extension analysis. A 5` end-labeled 38-mer oligonucleotide QM94 was annealed to 1 µg of poly(A) or poly(A) RNA of 293 cells or 2.5 µg of tRNA of Escherichia coli. Extension with reverse transcriptase was as described under ``Materials and Methods.'' The same primer was employed for dideoxy sequencing (lanes, A, C, G, and T) in which the template was an incomplete cDNA clone. The primer extended and the sequence products were electrophoresed on a 6% polyacrylamide gel. The DNA band representing the major transcription start site, 72 nt upstream the 5` end of the QM94 primer, is denoted by an arrow.



Structure of the 9G8 Upstream Region

Analysis of the sequence of the 5`-flanking region of the 9G8 gene contained in the 5.5-kb SacI subfragment reveals a high GC content (57%) and several promoter elements (Fig. 5). A TATA motif (TATATAA) is present at position -29, and three potential SP1 binding sites (GGCGGG) are found at positions -87, -148, and -224. Computer search reveals also putative regulatory elements (Locker and Buzard, 1990; Faisst and Meyer, 1992). Two sequence motifs for liver-specific factors A1 (LFA1), TGAACC and TGACCC, are present at positions -149 and -345, and one possible AP-2 motif (GCCTGGg), which deverges from the AP-2 consensus by one nucleotide, is located at position -298. In addition, an ATGACGcA sequence, which exhibits a good match with the consensus ATF site is present at position -59 and overlaps a TGACGcat sequence, with a significant homology to the CRE motif. Sequences identical to the core consensus for Ets (GGAAPu) are also present at positions -266, -261, -208, and -115. Finally, minimal consensus sites for Myb (AACNG) are located at positions -96, -121, -239, and -375.


Figure 5: Sequence of the upstream region of 9G8 transcription unit. Sequences of the upstream region of 9G8 gene from positions -414 to +26 are represented and the transcription start site is designated +1. The potential binding sites for various transcription factors are indicated on the corresponding sequences by arrows, which give also the orientation of these elements (Locker and Buzard, 1990; Faisst and Meyer, 1992). The natural StuI and NotI sites are indicated with open boxes and the artificial HindIII and BamHI sites (see ``Materials and Methods'') with solid boxes.



Functional Analysis of the 5`-Flanking Region

To characterize the upstream region of the 9G8 gene, a DNA fragment spanning from -414 to +26 (clone 9G8FL), as well as fragments containing various deletions, were fused to the bacterial CAT reporter gene in the pBLCAT3 plasmid (Fig. 6A). These constructs were transiently expressed into JEG-3 cells, which express 9G8 mRNA at levels similar to those of HeLa cells and cellular extracts were assayed for CAT activity 36 h after transfection. In Fig. 6B, we show that the upstream region was able to induce significant CAT activity. However, a deletion of the -202 to -36 region (clone 9G8S/N), which leaves the TATA motif intact, results in an 8-fold reduction of the CAT activity (compare the first and the fifth bars). A similar reduction was observed with the almost complete deletion of the 9G8 upstream region from -414 to -36 (not shown). Moreover, two smaller deletions -202/-72 (clone 9G8S/-72) and -88/-34 (clone 9G8-88/-34), which cover the -202 to -36 deletion (Fig. 6A), resulted in both cases in a 2.5-fold reduction compared with the wild type construct. In contrast, deletion of the region -414 to -203 (clone 9G8H/S) induced a 2-fold stimulation of the promoter activity (compare the first and the second bars), indicating that the most upstream sequences do not contain strong positive acting elements and that some negative elements may be present. Thus, this deletional analysis shows that the upstream region of the 9G8 gene is effective in activating a CAT reporter gene and that the positive acting elements are mainly concentrated all along the -202 to -34 region, immediately upstream of the TATA box. The presence of many putative regulatory elements served as a guide for testing, in a preliminary manner, several trans-acting factors. The effect of CREM+PKA, Myb, and E1A factors, which are known to activate many cellular genes, were analyzed by cotransfection experiments. We found that each factor stimulates the transcription around three fold (not shown), suggesting that the 9G8 promoter is able to respond to different trans-acting factors.

The 9G8 Gene Is Alternatively Spliced

Northern blot analysis of human fetal poly(A) RNA isolated from the brain, liver, kidney, and lung (Clontech) was carried out using a 330-nt cDNA probe, covering exons 2 and 3 of the 9G8 RNA (cDNA 1). We detect five mRNAs of approximately 1.3, 2.0, 2.4, 2.6, and 3.8 kb in size, respectively, the 2.4- and 2.6-kb being very close (Fig. 7, lanes 1-4). We observed that the same five transcripts are present in adult poly(A) RNAs (Multiple Tissue Northern blot 1, Clontech), suggesting that none of the isoforms is specific for a developmental stage (not shown).


Figure 7: Northern blot analysis of the 9G8 mRNA species in human fetal tissues. On the upper panel, analysis performed with human multiple fetal tissue Northern blot (Clontech) is represented. The tissues are indicated above each lane and on the top of each blot, we have indicated the probe used (see ``Materials and Methods'' for the description of the probes). The molecular weight markers are indicated on the right, and the sizes of the different 9G8 mRNA species are inscribed on the left of each blot. The scheme of each mRNA species is given in the lower panel. Exons are represented by open boxes, and coding regions are indicated by black boxes. The intron 3 is delineated by a hatched box.



To further analyze the different isoforms, the blot was hybridized with DNA probes specific for (i) the 3`-noncoding region between the two potential poly(A) signals (3` UTR; Fig. 7, lanes 5-8). We have used this probe because the distance observed between the two potential polyadenylation signals (1 kb) could account for different sizes of mRNA observed. (ii) The intron 3, since further characterization of the cDNA clones for 9G8 isolated previously (Cavaloc et al., 1994) indicated that two out of nine cDNA clones contained this intron (not shown). We observed that the 3.8- and 2.6-kb mRNA species use the distal poly(A) signal (Fig. 7, second panel), and that only the 3.8-, 2.4-, and 2.0-kb species contain intron 3 sequences (not shown). Nevertheless, the 2.0-kb species is too short to contain the entire intron 3. Looking for consensus signals within this intron, we have found one potential 3` splice site in its middle. By using DNA probes specific for the sequences downstream (Fig. 7, lanes 9-12) and upstream (Fig. 7, lanes 13-16) of this splice site, we show that the 2.0-kb mRNA is generated by the use of this alternative 3` splice site and that the 3.8- and 2.4-kb mRNA contain the totality of intron 3 sequences. The putative structure of all the mRNA isoforms is given in Fig. 7. A quantitative analysis of the relative abundance of the different species in various fetal tissues indicates that the 1.3-kb species, which encodes the whole 9G8 factor, is highly predominant in the liver, but it appears as a minor isoform in the kidney. In contrast, the 2.4-kb isoform, which contains the entire intron 3, is the predominant species in kidney. A quantitative estimation of the 9G8 mRNA isoforms indicates that the relative ratio of the intron 3 minus transcripts (1.3- and 2.6-kb species) to the intron 3 plus transcripts (2.0-, 2.4-, and 3.8-kb species) varies from about 1 to 5 between the kidney and the liver, respectively.


DISCUSSION

The 9G8 gene is divided into eight exons and seven introns, and the coding sequence is highly divided, since it starts in exon 1 and stops in exon 8. Thus, its exon/intron organization is very different from those of the two other SR factors SC35/PR264 (Sureau and Perbal, 1994) or RBP1 (Kim et al., 1992), since in these genes, the coding sequence is contained only in two exons. Previous comparison of amino acid sequences of RBD has shown that 9G8 presents a good homology with SRp20 and RBP1 (Cavaloc et al., 1994), suggesting that several SR factors may originate from a common ancestral gene, as already proposed (Birney et al., 1993). However, the very different organization of 9G8 and RBP1 genes indicates that profound modifications have occurred after ancient gene duplication. It has been proposed that the intron sequences frequently demarcate important functional or structural domains of proteins. We observe indeed that the RNA binding domain of the 9G8 factor covers precisely exon 2 sequences, whereas the middle region containing the specific CCHC zinc knuckle is located in the third exon. Finally, the SR domain of 9G8 is distributed in the exons 4-8, but a specific region of this domain, not found in the other SR factors, is encoded precisely in exon 5. This region contains four conserved repetitions of the consensus RRSRSXSX (Cavaloc et al., 1994) and might originate from intragenic recombination occurred during evolution. We have shown that the 9G8 gene is located in human on the chromosome 2p22-21. This region is the site of several known genes, including T cell leukemia virus enhancer factor (HTLF) (Li et al., 1992), and translocations of chromosome 2p22-16 with chromosome 11p23 have been reported in human leukemia (Bloomfied and de la Chappelle, 1987).

Examination of the 400-bp region upstream of the 9G8 gene shows several interesting features. The G + C content of 57% and the CpG content of 7% are indicators that the promoter region of the 9G8 gene is in a CpG island (Larsen et al., 1992). However, the presence of a TATA box and many potential regulatory elements does not allow us to classify this gene as a typical housekeeping gene (Locker and Buzard, 1990; Faisst and Meyer, 1992). The promoter of the 9G8 gene has been shown to be functional in JEG-3 cells and to respond to several trans-acting factors. In this respect, the 9G8 promoter resembles the SC35/PR264 gene which contains several Myb-responsive elements (Sureau et al., 1992). In fact, 9G8 expression, similar to that of the other SR factors, is likely ubiquitous, but it has been shown that the SR factors recognized by mAb104 (SRp20, SRp30, SRp40, SRp55, and SRp75) are expressed at different levels in various calf tissues (Zahler et al., 1993a).

We show in this paper that the expression of the 9G8 mRNA is the target of different regulations such as alternative splicing and alternative polyadenylation, leading to five well detectable species from 1.3 to 3.8 kb. One interesting feature is the retention of the entire or a part of the intron 3, because it leads to the translation of a truncated form of the 9G8 protein devoid of the SR domain by introducing a stop codon downstream of the exon 3/intron 3 junction (Fig. 8). In fact, although there are no important variations in the total amounts of 9G8 transcripts within the tested fetal tissues (Fig. 7), some important changes within the distribution of each species occur. The existence of the alternative splicing of intron 3, most likely due to the existence of a suboptimal 3` splice site at the end of this intron (see the sequence in Fig. 8) seems puzzling if it is thought that maximal levels of SR factors are required for an efficient splicing machinery. However, we have to take into account that levels of SR factors are variable (Zahler et al., 1993a) and may participate to the modulation of alternative splicing, as proposed by these authors. Moreover, our data raise the interesting possibility that the splicing of intron 3 might be submitted to regulation, because its 3` splice site is similar to other weak 3` splice sites, such as the female-specific 3` splice site of double-sex pre-mRNA (Ryner and Baker, 1991; Hedley and Maniatis, 1991) or the fibronectin ED1 exon (Caputi et al., 1994). In this respect, purine-rich motifs of the form CAGGAGGAA, CAGCAGGAG, and CAGGGACGAAG, located downstream within the exon 4 of this gene, resemble cis-acting motifs found in the M2 exon of IgM gene (Tanaka et al., 1994), troponin (Xu et al., 1993), fibronectin (Lavigueur et al., 1993; Caputi et al., 1994), or bovine growth hormon gene (Dirksen et al., 1994). They may be important for the excision of the whole intron 3. In fact, a mRNA variant containing a retained intron had also been previously described for the ASF/SF2 protein (Ge et al., 1991), but the amount of this species was small (<5%). Interestingly, the SR splice variant of ASF/SF2 conserves all its ability to modulate the alternative splicing, but loses its characteristics of constitutive splicing factor (Zuo and Manley, 1993; Caceres and Krainer, 1993). Nevertheless, it is still unknown whether this type of truncated SR factors is involved in the regulation of alternative splicing pathways in vivo.


Figure 8: Representation of the splicing events occurring on intron 3 and incidence on the primary sequence of the 9G8 protein. A, the intron 3 and the flanking exons are represented. Exonic sequences are boxed, and the different splice sites are indicated by an arrow. Under the diagram the amino acid sequence of each splice variant is indicated (stop codons are indicated by an asterisk). B, the sequence of the two alternatively used 3` splice sites are designed. The exonic sequences are written in uppercase letters and intronic sequences in lowercases letters.



We also show that the 9G8 pre-mRNA is alternatively processed in its 3`-untranslated region. Looking at polyadenylation signals within the genomic primary sequence, we find one ATTAAA motif at position 1433 and one AATAAA motif at position 2334 that are used in vivo. It had been observed that SC35/PR264 pre-mRNA was submitted to an alternative polyadenylation coupled to alternative splicing within its 3`-untranslated region (Sureau and Perbal, 1994). Differences in the stability of the SC35/PR264 mRNA species have been demonstrated in agreement with studies showing that sequences contained in the 3`-untranslated region are involved in the stability of the mRNA or in the control of translation (Sachs, 1993). In conclusion, we have identified splice variants of the 9G8 transcript that may allow the synthesis of significant and variable amounts of 9G8 SR factor deleted of the SR domain. It will be interesting to determine whether the differential expression of the 9G8 factor may be involved in the modulation of alternative splicing in vivo.


FOOTNOTES

*
This work was supported by funds from the Institut National de la Sant et de la Recherche Mdicale, the Centre National de la Recherche Scientifique, the Centre Hospitalier Universitaire Rgional, the Association pour la Recherche contre le Cancer, and the Association Franaise de lutte contre les Myopathies. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
To whom correspondence should be addressed. Tel.: 33-88-65-33-61; Fax: 33-88-65-32-01.

The abbreviations used are: snRNP, small nuclear ribonucleoprotein; RBD, RNA binding domain; nt, nucleotide(s); PCR, polymerase chain reaction; kb, kilobase(s); bp, base pair(s); CAT, chloramphenicol acetyltransferase; Pipes, 1,4-piperazinediethanesulfonic acid.


ACKNOWLEDGEMENTS

We are grateful to Dr. I. Davidson for critical reading of the manuscript. We thank G. Hildwein for excellent technical assistance, the cell culture group for growing cells, the photographic staff for preparation of the manuscript, B. Chatton, N. Foulkes, and J. Soret for the gift of clones.


REFERENCES

  1. Ayane, M., Preuss, U., Khler, G., and Nielsen, P. J.(1991) Nucleic Acids Res. 19, 1273-1279 [Abstract]
  2. Birney, A., Sanjay, K., and Krainer, A. R.(1993)Nucleic Acids Res. 21, 5803-5816 [Abstract]
  3. Bloomfield, C. D., and de la Chappelle, A.(1987)Semin. Oncol. 14, 372-383 [Medline] [Order article via Infotrieve]
  4. Breathnach, R., and Chambon, P.(1981)Annu. Rev. Biochem. 50, 349-383 [CrossRef][Medline] [Order article via Infotrieve]
  5. Caceres, J. F., and Krainer, A. R.(1993)EMBO J. 12, 4715-4726 [Abstract]
  6. Caceres, J. F., Stamm, S., Helfman, D. M., and Krainer, A. R.(1994)Nature 265, 1706-1709
  7. Caputi, M., Casari, M., Guenzi, S., Tagliabue, R., Sidoli, A., Melo, C. A., and Baralle, F. E. (1994)Nucleic Acids Res. 22, 1018-1022 [Abstract]
  8. Cavaloc, Y., Popielarz, M., Fuchs, J.-P., Gattoni, R., and Stvenin, J. (1994)EMBO J. 13, 2639-2649 [Abstract]
  9. Dirksen, W. P., Hampson, R. K., Sun, Q., and Rottman, F. M.(1994)J. Biol. Chem. 269, 6431-6436 [Abstract/Free Full Text]
  10. Faisst, S., and Meyer, S.(1992)Nucleic Acids Res. 20, 3-26 [Medline] [Order article via Infotrieve]
  11. Foulkes, N. S., Mellstrm, B., Benusiglio, E., and Sassone-Corsi, P.(1992)Nature 355, 80-84 [CrossRef][Medline] [Order article via Infotrieve]
  12. Fu, X.-D.(1993) Nature365,82-85 [CrossRef][Medline] [Order article via Infotrieve]
  13. Fu, X. D., and Maniatis, T.(1990)Nature 343, 437-441 [CrossRef][Medline] [Order article via Infotrieve]
  14. Fu, X. D., and Maniatis, T.(1992)Science 256, 535-538 [Medline] [Order article via Infotrieve]
  15. Fu, X. D., Mayeda, A., Maniatis, T., and Krainer, A. R.(1992)Proc. Natl. Acad. Sci. U. S. A. 89, 11224-11228 [Abstract]
  16. Ge, H., and Manley, J. L.(1990)Cell 62, 25-34 [Medline] [Order article via Infotrieve]
  17. Ge, H., Zuo, P., and Manley, J. L.(1991)Cell 66, 373-382 [Medline] [Order article via Infotrieve]
  18. Green, M. R. (1991)Annu. Rev. Cell Biol.7,559-599 [CrossRef]
  19. Hedley, M. L., and Maniatis, T.(1991)Cell 65, 579-586 [Medline] [Order article via Infotrieve]
  20. Kim, Y.-J., Zuo, P., Manley, J. L., and Baker, B. S.(1992)Genes & Dev. 6, 2569-2579
  21. Kohtz, J. D., Jamison, S. F., Will, C. L., Zuo, P., Lhrmann, R., Garcia-Blanco, M. A., and Manley, J. L.(1994)Nature 368, 119-124 [CrossRef][Medline] [Order article via Infotrieve]
  22. Krainer, A. R., Conway, G. C., and Kozak, D.(1990a)Genes & Dev. 4, 1158-1171
  23. Krainer, A. R., Conway, G. C., and Kozak, D.(1990b)Cell 62, 35-42 [Medline] [Order article via Infotrieve]
  24. Krainer, A. R., Mayeda, A., Kozak, D., and Binns, G.(1991)Cell 66, 383-394 [Medline] [Order article via Infotrieve]
  25. Larsen, F., Gundersen, G., Lopez, R., and Prydz, H.(1992)Genomics 13, 1095-1107 [Medline] [Order article via Infotrieve]
  26. Lavigueur, A., La Branche, H., Kornblihtt, A. R., and Chabot, B.(1993) Genes & Dev. 7, 2405-2417
  27. Leff, T., Elkaim, R., Goding, C. R., Jalinot, P., Sassone-Corsi, P., Perricaudet, M., Kedinger, C., and Chambon, P.(1984)Proc. Natl. Acad. Sci. U. S. A. 81, 4381-4385 [Abstract]
  28. Li, C., Lusis, A. J., Sparkes, R., Tran, S.-M., and Gaynor, R.(1992) Genomics 13, 658-664 [Medline] [Order article via Infotrieve]
  29. Locker, J., and Buzard, G.(1990)DNA Seq. 1, 3-11 [Medline] [Order article via Infotrieve]
  30. Luckow, B., and Schtz, G.(1987)Nucleic Acids Res.15,5490 [Medline] [Order article via Infotrieve]
  31. Mattei, M. G., Bruce, B., and Karsenty, G.(1993)Genomics 16, 786-788 [CrossRef][Medline] [Order article via Infotrieve]
  32. Mayeda, A., Zahler, A. M., Krainer, A. R., and Roth, M. B.(1992)Proc. Natl. Acad. Sci. U. S. A. 89, 1301-1304 [Abstract]
  33. Moore, J. M., Query, C. C., and Sharp, P. A. (1993) in The RNA World (Gesteland, R. F., and Atkins, J. F., eds) pp. 303-358, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
  34. Roth, M. B., Zahler, A. M., and Stolk, J. A.(1991)J. Cell Biol. 115, 587-596 [Abstract]
  35. Ryner, L. C., and Baker, B. S.(1991)Genes & Dev. 5, 2071-2085
  36. Sachs, A. B. (1993)Cell74,413-421 [Medline] [Order article via Infotrieve]
  37. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  38. Sassone-Corsi, P., Lamph, W. W., Kamps, M., and Verma, I. M.(1988)Cell 54, 553-560 [Medline] [Order article via Infotrieve]
  39. Sureau, A., and Perbal, B.(1994)Proc. Natl. Acad. Sci. U. S. A. 91, 932-936 [Abstract]
  40. Sureau, A., Soret, J., Vellard, M., Crochet, J., and Perbal, B.(1992)Proc. Natl. Acad. Sci. U. S. A. 89, 11683-11687 [Abstract]
  41. Tanaka, K., Watakabe, A., and Shimura, Y.(1994)Mol. Cell. Biol. 14, 1347-1354 [Abstract]
  42. Vellard, M., Sureau, A., Soret, J., Martinerie, C., and Perbal, B.(1992)Proc. Natl. Acad. Sci. U. S. A. 89, 2511-2515 [Abstract]
  43. Xu, R., Teng, J., and Cooper, T. A.(1993)Mol. Cell. Biol. 13, 3660-3674 [Abstract]
  44. Zahler, A. M., Lane, W. S., Stolk, J. A., and Roth, M. B.(1992)Genes & Dev. 6, 837-847
  45. Zahler, A. M., Neugebauer, K. M., Lane, W. S., and Roth, M. B.(1993a) Science 260, 219-222 [Medline] [Order article via Infotrieve]
  46. Zahler, A. M., Neugebauer, K. M., Stolk, J. A., and Roth, M. B.(1993b) Mol. Cell. Biol. 13, 4023-4028 [Abstract]
  47. Zuo, P., and Manley, J. L.(1993)EMBO J. 12, 4727-4737 [Abstract]
  48. Zuo, P., and Manley, J. L.(1994)Proc. Natl. Acad. Sci. U. S. A. 91,3363-3367 [Abstract]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.