(Received for publication, March 21, 1995; and in revised form, June 25, 1995)
From the
A previous study has identified a C U mutation at
position -3 in the 3` splice site of intron 10 of the
phenylalanine hydroxylase pre-mRNA in a patient with phenylketonuria. In vivo, this mutation induces the skipping of the downstream
exon. This result is puzzling because both CAG and UAG have been
reported to function equally as 3` splice sites. In this report, we
show that the C
U mutation affects predominantly the first step
of the splicing reaction and that it blocks spliceosome assembly at an
early stage. The 3` region of the phenylalanine hydroxylase intron 10
has two unusual characteristic features: multiple potential branch
sites and a series of four guanosine residues, which interrupt the
polypyrimidine tract at positions -8 to -11 from the 3`
splice site. We show that the mutation precludes the use of the
proximal branch site, while having no effect on the remote one. We also
show that in the UAG transcript, the four guanosine residues inhibit
the splicing of intron 10. The substitution of these purine residues by
one cytosine residue, regardless of the position, increases the
splicing efficiency of the mutant UAG precursor while having no effect
on the wild-type CAG precursor. Substituting the four purine residues
by four pyrimidines relieves the inhibition and rescues the use of the
proximal branch site. These results demonstrate that according to the
context, the C and U nucleotides preceding the AG are not equivalent
for the splicing reaction.
The removal of intervening sequences from precursor mRNA
involves the precise recognition of the 5` and 3` splice sites at the
exon/intron boundaries. The splicing reaction proceeds by two
transesterification steps(1, 2, 3) . In the
first step, the precursor is cleaved at the 5` splice site to yield the
upstream exon and the lariat intermediate. In the second step, the
mature mRNA is formed and the intron is released as a lariat. The
accuracy of the splicing process is due to the coordinated assembly of
a large number of protein factors and small nuclear ribonucleoprotein
particules (U1, U2, and U4/U5/U6 snRNPs) ()on the precursor
to form the spliceosome(1, 2, 3) . The
spliceosomal components interact with cis-acting sequences that include
the 5` splice site, the 3` splice site (YAG), the branch site, and the
pyrimidine tract downstream of the branch site.
The functional significance of these elements in the splicing reaction has been inferred from in vitro and in vivo site-directed mutational analysis in both yeast and mammalian systems (4, 5, 6, 7, 8) . Most of the mutations in the invariant 5` GU and 3` AG dinucleotides have a strong effect on the splicing process. Some mutations abolish or reduce mRNA production, whereas others induce exon skipping or utilization of cryptic sites. The effects of mutations in other positions are not so clear because of discrepancies between in vivo and in vitro results. However, the importance of nucleotides outside of the consensus sequences in the splicing process has been reported(4, 5, 9, 10) . Krawczak et al.(11) collected 101 cases of splicing mutations that are responsible for human genetic diseases. Although the most frequently observed mutations concerned the invariant 5`GU and 3`AG dinucleotides, some that occurred outside of these invariant sequences also dramatically altered the processing of pre-mRNA. For example, the mutation in position +5 of the 5` splice site, which in genetic disorders leads to abnormally spliced products, was shown recently to be a critical nucleotide for the accuracy of the splicing process by base pairing with U6 snRNP(12, 13, 14) .
Concerning the position -3 at the 3` splice site, two
mutations were reported in patients with
thalassemia(15) . One was a C
A substitution at the 3`
splice site of intron 2 of the
globin gene. The other was a U
G substitution at the 3` splice site of intron 1. The resulting
RNA splicing defects were consistent with what was previously observed
by means of in vitro mutational studies. The replacement of 3`
CAG by 3` GAG impaired the second step of the splicing reaction,
whereas AAG or UAG had only a limited effect on the splicing
process(16, 17, 18) . These studies were
extended in a more recent paper; using the properties of AG-independent
introns, which can support the first step of the splicing reaction in
the absence of the dinucleotide AG, it was shown that the nucleotides
preceding the AG are not equivalent, with CAG = UAG > AAG
> GAG(19) . This finding can be related to the frequency
with which wild-type 3` splice sites are represented in the human
genome where CAG is the most frequent (75%), followed by UAG (23%), AAG
(2%), and GAG (<1%)(20) . More recently, a C
U
transition at position -3 in intron 2 of the CFTR gene was
reported in a patient with cystic fibrosis(21) . PCR sequencing
analysis revealed that exon 3 was skipped during splicing of the mRNA.
In 1993, a CAG to UAG mutation was described in the 3` splice site
of intron 10 of the phenylalanine hydroxylase pre-mRNA in a patient
with phenylketonuria(22) . This mutation causes the skipping of
exon 11 and the premature termination of RNA translation in the
following exon. This was the first report showing that a pyrimidine to
pyrimidine (C U) substitution at the -3 position in the 3`
splice site affects the splicing process.
To get more insight into the molecular mechanism responsible for such a deleterious effect, we investigated the mutated precursor by an in vitro splicing assay. We have shown that the mutation markedly reduces the splicing efficiency and that it prevents the use of the branch site located at -22 nt upstream of the 3` splice site. In addition, the substitution of four contiguous purine residues, which interrupt the pyrimidine tract, by pyrimidine residues relieves the inhibition process on the mutant precursor. In contrast, a more limited effect is observed on the wild-type transcript. These results demonstrate that, according to the context, the C and U nucleotides preceding the AG are not equivalent for the splicing reaction.
Figure 1:
In
vitro splicing reaction of the wild-type 3`CAG and the mutant
3`UAG precursors. Splicing reactions were incubated for 0 (lanes
0), 30 (lanes 1), 60 (lanes 2), or 120 min (lanes 3). Mut and WT designate the mutant
and the wild-type precursor, respectively. M designates P-labeled HpaII fragments of PBR 322. Due to the
cloning procedure, 5` exon migrates at 120 nt (exon 10, 54 nt plus 67
nt from the polylinker).
Figure 4:
The effect of one G C mutation
within the tetraguanosine sequence on splicing of the 3` CAG and 3` UAG
precursors. The tetraguanosine sequence and the G
C mutation
introduced at each position are indicated at the top of the
figure. Splicing on the 3` UAG (Mut) and 3`CAG (WT)
were analyzed for each mutation for 0 (lane 0), 30 (lane
1), 60 (lane 2), or 120 min (lane
3).
Figure 2:
Spliceosome assembly on the wild-type
3`CAG and 3`UAG mutant pre-mRNAS. Splicing reactions were incubated
under splicing conditions for 0 (lanes 0), 5 (lanes
1), 10 (lanes 2), 15 (lanes 3), 30 (lanes
4), or 60 min (lanes 5), treated with heparin (2 mg/ml),
and loaded onto native polyacrylamide gels. H, nonspecific
complexes; SP, splicing complexes; Mut, mutant (C
U) precursor; WT, wild-type precursor; Ad,
adenovirus precursor used as a control; A, pre-splicing
complex; B, spliceosome.
Figure 3: Localization of the branch point by primer extension analysis. In order to get enough material for branch site determination, the amount of mutant pre-mRNA was increased 2-fold. Wild-type (Wt) and mutant final lariats were subjected to primer extension analysis using a 15-mer oligonucleotide complementary to positions -1 to -15 from the 3` splice site in the presence (lanes A, C, G, and T) or in the absence (lane E) of dideoxynucleotides. The nucleotide sequence of the 3` region of intron 10 of both wild-type and mutant precursors are also shown. The arrows show the branch sites used in the splicing process.
To test if this was the case, we constructed plasmids replacing the
guanine residues at each position by one cytosine residue. Two or four
cytosine substitutions were also tested. The results of splicing
experiments with the CAG and UAG pre-mRNAs containing a single guanine
to cytosine substitution at each of four positions are shown in Fig. 4. The results demonstrate that a single cytosine
substitution has a significant stimulating effect on the mutant
pre-mRNA splicing. Depending on the HeLa cell nuclear extract used, the
level of stimulation with one G C substitution was
2-3-fold, regardless of the position (see Fig. 6). These
results demonstrate that there is no positional effect and indicate the
importance for the nucleotide to be a pyrimidine in all four positions.
The subsequent splicing experiments with pre-mRNAs containing two or
four guanines substituted with cytosines indicated that the effect of
splicing stimulation was additive; increasing the number of pyrimidines
proportionately increases the splicing efficiency (Fig. 5). The
level of stimulation with four G
C substitutions was about
8-10-fold. The substitution of all guanine residues by cytosines
roughly restores the same splicing efficiency as the wild-type
precursor, which is correlated with the predominant use of the proximal
branch site (data not shown). In contrast, the single substitutions
have no effect on the wild-type precursor. Only the substitution by
four cytosine residues leads to an increase of the splicing activity by
about 2-fold (Fig. 6).
Figure 6:
Splicing efficiency of the 3`CAG and 3`UAG
precursors after substituting cytosine residues within the
tetraguanosine sequence. Quantification was performed on the final
lariat after 2 h of incubation under splicing conditions. The results
were from five independent experiments that were carried out under the
same experimental conditions. The splicing efficiency was calculated
for each precursors with or without G to C substitution as described
under ``Experimental Procedures'' (y axis). They are
expressed as percentages of the splicing efficiency of the wild-type
3`CAG precursor, where 100% is that of the wild-type transcript. The x axis corresponds to pre-mRNAs. 4G, wild-type
tetrapurine sequence; 3G+1C, one substitution G C
independent of the position; 2G+2C, two substitutions by
cytosine residues in the tetrapurine sequence at positions -10
and -11; 4C, four substitutions by cytosine residues.
Because substitution of the tetrapurine sequence by one pyrimidine
residue has the same effect on the splicing efficiency regardless of
the position, we present them by only one bar graph. The shaded
bars designate the 3`CAG precursor, and the hatched bars designate the 3`UAG precursor.
Figure 5:
The
effect of double and quadruple G C substitutions within the
tetraguanosine sequence on splicing of the 3` CAG and 3` UAG
precursors. The splicing reaction was analyzed on the 3` UAG (Mut) and 3`CAG (WT) after mutation of the four
guanosine residues by either two cytosine residues at positions
-10 and -11 or four cytosine residues. The incubation times
were 0 (lane 0), 30 (lane 1), 60 (lane 2),
or 120 min (lane 3).
From these results, we propose that the G residues within the pyrimidine tract are negative elements for the splicing of the UAG precursor. The substitution by cytosine residues markedly increases the splicing efficiency and has a greater effect on the UAG precursor than on the CAG (Fig. 7). However, even after restoration of an uninterrupted polypyrimidine tract, the CAG and UAG transcripts behave differently. The splicing efficiency of the UAG transcript remains lower than its CAG counterpart.
Figure 7: The substitution of the purine residues by cytosines has a greater effect on the splicing reaction of the UAG precursor than of the CAG. The results shown in Fig. 6have been expressed as the ratio of the splicing efficiency of the CAG and UAG transcripts containing the G to C substitutions over their respective nonsubstituted transcripts.
In this paper we show that in vitro, the mutation C
U at position -3 of the 3` splice site of phenylalanine
hydroxylase intron 10 severely impairs both the spliceosomal assembly
and the catalytic steps of the splicing reaction. This mutation was
identified in a patient with phenylketonuria. In vivo, the
effect of this mutation was to induce the skipping of the downstream
exon, leading to a truncated protein terminated in exon
12(22) . This is the first report showing that in position
-3 the C and U pyrimidines are not equivalent for the splicing
process. The results we obtained were surprising for several reasons.
First, in over 98% of human 3` splice sites a pyrimidine residue
precedes the AG consensus dinucleotide. Although the CAG acceptor site
is more common (75%), the UAG is still quite frequently used (23%; 20).
This was particularly relevant for the phenylalanine hydroxylase gene,
where in six out of the 13 introns the UAG is present at the 3` splice
site(30) . Second, recent results have shown that when two
closely spaced AG dinucleotides are in competition, the U and C at the
-3 position are equally recognized by the splicing
apparatus(19) . Third, as in other mammalian introns, a UAG can
be used to remove intron 10 from the phenylalanine hydroxylase gene.
This has been observed in a patient with phenylketonuria in which a G
to A substitution at the position -11 creates a UAG 3` splice
site (34) . This last observation is of particular interest
because it suggests that the elements upstream of the four G residues
function perfectly well in a UAG context. It also points to the role of
the four G residues in the inhibitory process in a UAG context.
Our in vitro experiments show that the C U mutation
predominantly blocks the first step of the splicing process. This is
correlated with a dramatic reduction of splicing complexes (Fig. 2). The mechanism by which this inhibition occurs remains
unknown. However, we can postulate that the C
U mutation
disrupts an early interaction needed for the 3` splice site to be
efficiently recognized by the splicing machinery.
The pyrimidine tract is known to be an important element involved in this process. In the case of AG-dependent introns, in which the pyrimidine tract is short, decreasing the number of pyrimidines downstream of the branch site or upstream of the 3`AG was shown to strongly impair the first step of the splicing reaction(16) . Recently, Roscigno et al.(35) reported that reducing consecutive uracil residues in the pyrimidine tract by inserting purines affected spliceosomal assembly before A complex formation. Moreover, they showed that there is both a positional and a compositional effect. The substitution by G residues has a more pronounced effect on the splicing efficiency than that by A residues; they totally block the splicing reaction. All these findings point to the fact that the length of the pyrimidine tract, its composition, and the distance between the branch site and the pyrimidines can affect the splicing efficiency(16, 35, 36) . Examination of the pyrimidine tract in intron 10 of the phenylalanine hydroxylase gene revealed several characteristic features. It is short and interrupted by four guanine residues very close to the 3` splice site. Indeed, substituting these purine residues by pyrimidines increases the splicing efficiency of the mutant pre-mRNA (see Fig. 6). Individually each substitution is able to increase the splicing reaction. Replacement of the four purines by four cytosine residues restores roughly the same splicing activity as is found in the wild-type precursor. These results confirm all the previous reports concerning the role of the pyrimidine tract in the splicing reaction and support the finding that the 3` region is recognized at an early step of complex formation(37) .
From the results we present here, we can suggest that increasing the pyrimidine content in the UAG precursor allows efficient recognition of the pyrimidine tract/3` splice site. But if this is so, why is the wild-type precursor so different for its requirement for the pyrimidine stretch? In fact, we have observed that increasing the number of pyrimidines in the wild-type transcript has no effect on the splicing reaction except when the four G residues are replaced by four C residues. In that case, the splicing efficiency increased by 2-fold. These results seem to indicate that the wild type intron partially escapes the need for a very strong pyrimidine tract. In apparent contradiction of this assertion, these results also suggest that the information content of the pyrimidine tract is limited. It was previously reported that the C and U residues at position -3 are equally efficient for 3` splice site recognition(19) . In contrast to these findings, our results clearly indicate that in the phenylalanine hydroxylase intron 10, the pyrimidines adjacent to the 3`AG are not equivalent. This is also illustrated by the fact that even after restoration of an uninterrupted pyrimidine tract, the splicing efficiency of the UAG precursor remains lower than the CAG counterpart. From these results, we propose that the C residue at position -3 increases the strength of the 3` splice site and is one of the determinants that compensates for the relative weakness in the pyrimidine tract. This assumption might explain why in mammalian introns CAG 3` splice sites are more abundant than UAG sites.
The recognition of the 3` splice site involves collaborative interactions between the branch site and the pyrimidine tract/3` AG (31, 38) . We have shown that splicing of the wild-type precursor involves the use of several branch sites, localized at -22, -23, and -34 nt upstream of the 3` splice site. The proximal one lies in the UAAUAAC sequence, which, in six out of seven positions, closely resembles the yeast branch site sequence. The results show that it is preferentially used for the splicing of the wild-type transcript. This is consistent with previous work from Zhuang et al.(38) showing that the UACUAAC sequence is the most efficient branch site for mammalian mRNA splicing. In the previous section, we suggested that the information content of the pyrimidine tract is weak and that a cytosine residue at position -3 contributes to increase the strength of the 3` splice site. Another element that could compensate for the weak pyrimidine tract would be the presence of a strong branch site. In agreement with this hypothesis, it has been reported that in vitro a strong branch point sequence can partially overcome a weak polypyrimidine tract and vice versa(35) . In the course of branch site determination, we noticed that a small percentage of the lariats have branch sites on A residues at positions -25, -26, and -39. Why multiple branch sites are used in the splicing of intron 10 is presently unknown. The existence of multiple branch sites has already been reported and is particularly well illustrated in alternative splicing(39, 40, 41, 42) . In those cases, it was proposed that the presence of multiple branch sites provides a number of binding sites for splicing factors. In a similar way, we can suggest that the presence of several branch sites could increase the recruitment of splicing factors, allowing the stabilization of U2 snRNP on the selected branch site.
The results
presented here show that the splicing defect occurs at an early stage
of spliceosome assembly (Fig. 2). We can suggest that the
mutation interferes with some steps required for the formation of the A
complex. Among the numerous proteins required for stable U2 snRNP
interaction on the branch site is the splicing factor
U2AF(43) . It is an abundant component of E complex whose
binding to the pre-mRNA is strongly dependent on the 3` splice site and
the integrity of the pyrimidine tract(37, 44) . SF3a
and SF3b splicing activities are also involved in the U2 snRNP binding
process(45, 46) . Recently, a p80 protein, which
associates with the branch site sequence prior to A complex formation,
has been identified, and it has been proposed that this protein could
be involved in the communication between U1 snRNP and the branch
region(47) . One possibility could be that the mutation C
U at the 3` splice site blocks the interaction of one or several
of these proteins, thus preventing the recognition of the branch site
by the U2 snRNP particle. Currently experiments are in progress to
understand the molecular mechanism by which the mutation C
U at
position -3 upstream of the 3` splice site prevents the splicing
of intron 10.