©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Inefficient Spliceosome Assembly and Abnormal Branch Site Selection in Splicing of an HIV-1 Transcript in Vitro(*)

(Received for publication, June 9, 1995)

Helle Dyhr-Mikkelsen (§) Jørgen Kjems (¶)

From the Department of Molecular Biology, University of Aarhus, DK-8000 Aarhus C, Denmark

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

Continuous replication of human immunodeficiency virus type I (HIV-1) requires balanced expression of spliced and nonspliced mRNAs in the cytoplasm. This process is regulated post-transcriptionally by the viral-encoded Rev protein. An important prerequisite for Rev responsiveness is the presence of weak splice sites in the viral mRNA. We have investigated the splicing of the second intron of the HIV-1 Tat/Rev transcript in vitro and show that the 3`-splice site region is responsible for the inefficient splicing of the HIV-1 transcript. In contrast, the HIV-1 5`-splice site is highly functional in combination with a heterologous 3`-splice site. Incubation of the HIV-1 transcript in nuclear extract leads to a rapid accumulation of 50 S nonproductive pre-spliceosome complexes. These complexes contain mainly U1 and U2 small nuclear ribonucleoproteins and are formed independently of the presence of the downstream 3`-splice site. The HIV-1 transcripts, which do proceed through the first splicing step, utilize primarily a uridine as the branch acceptor nucleotide. Sequence comparison with other HIV-1 introns suggests that nucleotides other than adenosines are commonly used as branch points in these viruses.


INTRODUCTION

Most mammalian genes are interrupted by introns which are removed during processing of the primary RNA transcripts in the nucleus. The cellular splicing machinery recognizes conserved sequences near the two splice sites and removes the intron prior to nuclear transport to the cytoplasm. This is a complex process involving the assembly of U1, U2, U5, and U4/U6 small nuclear ribonucleoproteins (snRNPs) (^1)on the pre-mRNA (reviewed in (1) and (2) ). Initially, U1 snRNP interacts with a conserved sequence at the 5`-splice site, U2 snRNP then binds to the branch point region just upstream from the 3`-splice site, and the remaining U5 and U4/U6 snRNPs enter this complex as a tri-snRNP particle to form the complete spliceosome. In addition to the snRNPs, a number of other auxiliary protein factors play important roles in spliceosome assembly. In particular, ASF/SF2 and U2AF assist the binding of U1 and U2 snRNPs to conserved sequences at the 5`-splice site and the branch point region, respectively, and SC-35 is critical for the recognition of the 3`-splice site (reviewed in (1) ). The RNA splicing process is generally very efficient, leading to the export of only one major mRNA species to the cytoplasm. However, the expression of a substantial number of mRNAs is tissue-specific and developmentally regulated by alternative splicing. Differentiated splicing products may be obtained by a number of different mechanisms including the utilization of alternative 5`- and 3`-splice sites, exon and intron inclusion or skipping, and mutually exclusive exons, all of which have been found in biological systems(3) . The complexity of the splicing reaction makes it a well suited target for regulation of gene expression, and the mechanisms involved in biological systems appear to be diverse(3) .

Retroviruses have evolved a post-transcriptional regulatory system based on intron retention, in order to express multiple proteins from the same promoter. Crucial for the life cycle of all retroviruses is a balanced expression of an unspliced mRNA of about 9 kb and a singly spliced mRNA of about 4 kb, encoding the Gag/Pol and Env proteins, respectively. Most of our knowledge about elements controlling this differential splicing comes from studies of avian sarcoma viruses. In this group of viruses the ratio between genomic and singly spliced mRNAs appears to be constitutive and mainly regulated by a suboptimal 3`-splice site (4, 5, 6, 7) and a negative regulator of splicing (NRS element), which acts in cis to decrease the splicing efficiency of the viral transcript(8, 9) .

In complex retroviruses, which include lentiviruses (e.g. HIV-1), spumaviruses, and human T-cell leukemia virus, the post-transcriptional regulation of splicing appears to be more complex. In addition to the two classes of mRNA found in all retroviruses, the complex viruses express a major class of approximately 2 kb long mRNAs, which encode a number of small regulatory proteins. Best studied are the HIV-1 regulatory proteins Tat and Rev, both of which are essential for virus propagation. Tat is a transcriptional activator, whereas Rev appears to function only at a post-transcriptional level, up-regulating the appearance of unspliced and singly spliced mRNAs in the cytoplasm (reviewed in (10) ). There has been some controversy about what level of gene expression is subject for Rev regulation. Rev may function directly at the level of mRNA splicing(11, 12, 13, 14, 15) , mRNA stability and transport(16, 17, 18, 19, 20) , and/or translation(21, 22) . Although some of these functions may be closely coupled, it suggests that Rev is a multifunctional protein.

The specificity of Rev relies on direct binding to the Rev response element (RRE) and the presence of cis-acting repressive sequences within the transcript. In contrast to the RRE, which is a well defined RNA element located at the start of the env gene, the identity of cis-acting repressive sequences is less well understood. These elements may constitute nuclear retention or instability elements and have been mapped to both the Gag, Pol, and Env regions(23, 24, 25, 26, 27) .

Based on the observation that the introduction of weak splice sites into an RRE-containing beta-globin gene renders the mRNA Rev responsive, it has been suggested that weak splice sites may function as nuclear retention elements(11) . This interpretation is supported by a recent study showing that the integrity of the splice sites in HIV-1 mRNA is necessary for Rev regulation of HIV-1 gene expression(14) . This implies that Rev-mediated regulation of HIV-1 gene expression requires an intrinsically inefficient splicing process. A search for splicing regulatory elements have identified a cis-acting repressor element within the first Tat-coding exon that suppresses splicing of the upstream intron in vitro and in vivo(28) . This element is position-dependent, but works in the context of a heterologous intron. cis Elements controlling the splicing of the second intron of the Tat/Rev transcript have been investigated in vivo, and it is concluded that a nonoptimal 3`-splice site splicing signal is the main determinant for inefficient splicing(29) . We have studied the splicing of the same HIV-1 intron and show that the 5`-splice site region directs a rapid accumulation of a 50 S complex, containing U1 and U2 snRNPs, and that the 3`-splice site is inefficiently recognized by the splicing apparatus. A possible role for the 50 S complex could be to retain the mRNA in the nucleus in the absence of Rev protein.


EXPERIMENTAL PROCEDURES

Construction of Plasmids

In the following, an asterisk (*) denotes that the restriction site has been blunt-ended with Klenow enzyme prior to ligation. The cloning scheme is indicated in Fig. 1. The pPIP7.A plasmid contains an artificial transcription unit including optimal splice signals(30) , pgTat-CMV contains a 2.7-kb TaqI fragment including the Tat gene of the HIV-1 strain HXB3 (nt 5392-8039; numbered according to Ratner et al.(31) ) behind a CMV promoter(11) . The pgTat-CMV3 and pgTat-CMV4 are derived from pgTat-CMV by deleting the SspI*-BglII* (nucleotides 5737-7198) and SspI*-HindIII* (nucleotides 5737-7718) fragments within the intron, respectively(11) . pTAT3 and pTAT4 were constructed by cloning the BamHI-XhoI fragment of pgTat-CMV3 and pgTat-CMV4, respectively, into the BamHI-SalI sites of pBS+ (Stratagene). pTATPIP was constructed by replacing the SacI-HincII fragment of pPIP7.A with the 374-bp SacI-AluI fragment of pTAT4 containing the 5`-splice site. pPIPTAT was constructed by replacing the HincII-HindIII fragment of pPIP7.A with the 356-bp AluI-HindIII fragment of pTAT4 containing the 3`-splice site. pDeltaTATPIP was constructed by replacing the EcoRI*-SalI* fragment of PIP7.A with a 260-bp RsaI-SspI* fragment of pgTatCMV (essentially the same fragment as the RsaI-AluI fragment of pTAT4, see Fig. 1), containing the 5`-splice site. pPIPDeltaTAT was constructed by replacing the SalI*-HindIII* fragment of PIP7.A with a 135-bp RsaI-TaqI* fragment of pTAT4, containing the 3`-splice site.


Figure 1: Constructs. A, pTAT3 and pTAT4 derive from pgTatCMV which contains part of the HIV-1 HXB-3 genome. The TaqI fragment of pgTatCMV, that includes the Tat/Rev intron flanked by 234 and 84 bp of the 5`- and 3`-Tat-coding exons, respectively, was shortened by deleting parts of the intron. The resulting fragments were inserted downstream of a T7 promoter to yield pTAT3 and pTAT4. The RRE remains intact in pTAT3 but is absent in pTAT4. B, structure of the precursor RNA transcribed from pPIP7.A, pTAT4 and the chimerical constructs pPIPTAT, pPIPDeltaTAT, pTATPIP, and pDeltaTATPIP (the RNA transcripts are denoted by the plasmid names without the p prefix). PIP.7A contains a modified version of the first intron in the major late transcript of adenovirus(30) . The PIPTAT transcript contains the 5`-half of PIP7.A RNA and the 3`-half of TAT4 RNA. TATPIP transcript contains the 5`-half of TAT4 RNA and the 3`-half of PIP7.A RNA. PIPDeltaTAT and DeltaTATPIP RNAs are similar to PIPTAT and TATPIP RNAs but with shorter HIV-1 splice site regions. TAT5'ss contains a truncated Tat/Rev gene including only the 5`-exon and 80 bp of the downstream intron of the TAT4 construct. Positions of restriction sites used for cloning are indicated, and segments originating from PIP7.A and TAT4 constructs are denoted with black and white bars, respectively. Exons and introns are denoted with thick and thin boxes, respectively, and major splice sites are indicated by open circles (5`-splice sites) and closed circles (3`-splice sites). The in vitro splicing activity of each construct is indicated (+, 50-70% splicing turnover; -, less than 5% turnover). The constructs are drawn to the scales indicated (note that the scale of Panel B is 5 times enlarged compared to Panel A). Numbering is done according to Ratner et al.(31) . For details in vector construction, see ``Experimental Procedures.



Preparation of RNA Transcripts

Pre-mRNAs were synthesized in vitro by T7 RNA polymerase transcription. The plasmid templates were generally linearized with HindIII, except from pPIPDeltaTAT, which was linearized with AluI, and TAT5`ss RNA, which was synthesized from the pTAT4 template that had been linearized with AvaII and treated with Klenow enzyme. In some experiments pTAT4 was linearized with Sau3A, HaeIII, and AvaI resulting in truncated 3`-exons containing 80, 33, and 19 nucleotides, respectively. Transcription of radioactively labeled and capped mRNAs, used for in vitro splicing analysis, was done as described previously(12) , and the RNAs were purified on 4% polyacrylamide denaturing gels. Biotinylated RNAs, used for streptavidin affinity purification, were synthesized in 100-µl reaction mixtures, containing 5 µg of linearized plasmid, 40 mM Tris/HCl (pH 7.4), 6 mM MgCl(2), 4 mM spermidine, 10 mM dithiothreitol, 40 units of RNasin, 0.7 mM ATP, 0.7 mM UTP, 0.7 mM GTP, 0.7 mM CTP, 0.07 mM biotin-11-UTP (10% of total UTP), 1 µCi of alpha-S-UTP (Amersham Corp., 3000 Ci/mmol), and 200 units of T7 RNA polymerase. The RNAs were separated from unincorporated nucleotides on a Sephadex G-50 spin column and purified on 4% polyacrylamide denaturing gels. The final concentration of the RNA was calculated from the specific activity of incorporated S label.

In Vitro Splicing and Complex Gel Analysis

In vitro splicing was performed essentially as described previously (12) . For denaturing gel analysis, the mRNA was incubated for 90 min at 30 °C, whereas for native gel runs and sucrose gradient centrifugations, samples were incubated for only 20 min at 30 °C. To protect the RNA from 3`-end exonucleases in the nuclear extract, 10 µg of tRNA were included in each splicing reaction. Splicing products were analyzed on denaturing gels, containing 6% polyacrylamide, 8 M urea, and 50-100 mM Tris borate (pH 8.3), and splicing complexes were analyzed by loading 5 µl of the splicing reactions onto native gels containing 2.5% 80:1 acrylamide/bisacrylamide and 50 mM Tris/glycine (pH 8.8).

Purification of Splicing Complexes and Northern Blotting

Streptavidin affinity purification of biotinylated RNA and probing for U1, U2, U4, U5, and U6 snRNPs were performed as described by Kjems and Sharp(32) . U11 and U12 snRNA antisense RNAs were prepared as described by Wassarman and Steitz(33) .

Debranching, Primer Extension, and Sequencing of RNA Products

Debranching of lariats was done as described in Ruskin and Green (34) by reincubating the phenol-extracted splicing reaction in a 25-µl mixture, containing 20 µl of debranching buffer (50 mM HEPES (pH 7.8), 100 mM KCl, 0.1 M EDTA), and 5 µl of the S100 fraction of HeLa cell extract (35) for 30 min at 30 °C, followed by phenol extraction and precipitation. Primer extension analysis was performed essentially as described in Kjems et al.(36) , using 1 pmol of 5`-end-labeled primer (5`-GTCGGGTCCCCTCGGG-3`), complementary to position 15-30 downstream from the 3`-splice site in TAT4 and gel-purified RNA templates. To obtain sequence information 1 mM ddTTP or ddATP, or 0.5 mM of ddGTP or ddCTP, was included in each of the respective sequencing reactions. The TATPIP exon-exon product was purified and identified by the following procedure: 1 µg of nonradioactive TATPIP was spliced in a 200-µl splicing reaction and co-electrophoresed together with a hot sample on a denaturing gel. RNA co-migrating with the exon-exon band was purified, and the sequence across the ligation site was determined by primer extension as described above. Preparation of lariat TAT3 and TAT4 RNAs for primer extension was done similarly.


RESULTS

A Nonoptimal HIV-1 3`-Splice Site Renders Splicing an Inefficient Process in Vitro

Controlling the cytoplasmic appearance of non-, single-, and double-spliced mRNAs is an important aspect of the HIV-1 life cycle. To determine elements important for this regulation, we have investigated the splicing reaction of the second intron of the double-spliced Tat and Rev transcript from HIV-1 strain HXB-3 using an in vitro splicing assay. This intron remains unspliced in transcripts encoding Gag/Pol, Env, Vif, Vpr, and Vpu, and its retention is subject to Rev control. The RNA transcripts TAT3 and TAT4, used in this study, contain the second and third exons encoding the Tat protein and different truncations of the second intron (Fig. 1). In the absence of Rev, these deletions did not influence the ratio between spliced and unspliced cytoplasmic RNA in vivo, suggesting that all sequences important for Rev independent splicing remain intact(11, 32) .

The splicing efficiency was investigated by incubating the transcript in a nuclear extract from HeLa cells. This is an appropriate system for studying HIV-1 regulation of splicing, based on the observation that HIV-1 precursor RNA exhibits a similar splicing pattern, when expressed in HeLa cells and T-cell lines(37) . Under optimal splicing conditions, less than 2% of TAT4 was converted into a lariat product, migrating above the precursor RNA (Fig. 2A, lanes 1 and 2). Based on a primer extension analysis (see below) this band contains intermediate lariat (IL; intron bound to the 3`-exon) and possibly also lariat intron alone. The presence of the RRE within the intron (TAT3) had no effect on splicing efficiency compared to that observed using TAT4 RNA (result not shown).


Figure 2: In vitro splicing analysis of TAT4 and chimerical constructs. A, autoradiograms showing the splicing products of TAT4 (lanes 1 and 2), PIPTAT (lanes 3 and 4), TATPIP (lanes 5 and 6), PIPDeltaTAT (lanes 7 and 8), DeltaTATPIP (lanes 9 and 10), and TAT5'ss (lanes 11 and 12). - and + denote lanes with samples that have been incubated at splicing conditions for 90 min in the absence and presence of ATP, respectively. The bands were identified as follows. The intermediate lariat RNA (IL) and lariat RNA (L) products were identified by their abnormal behavior on gels containing different salt and acrylamide concentrations. In addition, gel analysis of debranced splicing reaction was performed on TATPIP derived lariats (see Panel B). The linear splicing products including the unprocessed precursor RNA (P), ligated exon-exon RNA (EE), and 5`-exon RNA (5E) were identified on the basis of apparent size as compared to molecular size markers, and the TATPIP derived exon-exon product was sequenced by primer extension (data not shown). The samples were loaded on a 6% polyacrylamide gel containing 75 mM of Tris borate (pH 8.3). B, lariat identification of TATPIP splicing products on a 6% gel containing 100 mM Tris borate. -D denotes untreated splicing products; +D denotes splicing products treated with debranching extract. M denotes the lane containing single-stranded DNA size marker, numbers on the left indicate the molecular sizes in base pairs. Identities of individual bands are indicated. The 5`-exon of TATPIP generally migrates as two bands of variable intensities. Sequence analysis of the TATPIP specific exon-exon product showed no sign of alternative 5`-splice site usage, suggesting that the lower 5`-exon band may result from partial RNA degradation of the upper 5`-exon band.



To study the efficiency of the 5`- and 3`-splice sites of HIV-1 in vitro a number of chimerical constructs between the PIP7.A, a construct optimized for splicing, and TAT4 RNAs were constructed (Fig. 1). When the 5`-half of PIP7.A including the 5`-splice site was substituted with the 5`-half of HIV-1 mRNA (TATPIP; Fig. 1), splicing became highly efficient, yielding more than 70% splicing products (Fig. 2A, lanes 5 and 6). The identities of the branched splicing products were confirmed both by a change in mobility when altering the ionic strength of the gel and by debranching (Fig. 2B). Purification and direct sequencing of the exon-exon RNA product confirmed that the normal 5`-splice site of the HIV-1 RNA was correctly joined to the 3`-splice site of the PIP7.A RNA (result not shown).

When substituting the 3`-half of PIP7.A with the 3`-half of TAT4 (PIPTAT; Fig. 1) splicing was as inefficient as observed for TAT4 (Fig. 2A, lanes 3 and 4). Similar results were obtained when shorter regions of the HIV-1 transcript, containing the 5`- or 3`-splice site regions, were inserted into PIP7.A to replace the corresponding splice site (DeltaTATPIP and PIPDeltaTAT, respectively; Fig. 1and 2A, lanes 7-10), although a slight increase in splicing of PIPDeltaTAT RNA was observed as compared to PIPTAT and TAT4 (compare Fig. 2A, lanes 2, 4, and 8). These data imply that the region containing the 3`-splice site of the HIV-1 transcript is responsible for the inefficient splicing in vitro.

It has previously been shown that an element positioned downstream of the 3`-splice site of the first Tat intron inhibits the splicing of the upstream intron(28) . To investigate the possibility that sequences within the 3`-exon flanking the second Tat/Rev intron function as inhibitory elements, the splicing efficiency of the TAT4 transcripts truncated at different positions within the 3`-exon was analyzed. No significant differences in splicing efficiency were detected using constructs containing 87, 80, 33, and 19 nucleotides of the 3`-exon, implying that no cis-acting inhibitory elements are present in the 3`-exon of the TAT4 transcript (results not shown).

Identification of the Branch Point Sequence in the HIV-1 Intron

Examination of the sequence upstream from the 3`-splice site in the HIV-1 intron revealed no obvious branch site consensus. To identify the branched nucleotide, utilized for the inefficient lariat formation in vitro, 1 µg of low specifically labeled TAT3 and TAT4 transcripts was incubated under splicing conditions for an extended period of time to increase the yield of splicing products. Approximately 2% of the radioactive label incorporated in TAT3 or TAT4 pre-mRNA transcripts appeared in bands corresponding to branched RNAs. The bands were excised from the gel, extracted, and annealed to a primer complementary to a region within the common 3`-exon of TAT3 and TAT4. When extended by reverse transcriptase, specific stops were observed as compared to a control reaction containing a template of unspliced TAT4 RNA. This suggests that the observed splicing product corresponds to an intermediate lariat. Surprisingly, the reverse transcription was almost completely terminated at a cytidine, located 47 nucleotides upstream from the 3`-splice site in TAT3 and TAT4 (Fig. 3A, lanes 1 and 2), whereas no termination was observed at this position in the control reaction (Fig. 3A, lane 3). Since reverse transcription generally is arrested one nucleotide 3` to a branched nucleotide, this strongly suggests that the sequence UACUUUC is recognized as the branch site and that the underlined U is branched to the 5`-end of the intron (Fig. 3B). In addition, weaker bands were observed at nucleotides more proximal to the 3`-splice site which may represent alternative branch site nucleotides.


Figure 3: Characterization of branch sites used in the in vitro splicing reaction of TAT3 and TAT4. A, reverse transcriptase was used to extent cDNA from a primer annealing to the 3`-exon of gel-purified intermediate lariat splicing product from 1 µg of TAT3 and TAT4 mRNA (lanes 1 and 2, respectively) or gel-purified unspliced TAT4 RNA (lane 3). Lane 2A is a longer exposure of lane 2. To identify the termination sites, untreated TAT4 RNA was sequenced by primer extension in the presence of dideoxy-nucleotides (lanes A, G, C, and U; nucleotides indicate the corresponding RNA sequence). BP indicates the major termination site for reverse transcriptase at the C residue within a putative branch site sequence indicated. B, alignment of 3`-splice site sequences of HIV-1 HXB-3. The sequences originate from the second intron of the Tat and Rev major transcripts (REV 2/TAT 2; analyzed in this report), the 3`-splice site of the Env intron (ENV), and the first Rev intron (REV 1A/1B). The major branch site is denoted with an arrow, and homologous sequences in other introns are underlined. Small and capital letters indicate nucleotides which are variable and highly conserved, respectively, among different HIV-1 and HIV-2 strains. Small letters above the sequence indicate common nucleotide changes within the putative branch point consensus sequences in other HIV-1 and HIV-2 strains. The asterisk indicates the position of 3`-splice site of the HIV-1 precursor mRNA, and positions are numbered according to Ratner et al.(31) .



Splicing of the HIV-1 Intron Leads to Accumulation of a 50 S Pre-spliceosome Complex Containing Mainly U1 and U2 snRNPs

The splicing process requires a stepwise assembly of the snRNPs and other auxiliary splicing factors on the pre-mRNA in a highly ordered fashion. The splicing complexes formed on PIP7.A, TAT4 RNA, and chimerical constructs, when incubated in nuclear extract, were analyzed by native gel electrophoresis. As expected, PIP7.A and TATPIP, which both splice efficiently, formed pre-spliceosomes (complex A) and spliceosomes (complex B) very efficiently in the presence of ATP (Fig. 4A). In contrast, both TAT4 and PIPTAT formed only one complex, independent of ATP (Fig. 4A).


Figure 4: Analysis of splicing complexes. A, complex gel showing the splicing products of the indicated mRNAs. P-labeled pre-mRNA was incubated under splicing conditions for 20 min in the absence(-) or presence (+) of ATP and loaded directly on a native polyacrylamide gel. The H, A, and B complexes are indicated for PIP7.A and TATPIP at the left- and right- hand side of the autoradiogram, respectively. B, sucrose gradient profiles of a 200-µl splicing reaction containing approximately 1 µg of biotinylated and low specifically S-labeled PIP7.A or TAT4 pre-mRNAs. The positions of the 40 and 60 S PIP7.A-specific complexes are indicated. Due to the large amount of mRNA used in this type of preparative splicing reaction the 60 S spliceosomes formed on PIP7.A are only visible as a shoulder on the 40 S peak(32) . C, analysis of splicing complexes. Northern blot showing the snRNA composition of affinity purified splicing complexes of the individual fractions shown in B. Splicing complexes formed on biotinylated RNA (+) were affinity purified from pools of fractions 10-12, 13-15, 16-18, and 19-21 of the sucrose gradient profile shown in Panel B. The sedimentation range of theses fractions is approximately 40 S to 60 S as indicated above(32) . Unbiotinylated pre-mRNA(-) was used as a control for unspecific binding of snRNA to streptavidin beads and derived from fractions 10-12 and 19-21. Recovered complexes were denatured and the eluted snRNAs were electrophoresed on an 8% polyacrylamide-8 M urea gel. The RNA was electroblotted onto nitrocellulose membrane and probed with a mixture of antisense U1, U2, U4, U5, and U6 snRNA, that yielded approximately equal autoradiographic signals for the five snRNAs in the 60 S peak of the PIP7.A splicing reaction. The identities of the bands are indicated on the left side.



To investigate the content of these complexes in more detail, biotinylated mRNAs were incubated under splicing condition and fractionated on sucrose gradients. The PIP7.A RNA sedimented as a 40 S peak, corresponding to the A complex, and a 60 S ATP-dependent peak, corresponding to the B complex. (The 60 S peak was visible only as a shoulder on the 40 S peak of the sucrose gradient profile shown in Fig. 4B due to excess mRNA in this type of preparative gradient.) In contrast, the TAT4 mRNA sedimented in a peak at around 50 S (Fig. 4B). This peak formed independently of ATP (data not shown). Specific splicing complexes were purified from individual fractions of the sucrose gradient by streptavidin affinity chromatography and tested for the content of snRNA by Northern blotting, probing with a mixture of antisense snRNAs (Fig. 4C). As expected, the 40 S complex of the PIP7.A construct contained mostly U1 snRNP and some U2 snRNA, and the 60 S complex of the PIP7.A construct contained all five snRNPs, corresponding to the fully assembled spliceosome. In contrast, the TAT4 specific 50 S peak contained only U1 and U2 snRNPs and no U4/U6 and U5 snRNPs, suggesting that the low level of HIV-1 intron splicing is due to an inefficient assembly of the spliceosome.

To measure the kinetics of the HIV-1-specific 50 S complex formation, a splicing reaction containing TAT4 was incubated for different periods of time prior to complex purification and Northern blot analysis. The 50 S peak containing U1 and U2 snRNP was fully formed within 10 min, and extended incubation did not change the complex composition (Fig. 5).


Figure 5: Northern blot showing the time course of the snRNA composition of splicing complexes formed on TAT4. TAT4 pre-mRNA was incubated under splicing conditions for 10 and 120 min and analyzed as described in the legend to Fig. 4. PIP7.A was included for comparison. Splicing complexes of the pre-mRNAs were affinity-purified from pools of fractions 11-12, 15-16, and 19-20, corresponding to an approximate size of 40, 50, and 60 S, respectively. + and - indicate the presence and absence of biotin in the RNA, respectively, and the identities of the bands are indicated on the left side.



It has been suggested that the U11 and U12 snRNPs may play an important role in regulating the splicing in Rous sarcoma virus(38) . To test a putative role of these snRNPs in HIV-1 splicing, the Northern blots shown in Fig. 4C were reprobed with a mixture of antisense U11 and U12 snRNA probes. Although both snRNPs were easily detected in nuclear extract, neither of the snRNAs appeared to be specifically associated with PIP7.A or TAT4 pre-mRNA (results not shown).

The 3`-Splice Site Region Is Dispensable for U1 and U2 snRNP Binding

To characterize the regions within the HIV-1 sequence responsible for U1 and U2 snRNP binding, Northern blot analysis was performed on chimerical mRNA constructs (Fig. 6, A and B). The snRNP content of individual fractions of the TATPIP gradient looked very similar to that of PIP7.A except that the 40 S complex of the chimerical construct contained relatively more U2 snRNP (compare Fig. 4C, left panel, and Fig. 6A). In contrast, incubation of PIPTAT pre-mRNA in nuclear extract produced a complex containing mainly U1 snRNP and relatively less U2 snRNP as compared to TAT4 (compare Fig. 4C, right panel, and Fig. 6B). This suggests that the 5`-half of the HIV-1 transcript does not only bind U1 snRNP, but also U2 snRNP. To analyze the complexes formed on a construct containing the HIV-1 5`-splice site region alone, Northern blot analysis was performed using TAT5`ss, which lacks the 3`-exon and most of the intron (Fig. 1). This construct does not produce any detectable splicing products when incubated in nuclear extract (TAT5`ss, Fig. 2A). An ATP-independent peak, sedimenting at around 40 S, was detected (results not shown), and Northern blot analysis of this peak revealed the presence of U1 and U2 snRNPs (Fig. 6C). Considering the differences in length of the transcripts, this 40 S peak may correspond to the 50 S peak observed for the TAT4 transcript, suggesting that the 3`-splice site region of TAT4 is dispensable for stable U1 and U2 snRNP interactions with the HIV-1 transcript.


Figure 6: Northern blot showing the snRNA composition of affinity purified splicing complexes formed on TATPIP (A), PIPTAT (B), and TAT5'ss (C). Splicing complexes were separated on sucrose gradients and analyzed on Northern blots using antisense snRNA probes, as described in the legend to Fig. 4. Splicing complexes formed on biotinylated TATPIP and PIPTAT RNA were affinity purified from pools of fractions 13-15, 16-18, and 19-21, and control lanes of unbiotinylated RNA derived from fractions 13-15 and 19-21. TAT5`ss specific complexes were purified from fractions 9-11, 15-17, and 18-20 for the biotinylated RNA (+), and fraction 12-14 and 18-20 for the unbiotinylated RNA(-). The identities of the bands are indicated on the left side.




DISCUSSION

Balanced expression of differentially spliced mRNAs is an evolutionary conserved feature among retroviruses. Studies of two distantly related retroviruses, Rous sarcoma virus and HIV-1, have revealed that the mechanisms controlling this balance are strikingly similar. Both viruses apparently use a combination of suboptimal 3`-splice site signals (4, 5, 6, 7, 29) (this report) and cis-acting negative regulators of splicing(8, 9, 28) .

The elements controlling the splicing of the second intron in the HIV-1 Tat/Rev transcript have been investigated previously in vivo(29) . In agreement with that report, we found that the 3`-splice site region of the HIV-1 Tat intron contains inefficient splicing signals for in vitro splicing, whereas the 5`-splice site was highly efficiently spliced to a heterologous 3`-splice site. Our analysis also suggests that the inefficiency of splicing is not controlled by cis-acting repressive sequences in the downstream exon, as observed for the 3`-splice site of the first Tat intron(28) .

A mammalian 3`-splice site consensus is composed of a highly conserved AG immediately upstream of the splice site, a continuous stretch of 7 or more pyrimidines (preferably uridines), just upstream from the AG nucleotides, and a branch point sequence YNYURAY (Y = pyrimidine, R = purine, N = any nucleotide), in which the highly conserved A is used as the branch point. During splicing the branch point sequence base pairs with GUAGUA in U2 snRNP, bulging out the branch point adenosine from the helix (reviewed in (1) ). Analysis of cellular genes in general shows that a long U-rich polypyrimidine tract can compensate for a poor branch site, and vice versa. However, inspection of the 3`-splice site of the second Tat/Rev intron reveals a very irregular polypyrimidine tract and no obvious branch point candidate. Improvement of the polypyrimidine tract, or introduction of a consensus branch site 29-35 nucleotides upstream from the 3-splice site, increased the splicing efficiency in vivo, suggesting that both elements play an important role in maintaining a suboptimal 3`-splice site(29) .

Surprisingly, we found that the uridine, underlined within the sequence UACUUUC, functioned as the major branch site in the formation of lariat splicing product in vitro. Although naturally occurring branch sites may differ at several positions from the consensus, the branch point adenosine is highly conserved. Mutational studies have shown that the occurrence of uridines at positions 5 or 6 in a branch point motif decreases the splicing efficiency severely(39, 40) . It is possible that the capacity of the remaining nucleotides to form 5 Watson-Crick base pairs with the U2 snRNA can partially compensate for the lack of an adenosine at the branch point position.

Branching to a uridine residue has been observed in artificial systems (39) and in the splicing of the alternatively processed calcitonin/CGRP-I pre-mRNA(41) . In a study using beta-globin mRNA mutated at the branch site it was demonstrated that all 4 nucleotides can serve as branch acceptors with the following efficiencies A > C > G > U in the first step of splicing(39) . Alternatively, if an adenosine is present 1 nucleotide upstream of the normal branch site, this may function as a branch site(39) . Since no adenosine is located adjacent to the branch point position in the Tat/Rev intron, branching may be forced to occur mainly at the uridine residue.

Inspection of sequences upstream from two other major 3`-splice sites in HIV-1 revealed a similar branch point sequence containing uridine at the putative branch point nucleotide (Fig. 3B). Several observations suggest that this resemblance is functionally important. The putative branch point sequences are highly conserved among different HIV-1 and HIV-2 strains, the rare substitutions which are observed do not significantly destabilize the base pairing with U2 snRNA, the branch points are positioned in approximately the same distance from the 3`-splice sites (47-52 nucleotides), and no other obvious homology is found between the sequences upstream of the different 3`-splice sites (Fig. 3B). It remains to by experimentally confirmed whether the putative branch sites indicated in Fig. 3B are functional in vivo.

Taking advantage of our in vitro approach, we were able to analyze the splicing complexes formed on the HIV-1 mRNA and thereby determine at what step the assembly of the spliceosome was inhibited. Most of the mRNA rapidly accumulated into a 50 S complex containing U1 and U2 snRNPs independent of the presence of ATP. A similar complex was formed when using an RNA lacking the 3`-exon and most of the intron, suggesting that both U1 and U2 snRNPs mainly interact with the 5`-splice site or surrounding regions. Early stages in spliceosome formation have been investigated on other mRNAs. An ATP-independent commitment complex containing U1 snRNP, U2AF, and possibly also SF2/ASF has been characterized as the initial product of spliceosome assembly (30, 42, 43) . Formation of this complex requires the presence of both 5`- and 3`-splice sites and the polypyrimidine tract. The lack of a consensus 3`-splice site in the HIV-1 intron suggests that the 50 S complex formed may not represent a fully assembled commitment, or A complex. In particular, the polypyrimidine tract may be too irregular for efficient binding of U2AF. Interestingly, both the TAT4 RNA and the 5`-splice site region alone bind U1 and U2 snRNP, whereas a construct, containing the 5`-splice site region of PIP7.A and the 3`-splice site of TAT4 (PIPTAT), contains only U1 snRNP. An explanation for the U2 snRNP binding to 5`-half of the TAT4 transcript, independently of the downstream 3`-splice site, could be that at least three functional 3`-splice sites, used in the expression of the Rev and Env genes, are present upstream of the 5`-splice site in the transcript. Even though none of these 3`-splice sites exhibit optimal polypyrimidine stretches or branch site sequences, it is possible that the binding of U2AF and U2 snRNP to these sites is stabilized through interactions with the U1 snRNP across the exon, as it has been suggested for other exons(44) . Alternatively, the U2 snRNP binding may be a result of a increased unspecific association with the HIV-1 mRNA, as observed previously with other mRNAs(42, 43, 45) .

Supposing that similar complexes are formed in vivo, what function could this early arrest in spliceosome assembly play in the context of Rev regulation? While the nascent transcript is being synthesized, heterogeneous nuclear RNPs, snRNPs, and auxiliary splicing factors will interact with the RNA. The RRE within the HIV-1 transcript probably also binds Rev protein at this stage. A logical reason for an inefficient second step of spliceosome assembly may be to accumulate high levels of nuclear unspliced HIV-1 mRNA partially assembled into spliceosomes, which are capable of interacting with Rev. Interestingly, several other lines of evidence suggest that Rev functions on mRNA in the context of U1 snRNP. By a genetic approach it has been shown that the base pairing between U1 snRNA and conserved sequences at the 5`-splice site is important for Rev responsiveness (13) . A similar result was recently obtained in a yeast system demonstrating that some spliceosome assembly steps are required before Rev can act on the transcript(15) . These observations are consistent with results obtained in vitro, which have demonstrated that Rev can inhibit splicing of an RRE containing mRNA that has been preincubated in nuclear extract in the absence of ATP. In contrast, Rev completely looses inhibitory activity if the spliceosome is preassembled in the presence of ATP prior to addition of Rev(32) . These observations, and the data presented in this report, strongly suggests that Rev acts on the transcript after U1 snRNP binding but before assembly of a complete spliceosome. Maintenance of intrinsically inefficient splicing signals, such as a nonoptimal branch site, may therefore play a key role in timing Rev function.


FOOTNOTES

*
This work was supported in part by grants from the Danish Medical Research Council and Novo Nordisk's Fond. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
Supported by Aarhus University.

Supported by the Danish Medical Research Council. To whom correspondence should be addressed: Dept. of Molecular Biology, University of Aarhus, C. F. Møllers Allé, Bldg.130, DK-8000 Aarhus C, Denmark. Tel.: 45-8942-2686; Fax: 45-8619-6500; Kjems{at}biobase.dk.

(^1)
The abbreviations used are: sn, small nuclear; RNP, ribonucleoprotein; RRE, Rev response element; CMV, cytomegalovirus; dd, dideoxy; bp, base pair(s); kb, kilobase pair(s); SS, splice site; HIV-1, human immunodeficiency virus type I.


ACKNOWLEDGEMENTS

We are grateful to David D. Chang for providing the pgTat-CMV and pgTat-CMV3 plasmids. We thank Rita Rosendahl and Allan Jensen for technical assistance and Finn Skou Pedersen, Anne Tolstrup, Torben Heick Jensen, and Roger A. Garrett for discussions and critical reading of the manuscript.


REFERENCES

  1. Moore, M. J., Query, C. C., and Sharp, P. A. (1993) in The RNA World (Gesteland, R. F., and Atkins, J. F., eds) pp. 303-358, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  2. Nilsen, T. W. (1994) Cell 78,1-4 [Medline] [Order article via Infotrieve]
  3. Maniatis, T. (1991) Science 251,33-34 [Medline] [Order article via Infotrieve]
  4. Katz, R. A., Kotler, M., and Skalka, A. M. (1988) J. Virol. 62,2686-2695 [Medline] [Order article via Infotrieve]
  5. Katz, R. A., and Skalka, A. M. (1990) Mol. Cell. Biol. 10,696-704 [Medline] [Order article via Infotrieve]
  6. Fu, X. D., Katz, R. A., Skalka, A. M., and Maniatis, T. (1991) Genes & Dev. 5,211-220
  7. Berberich, S. L., and Stoltzfus, C. M. (1991) J. Virol. 65,2640-2646 [Medline] [Order article via Infotrieve]
  8. McNally, M. T., Gontarek, R. R., and Beemon, K. (1991) Virology 185,99-108 [Medline] [Order article via Infotrieve]
  9. McNally, M. T., and Beemon, K. (1992) J. Virol. 66,6-11 [Abstract]
  10. Cullen, B. R. (1992) Microbiol. Rev. 56,375-394 [Abstract]
  11. Chang, D. D., and Sharp, P. A. (1989) Cell 59,789-795 [Medline] [Order article via Infotrieve]
  12. Kjems, J., Frankel, A. D., and Sharp, P. A. (1991) Cell 67,169-178 [Medline] [Order article via Infotrieve]
  13. Lu, X. B., Heimer, J., Rekosh, D., and Hammarskjold, M. L. (1990) Proc. Natl. Acad. Sci. U. S. A. 87,7598-7602 [Abstract]
  14. Hammarskjold, M. L., Li, H., Rekosh, D., and Prasad, S. (1994) J. Virol. 68,951-958 [Abstract]
  15. Stutz, F., and Rosbash, M. (1994) EMBO J. 13,4096-4104 [Abstract]
  16. Emerman, M., Vazeux, R., and Peden, K. (1989) Cell 57,1155-1165 [Medline] [Order article via Infotrieve]
  17. Felber, B. K., Hadzopoulou, C. M., Cladaras, C., Copeland, T., and Pavlakis, G. N. (1989) Proc. Natl. Acad. Sci. U. S. A. 86,1495-1499 [Abstract]
  18. Malim, M. H., Hauber, J., Le, S. Y., Maizel, J. V., and Cullen, B. R. (1989) Nature 338,254-257 [CrossRef][Medline] [Order article via Infotrieve]
  19. Malim, M. H., and Cullen, B. R. (1993) Mol. Cell. Biol. 13,6180-6189 [Abstract]
  20. Fischer, U., Meyer, S., Teufel, M., Heckel, C., Lührmann, R., and Rautmann, G. (1994) EMBO J. 13,4105-4112 [Abstract]
  21. D'Agostino, D. M., Felber, B. K., Harrison, J. E., and Pavlakis, G. N. (1992) Mol. Cell. Biol. 12,1375-1386 [Abstract]
  22. Arrigo, S. J., and Chen, I. S. (1991) Genes & Dev. 5,808-819
  23. Maldarelli, F., Martin, M. A., and Strebel, K. (1991) J. Virol. 65,5732-5743 [Medline] [Order article via Infotrieve]
  24. Cochrane, A. W., Jones, K. S., Beidas, S., Dillon, P. J., Skalka, A. M., and Rosen, C. A. (1991) J. Virol. 65,5305-5313 [Medline] [Order article via Infotrieve]
  25. Schwartz, S., Felber, B. K., and Pavlakis, G. N. (1992) J. Virol. 66,150-159 [Abstract]
  26. Rosen, C. A., Terwillinger, E., Dayton, A., Sodroski, J. G., and Haseltine, W. A. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,2071-2075 [Abstract]
  27. Brighty, D. W., and Rosenberg, M. (1994) Proc. Natl. Acad. Sci. U. S. A. 91,8314-8318 [Abstract]
  28. Amendt, B. A., Hesslein, D., Chang, L. J., and Stoltzfus, C. M. (1994) Mol. Cell. Biol. 14,3960-3970 [Abstract]
  29. Staffa, A., and Cochrane, A. (1994) J. Virol. 68,3071-3079 [Abstract]
  30. Jamison, S. F., Crow, A., and Garcia, B. M. (1992) Mol. Cell. Biol. 12,4279-4287 [Abstract]
  31. Ratner, L., Haseltine, W., Patarca, R., Livak, K. J., Starcich, B., Josephs, S. F., Doran, E. R., Rafalski, J. A., Whitehorn, E. A., Baumeister, K., et al. (1985) Nature 313,277-284 [Medline] [Order article via Infotrieve]
  32. Kjems, J., and Sharp, P. A. (1993) J. Virol. 67,4769-4776 [Abstract]
  33. Wassarman, K. M., and Steitz, J. A. (1992) Mol. Cell. Biol. 12,1276-1285 [Abstract]
  34. Ruskin, B., and Green, M. R. (1985) Science 229,135-140 [Medline] [Order article via Infotrieve]
  35. Dignam, J. D., Martin, P. L., Shastry, B. S., and Roeder, R. G. (1983) Methods Enzymol. 101,582-598 [Medline] [Order article via Infotrieve]
  36. Kjems, J., Brown, M., Chang, D. D., and Sharp, P. A. (1991) Proc. Natl. Acad. Sci. U. S. A. 88,683-687 [Abstract]
  37. Purcell, D. F., and Martin, M. A. (1993) J. Virol. 67,6365-6378 [Abstract]
  38. Gontarek, R. R., McNally, M. T., and Beemon, K. (1993) Genes & Dev. 7,1926-1936
  39. Hornig, H., Aebi, M., and Weissmann, C. (1986) Nature 324,589-591 [Medline] [Order article via Infotrieve]
  40. Reed, R., and Maniatis, T. (1988) Genes & Dev. 2,1268-1276
  41. Adema, G. J., Bovenberg, R. A. L., Jansz, H. S., and Baas, P. D. (1988) Nucleic Acids Res. 16,9513-9526 [Abstract]
  42. Liao, X. C., Colot, H. V., Wang, Y., and Rosbash, M. (1992) Nucleic Acids Res. 20,4237-4245 [Abstract]
  43. Michaud, S., and Reed, R. (1991) Genes & Dev. 5,2534-2546
  44. Hoffman, B. E., and Grabowski, P. J. (1992) Genes & Dev. 6,2554-2568
  45. Wassarman, D. A., and Steitz, J. A. (1992) Science 257,1918-1925 [Medline] [Order article via Infotrieve]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.