(Received for publication, December 3, 1996, and in revised form, June 25, 1997)
From the Molecular Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089-1340
Sex-lethal (Sxl) is an RNA-binding protein, containing two conserved RNA binding domains (RBDs) and a glycine-rich region, which functions as a regulator of alternative splicing in Drosophila sex determination. Previous work demonstrated that Sxl monomers interact cooperatively upon binding to target RNAs and that the cooperativity depends on the glycine-rich N terminus. Here we use band shift experiments to show that RNA binding patterns are altered when Sxl is combined with other proteins having similar glycine-rich domains, including mammalian heterogeneous nuclear (hn) RNP L and Drosophila Hrb87F (an hnRNP A/B homolog). Direct involvement of the Sxl glycine-rich region in protein interactions was verified by Far-Western analysis. Two interaction domains, the Sxl N terminus and the Sxl first RNA binding domain, were suggested by the yeast two-hybrid assay. In a systematic examination of the RNA binding properties of Sxl domains, it was found that the Sxl termini as well as the RBDs influence RNA binding specificity. Finally, selection of the Sxl optimal binding site (SELEX) confirms the importance of U-runs in the Sxl binding site and suggests a second type of non-U-run target that may be associated with RNA secondary structure.
How specific and general splicing factors find their target pre-mRNA sequences and determine the correct splicing pattern remains a major question in understanding alternative splicing. One emerging picture is that individual factors, identified by either biochemical or genetic methods, work in larger complexes by interacting with other factors. Understanding how the splicing factors function in these systems relies on studying their interactions with target RNAs and with other splicing proteins.
Various structural studies on a number of known RNA-binding proteins and splicing factors have identified several functional domains responsible for RNA or protein interactions. One highly conserved domain (RBD,1 RRM, or RNP-CS) has been intensively studied as an independent functional motif for specific RNA binding (1-4). Another conserved domain, the RS domain, has also been intensively studied as a mediator of multiple interactions with other splicing proteins (5, 6). However, the involvement of the overall protein structure in RNA binding and protein interactions has been relatively neglected. Also, potential interaction or effector domains besides the well known RS domain have not been clarified.
We have been using the sex determination pathway in Drosophila to study splicing regulation by specific factors. In this system, we are provided with a well defined hierarchy of genes regulated by alternative splicing. The Sex-lethal (Sxl) protein is a sex-specific RNA-binding protein that regulates the alternative splicing of a number of pre-mRNAs of downstream genes, as well as its own pre-mRNA (7, 8). Sxl interacts directly with multiple cis-elements on its target pre-mRNAs (9-13). We have previously studied how Sxl binds to Sxl pre-mRNA and found that Sxl monomers cooperate through their N termini when binding to pre-mRNA regions containing two adjacent binding sites, each consisting of a short U-run (12). Because Sxl binds RNA at some distance from the Sxl splice sites it regulates, we postulated that in addition to the interactions between Sxl molecules, Sxl might also interact with other specific or general splicing factors (12). This seems especially likely during autoregulation, which involves a number of additional components identified by genetic studies (14-16).
The Sxl N terminus, which we had demonstrated to be required for cooperative RNA binding (12), is very rich in glycine, serine, asparagine, and proline. A similar structure is found in a number of known RNA-binding proteins, such as mammalian hnRNP proteins A1, B2, L (18-20) and Drosophila A/B-like proteins (21-23). Among these proteins, A1 has been found to counteract SR protein SF2/ASF in alternative splicing (24); L was shown to enable processing of intronless pre-mRNA (25). Glycine-rich regions are also found in a variety of other known splicing factors, including the U1 snRNP component 70K protein (26), U2AF35 (27), SF2/ASF (28, 29), and Drosophila P element splicing regulator PSI (30). Glycine-rich regions in all these proteins might function as common protein interaction domains.
In this report, we show that the glycine-rich Sxl N terminus influences interactions with hnRNP proteins containing RNA binding and glycine-rich domains. When we systematically analyzed the putative domains of Sxl for direct interaction in other assays, we confirmed the importance of the N terminus. Moreover, the first RBD may be sufficient to mediate interactions in vivo. We also show that the N and C termini, lying outside the RBDs, play a role in targeting Sxl to different RNAs. Finally, selection of the Sxl optimal binding site (SELEX) confirms the importance of U-runs in Sxl binding and suggests a second type of non-U-run target.
Production of glutathione S-transferase (GST) fusion proteins from bacteria, 32P-labeled RNAs by in vitro transcription, and conditions for in vitro RNA binding using band shift or filter binding methods were described in detail previously (12). As described by Wang and Bell (12) for Sxl and variant proteins, binding conditions were performed using a high protein/RNA ratio, and protein concentrations were on the order of 1 µM. UV cross-linking was performed by exposing an RNA binding assay in a microtiter plate to UV light in a UV Stratalinker (Stratagene) for 15 min. The plate was kept on ice during the exposure.
DNA ConstructsThe plasmids for making GSTL, GSTPTB/hnRNP I, and GSTU2AF65 were gifts from the laboratories of G. Dreyfuss (University of Pennsylvania), M. Garcia-Blanco (Duke University Medical Center), and M. Green (University of Massachusetts Medical Center), respectively. Additional GST fusion plasmids were created by cloning BamHI-EcoRI fragments into pGEX-2T as follows. The cDNA clone of Hrb87F (a gift from S. Haynes) (National Institutes of Health) was PCR- amplified with primers that added BamHI immediately upstream of the ATG(CGGGATCCATG) and an EcoRI site downstream of the open reading frame. The snf cDNA, pBC-D25 (provided by J. Romac and J. Keene, Duke University Medical Center), was transferred into pGEX-2T using the flanking BamHI and EcoRI sites in the original PET3a vector (31). The cDNA of the snf1621 mutant was isolated from snf1621 homozygous flies by reverse transcriptase-PCR that placed a BamHI site immediately upstream of the ATG and an EcoRI site after the open reading frame.
To make the GSTSxl/snf construct, snf cDNA was
reamplified by PCR with primers 5ATCTCATGATGGAGATGCTACCCAAC3
(upstream) and 5
GGACAGCGGCTGCTTCTTGGCGAACGTTAT3
(downstream) to add
BspMII and AlwNI sites then ligated into the same
sites of GST-2T-SxlcF1 (12) to replace the two Sxl RBDs with those of
snf. The Sxl early cDNA fusion was created by PCR
amplification that added a BamHI site immediately before the
ATG and continued to a unique BspMII site. The PCR product
was used to replace the BamHI to BspMII region of
the plasmid pGEX-2T-SxlcF1. For the shorter version of the alternative
Sxl exon 5 splicing, the BspMII to
SfiI region of Sxl cDNA MS11 (32) was used to replace
the equivalent region of Sxl cDNA cF1 (7).
The SxlcF1 deletion recombinant mutants used unique sites
BspMII, BspHI, AflII, and
AlwNI. SxlNI deletes 38 amino acids. The N terminus is 117 amino acids, RBD-1 is 80 aa, RBD-2 is 86 aa, and the C terminus is 71 aa. The 3 end of all the C-terminal deletion cDNAs has a 14 bp
XbaI linker, CTAGTCTAGACTAG (all linkers from New England
Biolabs), that contains a TAG stop codon in all three reading frames.
To maintain the correct reading frame of mutants N1, N3, and B2, an 8 bp XhoI linker, CCTCGAGG, was inserted between the
blunt-ended BamHI site and the appropriate site on SxlcF1. A
10-bp XhoI linker, CCCTCGAAAA, was similarly used for making
N4. All constructs without a linker were made by ligation of the
blunt-ended sites. Klenow was used to blunt-end all sites except for
AlwN1, whose overhanging 5
end was digested with T4 DNA
polymerase.
The Drosophila U1 snRNA cDNA clone was a gift from Dr.
S. Mount (University of Maryland). It was PCR-amplified with primers 5GGAATTCATACTTACCTGGCGTAGAG3
upstream with an EcoRI site,
and 5
GGGTACCTCGGGACGGCGCGAACGCC3
downstream with a KpnI
site. The PCR product was cloned into the pGEM4 vector to be
transcribed from the SP6 promoter. All constructs were confirmed by
sequencing.
Sxl protein to be used as probe was
expressed as a GST fusion from vector GEX-2TK (Pharmacia Biotech Inc.).
The protein was labeled as follows. Ten µl of fusion protein at about
1 mg/ml was added to 5 µl of 10 × HMK buffer (200 mM Tris, pH 7.5, 1 M NaCl, 120 mM
MgCl2), 5 µl of [-32P]ATP and 50 units
HMK (Sigma). The reaction was incubated at 37 °C for 30 min. To the
reaction, 20 µl of glutathione beads (1:1 in MTPBS (150 mM NaCl, 16 mM Na2HPO4, 4 mM NaH2PO4, pH 7.3)) was added and rotated
for 5 min at room temperature. The beads were washed with 100 µl of
MTPBS three times, and the bound proteins were then released with 30 µl of 20 mM Tris, pH 8, and 5 mM reduced glutathione three times. All the fractions were pooled and stored at
80 °C.
To make in vitro translated Sxl probe for Far-Western, the cDNA portion of pGEX-2TK-SxlcF1 from BamHI to EcoRI was blunt-ended and ligated to EcoRV of pcDNA3 (Invitrogen). [35S]Met-labeled Sxl protein was produced in the TNT reticulocyte lysate (Promega) according to the manufacturer protocol. The 38-kDa Sxl band was present after translation and absent from the control of pcDNA alone (data not shown).
For Far-Western blotting, various proteins separated by 12% SDS-PAGE were transferred to nitrocellulose. Incubation and washing followed the procedures of Kaelin et al. (33). In vitro translated protein at 105 to 106 cpm/ml was added. To check for similar protein levels, the filter was also incubated with anti-GST antibody recognizing all the fusion proteins (Pharmacia). After exposure to x-ray film to visualize the labeled Sxl binding, a Western blot was developed (data not shown).
Yeast Two-hybrid AnalysisThe yeast two-hybrid system of
Gyuris et al. (34) was used. The "bait" vector
containing the LexA DNA binding domain and "prey" vector containing
an activation domain were modified as follows. For the bait, the
EcoRI-SalI polylinker region of pEG202 was
replaced by a double-stranded linker created from two oligonucleotides, containing BamHI, XbaI, and EcoRI
sites, with 4-base single-stranded ends noted by parentheses:
5(AATT)AGGATCCTCTAGAGAATTCG(AGCT)3
. For the prey, the pJG4-5
BamHI site was destroyed by end-filling and ligation, then
the EcoRI-XhoI region of the polylinker was replaced by a linker, containing BamHI, BglII,
and EcoRI sites, with 4-base single-stranded ends:
5
(AATT)AGGATCCGAGATCTCCCGAATTCC(AGCT)3
. The cDNA clones of
Sxl and Sxl deletions were removed from the GSTSxl clones as BamHI-EcoRI fragments and
ligated in frame to the modified bait and prey vectors. Yeast strain
EGY48 (ura3 his3 trp1 3LexA binding
sites-LEU2) and yeast LexA binding
sites-lacZ reporter plasmid pSH18-34 were used. Interacting bait
and prey will activate the LexA binding site-LEU2
reporter to give a Leu+ phenotype and will also activate
the LexA binding site-lacZ reporter. The
LEU2 assay is more sensitive than the lacZ assay
due to the nature of the LexA binding sites present in each. Liquid
cultures were grown in selective media in the presence of galactose.
Yeast
-galactosidase assays were performed as described (35). One unit equals 1000 × A420/A600 of assayed
culture × volume assayed (ml) × time (min).
The 79- base template oligonucleotide includes PCR priming regions, the T7 RNA polymerase recognition site, and random sequences for selection of 26 bp: ATTATGCTGAGTGATATCCCGCTTAACCCATGGTTN26GCCTAGGTGATCAAGATC. The degeneracy of a totally randomized 26 base sequence would be 4.5 × 1015, which would be equivalent to roughly 0.2 mg of RNA. Primers matching the two flanking regions, 35 and 18 bases, respectively, were used for PCR amplification. Gel purified 79-mer (~0.4 mg) and 35-mer (~0.18 mg) were annealed and used as the templates for in vitro transcription. Five mg of RNA was synthesized, producing roughly 10 copies of RNA molecules/template, then phenol extracted, precipitated, and redissolved in binding buffer (12). This was then passed through a 1-ml column composed of agarose beads coupled with reduced glutathione (Sigma) saturated with GSTSxl protein. The RNAs were allowed to bind for 10 min at room temperature with mild orbital shaking. The column was then washed two times with 1.5 ml of binding buffer, followed by four times with 1.5 ml of wash buffer (binding buffer without the tRNA). The sample was eluted with MTPBS plus 0.1 M NaCl, collected as 10, 0.5-ml fractions, and then with MTPBS plus 1 M NaCl, also as 10 0.5-ml fractions. One-third of each RNA was reverse transcribed and PCR-amplified by 20 cycles of 94, 58, and 72 °C for 45 s. Only fractions 4, 5, and 6 of 1 M NaCl had large amounts of PCR products, so they were pooled as templates for the next round of selection. Approximately 5 mg of RNA was used for binding during each of three rounds of column selection. We then switched the selection to the bandshift assay, using previously described conditions (12). The Sxl protein used for selection was also switched from GST fusion to non-fusion. The area of shifted RNAs was cut out, electroeluted, and PCR-amplified as before. After two rounds of selection, the final PCR products were digested with KpnI and BamHI and cloned into pGEM4 (Promega) for sequencing and in vitro transcription. The RNA made after cutting the cloned fragments at HindIII has flanking sequences different from that of the PCR products to avoid possible effects on Sxl binding.
Previous work using band shift analyses demonstrated several properties of Sxl binding (12). First, Sxl binds a U-run site; second, Sxl binds cooperatively to RNA containing two adjacent U-run sites but is monomeric in solution; third, when 38 amino acids of the glycine-rich N terminus is removed, cooperativity is lost although binding affinity to a single U-run site is unchanged.
To test the hypothesis that the glycine-rich region may be a general
protein interaction domain, we examined interactions between Sxl and
human hnRNP L, which contains a glycine-rich region very similar to
Sxl. We performed band-shift assays with Sxl and GST L fusion protein
using RNAs identical to those used previously (12). These include RNA
S7B, a 52 nucleotide region of the Sxl pre-mRNA
containing the sequence U9AU8 and modifications
in which the U-runs have been systematically disrupted to (UC)-runs.
Consistent with previous results, the RNAs containing zero, one, or two
U-runs are bound by Sxl either not at all, as a monomer, or as a dimer, respectively (Fig. 1A,
lanes 1, 4, and 7; Ref. 12). In contrast, protein
L (in the form of a GST-L fusion) could bind, probably as a dimer
judging by its slow mobility, to all three of these RNAs (Fig.
1A, lanes 3, 6, and 9).
When Sxl and L were both added to the two-U-run RNA, a strong, broad band was observed, intermediate in size, between the Sxl and L bands (Fig. 1A, lane 8). The predominance of the intermediate complex suggests that the binding of Sxl and L influence one another, possibly by a protein interaction of the same type previously observed between Sxl and GSTSxl, which together produced a similar intermediate band (12). Like the observation on two-U-run RNA, with Sxl and L together on one-U-run RNA, an intermediate band was again observed, but it was of lower mass due to a single Sxl binding site (Fig. 1A, compare lanes 5 and 8). Strikingly, even when there was no U-run on the RNA, and so no Sxl binding site, a weak but distinct intermediate band was still observed (Fig. 1A, lane 2). The presence of an intermediate band in the absence of a Sxl binding site suggests two possibilities. A Sxl-L protein interaction might occur without any binding of Sxl to RNA; alternatively, an interaction with L may allow Sxl to bind RNA that lacks the normal U-run site of Sxl.
To show that the interaction between Sxl and L is due to the Sxl N terminus, we performed the same binding assay with SxlN1 (lacking the first 38 N-terminal amino acids) and SxlN2 (lacking the entire N terminus of 116 amino acids). It has been shown previously that SxlN1 no longer demonstrates cooperativity when binding RNA; thus, at an appropriate protein concentration, SxlN1 forms two discrete bands on two-U-run RNA in which one or two binding sites are filled (12). Here, each N-terminal deletion protein forms two bands on the two-U-run RNA although the bands are not well separated on this particular gel (Fig. 1A, lanes 10 and 13). On the same two-U-run RNA, protein L produced less of the intermediate band with SxlN1 than with Sxl (Fig. 1A, compare lanes 8 and 11). Most clearly, with SxlN2, there was no intermediate band, and instead, there were discrete L and SxlN2 bands (Fig. 1A, compare lanes 8 and 14). The loss of the intermediate band, especially with SxlN2, strongly suggests that the Sxl N terminus is important for the Sxl-L interaction, just as previously demonstrated for the Sxl-Sxl interactions (12). These experiments were performed several times at various protein concentrations (data not shown).
In a reciprocal experiment, portions of protein L were removed to
create a C-terminal deletion (GST LC) lacking 267 amino acids, and
an N-terminal deletion (GST L
N) lacking 36 amino acids of the
glycine-rich domain. The C-terminal deletion L
C binds RNA less well
than the entire protein but forms a strong intermediate band with Sxl
(Fig. 1B, lanes 5 and 6). Thus, L
C
behaves similarly to the entire L protein. In contrast, L
N forms a
band with an unexpectedly fast migration rate (Fig. 1B,
lane 8). Although L
N contains a deletion of only 27 amino
acids, it migrates much faster than the entire L protein (Fig.
1B, lane 4) and even migrates faster than the
much smaller protein L
C, which lacks 267 amino acids (Fig.
1B, lane 6). This faster migration rate suggests
that the N-terminal deletion protein may bind as a monomer (or small multimer) while the normal protein binds as a dimer (or larger multimer). This might be analogous to the loss of cooperativity in RNA
binding that was previously observed for SxlN1 (12). Furthermore, Sxl
appears to enhance the binding of L
N; however, unlike the other
situations examined, two complexes are observed that might be Sxl-Sxl
and Sxl-L
N (Fig. 1B, lane 7). One
interpretation is that L
N protein no longer interacts with itself,
but still has enough of the glycine-rich region left for interaction
with Sxl. Finally, replacement of the missing glycine-rich sequences in
L
N with the N-terminal 21 amino acids of Sxl (GST L/S) leads to
restoration of the intermediate complex (Fig. 1B, lane
9).
In summary, Sxl and hnRNP L show different patterns of RNA binding when they are in combination compared with when each is separate, suggesting the likelihood of a protein interaction. Deletions analysis suggests that this interaction is likely to be mediated by the glycine-rich domains of the two proteins.
Sxl Interacts with Other Sxl Isoforms and with Another hnRNP ProteinThere are two natural variants of the Sxl N terminus. The
N-terminal 26 amino acids of Sxl (isoform cF1, called Sxl herein (7)),
which is the product of the late Sxl transcripts, are replaced by a different sequence of 24 amino acids in the early Sxl
protein (SxlcE1, from cDNA clone cE1 (36)). In addition, alternative splicing of the late transcripts generates another isoform,
SxlcF1S, in which the N terminus is missing eight amino acids just
after the SxlN1 deletion (32). Given that SxlcE1 is known to have
functions similar to Sxl (36), we predicted that all isoforms would
interact with Sxl. As shown in Fig.
2A, lanes 1-4,
that appears to be the case. Intermediate bands are evident between Sxl
and GSTSxlcE1 or GSTSxlcF1S. Since two-U-run RNA was used as the
substrate, both proteins are assumed to interact as they bind RNA.
These interactions are virtually identical to those between Sxl and
GSTSxl, which we studied in detail previously (12). However, it appears
that GSTSxlcE1 interacts with Sxl less well compared with the other two
proteins (Fig. 2A, lanes 1 and 2).
To see whether known hnRNP proteins in addition to protein L could interact with Sxl, we performed assays with Drosophila Hrb87F (hrp36), which is an hnRNP A/B type protein with a glycine-rich domain at its C terminus (21-23). Interestingly, although GSTHrb87F alone binds at a very low level to two-U-run RNA, in combination with Sxl it shows a strong shifted band above the band with Sxl alone (Fig. 2A, lanes 7 and 8). A UV cross-linking experiment shows that GSTHrb87F alone is capable of a low level of binding to the two-U-run RNA; a protein concentration at least 100-fold higher than Sxl is required for equivalent levels of binding (Fig. 2B). Apparently Sxl interacts with Hrb87F to stabilize or strengthen its binding to a weak RNA site.
For comparison, two proteins lacking a glycine-rich domain were tested in this assay. Splicing factor U2AF65 and polypyrimidine track binding protein, PTB/hnRNP I, both bind polypyrimidine tracks (5, 37, 38). PTB/hnRNP I has noticeable structural similarity to L protein but lacks the glycine-rich region (37-39). Neither U2AF65 nor PTB/hnRNP I show any interaction with Sxl (data not shown).
To confirm the interactions observed in the band-shift assay, a Far-Western analysis was conducted using kinase-labeled GSTSxl as a probe against various other GST fusion proteins. These results show interactions between Sxl and all the Sxl isofoms (Fig. 2C, lanes 2-4, Sxl, SxlcE1, SxlcF1S) as well as between Sxl and Hrb87F (Fig. 2C, lane 5).
snf, a homolog of snRNP proteins U1A and U2B" which consists almost entirely of two RBD domains, has been identified genetically as being involved in the Sxl autoregulatory loop (14). We did not observe a strong interaction between either Sxl and snf in the Far-Western assay (Fig. 2C, lane 6) or between Sxl and a mutant form of snf (lane 7); however, Sxl does interact with the mosaic protein to which the Sxl N and C termini have been added (lane 8, Sxl/snf). We note that when more snf protein was used, weak interactions with Sxl could be observed (data not shown).
In brief, besides interacting between themselves, Sxl molecules can interact with other proteins, most likely through a common glycine-rich domain. In particular, hnRNP L helps Sxl bind to RNA that Sxl alone does not bind, and Sxl helps Hrb87F bind to RNA that the latter alone binds only weakly.
Interactions between Different Parts of Sxl ProteinBecause
the band-shift assay had identified the Sxl N terminus as important for
cooperative interactions between Sxl molecules (12), we attempted to
determine whether protein interactions could be observed by other
methods. Far-Western protein blotting, using radioactively labeled,
in vitro transcribed and translated Sxl as probe against
various Sxl deletions diagrammed in Fig. 3A, showed that protein
interactions are dependent on the Sxl N terminus (Fig. 3C).
Compared with the interaction with intact Sxl (Fig. 3C,
lane 1), the signal is substantially reduced when the entire
N-terminal region is removed (Fig. 3C, lanes 3 and 5, SxlN2 and N3). Conversely, the
interaction is not affected when the C terminus is progressively
deleted (Fig. 3C, lanes 7-9, SxlC1,
C2, and C3). The RNA binding domains singly or together bind little or not at all (Fig. 3C, lanes 10-12,
SxlB1, B2, B12), and deletion of
either RBD has no effect on the interaction (Fig. 3C,
lanes 13 and 14, SxlD1 and
D2). Surprisingly, the C-terminal region alone (SxlN4)
interacts well with Sxl (Fig. 3C, lane 6). However, it is possible that the C terminus is normally prevented from
intermolecular interactions by other parts of Sxl structure because not
all proteins having a C terminus interact with Sxl.
We also used the yeast two-hybrid assay to analyze interactions between Sxl and a series of altered Sxl proteins (Fig. 3A). Activities were measured using lacZ or LEU2 reporter genes that respond to interactions between wild-type Sxl fused to the LexA DNA binding domain (the bait) and Sxl deletion mutants fused to an activation domain (the prey). Some of the differences in activity can be attributed to a level of protein expression in yeast cells that is lower for intact Sxl than for any of the Sxl deletions (data not shown). In this assay, the N-terminal region appears neither necessary nor sufficient for a significant interaction (Fig. 3A). Nevertheless, the decrease in activity from SxlN1 to SxlN2 does suggest that the N terminus might play some role in protein interaction in this system (Fig. 3A, SxlN1, SxlN2).
It is more obvious that RBD-1 is important for interaction (Fig. 3A). For example, the N-terminal deletions (Fig. 3A, SxlN1-N4) do not completely lose activity until RBD-1 is lost. In addition, the isolated RBD-1 (Fig. 3A, SxlB1) is capable of interacting with Sxl. However, the presence of RBD-1 is not always sufficient because the C-terminal deletions SxlC1 and SxlC2 contain RBD-1 but do not interact with Sxl. This might be due to the influence of overall protein structure; for example, without the C terminus, the N terminus might block the ability of RBD-1 to interact with intact Sxl. Alternatively, the C-terminal deletion proteins may interact so strongly with each other (see below) that this overwhelms the measured interaction with intact Sxl.
When individual domains were used as bait and prey, the ability of RBD-1 to behave as an interacting domain became even more apparent (Fig. 3B). The two proteins containing RBD-1, SxlB1 and SxlC2, each interacted strongly with itself and with the other protein but did not interact with other regions (Fig. 3B). This is in striking contrast to the failure of SxlC2 to interact with intact Sxl (Fig. 3A, SxlC2). Thus, it appears that for each of the C-terminal deletions, the deletion proteins might interact so strongly among themselves that this overwhelms the interaction that is measured with intact Sxl. Such an explanation may also be invoked to explain the lack of apparent interaction of the isolated N terminus. As an additional consideration, it is possible that interactions between two RNA binding domains are dependent upon the presence of cellular RNA. If RNA molecules are involved, they might bring together the bait and prey proteins with or without the involvement of direct protein-protein contacts.
The N terminus alone activates transcription of the lacZ reporter gene even in the absence of the prey plasmid (Fig. 3B, SxlC3). Apparently, the artificially exposed Sxl N terminus works as a transcriptional activation domain when fused to the LexA DNA binding domain. This suggests that the N terminus can interact with other proteins, albeit with the transcription apparatus. Nevertheless, the N terminus can be assayed as the prey plasmid because it is already fused to an activation domain, but even here it shows no obvious interaction with any other Sxl domain (Fig. 3B, SxlC3). Again, self-interaction may play a role, and competition between the RBD-1 interaction and the N-terminal interaction may also be involved. Unfortunately, the interaction between the two isolated N termini cannot be tested.
In considering a final difference between the two assays, it is unclear why the C terminus (SxlN4) interacts with Sxl on the Far-Western blot (Fig. 3C, lane 6) but shows no evidence of interaction in the yeast two-hybrid system (Fig. 3, A and B, SxlN4). This may result from differences in fusion proteins used in each assay. In yeast, a LexA-Sxl fusion must interact with a SxlN4-activation domain fusion. In the Far-Western assay, the GST-SxlN4 fusion was probed with nonfusion Sxl.
In summary, Far-Western analysis clearly shows the importance of the glycine-rich Sxl N terminus in protein interactions. The yeast two-hybrid assay appears to measure two different interaction domains, the N terminus and RBD-1.
Influence of Sxl Protein Structure on Binding PropertiesThe
Sxl deletions used above were also tested for RNA binding ability. The
following three RNA substrates were used: (a) RNA S5A, which
contains a U-run RNA from the Sxl male-specific 3 splice
site and polypyrimidine tract and to which Sxl is known to bind;
(b) RNA S8A, which lacks U-runs and originates from the downstream unregulated 3
splice site and polypyrimidine tract to which
Sxl does not bind; and (c) Drosophila U1 snRNA,
which is a structured RNA that serves as a negative control and to
which Sxl is not expected to bind (12). Shown in Fig.
4A is a band-shift experiment
with these RNAs. Wherever possible, the amount of each protein was
adjusted so that a similar percentage of U-run RNA (S5A) was bound. For
these tests, the concentrations of purified GST fusion proteins were
estimated from staining of SDS-PAGE gels. The amount of each protein
was chosen after preliminary binding tests at different protein
concentrations so that roughly equivalent levels of binding to S5A
would be observed for all the mutants. Each protein concentration was
then held constant for binding to the different RNAs. The protein
concentrations fell in the range of 0.1-10 µM, and
proteins that did not bind to S5A were tested using a relatively high
concentration. Finally, the relative binding affinities summarized in
Fig. 4B were normalized to account for the different amounts
of each protein used.
The only proteins retaining completely normal specificity and affinity were those that lacked only the N or C terminus (Fig. 4, SxlN2 and SxlC1). However, RBD-1 alone or with the N terminus was nearly normal, showing only slightly reduced affinity (SxlB1 and SxlC2). In contrast, RBD-2 together with the C terminus (SxlN3) became completely nonspecific, even to the extent of binding U1 snRNA; RBD-2 alone (SxlB2) lost the ability to bind U-run RNA but acquired a surprising ability to bind U1 snRNA. The two RNA binding domains together (SxlB12) became nonspecific in binding.
Determination of the Sxl Optimal Binding Site (SELEX)To
study further how Sxl interacts with RNA, we performed
selection/amplification (SELEX) of binding sites from a random pool of
26-mers to identify the Sxl optimal binding sites. Listed in Fig.
5A are 15 RNA sequences that
bound Sxl. A run of 8 or more undisrupted Us, and a disrupted
U12 (Fig. 5B, lanes 8-11),
correlates with stronger binding compared with those with a U-run of 7 or less, or a disrupted run of 8, 9, or 10 Us (Fig. 5B,
lanes 1-7). The only RNA with 16 Us showed very strong
binding and formation of a higher complex, apparently with two Sxl
proteins that bound cooperatively (Fig. 5B, and + lanes 11). This conclusion is in agreement with our previous
report that Sxl binds to the U-runs on the Sxl and
tra pre-mRNA, as a monomer or dimer according to the run
length (12). The low frequency of long U-runs obtained may be because
such sequences are more prone to mutagenesis during PCR amplification.
Two sequences lack a U-run (RNAs 14 and 15). It is interesting that two
bands are present initially in these RNAs, presumably due to secondary
structure, but Sxl appears to bind only the upper band (Fig.
5B, lanes 14 and 15). Sxl does not
bind to the control RNA, C2, which also lacks U-runs.
We next used some of these newly isolated sequences as binding substrates for Sxl and the isolated Sxl RNA binding domains (SxlB1 and SxlB2). They were tested in a filter binding assay together with the same RNAs used in Fig. 4, U-run RNA S5A, U1 snRNA, and no-U-run RNA S8A. Reflecting the band-shift experiment in Fig. 5B, wild-type Sxl bound the SELEX sequences 15, 6, 10, and 8 with increasing affinity (Fig. 5C, Sxl). The first Sxl RBD (SxlB1) binds all substrates containing U-runs (RNAs 6, 8, 10) with somewhat reduced affinity compared with the wild-type protein (Fig. 5C, SxlB1 (hatched bars)) similar to the band shift results in Fig. 4 (compare SxlB1 and Sxl on U-run RNA). However, modifying the results of Fig. 4, in this assay, it is seen that RBD-1 (SxlB1) actually has a specificity somewhat different from wild-type Sxl because it bound to the SELEX RNA 15, which lacks U-runs, much more strongly than wild-type Sxl (Fig. 5C, compare Sxl (black bars) and SxlB1 (hatched bars) on RNA 15). The second Sxl RBD (SxlB2, gray bars), in agreement with the results shown in Fig. 4, binds U-run containing RNAs 6 and S5A weakly and binds U1 snRNA quite well (Fig. 5C). None of the proteins bound to the control RNA C1.
In summary, all the tested SELEX sequences having U-runs behaved similarly to U-run RNA S5A, binding Sxl better than RBD-1 or RBD-2. In contrast, the no-U-run RNA 15 and U1 snRNA were preferred by RBD-1 or RBD-2, respectively. Overall, the SELEX results argue for two types of binding: one containing U-runs influenced by their sequence context, the other somewhat weaker and lacking U-runs but associated with secondary structure.
It is currently thought that for Sxl to regulate the splicing of its own pre-mRNA, it must directly contact other splicing proteins. In this way, Sxl would be able to bridge the considerable distance between its intronic binding sites and the splice sites it regulates. It has already been demonstrated that Sxl molecules interact: cooperative interactions occur between two Sxl monomers as they bind to adjacent U-run sites on the Sxl pre-mRNA (12). Because the cooperativity was shown to be dependent upon the glycine-rich N terminus, that region was thought to be a protein interaction domain. It was also proposed that the large glycine-rich region may simultaneously interact with Sxl and with additional proteins (12).
In this study, we used band-shift assays as before to demonstrate that Sxl can interact on a small region of Sxl pre-mRNA with other proteins having a glycine-rich region. In the case of human hnRNP L, when combined with Sxl, an RNA-protein complex is formed that is intermediate in size between that formed by either protein alone. It is especially noteworthy that the intermediate complex forms even on RNA that lacks any Sxl binding site, suggesting that either L protein helps Sxl bind to a site that Sxl does not bind alone, or Sxl binds the L protein without itself binding RNA. If the first possibility is correct, it would be similar to the interaction observed between Sxl and Drosophila hnRNP protein Hrb87F. Hrb87F binds extremely weakly to the U-run that Sxl normally binds; however, in combination with Sxl, Hrb87F binding is much improved (Fig. 2, A and B).
These experiments also address the question of whether, when two cooperating Sxl monomers are bound to RNA, they might be capable of simultaneously interacting with additional proteins. When L and Sxl were combined upon RNA containing two adjacent Sxl binding sites, to which Sxl alone would bind cooperatively (12), an intermediate band was observed (Fig. 1A, lane 2). This result suggests that a simultaneous interaction between two Sxl molecules, L protein, and RNA may be possible.
Although these experiments were meant to test the in vitro interaction properties of the Sxl N terminus, it would not be surprising if the interaction between Sxl and certain Drosophila hnRNP proteins turns out to have relevance in vivo. A number of mammalian hnRNP proteins, including A1, F, and I/PTB, have been shown to directly influence splicing (40-43). In addition, it was recently shown that overexpression of Drosophila Hrb98DE, which is an hnRNP A/B protein, causes exon skipping in vivo (43).
Direct evidence for the involvement of the glycine-rich N terminus in protein interactions was provided by Far-Western analysis of the interactions between two Sxl molecules. The Sxl N-terminal region alone is sufficient for an interaction with Sxl (Fig. 3C). Consistent with these results, when the Sxl termini were added to snf, which is a protein that has two RBDs but no glycine-rich region, it acquired the ability to interact with Sxl (Fig. 2C).
The yeast two-hybrid assay gave results consistent with the idea that the N terminus is important for the interaction between Sxl molecules (Fig. 3A). However, the assay was complicated by the fact that the isolated Sxl N terminus is intrinsically able to activate transcription, presumably by inappropriate protein interaction with transcription factors (Fig. 3B). In addition, it appears likely that in yeast, the C-terminal deletions, including the isolated N terminus, might each form a tight interaction with itself that precludes interactions with other partners (Fig. 3, A and B). The N terminus might also interact tightly with unknown cellular proteins to similarly interfere with the measured interactions.
The behavior of the Sxl N terminus as a transcriptional activation domain is consistent with other observations of similar behaviors by glycine-rich regions of certain hnRNP proteins. For example, when the glycine-rich N terminus of hnRNP P2 (also called TLS/FUS) is fused by chromosomal translocation to CHOP, a C/EBP family DNA binding transcription factor, it acts as a transcriptional activator that produces dominant enhancement of liposarcomas (44-47). Similarly, the glycine-rich N terminus of RNA-binding protein EWS becomes oncogenic when fused to a series of DNA binding domains (for example, see Ref. 48). Finally, when the glycine-rich domain from Drosophila SARFH, a homolog of EWS and TLS/FUS, is fused to CHOP, the fusion protein leads to cell transformation (47, 49).
Unexpectedly, we found that in the yeast system the isolated RBD-1 behaved as an interaction domain, although in the Far-Western assay it did not (Fig. 3C). It may be that certain interactions are stabilized by yeast cellular RNA, possibly through transcripts containing poly(U) stretches. RBD-1 may be able to interact with other proteins only after binding to RNA; alternatively, activation may be achieved indirectly if the two proteins containing RBD-1, as bait and prey, both bind to a single cellular transcript. With regard to the importance of the RBD for protein interactions, we note that interactions between wild-type Sxl monomers were previously found to be dependent on RNA binding (12), and it is possible that normally the glycine-rich domain may coordinate with an RBD.
The glycine-rich domains found in various RNA-binding proteins do not have any clear structural features beyond the likelihood that stretches of amino acids such as glycine or proline result in flexible coils. These regions may be important for a variety of protein interactions in splicing, acting analogously to the RS domains, which connect a number of different proteins involved in general and regulated splicing (50). The interactions of Sxl with itself, with hnRNP L, and with Hrb87F provide preliminary evidence for the possibility of such a network.
Sxl Binding to RNAWe also examined the role of the Sxl RNA binding domains, and their functional interplay with the glycine-rich and C-terminal regions, in the recognition of RNA sequences. To date, most of the studied cases involving proteins with two or more RBDs have shown that an isolated RBD domain loses RNA binding specificity, or affinity, or both (1-5, 13, 51). For Sxl, we find that each isolated RBD has acquired an altered specificity, which is either mildly altered in the case of RBD-1 (SxlB1) or radically altered in the case of RBD-2 (SxlB2). In addition, the isolated pair of RBDs (SxlB12) has become nonspecific. Somewhat surprisingly, this lack of specificity can be rescued by addition of either the N or C terminus. This result suggests that an important factor in binding specificity is the establishment of an overall structural stability rather than any particular protein sequence. We would suggest that when two RBDs must act together, the overall structural context can be important; in the case of Sxl, this context can be provided by either the N or C terminus.
There are several differences between our results and those of a similar study by Kanaar et al. (13). Briefly, Kanaar et al. found that each RBD alone showed nonspecific binding and lower binding affinity than the intact protein, with very low affinity shown by RBD-2. The isolated pair of RBDs showed correct specificity and even stronger binding than the intact Sxl protein. Surprisingly, their intact protein was less specific in recognizing U-runs than the RBD pair. We note three points of difference. First, we found that Sxl RBD-1 shows nearly normal specificity with regard to recognizing U-runs although with somewhat lower affinity. Nevertheless, we also find that RBD-1 specificity is not completely normal because it bound a SELEX sequence that lacks U-runs more strongly than full-length Sxl (Fig. 5C, compare 15 and S5A). The discrepancy may arise from differences in extent of RBD-1 where we excluded the linker region between the two RBDs while Kanaar et al. (13) included it. Alternatively, the testing of different RNAs may have led to different results. Second, we find that RBD-2 has not simply lost affinity but rather has changed its specificity so that it exhibits a surprisingly strong affinity toward Drosophila U1 snRNA. When the C terminus is added to RBD-2 (SxlN3), it regains some ability to bind U-runs and becomes generally nonspecific like the RBD-2 of Kanaar et al. (13). This result suggests that some of the differences may be due to the RBD length. Third, and importantly, we find no evidence for a negative role of the termini that decreases both binding affinity and specificity of the RBD pair. Instead, we find that the two RBDs together have lost all binding specificity, and it is not regained until either the N or C terminus is added (SxlC1, SxlN2). Moreover, we observe that the entire protein shows strong binding and accurate specificity capable of distinction between U-run and no-U-run polypyrimidine tracts. Finally, in contrast to Kanaar et al. (13), we and others have previously demonstrated that Sxl does not bind the U-rich ftz polypyrimidine tract under our chosen conditions (9, 12). These differences may result from different methods of cloning and isolating the proteins.
As part of our study on Sxl interactions with RNA, we analyzed the Sxl binding site. Our results of selection and amplification of the optimal binding site (SELEX) do not concur with consensus sequences previously reported by others in which Sxl binding sites included AUnNnAGU (52) or U5(G/U)UU(G/U)U8 where Gs interrupting the Us are favored (43). These reported consensus sequences, emphasized because of their appearance within the Sxl binding site on tra pre-mRNA, are not found in most of the natural binding targets of Sxl on Sxl pre-mRNA (12, 53, 54) or male-specific lethal-2 RNA (55). Nevertheless, there is some resemblance to the consensus of Singh et al. (43) where several sequences contain U-runs flanked by Gs, and one strong binding sequence has a U-run interrupted by Gs (RNA8). Aside from the interpretation, the results of Sakashita and Sakamoto (52) were very close to ours with regard to the U-run lengths, locations, and binding strength, as well as the identification of a few sequences lacking U-runs. Our observation of the sequences lacking U-runs suggests that a second type of somewhat weaker binding site associated with RNA secondary structure may exist.
We thank G. Dreyfuss, S. Jamison, M. Garcia-Blanco, M. Green, S. Haynes, J. Keene, and S. Mount for generously providing cDNA clones. We thank Gail Miyasato for excellent assistance with constructs and purifications.