From the
Human immunodeficiency virus encodes the regulatory protein Rev,
which is required for expression of viral structural proteins. It binds
to an RNA element (RRE) in the viral transcript and up-regulates the
cytoplasmic appearance of unspliced and singly spliced viral mRNA. We
have studied the structure of Rev alone and complexed with the RRE and
two monoclonal antibodies, using a protein footprinting approach. The
method involves radioactive labeling at the C-terminal end of Rev
fusion protein followed by limited proteolysis under native conditions,
using 10 different proteinases. Rev protein was mainly cleaved within
the basic domain and in the C-terminal part. The periodicity of the
proteolytic cleavages within the basic domain strongly suggests that it
forms an
Human immunodeficiency virus type 1 (HIV-1)
In vivo and in vitro studies using mutated Rev protein have revealed a domain structure
as indicated (see Fig. 1A). A tract of basic amino acids
(residues 35-50) is involved in specific binding of RRE RNA and
nuclear/nucleolar
localization
(2, 3, 4, 5) . A slightly
more extended region is important for assembly of Rev into
oligomers
(5, 6, 7, 8) . A leucine-rich
sequence located at positions 78-83 constitutes another
functionally important region (2, 6, 9). This region may function as an
activation domain interacting with a cellular factor
(2) and/or
play a role in protein oligomerization
(10, 11) .
The affinity and specificity of this fusion protein for RRE
RNA were compared with nonfusion Rev protein using a gel retardation
assay. The molar concentration, required to complex half of
radiolabeled input RRE probe, was approximately the same for both types
of proteins (Fig. 2). Furthermore, the fusion Rev protein did not
bind the reverse RRE probe except at very high concentrations (more
than 2 µM). This implies that the modified termini of the
Rev protein did not interfere significantly with the folding of the RNA
binding domain.
Some
regions were more accessible to proteinases than others. Most cuts were
observed in the basic domain, in the transactivation domain, and in the
C-terminal region. When using Arg-C, Lys-C, trypsin, proteinase K,
subtilisin Carlsberg, and Pronase, we often observed a strong diffuse
band (marked with asterisks in Fig. 3A). This
band was easily identified by its abnormal mobility using different gel
conditions and was omitted from the analysis (see legend to
Fig. 3
). A strong and a weak Lys-C-specific band probably
represent cleavage after Lys
Plotting the migration (in mm) for all visible bands in
Fig. 3C, as a function of the mass of the putative
peptide resulted in points forming a smooth curve
(Fig. 3D). This reinforces that the assignments are
correct and that the gel resolution in the basic domain is at the level
of single amino acids.
Proteinase K, subtilisin Carlsberg, and
Pronase exhibited similar cleavage patterns, although the intensity of
each individual cleavage varied considerably (Fig. 3A).
Since these proteinases are relatively unspecific (cutting
preferentially before hydrophobic amino acids), exact assignments of
the proteolytic cleavage products are more difficult. Strong cleavage
occurred at several positions between the activation domain of Rev and
Glu
Enzymatic footprinting of nucleic acids is a powerful
approach for studying solution structure and molecular interactions of
DNA and RNA. We have used a parallel approach to study the structure of
a protein and to characterize the amino acids involved in the binding
of other macromolecules. The method is analogous to standard nucleic
acid footprinting except that radioactively end-labeled proteins are
used instead of nucleic acids, and proteinases are used instead of
endonucleases. The peptide cleavage products are subsequently resolved
on SDS gels and readily identified using appropriate internal size
markers. Using this method, the structure of HIV-1 Rev protein was
probed with 10 different proteinases under native conditions providing
insight into the overall folding of the fusion protein. Although, it is
possible that some artifactual bands corresponding to cleavages of
incorrectly folded proteins may occur, the observation that the RRE
consistently inhibited proteolytic cleavages at specific amino acids by
more than 70% at specific binding conditions, suggests that most of the
proteins contain a correctly folded RRE binding domain.
The RNA
binding efficiency of GST-Rev fusion protein has been investigated
previously. In one report, a 3 times higher molar concentration of
partially purified GST-Rev protein compared to nonfusion Rev protein
was needed to bind the same amount of radiolabeled input
probe
(6) . However, in a more recent report it has been
demonstrated that GST-Rev and nonfusion Rev protein bind RRE with
similar affinities
(51) . We find that the GST-Rev fusion protein
used in our study and nonfusion Rev protein exhibit essentially the
same binding affinity and specificity, using a similar gel mobility
shift assay. Although the fusion of the GST and the heart muscle kinase
site to the N- and C terminus of Rev, respectively, may alter the
structure locally, it is conceivable that the structure of the RNA
binding domain is not affected by the modified termini of the protein.
Some regions of Rev fusion protein are more accessible to
proteolytic cleavage than others, which may reflect a location on the
surface. The C-terminal domain (residues 75-116) was cleaved
strongly at multiple positions by most of the proteinases, whereas the
N-terminal domain (residues 1-34) generally was much less
accessible to proteinases. The central part (residues 35-66) was
cleaved by Arg-C yielding a number of evenly spaced bands. Strongest
cleavage was observed at residues flanking the central part of the
basic region (Arg
Our data show that the
RRE protection extends outside the basic domain at Arg
In a similar footprinting analysis of
elongation factor Tu, bound either to GTP or GDP, conformational
changes in the protein were accompanied by significant changes in the
proteinase cleavage pattern.
The protein footprinting methodology, described
in this paper, provides a general method for mapping protein domains
involved in binding of other proteins, nucleic acids, or other
macromolecules. The only requirement is that the fusion protein is
stable and that the region of interest is correctly folded when
situated in a fusion protein. Alternative methods for selective
visualization of terminal protein fragments have been used previously.
One method involves immunodetection by antisera, raised toward N- and
C-terminal peptides of the protein, in a Western blot
analysis
(52) . However, that method is unsuitable for small
proteins like Rev, partly because the antibody epitopes span a
significant portion of the protein and partly because blotting of small
peptides is relatively inefficient. Another method utilizes chemical
linkage of a fluorescent group to the N terminus of a protein, which
may then be visualized in a gel by UV radiation
(53) . The
disadvantage of this method is the requirement of irreversible
modification of all internal amino groups under denaturing conditions,
making it less useful for protein footprinting of native proteins.
These problems are avoided using the fusion-protein approach described
in this report. Independent of labeling technique, the most laborious
process in protein footprinting is the identification of the peptide
cleavage products at the amino acid level. An interesting possibility,
which we are currently testing, is to combine the protein footprinting
method with mass spectrometry technology to obtain a rapid and accurate
identification of proteolytic cleavage products.
Specificities are given according to Ref. 54.
We thank Anne Marie Szilvay for providing Rev mAbs,
Dag E. Helland and Anne Marie Szilvay for providing the Tat mAb, and to
Lars Sottrup-Jensen and Claus Oxvig for peptide sequencing and amino
acid analysis. The pSVH6rev plasmid was kindly provided by Alan W.
Cochrane, and the subtilisin Carlsberg proteinase was a gift from Steen
Mortensen (Novo Nordisk). We thank Annette H. Andersen for technical
assistance and Finn Skou Pedersen, Allan Jensen, Helle Dyhr-Mikkelsen,
and Roger A. Garrett for discussions and critical reading of the
manuscript.
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES
-helical structure with one side facing the solvent. In
the presence of RRE, these cleavages became significantly reduced. In
addition, strong protection was observed at position 66 outside the
basic domain. As a control for the specificity of the footprinting
reaction, we confirmed the position of the epitopes for two monoclonal
antibodies. This protein footprinting methodology is generally
applicable to other proteins for which terminal modifications are
acceptable, and provides a useful tool for mapping structure, substrate
binding, and conformational changes.
(
)
encodes two regulatory proteins, Tat and Rev, which are
absolutely required for virus production. Tat protein up-regulates the
synthesis of full-length viral mRNA, whereas the Rev protein acts at a
post-transcriptional level promoting the expression of unspliced and
singly spliced mRNA in the cytoplasm (for review, see Ref. 1). Rev
thereby induces a shift in viral protein expression from the early
regulatory proteins (most importantly Tat, Rev, and Nef) to the
structural proteins (Gag/Pol and Env). Both Tat and Rev mediate their
functions through specific interaction with the viral RNA elements TAR
and RRE, respectively.
Figure 1:
Construction of Rev
fusion protein. A, schematic structure of Rev fusion protein
expressed in E. coli. The N-terminal GST portion can be
removed after purification by cleavage at an internal thrombin site
(dottedbox). The protein can be specifically labeled
at a serine residue within the heart muscle kinase site
(cross-hatchedbox) using the heart muscle
kinase enzyme. The functional domains of Rev, based on data from in
vitro and in vivo experiments, are indicated. The basic
region involved in RNA binding and nuclear/nucleolar localization
(solidbox) is flanked by regions important for Rev
oligomerization (hatchedboxes). The activation
domain (shadedbox) is essential for Rev function
in vivo and may interact with a cellular factor and/or play a
role in Rev oligomerization. Protein segments are not drawn to scale.
B, amino acid sequence of Rev fusion protein. Putative
identifications of strong and weak proteinase cleavage sites are
denoted by solid and openarrows,
respectively, and the name of the corresponding proteinase is
indicated. C indicates the position of chemically induced
cleavage used as a marker. Positions are numbered according to the N
terminus of the Rev amino acid sequence. The basic and activation
domains are boxed, and dottedlines indicate
regions in which amino acid assignment were uncertain. Abbreviations:
HMK, heart muscle kinase target site; T, thrombin;
L, Lys-C; A, Arg-C; G, Glu-C; As,
Asp-N; Y, trypsin.
Several mechanisms have been suggested for Rev function. It may
directly interfere with spliceosome
assembly
(12, 13, 14, 15) , protect the
RNA from mRNA degradation
(16) , activate cytoplasmic transport
of incompletely spliced mRNA
(16, 17, 18) ,
and/or influence the translatability of mRNA in the
cytoplasm
(19, 20, 21) . The Rev response is
mediated through the specific interaction of multiple Rev proteins with
RRE. This interaction has been studied in vitro by a number of
different techniques including gel retardation
assays
(6, 22, 23, 24, 25, 26, 27, 28, 29) ,
RNA footprinting and chemical modification interference
analysis
(14, 30, 31) , circular dichroism
experiments
(32) , and systematic evolution of ligands by
exponential enrichment experiments
(33, 34, 35) .
Recent NMR studies have resolved structural features of a single high
affinity Rev binding site located within the RRE
RNA
(36, 37) . However, in the absence of NMR and x-ray
crystallography data for Rev, information about the protein structure
remains scarce. A circular dichroism analysis of a peptide spanning the
basic domain indicates that this region forms an -helix in
solution
(38) . Based on results using the same technique, it was
more recently proposed that the helical structure of the basic domain
forms one arm of a more extended helix-loop-helix motif in
Rev
(39) . We have studied the structure of Rev using protein
footprinting and identified single amino acids protected by the RRE
upon binding. The approach involves limited proteolysis of Rev fusion
protein specifically labeled at the C-terminal end, in the absence and
presence of molecular ligands.
Construction of Plasmid
The pRRE plasmid
contains a 223-base pair region (position 7768-7991) encoding the
RRE of the HXB-3 isolate of HIV-1
(31) . The Rev sequence was
derived from the pSVH6rev plasmid
(40) , which contains a
functional synthetic Rev gene including codons more efficiently
utilized in prokaryotes. This sequence corresponds to Rev from the
HIV-1 HXB2 strain but contains a serine and a threonine at positions 61
and 114, respectively. Sequencing of the pSVH6rev plasmid revealed a
single base mutation resulting in a Val
Ile
substitution in Rev as compared with the original published sequence.
However, since this mutation frequently occurs at this amino acid
position in wild-type HIV-1 strains, it is not likely to effect the
structure and function of Rev. The pSVH6rev plasmid was used as a
polymerase chain reaction template and amplified using M13(-40)
forward primer and a 5`-GCGAATTCCTTCTTTAGCTCC-3` primer creating an
EcoRI restriction site at the 3`-end. The resulting polymerase
chain reaction fragment was digested with EcoRI and partially
with BamHI, and the full-length Rev fragment was ligated to a
BamHI-EcoRI digest of pGEX-GTH.
(
)
The resulting plasmid encodes a GST-Rev fusion protein
containing a C-terminal recognition sequence for the catalytic subunit
of cAMP-dependent heart muscle kinase (see Fig. 1). The construct
was verified by sequencing.
Expression and Labeling of Protein
Fusion protein
was expressed in Escherichia coli strain BL21 and purified
essentially as described by Pharmacia Biotech Inc. The bacterial
cultures were induced at log phase with 0.1 mM
isopropyl-1-thio--D-galactopyranoside, and growth was
continued for an additional 3-5 h. Bacteria were harvested by
centrifugation and resuspended in buffer A (20 mM HEPES, pH
7.9, 200 mM NaCl, 20% glycerol, 10 mM
-mercaptoethanol) containing 0.2 mM phenylmethylsulfonyl
fluoride, 0.5 µg/ml leupeptin, 2.0 µg/ml aprotinin, and 0.1
mM EDTA. Resuspended bacteria were sonicated on ice in short
bursts and cleared of insoluble material by centrifugation. Fusion
protein was collected from the supernatants using glutathione-Sepharose
4B (Pharmacia) at a concentration of 0.5 µl of Sepharose bed
volume/ml of bacterial culture. After 30 min with gentle agitation at
room temperature, beads were collected by centrifugation and washed 3
times with PBS (140 mM NaCl, 2.7 mM KCl, 10
mM Na
HPO
, 1.8 mM
KH
PO
, pH 7.3). At this step, fusion protein
bound to beads was normally stored at -70 °C in PBS
containing 20% glycerol. For labeling purposes, fusion protein bound to
Sepharose beads were washed 3 times in heart muscle kinase buffer (20
mM Tris/HCl, pH 7.5, 100 mM NaCl, 12 mM
MgCl
) and subsequently labeled in 300 µl of protein
kinase reaction mixture (heart muscle kinase buffer containing 100
units of kinase (Sigma) and 0.33 mCi
[
-
P]ATP (Amersham Corp., 7000 Ci/mmol)) per
ml of bed volume of glutathione-Sepharose 4B for 30 min at 4 °C.
Unincorporated nucleotides were removed by washing the beads 5 times
with PBS. End-labeled fusion protein were eluted from the beads either
by gentle shaking 3-4 times in 1 bed volume of PBS containing
10-20 mM reduced glutathione for 30 min at 4 °C or
by thrombin cleavage by incubating with 50 units of thrombin
(Pharmacia) per ml of Sepharose bed volume for 2 h at 20 °C.
Nonfusion Rev protein was prepared as described previously
(31) .
Protein concentration and purity were determined by hydrolysis of
protein samples in 6 M HCl, 0.01% phenol, 5% thioglycolic acid
at 110 °C for 18 h followed by quantification of free amino
acids
(41) . The purity of both the fusion and the nonfusion Rev
proteins were assessed to be above 70%.
Preparation of RNA and Gel Mobility Shift
Analysis
RNA used for protein footprinting was synthesized in
200 µl of reaction mixtures containing 10 µg of linearized pRRE
plasmid DNA, 40 mM Tris/HCl, pH 7.4, 6 mM
MgCl, 4 mM spermidine, 10 mM
dithiothreitol, 50 units of RNasin, 0.5 mM ATP, 0.5
mM UTP, 0.5 mM GTP, 0.5 mM CTP, 1 µCi
[
-
P]UTP (Amersham Corp., 900 Ci/mmol), and
200 units of T3 RNA polymerase (Stratagene). The RNA was purified on a
4% polyacrylamide-8 M urea gel, extracted in 0.25 M
NaAc, pH 6.0, 1 mM EDTA in the presence of phenol, and ethanol
precipitated. The final concentration of the RNA was calculated from
the specific activity of incorporated
P label. The RNA was
renatured by incubating the RNA in renaturation buffer (10 mM
HEPES/KOH, pH 7.5, 50 mM NaCl) for 2 min at 80 °C followed
by slow cooling to 37 °C. Transcription of high specifically
radiolabeled RNA substrates and gel mobility shift analysis were done
as described previously
(31) .
Ligand Binding and Proteinase Digestion
One µl
of RNA renaturation buffer containing 500 ng of RRE RNA was added to
100 ng of Rev fusion protein (approximately 2 times molar excess of
RNA) in 9 µl of Rev binding buffer (10 mM HEPES/KOH, pH
7.5, 100 mM KCl, 1 mM MgCl, 0.5
mM EDTA, 1 mM dithiothreitol, 10% glycerol, 0.5 unit
of RNasin, 100 ng/µl bovine serum albumin, 50 ng/µl E.
coli. tRNA) followed by 20 min. incubation on ice. In the control
reaction the RRE was replaced with 500 ng E. coli. tRNA
(Boehringer Mannheim). For Rev-mAb binding approximately 100 ng Rev
fusion protein were incubated in 9 µl buffer B (5 mM
Tris/HCl, pH 7.4, 75 mM NaCl, 1 mM EDTA, 0.025%
Nonidet P-40, 100 ng/µl bovine serum albumin) for 10 min at room
temperature followed by the addition of 1 µl of mAb (1
µg/µl) and reincubation for 10 min at room temperature.
Immediately after the binding of RNA or mAb to Rev, 10 µl of the
respective proteinase (diluted in water) was added, and the mixture was
incubated for 15 min at 37 °C. Reactions were put on ice and
stopped by the addition of 6.7 µl of 4
SDS loading buffer
(24% glycerol, 6.8% SDS, 230 mM Tris/HCl, pH 6.8, 0.01% Serva
Blue W (Serva), 3.3%
-mercaptoethanol). Concentration ranges in
the final reaction mixture of the different proteinases were as
follows: 0.05 unit/µl thrombin (Pharmacia unit definition),
1-10 ng/µl Lys-C (Sigma), 0.005-0.05 unit/µl Arg-C
(Sigma unit definition), 5-50 pg/µl trypsin,
tosyl-phenyl-alanine chloromethylketone-treated (Cooper Biomedicals),
0.5-5 ng/µl Glu-C (Boehringer Mannheim), 0.02-0.5
ng/µl Asp-N (Sigma), 5-50 pg/µl proteinase K (Boehringer
Mannheim), 5-50 pg/µl subtilisin Carlsberg, 0.5-5
ng/µl Pronase (Boehringer Mannheim), 0.5-5 ng/µl
thermolysin (Sigma), 5-50 pg/µl bromelain (Boehringer
Mannheim).
SDS-Polyacrylamide Gel Electrophoresis
The
cleavage products were resolved using discontinuous
Tricine-SDS-polyacrylamide gel electrophoresis
(42) to achieve
optimal resolution of small peptides. Electrophoresis was done in
0.4-mm thick, 30 40-cm slab gels. The acrylamide percentage was
either 16 or 20% for the resolving gel and 7% for the stacking gel.
Samples were routinely run through the stacking gel at 20 mA and then
at constant 40 mA current until the Serva Blue W dye ran out of the
gel. Gels were dryed and autoradiographed with screens at -80
°C.
Protein Sequence Analysis and Chemical
Cleavage
Protein separated by SDS-polyacrylamide gel
electrophoresis was electroblotted (43) onto a ProBlott membrane
(Applied Biosystems). The band of interest was excised and subjected to
sequence analysis on an Applied Biosystems 477A instrument equipped
with an on-line 120-A chromatograph. Analysis was performed using
approximately 20 pmol of peptide. Specific cleavage at the Rev
tryptophanyl residue was done as described by Huang et
al.(44) .
Preparation of Full-length Radiolabeled
Protein
Rev protein containing a glutathione
S-transferase (GST) tag at the N terminus and the recognition
sequence for the catalytic subunit of cAMP-dependent heart muscle
kinase at the C terminus was expressed in E. coli (Fig. 1A). The GST tag allowed rapid affinity
purification of the fusion protein on a glutathione-Sepharose
matrix
(45) , and the presence of the heart muscle kinase site
(RRASV) facilitated specific labeling at the serine residue in the
presence of [-
P]ATP and the heart muscle
kinase enzyme
(46) . In a control experiment, we observed no
significant labeling of GST-Rev fusion protein lacking the heart muscle
kinase site, implying that the kinase is highly specific toward its
native site (data not shown). The specifically labeled Rev fusion
protein will therefore be referred to as end-labeled Rev protein in the
text below. The concept of positioning the GST and the heart muscle
kinase at opposite ends of the protein ensured that only full-length
Rev was labeled. In an initial construct, in which both the GST and the
heart muscle kinase tag were placed at the N terminus (using the
pGEX-2TK vector (Pharmacia)) a considerable amount of radioactive
protein degradation products were observed that interfered with the
footprinting analysis (data not shown). A cleavage site for thrombin
endopeptidase located between the GST tag and the Rev protein enabled
the removal of the GST part if necessary (Fig. 1A).
Since partial cleavage after position 66 within the Rev sequence was
observed by thrombin, most experiments were performed on GST-Rev fusion
protein.
Figure 2:
Gel mobility shift analysis of protein-RNA
complex formation. Approximately 2 ng of uniformly labeled RRE or
antisense RRE (rRRE) was incubated with increasing
concentration of wild-type Rev or GST-Rev fusion protein (nM)
as indicated. RRE and rRRE indicate the positions of
free probes, and Ori. marks the origin of the
gel.
Structural Analysis of Rev Using
Proteinases
End-labeled Rev fusion protein was digested under
native conditions over a wide concentration range with 10
endopeptidases: Lys-C, Arg-C, trypsin, Glu-C, and Asp-N, which are
relatively sequence specific and proteinase K, subtilisin Carlsberg,
Pronase, thermolysin, and bromelain, which cleave less specifically
(Fig. 3A, and see for specificities). All
of these proteinases are active under conditions that are optimal for
the stability of RevRRE complexes and do not contain detectable
RNase activity (results not shown). Proteinase concentrations at which
partial cleavage of the Rev protein was observed occurred within a
relatively narrow titration range. To favor single-hit kinetics,
conditions for protein footprinting were chosen such that at least 50%
of the radioactivity remained in the band containing uncleaved protein.
Under these conditions, only a subset of the potential proteolytic
targets sites were cleaved, which probably reflects the accessibility
of the cleavage sites within the protein structure. By comparing the
bands produced by proteinases and chemicals of different specificities,
it was possible to make a putative identification of most of the bands
as described below (Fig. 1B and 3A).
Figure 3:
Analysis
of proteolytic digests of GST-Rev fusion protein. A,
autoradiogram of a 20% protein gel showing proteolytic cleavage
products. Rev fusion protein was digested with increasing
concentrations of the indicated proteinases. C denotes a
control lane containing untreated fusion protein, and T denotes a lane containing thrombin-cleaved Rev fusion protein as a
marker. GST-Rev and Rev indicate the N terminus of the GST-Rev fusion
protein and Rev, respectively. Putative identifications of
corresponding amino acids are indicated except for the basic region
(residues 35-50), which is more closely investigated in
panelsC and D. Assignment of most bands
occurring in the GST portion is not attempted. Identification of a
secondary thrombin cleavage sites at Arg within the Rev
sequence is based on peptide sequencing (see ``Experimental
Procedures''). The remaining bands were identified on the basis of
their relative positions, compared with products from other specific
proteinases, and to bands in marker lanes containing mixtures of
unrelated peptides (not shown). An unidentified artifactual band of
unknown origin, which migrated at different positions (ranging from 2
to 40 kDa), depending on type of proteinase, duration of
electrophoresis, and acrylamide percentage of the gel, is indicated by
a star. Since all proteinase digests were performed multiple
times, using different electrophoresis conditions, these bands were
easily identified and omitted from the analysis. Proteinase
concentrations in the final reaction mixtures were as follows: 0.05
unit/µl thrombin; 1, 3, and 10 ng/µl Lys-C; 0.005, 0.015, and
0.05 unit/µl Arg-C; 5, 15, and 50 pg/µl trypsin; 0.5, 1.5, and
5 ng/µl Glu-C; 0.05, 0.15, and 0.5 ng/µl Asp-N; 5, 15, and 50
pg/µl proteinase K; 5, 15, and 50 pg/µl subtilisin Carlsberg;
0.5, 1.5, and 5 ng/µl Pronase; 0.5, 1.5, and 5 ng/µl
thermolysin; 5, 15, and 50 pg/µl bromelain. B, diagram
showing the relationship between the logarithm of the mass and
migration of the bands shown in panelA. The mass was
calculated for the radioactive C-terminal fragment produced by
proteolytic cleavage, and the migration was measured as the distance
between the bottom of the stacking gel and the center of the
radioactive band. For peptides above 5 kDa, an almost linear
relationship was obtained. In the basic region (positions 35-50)
the curve became more horizontal reflecting larger spacing between the
bands. The reverse effect occurred in the 50-80 region, where
less resolution were observed. Lower molecular mass peptides (<5
kDa) generally tend to migrate too slowly to convey to the linear
relationship. The boxedregion is analyzed in more
detail in panelD. C, autoradiogram
of a 20% protein gel showing Arg-C and trypsin digests in the Rev
region coelectrophoresed along with a marker for Trp
(W), thrombin-digested protein (T), and a
control containing untreated protein (C). At the highest
trypsin and Arg-C concentrations, multiple bands became visible, all of
which could be accounted for by individual arginines in the Rev
sequence. Assuming that the Trp
band migrate in between
the suggested Arg
and Arg
bands and knowing
the identity of Arg
and Arg
enabled a putative assignment of the remaining bands. The band
labeled Arg may either correspond to Arg
or
Arg
. Final proteinase concentrations were as follows: 0.02
and 0.05 unit/µl for Arg-C and 5 pg/µl and 15 pg/µl for
trypsin. The Trp
was cleaved by CNBr at the carboxyl side
as described under ``Experimental Procedures.''
Electrophoresis conditions were as described in panelA. D, diagram showing the relationship between
the calculated peptide mass and the distance migrated for the bands
shown in panelC (for details, see legend to
panelB).
SDS-polyacrylamide gel electrophoresis generally provides an almost
linear correlation between the logarithm of polypeptide mass and gel
mobility
(47) . However, a nonlinear relationship is occasionally
observed mainly because amino acids have different molecular weights,
bind SDS with different affinities, and are not uniformly
charged
(48) . In particular, peptides containing stretches of
acidic or basic amino acids often migrate abnormally, and heterologous
protein markers only allow a rough estimate of proteolytic cleavage
positions. A plot of the logarithm of the mass of the putatively
identified Rev peptides as a function of distances migrated in the SDS
gel was almost linear for masses above 5 kDa (Fig. 3B).
However, abnormally large spacing between the bands was observed in the
basic region (position 35-50), consistent with the notion that
basic proteins generally exhibit a high apparent size on SDS
gels
(49) . Abnormal migration was also observed for the peptide
cleaved before Asp (Fig. 3B).
and Lys
,
respectively, which are the only lysine residues in Rev
(Fig. 1B and 3A). In contrast, Glu-C only
cleaved at two to three of 11 potential Glu-C sites in Rev
(Fig. 1B and 3A), and the most accessible site
occurred just below the band corresponding to Asp
and was
identified as Glu
. A strong cleavage site was also
observed in the C-terminal end of Rev probably corresponding to
Glu
. This assignment is based on the detection of two
weak Glu-C-specific bands, apparent at high enzyme concentrations,
immediately below corresponding to Glu
and Glu
(result not shown). Asp-N cleaved strongly before
Asp
, which is positioned immediately C-terminal to the
activation domain (Fig. 1B and 3A). A weak band
appeared on some gels at the same position as Glu
, which
possibly originates from cleavage before Asp
. Since Asp-N
also cuts at cysteine residues, albeit with lower efficiency, we cannot
exclude that the bands derive from cleavage before Cys
or
Cys
, which are the only cysteine residues in Rev. Cleavage
at the other aspartic acids at positions 7 and 9 were not detected.
When plotted on the mass/mobility diagram, the band corresponding to
the Asp
cleavage product migrated abnormally slow
(Fig. 3B). However, the identification is probably
correct since no other aspartic acids (or cysteines) occur in the
10-83 region. Arg-C cleaved at regions flanking the Rev segment,
probably corresponding to residues Arg
and
Arg
in the thrombin recognition site and in the heart
muscle kinase site, respectively, and at several positions within, and
C-terminal to, the basic domain (Fig. 1B and
3C). Interestingly, bands in this region appeared as an evenly
spaced pattern representing a subset of the arginine residues. Based on
relative mobility of the bands and two specific markers corresponding
to cleavage after Trp
and Arg
, it
was possible to assign each of the cleavages to single amino acids
(Fig. 3, C and D). Most accessible were
residues Arg
, Arg
or
Arg
, Arg
, and Arg
, whereas
Arg
, Arg
, and Arg
were cleaved
to a minor extent. Trypsin treatment yielded a similar pattern to
Arg-C. However, the specificity of Arg-C and trypsin for the arginines
differed greatly. Most noticeable were the strong Arg-C cleavages after
Arg
and Arg
, which are not,
or are weakly, cleaved by trypsin, and the strong cleavage after
Arg
by trypsin, which is not cleaved by Arg-C above
background (Fig. 3, A and C). In addition,
trypsin-digested samples exhibited a weak band below
Arg
, which probably corresponds to Arg
(Fig. 3C). At an increased trypsin concentration,
several additional weak bands appeared within the basic region of Rev
that could all be accounted for by corresponding arginine residues in
the amino acid sequence (Fig. 3C). It is possible that
increased cleavage activity at the higher proteinase concentration
partially denatures the protein structure and exposes additional
arginines.
, whereas no cleavage was observed in, or N-terminal
to the basic domain. Bromelain, which is a relatively unspecific
proteinase cleaved strongly at a position near Asp
,
Glu
, and just below Glu
. In addition, weak
cleavage was observed near Lys
and Arg
(Fig. 3A). Thermolysin cleaved strongly at a
position around the activation domain and in the 95-100 region.
Footprinting the RRE Binding Site
Proteinases
attack the surface of a folded protein, and their activity may
therefore be sensitive to sterical hindrance by intermolecular
interactions. Probing a protein in the presence and absence of a
substrate may therefore provide information about what amino acids are
involved in binding. End-labeled Rev protein was probed with 10
different proteinases in the absence and presence of a 2 times molar
excess of RRE RNA (Fig. 4). In the reaction without RRE, a
similar amount of E. coli tRNA was added as control RNA.
Strong protection of specific cleavage by Arg-C was observed at
Arg, Arg
or
Arg
, Arg
, and
Arg
, whereas cleavage at Arg
,
Arg
, and Arg
was unaffected (Fig. 4).
Weak bands corresponding to Arg
and Arg
were
reduced to background levels upon RNA binding. The cleavage pattern by
Lys-C, Glu-C, Asp-N, proteinase K, subtilisin Carlsberg, Pronase,
thermolysin, and bromelain was unaffected by the presence of RRE RNA
(results not shown). Minor, but consistent protection against trypsin
digestion was observed at Arg
or Arg
,
indicating that trypsin may be less sensitive to RNA protection
(results not shown). The protection of arginines by the RRE may reflect
that these amino acids interact directly with, or are shielded by, the
RNA. Alternatively, RNA induced conformational changes or protein
multimerization render these residues inaccessible to the proteinase.
Figure 4:
Autoradiogram of a SDS protein gel showing
the footprint obtained with RRE RNA. Rev fusion protein was digested
with Arg-C in the presence of RRE RNA or the same amount of tRNA,
indicated by + and -, respectively. C denotes
control lanes in which Arg-C proteinase was omitted, and T indicates a marker lane where Rev fusion protein was digested with
thrombin, which cleaves at Argand at
Arg
. Proteolytic cleavages, which are specifically
sensitive to RRE include Arg
, Arg
or
Arg
, Arg
, and Arg
(Fig.
1B). Proteinase concentrations in the final reaction mixtures
were: 0.05 unit/µl thrombin, 0.015 unit/µl Arg-C, and 0.025
unit/µl Arg-C. The SDS gel contained 20%
acrylamide.
Mapping Monoclonal Antibody Epitopes
Since mAb
recognition sites, natural substrate binding sites, and
proteolytic-sensitive sites generally are located on the surface of
native proteins, there will often be structural overlap between these
sites. Mapping the epitopes of two Rev specific mAbs therefore served
as an appropriate positive control for the protein footprinting
approach. We have tested the specificity of the protein footprinting
technique by mapping the epitopes of two mAbs (mAb1 and mAb2), which
have previously been shown to interact with peptides spanning amino
acids 75-88 and 91-105, respectively
(50) . A
Tat-specific mAb was used as a negative control. Rev fusion protein
labeled at the C terminus was digested with Arg-C, Glu-C, Asp-N,
proteinase K, subtilisin Carlsberg, and bromelain in the presence and
absence of mAb1 or mAb2. Binding of mAb1 protected Rev toward Asp-N
specific cleavage at Asp and proteinase K and subtilisin
Carlsberg-specific cleavages in a region near Asp
(Fig. 5). Binding of mAb2 resulted in protection against
proteinase K and subtilisin Carlsberg in the 92-96 region and
against subtilisin Carlsberg cleavage near Glu
. Cleavage
with Arg-C, Glu-C, and bromelain were not inhibited by mAb1 or mAb2
binding. Of particular interest is the unaffected cleavage of
Glu
, suggesting that this residue is not recognized by
mAb2. The results obtained with protein footprinting correlate very
well with the epitope mapping data obtained previously
(50) and
reinforce that this method provides reliable information about domains
involved in intermolecular interactions.
Figure 5:
Autoradiogram of a SDS protein gel showing
the footprint obtained with Rev specific mAbs. Rev fusion protein was
digested with the indicated proteinases in the presence of mAb1 that
specifically recognizes residues 75-88 or in the presence of mAb2
recognizing residues 91-105. As a control, a mAb recognizing
residues 49-85 in the Tat protein was included. C,
denotes a control lane without added proteinase, and T, shows
thrombin-cleaved Rev fusion protein as a marker. Sites specifically
sensitive to Rev specific mAb1 binding included Asp-N cleavage at
Asp and to proteinase K and subtilisin Carlsberg cleavages
slightly more N-terminal to this position. Binding of mAb2 protected
Rev against cleavage by proteinase K and subtilisin Carlsberg in the
92-96 region and to subtilisin Carls-berg cleavage near
Glu
. Proteinase concentrations in the final reaction
mixtures were: 1.5 ng/µl Glu-C, 0.2 ng/µl Asp-N, 50 pg/µl
proteinase K, and 5 pg/µl subtilisin Carlsberg. The SDS gel
contained 16% acrylamide.
, Arg
or Arg
,
Arg
, and Arg
), whereas the core of the basic
domain (residues 40-48) was more resistant to proteolytic
cleavage yielding only weak bands putatively identified as Arg
and Arg
. Interestingly, when placed in an
-helical projection, the identified cleavage sites are confined to
one face of the helix (Fig. 6). This strongly suggests that the
basic region forms an
-helix in the context of the whole protein,
exposing one face of the helix to the solvent. Such an interpretation
is supported by circular dichroism data, which show that a peptide,
spanning only the basic domain of Rev, forms an
-helix in solution
and that the helicity of the peptide is important for specific RRE
binding
(38) . Since the N-terminal region of Rev protein is
relatively resistant to proteolytic cleavages, our data does not allow
testing of the proposal that residues 8-55 of Rev forms an
extended helix-loop-helix motif (39).
Figure 6:
Helical wheel projection of amino acids
threonine 34 to glutamine 51. Circles denote basic residues,
tiltedboxes denote polar residues, and boxes denote hydrophobic residues. Arginines, cleaved strongly by Arg-C
(boldfacearrows) include Arg,
Arg
or Arg
, and Arg
. Weak Arg-C
specific cleavages (thinarrows) include Arg
and Arg
. Affected arginines are all located at one
face of an
-helix. No cleavage was observed after
Arg
, Arg
, Arg
, Arg
,
and Glu
by Arg-C or Glu-C. Numbers refer to
positions in the Rev sequence (See Fig.
1B).
Comparison of the proteolytic
digestion pattern of protein alone and in complex with RRE shows that
the amino acids, forming the putative -helical structure, also are
affected by RNA binding. Four major Arg-C specific cleavage sites,
corresponding to Arg
, Arg
or
Arg
, Arg
, and Arg
, were
considerably reduced in the presence of RRE RNA, whereas no effects
were observed in the presence of the same amount of tRNA. The
protection of Arg
, Arg
or Arg
,
and Arg
against Arg-C cleavage may reflect direct
protection to proteinases by the RRE. This interpretation is supported
by binding studies of Rev and related peptides to the RRE. Based on
in vitro RNA footprinting and chemical modification
interference experiments, it has been shown that a peptide, containing
amino acids 34-50 of Rev (Rev 34-50), binds specifically to
the RRE, forming almost the same contacts to the RNA as the intact
protein
(14) . Moreover, mutating Arg
,
Arg
, Arg
or Arg
in Rev
34-50 decreases the specificity of the RNA binding significantly,
suggesting that these amino acids contact the RNA
(38) . The
importance of these amino acids has also been studied in vivo.
Substitution of both Arg
and Arg
or
Arg
and Arg
strongly reduces RRE binding and
Rev activity in
vivo(3, 5, 6, 7) . However, a
recent exhaustive scanning, using single arginine substitutions in Rev,
showed that, in contrast to the Rev 34-50 study by Tan et
al., no single arginine within the basic domain is essential for
Rev function in vivo(51) . This suggests that the
arginines within the basic domain of intact Rev protein are
functionally redundant for RRE binding (51).
.
This amino acid has not previously been assigned any role in RNA
binding and mutating Arg
has only marginal effect on Rev
function
(3) . Possibly, Arg
interacts with the RRE
providing an explanation for the decreased specificity of RNA binding
and more strict sequence requirement observed for Rev 34-50
compared with intact Rev protein
(38, 51) .
Alternatively, protection of Arg
upon RRE binding may
reflect sterical hindrance of the proteinase by the RNA,
oligomerization of the protein on the RRE, or induced conformational
changes in the protein, resulting in protein structures that are less
sensitive to proteinases.
(
)
In contrast, when
probing the Rev fusion protein, no additional cleavage sites or major
enhancements were observed upon RRE binding, suggesting that
conformational changes in Rev are minimal. This observation is
supported by circular dichroism spectra of Rev, which show only
marginal changes in the content of helical structure upon RRE
binding
(32) .
Table:
List of proteinases used in this study
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.