(Received for publication, April 15, 1996, and in revised form, August 27, 1996)
From the ¶ Department of Research Medicinal Chemistry, the
Department of Molecular, Cellular, and Structural
Biology, and the § Department of Infectious Diseases, Isis
Pharmaceuticals, Carlsbad, California 92008
We describe our initial application
of a biochemical strategy, comprising combinatorial screening and
rational optimization, which directly identifies oligonucleotides with
maximum affinity (per unit length), specificity, and rates of
hybridization to structurally preferred sites on folded RNA, to the
problem of design of antisense oligonucleotides active against the
hepatitis C virus (HCV). A fully randomized sequence DNA
oligonucleotide (10-mer) library was equilibrated with each of two
folded RNA fragments (200 and 370 nucleotides (nt)), together spanning
the 5 440 nt of an HCV transcript (by overlapping 130 nt), which were
varied over a range of concentrations. The equilibrations were
performed in solution under conditions determined to preserve RNA
structure and to limit all RNA-DNA library oligonucleotide interactions
to 1:1 stoichiometry. Subsequent Escherichia coli RNase H
(endoribonuclease H: EC 3.1.26.4) cleavage analysis identified two
preferred sites of highest affinity heteroduplex hybridization. The
lengths and sequences of different substitute chemistry
oligonucleotides complementary to these sites were rationally optimized
using an iterative and quantitative analysis of binding affinity and
specificity. Thus, DNA oligonucleotides that hybridized with the same
affinity to the preferred sites in the folded RNA fragments found by
screening as to short (
25 nt) RNA complements were identified but
were found to vary in length (10-18 nt) from site to site.
Phosphorothioate (P=S) and 2
-fluoro (2
-F) uniformly substituted
oligonucleotides also were found, which hybridized optimally to these
sites, supporting the design of short (10-15-nt) and maximally
specific oligonucleotides that are more nuclease-resistant (via
P=S) and have higher affinity (via 2
-F) than DNA. Finally, the
affinities of DNA and uniform 2
-F-, P=S-substituted 10-20-mer oligonucleotide complements for the best hybridization site, from HCV
nt 355 to nt 364-374, closely corresponded to antisense mechanism inhibition activities in an in vitro translation assay and
in a human cell-based HCV core protein expression assay, respectively. These results validate our strategy for the selection of
hybridization-optimized and biologically active antisense
oligonucleotides targeting HCV RNA and support the potential for
utility in further applications.
In order to ensure sequence-unique recognition of antisense and ribozyme oligonucleotide drug candidates with a chosen site on any given transcript or transcript precursor RNA, a necessary and sufficient requirement for a minimum of 15-17 complementary base pairs has been proposed (1, 2), and typically 20 are sought (3, 4). However, both the kinetics and the thermodynamics of antisense oligonucleotide hybridization may be profoundly attenuated when an energetic cost is incurred to disrupt secondary and tertiary structures of folded RNA that block the hybridization (5, 6, 7, 8). Further, the complexity of folded RNA suggests that there should be great variability of RNA structure-dependent hybridization parameters for oligonucleotides complementary to different positions along the primary sequence. In addition, examination of known RNA secondary structures reveals that identification of structurally ideal RNA hybridization sites for oligonucleotides as long as 20-mers1 is improbable. Therefore, it should be expected that the usual piecemeal sampling ("walks") at arbitrary intervals along linear RNA transcript sequences with a manageable set (typically 10-50) of complementary 20-mer oligonucleotides (3, 4) generally will identify a few, at best, that give sufficient net binding affinity, after the payment for RNA structure disruption, to support adequate biological inhibitory activity in cell-based screens. At effective concentrations, oligonucleotides chosen in this manner may then bind with near to, equivalent, or even higher affinity at alternative RNA sites on the same or other transcript(s) that are closely sequence-related, but more favorably structured, resulting in diminished site and transcript specificity (9, 10, 11, 12, 13).
In efforts to advance antisense oligonucleotide identification beyond the "linear thinking" of sequence complementarity, computer RNA folding algorithms as well as enzymatic and chemical reagent mapping approaches have been used to determine RNA secondary structure. These procedures do not predict tertiary interactions and may be unreliable (14, 15, 16), and the latter require use of multiple reagents or enzymes and can be tedious to perform. Even when single-stranded regions of RNA are accurately located, subsequent indirect predictions for the kinetics and thermodynamics of complementary oligonucleotide hybridization may be in error by many orders of magnitude (5) due to steric, topological, and tertiary structural constraints not evident in secondary structure determinations. Attempts to enhance the hybridization of antisense oligonucleotides to transcripts by rational design, including the use of tethered oligonucleotide probes (17, 18, 19), pseudo-half-knots (20, 21), symmetric secondary structures (22, 23), and stem bridging (24), while all enjoying some early stage success, require identification of particular RNA structural elements and are limited in scope when viewed against the potential structural diversity of available RNA targets in cells. Most recently, potentially powerful combinatorial or semicombinatorial strategies have been applied to the problem of oligonucleotide recognition of RNA structure. Although (partial) randomization of the RNA substrate recognition nucleotides of trans-cleaving ribozymes (25, 26) is promising, the methodology is not directly applicable to small antisense oligonucleotides. Probing hybridization to structured RNA with multiple length fragments of a given antisense nucleic acid generated by semicombinatorial solid phase synthesis (27) or by solution-based alkaline fragmentation of end-labeled RNA (28) represent innovative screening approaches. However, solid phase screening may bias the former; conserved end fragments of the latter do not mimic therapeutic antisense oligonucleotides; and both methods require unique synthesis of (the equivalent of) full-length antisense nucleic acid for a complete search of a given RNA metabolite. All methods currently are subject to restrictions on the length of target RNA that can be accommodated, which may or may not be easily surmounted to eventually provide more efficient searches of long transcript RNAs.
For these reasons, there clearly is a need for additional and better general methods to directly identify oligonucleotides that hybridize to folded RNA of undetermined structure with minimal energetic cost (29) of RNA structure disruption. Such oligonucleotides would maximize rates of hybridization (or "kinetic accessibility") and, hence, as previously shown, also would maximize affinity per unit length (5) and the effectiveness of in vivo antisense inhibition (28). Enhanced specificity also should be realized when the probability is greatly reduced of finding close sequence-related RNA sites with comparably favorable structure for oligonucleotide hybridization. Oligonucleotides with these properties most likely would be restricted in length to 10-15 nt, but they would have sufficient biological potency because of the fullest possible realization of hybridization affinity, which could be enhanced further with alternative chemical compositions. The potential problem of formation of internal oligonucleotide structure that would be further stabilized by high affinity chemical compositions, and which would lessen the net hybridization affinity with RNA (8), would also be minimized with shorter oligonucleotides. Other potential advantages of shorter oligonucleotides would be enhancement of primary sequence mismatch specificity (30, 31, 32) and product release after RNase-, ribozyme-, or artificial agent-mediated RNA cleavage (30), improved cellular uptake (33), reduced nonspecific protein binding (11), and less expensive synthesis and purification.
To address this existing need for alternative methods to identify
oligonucleotides optimized for hybridization to transcript RNA, we have
developed a strategy schematically depicted in Fig. 1. In an initial
solution-based combinatorial approach, we sequentially use short, fully
randomized (i.e. equimolar) sequence DNA oligonucleotide library hybridization affinity screening and Escherichia
coli RNase H (endoribonuclease H: EC 3.1.26.4) cleavage analysis (Fig. 1A) to identify energetically preferred hybridization
sites on folded RNA. We then follow up by quantitative rational
optimization of affinity and specificity of hybridization of individual
oligonucleotides to these sites (Fig. 1B). Here we report
the application of this approach to the problem of antisense
recognition of the 5-noncoding region (NCR) and initial coding region
of a hepatitis C virus (HCV) transcript. HCV is the major cause of
post-transfusion acute hepatitis, which may often progress to chronic
hepatitis, cirrhosis, and hepatocellular carcinoma (34), and
therapeutic options for treatment of patients remain limited and
largely ineffective (35). The 5
-NCR is attractive for antisense
targeting because it is relatively long (~340 nt) and has highly
conserved primary sequence (36, 37) and secondary structure (38) among
HCV isolates worldwide.
Plasmid
pGEM42-NCE 12 contains an HCV genomic
insert (nt 1-1357) obtained from HCV type II isolated from sera of a
chronically infected Japanese patient. Sense oligonucleotide primer for
the 370-nt fragment contained the 17-mer T7 RNA polymerase promoter followed by nt 1-20 of the HCV insert, and antisense oligonucleotide primer was complementary to nt 333-370. Sense oligonucleotide primer
for the 200 nt fragment contained the T7 promoter followed by nt
240-260 of HCV, and antisense oligonucleotide primer was complementary
to nt 403-440. Transcript fragments were prepared from polymerase
chain reaction-synthesized duplex DNA fragments of pGEM42-NCE 1 using
the Ambion MegaScript kit according to instructions and were extracted
with phenol-CHCl3, precipitated with ethanol, and then
purified using Boehringer Mannheim G-50 Quick Spin columns as
instructed. RNA transcripts were 5-end-labeled with 32P
after dephosphorylation with calf-intestinal alkaline phosphatase (Boehringer Mannheim), using [
-32P]ATP (ICN
Biochemicals) and T4 polynucleotide kinase (Promega) and were
3
-end-labeled with 32P using
[32P]phosphocytosine phosphate (ICN Biochemicals) and T4
RNA ligase (Boehringer Mannheim) (5) and were purified by 8%
denaturing PAGE and were extracted with phenol-CHCl3 and
precipitated with ethanol; specific activities were ~5000
cpm/fmol.
RNases ONE
(Promega), T1 and CL3 (Boehringer Mannheim), A (Life Technologies,
Inc.), and V1 (Pharmacia Biotech Inc.) were variously used in 10 µl
containing hybridization buffer (1 mM Tris-HCl, pH 7.4, 50 mM NaCl, 5 mM MgCl2), 3 mg of tRNA,
and 104 cpm 5-32P- or
3
-32P-end-labeled RNA transcript fragment in the presence
(for footprinting) or absence (for mapping) of prehybridized individual
oligonucleotides for 5 min at 37 °C. Reactions were quenched with 5 µl of 9 M urea, and products were resolved on 10%
denaturing PAGE. Enzymatic activities were adjusted to limit the extent
of digestion to 10%. Gels were quantitatively imaged using a Molecular
Dynamics PhosphorImager, and digitized values corresponding to all
sequence positions where enzymatic cleavage was observed were used in
all analyses.
DNA
and 2-deoxy, P=S oligonucleotides were synthesized by standard
phosphoramidite chemistry on an ABI synthesizer (model 380B) and
purified as described (3, 4), and 2
-F, P=S oligonucleotides were
synthesized and purified similarly, as described previously (39).
Oligoribonucleotides were synthesized using an ABI synthesizer (380B)
and were purified as described previously (5). Briefly, 5-dimethoxytrityl 2
-tert-butyldimethylsilyl nucleoside
3
-O-phosphoramidites with phenoxyacetyl protecting groups
on the exocyclic amines of A, C, and G were used. The wait step after
pulse delivery of tetrazole was 900 s. Base deprotection was
achieved by overnight incubation at room temperature in methanolic
ammonia, and the 2
-silyl group was removed at room temperature in 1 M tetrabutylammonium fluoride in tetrahydrofuran. RNA
oligonucleotides were purified using a C18 Sep-Pak cartridge followed
by ethanol precipitation.
Randomized DNA oligonucleotide libraries were made (40) on an ABI
synthesizer (model 394) using experimentally determined adjusted
proportions of phosphoramidites of each of the four nucleotide bases
(assayed by ratio of incorporation into all possible dimers) such that,
when mixed into a single vial, equimolar incorporation of all four
bases at each sequence position was reproducibly obtained, thus
ensuring equimolar representation of all possible sequence oligonucleotides (calculated as 410 = 1,048,576 sequences).
Briefly, phosphoramidites were mixed in a single vial on the fifth port
of the ABI 394 synthesizer, and the coupling wait step was increased to
5 min. The ratio of phosphoramidites in the mixture was tested by
making a single coupling to dT-CPG, cleaving and deprotecting the
product, and analyzing the crude dinucleotide material on RP-HPLC.
Proportions of the individual phosphoramidites were adjusted
accordingly, and the procedure was repeated until equal amounts of the
four dimers were obtained. The 3-position was sequence-randomized by
mixing the four base CPGs, removing the 5
-dimethoxytrityl on the
synthesizer, cleaving and deprotecting the product, and analyzing by
RP-HPLC. The proportion of each CPG was adjusted until equal amounts of
each base were obtained. Each synthesis used 1 µM of the
mixed base CPG. After cleavage and deprotection with ammonium hydroxide
at 55 °C for 16 h the dimethoxytrityl-off oligonucleotide
libraries were purified by RP-HPLC. The libraries were analyzed by
denaturing PAGE and diluted to a final library concentration of 10 µM. Limit hydrolyses by snake venom phosphodiesterase I
(U.S. Biochemical Corp.) and examination of products by uv absorption on RP-HPLC confirmed the equimolar representation of bases. A few
sequences may be underrepresented, without affecting this result, due
to relatively poor aqueous solubility (i.e. of long strings
of G nucleotides).
Prior to mixing for hybridization, the diluted transcript fragment and DNA library were each heated independently to 90 °C for 1 min and cooled slowly to 37 °C. Hybridizations were done over 20 h at 37 °C in 30 µl containing hybridization buffer, 104 cpm of 32P end-labeled transcript (1 nM to 10 µM total RNA), and 1 mM DNA library (calculated concentration, 100 pM individual sequences). Some hybridizations proceeded for up to 40 h with no change in the results (not shown), except for enhanced nonspecific background degradation of target RNA in some cases. Dithiothreitol was added to 1 mM and then (E. coli) RNase H (U.S. Biochemical Corp.) was added, varied (0.001-0.1 units) for each [RNA] to obtain the best signal-to-noise ratio, and reactions were allowed to proceed for 1-10 min at 37 °C. Reactions were quenched, and products were analyzed as for enzymatic structure mapping and footprinting.
Gel Mobility Shift AssayFor determinations of
Ka values, short RNA oligonucleotides (25-mers)
containing the preferred HCV transcript binding site sequences
identified by combinatorial hybridization affinity screening, 23-48
and 353-378, were made by automated synthesis. Hybridizations were
done in 20 µl containing hybridization buffer, 1000 cpm of
5
-32P-labeled RNA, and antisense oligonucleotide ranging
from 10 pM to 10 µM. Mixes were heated at
90 °C for 5 min, cooled slowly to 37 °C, and incubated for
20 h at 37 °C. 10 µl of loading buffer (15% Ficoll, 0.25%
bromphenol blue, 0.25% xylene cyanol) was added, and mixes were
resolved at 10 °C on native 20% PAGE using 44 mM Tris-borate and 1 mM MgCl2 running buffer for
~4 h at 122 W. Gels were quantitatively imaged using a Molecular
Dynamics PhosphorImager. The log linear range of the assay was
determined (not shown) to cover from 50 pM to 10 µM in apparent Kd
for antisense oligonucleotides.
HCV transcript was prepared as
above except that it encompassed the first 5 1.4-kilobase pairs.
Heterologous control truncated intercellular adhesion molecule 1 (tr-ICAM-1) transcript was synthesized similarly (3). In
vitro translation (IVT) reactions contained (in 15 µl) 300 ng of
HCV RNA (44 nM final concentration), 100 ng of tr-ICAM-1
RNA (30 nM final concentration), 5 µl rabbit reticulocyte lysate (Promega), 8.8 µCi of [35S]methionine (1175 Ci/mmol), 13 µM IVT amino acid mix minus methionine (Promega), 8 units of RNasin, and oligonucleotides (30 nM
to 1 µM). Before mixing, RNAs were heated at 65 °C for
5 min and then at 37 °C for 15 min. Reaction mixes were incubated at
37 °C for 1 h and then quenched with an equal volume of Laemmli
gel loading buffer, boiled for 8 min, and placed on ice. After
microcentrifuging for 8 min, samples were electrophoresed on a 14%
polyacrylamide gel (Novex) for 2 h at 125 V. Gels were fixed (10%
propanol, 5% acetic acid, 3% glycerol, 20 min), vacuum-dried, and
subjected to PhosphorImager analysis.
Simian virus
40 large tumor antigen-immortalized human hepatocytes H8Ad-17c (41)
were selected for expression of HCV sequences after calcium
phosphate-mediated transfection with a neomycin resistance expression
vector containing HCV type II sequences (nt 1-1357, including the
5-NCR, core protein, and the majority of the envelope protein E1
genes) fused to the human cytomegalovirus immediate early promoter
(35). HCV sequence-expressing H8Ad-17c cells were seeded into six-well
dishes at a density of 5 × 105 cells/well, rinsed
once with Optimem (Life Technologies, Inc.), treated with
oligonucleotides for 4 h in the presence of 5 µg/ml N-[1(-2,3-dioley(oxy)propyl]-n,n,n-trimethylammonium
chloride, and then rinsed once and refed with growth medium. Cells were lysed in 1 × Laemmli sample buffer 18-20 h after treatment with oligonucleotide and then boiled, and cell debris was removed by centrifugation. Proteins were separated on 16% SDS-PAGE and
transferred to polydivinylfluoride membrane. Western blots were blocked
in phosphate-buffered saline containing 2% normal goat serum and 0.3%
Tween 20, were probed with a polyclonal antibody derived from serum
taken from an HCV type II-positive patient and a monoclonal antibody to
glyceraldehyde-3-phosphate dehydrogenase (GAPDH) (Advanced Immunochemical), rinsed, and incubated with
125I-radiolabeled goat anti-human antibody, and
immunoreactive proteins were subjected to PhosphorImager analysis. HCV
RNA levels were not measured, because they previously have been shown
not to be reduced by antisense oligonucleotides uniformly modified with 2
-substitutions that do not support RNase H activity (35).
Strategy for Combinatorial Screening for Preferred Hybridization Sites
Library DesignWe used short (10-mer) DNA libraries in order to restrict the library complexity (calculated as 410 = 1,048,576 sequences) so that all possible sequences would be synthesized with equimolar base representation at each sequence position and in equivalent and sufficient amounts (40). The use of short oligonucleotide libraries also avoids negatively biasing selections for suboptimal recognition with long oligonucleotides, as was discussed earlier (9, 10, 11, 12, 13), keeps the heteroduplex affinity at the selected preferred sites within the operational range of measurement of readily accessible assays in order to facilitate subsequent quantitative optimization, and attenuates the stability of intramolecular interactions (i.e. stem-loops) of library sequences. Short oligonucleotides with very stable internal structure will diminish the affinity of hybridization with RNA (7, 8) and will not be selected by affinity-based screening. Unlike libraries of other nucleic acid screening strategies, the libraries used in this strategy have no internal fixed sequence positions (42) or external conserved flanking sequences (43, 44, 45).3
Combinatorial Hybridization Affinity ScreeningCombinatorial affinity screening (Fig. 1A) was performed at hybridization equilibrium, in solution, under quasiphysiologic conditions (i.e. pH, temperature, salt species, and concentration). Screening on solid supports or employing physical separation was avoided because of the potential for the introduction of bias and the difficulties inherent in satisfying the following requisite conditions.
The concentration of DNA library must be held (typically at 1 mM) such that the calculated concentration (100 pM for a 1 mM 10-mer library) of individual
library sequences ([X10] in Fig. 1) is
limiting and significantly less (by 100-fold) than the Kd (~10 nM for a DNA 10-mer mixed
sequence, as determined by the gel mobility shift assay) for the
highest affinity heteroduplex with the RNA. Heteroduplex formation is
then controlled solely by the concentration of RNA (5), which is
titrated around this Kd value in order to
preferentially select for the tightest interactions. These will reflect
sequence-dependent variation in heteroduplex stability to
some degree, but mostly they will reflect the absence of unfavorable
structure both at select local sites of RNA and of the complementary
oligonucleotides. Under conditions where the concentration of
individual library sequences
Kd for
heteroduplex formation, intermolecular interactions among complementary
library sequences are strongly disfavored. These conditions also allow
using a [RNA] that is still low enough (even up to 10 µM) that physical aggregation is mitigated (at least for
RNA < 400 nt). Most importantly, the only statistically significant population of bound RNA-DNA complexes has 1:1
stoichiometry, so all RNA-DNA interactions behave as if measured
independently (i.e. are "unlinked") and there is no
"melting out" of individual RNA molecules. By these
means,4 unbiased affinity selection in a
single round by massive parallel co-processing of all unlinked
possibilities for heteroduplex hybridization is achieved.
We currently use an RNase H cleavage assay to enable positional identification of affinity-selected heteroduplex regions on the end-labeled RNA via RNA cleavage product size separation on high resolution denaturing PAGE; from the sequences of the RNA at cleavage sites, the complementary sequences of bound oligonucleotides are inferred. The cleavage assays are performed sequentially to the combinatorial affinity hybridizations in the same solutions. It should be noted that, for reasons not well understood, RNase H cleavage at all nt positions of heteroduplexed DNA oligonucleotides with folded RNA is not (often) observed. This has two implications. First, cleavage of even one nt position is a secondary event dependent on the primary event of preferred hybridization of at least one DNA library oligonucleotide at sequences encompassing the cleavage site(s). Second, it is not possible to predict the exact length and sequence of preferentially hybridizing DNA oligonucleotides from affinity cleavage patterns. This requires further optimization (see below).
The RNase H cleavage method of analysis of DNA oligonucleotide
hybridization with folded RNA has high sensitivity (10
pM) for heteroduplex in the presence of much larger amounts
of unbound RNA and library sequences. It also detects multiple
hybridization sites simultaneously, and it can identify preferred
hybridization sites that are longer than the (10-mer) library
oligonucleotides, since affinity cleavage patterns on gels are the sum
of all patterns for individual 1:1 complexes, including those that are
sequence-overlapping. This summation can contribute to, at most, a
severalfold enhanced cleavage product yield at sites allowing
hybridization of oligonucleotides longer than those of the library.
However, at the best sites we observe (i.e. Fig.
2C and results not shown)4
considerable amplification (
100-fold at sites with high affinity and
rates of hybridization) of cleavage products over the maximum yield
calculated from the limiting concentration of individual library
sequences that can be present in 1:1 complexes at equilibrium without
amplification. Thus, we posit that there is, in addition, multiple
turnover of oligonucleotides during RNase H digestion at favorable
sites. This speculation remains unconfirmed and is a subject of further
study, but the end result is considerably improved assay sensitivity
over the calculated limit values.
Dependence on (E. coli) RNase H cleavage creates the possibility that some RNA sites acceptable for oligonucleotide hybridization may be overlooked because they are sterically inaccessible to RNase H or are not contiguous in the primary sequence (i.e. are stem-bridging). However, one advantage may be that those sites that are identified may be more predictive for recruitment of intracellular RNase H (albeit mammalian in therapeutic applications), which often is required for robust antisense biological inhibitory activity (3, 4, 46). Accurate assessment of the ability of this biochemical strategy to predict biological efficacy will require thorough testing as it may be adversely affected by any number of elements of the cellular milieu, including interference by RNA binding proteins (see below), transcript coding sequence scanning by ribosomes, and unfavorable differential subcellular localization of target RNA metabolite, antisense oligonucleotide, and RNase H. The favorable results for the initial investigation targeting HCV RNA are reported below and suggest that additional studies are warranted.
Application of Combinatorial Hybridization Affinity
Screening to 5 HCV RNA
We started with a 5 370-nt HCV RNA fragment (Fig. 2A),
because it encompasses the entire 5
-NCR, translation start codon, and
30 nt of the coding region and the secondary structure previously has
been determined by mapping with multiple single strand- and double-strand-specific endoribonucleases (38). It is large enough to
expect a folded structure, at least upstream of the artificial 3
-end
(see below), closely approximating that in the full-length transcript
under the same conditions. Application of combinatorial hybridization
affinity screening analysis gave results summarized in Fig. 2,
A-C. Acceptable quality single nt resolution of RNA cleavage products on standard denaturing PAGE typically is limited to
~100 nt from the end label, with lower resolution useful to at least
another 50 nt. Within these limits, using the 5
-end-labeled transcript
fragment, we detected several lesser quality affinity cleavage sites,
characterized by only a few low intensity bands on PAGE, and one
modestly preferred site starting at about nt 29 (the "29-site")
with a larger number of higher intensity cleavage products on PAGE. The
29-site aligns well with results of our own (Fig. 2A) and
the previously reported independent enzymatic structure-mapping results
(38) showing a large internal loop structure as the only substantially
less structured domain within the first 5
125 nt. This site is also
where the greatest spontaneous background hydrolytic cleavage in mock
experimental controls is observed (Fig. 2C), indicating
single-stranded local structure, as expected. The 370-nt fragment also
was 3
-end-labeled, and the combinatorial screening and analysis was
repeated. Again, only several lesser preferred affinity cleavage sites
were found (at nt 267-269, 302-304, and 351-352), having a few low
intensity cleavage products on PAGE, consistent with the high degree of enzyme-mapped secondary structure (38) and recently determined higher
order structure (47) in this region (Fig. 2A). The cleavage sites from nt 267-269 are consistent with a binding site(s) for a
10-mer oligonucleotide(s) at least overlapping a site from nt 264-282
for which a phosphorothioate antisense oligonucleotide gave
demonstrable inhibition in both in vitro translation and cell culture experiments (48).
The short affinity cleavage pattern (covering ~10 nt) at the 29-site
suggested the possibility of tight restrictions on the length of
optimally binding complementary oligonucleotides. This site also is
very near the 5-end of the transcript, and inhibition of viral
translation by antisense complementary to it would be expected to be
incomplete due to the initiation of translation at downstream sequences
comprising the internal ribosome entry site (IRES) (36, 49). Although a
cap structure may be present at the 5
-end of HCV RNA expressed in
H8Ad17c cells, extensive secondary structure within the 5
-NCR and the
presence of multiple AUG codons upstream of the normal initiator AUG
should interfere with ribosome scanning and result in retention of an
internal ribosome entry (35). Therefore, we wanted to see if there was a larger preferential hybridization site in the initial coding region
downstream of the AUG translation start codon at nt 342. The 370-nt
fragment is not long enough to confidently analyze the initial coding
region for hybridization sites, because of potential "end effects"
at the artificial 3
-end (see below). Therefore, we made a 200-nt
fragment centered on this AUG. Radioisotopic labeling of each end, in
turn, allowed for complete combinatorial hybridization affinity
screening with RNase H cleavage analysis to single nt resolution. The
affinity cleavage results for the central portion of this fragment
(Fig. 2A), removed from artificial 5
- and 3
-ends (as
below), indicated a good candidate hybridization site at ~340-370
nt. It was characterized by a long sequence stretch (covering ~28 nt)
of high density and intensity of affinity cleavage products seen on
denaturing polyacrylamide sequencing gels.
Overlap of RNA Fragments: End Effects, Structure Mapping, and Affinity Cleavage
RNA fragments are shorter than full-length transcripts and have
5, 3
, or both termini that correspond to sequence positions internal
to the transcripts; thus, such termini are artificial ends. Since they
are lacking contiguous 5
and/or 3
sequences, the local structure in
the vicinity of such unnaturally positioned ends may be different than
for the corresponding sequences in full-length transcripts. These
differences, as detected by enzymatic mapping and/or oligonucleotide
hybridization affinity, may be termed end effects. In this study, the
370-nt fragment has a 3
artificial end, and the 200-nt fragment has
both 5
and 3
artificial ends; but we have designed substantial
sequence overlap (130 nt) of the 3
-end of the 370-nt fragment with the
200-nt fragment and of the 5
-end of the 200-nt fragment with the
370-nt fragment. For both RNase H affinity cleavage of hybridized
oligonucleotides and consensus enzymatic structure mapping, there was
good correspondence between the 370- and 200-nt fragments for much of
the 130-nt overlap region (Fig. 2B). Differences that may
best be ascribed to local end effects were evident for the 200-nt
fragment from nt 240 to
251 and for the 370-nt fragment from nt
350
to 370. The 315-325-nt region was a clear outlier, which probably is
most consistent with a large steric effect and/or long range structure
interaction occluding access to mapping enzymes and RNase H for the
370-nt fragment but not for the 200-nt fragment. This interpretation is
consistent with the recent identification of a large RNA pseudoknot structure, involving upstream (nt 126-134) and downstream (nt 305-311, 315-323, and 325-331) sequences of the 5
-NCR, both present only in the 370-nt fragment (47). Finally, although most affinity cleavage sites aligned roughly with single-stranded domains determined by enzymatic structure mapping for either fragment, the latter was not
always usefully predictive of the former. For example, at the most
interesting candidate hybridization site at nt ~340-370 on the
200-nt fragment, the majority of nt positions of affinity cleavage
align poorly, if at all, with consensus single-stranded sequences. A
similar situation holds for the less dramatic affinity cleavage site at
~nt 295-301, and there is no affinity cleavage seen at
(predominantly) single-stranded nt ~302-315. These results are
consistent with a previous study of an RNA stem-loop fragment from
mutant Ha-ras mRNA (5) for which consensus enzymatically mapped single-stranded positions of the loop were as unfavorable for
complementary oligonucleotide hybridization as was the double-stranded stem. Together, these observations suggest that (i) some
single-stranded regions that are favorable for oligonucleotide
hybridization are missed by (larger) mapping enzymes, presumably due to
steric occlusion and, more importantly, (ii) only a subset of RNA
single-stranded stretches long enough for favorable antisense
oligonucleotide hybridization will actually support it. Therefore, the
more efficient and accurate approach for identification of this subset
is combinatorial hybridization affinity cleavage using only RNase H
instead of the far more tedious and less informative determination of
secondary structure by mapping using multiple enzymes.
Characterization of Combinatorial Screening-identified Hybridization Sites and Oligonucleotide Optimization
StrategyWe followed the quantitatively rigorous approach in
Fig. 1B in order to characterize the hybridization and to
rationally optimize the sequence, length, and chemical composition of
oligonucleotides complementary for identified sites. An iterative
testing cycle was used to minimize the total number of oligonucleotides
synthesized and analyzed. The rational optimization goal was design of
oligonucleotides with simultaneous realization of two attributes. (i)
The most obvious one is high affinity (Ka) for the
preferred sites on folded RNA (but not so high as to lose sequence
mismatch specificity) (31, 32), since affinity generally is thought to
correlate with biological antisense potency (31, 32, 50, 51). (ii) Of
equal importance for full realization of the antisense therapeutics promise is to achieve the highest possible transcript hybridization specificity. An accessible indicator of this is the site specificity, Krel, which we define for any candidate
oligonucleotide as the ratio of the value of Ka to
the value of the intrinsic potential affinity, Ka,
for hybridization to sequence complementary, short, and unstructured
RNA external standards (Fig. 1B); ideally,
Krel = Ka/Ka
1.0. In other words,
any oligonucleotide that binds with sequence complementarity to an
identified, structurally preferred transcript site as well as it can to
the same excerpted RNA sequence without structure is unlikely to bind
to any other sequence-related (or even identical sequence) transcript
site nearly as well and so will be maximally specific.
In practice (Fig. 1B), Ka is determined
by titrating individual candidate oligonucleotides against folded RNA
fragments and assaying by enzymatic footprinting. Values for
Ka are similarly determined except that short
sequences of RNA (
25 nt) are excerpted from the preferred
hybridization sites on the longer fragments, and a gel mobility shift
assay is used. Although this cannot claim to give "ideal" intrinsic
binding affinities, it is reasonable to expect that there cannot be a
very large energetic cost of disruption of structure in either the
preferred target RNA site or in the complementary DNA oligonucleotides,
or else they would have been screened against (8). Since, in addition, the control RNA oligonucleotides used to determine
Ka
values by gel mobility shift are much shorter
than the parent RNA fragments used in combinatorial screening,
potential inhibition of hybridization by higher order structure
involving sequences flanking, or some distance removed from, the
hybridization sites should be minimized. Finally, in preliminary assay
calibration experiments,4 affinity constants for
hybridization of 10-mer DNA oligonucleotides complementary to a 47-mer
mutant Ha-ras mRNA stem-loop determined by either the
enzymatic footprinting or gel shift methods were in agreement within a
2-3-fold variance. Under our assay conditions, neither method was very
sensitive to oligonucleotide sequence for mixed sequence
oligonucleotides of a given length and backbone chemistry (Fig.
3).
DNA Oligonucleotides
The results of applying this rational
approach for DNA oligonucleotides complementary to the combinatorial
screen-identified HCV transcript fragment sites starting at nt 29 and
340 are shown in Fig. 2D and Fig. 3, A-D. The
highest observed affinity binding (Ka) of 10-mers
was restricted to one (or perhaps two) oligonucleotide(s) for the
29-site (Fig. 3A), consistent with expectations from the
relatively short affinity cleavage pattern, and to at least two
oligonucleotides (separated by 10 nt) for the site from nt 355-380
(Fig. 3C), also consistent with the longer affinity cleavage
pattern. Importantly, experimentally apparent optimal binding of at
least one 10-mer DNA oligonucleotide(s) was found for the 29-site;
Ka and Ka
10 nM limit value (under the conditions used) for a 10-mer mixed sequence heteroduplex, and therefore,
Ka/Ka
1.0. Since the site
from nt 340-380 is so much longer, Ka
values were
not initially obtained for all 10-mers, but Ka
values for some of these oligonucleotides were of similar magnitude to that for the best one for the 29-site. Further, there was no evidence by either single-stranded RNA footprinting or RNase H affinity cleavage
analysis for binding of any of these individual 10-mer oligonucleotides
outside of the combinatorial screening-identified preferred sites (data
not shown).4 These results validate the combinatorial
hybridization affinity-screening protocol on the binding level and also
lend support to the existence of "ideal" hybridization sites in
folded transcript RNA, at least for 10-mer oligonucleotides.
The binding of DNA 20-mers was compared with that of DNA 10-mers at the 29-site (Fig. 3B). Only a small increase in Ka is maximally obtained for the best 20-mer at the expense of a significant reduction in Krel. Other 20-mers actually bind with lower or no better affinity than the best 10-mer, although the best 10-mer sequence is embedded in all 20-mers tested. These results underscore the strong contribution of RNA structure to oligonucleotide hybridization and the importance of matching oligonucleotide length to RNA target site structure. The conclusion that "more (oligonucleotide length) is not always better" has now been shown for both sequence mismatch specificity (31) and optimal structural recognition-dependent affinity and specificity (this study).
The most preferred subsite for hybridization within the HCV transcript
nt 340-380 region identified by affinity cleavage (Fig. 2A)
is from nt 355-380 (the "355-site") (Fig. 3C), and
hybridization appears to be more favorable than at the 29-site. All
10-mer DNA oligonucleotides starting from nt 355-370 hybridize with
Ka within ~100-fold of the highest affinity 10-mer
for the 29-site (Fig. 3C). Thus, this site is relatively
long (~25 nt) and appears structurally favorable for hybridization of
oligonucleotides somewhat longer than 10-mers in order to realize
greater affinity without undue loss of specificity. This conjecture was
confirmed with a series of DNA oligonucleotides increasing in length
from 10 to 20 nt, all starting at the 355-nt sequence position (Fig.
3D). It was found that Ka values are
proportional to length up to the 18-mer. Surprisingly,
Krel values actually improve with length, such
that the Krel value for the 18-mer is nearly
ideal. It is not clear why this happens. One possibility might be that the already favorably preorganized site in folded transcript RNA (the
latter point inferred from the 10-mer DNA data) requires only a
modestly more costly reorganization with increasing length of DNA
oligonucleotides, resulting in minimal lost energy from the net
biochemical energetic gain (Gt
) upon
hybridization. In contrast, it might be expected that the net energy
gain (
Gc
) upon hybridization of longer DNA
oligonucleotides to control sequences would be diminished by an
increasingly more costly de novo organization into longer
helices of more nucleotides of a short random coil RNA complement. This
would then result in a progressively larger value for
G
for DNA oligonucleotides of increasing length, where
G
=
Gt
Gc
. For whatever reason, a similar increase in
Krel with increasing oligonucleotide length is
also seen for the uniform P=S substitution (Fig. 3E).
We expected
that preferred DNA hybridization sites in folded RNA would also be
preferred hybridization sites for other oligonucleotide chemistries,
since all alternative chemistries discovered to date (39, 52) that bind
single-stranded RNA by Watson-Crick hybridization form right-handed
heteroduplexes with helical conformations close to A or A form, as
does DNA. If true, then our combinatorial DNA library screening
strategy would have significant predictive value for biological and
therapeutic applications requiring enhanced nuclease resistance
(i.e. most commonly using P=S) (3, 4) and affinity per unit
length (i.e. highly potentiated for 2
-F) (39). As shown in
Fig. 3, the 355-site discovered using DNA oligonucleotides (Fig.
3D) also is favorable for hybridization of oligonucleotides
substituted with alternative backbone chemical compositions (Fig. 3,
E-F).
However, it also appears that more subtle variation of conformational
preferences of heteroduplexes of different substitute oligonucleotide
chemistries are manifested as deviations of maximum length limits for
best possible binding at sterically and topologically constrained sites
in folded RNA; evidence for variable topological-conformational constraints over the nt 355-380 site is seen in the uneven
Ka profile (Fig. 3C) for 10-mer DNA
oligonucleotides. In fact, it follows that higher affinity,
conformationally preorganized backbone chemistries (i.e.
2-F) that form more rigidified heteroduplex helices (39) and are
therefore less conformationally accommodating should exhibit decreased
limit lengths for optimal binding at constrained sites, as observed:
20-mer P=S diester (Fig. 3E) and 18-mer phosphodiester (Fig.
3D) versus 14-mer 2
-F, P=S diester (Fig.
3F) limit lengths for the HCV transcript 355-site.
Alternatively, or additionally, minimal internal structure in
oligonucleotides longer than 14-16 nt could be stabilized by the 2
-F
modification to the extent that the requisite cost of its disruption on
net affinity of hybridization to the 355-site is significant. Although combinatorial screening for hybridization with 10-mer DNA
oligonucleotides strongly biases against structure in these
oligonucleotides, it cannot altogether eliminate the possibility of
internal structure in longer oligonucleotides with selected 10-mer
sequences embedded therein. However, our inability to generate
plausible folded structures makes this explanation appear less
probable. Also, the evidence points to little structure in the RNA
target from nt 355-380, and so it is not likely there is much
structure in the antisense complement spanning this region.
The generally reduced values of Krel for
oligonucleotides of both the lower affinity P=S and higher affinity
2-F, P=S series compared with those of the DNA series may be
attributed solely to the P=S modification. Chemical synthesis of the
P=S substitution creates a random diastereomeric distribution in
oligonucleotides. A more restricted conformational subset of this total
product distribution may hybridize well (hence a lower average
Ka for the whole population) to more constrained
sites in folded RNA fragments than the larger subset that may be
expected to hybridize well (hence a higher Ka
for
the whole population) to more conformationally accommodating, short,
unstructured RNA oligonucleotides, thus accounting for these observed
Krel values. Other ways in which the P=S
modification could negatively bias the observed Krel values for this site compared with values
for DNA may also be possible, but they remain to be identified.
Regardless of the underlying cause, if this observation is found to
apply to other preferred sites of hybridization on folded RNA, then
this even less than expected affinity obtained with the P=S
substitution would argue for minimizing its incorporation in future
designs.
The possibility exists that some additional improvement in
Ka and Krel values for
shorter length 2-F, P=S oligonucleotides could be realized by a finer
subsite mapping within the nt 350-380 site (i.e.
12-16-mers starting at nt 357 or 359 or centered at nt 365). Even so,
results presented here with the nt 355 start site 2
-F, P=S
oligonucleotide series (Fig. 3F) show that alternative chemistries to DNA (Fig. 3D) can be used successfully in the
context of shorter (10-15-mer) oligonucleotides in order to
simultaneously achieve enhanced nuclease resistance and affinity per
unit length and the maximal specificity that is permitted by the
structure of a given preferred hybridization site.
Correlations of in Vitro Optimization of Binding with Biological Antisense Activities
Prior to commencing this work, we felt that there was sufficient
justification to support the premise that a biochemical strategy could
predict biologically active antisense oligonucleotides often enough to
prove useful. Briefly summarized, numerous examples are known of
identity in structure and function of RNA elements either purified (and
often truncated) in biochemical studies or (usually full-length) in
cellular studies (6, 7, 8, 14, 15, 16, 22, 23, 53, 54, 55, 56, 57). High affinity RNA
binding proteins often may not be effective competitors of antisense,
since they usually recognize combinations of RNA sequence and structure
elements that are not favorable for antisense hybridization (53, 54, 55). A
major class of generally lower affinity and specificity RNA
single-stranded binding proteins, RNA chaperones, alter folding
kinetics, but not final structure, of RNA and, likewise, alter
oligonucleotide hybridization kinetics but not specificity with folded
RNA (58, 59). Finally, more directly relevant to the application
described here, preliminary studies elsewhere3 have
demonstrated that identical specificities are obtained for oligonucleotide hybridization to long, folded HCV 5-NCR-coding region
transcript fragments in the presence and absence of protein introduced
with whole cell cytoplasmic or nuclear extracts.
Results from our in vitro oligonucleotide binding
optimization strategy with purified RNA fragments (Fig. 3D)
do show, in fact, a positive, predictive correlation with biological
antisense inhibition activity in two assay formats. The first is a cell membrane-free IVT assay (Fig. 4). The IVT assay uses a
much longer HCV 5 1.4-kilobase transcript fragment in the presence of
cellular RNA-binding proteins. The absence of any inhibition of
translation of the HCV transcript by the 355-10 and 355-12, the
randomized sequence 20-mer (355-20R, except at the highest
concentration), or the sense sequence 20-mer (355-20S) DNA
oligonucleotides or of the heterologous tr-ICAM internal control
transcript translation by any of the oligonucleotides argues against a
(length-dependent) nonspecific effect. It is apparent that
a minimal threshold affinity (i.e. for a DNA 14-mer) is
required to elicit an inhibitory biological response by 355-site
antisense oligonucleotides. For the 355-14 to 355-20 length
oligonucleotides the dose-response curves are reflective of binding
isotherms (Fig. 4A), and higher affinity oligonucleotides
have lower values (within experimental error) for the effective
concentration giving 50% reduction (EC50) in HCV core
protein production (Fig. 4B). These results suggest a direct
correlation of inhibition by an antisense mechanism with hybridization
affinity. Compression of the range of EC50 values for
14-20-mers (~3-fold) compared with the larger range of
Ka values with purified RNA (>10-fold) might
reflect attenuation of binding affinities by proteins.3
These results are in accord with recent IVT assay analyses of walks
with individual DNA or phosphorothioate oligonucleotides over long 5
HCV transcript fragments, which identified several subsites within the
nt 340-380 region as preferred positions for biological antisense
activity (36, 48, 60).
The second biological assay measures HCV core protein expression in
human cells. In this assay, the inhibitory activities of 10-20-mer
2-F, P=S oligonucleotides targeting the 355-site are dose-responsive
(Fig. 5) and closely correlate with their biochemical
hybridization affinity to folded RNA transcript fragments (Fig.
3F). There is little or no activity of a randomized sequence 20-mer (355-20R), and none of the oligonucleotides tested inhibited expression of the internal control GAPDH. As with the affinities, there
is a diminishing gain in inhibition obtained with the 2
-F rigidified
backbone at longer than 14-16-mers. Nevertheless, the inhibitory
activity of the 2
-F, P=S 355-20-mer oligonucleotide was unsurpassed
(Fig. 6B) when compared with activities
determined identically of the most active 20-mer 2
-F, P=S
oligonucleotides of a linear sequence walk targeting the first 5
375 nt of the same HCV sequence (Fig. 6, A and B).
Since the majority of these oligonucleotides showed no activity, a few
gave only slight inhibition (Fig. 6A), and
sequence-randomized controls (330-20R and 340-20R) gave little or no
inhibition (Fig. 6B), our results are consistent with
sequence-specific antisense inhibition and not with
sequence-nonspecific, but length-dependent, P=S
oligonucleotide inhibition (9, 11). Further, the most active antisense
oligonucleotides identified by the walk are complementary to sites of
preferred hybridization, as inferred from positions of RNase H cleavage
after combinatorial oligonucleotide screening (i.e. 260-20 and especially 340-, 345-, 350-, and 355-20 oligonucleotides). Similar
results using the HCV core protein expression assay recently were
obtained from a walk with uniformly modified 2
-methoxyethoxy
phosphodiester 20-mers (35). Other workers also have obtained the most
effective antisense activity in various cell-based assays when
oligonucleotides were directed against the initiation AUG codon and
initial coding region of HCV RNA (48, 60). Finally, it is of interest
and of possible therapeutic value that the inhibition achieved in this
study by binding optimized antisense oligonucleotides complementary to
coding sequences was seen without the recruitment of RNase H activity
(not supported by the 2
-F and 2
-methoxyethoxy modifications (35,
39)), which may provide for greater flexibility in future design and
utilization.
Conclusions and Implications
Combinatorial DNA oligonucleotide screening (Fig. 1A) and rational optimization (Fig. 1B) of oligonucleotide sequence, length, and chemistry for affinity and specificity of hybridization to structurally preferred sites on folded HCV RNA transcript fragments has been quantitatively established in vitro and biologically validated. This correlation lends further support to the growing value to biology of careful biochemical studies with purified RNA fragments and, in particular, shows that potential artifacts resulting from artificial ends of transcript fragments can be avoided. Further applications of this strategy will be required to better assess the general predictive value for efficacy of antisense oligonucleotides in living cells. In the present study, the identified preferred sites for DNA hybridization also hybridize well with oligonucleotides having unnatural chemistry-substituted backbones conferring enhanced nuclease resistance and hybridization affinity. The results of this study support the hypothesis that binding-optimized shorter oligonucleotides (i.e. 10-15-mers) may often have equivalent or higher affinity and significantly greater hybridization site specificity than longer ones (i.e. 20-mers). We recommend comprehensive biological testing focused on oligonucleotides optimized for hybridization to the various candidate sites identified by our biochemical strategy in order to determine those sites where antisense generates the most potent biological inhibition. In those cases where the correlation of in vitro binding and biological activity is not confounded by the cellular environment, the global RNA binding specificity, that is, the discrimination against binding to all intracellular RNA sites other than the one targeted (10), should be improved using oligonucleotides optimized in vitro for binding affinity and specificity for a particular preferred site.
As for the alternative semicombinatorial antisense fragment probing strategies (25, 28) described earlier, practical selections against RNAs much longer than is possible with the technology reported here is greatly desired. Primer extension copying of the RNase H-generated RNA fragments into DNA should facilitate improved resolution on denaturing PAGE. However, the preferred eventual assay to use with our strategy would be realized if combinatorial hybridization screening in solution, in accordance with the conditions established in this study, could be followed by (i) rapid removal (preferably enzymatically) of unbound library oligonucleotides (i.e. using single-strand-specific DNases similarly to, and within the time period of, successful posthybridization RNase H treatment of this study), (ii) melting of the RNA-preferentially hybridized oligonucleotide heteroduplexes, and then (iii) some format of oligonucleotide matrix array (61) capture and imaging of released oligonucleotides. If successful, such an alternative method to RNase H cleavage for hybridization analysis could allow direct screening of libraries synthesized incorporating novel chemistries. In selecting these analog chemistries, our results suggest that less rigid backbones may better accommodate the variable topological and conformational constraints on available hybridization sites on folded RNA and thereby will more fully realize the potential affinity enhancement of the chemistry used.
We thank O. Uhlenbeck and H. Moser
for intellectual support and periodic critical review; S. T. Crooke for
challenging questions; P. D. Cook, D. J. Ecker, K. Anderson, and J. Kiely for providing research support and resources; E. DeBaets and P. Davis for randomized DNA library synthesis; Isis oligonucleotide
synthesis facility, especially H. Sasmor, for DNA, P=S, and 2-F/P=S
oligonucleotides; members of our laboratories, especially D. Robertson
and R. Griffey, for helpful suggestions; reviewers for helpful comments
and suggestions; and Kaketsuken Chemo-Sero Therapeutic Research
Institute for the HCV genomic clone and the H8Ad-17c cell line.
Subsequent to the initial submission of this manuscript it was reported (62) that high affinity chemistry-substituted antisense oligonucleotides as short as 7-mers can be potent and selective inhibitors of gene expression in cells, consistent with some of the assertions and results of this study.