A Novel Long Distance Base-pairing Interaction in Human Immunodeficiency Virus Type 1 RNA Occludes the Gag Start Codon*

Truus E. M. Abbink and Ben BerkhoutDagger

From the Department of Human Retrovirology, Academic Medical Center, University of Amsterdam, 1100 DE Amsterdam, The Netherlands

Received for publication, October 8, 2002, and in revised form, November 21, 2002

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The 5'-untranslated region (5'-UTR) is the most conserved part of the HIV-1 RNA genome, and it contains regulatory motifs that mediate various steps in the viral life cycle. Previous work showed that the 5'-terminal 290 nucleotides of HIV-1 RNA adopt two mutually exclusive secondary structures, long distance interaction (LDI) and branched multiple hairpin (BMH). BMH has multiple hairpins, including the dimer initiation signal (DIS) hairpin that mediates RNA dimerization. LDI contains a long distance base-pairing interaction that occludes the DIS region. Consequently, the two conformations differ in their ability to form RNA dimers. In this study, we have presented evidence that the full-length 5'-UTR also adopts the LDI and BMH conformations. The downstream 290-352 region, including the Gag start codon, folds differently in the context of the LDI and BMH structures. These nucleotides form an extended hairpin structure in the LDI conformation, but the same sequences create a novel long distance interaction with upstream U5 sequences in the BMH conformation. The presence of this U5-AUG duplex was confirmed by computer-assisted RNA structure prediction, biochemical analyses, and a phylogenetic survey of different virus isolates. The U5-AUG duplex may influence translation of the Gag protein because it occludes the start codon of the Gag open reading frame.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Human immunodeficiency virus type 1 (HIV-1)1 virions contain two full-length positive-stranded RNA molecules as genome. The full-length RNA not only serves as viral genome but also functions as an mRNA to encode the Gag and Gag-Pol polyproteins. The highly structured 5'-UTR is the most conserved part of the HIV-1 genome and is involved in several steps of the viral replication cycle (1). Distinct functions have been assigned to individual sequence and/or structure motifs (presented in different colors in Fig. 1A). The 5'-UTR consists of an upstream repeat (R) region that recurs at the 3'-terminus of the HIV-1 genome and that comprises TAR and the polyadenylation (poly(A)) signal. The well characterized TAR hairpin mediates transcription activation by binding of the viral Tat protein and the cellular protein, cyclin T (2-10). The poly(A) hairpin inhibits premature polyadenylation of the nascent RNA by masking of the AAUAAA polyadenylation signal (11, 12). The U5 region is located downstream of the R region and contains two important signals for reverse transcription, the primer activation signal (PAS) and the primer binding site (PBS) (13, 14). Additional essential motifs are located further downstream in the 5'-UTR. These include the RNA dimer initiation signal (DIS), the major splice donor site (SD) that is required for the generation of subgenomic mRNAs, the packaging signal (Psi ) that is required for the assembly of infectious virus particles, and a hairpin motif that includes the Gag start codon (15-23).

The secondary structure of the HIV-1 5'-UTR has been studied extensively, and a variety of structure models have been proposed (1, 18, 19). Recently, the 5'-UTR was shown to fold alternative secondary structures (Fig. 1A) (24). The ground state conformation is formed by a long distance interaction of the poly(A) and DIS regions and is termed LDI. The alternative, metastable conformation is a branched structure with multiple hairpins and is termed BMH. The two conformations differ in their ability to form RNA dimers. The DIS sequence is masked in the LDI conformation by long distance base pairing with upstream sequences, thus preventing dimer formation. In contrast, the DIS hairpin with the palindromic loop sequence is folded in the BMH structure.

Thus, BMH RNA is able to engage in a kissing-loop interaction with the DIS palindrome of a second RNA molecule, thereby forming loose dimers (15, 25-30). Heat treatment or incubation with the HIV-1 nucleocapsid (NC) protein triggers the formation of a tight dimer with extended inter-strand base pairing (15, 25, 28, 31). Interestingly, the NC protein also mediates the switch from LDI to BMH (24). This RNA switch mechanism may allow regulation and appropriate timing of the different 5'-UTR functions. For instance, the HIV-1 genomic RNA should be translated into the Gag and Gag-Pol proteins prior to RNA dimerization and packaging into assembling virions.

The LDI and BMH structures have been studied in transcripts that comprise the 5'-terminal 290 nucleotides (nts) of the HIV-1 leader RNA. Because the SD site is located at nucleotide position 289, these results suggest that both genomic and subgenomic HIV-1 mRNAs can fold the LDI conformation. The 5'-UTR of the genomic HIV-1 RNA consists of 335 nucleotides up to the AUG start codon of the Gag open reading frame (ORF). In this work, we studied the folding of the downstream leader region 290-368 that contains the SD and Psi  signals and part of the Gag ORF. Computer-assisted folding and a phylogenetic survey of the leader RNA of different primate lentiviruses revealed a novel long distance interaction between U5 sequences and the Gag initiation codon: the U5-AUG duplex. The proposed U5-AUG long distance interaction was analyzed by mutational analysis, polyacrylamide gel electrophoresis, and RNA structure probing. The U5-AUG long distance interaction is formed exclusively in the BMH structure and not in the alternative LDI fold. The duplex is of particular interest because it occludes the AUG start codon of the Gag ORF, and it therefore has the potential to be involved in regulation of mRNA translation.

    EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

RNA Secondary Structure Prediction-- Computer-assisted RNA secondary structure predictions were performed using the Mfold version 3.0 algorithm (32, 33) offered by the MBCMR Mfold server (mfold.burnet.edu.au/). Standard settings were used for all folding jobs (37 °C and 1.0 M NaCl, with a 5% suboptimality range). Folding was performed with sequences comprising nucleotides 1-368 of the genomic RNA sequence of the wild-type (wt) and mutant HIV-1 LAI RNA. Phylogenetic studies were based on MFold data obtained with 500-nucleotide leader fragments of the primate lentiviral genomes.

Constructs-- For mutation of the HIV-1 leader RNA, we used the plasmid Blue-5'LTR (34). This pBluescript-derived construct contains the XbaI-ClaI fragment of the infectious pLAI clone, including the 5'-LTR, the complete 5'-UTR, and part of the Gag ORF (-454/+376). Mutations were created by a standard PCR mutagenesis protocol. For construction of the s1 mutation, oligonucleotide primers TA007 (5'-CCC76AAGCTTGCCTTGAGTGCTTCAAGTAGTGTGCACCCATCTGTTGTGTGACTCT GG130-3') and AD-GAG (complementary to position 442-462 of the HIV-1 genome) were used in a standard PCR reaction. For the w1 and w2 mutations, we used the forward primers TA009 (5'-CCC76AAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGTTTGTCTGTTGTGTG ACTCTGG130-3') and TA008 (5'-CCC76AAAGCTTGCCTTGAGTGCTTCAAGTAGTGTG GAAAGTCTGTTGTGTGACTCTGG130-3'). The mutated nucleotides are underlined, and the nucleotide positions of the HIV-1 sequence are indicated in superscript. The sequence of the PCR products was confirmed by sequencing. The mutant PCR products were digested by HindIII and ClaI and cloned into the Blue-5'LTR vector. The XbaI-ClaI fragments were subsequently cloned into pLAI-R37, a derivative of the full-length infectious clone pLAI (35). The mutant proviral constructs were designated pLAI-s1, -w1, and -w2. Transfection of the SupT1 cell line was performed by electroporation, and CA-p24 levels were determined as described previously (13, 36).

In Vitro Transcription and RNA Dimerization-- pLAI, pLAI-s1, -w1, and -w2 plasmids were used as template in a PCR reaction with primers T7-2 (corresponding to position 1-18 of the HIV-1 genome with an upstream T7 RNA promoter sequence) and R:A368-A347 (complementary to 368-347 of the HIV-1 genome). The PCR products were ethanol-precipitated and used for in vitro transcription by T7 RNA polymerase with [alpha -32P]dCTP according to the manufacturer's protocol (MEGAshortscript T7 transcription kit, Ambion, Inc.). Transcription reactions were stopped by addition of formamide-containing loading buffer and applied to 5% denaturing polyacrylamide gels. Gel slices containing the radiolabeled transcript were excised and soaked in TBE buffer (90 mM Tris borate, 2 mM EDTA) overnight at room temperature to elute the RNA. The RNA was ethanol-precipitated and dissolved in water. Equal amounts of RNA were heat-denatured and slowly renatured in the presence of dimerization buffer L (40 mM NaCl, 0.1 mM MgCl2, 10 mM Tris-HCl, pH 7.5). Aliquots were analyzed on polyacrylamide gels in 0.25 × TBE (22.5 mM Tris borate, 0.5 mM EDTA) and 0.25 × TBM (22.5 mM Tris borate, 0.1 mM MgCl2), either with a formamide-containing buffer or non-denaturing loading buffer. Gels were dried and applied to a Storm PhosphoImager. We used the computer program ImageQuant 5.0 (Amersham Biosciences) to quantify the RNA signals. The dimerization yield was determined by dividing the amount of dimer by the total amount of RNA (dimer plus monomer).

RNA Structure Probing-- pLAI and pLAI-s1 plasmids were used as templates in a PCR reaction with primers T7-2 and TA015 (complementary to 442-462 of HIV-1 genome). The PCR products were ethanol-precipitated and used for in vitro transcription with the Ambion MEGAshortscript T7 transcription kit. Transcripts were DNase I-treated, phenol-extracted, ethanol-precipitated, and dissolved in water. The RNA samples were heat-denatured, followed by addition of sodium cacodylate (pH 7.0) and MgCl2 to a final concentration of 100 and 1 mM. The RNA (10 µg) was treated at room temperature with 2 µl of kethoxal for 10 min, with 1 µl of dimethylsulfate (DMS) for 5 min, or mock-treated. The reactions were stopped by addition of 50 µg of Escherichia coli tRNA. The RNA was ethanol-precipitated and dissolved in 22 µl of water, and 4 µl was used in a primer extension assay with 5'-end-labeled oligonucleotide primers. Antisense primers reverse poly(A) 104/77 (complementary to 77-104 of the HIV-1 genome), cn3 (complementary to position 133-161), lys21 (complementary to 182-202), R:G290-G270 (complementary to 270-290), R:A368-A347, and TA015 were end-labeled with [gamma -32P]ATP and T4 polynucleotide kinase. The kinase was inactivated at 80 °C for 10 min. For a primer extension reaction, 2 ng of the labeled probe was heat annealed to the RNA in 83 mM Tris-HCl (pH 7.5) and 125 mM KCl. Avian myeloblastoma virus reverse transcriptase (RT, 5 units) was added in RT buffer to yield a mixture with 3 mM MgCl2, 10 mM dithiothreitol, 10 µM dNTP, and 50 µg/ml actinomycin D. The reactions were incubated at 37 °C for 1 h and stopped by the addition of 200 mM NaOH and an additional incubation of 20 min. The samples were ethanol-precipitated, dissolved in formamide loading buffer, and applied to 10% polyacrylamide sequencing gels. The products were visualized by a Storm PhosphoImager.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Identification of the U5-AUG Long Distance Interaction-- Previous work by Huthoff and Berkhout (24) showed that the HIV leader RNA is able to form two mutually exclusive secondary structures. In the ground state structure, the leader RNA adopts the LDI conformation that is based on an interaction between the poly(A) and DIS regions. In the presence of the viral NC protein, the LDI conformation switches to the BMH conformation that presents the poly(A) and DIS hairpins. Studies thus far have focused on transcripts that comprise the 5'-terminal 290 nucleotides of the HIV-1 leader RNA. In this study, we have analyzed the RNA folding of the complete 5'-UTR. Using the MFold computer program, we identified a novel long distance interaction that includes the start codon of the Gag ORF. This base-pairing possibility occurs between nucleotides 105-115 in the U5 region and 334-344 surrounding the AUG initiation codon and is termed the U5-AUG duplex. These two sequence elements are marked in the linear presentation of the 5'-UTR and the LDI and BMH structures (Fig. 1A). The duplex consists of 11 consecutive base pairs, including four G-U base pairs (Fig. 1B). The MFold results indicate that formation of the U5-AUG duplex occurs exclusively in BMH-like structures and not in the LDI conformation (results not shown).


View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1.   Overview of the HIV-1 5'-UTR organization and structure of the wild-type and mutant U5-AUG duplex. A, top, organization of the genomic 5'-UTR with the regulatory motifs is indicated by colored boxes. The two segments that form the long distance base-pairing interaction (U5-AUG duplex) are indicated in red. The Gag initiation codon is marked by an asterisk. Middle, traditional secondary structure model of the genomic 5'-UTR that highlights the hairpin structures and the regulatory motifs (1). Bottom, the alternative LDI and BMH structures of the genomic 5'-UTR. The U5 and AUG segments are single-stranded in the BMH fold and are now proposed to form the U5-AUG duplex. B-D, base pairing of the wild-type and mutant U5-AUG duplexes. Nucleotide positions are indicated. Mutated nucleotides are indicated in bold. The thermodynamic stability is indicated at the right (Delta G in kcal/mole).

To test for the presence of the U5-AUG duplex, we designed mutants that either strengthen or weaken the base pairing interaction (Fig. 1C). The duplex is stabilized in the s1 mutant by substitution of three G-U base pairs by one G-C and two A-U base pairs. The duplex is destabilized in the w1 and w2 mutants by replacing the central C-G base pairs either by U-G base pairs or by A-G mismatches, respectively. We first set out to determine the dimerization properties of the wt and mutant transcripts on a non-denaturing gel. Radiolabeled transcripts of the genomic RNA (nts 1-368) were synthesized in vitro, incubated at RNA dimerization conditions, and analyzed on gel (Fig. 2A). RNA monomers and dimers were detected for all transcripts in TBE and TBM gels. The most noticeable observation is that the s1 transcript with the stabilized U5-AUG duplex migrates faster in TBM gels than the wt transcript. Formamide-denatured samples were included as control (indicated above the lanes), and the remarkable migration of the s1 transcript is lost upon denaturation. The s1 dimer also migrates faster than the wt dimer in the TBE gels, but the fast migrating s1 monomer is observed as a diffuse band. Apparently, Mg2+ in the gel is required to stabilize the U5-AUG duplex in s1 monomers. The results of two experiments were quantified to calculate the level of RNA dimerization for the wt and mutant transcripts (Fig. 2B). The TBM gel shows both dimer types (loose and tight dimers), whereas only tight dimers are detected on TBE gels. The small difference in dimerization efficiencies in TBE versus TBM gels is therefore likely because of the presence of loose dimers on the latter gel type. The mutant transcripts s1, w1, and w2 dimerize more efficiently than the wt transcript, independent of the presence of Mg2+. Apparently, all mutations in the U5 motif result in elevated levels of RNA dimerization, which may be because of their destabilizing effect on the LDI conformation.


View larger version (50K):
[in this window]
[in a new window]
 
Fig. 2.   Gel electrophoresis of wild-type and mutant U5-AUG duplex transcripts. A, migration of wild-type, s1, w1, and w2 transcripts on non-denaturing TBE or TBM gels. RNA monomers (M) and dimers (D) are indicated on the left. The presence of denaturing formamide (F) in the loading buffer is indicated above the lanes. Arrows indicate the unusually fast migrating monomer and dimer of the s1 transcript. B, dimerization yields for the wt and mutant transcripts. The results are the average of two experiments, and the standard deviation is indicated. The dimerization yield was determined on TBE and TBM gels.

To test whether the fast migration of the s1 transcript is caused by stabilization of the U5-AUG interaction, we created a set of double mutants (Fig. 1D). The downstream segment 334-343 of the U5-AUG duplex was substituted by sequences that disrupt or weaken base pairing. The three central C-G base pairs were opened in the AUG3 mutant, and nearly all base pairs were disrupted in the AUG10 mutant. The destabilizing mutations were introduced both in the wt and s1 mutant transcripts. The wt and mutant transcripts were subjected to non-denaturing gel electrophoresis (Fig. 3A). Most importantly, opening of the U5-AUG duplex in the s1-AUG3 and s1-AUG10 mutants corrects the unusual migration of the s1 transcript. The AUG3 and -10 mutations have no effect on the migration of the wt transcript. These results confirm that formation of the U5-AUG duplex in transcript s1 induces a conformation in the HIV-1 leader RNA that migrates relatively fast during gel electrophoresis. We also quantified the dimerization efficiencies of this set of mutants (Fig. 3B). The wt transcript shows a moderate increase in dimerization efficiency upon introduction of the AUG3 or -10 mutations (from 30 to 34% dimers). The s1 transcript shows increased dimerization (60% dimers), and this effect is countered by the AUG3 or -10 mutations (47% dimers). Thus, the increased dimerization efficiency of the s1 transcript is caused, at least partially, by stabilization of the U5-AUG duplex in the BMH context.


View larger version (57K):
[in this window]
[in a new window]
 
Fig. 3.   Gel electrophoresis of the U5-AUG double mutant transcripts. A, migration of wild-type and mutant transcripts on a non-denaturing TBM gel. See Fig. 2 legend for further details. B, dimerization yields for the wild-type and mutant transcripts. The dimerization yield for each transcript was determined as in Fig. 2. The results of one representative experiment were quantified.

New HIV-1 RNA Structure Models for the Full-length 5'-UTR-- We next set out to determine the secondary structure of the wt and s1 mutant transcript (1-462) by RNA structure probing. Because the fast migrating s1 structure was only visible in the presence of Mg2+ (Fig. 2), the transcripts were heat-denatured and refolded in the presence of Mg2+. The transcripts were treated with limiting amounts of kethoxal or DMS and subsequently used as template for reverse transcription with several antisense DNA primers. The cDNA products were analyzed by denaturing gel electrophoresis. The complete set of probing data is listed in Table I. To facilitate the discussion of this complex data set, we will first present the new secondary structure models in Fig. 4. The wt RNA is folded in the ground state LDI conformation, in which the poly(A) and DIS regions (marked orange and pink) are base paired in a long distance interaction that extends the stem of the PBS domain. The downstream region 282-352 folds an extended stem-loop structure with three internal loops and a GGAG loop (marked yellow). The top of this extended hairpin is, in fact, the previously described Psi  or SL3 hairpin that is required for viral RNA packaging (37). We termed the extended hairpin Psi E. The SD site (marked gray) is located within an internal loop of Psi E. The Gag initiation codon (marked by an asterisk) is located in the central internal loop and the adjacent stem segment of Psi E. The downstream Gag sequences (nts 358-367) are possibly engaged in long distance base pairing with nucleotides 60-67 in the R region directly downstream of TAR. This interaction is termed the R-Gag duplex. In contrast, the s1 mutant RNA folds the BMH structure that exposes both the poly(A) and DIS hairpins. The downstream sequences in s1 RNA fold the SD hairpin and the short version of the Psi  hairpin, and the leader domain is closed by the U5-AUG duplex (105-115 pairs with 334-344).


                              
View this table:
[in this window]
[in a new window]
 
Table I
Secondary structure probing of the wt and s1 mutant leader RNA
Reactivity of wt and s1 RNA to kethoxal (G-specific) and DMS (A- and C-specific) were estimated and classified into five categories: +++ = highly reactive, ++ = reactive, + = moderately reactive, +/- = marginally reactive, - = not reactive. The sequences that constitute the U5-AUG duplex are indicated by outlined boxes and the Gag initiation codon is indicated in bold. The sequence substitutions in the s1 RNA are indicated in italics. s indicates reverse transcription stops.


View larger version (25K):
[in this window]
[in a new window]
 
Fig. 4.   The extended LDI and BMH structure models of the complete HIV-1 5'-UTR. The models are based primarily on the RNA structure probing data that are presented in detail in Table I and Fig. 5. The regulatory motifs are marked in colors as in Fig. 1A. The two segments of the U5-AUG duplex are presented in a red-outlined box, and the Gag initiation codon is marked by an asterisk. Overall, the percentage of single-stranded nucleotides in the BMH conformation is similar to that of the LDI conformation (39%).

The structures shown in Fig. 4 are consistent with the MFold analyses. The LDI conformation with the extended PBS stem and the extended Psi E hairpin is the most stable structure adopted by the wt RNA. The BMH folding with the multiple hairpins (poly(A), DIS, SD, and short Psi ) and the novel U5-AUG duplex is the most stable structure adopted by s1 RNA. Apparently, the metastable BMH folding is facilitated by stabilization of the U5-AUG interaction. We previously demonstrated that the BMH fold can also be triggered by stabilization of the poly(A) or the DIS hairpin (24). Few leader RNA motifs do not change their structure during the LDI to BMH switch: the TAR hairpin (nts 1-57, marked green), the upper primer activation signal/primer binding site domain (nts 116-239, marked lilac and blue), and the short Psi  hairpin (nts 305-331, marked yellow). The constitutive folding of the TAR and PBS domains in the LDI and BMH structures was described previously (24). Apparently, these structures fold autonomously, suggesting that their biological function is independent of the LDI/BMH switch.

Structure Probing-- The structure probing data of the wt and s1 RNA are presented to highlight the differences between the LDI and BMH structures. There are three regions that differ significantly in accessibility to the single strand-specific reagents kethoxal and DMS in the two transcripts. The first region is segment 105-115 of the U5-AUG duplex in which the s1 mutations were introduced (Fig. 5A). G106 and G108 are accessible to kethoxal in the wt transcript, whereas G106 and A108 are not sensitive to kethoxal and DMS in the s1 transcript. Apparently, these nucleotides are base-paired in the s1 transcript. Interestingly, the control primer extension reaction yields two major stop products on the s1 RNA template at position U118 and U120 (marked s in Table I). Because the wt transcript has an identical sequence, it is likely that the RT enzyme is stopped by a structure that is specific for the s1 template. Apparently, the RT enzyme stopped three and five nucleotides before reaching the U5-AUG duplex. The second region that shows differential s1-wt reactivity concerns the sequences flanking the Gag initiation codon (Fig. 5B). Purines 332-336 are exclusively accessible to kethoxal and DMS in the wt transcript, indicating that these nucleotides are single-stranded. The third region that exhibits major probing differences is domain 235-242 (Fig. 5C). This sequence is completely sensitive to kethoxal and DMS in the s1 transcript, whereas it is only partially sensitive in the wt transcript. Together, these results support the folding of the U5-AUG interaction in the s1 mutant transcript. As a result, nucleotides 240-242 become single-stranded exclusively in the BMH fold of the s1 transcript (Figs. 4 and 5C). These nucleotides are paired to nucleotides 113-115 in the LDI conformation of the wt transcript.



View larger version (87K):
[in this window]
[in a new window]
 
Fig. 5.   RNA structure probing data for the wt and s1 transcripts. Each gel segment shows the primer extension products of wt and s1 transcripts after mock treatment (-) or treatment with limiting amounts of the G-specific reagent kethoxal (K) and the A/C-specific chemical DMS (D). The relevant parts of the LDI and BMH structure are shown on the left and right, respectively. Nucleotide positions that are discussed in the text are indicated. The s1 mutations are indicated by arrows. A, nts 92-124. B, nts 322-338. C, nts 209-241. D, nts 53-90. E, nts 253-282.

The poly(A) and DIS regions also react differently in the wt and s1 transcripts (Fig. 5D). All five A residues of the poly(A) signal 73AAUAAA78 are equally accessible to DMS in the wt transcript, confirming that the poly(A) signal is single-stranded as in the LDI structure. In contrast, 73AA74 is less exposed to DMS than 76AAA78 in the s1 transcript, indicating that the poly(A) hairpin of the BMH conformation is formed. We previously used this differential reactivity within the poly(A) signal to differentiate between the LDI and BMH structures (24). Several nucleotides in the DIS region (264 and 274-276) are more exposed in wt RNA compared with s1 RNA, confirming the LDI fold of wt RNA (Fig. 5E). In contrast, A263 is exclusively DMS-sensitive in the s1 transcript, consistent with the folding of the DIS hairpin in the BMH structure. These combined results confirm that the wt transcript adopts the LDI conformation as the ground state structure and the s1 mutations force the RNA into the alternative BMH fold.

Slight differences in reactivity between the two transcripts are also observed for positions 66-68, 274-305, and 356 (Table I). For instance, G290 and G292 are more reactive and G298 is less reactive in s1 RNA. These differences led to the proposed folding of the SD hairpin in the BMH structure and the Psi E hairpin and R-Gag duplex in the LDI conformation (Fig. 4). The nucleotides in the bottom stem segment of Psi E and in the R-Gag duplex are moderately accessible to kethoxal/DMS (Table I), suggesting that these RNA structures are metastable.

Phylogenetic Analysis of U5-AUG Duplex-- We have shown that the s1 mutant folds the U5-AUG duplex as part of the BMH fold. The U5-AUG interaction is not present in the LDI fold, which is the most stable structure of the wt leader RNA. It proved difficult to formally demonstrate that the U5-AUG duplex will be formed in the wt leader once the RNA switches into the BMH structure because the LDI conformation is strongly favored. We therefore performed an extensive phylogenetic analysis of leader sequences of other lentiviruses to provide further evidence for the U5-AUG interaction in the form of base pair co-variations (Fig. 6). This survey presents convincing evidence for the proposed long distance base pairing. For instance, the closing base pair U-A is replaced by C-G in the HIV-1 isolate from the N (new) group. The U5-AUG duplex is also conserved in the more distantly related simian immunodeficiency viruses (SIV) and HIV-2 lentiviruses. It is not surprising that the AUG start codon is absolutely conserved, but we identified numerous sequence changes in the nucleotides that flank the start codon. These changes are compensated by complementary changes in the upstream U5 sequences. For instance, the U5-AUG duplex in SIVl'Hoest shows five co-variations, two of which affect the Gag ORF. Despite all sequence variations, it is remarkable that the stability of the U5-AUG interaction is kept within certain limits, ranging from 10 to 13 base pairs. Because the U5-AUG duplex is present exclusively in the BMH structure, major changes in its stability will have a direct impact on the LDI-BMH equilibrium. This requirement may explain the conservation of the U5-AUG duplex stability, exactly as was described for other leader RNA structures such as the poly(A) hairpin (38, 39). In summary, the phylogenetic data indicate that U5-AUG base pairing, but not the actual nucleotide sequence, is conserved among primate lentiviruses. The combined results support a function for this long distance interaction in the viral replication cycle. To directly test this, we performed replication experiments with the wt and w2-mutated viruses. One representative replication curve in the SupT1 cell line is shown in Fig. 7. This result indicates that preventing the formation of the U5-AUG duplex leads to a significant replication defect. Studies are ongoing to further analyze these mutant viruses and select for phenotypic revertants.


View larger version (33K):
[in this window]
[in a new window]
 
Fig. 6.   Phylogenetic analysis of the U5-AUG duplex in primate lentiviruses. The U5-AUG duplex is shown for different HIV-1 subtypes, HIV-2, and all SIV lineages. Nucleotide positions are indicated for each duplex. The HIV-1 LAI isolate is shown on top as the prototype. Nucleotide changes in the other HIV-SIV isolates are marked in bold when they conserve the base-pairing potential. This includes semi-co-variations (e.g. A-U to G-U) and true co-variations (e.g. A-U to G-C). Asterisks indicate the Gag initiation codon.


View larger version (12K):
[in this window]
[in a new window]
 
Fig. 7.   The w2-mutated virus has a significant replication defect. The SupT1 cell line was transfected with 1 µg of the wt and w2 proviral construct. CA-p24 production was measured in the culture medium at several days post-transfection.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

We analyzed the secondary structure of the complete 5'-UTR of the HIV-1 genomic RNA (nts 1-368). Previous studies on the 5'-terminal 290-nucleotide fragment indicate that the 5'-UTR is able to adopt two mutually exclusive structures, LDI and BMH. This study reveals that the complete 5'-UTR also folds these alternative conformations. The 3'-terminal 290-368 segment contributes differently to the LDI and BMH structures. In the context of the ground state LDI conformation, these sequences fold the well known Psi  hairpin, but in a significantly extended form. This Psi E hairpin includes the SD signal and the Gag start codon. The internal loops of this Psi E hairpin are remarkably symmetrical and purine-rich; only a single U and no C residues are present among the 29 single-stranded nucleotides. Different extended forms of the Psi  hairpin have been described (19, 37, 40-42). These alternative conformations are not confirmed by our RNA structure probing results with the full-length 5'-UTR. It is clear from our study that the lower half of the Psi E hairpin cannot be too rigid because it has to melt to allow the formation of the SD hairpin and the U5-AUG duplex in the BMH context. The LDI conformation of the complete 5'-UTR also contains a long distance base-pairing interaction between sequences in the Gag ORF and the R region immediately downstream of TAR (R-Gag duplex). Studies are in progress to verify the presence of this duplex and its role in HIV-1 biology.

In the metastable BMH conformation, the lower part of Psi E is opened to allow the formation of the S.D. hairpin that flanks the short version of the Psi  hairpin. Furthermore, the downstream Gag sequences are also free to engage in alternative base pairing to form the long distance interaction with upstream U5 sequences: the U5-AUG duplex. In fact, formation of the U5-AUG duplex creates a base-pairing partner for the single-stranded U5 region between the poly(A) hairpin and the PBS domain that thus far could not be paired with other sequences in the HIV-1 RNA genome. Former studies suggested that the Gag initiation codon is located in the bottom stem of a bulged hairpin (1, 43). This folding was based on Mfold and probing studies of incomplete leader sequences that lack nucleotides 105-115 that are necessary to form the U5-AUG interaction. The U5-AUG duplex closes a domain with multiple hairpins (PBS, DIS, S.D., and Psi ) that are separated by a purine-rich ring structure with A9, G7, C1, and U0. The bulges and internal loops of several of the hairpin motifs (DIS, SD, and Psi ) are also purine-rich with A5, G4, C0, and U0. The probing results of the complete 5'-UTR clearly indicate that the multiple purines of the ring are single-stranded. An extended format of the DIS hairpin has recently been proposed based on nuclear magnetic resonance analysis of a small RNA fragment (44). Such a DIS extension will, at least partially, close the ring structure that we propose. However, the DIS extension is not confirmed by our probing data, and the proposed base pairing is not supported by base pair co-variations in different viral isolates. In fact, several isolates including our LAI strain contain sequence variations that do not allow this DIS extension. The function of the open purine ring will be tested in future studies.

The U5-AUG duplex was analyzed by experimental and theoretical approaches. The duplex was strengthened in the s1 variant by mutations in the U5 domain. Probing experiments showed that s1 RNA folds the BMH conformation with the U5-AUG duplex. The wt transcript folds the LDI conformation that was originally discovered because it migrates fast on non-denaturing gels compared with the BMH conformation (45). Strikingly, the s1 transcript migrates even faster, suggesting that the BMH structure is compacted by closing of the U5-AUG duplex. It will be of interest to test whether this potentially compact RNA fold is suitable for x-ray studies. Strengthening of the U5-AUG duplex shifts the equilibrium from LDI to the BMH conformation. Previous work showed that such a shift usually coincides with an increased RNA dimerization capacity (24, 46, 47). Indeed, the s1 transcript dimerizes more efficiently than wt RNA, and this effect was neutralized by mutations in the Gag region that weaken the U5-AUG duplex.

Phylogenetic analyses of the 5'-UTR sequences of all known types of HIV and SIV revealed that a similar U5-AUG duplex can be formed despite considerable divergence in the sequence of the U5 and AUG segments. Many base pair co-variations were observed, providing evidence for the existence and biological importance of this structural motif. Because the U5-AUG duplex occludes the Gag initiation codon, it is possible that this interaction influences the translation of the Gag protein. It has been shown that the HIV-1 5'-UTR can function as an internal ribosomal entry site (IRES).2 Nevertheless, the 5'-UTR of different viral isolates exclude upstream AUG triplets (1), which is consistent with a regular scanning mechanism of translation. It is therefore possible that translation of the genomic HIV-1 RNA proceeds both by scanning and by internal initiation. The translational mode may differ with the stage of infection or on spliced versus unspliced RNA. We previously speculated that NC may shift the 5'-UTR conformation from LDI to BMH late in the infection process. This could coincide with a switch in the mechanism of translation, from scanning to internal initiation or vice versa. In the former scenario, the BMH conformation and the U5-AUG duplex may be important structures for the proposed IRES function. Certain features of the BMH structure do in fact resemble the IRES element of the pestiviral RNA genome, which was recently shown to be critically dependent on clustered single-stranded adenosines within this structured RNA motif (48). More strikingly, polypurine A-rich sequences were also shown to exhibit IRES activity (49), and we observed the abundance of single-stranded purines and especially adenosines in the ring structure that is formed by closure of the U5-AUG duplex. There is also accumulating evidence that unpaired adenosine residues can dock into the minor groove of a receptor helix, and this A-minor motif appears a very important element for the acquisition of global RNA architecture (49). Thus, the destiny of the HIV-1 genomic RNA: the ribosome or the virion (50) may be regulated by structural changes in the leader RNA, similar to mechanisms that have been described for the cauliflower mosaic virus (51, 52). Translation studies are in progress to test these intriguing possibilities.

Another role for the U5-AUG duplex may reside in RNA packaging, because this long distance interaction has a major impact on the structural presentation of the leader RNA sequences that are involved in RNA packaging. The U5-AUG duplex can form exclusively in the genomic HIV-1 RNA that includes the Gag region and not in the multiple subgenomic forms of HIV-1 RNA. Thus, this structure may contribute to specific packaging of the full-length genomic RNA into new virus particles. Interestingly, previous work showed that deletion of the upstream sequences of the U5-AUG duplex (nts 98-126) induces an RNA packaging defect (53). Gag- and NC-binding sites on the HIV-1 RNA have been mapped to a segment of ~120 nucleotides that includes the DIS, SD, and Psi  hairpins and sequences of the Gag ORF (37, 43, 54-56). Most of these studies used relatively small fragments of the 5'-UTR that cannot fold the LDI conformation, and these RNAs will constitutively fold the multiple hairpins of the alternative BMH structure. However, binding of Gag/NC to the full-length 5'-UTR may depend on formation of the U5-AUG duplex and the concomitant LDI to BMH switch. RNA packaging studies with mutant viruses are severely hampered by the fact that the 5'-UTR encodes many overlapping regulatory signals that cannot be studied independently (57). For instance, alterations in the DIS region affect RNA dimerization but also result in an RNA packaging defect (58-60). Likewise, mutation of the Gag initiation codon in an HIV-1-based vector was shown to result in very low levels of intracellular genomic RNA, which consequently results in reduced RNA packaging (61). In general, many indirect effects can be expected from mutations that influence the overall folding of the HIV-1 5'-UTR, and such side effects do severely complicate the description of discrete RNA motifs like the packaging signal. In vitro studies with short RNA fragments will certainly miss some of the important features of the HIV-1 5'-UTR because the proper secondary and tertiary RNA structure is not formed.

    ACKNOWLEDGEMENTS

We thank Hendrik Huthoff for critical reading of the manuscript and Wim van Est for the artwork.

    FOOTNOTES

* The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Dagger To whom correspondence should be addressed: Dept. of Human Retrovirology, Academic Medical Center, University of Amsterdam, P.O. Box 22700, 1100 DE Amsterdam, The Netherlands. Tel.: 31-20-5664822; Fax: 31-20-6916531; E-mail: b.berkhout@amc.uva.nl; Web address: www.berkhoutlab.com.

Published, JBC Papers in Press, November 27, 2002, DOI 10.1074/jbc.M210291200

2 Brasey, A., Lopez-Lastra, M., Ohlmann, T., Beerens, N., Berkhout, B., Darlix, J., and Sonenberg, N. (2003) J. Virol., in press.

    ABBREVIATIONS

The abbreviations used are: HIV-1, human immunodeficiency virus type I; 5'-UTR, 5'-untranslated region; nts, nucleotides; LDI, long distance interaction; BMH, branched multiple hairpin; DIS, dimer initiation signal; R, repeat; TAR, transactivation region; poly(A), polyadenylation; PAS, primer activation signal; PBS, primer binding site; SD, splice donor; NC, nucleocapsid; ORF, open reading frame; DMS, dimethylsulfate; RT, reverse transcriptase, wt, wild-type; SIV, simian immunodeficiency virus.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

1. Berkhout, B. (1996) Progr. Nucleic Acid Res. Mol. Biol. 54, 1-34[Medline] [Order article via Infotrieve]
2. Puglisi, J. D., Tan, R., Calnan, B. J., Frankel, A. D., and Williamson, J. R. (1992) Science 257, 76-80[Medline] [Order article via Infotrieve]
3. Aboul-ela, F., Karn, J., and Varani, G. (1995) J. Mol. Biol. 253, 313-332[CrossRef][Medline] [Order article via Infotrieve]
4. Aboul-ela, F., Karn, J., and Varani, G. (1996) Nucleic Acids Res. 24, 3974-3981[Abstract/Free Full Text]
5. Ippolito, J. A., and Steitz, T. A. (1998) Proc. Natl. Acad. Sci. U. S. A. 95, 9819-9824[Abstract/Free Full Text]
6. Berkhout, B. (1992) Nucleic Acids Res. 20, 27-31[Abstract]
7. Klaver, B., and Berkhout, B. (1994) EMBO J. 13, 2650-2659[Abstract]
8. Dingwall, C., Ernberg, I., Gait, M. J., Green, S. M., Heaphy, S., Karn, J., Lowe, A. D., Singh, M., Skinner, M. A., and Valerio, R. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 6925-6929[Abstract]
9. Wei, P., Garber, M. E., Fang, S.-M., Fisher, W. H., and Jones, K. A. (1998) Cell 92, 451-462[Medline] [Order article via Infotrieve]
10. Berkhout, B., Silverman, R. H., and Jeang, K. T. (1989) Cell 59, 273-282[Medline] [Order article via Infotrieve]
11. Klasens, B. I. F., Das, A. T., and Berkhout, B. (1998) Nucleic Acids Res. 26, 1870-1876[Abstract/Free Full Text]
12. Klasens, B. I. F., Thiesen, M., Virtanen, A., and Berkhout, B. (1999) Nucleic Acids Res. 27, 446-454[Abstract/Free Full Text]
13. Beerens, N., Groot, F., and Berkhout, B. (2001) J. Biol. Chem. 276, 31247-31256[Abstract/Free Full Text]
14. Beerens, N., and Berkhout, B. (2002) J. Virol. 76, 2329-2339[Abstract/Free Full Text]
15. Laughrea, M., and Jette, L. (1994) Biochemistry 33, 13464-13474[Medline] [Order article via Infotrieve]
16. Lever, A., Gottlinger, H., Haseltine, W., and Sodroski, J. (1989) J. Virol. 63, 4085-4087[Medline] [Order article via Infotrieve]
17. Aldovini, A., and Young, R. A. (1990) J. Virol. 64, 1920-1926[Medline] [Order article via Infotrieve]
18. Harrison, G. P., and Lever, A. M. L. (1992) J. Virol. 66, 4144-4153[Abstract]
19. Baudin, F., Marquet, R., Isel, C., Darlix, J. L., Ehresmann, B., and Ehresmann, C. (1993) J. Mol. Biol. 229, 382-397[CrossRef][Medline] [Order article via Infotrieve]
20. Clavel, F., and Orenstein, J. M. (1990) J. Virol. 64, 5230-5234[Medline] [Order article via Infotrieve]
21. Purcell, D. F. J., and Martin, M. A. (1993) J. Virol. 67, 6365-6378[Abstract]
22. O'Reilly, M. M., McNally, M. T., and Beemon, K. L. (1995) Virology 213, 373-385[CrossRef][Medline] [Order article via Infotrieve]
23. Kerwood, D. J., Cavaluzzi, M. J., and Borer, P. N. (2001) Biochemistry 40, 14518-14529[CrossRef][Medline] [Order article via Infotrieve]
24. Huthoff, H., and Berkhout, B. (2001) RNA 7, 143-157[Abstract/Free Full Text] (N. Y.)
25. Skripkin, E., Paillart, J. C., Marquet, R., Ehresmann, B., and Ehresmann, C. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 4945-4949[Abstract]
26. Paillart, J. C., Marquet, R., Skripkin, E., Ehresmann, B., and Ehresmann, C. (1994) J. Biol. Chem. 269, 27486-27493[Abstract/Free Full Text]
27. Muriaux, D., Girard, P.-M., Bonnet-Mathoniere, B., and Paoletti, J. (1995) J. Biol. Chem. 270, 8209-8216[Abstract/Free Full Text]
28. Clever, J. L., Wong, M. L., and Parslow, T. G. (1996) J. Virol. 70, 5902-5908[Abstract]
29. Haddrick, M., Lear, A. L., Cann, A. J., and Heaphy, S. (1996) J. Mol. Biol. 259, 58-68[CrossRef][Medline] [Order article via Infotrieve]
30. Laughrea, M., and Jette, L. (1996) Biochemistry 35, 9366-9374[CrossRef][Medline] [Order article via Infotrieve]
31. Muriaux, D., Fosse, P., and Paoletti, J. (1996) Biochemistry 35, 5075-5082[CrossRef][Medline] [Order article via Infotrieve]
32. Mathews, D. H., Sabina, J., Zuker, M., and Turner, D. H. (1999) J. Mol. Biol. 288, 911-940[CrossRef][Medline] [Order article via Infotrieve]
33. Zuker, M., and Turner, D. H. (1999) in Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide (Barciszewski, J. , and Clark, B. F. C., eds) , pp. 11-43, Kluwer Academic Publishers, Dordrecht/Boston/London
34. Klaver, B., and Berkhout, B. (1994) J. Virol. 68, 3830-3840[Abstract]
35. Berkhout, B., van Wamel, J., and Klaver, B. (1995) J. Mol. Biol. 252, 59-69[CrossRef][Medline] [Order article via Infotrieve]
36. Back, N. K. T., Nijhuis, M., Keulen, W., Boucher, C. A. B., Oude Essink, B. B., van Kuilenburg, A. B. P., Van Gennip, A. H., and Berkhout, B. (1996) EMBO J. 15, 4040-4049[Abstract]
37. Zeffman, A., Hassard, S., Varani, G., and Lever, A. (2000) J. Mol. Biol. 297, 877-893[CrossRef][Medline] [Order article via Infotrieve]
38. Das, A. T., Klaver, B., Klasens, B. I. F., van Wamel, J. L. B., and Berkhout, B. (1997) J. Virol. 71, 2346-2356[Abstract]
39. Berkhout, B., Klaver, B., and Das, A. T. (1997) Nucleic Acids Res. 25, 940-947[Abstract/Free Full Text]
40. Hayashi, T., Shioda, T., Iwakura, Y., and Shibuta, H. (1992) Virology 188, 590-599[Medline] [Order article via Infotrieve]
41. Hayashi, T., Ueno, Y., and Okamoto, T. (1993) FEBS Lett. 327, 213-218[CrossRef][Medline] [Order article via Infotrieve]
42. Huynen, M. A., Perelson, A., Vieira, W. A., and Stadler, P. F. (1996) J. Comput. Biol. 3, 253-274[Medline] [Order article via Infotrieve]
43. Clever, J., Sassetti, C., and Parslow, T. G. (1995) J. Virol. 69, 2101-2109[Abstract]
44. Greatorex, J., Gallego, J., Varani, G., and Lever, A. (2002) J. Mol. Biol. 322, 543[CrossRef][Medline] [Order article via Infotrieve]
45. Berkhout, B., and van Wamel, J. L. B. (2000) RNA (N. Y.) 6, 282-295
46. Huthoff, H., and Berkhout, B. (2001) Nucleic Acids Res. 29, 2594-2600[Abstract/Free Full Text]
47. Huthoff, H., and Berkhout, B. (2002) Biochemistry 41, 10439-10445[CrossRef][Medline] [Order article via Infotrieve]
48. Fletcher, S. P., and Jackson, R. J. (2002) J. Virol. 76, 5024-5033[Abstract/Free Full Text]
49. Doherty, E. A., Batey, R. T., Masquida, B., and Doudna, J. A. (2001) Nat. Struct. Biol. 8, 339-343[CrossRef][Medline] [Order article via Infotrieve]
50. Butsch, M., and Boris-Lawrie, K. (2002) J. Virol. 76, 3089-3094[Free Full Text]
51. Hemmings-Mieszczak, M., Steger, G., and Hohn, T. (1998) RNA (N. Y.) 4, 101-111
52. Hemmings-Mieszczak, M., Steger, G., and Hohn, T. (1997) J. Mol. Biol. 267, 1075-1088[CrossRef][Medline] [Order article via Infotrieve]
53. Vicenzi, E., Dimitrov, D. S., Engelman, A., Migone, T.-S., Purcell, D. F. J., Leonard, J., Englund, G., and Martin, M. A. (1994) J. Virol. 68, 7879-7890[Abstract]
54. Damgaard, C. K., Dyhr-Mikkelsen, H., and Kjems, J. (1998) Nucleic Acids Res. 26, 3667-3676[Abstract/Free Full Text]
55. Amarasinghe, G. K., Zhou, J., Miskimon, M., Chancellor, K. J., McDonald, J. A., Matthews, A. G., Miller, R. R., Rouse, M. D., and Summers, M. F. (2001) J. Mol. Biol. 314, 961-970[CrossRef][Medline] [Order article via Infotrieve]
56. Amarasinghe, G. K., De Guzman, R. N., Turner, R. B., Chancellor, K. J., Wu, Z. R., and Summers, M. F. (2000) J. Mol. Biol. 301, 491-511[CrossRef][Medline] [Order article via Infotrieve]
57. Berkhout, B. (2000) Adv. Pharmacol. 48, 29-73[Medline] [Order article via Infotrieve]
58. Berkhout, B., and van Wamel, J. L. B. (1996) J. Virol. 70, 6723-6732[Abstract]
59. Paillart, J.-C., Berthoux, L., Ottmann, M., Darlix, J.-L., Marquet, R., Ehresmann, B., and Ehresmann, C. (1996) J. Virol. 70, 8348-8354[Abstract]
60. Clever, J. L., and Parslow, T. G. (1997) J. Virol. 71, 3407-3414[Abstract]
61. Richardson, J. H., Child, L. A., and Lever, A. M. L. (1993) J. Virol. 67, 3997-4005[Abstract]


Copyright © 2003 by The American Society for Biochemistry and Molecular Biology, Inc.