1 Institut für Theoretische Chemie und Molekulare Strukturbiologie, Universität Wien, Währingerstraße 17, A-1090 Wien, Austria
2 Bioinformatik, Institut für Informatik, Universität Leipzig, Kreuzstraße 7b, D-04103 Leipzig, Germany
3 The Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
Correspondence
Ivo L. Hofacker
ivo{at}tbi.univie.ac.at
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Supplementary figures are supplied in JGV Online.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
RNA secondary structures have been shown to be very sensitive to mutations (Fontana et al., 1993; Schuster et al., 1994
): mutations in about 10 % of the sequence positions already leads almost surely to unrelated structures if the mutated positions are chosen randomly. Secondary structure elements that are consistently present in a group of sequences with less than, say, 95 % mean pairwise identity are therefore most likely the result of stabilizing selection, not a consequence of the high degree of sequence homology. This fact can be exploited to design algorithms that reliably detect conserved RNA secondary structure elements in a small sample of related RNA sequences (Hofacker et al., 1998
; Hofacker & Stadler, 1999
). This method was recently applied quite successfully to a survey of the genomes of Picornaviridae (Witwer et al., 2001
) and the RNA pre-genome of Hepadnaviridae (Stocsits et al., 1999
).
Here we report a comprehensive survey of members of the family Flaviviridae, which possess a single-stranded positive-sense RNA (ss+RNA) genome. The family is subdivided into the three genera Flavivirus, Pestivirus and Hepacivirus and the group of GB virus C/hepatitis G viruses (GBV-C) with a currently uncertain taxonomic classification (van Regenmortel et al., 2000). The RNA genome, which has a size of 9·612·3 kb, is characterized by a similar organization (Fig. 1
) in all genera and acts as the only mRNA found in infected cells. It contains one single long open reading frame flanked by 5' and 3' untranslated regions (UTR). These are known to form into specific secondary structures required for genome replication and translation. Viral proteins are synthesized as one single polyprotein, which is co- and post-translationally cleaved by viral and cellular proteinases.
|
The second group, consisting of Hepacivirus (hepatitis C virus; HCV), Pestivirus (PESTI) and GBV-C, controls translation by means of an IRES in the 5' UTR and has a short, less-structured 3' UTR. Pestivirus and Hepacivirus have very similar IRES regions (Pestova et al., 1998); the IRES of GBV-C is 50 % longer and structurally quite different (Simons et al., 1996
). Therefore we treat these two groups separately.
While the 5' and 3' UTRs of Flaviviridae have been the object of several studies, very little is known about the secondary structures of the coding regions despite some evidence that the coding region might also contain functional RNA motifs (Simmonds & Smith, 1999; Tuplin et al., 2002
).
![]() |
METHODS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Multiple sequence alignments were calculated using CLUSTAL W (Thompson et al., 1994). All sequence positions reported here refer to the multiple sequence alignments that are available as part of the supplemental material.
RNA genomes were folded in their entirety using McCaskill's partition function algorithm (McCaskill, 1990) as implemented in the VIENNA RNA package (Hofacker et al., 1994
), based on the energy parameters published in Mathews et al. (1999)
. The result of this computation is a matrix of base pairing probabilities for each potential base pair (i, j) of the genomic RNA.
The ALIDOT algorithm (Hofacker et al., 1998; Hofacker & Stadler, 1999
) was used to search the base pairing probability matrix for conserved secondary structure patterns. This method requires an independent prediction of the secondary structure for each of the sequences and a multiple sequence alignment that is obtained without any reference to the predicted secondary structures. The algorithm ranks base pairs using both the thermodynamic information contained in the base pairing probability matrix and the information on compensatory, consistent (e.g. GC
GU) and inconsistent mutations contained in the multiple sequence alignment. The approach is different from efforts to simultaneously compute alignment and secondary structures (Gorodkin et al., 1997
; Sankoff, 1985
) and from programs such as CONSTRUCT (Lück et al., 1996
, 1999
) and ALIFOLD (Hofacker et al., 2002
) because it does not assume that the sequences have a single common structure. An implementation of this algorithm is available from http://www.tbi.univie.ac.at/RNA/. The ALIFOLD algorithm (Hofacker et al., 2002
) is used to obtain consensus structures of regions with significant structural conservation.
Computational results are shown as Hogeweg-style mountain plots (Hogeweg & Hesper, 1984) with colour codes indicating sequence covariations.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
The mean pairwise sequence identity of all four species of the genus Flavivirus (less than 50 %) was too low to yield good alignments. The species TBE differs most from the other species in both sequence and structure. From an alignment of the remaining species (DEN, JEV and YFV) we obtained a common structure for the CS (Fig. 2), which supports the prediction of Hahn et al. (1987)
. We then compared only DEN and JEV. In our data the CS contained no sequence variation but was predicted with pair probabilities close to one. Adjacent to the CS we found a further stem which participates in genome cyclization and which contains several sites of sequence variation (P2'' in supplemental material A). Between CS and P2'' there is a well-conserved hairpin structure supported by numerous compensatory mutations (DV2/JE2 in supplemental material A).
TBE.
The conserved cyclization motifs first reported by Hahn et al. (1987) for mosquito-borne viruses are absent in the TBE group. Putative CSs were proposed for Powassan virus RNA (Mandl et al., 1993
), for TBE and cell fusing agent (Khromykh et al., 2001
). In all proposed motifs for genome cyclization we did not find any mutations in the sequences; thus we could not use our method to confirm the predicted structure by means of sequence covariation. Thermodynamic folding, however, provided strong evidence for the CS "A"-motif (Fig. 2
) (Khromykh et al., 2001
; Mandl et al., 1993
) because these base pairs appeared with probabilities close to one in the folds of the complete genome. Khromykh's region CS"B" was folded only by one single sequence (TEU27491) and thus could not be considered as a common motif for all members of TBE.
5' UTR.
The 5' UTRs of DEN, JEV and YFV form into a very similar secondary structure, while the structure for TBE is significantly different (Fig. 3); see DV1, JE1, YF1 and TB1, respectively. For DEN we found structural conservation, while the sequences of JEV and YFV were highly conserved. A manually improved alignment of this region for JEV and YFV to DEN showed that there was significant structure conservation among all three genera. Furthermore, all structures contained an interior loop of three Us (one on the 5'- and two on the 3'-strand; DV1, JE1 and YF1 in Fig. 3
).
|
A stem carrying the initiator AUG proposed by Hahn et al. (1987) for YFV was found to fold in all sequences but is not supported by sequence covariation. The 5' UTR structure proposed by Khromykh et al. (2001)
for TBE was inconsistent with the available sequence data. We found a different structure that was confirmed by several mutations, both consistent and compensatory (TB1 in Fig. 3
).
Coding region.
Several conserved secondary structures were found in the coding regions of DEN, JEV, YFV and TBE. These structures are available on the web site. So far, no functions have been proposed for these regions. Hahn et al. (1987) already proposed the stemloop DV2 for Den-2 virus.
3' UTR.
Conserved structures in the 3' UTR are shown in Fig. 3: the structures show strong similarity between species. Sequence variation in the stem DV6a was high in DEN and present in JEV. For YFV, we found a stem corresponding to DV6a and to JE7a.
Structures similar to DV6, JE7, YF27 or TB19 were also proposed by Hahn et al. (1987) for DEN 2 and YFV, by Khromykh et al. (2001)
for DEN, YFV, JEV and TBE, by Rauscher et al. (1997)
(B for DEN, YFV, and JEV and I, II and III for TBE), by Proutski et al. (1999)
(TL1/RCS2 or TL2/CS2 for DEN and JEV, and "stemloop 1 in subregion I" for YFV), and by Leitmeyer et al. (1999)
for DEN.
DEN.
For the DEN 3' UTR, we found the same structures as Rauscher et al. (1997) where the analysis was restricted to the isolated 3' UTR. None of the long-range interactions interfered with any of these structural motifs. Leitmeyer et al. (1999)
propose additional base pairings that we could not find because they conflicted with the cyclization domains.
We only found parts of the secondary structures proposed by Proutski et al. (1997) for DEN2 as conserved for all DEN species. In particular, we did not find structures I2 and I3, II1 and III except region 3' LSH (our DV7). DV6 and DV7 are also discussed by Proutski et al. (1999)
for DEN4. All other structures that are reported in that study are disrupted by the cyclization of the viral genome. Assuming that cyclization of the genome is vital, we can reinterpret the deletion studies reported by Men et al. (1996)
in the following way: the deletion of DV6a (TL2) yields a delayed and reduced growth in simian and mosquito cells. When the deletions were extended more, to the 3' end of the sequence, the CS region was destroyed [mutant 3' 17283 of Men et al. (1996)
] and hence no viable viruses were found. A non-viable mutant 3'd 172107 may be explained by the importance of the sequence motif CAAAAA for virus propagation (Men et al., 1996
). Our data indicate that, in this case, the sequence motif is important rather than any structure associated with it. For the mutants 3'd 333183 and 3'd 384183, Men et al. (1996)
measure a greatly delayed and reduced growth in living cells. We would argue that these deletions destroy a possible prolongation of the cyclization region that we found for dengue viruses (data not shown). Our data indicate that each single sequence allows additional stems for cyclization in this region even though their exact positions vary slightly among the different sequences. It is plausible that such an extended cyclization region adds to the efficiency of virus replication but is not necessarily essential for its viability.
YFV and JEV.
The sequences in our dataset had about =91·7 % pairwise identity in the 3' UTR. We observed only a small number of compensatory mutations to verify structural features predicted based on our thermodynamic algorithm. We essentially found the same structures as Rauscher et al. (1997)
; again none of the structures reported by Rauscher et al. (1997)
conflicted with CS regions. YF28 was shorter by 9 bp than reported by Hahn et al. (1987)
. YF28 and YF27 corresponded to 3' LSH and I1, respectively, JE7 and JE8 to 3' LSH and II2, respectively, as proposed by Proutski et al. (1997)
. More structures could not be found for similar reasons as explained for DEN above.
TBE.
We recovered structures very similar to those reported by Mandl et al. (1998) and Rauscher et al. (1997)
. In particular, TB17 and TB18 correspond to IV and VI of Mandl et al. (1998)
and Rauscher et al. (1997)
, respectively, TB19 contains stem III of Mandl et al. (1998)
and Rauscher et al. (1997)
and TB16 corresponds to VII, VIII and IX of Mandl et al. (1998)
and Rauscher et al. (1997)
. Structure A1 reported by Mandl et al. (1998)
was shorter because of conflicts with cyclization sequences P1' and CS "A". Structure A2 (Mandl et al., 1998
) did not seem to be conserved. For structures MS and V of Mandl et al. (1998)
we had evidence from thermodynamic folding. However, these two structures conflict with P2. TB16 to TB21 conform with structures proposed by Proutski et al. (1997)
.
Pestivirus, Hepacivirus and GBV-C
5' UTR.
The 5' UTRs of these virus groups contain an IRES. For parts of the HCV IRES even tertiary structure studies are available (Kieft et al., 2002; Lukavsky et al., 2001
). The sequences of 5' UTRs of GBV-C and HCV are significantly more conserved than the rest of their respective genomes (Table. 1). For these two virus groups we found that the secondary structure of the 5' UTR is less conserved than we expected (due to the few sequence covariations); an overview is given in Fig. 4
. This was consistent with the data reported by Witwer et al. (2001)
for Picornaviridae. In contrast, the IRES of PESTI turned out to be highly conserved.
|
GBV-C.
The 5' UTR sequences of GBV-C were more highly conserved than the rest of the genome (Table. 1). Most of the sequence variation occurred around nucleotide (nt) positions 410437, which comprised the structural element HG6 (IVb) (Fig. 5a). This motif was also predicted in previous studies (Simons et al., 1996
; Smith et al., 1997
).
|
The sequences were too conserved in the remainder of the 5' UTR to support predicted structures by means of sequence variation. The thermodynamic prediction, however, found structures similar to those previously proposed (Katayama et al., 1998; Simons et al., 1996
; Smith et al., 1997
).
HCV.
The 5' UTR of HCV comprises 341342 nt. The RNA fold algorithm recovered structures similar to those reported in previous studies (Collier et al., 2002; Honda et al., 1996b
; Kalliampakou et al., 2002
; Kieft et al., 2001
; Kolupaeva et al., 2000a
; Odreman-Macchioli et al., 2000
; Psaridi et al., 1999
; Spahn et al., 2001
; Tang et al., 1999
; Fig. 4b
). Our algorithm was not designed to predict pseudoknots. However, we made sure that nucleic acids that are known to be involved in pseudoknots (Pestova et al., 1998
) do not pair to other parts of the sequence.
Due to high sequence conservation (Table 1) we found only two sites with compensatory mutations in HC3 (called IIIa, b and c by Honda et al., 1996b) in our dataset of nine complete genomic sequences. When additional sequences of the IRES region were included in the analysis, the structure was well supported by compensatory mutations (data not shown). This structure, HC3, has received considerable attention since it appears to act as a binding site for the eIF340S complex. It has an internal loop, which is twisted in itself (Collier et al., 2002
). Even though we found a mean identity of 98·3 % in this region, there were two compensatory mutations just before and after this highly structured part of the HCV IRES. This confirms Collier's interpretation that the shape of the backbone rather than the sequence composition is important for translation initiation.
We found a stem, HC2, which corresponds to IIa proposed by Honda et al. (1996a). For the nucleotides following stem IIa, the prediction favoured long-range interactions with nt 85718552 (NS5B); see HCVCS2 (discussed later). When the isolated IRES region (i.e. nt 44357) was folded separately, stems IIa and IIb were recovered as proposed by Honda et al. (1996a)
.
Pestivirus.
As with HCV and GBV-C the sequence of the 5' UTR region was more conserved than the rest of the genome (Table 1) but we still found a considerable amount of consistent and compensatory mutations.
Stem PV1 was proposed as Ia by Brown et al. (1992) and as domain A by Deng & Brock (1993)
(Fig. 5b)
.
Fletcher & Jackson (2002) observed that a deletion of nucleotides comprising stem PV2 (II in Brown et al., 1992
; domain C in Deng & Brock, 1993
) decreased the activity of IRES to 19 %. Though the pair probabilities in stem PV2 were small (Fig. 5b
), we found no inconsistencies and a considerable amount of compensatory mutations. This might point out the importance of the structure rather than the sequence to IRES function in this region.
As in previous studies (Deng & Brock, 1993; Fletcher & Jackson, 2002
; Kolupaeva et al., 2000b
; Moser et al., 2001
) our method detected stem PV3 as an important feature of Pestivirus IRES structure. Even though our algorithm does not allow pseudoknots, both stems of the pseudoknot reported by Pestova et al. (1998)
show up in the base pairing probabilities.
Coding region
GBV-C.
We found two significantly conserved stems (HG9 and HG10) in the E1 region, which were previously proposed by Simmonds & Smith (1999) based on a different algorithm (data presented in the supplemental material).
Conserved secondary structures seemed to be concentrated in the NS5A and NS5B region of the GBV-C genome (Fig. 6). Some of these had already been proposed by Cuceanu et al. (2001)
(Fig. 6c, e
). Furthermore, HG38 corresponds to SLV and HG39 to SLIV. In our data the SLI motif is completely conserved in the sequence. In SLVI we found more inconsistent mutations than compensatory mutations (data not shown) and the proposed SLVII structure could not be found with our method.
HCV.
Again, we found most of the conserved structures in the NS5A and NS5B regions. Some of these have been previously reported as important for the efficiency of the IRES function (Tuplin et al., 2002; Zhao & Wimmer, 2001
). One of the motifs detected by Tuplin et al. (2002)
is HC4, shown in Fig. 6(d)
. Tuplin et al. (2002)
further found HC6 as SL443, HC27 as SL8828 and HC28 as SL9011. According to our data there was no evidence for the existence of SL7730 and SL9118. SL8926 showed too many inconsistencies in our data and SL8376 was not folded because of interactions of this region with the 3' UTR (discussed later).
Ray et al. (1999) argued that HCV persistence is associated with sequence variability in putative envelope genes E1 and E2. We found a conserved RNA structure, HC7, in the E1 region (Fig. 6f
).
Pestivirus.
All putative conserved secondary structural elements in the coding region of PESTI were very short. A stemloop downstream of the initiator AUG appears in our data to have too many inconsistencies and thus cannot be considered as a conserved feature of PESTI, in agreement with the analysis of Myers et al. (2001). The most prominent stems found in the coding region are shown in Fig. 6(g)
and (h).
3' UTR
GBV-C.
The 3' UTR sequences of GBV-C are highly conserved (=96·7 %). Not surprisingly, we predicted structures similar to those previously reported (Katayama et al., 1998
; Okamoto et al., 1997
; Xiang et al., 2000
) but not all of them were supported by sequence covariation (data not shown). Some of the previously proposed structures conflict with long-range interactions to the 5' UTR predicted by our method (discussed later). One example well supported by sequence covariation is the structure HG43 that was also proposed by Cuceanu et al. (2001)
and Xiang et al. (2000)
.
HCV.
The 3' UTR consists of a short sequence of variable length and composition (variable region), a U-rich stretch (poly-U-UC region) variable in its length and a highly conserved sequence of approximately 100 nt at the 3' end (conserved region, X-tail) (Kolykhalov et al., 1996; Tanaka et al., 1996
; Yamada et al., 1996
). Within this X-tail we found only a single mutation, which is compatible with the predicted structure. Our stem HC29 corresponds to SL1 as previously reported (Blight & Rice, 1997
; Ito & Lai, 1997
; Yamada et al., 1996
). Stems SL2 and SL3, as proposed by Blight & Rice (1997)
and Ito & Lai (1997)
, compete in our data with the formation of two long-range interactions, LR1 and LR2. The probability of base pairs in LR1 was around P=0·54, significantly higher than HC29 (SL1). The elements SL2 and SL3 were thermodynamically unfavourable in the genomic context and could only be detected when a sequence window was used that was too small to contain the long-range interactions. More recently, Yi & Lemon (2003)
introduced several point mutations in the X-tail of the 3' UTR of HCV. Their results could not provide proof for the existence of SL2 or SL3 but indicated that there are stringent requirements for the sequence in this region.
Pestivirus.
Pestiviruses are very heterogeneous in their 3' UTR region, due to extended AU-rich insertions in some strains. The only RNA feature that was shared among all available sequences is the terminal stem PV15 that was originally described by Deng & Brock (1993) and also by Becher et al. (1998)
and Yu et al. (1999)
.
Genome cyclization.
Surprisingly, we discovered strong evidence for genome cyclization not only in the genus Flavivirus, where this effect has already been described in the literature, but also within HCV and GBV-C. The most prominent of them are shown in Fig. 7.
|
In HCV, putative cyclization domains comprised base pairs of nt 13 with nt 86278625 (HCVCS1), 8892 with 86028606 (HCVCS2) and 95110 with 85568571 (HCVCS3). Within HCVCS3 we found two sites of compensatory mutations (Fig. 7). In HCV, nucleotides from the IRES region (nt 13, 8892 and 95110) are paired with nucleotides within the coding region for the protein NS5B. At the same time we observed two regions of the 3' UTR to fold forward to the NS5B region as well: (i) LR1: nt 86288661 (NS5B) paired with nt 95999633 (3' UTR) and (ii) LR2: nt 89788995 (NS5B) paired with nt 95839598 (3' UTR). This brought the 5' and 3' regions into very close proximity, as illustrated in supplemental material C. Sequence position 8627 is involved in the interaction with the IRES; the adjacent nt 8627 pairs with the 3' UTR.
All of the mutations (15 point mutations and 6 double mutations) studied in Yi & Lemon (2003) exhibit reduced or no replication activity. Most of them would disrupt base pairs in either LR1 or LR2, supporting our proposed interactions. However, five of the point mutations are in predicted loop regions and would be expected to cause only minor secondary structural changes. This could indicate that there are sequence constraints beyond conservation of secondary structure. However, to prove or disprove the existence of LR1 and LR2, more mutation experiments would be needed.
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the genus Flavivirus a cyclization of the genome had already been described in the literature and localized to very conserved cyclization sequences. Apart from recovering these known cyclization sequences, we detected further sequences which took part in cyclization for all species in this study (P1', P1 and P2). These sequences varied considerably in sequence, length and position. Men et al. (1996) showed that deleting these sequences led to a greatly delayed and reduced growth in simian and mosquito cells. It is possible that these additional cyclization domains are not strictly necessary for virus viability, but only support and stabilize viral genome cyclization.
Most surprisingly, we also found viral genome cyclization in GBV-C and HCV, which had not been reported before, although Yi & Lemon (2003) suppose a cyclization of HCV genome by the assistance of some cellular protein. Our algorithm made out base pair probabilities for both previously reported secondary structures in 5' and 3' UTRs as well as for genome cyclization. For both cases, our data revealed no inconsistencies. Thus known structures compete with genome cyclization. Our evaluation conditions favoured genome cyclization based both on thermodynamic prediction, in the case of HCV, and sequence covariation. This result can be interpreted either as a relict of ancient ancestors between these genera and the genus Flavivirus or, more speculatively, as a switch providing different functions in different states of the viral life-cycle (e.g. a switch between replication and translation states of the virus).
While in Flavivirus and GBV-C the 5' and 3' ends pair within the untranslated regions, we found base pairing in HCV between the 5' and the 3' ends to a region some 1000 nt upstream of the 3' end (i.e. a region within the NS5B protein). More interestingly, we observed that, in this way, 5' and 3' ends were brought closely together. This could be a reason for the particular importance of the NS5B region as assumed in the literature (Oh et al., 1999, 2000
). It may also explain the results of Friebe et al. (2001)
and Kim et al. (2002)
, who observed that domains HC1(I) and HC2(II) in the 5' UTR are essential for replication, while domain HC3(III) helps to facilitate replication but is not absolutely required.
Furthermore, in this report (and in the supplementary material available online), we present a large number of secondary structure elements that have not been described before, most importantly within the coding region. This information could be used to identify additional regions that might be important for virus viability and propagation, and thus to gain more insight into the life-cycle of the members of the family Flaviviridae.
![]() |
ACKNOWLEDGEMENTS |
---|
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Blight, K. J. & Rice, C. M. (1997). Secondary structure determination of the conserved 98-base sequence at the 3' terminus of hepatitis C virus genome RNA. J Virol 71, 73457352.[Abstract]
Brinton, M. A. & Dispoto, J. H. (1988). Sequence and secondary structure analysis of the 5'-terminal region of flavivirus genome RNA. Virology 162, 290299.[Medline]
Brown, E. A., Zhang, H., Ping, L. H. & Lemon, S. M. (1992). Secondary structure of the 5' nontranslated regions of hepatitis C virus and pestivirus genomic RNAs. Nucleic Acids Res 20, 50415045.[Abstract]
Collier, A. J., Gallego, J., Klinck, R., Cole, P. T., Harris, S. J., Harrison, G. P., Aboul-Ela, F., Varani, G. & Walker, S. (2002). A conserved RNA structure within the HCV IRES eIF3-binding site. Nat Struct Biol 9, 375380.[Medline]
Cuceanu, N. M., Tuplin, A. & Simmonds, P. (2001). Evolutionarily conserved RNA secondary structures in coding and non-coding sequences at the 3' end of the hepatitis G virus/GB-virus C genome. J Gen Virol 82, 713722.
Deng, R. & Brock, K. V. (1993). 5' and 3' untranslated regions of pestivirus genome: primary and secondary structure analyses. Nucleic Acids Res 21, 19491957.[Abstract]
Fletcher, S. P. & Jackson, R. J. (2002). Pestivirus internal ribosome entry site (IRES) structure and function: elements in the 5' untranslated region important for IRES function. J Virol 76, 50245033.
Fontana, W., Konings, D. A. M., Stadler, P. F. & Schuster, P. (1993). Statistics of RNA secondary structures. Biopolymers 33, 13891404.[Medline]
Friebe, P., Lohmann, V., Krieger, N. & Bartenschlager, R. (2001). Sequences in the 5' nontranslated region of hepatitis C virus required for RNA replication. J Virol 75, 1204712057.
Gorodkin, J., Heyer, L. J. & Stormo, G. D. (1997). Finding common sequences and structure motifs in a set of RNA molecules. In Proceedings of the ISMB-97, pp. 120123. Edited by T. Gaasterland, P. Karp, K. Karplus, C. Ouzounis, C. Sander & A. Valencia. Menlo Park, CA: AAAI Press.
Hahn, C. S., Hahn, Y. S., Rice, C. M., Lee, E., Dalgarno, L., Strauss, E. G. & Strauss, J. H. (1987). Conserved elements in the 3' untranslated region of flavivirus RNAs and potential cyclization sequences. J Mol Biol 198, 3341.[Medline]
Hofacker, I. L. & Stadler, P. F. (1999). Automatic detection of conserved base pairing patterns in RNA virus genomes. Comput Chem 23, 401414.[CrossRef][Medline]
Hofacker, I. L., Fontana, W., Stadler, P. F., Bonhoeffer, S., Tacker, M. & Schuster, P. (1994). Fast folding and comparison of RNA secondary structures. Monatsh Chem 125, 167188.
Hofacker, I. L., Fekete, M., Flamm, C., Huynen, M. A., Rauscher, S., Stolorz, P. E. & Stadler, P. F. (1998). Automatic detection of conserved RNA structure elements in complete RNA virus genomes. Nucleic Acids Res 26, 38253836.
Hofacker, I. L., Fekete, M. & Stadler, P. F. (2002). Secondary structure prediction for aligned RNA sequences. J Mol Biol 319, 10591066.[CrossRef][Medline]
Hogeweg, P. & Hesper, B. (1984). Energy directed folding of RNA sequences. Nucleic Acids Res 12, 6774.[Abstract]
Honda, M., Brown, E. A. & Lemon, S. M. (1996a). Stability of a stemloop involving the initiator AUG controls the efficiency of internal initiation of translation on hepatitis C virus RNA. RNA 2, 955968.[Abstract]
Honda, M., Ping, L. H., Rijnbrand, R. C., Amphlett, E., Clarke, B., Rowlands, D. & Lemon, S. M. (1996b). Structural requirements for initiation of translation by internal ribosome entry within genome-length hepatitis C virus RNA. Virology 222, 3142.[CrossRef][Medline]
Ito, T. & Lai, M. M. C. (1997). Determination of the secondary structure of and cellular protein binding to the 3'-untranslated region of the hepatitis C virus RNA genome. J Virol 71, 86988706.[Abstract]
Kalliampakou, K. I., Psaridi-Linardaki, L. & Mavromara, P. (2002). Mutational analysis of the apical region of domain II of the HCV IRES. FEBS Lett 511, 7984.[CrossRef][Medline]
Katayama, K., Kageyama, T., Fukushi, S., Hoshino, F. B., Kurihara, C., Ishiyama, N., Okamura, H. & Oya, A. (1998). Full-length GBV-C/HGV genomes from nine Japanese isolates: characterization by comparative analysis. Arch Virol 143, 113.[CrossRef][Medline]
Khromykh, A. A., Meka, H., Guyatt, K. J. & Westaway, E. G. (2001). Essential role of cyclization sequences in flavivirus RNA replication. J Virol 75, 67196728.
Kieft, J. S., Zhou, K., Jubin, R. & Doudna, J. A. (2001). Mechanism of ribosome recruitment by hepatitis C IRES RNA. RNA 7, 194206.
Kieft, J. S., Zhou, K., Grech, A., Jubin, R. & Doudna, A. (2002). Crystal structure of an RNA tertiary domain essential to HCV IRES-mediated translation initiation. Nat Struct Biol 9, 370374.[Medline]
Kim, Y. K., Kim, C. S., Lee, S. H. & Jang, S. K. (2002). Domains I and II in the 5' nontranslated region of the HCV genome are required for RNA replication. Biochem Biophys Res Commun 290, 105112.[CrossRef][Medline]
Kolupaeva, V. G., Pestova, T. V. & Hellen, C. U. (2000a). An enzymatic footprinting analysis of the interaction of 40S ribosomal subunits with the internal ribosomal entry site of hepatitis C virus. J Virol 74, 62426250.
Kolupaeva, V. G., Pestova, T. V. & Hellen, C. U. (2000b). Ribosomal binding to the internal ribosomal entry site of classical swine fever virus. RNA 6, 17911807.
Kolykhalov, A. A., Feinstone, S. & Rice, C. M. (1996). Identification of a highly conserved sequence element at the 3' terminus of hepatitis C virus genome RNA. J Virol 70, 33633371.[Abstract]
Leitmeyer, K. C., Vaughn, D. W., Watts, D. M., Salas, R., Villalobos, I., de Chacon, I. V., Ramos, C. & Rico-Hesse, R. (1999). Dengue virus structural differences that correlate with pathogenesis. J Virol 73, 47384747.
Lück, R., Steger, G. & Riesner, D. (1996). Thermodynamic prediction of conserved secondary structure: application to the RRE element of HIV, the tRNA-like element of CMV, and the mRNA of prion protein. J Mol Biol 258, 813826.[CrossRef][Medline]
Lück, R., Gräf, S. & Steger, G. (1999). ConStruct: a tool for thermodynamic controlled prediction of conserved secondary structure. Nucleic Acids Res 27, 42084217.
Lukavsky, P. J., Kim, I., Otto, G. A. & Puglisi, J. D. (2003). Structure of HCV IRES domain II determined by NMR. Nat Struct Biol 10, 10331038.[CrossRef][Medline]
Mandl, C. W., Holzmann, H., Kunz, C. & Heinz, F. X. (1993). Complete genomic sequence of Powassan virus: evaluation of genetic elements in tick-borne versus mosquito-borne flaviviruses. Virology 194, 173184.[CrossRef][Medline]
Mandl, C. W., Holzmann, H., Meixner, T., Rauscher, S., Stadler, P. F., Allison, S. L. & Heinz, F. X. (1998). Spontaneous and engineered deletions in the 3' noncoding region of tick-borne encephalitis virus: construction of highly attenuated mutants of a flavivirus. J Virol 72, 21322140.
Mathews, D. H., Sabina, J., Zuker, M. & Turner, H. (1999). Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol 288, 911940.[CrossRef][Medline]
McCaskill, J. S. (1990). The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 11051119.[Medline]
Men, R., Bray, M., Clark, D., Chanock, R. M. & Lai, C. J. (1996). Dengue type 4 virus mutants containing deletions in the 3' noncoding region of the RNA genome: analysis of growth restriction in cell culture and altered viremia pattern and immunogenicity in rhesus monkeys. J Virol 70, 39303937.[Abstract]
Meyers, G. & Thiel, H. J. (1996). Molecular characterization of pestiviruses. Adv Virus Res 47, 53118.[Medline]
Moser, C., Bosshart, A., Tratschin, J. D. & Hofmann, M. A. (2001). A recombinant classical swine fever virus with a marker insertion in the internal ribosome entry site. Virus Genes 23, 6368.[CrossRef][Medline]
Myers, T. M., Kolupaeva, V. G., Mendez, E., Baginski, S. G., Frolov, I., Hellen, C. U. & Rice, C. M. (2001). Efficient translation initiation is required for replication of bovine viral diarrhea virus subgenomic replicons. J Virol 75, 42264238.
Odreman-Macchioli, F. E., Tisminetzky, S. G., Zotti, M., Baralle, F. E. & Buratti, E. (2000). Influence of correct secondary and tertiary RNA folding on the binding of cellular factors to the HCV IRES. Nucleic Acids Res 28, 875885.
Oh, J. W., Ito, T. & Lai, M. M. (1999). A recombinant hepatitis C virus RNA-dependent RNA polymerase capable of copying the full-length viral RNA. J Virol 73, 76947702.
Oh, J. W., Sheu, G. T. & Lai, M. M. (2000). Template requirement and initiation site selection by hepatitis C virus polymerase on a minimal viral RNA template. J Biol Chem 275, 1771017717.
Okamoto, H., Nakao, H., Inoue, T., Fukuda, M., Kishimoto, J., Iizuka, H., Tsuda, F., Miyakawa, Y. & Mayumi, M. (1997). The entire nucleotide sequences of two GB virus C/hepatitis G virus isolates of distinct genotypes from Japan. J Gen Virol 78, 737745.[Abstract]
Pestova, T. V., Shatsky, I. N., Fletcher, S. P., Jackson, R. J. & Hellen, C. U. (1998). A prokaryotic-like mode of cytoplasmic eukaryotic ribosome binding to the initiation codon during internal translation initiation of hepatitis C and classical swine fever virus RNAs. Genes Dev 12, 6783.
Proutski, V., Gould, E. A. & Holmes, E. C. (1997). Secondary structure of the 3' untranslated region of flaviviruses: similarities and differences. Nucleic Acids Res 25, 11941202.
Proutski, V., Gritsun, T. S., Gould, E. A. & Holmes, E. C. (1999). Biological consequences of deletions within the 3'-untranslated region of flaviviruses may be due to rearrangements of RNA secondary structure. Virus Res 64, 107123.[CrossRef][Medline]
Psaridi, L., Georgopoulou, U., Varaklioti, A. & Mavromara, P. (1999). Mutational analysis of a conserved tetraloop in the 5' untranslated region of hepatitis C virus identifies a novel RNA element essential for the internal ribosome entry site function. FEBS Lett 453, 4953.[CrossRef][Medline]
Rauscher, S., Flamm, C., Mandl, C. W., Heinz, F. X. & Stadler, P. F. (1997). Secondary structure of the 3'-noncoding region of flavivrus genomes: comparative analysis of base pairing probabilities. RNA 3, 779791.[Abstract]
Ray, S. C., Wang, Y. M., Laeyendecker, O., Ticehurst, J. R., Villano, S. A. & Thomas, D. L. (1999). Acute hepatitis C virus structural gene sequences as predictors of persistent viremia: hypervariable region 1 as a decoy. J Virol 73, 29382946.
Rivas, E. & Eddy, S. R. (2000). Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs. Bioinformatics 16, 583605.[Abstract]
Sankoff, D. (1985). Simultaneous solution of the RNA folding, alignment, and proto-sequence problems. SIAM J Appl Math 45, 810825.
Schuster, P., Fontana, W., Stadler, P. F. & Hofacker, I. L. (1994). From sequences to shapes and back: a case study in RNA secondary structures. Proc R Soc Lond B Biol Sci 255, 279284.[Medline]
Simmonds, P. & Smith, D. B. (1999). Structural constraints on RNA virus evolution. J Virol 73, 57875794.
Simons, J. N., Desai, S. M., Schultz, D. E., Lemon, S. M. & Mushahwar, I. K. (1996). Translation initiation in GB viruses A and C: evidence for internal ribosome entry and implication for genome organization. J Virol 70, 61266135.[Abstract]
Smith, D. B., Cuceanu, N., Davidson, F., Jarvis, L. M., Mokili, J. L., Hamid, S., Ludlam, C. A. & Simmonds, P. (1997). Discrimination of hepatitis G virus/GBV-C geographical variants by analysis of the 5' non-coding region. J Gen Virol 78, 15331542.[Abstract]
Spahn, C. M., Kieft, J. S., Grassucci, R. A., Penczek, P. A., Zhou, K., Doudna, J. A. & Frank, J. (2001). Hepatitis C virus IRES RNA-induced changes in the conformation of the 40s ribosomal subunit. Science 291, 19591962.
Stocsits, R., Hofacker, I. L. & Stadler, P. F. (1999). Conserved secondary structures in hepatitis B virus RNA. In Computer Science in Biology, pp. 7379. Univ. Bielefeld, Bielefeld, Germany. Proceedings of the GCB'99, Hannover, Germany.
Tanaka, T., Kato, N., Cho, M. J., Sugiyama, K. & Shimotohno, K. (1996). Structure of the 3' terminus of the hepatitis c virus genome. J Virol 70, 33073312.[Abstract]
Tang, S., Collier, A. J. & Elliott, R. M. (1999). Alterations to both the primary and predicted secondary structure of stem-loop IIIc of the hepatitis C virus 1b 5' untranslated region (5'UTR) lead to mutants severely defective in translation which cannot be complemented in trans by the wild-type 5'UTR sequence. J Virol 73, 23592364.
Tautz, N., Harada, T., Kaiser, A., Rinck, G., Behrens, S. & Thiel, H. J. (1999). Establishment and characterization of cytopathogenic and noncytopathogenic pestivirus replicons. J Virol 73, 94229432.
Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 46734680.[Abstract]
Tuplin, A., Wood, J., Evans, D. J., Patel, A. H. & Simmonds, P. (2002). Thermodynamic and phylogenetic prediction of RNA secondary structures in the coding region of hepatitis C virus. RNA 8, 824841.
van Regenmortel, M. H. V., Fauquet, C., Bishop, D. & 8 other authors (2000). Virus Taxonomy: The Classification and Nomenclature of Viruses. The Seventh Report of the International Committee on Taxonomy of Viruses. San Diego: Academic Press. http://www.ncbi.nlm.nih.gov/ICTVdb/
Witwer, C., Rauscher, S., Hofacker, I. L. & Stadler, P. F. (2001). Conserved RNA secondary structures in Picornaviridae genomes. Nucleic Acids Res 29, 50795089.
Xiang, J., Wunschmann, S., Schmidt, W., Shao, J. & Stapleton, J. T. (2000). Full-length GB virus C (Hepatitis G virus) RNA transcripts are infectious in primary CD4-positive T cells. J Virol 74, 91259133.
Yamada, N., Tanihara, K., Takada, A., Yorihuzi, T. T., Tsutsumi, M., Shimomura, H., Tsuji, T. & Date, T. (1996). Genetic organization and diversity of the 3' noncoding region of the hepatitis C virus genome. Virology 223, 255261.[CrossRef][Medline]
Yi, M. K. & Lemon, S. M. (2003). 3' nontranslated RNA signals required for replication of hepatitis C virus RNA. J Virol 77, 35573568.
You, S. & Padmanabhan, R. (1999). A novel in vitro replication system for dengue virus. Initiation of RNA synthesis at the 3'-end of exogenous viral RNA templates requires 5'- and 3'-terminal complementary sequence motifs of the viral RNA. J Biol Chem 274, 3371433722.
Yu, H., Grassmann, C. W. & Behrens, S. E. (1999). Sequence and structural elements at the 3' terminus of bovine viral diarrhea virus genomic RNA: functional role during RNA replication. J Virol 73, 36383648.
Zhao, W. D. & Wimmer, E. (2001). Genetic analysis of a poliovirus/hepatitis C virus chimera: new structure for domain II of the internal ribosomal entry site of hepatitis C virus. J Virol 75, 37193730.
Received 26 June 2003;
accepted 10 December 2003.