Laboratory for Clinical and Molecular Virology, University of Edinburgh, Summerhall, Edinburgh EH9 1QH, UK1
Author for correspondence: Peter Simmonds. Fax +44 131 650 7965. e-mail Peter.Simmonds{at}ed.ac.uk
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The great genetic stability of HGV/GBV-C and related viruses implied by the evidence for co-evolution with primates is difficult to reconcile with their observed rapid sequence change in individuals over short observation periods (Nakao et al., 1997 ). We have previously found evidence for constraints on sequence change at many sites in the coding sequence, even at those where substitutions would be synonymous (Simmonds & Smith, 1999
). Evidence that RNA secondary structure formation through internal base-pairing limits sequence variability at these sites was provided by the finding of multiple covariant sites spatially associated with potential stemloop structures amongst HGV/GBV-C sequences of different genotypes. Furthermore, these occurred at positions in the genome that showed reductions in synonymous variability. In that study we excluded non-random nucleotide composition and biased codon usage as compounding factors in the use of RNA folding prediction algorithms and calculation of free energies.
In the current study we have used a variety of phylogenetic and free energy-based predictive algorithms to compare the extent and conservation of RNA secondary structure formation in the 3' untranslated region (3'UTR) with upstream coding sequences from NS5B, a region encoding the viral RNA polymerase. Our findings indicate that the part of the RNA genome containing the coding sequences is more extensively structured than the 3'UTR, and shows better conservation between variants of HGV/GBV-C infecting different primates. These findings imply an important functional role(s) for the observed secondary structure.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Viroid sequences analysed were citrus exocortis viroid (accession no. X53715), potato spindle tuber viroid (U23058), chrysanthemum chlorotic mottle viroid (AJ247123), Mexican papita viroid (L78463) and potato spindle tuber viroid (X76846). Delta virus sequences of different genotypes were obtained from the following entries: AF098261, AJ000558, M21012, D01075 and M28267. Coding sequences of serum albumin were obtained from the following mammalian species: cat (X84842), cow (Y17729), gerbil (AB006197), horse (X74045), macaque (M90463) and rat (U01222). -globin coding sequences were obtained from the following mammalian species: baboon (X05289), orang-utan (M12158), duck (X02008), marsupial cat (M17083) and human (V00493).
Additional HGV/GBV-C 3'UTR sequences.
3'UTR sequences were obtained from 17 samples whose genotype had been deduced from sequence comparisons of the 5'UTR and E2 regions (Smith et al., 1997 , 2000
). RNA was extracted using proteinase KSDS and phenolchloroform as described previously (Jarvis et al., 1994
). Purified RNA was then reverse-transcribed and amplified by hemi-nested RTPCR using primers derived from conserved regions of the HGV/GBV-C genome at the carboxyl end of the NS5B gene and the extreme 3'-end: Z3580 outer sense (positions 88298848, 5' GGTGGTNCATCAATTGGATT 3', where N=A, C, G, T); Z3581 inner sense (positions 88818900, 5' GGTTCTTAGCCCTGCTCATC 3'); and Z3582 outer and inner antisense (positions 92129231, 5' AGTAGAACCCGGCCTTTGGG 3'). Reverse transcription was carried out at 42 °C for 30 min using avian myeloblastosis virus reverse transcriptase (Promega). The conditions for the first round of PCR were hot start at 80 °C for 2 min followed by 30 cycles of 94 °C for 18 s, 58 °C for 21 s and 72 °C for 90 s. At the end of the last cycle, samples were heated to 72 °C for 5 min to allow termination of incomplete strands. The second round of PCR was performed using 1 µl of the primary PCR product for the same number of cycles and conditions. The amplified PCR products were cloned into pGEM-T vector (Promega), and sequenced with both sense and antisense plasmid primers using T7 DNA polymerase (Sequenase, USB). The consensus sequence of one to three clones for each sample was used for phylogenetic analysis.
Free energy on RNA folding.
The last 1000 bases of aligned HGV/GBV-C complete genome sequences were split into NS5B and 3'UTR regions. The NS5B sequence included the whole sequence upstream of and including the stop codon (670 bases in the sequence U36380), while the 3'UTR included the whole sequence downstream from this position (321 bases). The free energy of folding was calculated with the programs RNADraw v1.1 or MFOLD using default settings. The contribution of nucleotide order to free energy of folding was estimated by comparison of free energy with the mean value of sequences generated by independent sequence order randomizations. The variability in free energy on folding 50 sequence order randomizations of three representative HGV/GBV-C sequences of genotype 1 (U36380), 2 (AB013501) and 3 (D90601) was comparable to the combined variability shown by three sequence order randomizations of the seven HGV/GBV-C sequences shown in Table 1 [NS5 region: ±0·044 (±2·7%), ±0·041 (±2·8%) and ±0·040 (±2·8%); 3'UTR: ±0·043 (±4·0%), ±0·041 (±4·3%) and ±0·04 (±3·7%) for the three sequences with 50 randomizations].
|
Free energies in the HGV/GBV-C NS5B and 3'UTR were compared with the mean values of sets of four 5'UTR sequences (U44402, U63715, AB008335 and D87263), five plant viroid sequences, five delta virus sequences, and five albumin and five -globin sequences from a range of vertebrate species.
Sequence software.
All randomization, free energy calculations and secondary structure predictions were made with the programs RNADraw v1.1 and MFOLD using default settings. Sequence alignments and distance measurements were performed with the Simmonic 2000 package, which is available from the authors.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
One of the problems of verifying structure predictions for the 3'UTR using covariance was lack of comparative sequence information of two of the four HGV/GBV-C genotypes. In GenBank, there are currently only 25 complete or near-complete 3'UTR sequences of which only two are from type 1 and two of type 4. We obtained additional 3'UTR sequences of genotypes 14 from HGV/GBV-C-infected individuals from various geographical locations. Sequence variability in the 3'UTR was largely confined to the predicted stemloop 3'SLVII (Fig. 4), which demonstrated a large number of covariant substitutions both between and within genotypes. Covariant sites, and in genotype 3 a paired deletion, were also found in stemloop 3'SLIII. The remaining predicted structures were in regions of the 3'UTR invariant between HGV/GBV-C genotypes, although covariant changes and some structural differences with the HGV/GBV-CCPZ sequence were detected in loops 3'SLVI and 3'SLIV (data not shown).
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Computer-predicted folding patterns and RNase cleavage experiments have previously demonstrated the existence of a long stable hairpin structure (3'-LSH) within the distal part of the 3'UTR of several different flaviviruses (Proutski et al., 1997 ; Brinton et al., 1986
; Rice et al., 1985
), some positive-strand RNA plant viruses (Strauss & Strauss, 1983
), HCV (Blight & Rice, 1997
; Kolykhalov et al., 1996
), GBV-B (Rijnbrand et al., 2000
) and pestiviruses (Yu et al., 1999
; Deng & Brock, 1993
). Other studies have provided evidence for a specific interaction between the 3'LSH of flaviviruses and host cellular proteins, components of the virus replication complex or have demonstrated a specific binding of cellular proteins to the 3'-terminal 98 nucleotides of the HCV RNA (Ito & Lai, 1997
) and determined which regions of the HCV 3'-UTR are critical for in vivo virus replication (Lefrere et al., 1999
).
The configurations of the predicted terminal loops 3'SLII and 3'SLI in the 3'UTR of HGV/GBV-C (Fig. 2) closely resemble those predicted for HCV (Tanaka et al., 1996
; Kolykhalov et al., 1996
) and GBV-B (Rijnbrand et al., 2000
). However, there was no evidence for a conserved third loop (3'SLIII) nor primary sequence similarity between HGV/GBV-C and HCV or GBV-B. Additionally, the terminal loop is shorter than the HCV and GBV-B homologues (14 base pairs in the stem instead of 19 or 20), although the predicted free energy for formation of the terminal loop (-73 kJ, -2·2 kJ/b) is similar to that of HCV (-110 kJ, -2·4 kJ/b), and indicates a high probability of its formation in vivo.
The 3'UTR sequences of human HGV/GBV-C genotypes were highly conserved, with mean pairwise distances between genotypes ranging from 3·8 to 6·6%, compared with 12·8 to 13·4% over the rest of the genome. The HGV/GBV-CCPZ sequence, however, did not display the same differential in divergence, with 2326% sequence divergence from human genotypes in the 3'UTR, only slightly lower than observed upon comparison of coding sequences (30%). Structurally, only loop 3'SLV was found in both human and chimpanzee variants, although the alignment indicated that the region containing the two terminal loops was missing from the published HGV/GBV-CCPZ sequence. The lack of structural conservation between HGV/GBV-C sequences is not unexpected, given the lack of similarity between other members of the hepaciviruses and pestiviruses in this region. For example, only the terminal three loops are conserved between HCV and GBV-B, and there is also considerable sequence variability between HCV genotypes in non-coding regions 5' to this, including the great variability in length of the poly(U) tract. Clearly there are varying constraints on sequence change between different regions of the 3'UTR. However, apart from the involvement of the terminal loops in transcription initiation of HCV (Lohmann et al., 1999 ) and potentially other hepaciviruses, it remains unclear what other functional roles secondary structure in the 3'UTR may play.
Secondary structure in NS5B
The prediction methods used to determine the structure of the 3'UTR were also used to analyse the coding sequence of NS5B. Surprisingly, this region showed even greater free energy on folding than the 3'UTR, several large stemloops such as those numbered SLNS5BIII and SLNS5BV and, in contrast to the 3'UTR, substantial structural similarity between human and chimpanzee HGV/GBV-C variants. Generally, secondary structures were either identical between HGV/GBV-C variants or, particularly on comparison with the HGV/GBV-CCPZ sequence, showed some differences in the identity of the bases involved in base-pairing, but retained conservation of the overall shape and size of the stemloop. The finding of secondary structure in this region and differences in free energy with sequence order-randomized sequences throughout the genome (Fig. 1) confirms and extends our previous predictions for extensive structure of the HGV/GBV-C genome based on a novel method of covariance scanning and analysis of the distribution of variability at synonymous sites (Simmonds & Smith, 1999
). The involvement of such a high proportion of bases in internal base-pairing in the NS5B region, and by implication elsewhere in the genome, suggests that the RNA molecule may be extensively folded through local and possible longer-range interactions to form a tertiary RNA structure.
The conservation in structure and the large number of covariant substitutions suggest a functional role(s) for the RNA structures. Particularly striking was the similarity in free energy on folding the NS5B sequences with the free energies observed for plant viroids and the non-coding region of delta viruses. For these agents, the secondary structure is essential for the replication of the genome, where specific domains may catalyse RNA cleavage, ligation and editing of the genomic RNA sequence. For HGV/GBV-C, amongst several possibilities, RNA folding may be required for packaging of the HGV/GBV-C genome into virus particles, or to protect the genome from RNA-degrading enzymes, particularly as HGV/GBV-C and related viruses do not appear to encode a conventional nucleocapsid protein. In other RNA viruses, secondary structures such as the cis-acting replication element in picornaviruses may play a role in initiation of RNA synthesis through long-range interactions with the 3'-terminal region of the genome (Goodfellow et al., 2000 ; McKnight & Lemon, 1998
). The interactions between different genomic regions implied by these observations suggest that HGV/GBV-C might also have an organized overall structure of the RNA genome, in which stemloop structures may play a role in virus replication.
Enzymatic and chemical methods have been used to provide evidence of secondary structures independently of sequence analysis. While we have considered this approach for the further investigation of the HGV/GBV-C described in this and our previous study (Simmonds & Smith, 1999 ), the problem with the analysis of sequences such as the NS5B is that they are too long to be easily resolvable by conventional methods. Although it may be possible to separately analyse shorter lengths of sequence in this region, splitting sequences in this way could disrupt the longer-range interactions such as the base-pairing in the base of stemloops SLNS5BIII and V. Direct visualization by electron microscopy of RNA folded in physiological conditions is potentially a better method to determine secondary structure of longer sequences of RNA, particularly if combined with hybridization with gold-labelled probes to identify specific sequences within the observed structures. We are currently carrying out this analysis with RNA transcripts from the two regions analysed in this study. Additionally, experimental manipulation of the recently described infectious clone of HGV/GBV-C and methods to culture the virus in vitro (Xiang et al., 2000
) may allow a direct investigation of the functional significance of RNA folding in this region of the genome.
Using methods described in this and our previous study, we have also commenced secondary structure analyses of other members of the flavivirus family. HCV shows an even greater excess of free energy on folding the NS5B region (23%, Table 2), several stemloop structures conserved between HCV genotypes (which are much more divergent in nucleotide sequence than between the HGV/GBV-C sequences analysed in this study), and the occurrence of multiple covariant sites in each of the predicted stemloops (data not shown). The availability of a replicating clone of HCV (Lohmann et al., 1999
) may allow the role of such structures to be experimentally investigated.
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Birkenmeyer, L. G., Desai, S. M., Muerhoff, A. S., Leary, T. P., Simons, J. N., Montes, C. C. & Mushahwar, I. K. (1998). Isolation of a GB virus-related genome from a chimpanzee. Journal of Medical Virology 56, 44-51.[Medline]
Blight, K. J. & Rice, C. M. (1997). Secondary structure determination of the conserved 98-base sequence at the 3' terminus of hepatitis C virus genome RNA. Journal of Virology 71, 7345-7352.[Abstract]
Brinton, M. A., Fernandez, A. V. & Dispoto, J. H. (1986). The 3'-nucleotides of flavivirus genomic RNA form a conserved secondary structure. Virology 153, 113-121.[Medline]
Bukh, J. & Apgar, C. L. (1997). Five new or recently discovered (GBV-A) virus species are indigenous to New World monkeys and may constitute a separate genus of the Flaviviridae. Virology 229, 429-436.[Medline]
Deng, R. T. & Brock, K. V. (1993). 5' and 3' untranslated regions of pestivirus genome primary and secondary structure analyses. Nucleic Acids Research 21, 1949-1957.[Abstract]
Erker, J. C., Desai, S. M., Leary, T. P., Chalmers, M. L., Montes, C. C. & Mushahwar, I. K. (1998). Genomic analysis of two GB virus A variants isolated from captive monkeys. Journal of General Virology 79, 41-45.[Abstract]
Gonzalez-Perez, M. A., Norder, H., Bergstrom, A., Lopez, E., Visona, K. A. & Magnius, L. O. (1997). High prevalence of GB virus C strains genetically related to strains with Asian origin in Nicaraguan hemophiliacs. Journal of Medical Virology 52, 149-155.[Medline]
Goodfellow, I., Chaudhry, Y., Richardson, A., Meredith, J., Almond, J. W., Barclay, W. & Evans, D. J. (2000). Identification of a cis-acting replication element within the poliovirus coding region. Journal of Virology 74, 4590-4600.
Ito, T. & Lai, M. M. C. (1997). Determination of the secondary structure of and cellular protein binding to the 3'-untranslated region of the hepatitis C virus RNA genome. Journal of Virology 71, 8698-8706.[Abstract]
Jarvis, L. M., Watson, H. G., McOmish, F., Peutherer, J. F., Ludlam, C. A. & Simmonds, P. (1994). Frequent reinfection and reactivation of hepatitis C virus genotypes in multitransfused hemophiliacs. Journal of Infectious Diseases 170, 1018-1022.[Medline]
Katayama, Y., Apichartpiyakul, C., Handajani, R., Ishido, S. & Hotta, H. (1997). GB virus C hepatitis G virus (GBV-C/HGV) infection in Chiang Mai, Thailand, and identification of variants on the basis of 5'-untranslated region sequences. Archives of Virology 142, 2433-2445.[Medline]
Katayama, K., Kageyama, T., Fukushi, S., Hoshino, F. B., Kurihara, C., Ishiyama, N., Okamura, H. & Oya, A. (1998). Full-length GBV-C/HGV genomes from nine Japanese isolates: characterization by comparative analyses. Archives of Virology 143, 1063-1075.[Medline]
Kolykhalov, A. A., Feinstone, S. M. & Rice, C. M. (1996). Identification of a highly conserved sequence element at the 3' terminus of hepatitis C virus genome RNA. Journal of Virology 70, 3363-3371.[Abstract]
Leary, T. P., Muerhoff, A. S., Simons, J. N., Pilot-Matias, T. J., Erker, J. C., Chalmers, M. L., Schlauder, G. S., Dawson, G. J., Desai, S. M. & Mushahwar, I. K. (1996). Sequence and genomic organization of GBV-C: a novel member of the Flaviviridae associated with human non-AE hepatitis. Journal of Medical Virology 48, 60-67.[Medline]
Leary, T. P., Desai, S. M., Erker, J. C. & Mushahwar, I. K. (1997). The sequence and genomic organization of a GB virus A variant isolated from captive tamarins. Journal of General Virology 78, 2307-2313.[Abstract]
Lefrere, J. J., Roudotthoraval, F., Morandjoubert, L., Brossard, Y., Parnetmathieu, F., Mariotti, M., Agis, F., Rouet, G., Lerable, J., Lefevre, G., Girot, R. & Loiseau, P. (1999). Prevalence of GB virus type C hepatitis G virus RNA and of anti-E2 in individuals at high or low risk for blood-borne or sexually transmitted viruses: evidence of sexual and parenteral transmission. Transfusion 39, 83-94.[Medline]
Linnen, J., Wages, J., Zhangkeck, Z. Y., Fry, K. E., Krawczynski, K. Z., Alter, H., Koonin, E., Gallagher, M., Alter, M., Hadziyannis, S., Karayiannis, P., Fung, K., Nakatsuji, Y., Shih, J. W. K., Young, L., Piatak, M., Hoover, C., Fernandez, J., Chen, S., Zou, J. C., Morris, T., Hyams, K. C., Ismay, S., Lifson, J. D., Hess, G., Foung, S. K. H., Thomas, H., Bradley, D., Margolis, H. & Kim, J. P. (1996). Molecular cloning and disease association of hepatitis G virus: a transfusion-transmissible agent. Science 271, 505-508.[Abstract]
Lohmann, V., Korner, F., Koch, J. O., Herian, U., Theilmann, L. & Bartenschlager, R. (1999). Replication of subgenomic hepatitis C virus RNAs in a hepatoma cell line. Science 285, 110-113.
McKnight, K. L. & Lemon, S. M. (1998). The rhinovirus type 14 genome contains an internally located RNA structure that is required for viral replication. RNA 4, 1569-1584.
Muerhoff, A. S., Smith, D. B., Leary, T. P., Erker, J. C., Desai, S. M. & Mushahwar, I. K. (1997). Identification of GB virus C variants by phylogenetic analysis of 5'-untranslated and coding region sequences. Journal of Virology 71, 6501-6508.[Abstract]
Nakao, H., Okamoto, H., Fukuda, M., Tsuda, F., Mitsui, T., Masuko, K., Lizuka, H., Miyakawa, Y. & Mayumi, M. (1997). Mutation rate of GB virus C hepatitis G virus over the entire genome and in subgenomic regions. Virology 233, 43-50.[Medline]
Okamoto, H., Nakao, H., Inoue, T., Fukuda, M., Kishimoto, J., Iizuka, H., Tsuda, F., Miyakawa, Y. & Mayumi, M. (1997). The entire nucleotide sequences of two GB virus C/hepatitis G virus isolates of distinct genotypes from Japan. Journal of General Virology 78, 737-745.[Abstract]
Proutski, V., Gould, E. A. & Holmes, E. C. (1997). Secondary structure of the 3' untranslated region of flaviviruses: similarities and differences. Nucleic Acids Research 25, 1194-1202.
Rice, C. M., Lenches, E. M., Eddy, S. R., Shin, S. J., Sheets, R. L. & Strauss, J. H. (1985). Nucleotide sequence of yellow fever virus: implications for flavivirus gene expression and evolution. Science 229, 726-733.[Medline]
Rijnbrand, R., Abell, G. & Lemon, S. M. (2000). Mutational analysis of the GB virus B internal ribosome entry site. Journal of Virology 74, 773-783.
Simmonds, P. & Smith, D. B. (1999). Structural constraints on RNA virus evolution. Journal of Virology 73, 5787-5794.
Simons, J. N., Desai, S. M., Schultz, D. E., Lemon, S. M. & Mushahwar, I. K. (1996). Translation initiation in GB viruses A and C: evidence for internal ribosome entry and implications for genome organization. Journal of Virology 70, 6126-6135.[Abstract]
Smith, D. B., Cuceanu, N., Davidson, F., Jarvis, L. M., Mokili, J. L. K., Hamid, S., Ludlam, C. A. & Simmonds, P. (1997). Discrimination of hepatitis G virus/GBV-C geographical variants by analysis of the 5' non-coding region. Journal of General Virology 78, 1533-1542.[Abstract]
Smith, D. B., Basaras, M., Frost, S., Haydon, D., Cuceanu, N., Prescott, L., Kamenka, C., Millband, D., Sathar, M. A. & Simmonds, P. (2000). Phylogenetic analysis of GBV-C/hepatitis G virus. Journal of General Virology 81, 769-780.
Strauss, E. G. & Strauss, J. H. (1983). Replication strategies of the single stranded RNA viruses of eukaryotes. Current Topics in Microbiology and Immunology 105, 1-98.[Medline]
Tanaka, T., Kato, N., Cho, M. J., Sugiyama, K. & Shimotohno, K. (1996). Structure of the 3' terminus of the hepatitis C virus genome. Journal of Virology 70, 3307-3312.[Abstract]
Tanaka, Y., Mizokami, M., Orito, E., Ohba, K., Kato, T., Kondo, Y., Mboudjeka, I., Zekeng, L., Kaptue, L., Bikandou, B., Mpele, P., Takehisa, J., Hayami, M., Suzuki, Y. & Gojobori, T. (1998). African origin of GB virus C hepatitis G virus. FEBS Letters 423, 143-148.[Medline]
Xiang, J., Wunschmann, S., Schmidt, W., Shao, J. & Stapleton, J. T. (2000). Full-length GB virus C (hepatitis G virus) RNA transcripts are infectious in primary CD4-positive T cells. Journal of Virology 74, 9125-9133.
Yu, H. Y., Grassmann, C. W. & Behrens, S. E. (1999). Sequence and structural elements at the 3' terminus of bovine viral diarrhea virus genomic RNA: functional role during RNA replication. Journal of Virology 73, 3638-3648.
Received 17 August 2000;
accepted 10 January 2001.