Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Japan
Correspondence: E-mail: tgojobor{at}genes.nig.ac.jp.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: RNA virus evolution synonymous substitution rate replication frequency infection mode transmission mode
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Mutation by nucleotide substitution is considered to be one of the important evolutionary mechanisms because it is the major source of new mutant productions of RNA viruses. To study the mutation rate, we examined the rate of synonymous substitution in this study because natural selection does not strongly influence the fixation probability of synonymous substitution, at least at the protein level, and therefore the rate of synonymous substitution reflects the mutation rate in a great extent (Miyata and Yasunaga 1980; Bush et al. 1999).
This study examined the degree of variation of synonymous substitution rates among RNA viruses and to identify the main source of the variation. For this purpose, we estimated the synonymous substitution rates for a total of 49 different species of RNA viruses that belong to 39 genera of 15 families.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We also obtained the years of isolation for all strains from the database and the available publications, which were listed for all RNA virus species in appendix A of supplemental materials. We, then, estimated the rate of synonymous substitution for the genes encoding the outer-structural protein. For hepatitis D virus only, however, we used the whole genome sequence because this virus did not have the structural protein. The RNA virus species used in this paper are summarized in tables 14. Moreover, these tables show natural hosts, infection modes, and transmission modes, whose references were listed in appendix A of supplemental materials. Examining the infection modes, we always focused upon the infection mode occurring between a given virus species and its natural host. This is because the infection mode from the virus to the natural host is reasonably considered to represent an important feature of the RNA viruses.
|
In the second approach, we estimated the rates of synonymous substitution for the remaining three RNA viruses, Puumala, HTLV-1, and HGV, using the divergence times that have already been reported. These viruses were reported to coevolve with the host species (Horai 1995; Yanagihara et al. 1995; Asikainen et al. 2000; Robertson 2001). Therefore, the divergence time of a virus was considered to correspond to the divergence time of the host. We first constructed each multiple alignment of three RNA viruses to match the coding region by the computer program, CLUSTALW. From the multiple alignment, the phylogenetic tree was constructed by the maximum likelihood method on the basis of the HKY model. The ancestral sequence of the divergence node was estimated by the maximum likelihood approach. The rate of synonymous substitution was estimated by dividing the average number of synonymous substitutions from the ancestral sequence to all tips of the phylogenetic tree by the time period from the known divergence time of the host to the present.
A Test of Substitution Saturation
We conducted a statistical test to examine whether the number of substitutions was saturated or not by Xia and Xie's (2001) method. In this method, both transitions and transversions were plotted against evolutionary distances such as the number of nucleotide substitutions. In figure 1, for example, we showed the comparison for transitions and transversions in human enterovirus A. When transitions occur much more frequently than transversions, no saturation of substitution is recognized. On the other hand, when transversions gradually outnumber transitions, substitution saturation is suspected because multiple substitutions may have occurred at each site. Therefore, we conducted the comparison of two regression slopes. If the slope of transitions is significantly steeper than that for transversions against evolutionary distances, the substitution was considered not to be saturated.
|
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
Moreover, when the rate variation of RNA viruses was compared with that of nonviral organisms, we found that the former was about 1,000-fold larger than that of the latter. This is because nonviral organisms have been reported to have evolved at varying synonymous substitution rates only by 2 orders of magnitude in the actual range of (0.1212.4) x 109 (Li, Tanimura, and Sharp 1987; Wolfe, Li, and Sharp 1987; Bulmer, Wolfe, and Sharp 1991; Gaut et al. 1996; Pawlowski et al. 1997).
We built three possible hypotheses in the following to understand the reason that such variation existed. The first hypothesis is that the error rate per replication (replication error rate) among the RNA viruses is the main cause of the variation. The second hypothesis is that the number of replications per unit time (replication frequency) among RNA viruses is the main cause of the variation. The third hypothesis is that both of these affect the variation.
To investigate which of these three hypotheses was correct, we compared the rate of replication error with the synonymous substitution rate using eight different RNA viruses, as shown in table 5 (Schrag, Rota, and Bellini 1991; Mansky and Temin 1995; Drake and Holland 1999; Stech et al. 1999; Mansky 2000; Escarmis et al. 2002). The replication error rate of porcine reproductive and respiratory syndrome virus was estimated from the passage number, the number of nucleotide substitutions during the passage, and the time required for viral budding (Dea 1995; Allende 2000). The results showed that the rate of replication error (the order is 105) was almost constant among the eight different species of RNA viruses in spite of the wide variation in the synonymous substitution rates among these species. These results indicated that the replication frequency should be the main source of the variation in the synonymous substitution rates because the constancy of the rates of replication error showed that these did not contribute strongly to the variation of the synonymous substitution rates among RNA viruses.
|
|
First, we compared the infection modes with the rates of synonymous substitution. The results showed that the rates of synonymous substitution for viruses inducing both acute and persistent infection were higher than those for viruses inducing only acute infection. We also showed that the rates of synonymous substitution for viruses of only acute infection were higher than those for the viruses of only persistent infection and that the rates of synonymous substitution for viruses of only persistent infection were higher than those for the viruses of latent infection. All of those differences were statistically significant (P < 0.05) by the two-tailed Wilcoxon test (fig. 3). Indeed, the replication frequencies for the viruses of only acute infection are considered to be higher than those for the viruses of only persistent infection, because viruses causing an acute symptom are expected to infect the neighboring host cells more frequently than viruses persistently holding a symptom (Overbaugh and Bangham 2001).
The replication frequencies of viruses inducing both acute and persistent infection might be higher than those of viruses inducing only acute infection, to some extent, because viruses inducing both infection could repeatedly replicated themselves in the acute phase. On the other hand, the replication frequencies of latent infection are absolutely lower than those of only persistent infection because the replication of the virus are strictly limited in latent infection. Thus, we reasonably assume that the highest replication frequency is manifested by the viruses inducing both acute and persistent infection, the second highest by viruses inducing only acute infection, the third by viruses inducing only persistent infection, and the fourth by the viruses inducing latent infection. In fact, the rate of synonymous substitution for the viruses inducing both acute and persistent infection is the highest, that for the viruses inducing only acute infection is the second highest, that for the viruses inducing only persistent infection is the third, and that for the viruses inducing latent infection is the fourth. This is because differences in the infection modes are considered to affect the replication frequencies.
Furthermore, we compared the transmission mode with the rate of synonymous substitution among RNA viruses. As mentioned earlier, the transmission modes of RNA viruses were classified into six kinds, i.e., aerosol, contagious, fecal-oral, blood, bite, and vector. In figure 3, the synonymous substitution rates of viruses inducing aerosol, contagious, or fecal-oral route transmission were higher than those of the viruses inducing transmission via blood, bite, or vector, and the differences were significant (P < 0.05) by the two-tailed Wilcoxon test. These results implied that differences in viral transmission modes were also correlated with the rate of synonymous substitution. The correlation can be understood as follows. Viruses that spread rapidly among hosts through aerosol, contagious, or fecal-oral route transmission would quickly replicate because the viruses can infect many individuals surrounding an infected host. On the other hand, viruses that spread slowly among hosts by a transmission via blood, a bite or a vector would replicate slowly compared with viruses inducing a transmission via the aerosol, contagious, or fecal-oral routes. This indicated that the transmission mode affected the replication frequency and that differences in the replication frequencies contributed to the variation of the rate of synonymous substitution for RNA viruses. In fact, there was a good example in which a change of transmission mode seriously affected the evolutionary rate (Salemi et al. 1999). This report is consistent with our results that differences in the transmission mode affect differences in the replication frequency, and differences in the replication frequencies produced the rates of synonymous substitution.
To summarize, the synonymous substitution rates among RNA viruses varied by 5 orders of magnitude. Moreover, in the present study, we proved that the variation in the synonymous substitution rates among RNA viruses was caused by variation of the replication frequency, and that differences in the infection and transmission modes affected the variation of replication frequencies.
![]() |
Supplementary Material |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Allende, R., W. W. Laegreid, G. F. Kutish, J. A. Galeota, R. W. Wills, and F. A. Osorio. 2000. Porcine reproductive and respiratory syndrome virus: description of persistence in individual pigs upon experimental infection. J. Virol. 74:10834-10837.
Asikainen, K., T. Hanninen, H. Henttonen, J. Niemimaa, J. Laakkonen, H. K. Andersen, N. Bille, H. Leirs, A. Vaheri, and A. Plyusnin. 2000. Molecular evolution of puumala hantavirus in Fennoscandia: phylogenetic analysis of strains from two recolonization routes, Karelia and Denmark. J. Gen. Virol. 81:2833-2841.
Bulmer, M., H. Wolfe, and P. M. Sharp. 1991. Synonymous nucleotide substitution rates in mammalian genes: implications for the molecular clock and the relationship of mammalian orders. Proc. Natl. Acad. Sci. USA. 88:5974-5978.[Abstract]
Bush, R. M., W. M. Fitch, C. A. Bender, and N. J. Cox. 1999. Positive selection on the H3 hemagglutinin gene of human influenza virus A. Mol. Biol. Evol. 16:1457-1465.[Abstract]
Dea, S., N. Sawyer, R. Alain, and R. Athanassious. 1995. Ultrastructural characteristics and morphogenesis of porcine reproductive and respiratory syndrome virus propagated in the highly permissive MARC-145 cell clone. Adv. Exp. Med. Biol. 380:95-98.[Medline]
Drake, J. W., and J. J. Holland. 1999. Mutation rates among RNA viruses. Proc. Natl. Acad. Sci. USA 96:13910-13913.
Escarmis, C., G. Gomez-Mariano, M. Davila, E. Lazaro, and E. Domingo. 2002. Resistance to extinction of low fitness virus subjected to plaque-to-plaque transfers: diversification by mutation clustering. J. Mol. Biol. 315:647-661.[CrossRef][ISI][Medline]
Gaut, B. S., B. R. Morton, B. C. McCaig, and M. T. Clegg. 1996. Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc. Natl. Acad. Sci. USA 93:10274-10279.
Horai, S. 1995. Evolution and the origins of man: clues from complete sequences of hominoid mitochondrial DNA. Southeast Asian J Trop Med Public Health. 26:146-154.[Medline]
Jenkins, G. M., A. Rambaut, O. G. Pybus, and E. C. Holmes. 2002. Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J. Mol. Evol. 54:156-165.[CrossRef][ISI][Medline]
Korber, B., M. Muldoon, J. Theiler, F. Gao, R. Gupta, A. Lapedes, B. H. Hahn, S. Wolinsky, and T. Bhattacharya. 2000. Timing the ancestor of the HIV-1 pandemic strains. Science 288:1789-1796.
Lee, L. M., and D. K. Henderson. 2001. Emerging viral infections. Curr. Opin. Infect. Dis. 14:467-480.[ISI][Medline]
Li, W. H., M. Tanimura, and P. M. Sharp. 1987. An evaluation of the molecular clock hypothesis using mammalian DNA sequences. J. Mol. Evol. 25:330-342.[ISI][Medline]
Mansky, L. M. 2000. In vivo analysis of human T-cell leukemia virus type 1 reverse transcription accuracy. J. Virol. 74:9525-9531.
Mansky, L. M., and H. M. Temin. 1995. Lower in vivo mutation rate of human immunodeficiency virus type 1 than that predicted from the fidelity of purified reverse transcriptase. J. Virol. 69:5087-5094.[Abstract]
Miyata, T., and T. Yasunaga. 1980. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J. Mol. Evol. 16:23-36.[ISI][Medline]
Nei, M., and T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418-426.[Abstract]
Overbaugh, J., and C. R. Bangham. 2001. Selection forces and constraints on retroviral sequence variation. Science. 292:1106-1109.
Parashar, U. D., L. M. Sunn, and F. Ong, et al. (17 coauthors). 2000. Case-control study of risk factors for human infection with a new zoonotic paramyxovirus, Nipah virus, during a 19981999 outbreak of severe encephalitis in Malaysia. J. Infect. Dis. 181:1755-1759.[CrossRef][ISI][Medline]
Pawlowski, J., I. Bolivar, J. F. Fahrni, C. de Vargas, M. Gouy, and L. Zaninetti. 1997. Extreme differences in rates of molecular evolution of foraminifera revealed by comparison of ribosomal DNA sequences and the fossil record. Mol. Biol. Evol. 14:498-505.[Abstract]
Rambaut, A. 2000. Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 16:395-399.[Abstract]
Robertson, B. H. 2001. Viral hepatitis and primates: historical and molecular analysis of human and nonhuman primate hepatitis A, B, and the GB-related viruses. J. Viral. Hepat. 8:233-242.[CrossRef][ISI][Medline]
Salemi, M., M. Lewis, J. F. Egan, W. W. Hall, J. Desmyter, and A. M. Vandamme. 1999. Different population dynamics of human T cell lymphotropic virus type II in intravenous drug users compared with endemically infected tribes. Proc. Natl. Acad. Sci. USA 96:13253-13258.
Schrag, S. J., P. A. Rota, and W. J. Bellini. 1991. Spontaneous mutation rate of measles virus: direct estimation based on mutations conferring monoclonal antibody resistance. J. Virol. 73:51-54.
Stech, J., X. Xiong, C. Scholtissek, and R. G. Webster. 1999. Independence of evolutionary and mutational rates after transmission of avian influenza viruses to swine. J. Virol. 73:1878-1884.
Tanaka, Y., K. Hanada, M. Mizokami, A. E. Yeo, J. W. Shih, T. Gojobori, and H. J. Alter. 2002. Inaugural Article: A comparison of the molecular clock of hepatitis C virus in the United States and Japan predicts that hepatocellular carcinoma incidence in the United States will increase over the next two decades. Proc. Natl. Acad. Sci. USA 99:15584-15589.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract]
Wolfe, K. H., W. H. Li, and P. M. Sharp. 1987. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. USA 84:9054-9058.[Abstract]
Xia, X., and Z. Xie. 2001. DAMBE: software package for data analysis in molecular biology and evolution. J. Hered. 2001 92:371-373.
Yanagihara, R., N. Saitou, V. R. Nerurkar, K. J. Song, I. Bastian, G. Franchini, and D. C. Gajdusek. 1995. Molecular phylogeny and dissemination of human T-cell lymphotropic virus type I viewed within the context of primate evolution and human migration. Cell Mol. Biol. 41:145-161.[ISI]
Yang, Z., S. Kumar, and M. Nei. 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641-1650.