Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Hemagglutinin (HA) is the major envelope glycoprotein of A and B viruses, and hemagglutinin-esterase (HE) in C viruses is a protein homologous to HA. HA (HE) is cleaved into the signal peptide (about 20 amino acids in influenza A viruses), protein HA1 (HE1) (about 320 amino acids), and protein HA2 (HE2) (about 220 amino acids) when mature proteins are produced (fig. 1
). HA1 (HE1) is a receptor-binding protein and the major target of immune responses, whereas HA2 (HE2) is an anchor protein of the envelope and mediates fusion of the envelope and the cellular endosomal membrane. Influenza A virus HA genes are classified into 15 subtypes (H1H15), according to their antigenic properties (WHO Memorandum 1980
), whereas B and C virus HA (HE) genes are not classified into subtypes. Because influenza A virus pandemics in humans appear to occur when new subtypes of HA genes are introduced from aquatic birds, an understanding of the origin and evolution of HA genes is of particular importance.
|
The purpose of this paper is to study the evolutionary relationships of influenza A, B, and C virus HA (HE) genes. We are also interested in estimating the divergence times between these genes.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Amino acid sequences of influenza A, B, and C virus HA2s (HE2s) were collected from the international DNA databank (DDBJ release 43). After excluding sequences from laboratory-adapted viruses and identical sequences within species, we obtained 57, 34, 58, 10, 29, 2, 41, 1, 4, 2, 1, 1, 3, 1, and 2 amino acid sequences for the H1H15 subtypes of A virus HA2s, respectively. We also obtained 15 sequences for B virus HA2s and 35 sequences for C virus HE2s. A total of 296 amino acid sequences were aligned by the computer program CLUSTAL W (Thompson, Higgins, and Gibson 1994
). After removing all alignment gaps, 207 amino acid sites were used for estimating p, Poisson correction (PC), and gamma distances (Nei and Kumar 2000
). The gamma shape parameter (a) was estimated to be 1.83 by Gu and Zhang's (1997)
method. The phylogenetic tree was constructed by the neighbor-joining (NJ) method (Saitou and Nei 1987
), and the reliability of each interior branch was tested by the bootstrap method with 1,000 resamplings (Felsenstein 1985
; Kumar et al. 2001
). The NJ trees were also constructed for 17 amino acid sequences which were randomly chosen from each subtype of A virus HA2s and from B virus HA2s and C virus HE2s (table 1
).
|
We obtained 50, 25, 24, 10, 21, 2, 25, 1, 4, 2, 1, 1, 3, 1, and 2 amino acid sequences for the H1H15 subtypes of A virus HAs from the databank, respectively, and made a multiple alignment for a total of 172 sequences by CLUSTAL W. After removing all alignment gaps, 540 amino acid sites were used for estimating gamma distances with a = 1.20, which was obtained by Gu and Zhang's method. An NJ tree was constructed, and the branch lengths were recalculated by the ordinary least squares method (Rzhetsky and Nei 1993
) to estimate the rate of amino acid substitution accurately (see subsequently).
When the years of isolation are available for viral sequences in a phylogenetic tree, the rate of amino acid substitution may be estimated by the regression coefficient of the numbers of amino acid substitutions from a common root on the years of isolation (Nei 1983
; Suzuki, Wyndham, and Gojobori 2001
). Using the phylogenetic tree for 172 sequences of influenza A virus HAs, we estimated the rate of amino acid substitution for duck A virus HAs because duck provided the largest number (28) of sequences among aquatic birds. For estimating the divergence times between subtypes of A virus HA genes, we constructed a linearized tree (Takezaki, Rzhetsky, and Nei 1995
) for 28 amino acid sequences of duck A virus HAs using the gamma distance with a = 1.20. The standard errors (SEs) and 99% confidence intervals (CIs) of the rates and the divergence times were estimated by the bootstrap method, under the assumption that the topologies of the phylogenetic trees for 172 sequences of influenza A virus HAs and 28 sequences of duck A virus HAs were correct (Nei and Kumar 2000
).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The rate of amino acid substitution for duck A virus HAs (3.19 x 10-4 per site per year) was slower than that for human and swine A virus HAs ([0.562.03] x 10-3 per site per year) but similar to that for B virus HAs (5.3 x 10-4 per site per year [Air et al. 1990
]) and C virus HEs (2.3 x 10-4 per site per year [Muraki et al. 1996
]). These results suggest that the rate for HAs (HEs) is more or less constant in the natural reservoir but is accelerated in the newly infected host species. This is probably caused by variation in the strengths of immune responses and functional constraints on HAs (HEs) among different host species (Yamashita et al. 1988
; Bean et al. 1992
; Schafer et al. 1993
; Scholtissek, Ludwig, and Fitch 1993
; Makarova et al. 1999
; Suzuki and Gojobori 1999
).
The earliest divergence time between subtypes of influenza A virus HA genes was estimated to be about 2,000 years ago. Also, the divergence time between A and B virus HA genes was estimated to be about 4,000 years ago, whereas A and B virus HA genes and C virus HE genes diverged about 8,000 years ago. These estimates are substantially higher than those (200300 years) by Saitou and Nei (1986)
, who used human HA sequences. Because the evolutionary rate for human A virus HAs is known to be higher than that for aquatic birds, their estimates are considered to be underestimates. In fact, influenza pandemics in humans have been recorded as early as 412 B.C. (Kaplan and Webster 1977
), suggesting that influenza A viruses existed more than 2,400 years ago. This observation is consistent with the estimates obtained in the present study.
We estimated the rates and the divergence times under the assumption that the molecular clock has held throughout the evolutionary history of HA (HE) genes. To examine whether this was really the case, we tested the linear relationship between the year of isolation and the number of amino acid substitutions in figure 4 and found that the linearity was not supported at the 1% significance level in both panels (a) and (f). However, the rate of amino acid substitution for human A virus HAs obtained from panel (a) (1.20 x 10-3 per site per year) was similar to that from previous studies (1.0 x 10-3 per site per year [Saitou and Nei 1986
]), and the rate for duck A virus HAs obtained from panel (f) (3.89 x 10-4 per site per year) was similar to that obtained from panel (e) (2.48 x 10-4 per site per year). These observations suggest that the rates obtained from panels (a) and (f) are approximately correct. Also, the molecular clock was not rejected at the 1% significance level for the phylogenetic tree in figure 5
by the likelihood-ratio test (Rambaut 2000
; Yang 2000
) but was rejected for the tree in figure 2c.
The latter observation may reflect the fact that the biochemical functions are different between HAs and HEs and the natural reservoirs are not the same for influenza A, B, and C viruses. Therefore, some caution is necessary in estimating the divergence times between influenza A, B, and C virus HA (HE) genes. However, the rate of amino acid substitution for duck influenza A virus HAs was similar to that for B virus HAs and C virus HEs, as indicated previously. Also, in reality, no strict molecular clock is likely to hold for any protein but it is known that rough divergence times can be obtained even if the molecular clock is violated to some extent (Nei and Kumar 2000
, pp. 187206; Nei, Xu, and Glazko 2001
). Therefore, these estimates also appear to be appropriate as rough estimates.
In conclusion, influenza virus HA (HE) genes apparently evolved at a rate of amino acid substitution of 10-4 per site per year in the natural reservoir. These genes apparently diverged into influenza A, B, and C virus HA (HE) genes several thousand of years ago and subsequently into subtypes in influenza A viruses from several thousand to several hundred years ago.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Keywords: influenza virus
hemagglutinin
hemagglutinin-esterase
rate of amino acid substitution
divergence time
Address for correspondence and reprints: Yoshiyuki Suzuki, Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, 328 Mueller Laboratory, University Park, Pennsylvania 16802. yis1{at}psu.edu
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Air G. M., A. J. Gibbs, W. G. Laver, R. G. Webster, 1990 Evolutionary changes in influenza B are not primarily governed by antibody selection Proc. Natl. Acad. Sci. USA 87:3884-3888[Abstract]
Bean W. J., M. Schell, J. Katz, Y. Kawaoka, C. Naeve, O. Gorman, R. G. Webster, 1992 Evolution of the H3 influenza virus hemagglutinin from human and nonhuman hosts J. Virol 66:1129-1138[Abstract]
Cox N. J.,, F. Fuller, N. Kaverin, H. D. Klenk, R. A. Lamb, B. W. J. Mahy, J. McCauley, K. Nakamura, P. Palese, R. Webster, 2000 Family Orthomyxoviridae Pp. 585597 in M. H. V. van Regenmortel, C. M. Fauquet, D. H. L. Bishop, et al. (11 co-editors), eds. Virus Taxonomy. Academic Press, London
Felsenstein J., 1985 Confidence limits on phylogenies: an approach using the bootstrap Evolution 39:783-791[ISI]
Gammelin M., A. Altmuller, U. Reinhardt, J. Mandler, V. R. Harley, P. J. Hudson, W. M. Fitch, C. Scholtissek, 1990 Phylogenetic analysis of nucleoproteins suggests that human influenza A viruses emerged from a 19th-century avian ancestor Mol. Biol. Evol 7:194-200
Gu X., J. Zhang, 1997 A simple method for estimating the parameter of substitution rate variation among sites Mol. Biol. Evol 14:1106-1113[Abstract]
Hayashida H., H. Toh, R. Kikuno, T. Miyata, 1985 Evolution of influenza virus genes Mol. Biol. Evol 2:289-303[Abstract]
Hinshaw V. S., R. G. Webster, B. Turner, 1980 The perpetuation of orthomyxoviruses and paramyxoviruses in Canadian waterfowl Can. J. Microbiol 26:622-629[ISI][Medline]
Kaplan M. M., R. G. Webster, 1977 The epidemiology of influenza Sci. Am 12:88-106
Kendal A. P., G. R. Noble, J. J. Skehel, W. R. Dowdle, 1978 Antigenic similarity of influenza A (H1N1) viruses from epidemics in 19771978 to "Scandinavian" strains isolated in epidemics of 19501951 Virology 89:632-636[ISI][Medline]
Krossoy B., I. Hordvik, F. Nilsen, A. Nylund, C. Endresen, 1999 The putative polymerase sequence of infectious salmon anemia virus suggests a new genus within the Orthomyxoviridae J. Virol 73:2136-2142
Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Bioinformatics 17:12441245
Makarova N. V., N. V. Kaverin, S. Krauss, D. Senne, R. G. Webster, 1999 Transmission of Eurasian avian H2 influenza virus to shorebirds in North America J. Gen. Virol 80:3167-3171
Muraki Y., S. Hongo, K. Sugawara, F. Kitame, K. Nakamura, 1996 Evolution of the haemagglutinin-esterase gene of influenza C virus J. Gen. Virol 77:673-679[Abstract]
Nakada S., R. S. Creager, M. Krystal, R. P. Aaronson, P. Palese, 1984 Influenza C virus hemagglutinin: comparison with influenza A and B virus hemagglutinins J. Virol 50:118-124[ISI][Medline]
Nakajima K., U. Desselberger, P. Palese, 1978 Recent human influenza A (H1N1) viruses are closely related genetically to strains isolated in 1950 Nature 274:334-339[ISI][Medline]
Nei M., 1983 Genetic polymorphism and the role of mutation in evolution Pp. 165190 in M. Nei and R. K. Koehn, eds. Evolution of genes and proteins. Sinauer, Sunderland, Mass
Nei M., S. Kumar, 2000 Molecular evolution and phylogenetics Oxford University Press, Oxford, New York
Nei M., P. Xu, G. Glazko, 2001 Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms Proc. Natl. Acad. Sci. USA 98:2497-2502
Palese P., J. F. Young, 1982 Variation of influenza A, B, and C viruses Science 215:1468-1474[ISI][Medline]
Rambaut A., 2000 Estimating the rate of molecular evolution: incorporating noncontemporaneous sequences into maximum likelihood phylogenies Bioinformatics 16:395-399[Abstract]
Reid A. H., T. G. Fanning, J. V. Hultin, J. K. Taubenberger, 1999 Origin and evolution of the 1918 "Spanish" influenza hemagglutinin gene Proc. Natl. Acad. Sci. USA 96:1651-1656
Rohm C., N. Zhou, J. Suss, J. Mackenzie, R. G. Webster, 1996 Characterization of a novel influenza hemagglutinin, H15: criteria for determination of influenza A subtypes Virology 217:508-516[ISI][Medline]
Rzhetsky A., M. Nei, 1993 Theoretical foundation of the minimum-evolution method of phylogenetic inference Mol. Biol. Evol 10:1073-1095[Abstract]
Saitou N., M. Nei, 1986 Polymorphism and evolution of influenza A virus genes Mol. Biol. Evol. 3:57-74[Abstract]
. 1987 The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]
Schafer J. R., Y. Kawaoka, W. J. Bean, J. Suss, D. Senne, R. G. Webster, 1993 Origin of the pandemic 1957 H2 influenza A virus and the persistence of its possible progenitors in the avian reservoir Virology 194:781-788[ISI][Medline]
Scholtissek C., S. Ludwig, W. M. Fitch, 1993 Analysis of influenza A virus nucleoproteins for the assessment of molecular genetic mechanisms leading to new phylogenetic virus lineages Arch. Virol 131:237-250[ISI][Medline]
Scholtissek C. V., V. von Hoyningen, R. Rott, 1978 Genetic relatedness between the new 1977 epidemic strains (H1N1) of influenza and human influenza strains isolated between 1947 and 1957 (H1N1) Virology 89:613-617[ISI][Medline]
Slemons R. D., D. C. Johnson, J. S. Osborn, F. Hayes, 1974 Type-A influenza viruses isolated from wild free-flying ducks in California Avian Dis 18:119-124[ISI][Medline]
Smith W., C. H. Andrewes, P. P. Laidlaw, 1933 A virus obtained from influenza patients Lancet 225:66-68
Suarez D. L., 2000 Evolution of avian influenza viruses Vet. Microbiol 74:15-27[ISI][Medline]
Suzuki Y., T. Gojobori, 1999 A method for detecting positive selection at single amino acid sites Mol. Biol. Evol 16:1315-1328[Abstract]
Suzuki Y., A. Wyndham, T. Gojobori, 2001 Virus evolution Pp. 377413 in D. J. Balding, M. Bishop, and C. Cannings, eds. Handbook of statistical genetics. Wiley, Chichester
Takezaki N., A. Rzhetsky, M. Nei, 1995 Phylogenetic test of the molecular clock and linearized trees Mol. Biol. Evol 12:823-833[Abstract]
Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680[Abstract]
Webster R. G., W. J. Bean, O. T. Gorman, T. M. Chambers, Y. Kawaoka, 1992 Evolution and ecology of influenza A viruses Microbiol. Rev 56:152-179[Abstract]
Webster R. G., M. Yakhno, V. S. Hinshaw, W. J. Bean, K. G. Murti, 1978 Intestinal influenza: replication and characterization of influenza viruses in ducks Virology 84:268-278[ISI][Medline]
WHO Memorandum. 1980 A revision of the system of nomenclature for influenza viruses Bull. WHO 58:585-591[ISI][Medline]
Yamashita M., M. Krystal, W. M. Fitch, P. Palese, 1988 Influenza B virus evolution: co-circulating lineages and comparison of evolutionary pattern with those of influenza A and C viruses Virology 163:112-122[ISI][Medline]
Yang Z., 2000 Phylogenetic analysis by maximum likelihood (PAML). Version 3.0 University College London, London, U.K