Abteilung Virologie, Universität des Saarlandes, Institut für Medizinische Mikrobiologie und Hygiene, Klinikum Homburg, Haus 47, D-66421 Homburg, Germany1
Unité de Rétrovirologie Moléculaire, Institut Pasteur, F-75724 Paris cedex 15, France2
Author for correspondence: Andreas Meyerhans. Fax +49 6841 16 3980. e-mail Andreas.Meyerhans{at}med-rz.uni-sb.de
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The mechanisms that may contribute to the fixation of mutations include positive and negative selection as well as stochastic effects like bottlenecking and the massive destruction of virus and virus-infected cells by the intense antiviral immune response. A widely used means of distinguishing between these processes is the analysis of non-synonymous (ns) and synonymous (s) nucleotide substitutions. When normalized to the number of non-synonymous and synonymous sites, a ratio of dns/ds>>1 would indicate positive selection and dns/ds<<1 would indicate negative selection, while dns/ds1 is compatible with drift (Nei & Gojobori, 1986
).
Counting ns and s substitutions within an HIV sequence dataset relies on a proper reconstruction of virus evolution by phylogeny. Without doing so, standard pairwise methods for estimating dns and ds tend to overestimate the number of substitutions in these cases (Zanotto et al., 1999 ). Split decomposition is a mathematical clustering technique that has been applied successfully to the analysis of virus evolution (Dopazo et al., 1993
; Plikat et al., 1997
). It is a non-approximative method by which a set of sequences in the form of a distance matrix is decomposed into a number of binary splits. The splits can then be presented as a network, in which the nodes and tips of the branches correspond to individual sequences.
Given the clonal origin of an HIV infection represented by the node with most branches at an early time-point after the primary infection, it was possible to make a reliable estimation of the number of ns and s substitutions along the evolutionary path of an HIV-1 nef quasispecies in a single individual (Plikat et al., 1997 ). By comparison with expected ns and s values expected from a neutral model, it was concluded that drift plays a prominent role in HIV evolution. Here, the approach has been applied to 73 HIV-1 datasets comprising the nef, env and gag genes, encompassing approximately 1000 sequences. As judged by the proportions of ns and s substitutions, negative selection and random processes are uppermost in shaping the evolution of virus quasispecies in vivo.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Analysis of the sequence data.
Nucleic acid sequences were aligned by using the multiple sequence alignment algorithm as implemented in Clustal W (Thompson et al., 1994 ). Gap penalty parameters were set to 3·0 for opening a new gap and 0·05 for extension of an existing gap and the output format was set to MSF. The sequence alignments were translated to NEXUS format by using a modified version of READSEQ and used as input for SplitsTree 2.4 (Bandelt & Dress, 1992
; Huson, 1998
; Thompson et al., 1994
). For each dataset, the most parsimonious path was mapped out on the phylogram and substitutions were scored in terms of non-synonymous and synonymous substitutions as well as transitions and transversions.
The expected numbers of ns and s mutations were calculated for one sequence in each dataset, assuming all substitutions to be equally probable. Alternatively, they were calculated by imposing a transition/transversion ratio derived from each dataset in a manner described previously (Plikat et al., 1997 ). Any deviation of the observed values from these theoretical values was tested for significance by means of a
2-test. Datasets for which
2=3·841 (P<0·05) were interpreted as being under selection. If ns substitutions were significantly more or less frequent than expected, the genomes would be considered to be under positive or negative (purifying) selection, respectively. Since scoring transitions and transversions doesn't require knowledge of the genome that initiated the infection, this method is particularly useful for analysing unrooted datasets.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
For the sequences published by McDonald et al. (1997) , CD4+ cell counts were given corresponding to the date of final sequence sampling. Plotting the CD4+ cell counts against the
2 values determined for each of these sequence sets shows that there was basically no difference between the modes of evolution in patients with T4 counts around 400 per µl and those with T4 counts in ranges defining AIDS (data not shown).
A number of studies have shown that the accumulation of HIV-1 genetic diversity over time is essentially linear (Gojobori et al., 1990 ; Plikat et al., 1997
). Table 1
includes a number of datasets from several individuals over time. Although the founding genome was not known, it was interesting to investigate the manner in which cross-sectional diversity increased over time for these patients. The mean numbers of ns and s substitutions per base are shown as a function of time after seroconversion in Fig. 3
.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
A number of these datasets have revealed evidence of positive selection (Holmes et al., 1992 ; Liu et al., 1997
; Simmonds et al., 1991
; Wolinsky et al., 1996
; Zhang et al., 1997
), while the present analysis failed to do so. In these previous studies, ns and s values were invariably derived from 2x2 comparisons of sequences, without taking into account the phylogenetic relationships between the sequences. There is no obvious reason why 2x2 analyses should have a preferential effect on non-synonymous as opposed to synonymous substitutions. However, as phylogenetic reconstruction identifies a smaller number of mutations in a dataset compared with a 2x2 analysis, the statistical importance of the observed to expected values for ns or s substitutions becomes a major issue. For example, for a random model of mutation, approximately 80% of substitutions are non-synonymous. As the number of observed mutations becomes smaller, it clearly becomes harder to distinguish the observed distribution of ns and s substitutions from that expected. Conversely, any method that increases the absolute values of ns and s might tend to establish statistical significance more frequently than warranted.
However, it is not just a question of statistical significance, because many of the datasets involve large numbers of observed ns and s substitutions (Table 1). The question then becomes: which method more accurately describes the biological process? Given that diversification is a case of descent with modification, phylogenetic reconstruction would appear warranted. Indeed, it is increasingly being used in assessing the significance of ns and s mutations as judged by codon-based methods (Nielsen & Yang, 1998
; Yamaguchi-Kabata & Gojobori, 2000
; Zanotto et al., 1999
). Such analyses have shown some evidence of positive selection. The difference is that these studies are of much higher resolution, asking questions about individual codons, whereas analyses of whole regions are of lower resolution. Taken together, this suggests that the majority of substitutions are not positively selected, while a small fraction might well be (Sala & Wain-Hobson, 2000
).
Among these intrapatient datasets, there was clearly a great deal of genetic noise. In other words, there was little evidence of fixation of mutations in the virus quasispecies. The reasons for this may be many, although one variable could be the half-life of HIV-infected resting T cells in the peripheral blood, the preferred source of material for all of the studies cited above. These half-lives are variably cited in terms of months to years (Michie et al., 1992 ; Perelson et al., 1996
, 1997
). Hence, the time required to observe fixation may need to be several half-lives. Another variable could be the rapid expansion of some populations of variants that spill over into the periphery. The most striking observation concerns the dynamics of defective proviruses: occasionally 40% of genomes can be defective at a single site at a given time-point. A few months earlier or later, the proportion may be 5% or less (Martins et al., 1991
).
Another unknown in the assessment of mutation frequencies and mutation matrices from patient-derived datasets is the effect of recombination. In the accompanying study of longitudinal SIV sequence variation, Cheynier et al. (2001) show that the frequency of deletions in an intrapatient data set is comparable to that for transversions. Jetzt et al. (2000)
have determined the ex vivo recombination rate to be about 23 recombination events per genome per cycle. This is approximately tenfold higher than the point substitution rate (Mansky & Temin, 1995
). A priori, recombination should not favour ns or s substitutions per se. However, in a situation where the recombination rate is high compared with the substitution rate, it might be argued that the majority of homoplasies result from recombination.
In conclusion, a strong element of genetic noise is in evidence among intrapatient HIV quasispecies. Although more precise methods do yield evidence of positive selection, it would seem that the majority of substitutions observed in a dataset are unselected or are under negative selection.
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bandelt, H. J. & Dress, A. W. (1992). Split decomposition: a new and useful approach to phylogenetic analysis of distance data. Molecular Phylogenetics and Evolution 1, 242-252.[Medline]
Brown, A. J. & Cleland, A. (1996). Independent evolution of the env and pol genes of HIV-1 during zidovudine therapy. AIDS 10, 1067-1073.[Medline]
Brown, A. J., Lobidel, D., Wade, C. M., Rebus, S., Phillips, A. N., Brettle, R. P., France, A. J., Leen, C. S., McMenamin, J., McMillan, A., Maw, R. D., Mulcahy, F., Robertson, J. R., Sankar, K. N., Scott, G., Wyld, R. & Peutherer, J. F. (1997). The molecular epidemiology of human immunodeficiency virus type 1 in six cities in Britain and Ireland. Virology 235, 166-177.[Medline]
Cheynier, R., Kils-Hütten, L., Meyerhans, A. & Wain-Hobson, S. (2001). Insertion/deletion frequencies match those of point mutations in the hypervariable regions of the simian immunodeficiency virus surface envelope gene. Journal of General Virology 82, 1613-1619.
Delassus, S., Cheynier, R. & Wain-Hobson, S. (1991). Evolution of human immunodeficiency virus type 1 nef and long terminal repeat sequences over 4 years in vivo and in vitro. Journal of Virology 65, 225-231.[Medline]
Donaldson, Y. K., Bell, J. E., Holmes, E. C., Hughes, E. S., Brown, H. K. & Simmonds, P. (1994). In vivo distribution and cytopathology of variants of human immunodeficiency virus type 1 showing restricted sequence variability in the V3 loop. Journal of Virology 68, 5991-6005.[Abstract]
Dopazo, J., Dress, A. W. M. & von Haeseler, A. (1993). Split decomposition: a technique to analyze viral evolution. Proceedings of the National Academy of Sciences, USA 90, 10320-10324.[Abstract]
Gojobori, T., Moriyama, E. N. & Kimura, M. (1990). Molecular clock of viral evolution, and the neutral theory. Proceedings of the National Academy of Sciences, USA 87, 10015-10018.[Abstract]
Holmes, E. C., Zhang, L. Q., Simmonds, P., Ludlam, C. A. & Brown, A. J. (1992). Convergent and divergent sequence evolution in the surface envelope glycoprotein of human immunodeficiency virus type 1 within a single infected patient. Proceedings of the National Academy of Sciences, USA 89, 4835-4839.[Abstract]
Holmes, E. C., Zhang, L. Q., Robertson, P., Cleland, A., Harvey, E., Simmonds, P. & Leigh Brown, A. J. (1995). The molecular epidemiology of human immunodeficiency virus type 1 in Edinburgh. Journal of Infectious Diseases 171, 45-53.[Medline]
Hughes, E. S., Bell, J. E. & Simmonds, P. (1997). Investigation of the dynamics of the spread of human immunodeficiency virus to brain and other tissues by evolutionary analysis of sequences from the p17gag and env genes. Journal of Virology 71, 1272-1280.[Abstract]
Huson, D. H. (1998). SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14, 68-73.[Abstract]
Jetzt, A. E., Yu, H., Klarmann, G. J., Ron, Y., Preston, B. D. & Dougherty, J. P. (2000). High rate of recombination throughout the human immunodeficiency virus type 1 genome. Journal of Virology 74, 1234-1240.
Leitner, T., Kumar, S. & Albert, J. (1997). Tempo and mode of nucleotide substitutions in gag and env gene fragments in human immunodeficiency virus type 1 populations with a known transmission history. Journal of Virology 71, 4761-4770.[Abstract]
Liu, S. L., Schacker, T., Musey, L., Shriner, D., McElrath, M. J., Corey, L. & Mullins, J. I. (1997). Divergent patterns of progression to AIDS after infection from the same source: human immunodeficiency virus type 1 evolution and antiviral responses. Journal of Virology 71, 4284-4295.[Abstract]
McDonald, R. A., Mayers, D. L., Chung, R. C., Wagner, K. F., Ratto-Kim, S., Birx, D. L. & Michael, N. L. (1997). Evolution of human immunodeficiency virus type 1 env sequence variation in patients with diverse rates of disease progression and T-cell function. Journal of Virology 71, 1871-1879.[Abstract]
Mansky, L. M. & Temin, H. M. (1995). Lower in vivo mutation rate of human immunodeficiency virus type 1 than that predicted from the fidelity of purified reverse transcriptase. Journal of Virology 69, 5087-5094.[Abstract]
Martins, L. P., Chenciner, N., Asjo, B., Meyerhans, A. & Wain-Hobson, S. (1991). Independent fluctuation of human immunodeficiency virus type 1 rev and gp41 quasispecies in vivo. Journal of Virology 65, 4502-4507.[Medline]
Michie, C. A., McLean, A., Alcock, C. & Beverley, P. C. (1992). Lifespan of human lymphocyte subsets defined by CD45 isoforms. Nature 360, 264-265.[Medline]
Nei, M. & Gojobori, T. (1986). Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution 3, 418-426.[Abstract]
Nielsen, R. & Yang, Z. (1998). Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148, 929-936.
Perelson, A. S., Neumann, A. U., Markowitz, M., Leonard, J. M. & Ho, D. D. (1996). HIV-1 dynamics in vivo: virion clearance rate, infected cell life-span, and viral generation time. Science 271, 1582-1586.[Abstract]
Perelson, A. S., Essunger, P., Cao, Y., Vesanen, M., Hurley, A., Saksela, K., Markowitz, M. & Ho, D. D. (1997). Decay characteristics of HIV-1-infected compartments during combination therapy. Nature 387, 188-191.[Medline]
Plikat, U., Nieselt-Struwe, K. & Meyerhans, A. (1997). Genetic drift can dominate short-term human immunodeficiency virus type 1 nef quasispecies evolution in vivo. Journal of Virology 71, 4233-4240.[Abstract]
Sala, M. & Wain-Hobson, S. (2000). Are RNA viruses adapting or merely changing? Journal of Molecular Evolution 51, 12-20.[Medline]
Simmonds, P., Zhang, L. Q., McOmish, F., Balfe, P., Ludlam, C. A. & Brown, A. J. (1991). Discontinuous sequence change of human immunodeficiency virus (HIV) type 1 env sequences in plasma viral and lymphocyte-associated proviral populations in vivo: implications for models of HIV pathogenesis. Journal of Virology 65, 6266-6276.[Medline]
Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673-4680.[Abstract]
Wolfs, T. F., Zwart, G., Bakker, M., Valk, M., Kuiken, C. L. & Goudsmit, J. (1991). Naturally occurring mutations within HIV-1 V3 genomic RNA lead to antigenic variation dependent on a single amino acid substitution. Virology 185, 195-205.[Medline]
Wolinsky, S. M., Korber, B. T., Neumann, A. U., Daniels, M., Kunstman, K. J., Whetsell, A. J., Furtado, M. R., Cao, Y., Ho, D. D. & Safrit, J. T. (1996). Adaptive evolution of human immunodeficiency virus-type 1 during the natural course of infection. Science 272, 537-542.[Abstract]
Yamaguchi-Kabata, Y. & Gojobori, T. (2000). Reevaluation of amino acid variability of the human immunodeficiency virus type 1 gp120 envelope glycoprotein and prediction of new discontinuous epitopes. Journal of Virology 74, 4335-4350.
Zanotto, P. M., Kallas, E. G., de Souza, R. F. & Holmes, E. C. (1999). Genealogical evidence for positive selection in the nef gene of HIV-1. Genetics 153, 1077-1089.
Zhang, L., Diaz, R. S., Ho, D. D., Mosley, J. W., Busch, M. P. & Mayer, A. (1997). Host-specific driving force in human immunodeficiency virus type 1 evolution in vivo. Journal of Virology 71, 2555-2561.[Abstract]
Received 20 October 2000;
accepted 8 March 2001.