Sustained G->A hypermutation during reverse transcription of an entire human immunodeficiency virus type 1 strain Vau group O genome

Jean-Pierre Vartanian1, Michel Henry1 and Simon Wain-Hobson1

Unité de Rétrovirologie Moléculaire, Institut Pasteur, 28 rue du Dr Roux, 75724 Paris cedex 15, France1

Author for correspondence: Simon Wain-Hobson. Fax +33 1 45 68 88 74. e-mail simon{at}pasteur.fr


   Abstract
Top
Abstract
Main text
References
 
Two full-length human immunodeficiency virus type 1 O sequences are described, one of which was hypermutated in all regions of the genome. This indicates that the intracellular [dTTP]/[dCTP] bias conducive to G->A hypermutation may be sustained throughout the synthesis of minus-strand DNA. In turn, this suggests the possibility of mutation of host sequences.


   Main text
Top
Abstract
Main text
References
 
Retroviral G->A hypermutation takes place during reverse transcription when the intracellular dTTP concentration greatly exceeds that of dCTP (Martinez et al., 1994 ; Vartanian et al., 1997 ). This results in extensive and monotonous incorporation of dT opposite rG leading to G->A substitutions. It has been found to occur for a wide variety of retroviruses including hepatitis B virus (Günther et al., 1997 ), although the degree of hypermutation is never greater than for the lentiviruses – occasionally segments may occur with 60% of Gs substituted (Janini et al., 2001 ; Vartanian et al., 1991 ; Wain-Hobson et al., 1995 ). Hypermutation may be reproduced in vitro (Martinez et al., 1994 ), during strong stop synthesis and in culture, again by modulating the [dTTP]/[dCTP] ratio (Vartanian et al., 1997 ). It occurs essentially during minus-strand DNA synthesis although occasional genomes with hypermutated plus-strands have been noted (Vartanian et al., 1991 ). Given that minus-strand synthesis takes at least an hour and that elongation beyond a mismatch, albeit facile for the human immunodeficiency virus (HIV-1) RT, slows down synthesis (Martinez et al., 1994 ; Richetti & Buc, 1990 ), hypermutation throughout minus-strand synthesis would indicate that dNTP pool imbalances can be maintained for considerable periods of time and do not reflect rapid fluctuations around steady state concentrations. One complete caprine arthritis–encephalitis virus (CAEV) genome indicated that hypermutation occurred erratically with hypermutated segments interspersed by non-mutated regions (Wain-Hobson et al., 1995 ). Borman et al. (1995) described 2·4 kb of sequence data derived from a complete cloned HIV-1 O (strain Vau) provirus. Segments of env and int were highly hypermutated.

To see if G->A hypermutation can be sustained over 10 kb of minus-strand DNA synthesis, this clone (Borman et al., 1995 ) has been completely sequenced. All regions of the provirus were hypermutated including the two LTRs. Given that a typical HIV-1 base composition is approximately 36% A, 23% G, 19% C and 22% T and that of the hypermutated Vau genome is approximately 43% A, 16% G, 19% C and 22% T, this would suggest that approximately 31% of Gs were substituted. However, given the paucity of HIV-1 O sequences, as well as their divergence, the extent of hypermutation was difficult to ascertain. Accordingly a complete Vau sequence was assembled by sequencing four fragments amplified from the same Hirt DNA preparation from which the hypermutated provirus was cloned. Primers were designed from what appeared to be non-hypermutated regions. Upon cloning in a TOPO TA cloning vector, inserts were sequenced. Both strands were covered and all differences with respect to the two full-length published sequences, MVP5180 (Gürtler et al., 1994 ) and ANT70 (Vanden Haesevelde et al., 1994 ) were checked on the fluorograms. Accession numbers for the two Vau sequences are AF407418 and AF407419. Alternatively they can be found at ftp.pasteur.fr/pub/retromol/Vau.

The extent of G->A hypermutation is shown in Fig. 1 in the form of differences with respect to the reference Vau sequence (Fig. 1). A total of 680 out of 2189 Gs were substituted, or 31%, while the ratio of G->A/A->G transitions was 680/13. Typically, substitutions occurred mainly in GpG (60%) and to a lesser extent in GpA (34%) dinucleotides to the detriment of GpT (3%) and GpC (3%) (Borman et al., 1995 ; Fitzgibbon et al., 1993 ; Vartanian et al., 1991 , 1994 ). Many runs of G were substituted reflecting once again the ability of the RT to elongate beyond mismatches, even runs of three to four mismatches.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 1. Hypermutated HIV-1Vau O genome shown as differences with respect to the non-mutated sequence. Dots indicate identities. The start of U3, R, U5 and open reading frames are indicated.

 
Turning to the complete HIV-1Vau sequence, once again Vpu and Env proved to be the most variable proteins differing by approximately 25–30% for O group isolates. By comparison Gag, Pol, Vif and Nef proteins varied by approximately 10–20%. Inevitably with three complete O group sequences, specific motifs start to come to the fore. The differences are subtle, for example the stem and loop involved in ribosomal frameshifting between gag and pol are shorter and larger compared to M and N group sequences. Rev is 11 residues shorter at its carboxy terminus compared to most HIV-1 M sequences, with the exception of clade C sequences.

Not surprisingly, within gp120 Env variation is particularly evident. The V4 hypervariable region of all published HIV-1 O Envs show an additional disulphide bridge (Fig. 2). This is intriguing given that a similar situation pertains to a fraction of HIV-1 A/E clade viruses circulating in Thailand (Fig. 2), although its precise location is slightly different (McCutchan et al., 1996 ). It represents a case of parallel evolution, for this is the only feature in common with the A/E viruses. This extra disulphide bridge is absent in N and SIVcpz strains. Unfortunately this region of the Env structure was insufficiently ordered in the crystal structure to permit analysis.



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 2. Primary and secondary structures of the Env V4 region. (a) Sequence alignment of the V4 Env region and flanking sequences derived from all HIV-1 O sequences as well as two HIV-1 M A/E clade sequences from Thailand and Central African Republic. (b) Proposed secondary structures of the V4 loops based on that of HIV-1 Lai (Wain-Hobson et al., 1985 ). The three disulphide bridges are numbered.

 
Only for Env are there numerous HIV-1 O sequences. These were aligned using the program CLUSTAL W (Thompson et al., 1994 ). Phylogenetic trees were derived by neighbour-joining analysis applied to pairwise sequence distances calculated using the Kimura two-parameter method to generate unrooted trees (Felsenstein, 1989 ). A phylogenetic tree based on a selection of HIV-1 env sequences is shown in Fig. 3. Obviously the Vau sequence clusters well with the other O sequences at the base of the radiation. Analysis of the gag and pol sequences showed that Vau clustered in the same position indicating that it is not a recombinant (data not shown). As noted before, the radius of the O group sequences is comparable to that of the M group viruses. The radii for full-length HIV-1 O Gag and Pol trees were comparable to those of HIV-1 M (not shown) indicating that the two expansions are generally in phase (Charneau et al., 1994 ).



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 3. Phylogenetic trees of all available complete HIV-1 O env sequences as well as SIVcpz and HIV-1 M reference sequences. Nucleic acid sequences were aligned by CLUSTAL W (Thompson et al., 1994 ) while trees were calculated by neighbour-joining analysis applied to pairwise sequence distances calculated using the Kimura two-parameter method (Felsenstein, 1989 ). Branch lengths are drawn to scale with the bar indicating 0·1 nucleotide replacements per site. The final output was generated with TREEVIEW (Page, 1996 ). The number at each node represents the percentage of bootstrap replicates (out of 100). Only bootstrap values > 95 are given. GenBank accession numbers are in parentheses: HIV-1 group M, clade A (L39106), clade B (Lai, K02013), clade C (U52953), clade D (M27372), clade E (U54771), clade F (U88826), clade G (U88826), HIV-1 group O, ANT70 (L20587), MVP5180 (L20571), 655Ha (U82993), 276Ha (U82991), 193Ha (U82990), 341Ha (U82992), HIV-1 group N, YBF30 (AJ006022), SIVcpzGab (X52154), SIVcpzANT (U42720), SIVCAM-3 (AF115393), SIVcpzUS (AF103818), SIVlhoest (AF075269), SIVsun (AF131870), SIVmndGB1 (M27470), SIVstm (M83293), SIVmn (M16403), HIV-2-EHO (U27200), HIV-2-ROD (M15390), SIVagmVER155 (M29975), SIVagmVER3 (M30931), SIVagmTYO-1 (X07805), SIVagmGRI677 (M584101), SIVsab (U04005), SIVsyk (L06042) and SIVcolCGU1 (AF301156).

 
That hypermutation occurred throughout the entire provirus indicates that highly biased dNTP pools can be maintained for a few hours. This follows from the knowledge that proviral synthesis takes up to an hour while elongation after a mismatch is relatively slow compared to the match (Sala et al., 1995 ). Given the large number of mismatches, minus-strand polymerization would be expected to be slowed down considerably. What this means for chromosomal DNA synthesis is less clear. Certainly, the frequency of mutation resulting from biased dNTP pools would be greatly reduced by proofreading and mismatch repair, something unavailable to a retrovirus. However, given that G:T mismatches are corrected to A:T rather than G:C at a frequency of approximately 1:20 (Brown & Jiricny, 1987 ), it would seem possible that if a [dTTP]/[dCTP] bias could be maintained for a few hours, mutation might well occur in the host cell genome (Gojobori & Yokoyama, 1987 ; Krawczak et al., 1995 ). Fluctuations of dNTP pools might provide a biochemical link underlying oncogenesis, spontaneous and hereditary diseases as well as retroviral hypermutations.


   Acknowledgments
 
We would like to thank François Clavel for the lambda clone, Eric Pelletier for sequencing and Francine McCutchan for an electronic precursor to Fig. 2. This work was supported by grants from the Institut Pasteur and l’Agence Nationale pour la Recherche sur le SIDA.


   References
Top
Abstract
Main text
References
 
Borman, A. M., Quillent, C., Charneau, P., Kean, C. M. & Clavel, F. (1995). A highly defective HIV group O provirus: evidence for the role of local sequence determinants in hypermutation during negative strand DNA synthesis. Virology 208, 601-609.[Medline]

Brown, T. C. & Jiricny, J. (1987). A specific mismatch repair event protects mammalian cells from loss of 5-methylcytosine. Cell 50, 945-950.[Medline]

Charneau, P., Borman, A. M., Quillent, C., Guetard, D., Chamaret, S., Cohen, J., Remy, G., Montagnier, L. & Clavel, F. (1994). Isolation and envelope sequence of a highly divergent HIV-1 isolate: definition of a new HIV-1 group. Virology 205, 247-253.[Medline]

Felsenstein, J. (1989). PHYLIP: Phylogeny Inference Package (version 3.2). Cladistics 5, 164-166.

Fitzgibbon, J. E., Mazar, S. & Dubin, D. T. (1993). A new type of G->A hypermutation affecting human immunodeficiency virus. AIDS Research and Human Retroviruses 9, 833-838.[Medline]

Gojobori, T. & Yokoyama, S. (1987). Molecular evolutionary rates of oncogenes. Journal of Molecular Evolution 26, 148-156.[Medline]

Günther, S., Sommer, G., Plikat, U., Wain-Hobson, S., Will, H. & Meyerhans, A. (1997). Naturally occurring hepatitis B virus subgenomes bearing the hallmarks of retroviral G->A hypermutation. Virology 235, 104-108.[Medline]

Gürtler, L. G., Hauser, P. H., Eberle, J., von Brunn, A., Knapp, S., Zekeng, L., Tsague, J. M. & Kaptue, L. (1994). A new subtype of human immunodeficiency virus type 1 (MVP-5180) from Cameroon. Journal of Virology 68, 1581-1585.[Abstract]

Janini, M., Rogers, M., Birx, D. R. & McCutchan, F. E. (2001). Human immunodeficiency virus type 1 DNA sequences genetically damaged by hypermutation are often abundant in patient peripheral blood mononuclear cells and may be generated during near-simultaneous infection and activation of CD4+ T cells. Journal of Virology 75, 7973-7986.[Abstract/Free Full Text]

Krawczak, M., Smith-Sorensen, B., Schmidtke, J., Kakkar, V. V., Cooper, D. N. & Hovig, E. (1995). Somatic spectrum of cancer-associated single basepair substitutions in the TP53 gene is determined mainly by endogenous mechanisms of mutation and selection. Human Mutation 5, 48-57.[Medline]

McCutchan, F. E., Artenstein, A. W., Sanders-Buell, E., Salminen, M. O., Carr, J. K., Mascola, J. R., Yu, X. F., Nelson, K. E., Khamboonruang, C., Schmitt, D., Kieny, M. P., McNeil, J. G. & Burke, D. S. (1996). Diversity of the envelope glycoprotein among human immunodeficiency virus type 1 isolates of clade E from Asia and Africa. Journal of Virology 6, 3331-3338.

Martinez, M. A., Vartanian, J. P. & Wain-Hobson, S. (1994). Hypermutagenesis of RNA using human immunodeficiency virus type 1 reverse transcriptase and biased dNTP concentrations. Proceedings of the National Academy of Sciences, USA 91, 11787-11791.[Abstract/Free Full Text]

Page, R. D. M. (1996). Treeview: an application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences 12, 357-358.[Medline]

Richetti, M. & Buc, H. (1990). Reverse transcriptases and genomic variability: the accurancy of DNA replication is enzyme specific and sequence dependent. EMBO Journal 9, 1583-1593.[Abstract]

Sala, M., Wain-Hobson, S. & Schaeffer, F. (1995). HIV-1 reverse transcriptase tG:T mispair formation on RNA and DNA templates with mismatched primers: a kinetic and thermodynamic study. EMBO Journal 14, 4622-4627.[Abstract]

Thompson, J., Higgins, D. & Gibson, T. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673-4680.[Abstract]

Vanden Haesevelde, M., Decourt, J. L., De Leys, R. J., Vanderborght, B., van der Groen, G., van Heuverswijn, H. & Saman, E. (1994). Genomic cloning and complete sequence analysis of a highly divergent African human immunodeficiency virus isolate. Journal of Virology 68, 1586-1596.[Abstract]

Vartanian, J. P., Meyerhans, A., Asjo, B. & Wain-Hobson, S. (1991). Selection, recombination, and G->A hypermutation of human immunodeficiency virus type 1 genomes. Journal of Virology 65, 1779-1788.[Medline]

Vartanian, J. P., Meyerhans, A., Sala, M. & Wain-Hobson, S. (1994). G->A hypermutation of the human immunodeficiency virus type 1 genome: evidence for dCTP pool imbalance during reverse transcription. Proceedings of the National Academy of Sciences, USA 91, 3092-3096.[Abstract]

Vartanian, J. P., Plikat, U., Maheux, R., Guillemot, L., Meyerhans, A. & Wain-Hobson, S. (1997). HIV genetic variability is directed and restricted by DNA precursor availability. Journal of Molecular Biology 270, 139-151.[Medline]

Wain-Hobson, S., Sonigo, P., Danos, O., Cole, S. & Alizon, M. (1985). Nucleotide sequence of the AIDS virus, LAV. Cell 40, 9-17.[Medline]

Wain-Hobson, S., Sonigo, P., Guyader, M., Gazit, A. & Henry, M. (1995). Erratic G->A hypermutation within a complete caprine arthritis–encephalitis virus (CAEV) provirus. Virology 209, 297-303.[Medline]

Received 9 October 2001; accepted 26 November 2001.