Development of a homology model for clade A human immunodeficiency virus type 1 gp120 to localize temporal substitutions arising in recently infected women

Mary Poss1, David C. Holley2, Roman Biek1, Harold Cox3 and John Gerdes3

1 Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA
2 Department of Pharmaceutical Sciences, University of Montana, Missoula, MT 59812, USA
3 Department of Chemistry, University of Montana, Missoula, MT 59812, USA

Correspondence
Mary Poss
mary.poss{at}umontana.edu


   ABSTRACT
Top
ABSTRACT
MAIN TEXT
REFERENCES
 
The virus population transmitted by a human immunodeficiency virus type 1 (HIV-1) infected individual undergoes restriction and subsequent diversification in the new host. However, in contrast to men, who have limited virus diversity at seroconversion, there is measurable diversity in viral envelope gene sequences in women infected with clade A HIV-1. In this study, virus sequence diversity in three unrelated, clade A infected women preceding and shortly after seroconversion was evaluated. It was demonstrated that there is measurable evolution of envelope gene sequences over this time interval. Furthermore, in each of the three individuals, amino acid substitutions arose at five or six positions in sequences derived at or shortly after seroconversion relative to sequences obtained from the seronegative sample. Presented here is a model of clade A gp120 to determine the location of substitutions that appeared as the virus population became established in three clade A HIV-1 infected women.


   MAIN TEXT
Top
ABSTRACT
MAIN TEXT
REFERENCES
 
Transmission of human immunodeficiency virus type 1 (HIV-1) from an infected to a naïve host causes significant changes in the genetic make-up of the virus population. HIV-1-infected individuals may harbour up to 10 % virus diversity in the viral envelope gene (env), depending on the length of infection, but only a subset of these variants initiate the new infection. The inoculum transmitted to a recipient contains diverse genotypes (Learn et al., 2002) but in men the variant pool detected close to the time of seroconversion is essentially homogeneous in env (Wolfs et al., 1992; Zhang et al., 1993; Zhu et al., 1993). In contrast, env diversity in clade A-infected women is 1–5 % at seroconversion (Long et al., 2000; Poss et al., 1995). The factors responsible for different profiles of virus diversity in men and women near seroconversion are unknown.

Diversity in the HIV-1 glycoprotein, gp120, encoded by env can significantly impact viral fitness by altering cell tropism and neutralization sensitivity. Substitutions in clade B gp120 that affect receptor and co-receptor binding and antibody interactions have been determined by mutagenesis studies (Boussard et al., 2002; Kwong et al., 2002; Pantophlet et al., 2003; Saphire et al., 2001; Wyatt et al., 1998; Zwick et al., 2003). However, it is not known whether any key functional sites change as the virus establishes infection in a new host. Furthermore, despite the large number of clade A HIV-1-infected individuals worldwide, there has been no research on the structural biology of gp120 of non-clade B viruses. To gain a better understanding of forces acting on gp120 of a colonizing virus population, we developed a homology model of clade A HIV-1 gp120 to locate substitutions that arose between pre- and post-seroconversion samples in each of three clade A-infected women for whom extensive analysis of HIV-1 diversity (Poss et al., 1995), plasma viral load and post-seroconversion env evolution (Poss et al., 1998), and viral phenotype (Painter et al., 2003) have been described.

To determine if there was measurable evolution near the time of infection, we recovered full-length env sequences from pre-seroconversion plasma viral RNA samples from each individual and from samples obtained at 20 (Q23), 7 (Q45) or 10 and 49 weeks (Q47) from the pre-seroconversion sample (Poss et al., 1995). The analysis employed a Markov Chain Monte Carlo (MCMC) framework and a general time reversible substitution model for the Bayesian estimation of evolutionary rates (Drummond et al., 2002), incorporating a prior of time to most recent common ancestor estimated from a larger dataset of V1–2 and V3 sequences from these subjects (Poss et al., 1998). The estimated evolutionary rate for Q23 sequences was 2·5 % per site per year (95 % confidence intervals 1·61–3·35) based on 16 plasma viral RNA sequences from two samples obtained 20 weeks apart. Rates of 0·28 % (0·07–0·51) were derived for Q45 based on 17 plasma viral sequences obtained 7 weeks apart. Viral RNA sequences were not available for Q47 at seroconversion and thus estimated rates of 0·13 % (0·05–0·22) were based on eight viral RNA pre-seroconversion sequences, seven proviral DNA sequences obtained 10 weeks later at the time seroconversion was detected, and 10 proviral sequences obtained 7 months after seroconversion. In all cases, confidence intervals excluded zero indicating that measurable rates of evolution occurred in env. There was no correlation between plasma viral loads and evolutionary rates in these individuals.

Protein sequences from pre-seroconversion samples were compared to those obtained at or near seroconversion to determine if amino acid replacement occurred in the evolving virus population. Although intra-sample sequence diversity was high (Fig. 1), in all three subjects amino acids at several positions in sequences derived at or shortly after seroconversion were replaced when compared with the sequences detected at the time of infection (Table 1). Six substitutions were fixed in the seroconversion virus population of Q23. Four of these sites have been shown by mutagenesis studies of clade B gp120 to affect the binding of CD4 and one to affect binding of antibodies to the CD4-binding site (CD4BS) (Table 1). Using a null distribution of 66 positions in gp120 affecting CD4 binding (Kwong et al., 1998; Pantophlet et al., 2003; Rizzuto et al., 1998), there was a significantly higher number of substitutions at sites affecting binding to the CD4-binding site than would be expected if substitutions were randomly distributed (Fisher's exact test, P=0·009). Substitution T430A is of particular interest because alanine mutagenesis of this site in clade B gp120 strongly increased binding of antibody IgG1b12 (b12) (Pantophlet et al., 2003; Saphire et al., 2001) but decreased binding of CD4 (Rizzuto et al., 1998), suggesting that this substitution could affect the structure of the receptor-binding domain. Three out of five temporal substitutions from Q45 viruses involved sites that affected CD4 and b12 or CCR5 binding to clade B gp120, which is significantly more than would be expected by chance (P=0·037). Temporal changes in Q47 sequences did not involve sites that have been investigated for impact on receptor, co-receptor or antibody binding, although sites 278 and 364 are adjacent to residues that contact CD4 (Kwong et al., 1998). None of the temporal substitutions in either Q23 or Q45 sequences affected potential N-linked glycosylation sites or involved V1, V2 or V3, and none occurred in the gp41 portion of the glycoprotein (data not shown). In contrast, temporal substitutions in Q47 sequences did involve one site in V1 and gp41 (data not shown) and did result in two new potential N-linked glycosylation sites (Table 1).



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 1. Distribution of gp120 amino acid substitutions. A consensus sequence was determined for the pre-seroconversion sequences obtained from each of three clade A-infected women. The percentage of sequences that contained an amino acid that differed from the pre-seroconversion consensus at each position is shown for pre-seroconversion sequences (grey lines, ‘I’) and sequences obtained at or after seroconversion (dashed lines, ‘Sc’). The secondary structure elements are relative to the HXBc2 crystal and include only the regions that appear in the crystal structure. The locations of the deleted V1–2 and V3 loops are indicated in the bottom panel by arrowheads. The numbers of sequences in each dataset are; Q23–8 and 8, Q45–9 and 8 and Q47–8 and 17 for pre-seroconversion and post-seroconversion samples, respectively.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Summary of temporal substitutions

 
Of the temporal substitutions identified in these subjects, 50 % or more fell outside variable regions. It is particularly noteworthy that none of the early temporal changes affected V3 because fixation of amino acids in V3 did occur over a 2 year period post-seroconversion in sequences obtained from these individuals (Poss et al., 1998). Substitutions that arise in V3 often correlate with a phenotypic change in the virus population (Scarlatti et al., 1997; Speck et al., 1997). The fact that the V3 region in the viruses from these individuals does not evolve near the time of infection and seroconversion suggests that there may be selection against change in V3 as the virus population becomes established.

Despite the high prevalence of non-clade B HIV-1 infection globally, most structural, evolutionary and therapeutic research on HIV-1 is based on clade B viruses. Thus, it is not clear whether results obtained from clade B gp120 mutagenesis studies are applicable to clade A gp120. A crystal structure for clade B gp120 is available (Kwong et al., 1998, 2000a) and provides the opportunity to determine the structural relatedness of gp120 from clades A and B. Using the clade B structure, we first identified substitutions specific to clade A gp120 from an alignment of consensus sequences of clade B and clade A (available at http://hiv-web.lanl.gov/), which are shown in red in Fig. 2(A, B). Although clade A-specific substitutions are distributed throughout the sequence, there is a preponderance of differences in the ‘silent face’ of the gp120 outer domain in {beta} sheet 12 and 13 and {alpha} helix 2, which flank the V3 loop (Fig. 2A). This {alpha} helix is known to be variable both within and between clades, which can lead to presentation of distinct antigenic surfaces to the immune system (Kwong et al., 2000a). In the 3-D structure, substitutions characteristic of clade A occur at periodic intervals in the {alpha} helix and in a continuous stretch of the apposing {beta}12 sheet to form a clade A-specific planar surface. In addition, there are several clade A changes within the CD4-binding pocket (Fig. 2B). Clade A HIV-1 gp120s, including all representatives discussed herein, contain a substitution at P369, which is flanked by conserved CD4 contact residues D368 and E370. Substitutions of P369 decrease the binding of CD4 binding site antibody, b12 (Saphire et al., 2001), and are a feature of clade B neutralization escape mutants (Mo et al., 1997). Clade A-specific substitutions are also present in the bridging sheet, which is involved in co-receptor binding (Kwong et al., 1998). It is noteworthy that there are few clade differences in the inner domain of gp120, a region of contact with gp41 (Kwong et al., 1998).



View larger version (111K):
[in this window]
[in a new window]
 
Fig. 2. Topological distribution of clade A-specific substitutions and temporal substitutions arising after infection with clade A HIV-1. (A, B) The structure of clade B gp120 (1G9N) is shown as a ribbon diagram with sites differing between consensus clade B and clade A gp120 depicted in red, the truncated V1–2 and V3 loops indicated in green and the V4 and V5 loops shown in cyan. The arrow in (A) indicates the planar face containing clade A-specific substitutions. Perspective in (A) is with the virus membrane upper right and cell membrane lower left (CD4 is not shown), and in (B) the molecule is rotated approximately 110 ° around the vertical axis to display the CD4-binding pocket and CD4, shown as a violet tube structure. (C, D) Homology model of clade A HIV-1 based on the sequence of Q47S6 is shown in the same orientation as the clade B structure in (A, B). Temporal substitutions that arose in Q23, Q45 and Q47 sequences are displayed in magenta, cyan and purple, respectively. The temporal substitutions that were shared between viruses are indicated with striping. The V3 loop has been added to the model and adjustments have been made to accommodate the longer clade A V4 loop.

 
We developed a homology model of clade A gp120 based on the clade B structure, 1G9N (Kwong et al., 2000a) to determine the spatial distribution of temporal substitutions from three subjects. The prototype clade A model employed the gp120 sequence of Q23Sc4 (AY069928) because previous phylogenetic analysis demonstrated that Q23 viruses were basal in the clade A tree (Poss et al., 1997), suggesting that they were suitable representatives of clade A gp120. The clade A homology model was built at the Molecular Computational Core Facility at the University of Montana, utilizing an Octane SGI (Silicon Graphics) workstation operated with software SYBYL 6.8 employing Biopolymer, Composer and ProTable modules (Tripos). The Q23Sc4 sequence was threaded against the 1G9N scaffold using the Composer module, which considers residues that only align with the scaffold, thereby providing a preliminary homology model. The original IG9N structure is devoid of the first three variable loops. For the homology model, the V1–V2 loop regions were not considered. However, the clade A V3 loop structure was fashioned by a protocol related to Kwong's method (Kwong et al., 2000b) utilizing published NMR conformations of the gp120 clade B V3 loop region (Vranken et al., 2001). Whereas there are clade-specific regions of gp120 (shown in Fig. 2A, B), the core clade A gp120 model structure was found to possess few abnormal {phi}, {psi} and {omega} values (using ProTable; Lovell et al., 2000) relative to those of the X-ray-derived core clade B gp120 structure, suggesting a noteworthy degree of core structural similarity between the protein representations. The clade A V4 loop is longer than that of the IG9N crystal. Therefore, the additional amino acids were accommodated by clipping at the V4 N-terminal side, inserting the extra clade A amino acids, followed by adjusting select bond rotations at the loop C-terminal end to allow loop attachment to the core structure. To obtain a refined Q23Sc4 homology model (Fig. 2C, D), side chain clashes between the V3 and V4 loops were minimized, first by changing select bond torsions to provide {phi}, {psi} and {omega} values consistent with established standards (Lovell et al., 2000), followed by local V3 and V4 energetic minimizations (Tripos force field) to their respective nearest energy wells.

The sequence with the longest V4 region, Q47S6 (AY288084), was threaded against the Q23Sc4 homology structure to display the temporal substitution. Virus populations from each of the three individuals evolved a temporal substitution in the inner domain near the truncated N terminus of gp120 (Fig. 2C), a region that is proximal to the viral membrane and that also accommodated clade A-specific changes (Fig. 2A). The remainder of the inner domain did not change in the period following infection. Both Q45 and Q47 viruses had amino acid substitutions in V4 and the V4 stem, a region on the outer domain of the protein that does not contribute to virus neutralization. In both the Q23 and Q45 virus populations, temporal substitutions involved the bridging sheet and the V5 region (Fig. 2D), which forms an upper surface to the CD4-binding pocket and has been implicated by mutagenesis to affect gp120–receptor interaction (Table 1). One of the Q47 substitutions (H364P) also lies on the lateral surface of the binding pocket and is adjacent to key CD4 contact residues (Kwong et al., 1998).

Our data demonstrate that in three independent infections there is measurable evolution in HIV-1 env preceding and following seroconversion. We provide the first model of clade A gp120 and demonstrate that it has significant structural similarity with the clade B glycoprotein and that substitutions arising as the virus population becomes established in three hosts are not randomly distributed in the protein. Furthermore, in two of the three subjects there were significantly more changes at sites shown by mutagenesis of clade B gp120 to affect receptor or co-receptor binding than would be expected by chance. Further studies of the substitutions that arise soon after infection, will be valuable to understand selective forces acting on the infecting virus population and to inform efforts aimed at preventing establishment of new infections.


   ACKNOWLEDGEMENTS
 
The authors thank Drs Joan Kreiss, Harold Martin, Jr and their collaborators at the Ganjoni Municipal Clinic and Coast Provincial General Hospital, Mombasa, Kenya for sample collection and Dr Julie Overbaugh for providing plasma samples. This research was supported in part by NIH AI44609. Partial support for D. C. H. and H. C. came from Montana NSF-EPSCoR, EPS-0091995 and for J. G. from NIH-NCRR, NIH P20 RR15583. Funding for the Molecular Computational Core Facility, Center for Structural and Functional Neuroscience, was derived from NIH P20 RR15583-01 and the NSF EPS-0091995.


   REFERENCES
Top
ABSTRACT
MAIN TEXT
REFERENCES
 
Boussard, C., Doyle, V. E., Mahmood, N., Klimkait, T., Pritchard, M. & Gilbert, I. H. (2002). Design, synthesis and evaluation of peptide libraries as potential anti-HIV compounds, via inhibition of gp120/cell membrane interactions, using the gp120/cd4/fab17 crystal structure. Eur J Med Chem 37, 883–890.[CrossRef][Medline]

Drummond, A. J., Nicholls, G. K., Rodrigo, A. G. & Solomon, W. (2002). Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161, 1307–1320.[Abstract/Free Full Text]

Kwong, P. D., Wyatt, R., Robinson, J., Sweet, R. W., Sodroski, J. & Hendrickson, W. A. (1998). Structure of an HIV gp120 envelope glycoprotein in complex with the CD4 receptor and a neutralizing human antibody. Nature 393, 648–659.[CrossRef][Medline]

Kwong, P. D., Wyatt, R., Majeed, S., Robinson, J., Sweet, R. W., Sodroski, J. & Hendrickson, W. A. (2000a). Structures of HIV-1 gp120 envelope glycoproteins from laboratory-adapted and primary isolates. Structure Fold Des 8, 1329–1339.[Medline]

Kwong, P. D., Wyatt, R., Sattentau, Q. J., Sodroski, J. & Hendrickson, W. A. (2000b). Oligomeric modeling and electrostatic analysis of the gp120 envelope glycoprotein of human immunodeficiency virus. J Virol 74, 1961–1972.[Abstract/Free Full Text]

Kwong, P. D., Doyle, M. L., Casper, D. J. & 18 other authors (2002). HIV-1 evades antibody-mediated neutralization through conformational masking of receptor-binding sites. Nature 420, 678–682.[CrossRef][Medline]

Learn, G. H., Muthui, D., Brodie, S. J., Zhu, T., Diem, K., Mullins, J. I. & Corey, L. (2002). Virus population homogenization following acute human immunodeficiency virus type 1 infection. J Virol 76, 11953–11959.[Abstract/Free Full Text]

Long, E. M., Martin, H. L., Jr, Kreiss, J. K., Rainwater, S. M., Lavreys, L., Jackson, D. J., Rakwar, J., Mandaliya, K. & Overbaugh, J. (2000). Gender differences in HIV-1 diversity at time of infection. Nat Med 6, 71–75.[CrossRef][Medline]

Lovell, S. C., Word, J. M., Richardson, J. S. & Richardson, D. C. (2000). The penultimate rotamer library. Proteins 40, 389–408.[CrossRef][Medline]

Mo, H., Stamatatos, L., Ip, J. E., Barbas, C. F., Parren, P. W., Burton, D. R., Moore, J. P. & Ho, D. D. (1997). Human immunodeficiency virus type 1 mutants that escape neutralization by human monoclonal antibody IgG1b12. J Virol 71, 6869–6874.[Abstract]

Painter, S. L., Biek, R., Holley, D. C. & Poss, M. (2003). Envelope variants from women recently infected with clade A human immunodeficiency virus type 1 confer distinct phenotypes that are discerned by competition and neutralization experiments. J Virol 77, 8448–8461.[Abstract/Free Full Text]

Pantophlet, R., Ollmann Saphire, E., Poignard, P., Parren, P. W., Wilson, I. A. & Burton, D. R. (2003). Fine mapping of the interaction of neutralizing and nonneutralizing monoclonal antibodies with the CD4 binding site of human immunodeficiency virus type 1 gp120. J Virol 77, 642–658.[CrossRef][Medline]

Poss, M., Martin, H. L., Kreiss, J. K., Granville, L., Chohan, B., Nyange, P., Mandaliya, K. & Overbaugh, J. (1995). Diversity in virus populations from genital secretions and peripheral blood in women recently infected with human immunodeficiency virus type 1. J Virol 69, 8118–8122.[Abstract]

Poss, M., Gosink, J., Thomas, E., Kreiss, J. K., Ndinya-Achola, J., Mandaliya, K., Bwayo, J. & Overbaugh, J. (1997). Phylogenetic evaluation of Kenyan HIV-1 isolates. AIDS Res Hum Retroviruses 13, 493–499.[Medline]

Poss, M., Rodrigo, A. G., Gosink, J. J., Learn, G. H., de Vange Panteleeff, D. D., Martin, H. L., Jr, Bwayo, J., Kreiss, J. K. & Overbaugh, J. (1998). Evolution of envelope sequences from the genital tract and peripheral blood of women infected with clade A human immunodeficiency virus type 1. J Virol 72, 8240–8251.[Abstract/Free Full Text]

Rizzuto, C. D., Wyatt, R., Hernandez-Ramos, N., Sun, Y., Kwong, P. D., Hendrickson, W. A. & Sodroski, J. (1998). A conserved HIV gp120 glycoprotein structure involved in chemokine receptor binding. Science 280, 1949–1953.[Abstract/Free Full Text]

Saphire, E. O., Parren, P. W., Pantophlet, R. & 7 other authors (2001). Crystal structure of a neutralizing human IGG against HIV-1: a template for vaccine design. Science 293, 1155–1159.[Abstract/Free Full Text]

Scarlatti, G., Tresoldi, E., Bjorndal, A. & 9 other authors (1997). In vivo evolution of HIV-1 co-receptor usage and sensitivity to chemokine-mediated suppression. Nat Med 3, 1259–1265.[Medline]

Speck, R. F., Wehrly, K., Platt, E. J., Atchison, R. E., Charo, I. F., Kabat, D., Chesebro, B. & Goldsmith, M. A. (1997). Selective employment of chemokine receptors as human immunodeficiency virus type 1 coreceptors determined by individual amino acids within the envelope V3 loop. J Virol 71, 7136–7139.[Abstract]

Vranken, W. F., Fant, F., Budesinsky, M. & Borremans, F. A. (2001). Conformational model for the consensus V3 loop of the envelope protein gp120 of HIV-1 in a 20 % trifluoroethanol/water solution. Eur J Biochem 268, 2620–2628.[Abstract/Free Full Text]

Wolfs, T. F., Zwart, G., Bakker, M. & Goudsmit, J. (1992). HIV-1 genomic RNA diversification following sexual and parenteral virus transmission. Virology 189, 103–110.[Medline]

Wyatt, R., Kwong, P. D., Desjardins, E., Sweet, R. W., Robinson, J., Hendrickson, W. A. & Sodroski, J. G. (1998). The antigenic structure of the HIV gp120 envelope glycoprotein. Nature 393, 705–711.[CrossRef][Medline]

Zhang, L. Q., MacKenzie, P., Cleland, A., Holmes, E. C., Brown, A. J. & Simmonds, P. (1993). Selection for specific sequences in the external envelope protein of human immunodeficiency virus type 1 upon primary infection. J Virol 67, 3345–3356.[Abstract]

Zhu, T., Mo, H., Wang, N., Nam, D. S., Cao, Y., Koup, R. A. & Ho, D. D. (1993). Genotypic and phenotypic characterization of HIV-1 patients with primary infection. Science 261, 1179–1181.[Medline]

Zwick, M. B., Kelleher, R., Jensen, R., Labrijn, A. F., Wang, M., Quinnan, G. V., Jr, Parren, P. W. & Burton, D. R. (2003). A novel human antibody against human immunodeficiency virus type 1 gp120 is V1, V2, and V3 loop dependent and helps delimit the epitope of the broadly neutralizing antibody immunoglobulin G1 b12. J Virol 77, 6965–6978.[Abstract/Free Full Text]

Received 13 January 2004; accepted 23 February 2004.



This Article
Abstract
Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Google Scholar
Articles by Poss, M.
Articles by Gerdes, J.
Articles citing this Article
PubMed
PubMed Citation
Articles by Poss, M.
Articles by Gerdes, J.
Agricola
Articles by Poss, M.
Articles by Gerdes, J.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS