Major changes in the G protein of human respiratory syncytial virus isolates introduced by a duplication of 60 nucleotides

Alfonsina Trento1,2, Mónica Galiano2, Cristina Videla2, Guadalupe Carballal2, Blanca García-Barreno1, José A. Melero1 and Concepción Palomo1

1 Unidad de Biología Viral, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, 28220 Madrid, Spain
2 Centro de Educación Médica e Investigaciones Clínicas, CEMIC, Hospital Universitario, Av. Galván 4102, Buenos Aires C1431FWO, Argentina

Correspondence
José A. Melero
jmelero{at}isciii.es


   ABSTRACT
Top
ABSTRACT
MAIN TEXT
REFERENCES
 
The entire nucleotide sequence of the G gene of three human respiratory syncytial virus (HRSV) isolates (antigenic group B) has been determined. These three viruses (named BA viruses) were isolated in Buenos Aires in 1999 from specimens collected in different hospitals and at different dates. BA viruses have an exact duplication of 60 nucleotides in the G gene, starting after residue 791. This duplication is flanked by a repeat of four nucleotides (GUGU) and can fold into a relatively stable secondary structure. These features suggest a possible mechanism for the generation of a duplicated G segment. The predicted polypeptide is lengthened by 20 amino acids (residues 260–279) and this is reflected in the slower electrophoretic mobility of the G protein precursor of BA viruses compared with related viruses. The changes reported here expand the examples of drastic genetic alterations that can be introduced into the G protein sequence of HRSV while it replicates in its natural host.


   MAIN TEXT
Top
ABSTRACT
MAIN TEXT
REFERENCES
 
Human respiratory syncytial virus (HRSV) is a major cause of lower respiratory tract disease in babies and vulnerable adults. It causes annual epidemics during winter months in temperate countries or during the rainy season in tropical regions (reviewed by Collins et al., 2001). HRSV is classified in the genus Pneumovirus, family Paramyxoviridae. The viral genome, a negative-sense single-stranded RNA molecule, encodes at least 11 distinct proteins, two of which are the major surface glycoproteins anchored in the viral membrane. These consist of the attachment (G) glycoprotein, which mediates virus binding to cells (Levine et al., 1987), and the fusion (F) glycoprotein, which promotes fusion of the viral and cellular membranes (Walsh & Hruska, 1983). The G protein is produced in two forms: a membrane-bound form and a soluble form, which is secreted into the medium and is generated by initiation of translation at an internal in-frame AUG codon (Roberts et al., 1994). HRSV isolates have been classified into antigenic groups A and B, based mainly on the reactivity of viruses with monoclonal antibodies directed against the G protein (Anderson et al., 1985; Mufson et al., 1985). A third glycoprotein, the SH protein, is incorporated at a low level into virus particles; however, it is expressed in large amounts at the surface of infected cells (Collins & Mottet, 1993). Whilst the function of the SH protein is currently unknown, it has been found to induce changes in membrane permeability when expressed in bacteria (Perez et al., 1997).

The G protein is a type II glycoprotein that shares neither sequence nor structural features with the attachment proteins (HN or H) of other paramyxoviruses (Wertz et al., 1985). Spontaneous mutants with deletions of the SH and G genes (Karron et al., 1997) and genetically engineered viruses with deletions of the entire G gene have been isolated in tissue culture (Techaarpornkul et al., 2001). These viruses can replicate efficiently in certain cell types (e.g. Vero cells) but replicate inefficiently in others (e.g. HEp-2 cells) and they are attenuated in BALB/c mice (Teng et al., 2001). Therefore, it seems that the G protein, although not necessary for infection of certain cell types, is required for efficient infectivity, and this may be the reason for its presence in all virus isolates analysed to date. Nevertheless, the G protein shows extensive sequence and antigenic variation between viruses. The G protein is also one of the targets of neutralizing antibodies (reviewed by Melero et al., 1997).

The capacity of the G protein to accommodate drastic sequence changes is illustrated by a series of escape mutants selected with certain monoclonal antibodies. Besides single amino acid substitutions, some escape mutants had: (i) frame-shift mutations that altered the C-terminal one-third of the G protein (García-Barreno et al., 1990); (ii) premature stop codons that shortened the length of the G polypeptide by between 1 and 42 amino acids (Rueda et al., 1991, 1995); and (iii) A->G hypermutations that were translated into several amino acid changes, some of them involving a conserved cluster of cysteines found in the middle of the G protein ectodomain (Rueda et al., 1994; Martínez et al., 1997; Walsh et al., 1998).

There is some evidence that the changes mentioned above can also arise in the G protein during propagation of HRSV in its natural host. For instance, Sullender et al. (1991) described two viruses isolated from the same child 2 years apart that differed in 17 nucleotides of the G protein gene. These changes were translated into 11 amino acid differences, seven of them resulting from frame-shift mutations. Viruses with G proteins of different length (between 295 and 299 amino acids) due to mutations that determined termination codon usage have been isolated from clinical specimens (Sullender et al., 1991; Martínez et al., 1999). Finally, evidence for A->G hypermutations was provided by comparison of G gene sequences from certain natural isolates (Martínez & Melero, 2002).

We now describe three clinical isolates of HRSV (BA3833/99B, BA3859/99B and BA4128/99B; named BA viruses), classified within antigenic group B, that contain a duplication of 60 nucleotides in the C-terminal one-third of the G protein gene. These viruses were isolated during an active surveillance study of respiratory infections in Buenos Aires, Argentina, from 1995 to 2001. Firstly, viral antigens were detected in clinical specimens by indirect immunofluorescence. Subsequently, viruses were isolated by inoculation of clinical samples in susceptible cells. A total of 38 RSVs were isolated in 1999; these viruses were classified in either antigenic group A (47·4 %) or antigenic group B (52·6 %) by reactivity with group-specific monoclonal antibodies. To gain further information about the phylogenetic relationship of the virus isolates, partial sequences of the C-terminal one-third of the G protein gene were obtained.

Initially, total RNA extracted from infected cells was used to obtain a cDNA segment of the G gene by hemi-nested RT-PCR. Reverse transcription was carried out with a negative sense primer that contained an oligo(dT) tail (LG3-, 5'-GGCCCGGGAAGCTTTTTTTTTTTTTTT-3'). Subsequently PCR amplification was done with Taq polymerase using LG3- and the primer LG5+ (5'-GGATCCCGGGGCAAATGCAAACATGTCC-3'), which included the start sequence of the G protein gene (in bold). For group B viruses, a second amplification was performed using primers LG3- and GB496+ (5'-GATGATTACCATTTTGAAGTGTTCA-3'), which started at nucleotide 496 of the G gene sequence of strain CH18547 (a prototype strain of antigenic group B) (Johnson et al., 1987). The DNA product of the hemi-nested RT-PCR from BA viruses migrated significantly more slowly than the equivalent DNA amplified from other viruses, suggestive of a larger size. When this DNA was sequenced using the Big-Dye Terminator Sequencing kit (Applied Biosystems), a duplication of 60 nucleotides was observed, as illustrated in Fig. 1(A).



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 1. Sequence analysis of BA viruses. (A) Partial nucleotide sequence of the G protein gene (positive sense) of RSV strains. (B) Phylogenetic analysis of the viruses denoted in (A). Nucleotide sequences of the entire G protein gene were aligned using the CLUSTAL_X program, version 1.81 (Thompson et al., 1997). Phylogenetic analysis was done using the MEGA software, version 2.1. Viruses were clustered using the neighbour-joining method, with Kimura's 2-parameter model. The bar denotes nucleotide substitutions per site. (C) RNA secondary structure of the G segment (negative sense) duplicated in BA viruses, predicted with the algorithms of Zuker et al. (1999). The stability of the structure is indicated as -kcal mol-1.

 
To exclude other major sequence alterations in the G gene of BA viruses, the entire gene was amplified by RT-PCR with primers OG1-21+ (5'-GGGGCAAATGCAACCATGTC-3') and BG9- (5'-GGAATTCGTCGACTTTTTTTTTTGAATAA-3'). This amplification was done with another preparation of total RNA extracted from newly infected cells. In parallel, the G genes of the strains Mon/15/90 (a close relative of the BA viruses) and CH18537 (a reference strain of antigenic group B) were also amplified. The complete G gene sequence of the five viruses was determined using the ‘Big-Dye’ method (GenBank accession nos M17213, AY333361, AY333362, AY333363 and AY333364). The sequence of the prototype strain CH18537 was identical to the sequence published by Johnson et al. (1987). A phylogenetic analysis of the five entire sequences (Fig. 1B) confirmed that Mon/15/90 was closely related to the three new BA isolates. Two of these viruses (BA3833/99B and BA4128/99B) had identical sequences. The other virus (BA3859/99B) had two differences (A72->G and A329->G in the negative strand), the last one being translated into a single amino acid change (L105->P). These three viruses were isolated from children hospitalized in different centres in Buenos Aires. Furthermore, the isolates were obtained at different dates of the 1999 epidemic (BA3833/99B was isolated on 1 June 1999, BA3859/99B on 9 June 1999 and BA4128/99B on 20 August 1999), suggesting that they represented a substantial virus burden within the outbreak. No obvious differences in the growth rate and syncytia formation between BA viruses and other HRSV isolates were observed.

Eight other viruses of antigenic group B isolated in Buenos Aires during the same outbreak as the BA viruses were sequenced. None of these viruses had the 60 nucleotide duplication. Phylogenetic analysis of all group B isolates revealed that viruses from different genetic branches circulated in Buenos Aires during the 1999 epidemic (M. Galiano and others, unpublished data). One of these viruses (BA3737/99B) was closely related to the BA isolates, with only eight nucleotide differences in the last 400 nucleotides of the G gene (excluding the duplication) and none of them in the duplicated segment of the later viruses. Thus, a virus similar to BA3737/99B could have been the ancestor of the viruses with the 60 nucleotide duplication.

The extra sequence in the BA viruses starts with a motif of four nucleotides, CACA (nucleotides 732–735, mRNA sense), which is repeated at the end of the duplicated segment (Fig. 1A). This introduces an uncertainty about the starting site of the duplication, which could equally start in any of these four nucleotides. This ambiguity, however, does not alter the amino acid sequence deduced for the G protein of the three BA viruses.

A relatively stable secondary structure of the vRNA sequence that is duplicated in BA viruses was predicted using the algorithms developed by Zuker et al. (1999) (Fig. 1C). This structure suggests a possible mechanism for generating the duplicated segment if the viral polymerase switched to the original vRNA strand and copied again the 60 nucleotides represented in Fig. 1(C) before continuing the synthesis of the cRNA intermediate. It is worth stressing that no stable structures were predicted in that region of the BA viruses antigenome. Consequently, the above mechanism is less likely to occur during synthesis of the vRNA strand from the cRNA intermediate. Although the CACA motif is found repeatedly throughout the HRSV genome, the generation of stable secondary RNA structures and, most importantly, the viability of mutations may restrict the incorporation of nucleotide duplications in natural isolates.

The nucleotide sequence of the G gene from BA viruses is translated in a polypeptide of 315 amino acids, the largest found so far among HRSV isolates (Fig. 2). This protein shares structural features with the G proteins of other HRSV strains, such as the cluster of cysteines and the presence of multiple potential sites for O- and N-glycosylation in the protein ectodomain. The duplicated sequence lies in the C-terminal one-third of the G polypeptide and includes some of the potential O-glycosylation sites.



View larger version (8K):
[in this window]
[in a new window]
 
Fig. 2. Scheme of the G protein primary structure of BA viruses. The primary structure of the G protein from BA3833/99G virus is represented. Symbols indicate the transmembrane region (—), the potential N- ({blacktriangledown}) and O- (|) glycosylation sites and the cysteine residues ({bullet}). The variable regions of the G protein are indicated. The amino acid sequence between residues 240 and 280 is shown, highlighting the segment of 20 amino acids which is duplicated (boldface and italics).

 
Despite the increase in protein length, no perceptible size differences were observed by immunoblot when the G proteins of BA viruses were compared with the homologous protein of the closely related strain Mon/15/90 (Fig. 3A). However, when the infected cells were treated with tunicamycin to visualize the unglycosylated precursor (Palomo et al., 1991; Martínez et al., 1997), a clear difference in size was observed between the G protein precursor of Mon/15/90 and BA viruses (Fig. 3B). This result suggests that the nucleotide sequence which is duplicated in the latter viruses is indeed translated into protein. It is likely that this size difference was not reflected in the mature protein because the G band is very heterogeneous due to multiple glycosylations and the poor resolution in that part of the gel. It is worth mentioning that Teng & Collins (2002) did not observe a difference in the electrophoretic mobility of the mature G protein when 26 amino acids were deleted from the central part of the G protein ectodomain.



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 3. Immunoblot of the G protein. (A) HEp-2 cells were infected with the indicated viruses (m.o.i. of 1–2 p.f.u. per cell), as described previously (Martínez et al., 1997). (B) HEp-2 cells were infected in a similar manner, except that tunicamycin (10 µg ml-1) was added to the culture medium immediately after infection. Extracts were made after 48 h in buffer containing: 10 mM Tris/HCl, pH 7·6, 5 mM EDTA, 140 mM NaCl, 1 % Triton X-100 and 1 % sodium deoxycholate. Proteins were separated by 10 % SDS-PAGE, electrotransferred to Immobilon membranes and developed by immunoblot using monoclonal antibody 021/1G, which is specific for the G protein. Molecular mass markers are shown on the left and the positions of the mature G protein, partially glycosylated intermediates and the G protein precursor are indicated on the right.

 
BA viruses illustrate a new type of drastic change introduced in the G protein during natural propagation of HRSV. The fact that the three viruses have very similar sequences (two of them being identical) suggests that they originated by a unique event that occurred shortly before their isolation. It is interesting that similar 60 nucleotide insertions in the G protein gene have been detected in clinical specimens collected in Japan in the 2002–2003 season (R. Saito and H. Suzuki, Department of Public Health, University of Niigata, Japan, personal communication).

The C-terminal one-third of the G molecule has been shown to be immunologically relevant. Epitopes recognized by strain-specific monoclonal antibodies directed against the G protein of group A viruses have been mapped in that segment of the G polypeptide (Melero et al., 1997). In addition, human convalescent sera react with the G protein C-terminal one-third of certain HRSV strains (Palomo et al., 2000) and with synthetic peptides derived from them (Cane, 1997). Thus, it is possible that the duplicated amino acids in BA viruses change the antigenic structure of the G molecule, conferring to them an evolutionary advantage to re-infect individuals exposed previously to the ancestor virus. However, the antigenic properties of the BA virus G proteins cannot be assessed at present due to the lack of specific reagents.

Viruses with a three nucleotide duplication in the G protein gene have been reported (Sullender et al., 1991; García et al., 1994). Another virus isolated in Buenos Aires in 2001 had a six nucleotide duplication (to be reported). Thus, it seems that the HRSV polymerase is prone to copy repeatedly limited sequences of the G protein gene. In fact, when analysed in detail, the G protein sequence of many virus strains contain multiple short sequence repeats. The 60 nucleotide duplication reported here represents an extreme example of repeated sequences in the G protein gene. Whether this duplication originated from a partial vRNA secondary structure, as illustrated in Fig. 1(C), is not known. This structure could not be formed if the vRNA is bound tightly to the nucleoprotein. However, it is possible that short segments of vRNA devoid of nucleoprotein are generated during the process of RNA replication. Then, transient RNA secondary structures could be formed. These structures could be also at the basis of other mechanisms to generate RSVs with multiple A->G changes (hypermutations) (Martínez & Melero, 2002) or defective genomes, as described for other negative-stranded RNA viruses.


   ACKNOWLEDGEMENTS
 
We thank Beatriz Ebekian and Carmen Ricarte (CONICET) for excellent technical assistance. This work was supported in part by grants 01/24 from Instituto de Salud Carlos III and QLK2-CT-1999-00443 from the European Union (to J. A. M.), ERBIC18CT980374 from the European Union (to G. C. and J. A. M.) and from Fundación Rene Baron (to G. C.).


   REFERENCES
Top
ABSTRACT
MAIN TEXT
REFERENCES
 
Anderson, L. J., Heirholzer, J. C., Tson, C., Hendry, R. M., Fernie, B. N., Stone, Y. & McIntosh, K. (1985). Antigenic characterization of respiratory syncytial virus strains with monoclonal antibodies. J Infect Dis 151, 626–633.[Medline]

Cane, P. A. (1997). Analysis of linear epitopes recognized by the primary human antibody response to a variable region of the attachment (G) protein of respiratory syncytial virus. J Med Virol 51, 297–304.[CrossRef][Medline]

Collins, P. L. & Mottet, G. (1993). Membrane orientation and oligomerization of the small hydrophobic protein of human respiratory syncytial virus. J Gen Virol 74, 1445–1450.[Abstract]

Collins, P. L., Chanock, R. M. & Murphy, B. R. (2001). Respiratory syncytial virus. In Fields Virology, pp. 1443–1484. Edited by D. M. Knipe & P. M. Howley. Philadelphia: Lippincott Williams & Wilkins.

García, O., Martín, M., Dopazo, J. & 8 other authors (1994). Evolutionary pattern of human respiratory syncytial virus (subgroup A): cocirculating lineages and correlation of genetic and antigenic changes in the G glycoprotein. J Virol 68, 5448–5459.[Abstract]

García-Barreno, B., Portela, A., Delgado, T., López, J. A. & Melero, J. A. (1990). Frame shift mutations as a novel mechanism for the generation of neutralization resistant mutants of human respiratory syncytial virus. EMBO J 9, 4181–4187.[Abstract]

Johnson, P. R., Spriggs, M. K., Olmsted, R. A. & Collins, P. L. (1987). The G glycoprotein of human respiratory syncytial viruses of subgroups A and B: extensive sequence divergence between antigenically related proteins. Proc Natl Acad Sci U S A 84, <@?show=[to]>5625–5629.

Karron, R. A., Buonagurio, D. A., Georgiu, A. F. & 8 other authors (1997). Respiratory syncytial virus (RSV) SH and G proteins are not essential for viral replication in vitro: clinical evaluation and molecular characterization of a cold-passaged, attenuated RSV subgroup B mutant. Proc Natl Acad Sci U S A 94, 13961–13966.[Abstract/Free Full Text]

Levine, S., Klaiber-Franco, R. & Paradiso, P. R. (1987). Demonstration that glycoprotein G is the attachment protein of respiratory syncytial virus. J Gen Virol 68, 2521–2524.[Abstract]

Martínez, I. & Melero, J. A. (2002). A model for the generation of multiple A to G transitions in the human respiratory syncytial virus genome: predicted RNA secondary structures as substrates for adenosine deaminases that act on RNA. J Gen Virol 83, 1445–1455.[Abstract/Free Full Text]

Martínez, I., Dopazo, J. & Melero, J. A. (1997). Antigenic structure of the human respiratory syncytial virus G glycoprotein and relevance of hypermutation events for the generation of antigenic variants. J Gen Virol 78, 2419–2429.[Abstract]

Martínez, I., Valdés, O., Delfraro, A., Arbiza, J., Russi, J. & Melero, J. A. (1999). Evolutionary pattern of the G glycoprotein of human respiratory syncytial viruses from antigenic group B: the use of alternative termination codons and lineage diversification. J Gen Virol 80, 125–130.[Abstract]

Melero, J. A., García-Barreno, B., Martínez, I., Pringle, C. R. & Cane, P. A. (1997). Antigenic structure, evolution and immunobiology of human respiratory syncytial virus attachment (G) protein. J Gen Virol 78, 2411–2418.[Free Full Text]

Mufson, M. A., Örvell, C., Rafnar, B. & Norrby, E. (1985). Two distinct subtypes of human respiratory syncytial virus. J Gen Virol 66, 2111–2124.[Abstract]

Palomo, C., García-Barreno, B., Peñas, C. & Melero, J. A. (1991). The G protein of human respiratory syncytial virus: significance of carbohydrate side-chains and the C-terminal end to its antigenicity. J Gen Virol 72, 669–675.[Abstract]

Palomo, C., Cane, P. A. & Melero, J. A. (2000). Evaluation of the antibody specificities of human convalescent-phase sera against the attachment (G) protein of human respiratory syncytial virus: influence of strain variation and carbohydrate side chains. J Med Virol 60, 468–474.[CrossRef][Medline]

Perez, M., García-Barreno, B., Melero, J. A., Carrasco, L. & Guinea, R. (1997). Membrane permeability changes induced in Escherichia coli by the SH protein of human respiratory syncytial virus. Virology 235, 342–351.[CrossRef][Medline]

Roberts, S. R., Lichtenstein, D., Ball, L. A. & Wertz, G. W. (1994). The membrane-associated and secreted forms of the respiratory syncytial virus attachment glycoprotein G are synthesized from alternative initiation codons. J Virol 68, 4538–4546.[Abstract]

Rueda, P., Delgado, T., Portela, A., Melero, J. A. & García-Barreno, B. (1991). Premature stop codons in the G glycoprotein of human respiratory syncytial viruses resistant to neutralization by monoclonal antibodies. J Virol 65, 3374–3378.[Medline]

Rueda, P., García-Barreno, B. & Melero, J. A. (1994). Loss of conserved cysteine residues in the attachment (G) glycoprotein of two human respiratory syncytial virus escape mutants that contain multiple A–G substitutions (hypermutations). Virology 198, 653–662.[CrossRef][Medline]

Rueda, P., Palomo, C., García-Barreno, B. & Melero, J. A. (1995). The three C-terminal residues of human respiratory syncytial virus G glycoprotein (Long strain) are essential for integrity of multiple epitopes distinguishable by antiidiotypic antibodies. Viral Immunol 8, 37–46.[Medline]

Sullender, W. M., Mufson, M. A., Anderson, L. J. & Wertz, G. W. (1991). Genetic diversity of the attachment protein of subgroup B respiratory syncytial viruses. J Virol 65, 5425–5434.[Medline]

Techaarpornkul, S., Barretto, N. & Peeples, M. E. (2001). Functional analysis of recombinant respiratory syncytial virus deletion mutants lacking the small hydrophobic and/or attachment glycoprotein gene. J Virol 75, 6825–6834.[Abstract/Free Full Text]

Teng, M. N. & Collins, P. L. (2002). The central conserved cysteine noose of the attachment G protein of human respiratory syncytial virus is not required for efficient viral infection in vitro or in vivo. J Virol 76, 6164–6171.[Abstract/Free Full Text]

Teng, M. N., Whitehead, S. S. & Collins, P. L. (2001). Contribution of the respiratory syncytial virus G glycoprotein and its secreted and membrane-bound forms to virus replication in vitro and in vivo. Virology 289, 283–296.[CrossRef][Medline]

Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25, 4876–4882.[Abstract/Free Full Text]

Walsh, E. E. & Hruska, J. (1983). Monoclonal antibodies to respiratory syncytial virus proteins: identification of the fusion protein. J Virol 47, 171–177.[Medline]

Walsh, E. E., Falsey, A. R. & Sullender, W. M. (1998). Monoclonal antibody neutralization escape mutants of respiratory syncytial virus with unique alterations in the attachment (G) protein. J Gen Virol 79, 479–487.[Abstract]

Wertz, G. W., Collins, P. L., Huang, Y., Gruber, C., Levine, S. & Ball, L. A. (1985). Nucleotide sequence of the G protein of human respiratory syncytial virus reveals an unusual type of viral membrane. Proc Natl Acad Sci U S A 82, 4075–4079.[Abstract]

Zuker, M., Mathews, D. H. & Turner, D. H. (1999). Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. In RNA Biochemistry and Biotechnology, pp. 11–43. Edited by J. Barciszewski & B. F. C. Clark. Dordrecht: Kluwer.

Received 19 May 2003; accepted 15 July 2003.