Evolution of the genome of Human enterovirus B: incongruence between phylogenies of the VP1 and 3CD regions indicates frequent recombination within the species

A. Michael Lindberg1, Per Andersson1, Carita Savolainen2, Mick N. Mulders2,{dagger} and Tapani Hovi2

1 Department of Chemistry and Biomedical Sciences, University of Kalmar, SE-391 82 Kalmar, Sweden
2 Enterovirus Laboratory, National Public Health Institute (KTL), 00300 Helsinki, Finland

Correspondence
Michael Lindberg
michael.lindberg{at}hik.se


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Enteroviruses show a high degree of sequence variation both between and within serotypes due to the lack of proofreading of the viral RNA-dependent RNA polymerase. In addition, recombination is known to occur not only within but also between different serotypes. We have previously shown that capsid coding sequences of coxsackievirus B4 (CVB4) cluster in several coexisting genotypes (intergenotypic nucleotide difference of 12 % or more) whereas a single lineage of echovirus 30 (EV30) has been prevailing and evolving throughout the last two decades. In the major capsid gene, VP1, clustering of both nucleotide and amino acid sequences correlates with serotype. We have now determined a 501 nucleotide sequence in the non-structural 3CD region of CVB4 and EV30 field strains. Phylogenetic analysis revealed that sequences of Human enterovirus B (HEV-B) were segregated in the 3CD region into three distinct clusters without the VP1-associated serotype/genotype correlation. One of the clusters comprised the E2 strain of CVB4, the EV30 prototype and five other CVB4 field strains whereas the other two clusters, in addition to CVB4 and EV30 strains, also included other HEV-B serotypes. We believe that intertypic recombination is the most likely explanation for the observed incongruence. Similarity analysis based on complete genomes of the CVB4 and EV30 prototypes and the CVB4 E2 strain revealed that a putative recombination spot was mapped within the 2B gene. The incongruence observed in the two genomic domains (P1 and P3) suggests a certain degree of independent evolution, which may be explained by interserotypic recombination within an enterovirus species. It is thus difficult to exclude recombination in the history of any given strain.

The nucleotide sequences reported in this paper have been submitted to GenBank and assigned accession nos AF311938 (EV30B), AF311939 (CVB4E2b) and AF516186AF516205 for the partial 3CD sequences of the CVB4 and EV30 strains.

{dagger}Present address: Laboratoire National de Santé, Department of Immunology, Luxembourg.


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
The genus Enterovirus, family Picornaviridae, includes a large group of human pathogens causing a variety of diseases such as poliomyelitis, aseptic meningitis, myocarditis, systemic infection in the newborn, acute haemorrhagic conjunctivitis and probably insulin-dependent diabetes mellitus type 1 (Melnick, 1996). The enteroviruses of humans are divided into five species: Human Enterovirus A to D (HEV-A to -D) and Poliovirus (PV) (King et al., 2000). Of these, the HEV-B species comprise the widest repertoire of serotypes including the six group B coxsackieviruses (CVB) and 31 echoviruses (EV) as well as coxsackievirus A9 (CVA9) and enterovirus 69. The 7500-nucleotide-long single-stranded RNA genome is composed of a single large open reading frame divided into three domains, P1 to P3, each encoding three to four proteins and the flanking 5' and 3' untranslated regions (UTRs). Sequence data covering complete genomes have been available for representatives of all the species for some time and segregation of the coding regions of the genomes in definite clusters has been used as the main criterion in defining the species (Hyypiä et al., 1997; Oberste et al., 1999c; Pöyry et al., 1996). The clustering of different enterovirus types into the established species shows the same genetic relationship whether the comparison is based on regions encoding structural or non-structural genes or proteins (Pöyry et al., 1996). Within a particular species, the sequence divergence is greater in the capsid protein-coding P1 region than in the non-structural protein-coding P2 and P3 regions. Different hypotheses have been put forward to explain this difference. The high divergence of the structural genes has been explained as forced differentiation of the capsid proteins by the host immune surveillance. This is supported by the fact that the sequence diversity, including indels, is primarily observed in regions of the capsid proteins exposed to the environment and the immune response. An alternative, but not mutually exclusive, explanation is that the viruses are not adapted to the immune surveillance, but, instead, the variation is a consequence of the quasispecies nature of virus populations, which provides a fluctuating pool of variants that also differ in overall fitness. Relatively greater variation in the capsid region would then reflect enrichment of certain variants in past microenvironments especially suitable for their inherent properties, while the lesser genetic divergence of the non-structural proteins, presumably more sensitive to deleterious mutations, would be a result of negative selection (Sala & Wain-Hobson, 1999, 2000). The 3'UTR follows the clustering of the coding region of the genome but in the relatively conserved 5'UTR only two clusters are seen, one formed by HEV-A and -B and the other by HEV-C and -D and the polioviruses (Santti et al., 1999). It was suggested recently that the disparity of the diversity pattern of the 5'UTR sequences of enteroviruses is likely to result from ancient recombination(s) rather than being based on accumulation of point mutations (Santti et al., 1999). Whether the 5'UTR is a region of frequent recombination events is presently not known.

It has been known for a long time that genomic recombination may take place between closely related enteroviruses replicating in the same cell. Hirst (1962) and Ledinko (1963) reported that poliovirus could recombine during certain propagation conditions in the laboratory. Recently, extensive studies on virus strains excreted by recipients of the trivalent attenuated polio vaccine have revealed that under those conditions, recombination is often observed and chimeric strains containing genomic regions from two or sometimes from the three different poliovirus serotypes are readily isolated from the vaccinees (Cuervo et al., 2001; Furione et al., 1996; Georgescu et al., 1995; Guillot et al., 2000). It has been proposed that the phylogenetic relationships observed for CVB3 and CVA9, two members of the HEV-B cluster, might be due to previous recombination events (Santti et al., 1999, 2000). We recently reported that the genetic lineage represented by the prototype strain of echovirus 9 (EV9H) is, as expected, closely related to the Barty genotype of EV9 (EV9B) in the structural genes while the non-structural genes of the P3 region are closely related to a lineage represented by the echovirus 18 prototype (EV18M) (Andersson et al., 2002). A recombination site was estimated by a maximum-likelihood approach to the 2C gene, one region that is a frequent recombination spot in polioviruses (Duggal et al., 1997; Duggal & Wimmer, 1999). However, the impact of recombination on the biology and evolution of enteroviruses has not been seriously assessed. Evidence indicating recombination in the evolutionary history of currently circulating strains has instead been considered exceptional.

In this report we have analysed genes of the P1 and P3 regions using previously described clinical isolates of CVB4 and EV30, members of HEV-B (Mulders et al., 2000; Savolainen et al., 2001). Phylogenetic analysis of the two regions demonstrated that their evolutionary histories are strikingly different and this observation suggests that recombination within the HEV-B species is probably more frequent than previously anticipated.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Cells and viruses.
The prototype EV30 strain, Bastianni, was obtained from ATCC (VR-322) and was propagated using continuous cultures of a local variant of green monkey kidney cells maintained in DMEM containing 5 % fetal bovine serum. The propagation of CVB4E2b, a variant of the previously described E2 strain of CVB4 and the other CVB4 and EV30 isolates included in this study have been previously described (Lindberg et al., 1997; Mulders et al., 2000; Savolainen et al., 2001).

Virus characterization, viral RNA extraction and cDNA synthesis.
The methods used to characterize the EV30 and CVB4 strains used in this study have been described previously (Lindberg & Polacek, 2000; Lindberg et al., 1997). Purified viral RNA was reverse-transcribed using Superscript II (Life Technologies) and primer NotdT25 (5'-ATAAGAATGCGGCCGCT25-3') prior to the amplification of viral genomic fragments by PCR.

PCR and long-distance PCR for sequence analyses of the EV30B and CVB4E2b genomes.
To generate templates for complete genotyping of CVB4E2b and EV30B, three overlapping amplicons were generated for each virus genome by long-distance PCR as previously described for the sequence analysis of the coxsackievirus B5 and echovirus 5 prototypes (Lindberg & Polacek, 2000; Lindberg et al., 1999). Entero5ncpac, which covers the first 19 nucleotides of an enterovirus genome (5'-ATGTCGACTTAATTAAAACAGCCTGTGGGTT-3'), was used for PCR amplification and sequence analysis of the 5' end of the genome as previously described (Lindberg & Polacek, 2000). The first 19 nucleotides of the reported EV30B and CVB4E2b sequences are thus primer-derived. This region is, however, completely conserved throughout all HEV-B genomes so far described. The primer NotdT25 was used both for oligo(d)T-primed reverse transcription and to generate templates for PCR and amplification of the 3' end of the genome using the 3'-RACE (rapid amplification of cDNA ends) method as previously described for genotyping of complete enterovirus genomes (Polacek et al., 1999).

Sequencing strategy and analysis.
The complete nucleotide sequences of the EV30B and CVB4E2b genomes were determined using dideoxy sequencing of the three overlapping amplicons in conjunction with a primer-walking strategy as previously described (Lindberg & Polacek, 2000). The VP1 sequences of the CVB4 and EV30 strains used in this study have previously been reported (Mulders et al., 2000; Savolainen et al., 2001). In addition, a 501 nucleotide sequence (167 amino acids) covering the 3CD junction (156 nucleotides of 3C and 345 nucleotides of 3D) was extracted from all CVB4 and EV30 clinical isolates used in the study. The sequencing reactions were performed with an ABI PRISM BigDye Termination Cycle Sequencing Ready Reaction kit (Applied Biosystems). Sequence data were collected with an ABI PRISM 310 Genetic Analyser (Applied Biosystems) and the EV30B and CVB4E2b sequences were assembled using Sequencher 3.1 software (GeneCodes). Analyses of sequence relationships were performed using the GCG (Genetics Computer Group) package version 10 (Devereux et al., 1984) and available internet resources.

Sequences used in comparisons.
Previously reported enterovirus sequences used in the comparison were obtained from GenBank. The origin, abbreviations and accession numbers for the virus sequences included in the analyses are shown in Table 1.


View this table:
[in this window]
[in a new window]
 
Table 1. Virus sequences of human enterovirus species used in the analyses

 
Phylogenetic analysis.
The amino acid sequences were aligned by ClustalW using the default settings followed by manual editing and the nucleotide sequences were then aligned according to the amino acid sequence using DAMBE (Thompson et al., 1994; Xia & Xie, 2001). All positions with gaps were deleted. The resulting datasets included 759 nucleotides for VP1 and 489 nucleotides for 3CD. Before any attempt at phylogenetic reconstruction, likelihood mapping (LM) analysis was used to evaluate whether the datasets contained phylogenetic signals that correspond to tree-, star- or net-like evolution. Random quartets (10 000) of the sequences in each dataset were selected and the analysis of the major type of phylogenetic signal was evaluated for each dataset using the TN93 model of sequence substitution and gamma-distributed variation of rates across sites (Krings et al., 1997; Lemey et al., 2002; Strimmer & von Haeseler, 1997). The relationships between EV30B, CVB4E2b, the clinical EV30 and CVB4 isolates and the other members of HEV-B included in the datasets were thus characterized using the LM method as included in Tree-Puzzle software version 5.0 (Strimmer & von Haeseler, 1996).

Distance matrix-based methods of phylogenetic reconstruction were performed using a beta version of PAUP* 4.0 (Swofford, 2000). Various models of molecular evolution were evaluated including the HKY85 model of sequence evolution (Hasegawa et al., 1985), the Tamura-Nei algorithm, TN93 (Tamura & Nei, 1993) and the GTR model (Rodriguez et al., 1990) using the likelihood ratio test as implemented in Modeltest (Posada & Crandall, 1998). In this test the GTR model was chosen. Biased nucleotide frequencies were, however, observed for the two datasets (3CD A: 28·1 %, G: 20·5 %, C: 27·2 %, U: 24·2 %; VP1 A: 28·6 %, G:19·6 %, C: 28·9 %, U: 22·9 %). Therefore the LogDet model of nucleotide sequence substitution was included in the analysis (Lake, 1994; Lockhart et al., 1994; Steel, 1994). This model was included because it is less affected by nucleotide composition bias, an imbalance that may affect the effort to reconstruct the evolutionary history (Swofford et al., 1996). In addition, the results obtained were verified with phylogenetic reconstruction using PHYLIP version 3.6 (Felsenstein, 2001) and Tree-Puzzle version 5.0 (Strimmer & von Haeseler, 1996). A neighbour-joining tree was constructed using parameters estimated from each dataset (Saitou & Nei, 1987). The estimated parameters included transition/transversion rate (Ts/Tv) and proportion of invariable sites (I). Rates of substitution at variable sites were assumed to follow a gamma distribution of four rate categories, with the average rate for each category represented by the mean value. The shape parameter of the gamma distribution, {alpha}, was estimated from each dataset. The tree generated was studied and compared with trees generated by the distance method using other models of sequence evolution and bootstrap analyses were performed (1000 replicates for distance methods and 100 replicates for maximum-likelihood analysis) to evaluate the statistical support of the generated trees (Felsenstein, 1985). Trees showing the same evolutionary patterns were also constructed using maximum-likelihood criteria using PAUP* 4.0 and including, from each dataset, the estimated parameters described above (data not shown).

Similarity analysis of the complete genome alignments was performed with the SimPlot software package version 2.5, using a sliding window of 400 nucleotides moving in steps of 50 nucleotides (Lole et al., 1999). The sequence identity, corrected for multiple sequence substitutions by the JC model (Jukes & Cantor, 1969), for the selected sequence against all the other sequences in the alignment was plotted for each window analysed. This analysis has previously been used for detecting recombination in human immunodeficiency virus and enteroviruses (Andersson et al., 2002; Lole et al., 1999; Santti et al., 1999). The indicated site of recombination was verified using a method based on maximum-likelihood analysis and using representatives of the putative parental lineages, CVB4JVB and EV30B, and the proposed chimera, CVB4E2b. The method is implemented in the software LARD and has previously been used to detect the recombination sites in dengue virus and GB virus C/hepatitis G virus and enterovirus (Andersson et al., 2002; Worobey & Holmes, 2001; Worobey et al., 1999).


   RESULTS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Sequence analysis of the complete EV30B and CVB4E2b genomes
The complete nucleotide sequence of EV30B was determined using long-distance PCR and a primer-walking strategy. The sequence was aligned with previously published enterovirus sequences and the proposed cleavage sites in the polyprotein were identified by amino acid sequence homology (Table 2). In addition, the complete genomic sequence was determined for the E2 strain variant of CVB4 included in this analysis. Sixty differences between a previously reported CVB4E2 sequence and the sequence used in this work were observed including a frameshift in the 3D gene (Table 3). This new variant of the CVB4E2 sequence will be referred to as CVB4E2b (accession no. AF311939) and used in all comparisons described below.


View this table:
[in this window]
[in a new window]
 
Table 2. Genome features of EV30B

 

View this table:
[in this window]
[in a new window]
 
Table 3. Genomic differences between CVB4E2b and CVB4E2org

 
Phylogenetic analysis of HEV-B VP1
The VP1 sequences of representative virus strains of each of the CVB4 and EV30 lineages previously reported were aligned with the corresponding sequences of other completely sequenced members of the HEV-B species and compared with representative virus strains of the other four species of enterovirus that consist of human pathogens. Phylogenetic analysis was performed for both nucleotide and deduced amino acid sequences, and the resulting inferred relationship was evaluated by bootstrap analysis. The VP1 dendrogram showed complete bootstrap support in this gene for the correlation between serotype and genotype (Fig. 1a). Within the HEV-B cluster, the included strains of serotypes CVB3, 4 and 5, EV9 and EV30 formed monophyletic groups supported by high bootstrap values. This result is in accordance with previous observations of a correlation between serotypic and genotypic relationships within VP1 (Oberste et al., 1999a, c, 2000).




View larger version (51K):
[in this window]
[in a new window]
 
Fig. 1 Dendrograms showing phylogenetic relationships in the VP1 (a) and 3CD (b) regions. For each dataset, the columns including gaps were deleted from the codon-corrected alignments resulting in datasets containing 759 (VP1) and 489 (3CD) nucleotide positions, respectively. The LogDet model of nucleotide substitution (Lake, 1994; Lockhart et al., 1994; Steel, 1994) was used and a neighbour-joining tree was constructed (Saitou & Nei, 1987) and evaluated by 1000 bootstrap pseudoreplicates (Felsenstein, 1985). The percentage of bootstrap replicates supporting the inferred trees are shown at the nodes. Only bootstrap values above 70 % are shown (Hillis & Bull, 1993). (a) VP1. The clustering of HEV-B strains of the same serotype is fully supported by bootstrap analysis. (b) 3CD. The HEV-B strains cluster into three main lineages.

 
Evolutionary relationship within the 3CD sequence of HEV-B
A 501-nucleotide sequence corresponding to 167 amino acids from the 3CD region was determined from the representative lineages derived from the CVB4 and EV30 VP1 analyses. The sequences were aligned with the corresponding region of the enteroviruses used in the VP1 analysis. The robustness of the reconstructed genetic relationship was evaluated with bootstrapping. The genetic relationship of the 3CD dataset cosegregated into three distinct monophyletic groups supported by high bootstrap values (Fig. 1b). The three clusters constitute the HEV-B clade and are distinct from the other enterovirus species. The EV30 strains were predominantly (seven of nine included strains) clustered into one group (I, Fig. 1b) mixed with three CVB4 strains and representatives of several other HEV-B serotypes. This group included four EV30 strains of which three were closely related in their VP1 gene and supported by high bootstrap values. Two other EV30 strains showed a closer relationship with other HEV-B serotypes than with members of their own serotype: EV3013240net77 clustered with EV1F and EV3020428net79 with CVB410197net77, the latter pair of strains belonging to different serotypes but isolated in the same country within a reasonable time scale. This may indicate that the CVB4 and EV30 lineages have had the opportunity to participate in recombination events.

The second cluster (II) included one EV30 strain (EV3017570net87), the CVB4 prototype and four other CVB4 strains and other members of the HEV-B species. The previously reported close relationship between EV9H and E18M in the P3 region due to recombination mapped to the 2C gene was also verified (Andersson et al., 2002). CVB3W and CVA9G also showed a close relationship in this region, as has previously been noted (Santti et al., 1999). The close relationships between CVB5F and EV11G and between EV6C and one CVB4 strain, CVB405560net62, also showed high bootstrap support.

The third group (III) included the EV30 prototype (EV30B), CVB4E2b and five other members of CVB4. There are currently no sequences in GenBank derived from other serotypes of HEV-B associated with this group. High bootstrap supported the close relationship between the 3CD region of EV30B and CVB4E2b.

Genome-wide similarity analysis of HEV-B
An analysis of complete HEV-B genomes present in GenBank was therefore conducted aiming to analyse whether the incongruence between the VP1 and 3CD trees regarding the evolutionary relationship between EV30B and CVB4E2b could be explained by a recombination between their ancestral lineages. Aligned full-length HEV-B genomes including the CVB4 prototype, CVB4E2b and EV30B were included in a genome-wide similarity analysis. In this analysis one of the genomes is compared with the other aligned sequences using a window size of a predefined number of nucleotides, which moves in overlapping steps throughout the genome. The similarity analysis comparison of the CVB4 prototype (CVB4JVB, Fig. 2a) showed the following result. In the 5'UTR sequence identity was relatively high (average 80–95 %), whereas the capsid-coding region P1 was lower (75–80 % within a serotype, otherwise 50–70 %). In the P2 and P3 regions, the different serotypes were relatively close to each other (75–85 %) and no strain was significantly singled out over the entire region.




View larger version (88K):
[in this window]
[in a new window]
 
Fig. 2. Similarity analysis of complete HEV-B genomes using a sliding window of 400 nucleotides moving in 50 nucleotide steps, all positions with gaps deleted and the nucleotide similarity plotted using the JC model of nucleotides substitution (Jukes & Cantor, 1969). The approximately position in an HEV-B genome is indicated above each similarity analysis. The plots show comparisons of CVB4JVB (a), CVB4E2b (b) and EV30B (c) against all other genomes in the alignment.

 
Comparative analysis using the CVB4E2b genomic sequence showed a different pattern (Fig. 2b). The 5'UTR and P1 regions were related to the prototype (CVB4JVB). In the P2 and P3 regions, all HEV-B genomes showed similar sequence identities except one: the EV30 prototype (EV30B). Based on the pairwise sequence comparisons, a shift in sequence identity was observed in the 2B gene (Fig. 2b). Downstream of this location the EV30B genome was significantly more related to CVB4E2b than to any other aligned HEV-B genome. This analysis supports the observation that the two regions used in the tree reconstruction (Fig. 1a, b) display different evolutionary histories and that the CVB4E2b genome may be regarded as a genome that displays different phylogenetic relationships (chimera) in two different regions of the genome. The CVB4E2b strain is closely related to the CVB4JVB and EV30B lineages in the P1 and the P3 regions, respectively. The similarity analysis comparing EV30B to all other HEV-B genomes gave the expected similarity profile (Fig. 2c). The EV30B genome is the only complete EV30 sequence available in GenBank so no sequence identity above background was observed in the 5'UTR, P1 and the beginning of the P2 region. The expected shift in sequence identity between EV30B and CVB4E2b was observed in the 2B gene and was prominent throughout the rest of the genome.

To verify the observed shift in genetic relationship in the CVB4E2b genome, we have used a maximum-likelihood approach to map the putative recombination breakpoint. The two viruses representing the designated parental lineages, CVB4JVB and EV30B, were aligned with the putative hybrid genome, CVB4E2b. By comparing the relationships between the aligned sequences, the recombination point in the CVB4E2b genome was mapped to be between nt 3478 and 3479 in the 2B gene (likelihood ratio for recombination, 210·972). This is in accordance with the results obtained by the similarity analysis.


   DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
From the analyses on the full-length genomes of several HEV-B viruses, we have demonstrated in this and a previous work that recombination plays a role in the evolution of enteroviruses (Andersson et al., 2002). The P1 domain encoding the capsid proteins of a particular genetic lineage appears capable of readily recombining with P2P3 domains derived from a variety of genotypes within a species cluster. In this paper, we have shown that fourteen CVB4 strains, nine EV30 strains and all the remaining completely sequenced HEV-B strains segregated in the 3CD region in three distinct clusters supported by high bootstrap values and contained representatives of more than one serotype (Fig. 1b). Strains belonging to a given serotype or even to a genotype as determined by alignment of VP1 sequences clustered in two or three separate 3CD clusters. Our data indicate that recombination events are frequent among HEV-B strains and not restricted to genomes of the same serotype but occur across different serotypes of the same species. The capsid-coding domain functions as an index region giving the basic identity for a strain, which through recombination can engage, as engines for replication, different P2P3 combinations when available. The evidence for recombination using a limited number of strains and being restricted to particular HEV-B outbreaks has also been suggested by data recently reported by others (Oprisan et al., 2002).

We initially chose EV30 as a target for entire genome sequencing because of its importance as a causative agent of aseptic meningitis (Oberste et al., 1999b; Savolainen et al., 2001). In recent years, EV30 strains have caused a sort of pandemic wave of meningitis outbreaks (Savolainen et al., 2001). After obtaining the sequence we realized the close relationship of the EV30 P2 and P3 regions to those of the previously published CVB4E2. Resequencing the E2 strain revealed some differences compared with the previously published sequence but did not change the general picture. The CBV4E2 strain appeared to inherit genomic information that defines the CVB4 serotype, here represented by the CVB4 prototype lineage, and shared evolutionary history with the EV30 prototype lineage in the non-structural genes. A probable site of recombination was estimated to be located in the 2B protein and verified by maximum-likelihood analysis. We have previously shown that the EV9 prototype genome, EV9H, displays the same mixed inheritance: the P1–P2 region is related to the EV9 lineage (represented by EV9B), whereas the P3 region is genetically most closely related to a lineage represented by the EV18 prototype genome, EV18M. The point of recombination was located in the 2C gene (Andersson et al., 2002). The position in the 2B gene described here and that in the 2C gene, reported for the EV9/EV18 chimera, have also been identified as hot spots for recombination in poliovirus (Duggal et al., 1997; Duggal & Wimmer, 1999). This demonstrates that at least two regions in the non-structural genes may be involved in recombination events where HEV-B genomes are involved. This also demonstrates that recombination may occur at several positions in the P2–P3 region. This suggests that the designated ‘replication vehicle’ may be further divided, in terms of recombination, into functional subunits.

The CVB4E2 strain may thus be regarded as a recombinant strain sharing the P1 region with the CVB4 prototype (CVB4JVB) lineage and the 3' part clustering with the EV30B lineage. An alternative explanation is that EV30B is the recombinant, swapping from an unknown 3' sequence to a genotype present in CVB4E2 and other CVB4 strains included in this analysis. To test this hypothesis further, more complete EV30 genomes need to be sequenced and compared with all HEV-B corresponding regions deposited in GenBank. We can only speculate about the ‘origin’ of the P2–P3 regions of any given HEV-B ‘prototype’ or other reference strain. We can no longer exclude recombination in the ancestral history of any strain.

In the case of intertypic recombinants frequently generated in recipients of the trivalent oral poliovirus vaccine, recombination can be considered to increase the fitness of the attenuated strains for replication in the human gut. A speculative extension of the data described above is that recombination events occurring between HEV-B strains while circulating in human populations might affect their tissue tropism and hence pathogenicity. This is an area of research that needs to be further explored.

The most plausible mechanism causing shuffling of genome segments between closely related genomes is homologous recombination and the copy choice model (Jarvis & Kirkegaard, 1992; Kirkegaard & Baltimore, 1986; Tang et al., 1997). There seems to be hot spots of recombination within an enterovirus genome. For instance, characterization of the recombination sites within the poliovirus genome at non-physiological and physiological conditions revealed that severe restrictions were applied to recombination within the P1 region at physiological conditions. Using an in vitro system for studying recombination in the poliovirus genome, Duggal et al. (1997) observed fewer limitations of recombination between poliovirus genomes in vitro than in vivo. This suggests that recombination may occur outside eventual hot spots of recombination. However, when more physiological constraints were used, the genes in the P1 region were restricted to recombination (Duggal & Wimmer, 1999). The experimental setup did, however, select viral genomes that were able to replicate and therefore it may be possible that recombination may occur genome-wide in natural infections as well but is rapidly selected against due to low fitness.

In conclusion, we have shown that the evolutionary history when comparing a gene encoding the highly variable structural protein, VP1, is different from the evolutionary pathway inferred using the P3 region (the 3CD dataset). This apparent incongruence, underscored by extensive phylogenetic analysis, suggests a model where different segments of the genome are combined into a complete replicating virus genome but display a certain degree of unrelated evolution between the regions. The available data suggest that the P1 region (probably together with at least part of the 2A gene) may constitute one region, while the borders and restrictions defining one or several regions in the 5'UTR, P2 and P3 regions still need to be characterized. Complete genomic sequences of the CVB4 and EV30 prototypes and the ‘diabetogenic’ CVB4E2b strain (Chatterjee et al., 1992; Kang et al., 1994) clearly demonstrated the at least dual inheritance in the CVB4E2b genome with a putative crossover site in the 2B gene. It is, however, plausible that recombination in the HEV-B species occurs frequently and at several locations in the genome. More sequence data derived from the non-structural genes of all enterovirus serotypes and variants are needed to be able to predict whether there are dominating lineages in the non-structural genes comparable with the genotype/serotype correlation in the P1 region. The significance of recombination in pathogenesis needs to be addressed. It is our belief that revealing the mechanisms and frequency of combining of regions in an enterovirus genome would be of great importance for the understanding of how these viruses persist, replicate and cause disease in their hosts.


   ACKNOWLEDGEMENTS
 
We thank Kjell Edman for interesting discussions and for reviewing the manuscript and Anne Andersson for technical assistance. This work was supported by grants from the University of Kalmar and The Knowledge Foundation, Sweden, and grants from the Academy of Finland and the Päivikki and Sakari Sohlberg Foundation (Helsinki) to T. H.


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Andersson, P., Edman, K. & Lindberg, A. M. (2002). Molecular analysis of the echovirus 18 prototype. Evidence of interserotypic recombination with echovirus 9. Virus Res 85, 71–83.[CrossRef][Medline]

Chatterjee, N. K., Hou, J., Dockstader, P. & Charbonneau, T. (1992). Coxsackievirus B4 infection alters thymic, splenic, and peripheral lymphocyte repertoire preceeding onset of hyperglycemia in mice. J Med Virol 38, 124–131.[Medline]

Cuervo, N. S., Guillot, S., Romanenkova, N., Combiescu, M., Aubert-Combiescu, A., Seghier, M., Caro, V., Crainic, R. & Delpeyroux, F. (2001). Genomic features of intertypic recombinant Sabin poliovirus strains excreted by primary vaccinees. J Virol 75, 5740–5751.[Abstract/Free Full Text]

Devereux, J., Haeberli, P. & Smithies, O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res 12, 387–395.[Abstract]

Duggal, R. & Wimmer, E. (1999). Genetic recombination of poliovirus in vitro and in vivo: temperature-dependent alteration of crossover sites. Virology 258, 30–41.[CrossRef][Medline]

Duggal, R., Cuconati, A., Gromeier, M. & Wimmer, E. (1997). Genetic recombination of poliovirus in a cell-free system. Proc Natl Acad Sci U S A 94, 13786–13791.[Abstract/Free Full Text]

Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791.

Felsenstein, J. (2001). PHYLIP: phylogenetic inference package, 3.6a2 edn. Department of Genetics, University of Washington, Seattle, WA, USA.

Furione, M., Guillot, S., Otelea, D., Balant, J., Candrea, A. & Crainic, R. (1996). Polioviruses with natural recombinant genomes isolated from vaccine-associated paralytic poliomyelitis. Virology 196, 199–208.

Georgescu, M.-M., Delpeyroux, F. & Crainic, R. (1995). Tripartite genome organization of a natural type 2 vaccine/nonvaccine recombinant poliovirus. J Gen Virol 76, 2343–2348.[Abstract]

Guillot, S., Caro, V., Cuervo, N., Korotkova, E., Combiescu, M., Persu, A., Aubert-Combiescu, A., Delpeyroux, F. & Crainic, R. (2000). Natural genetic exchange between vaccine and wild poliovirus strains in humans. J Virol 74, 8434–8443.[Abstract/Free Full Text]

Hasegawa, M., Kishino, H. & Yano, T. (1985). Dating of the human–ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 21, 160–174.

Hillis, D. M. & Bull, J. J. (1993). An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst Biol 42, 182–192.

Hirst, G. K. (1962). Genetic recombination with Newcastle disease virus, polioviruses, and influenza. Cold Spring Harbor Symp Quant Biol 27, 303–309.

Hyypiä, T., Hovi, T., Knowles, N. J. & Stanway, G. (1997). Classification of enteroviruses based on molecular and biological properties. J Gen Virol 78, 1–11.[Free Full Text]

Jarvis, T. C. & Kirkegaard, K. (1992). Poliovirus RNA recombination: mechanistic studies in the absence of selection. EMBO J 11, 3135–3145.[Abstract]

Jukes, T. H. & Cantor, C. R. (1969). Evolution of protein molecules. In Mammalian Protein Metabolism, pp. 21–132. Edited by H. N. Munro. New York: Academic Press.

Kang, Y., Chatterjee, N. K., Nodwell, M. J. & Yoon, J.-W. (1994). Complete nucleotide sequence of a strain of coxsackievirus B4 virus of human origin that induces diabetes in mice and its comparison with nondiabetogenic coxsackie B4 JVB strain. J Med Virol 44, 353–361.[Medline]

King, A. M. Q., Brown, F., Christian, P. & 8 other authors (2000). Picornaviridae. In Virus Taxonomy. Seventh Report of the International Committee on Taxonomy of Viruses, pp. 657–678. Edited by M. H. V. van Regenmortel, C. M. Fauquet, D. H. L. Bishop, E. B. Carstens, M. K. Estes, S. M. Lemon, J. Maniloff, M. A. Mayo, D. J. McGeoch, C. R. Pringle & R. B. Wickner. San Diego: Academic Press.

Kirkegaard, K. & Baltimore, D. (1986). The mechanism of RNA recombination in poliovirus. Cell 47, 433–443.[Medline]

Krings, M., Stone, A., Schmitz, R. W., Krainitzki, H., Stoneking, M. & Paabo, S. (1997). Neanderthal DNA sequences and the origin of modern humans. Cell 90, 19–30.[Medline]

Lake, J. A. (1994). Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances. Proc Natl Acad Sci U S A 91, 1455–1459.[Abstract]

Ledinko, N. (1963). Genetic recombination with poliovirus type 1. Virology 20, 107–119.[CrossRef]

Lemey, P., Salemi, M., Bassit, L. & Vandamme, A. M. (2002). Phylogenetic classification of TT virus groups based on the N22 region is unreliable. Virus Res 85, 47–59.[CrossRef][Medline]

Lindberg, A. M. & Polacek, C. (2000). Molecular analysis of the prototype coxsackievirus B5 genome. Arch Virol 145, 205–221.[CrossRef][Medline]

Lindberg, A. M., Polacek, C. & Johansson, S. (1997). Amplification and cloning of complete enterovirus genomes by long distance PCR. J Virol Methods 65, 191–199.[CrossRef][Medline]

Lindberg, A. M., Johansson, S. & Andersson, A. (1999). Echovirus 5: infectious transcripts and complete nucleotide sequence from uncloned cDNA. Virus Res 59, 75–87.[CrossRef][Medline]

Lockhart, P. J., Steel, M. A., Hendy, M. D. & Penny, D. (1994). Recovering evolutionary trees under a more realistic model of sequence evolution. Mol Biol Evol 11, 605–612.[Free Full Text]

Lole, K. S., Bollinger, R. C., Paranjpe, R. S., Gadkari, D., Kulkarni, S. S., Novak, N. G., Ingersoll, R., Sheppard, H. W. & Ray, S. C. (1999). Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in india, with evidence of intersubtype recombination. J Virol 73, 152–160.[Abstract/Free Full Text]

Melnick, J. L. (1996). Enteroviruses: polioviruses, coxsackieviruses, echoviruses, and newer enteroviruses. In Fields Virology, 3rd edn, pp. 655–712. Edited by B. N. Fields, D. M. Knipe & P. M. Howley. Philadelphia: Lippincott–Raven.

Mulders, M. N., Salminen, M., Kalkkinen, N. & Hovi, T. (2000). Molecular epidemiology of coxsackievirus B4 and disclosure of the correct VP1/2Apro cleavage site: evidence for high genomic diversity and long-term endemicity of distinct genotypes. J Gen Virol 81, 803–812.[Abstract/Free Full Text]

Oberste, M. S., Maher, K., Kilpatrick, D. R., Flemister, M. R., Brown, B. A. & Pallansch, M. A. (1999a). Typing of human enteroviruses by partial sequencing of VP1. J Clin Microbiol 37, 1288–1293.[Abstract/Free Full Text]

Oberste, M. S., Maher, K., Kennett, M. L., Campbell, J. J., Carpenter, M. S., Schnurr, D. & Pallansch, M. A. (1999b). Molecular epidemiology and genetic diversity of echovirus type 30 (E30): genotypes correlate with temporal dynamics of E30 isolation. J Clin Microbiol 37, 3928–3933.[Abstract/Free Full Text]

Oberste, M. S., Maher, K., Kilpatrick, D. R. & Pallansch, M. A. (1999c). Molecular evolution of the human enteroviruses: correlation of serotype with VP1 sequence and application to picornavirus classification. J Virol 73, 1941–1948.[Abstract/Free Full Text]

Oberste, M. S., Maher, K., Flemister, M. R., Marchetti, G., Kilpatrick, D. R. & Pallansch, M. A. (2000). Comparison of classic and molecular approaches for the identification of untypeable enteroviruses. J Clin Microbiol 38, 1170–1174.[Abstract/Free Full Text]

Oprisan, G., Combiescu, M., Guillot, S., Caro, V., Combiescu, A., Delpeyroux, F. & Crainic, R. (2002). Natural genetic recombination between co-circulating heterotypic enteroviruses. J Gen Virol 83, 2193–2200.[Abstract/Free Full Text]

Polacek, C., Lundgren, A., Andersson, A. & Lindberg, A. M. (1999). Genomic and phylogenetic characterization of coxsackievirus B2 prototype strain Ohio-1. Virus Res 59, 229–238.[CrossRef][Medline]

Posada, D. & Crandall, K. A. (1998). Modeltest: testing the model of DNA substitution. Bioinformatics 14, 817–818.[Abstract]

Pöyry, T., Kinnunen, L., Hyypiä, T., Brown, B., Horsnell, C., Hovi, T. & Stanway, G. (1996). Genetic and phylogenetic clustering of enteroviruses. J Gen Virol 77, 1699–1717.[Abstract]

Rodriguez, F., Oliver, J. L., Marin, A. & Medina, J. R. (1990). The general stochastic model of nucleotide substitution. J Theor Biol 142, 485–501.[Medline]

Saitou, N. & Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406–425.[Abstract]

Sala, M. & Wain-Hobson, S. (1999). Drift and conservatism in RNA virus evolution: are they adapting or merely changing? In Origin And Evolution Of Viruses, pp. 115–140. Edited by E. Domingo, R. Webster & J. Holland. San Diego: Academic Press.

Sala, M. & Wain-Hobson, S. (2000). Are RNA viruses adapting or merely changing? J Mol Evol 51, 12–20.[Medline]

Santti, J., Hyypiä, T., Kinnunen, L. & Salminen, M. (1999). Evidence of recombination among enteroviruses. J Virol 73, 8741–8749.[Abstract/Free Full Text]

Santti, J., Harvala, H., Kinnunen, L. & Hyypiä, T. (2000). Molecular epidemiology and evolution of coxsackievirus A9. J Gen Virol 81, 1361–1372.[Abstract/Free Full Text]

Savolainen, C., Hovi, T. & Mulders, M. N. (2001). Molecular epidemiology of echovirus 30 in Europe: succession of dominant sublineages within a single major genotype. Arch Virol 146, 521–537.[CrossRef][Medline]

Steel, M. A. (1994). Recovering a tree from the leaf colorations it generates under a Markov model. Appl Math Lett 7, 19–24.

Strimmer, K. & von Haeseler, A. (1996). Quartet puzzling: a quartet maximum-likelihood method for reconstructing tree topologies. Mol Biol Evol 13, 964–969.[Free Full Text]

Strimmer, K. & von Haeseler, A. (1997). Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc Natl Acad Sci U S A 94, 6815–6819.[Abstract/Free Full Text]

Swofford, D. L. (2000). PAUP*: phylogenetic analysis using parsimony (*and other methods), 4th edn. Sinauer Associates, Sunderland, MA, USA.

Swofford, D. L., Olsen, G. J., Waddell, P. J. & Hillis, D. M. (1996). Phylogenetic inference. In Molecular Systematics, 2nd edn, pp. 407–514. Edited by D. M. Hillis, C. Moritz & B. K. Mable. Sinauer Associates, Sunderland, MA, USA.

Tamura, K. & Nei, M. (1993). Estimation of the number of nucleotide substitutions in the control regions of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10, 512–526.[Abstract]

Tang, R. S., Barton, D. J., Flanegan, J. B. & Kirkegaard, K. (1997). Poliovirus RNA recombination in cell-free extracts. RNA 3, 624–633.[Abstract]

Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.[Abstract]

Worobey, M. & Holmes, E. C. (2001). Homologous recombination in GB virus C/hepatitis G virus. Mol Biol Evol 18, 254–261.[Abstract/Free Full Text]

Worobey, M., Rambaut, A. & Holmes, E. C. (1999). Widespread intra-serotype recombination in natural populations of dengue virus. Proc Natl Acad Sci U S A 96, 7352–7357.[Abstract/Free Full Text]

Xia, X. & Xie, Z. (2001). DAMBE: software package for data analysis in molecular biology and evolution. J Hered 92, 371–373.[Abstract/Free Full Text]

Received 11 November 2002; accepted 24 January 2003.