Inter- and intralineage recombinants are common in natural populations of Turnip mosaic virus

Zhongyang Tan1, Yasuhiko Wada2,3, Jishuang Chen4 and Kazusato Ohshima1

1 Laboratory of Plant Virology, Faculty of Agriculture, Saga University, Saga 840-8502, Japan
2 Laboratory of Animal Production and Management, Faculty of Agriculture, Saga University, Saga 840-8502, Japan
3 BioInfomatics Research Division, Japan Science and Technology Corporation, Tokyo 102-0081, Japan
4 Institute of Bioengineering, Zhejiang University of Science and Technology, Hangzhou Xiasha 310018, PR China

Correspondence
Kazusato Ohshima
ohshimak{at}cc.saga-u.ac.jp


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
A recombination map of the genome of Turnip mosaic virus (TuMV) was assembled using data from 19 complete genomic sequences, previously reported, and a composite sample of three regions of the genome, one-third in total, of a representative Asia-wide collection of 70 isolates. Thus, a total of 89 isolates of worldwide origin was analysed for recombinants. Eighteen recombination sites were found spaced throughout the 5' two-thirds of the genome, but there were only two in the 3' one-third; thus, 24 and 35 % of the P1 and NIa-VPg gene sequences examined were recombinants, whereas only 1 % of the corresponding NIa-Pro and CP gene sequences were recombinants. Recombinants with parents from the same or from different lineages were found, and some recombination sites characterized particular lineages. Most of the strain BR recombinants belonged to the Asian-BR group, as defined previously, and it was concluded that this lineage resulted from a recent migration, whereas many of the strain B recombinants from Asia fell into the world-B group. Again, a large proportion of isolates in this group were recombinants. Some recombination sites were found only in particular lineages, and hence seemed more likely to be the surviving progeny from single recombinational events, rather than the progeny of multiple events occurring at recombination hotspots. It seems that the presence of recombination sites, as well as sequence similarities, may be used to trace the migration and evolution of TuMV.

The GenBank/EMBL/DDBJ accession numbers of the nucleotide sequence data obtained in this study are AB179888AB180040.


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Studies of the molecular evolutionary history of viruses help to provide understanding of such important features of their biology as changes in virulence and geographical range, and their ‘emergence’ as new epidemics, and this understanding is essential for designing strategies for controlling viruses. The complexity of such studies is becoming increasingly obvious as they involve understanding variation caused by mutation, recombination, selection and adaptation (Simon & Bujarski, 1994; Roossinck, 1997; Bousalem et al., 2000; García-Arenal et al., 2001; Rubio et al., 2001; Bateson et al., 2002; Chen et al., 2002a; Monci et al., 2002; Moury et al., 2002; Ohshima et al., 2002; Stenger et al., 2002; Tomimura et al., 2003; Moreno et al., 2004).

Turnip mosaic virus (TuMV) infects a wide range of plant species, mostly from the family Brassicaceae. It is probably the most widespread and important virus infecting both crop and ornamental species of this family and occurs throughout the world, including the temperate and tropical regions of Africa, Asia, Europe, Oceania and North and South America (Provvidenti, 1996; Ohshima et al., 2002). TuMV was ranked second only to Cucumber mosaic virus (CMV) as the most important virus infecting field-grown vegetables in a survey of virus disease in 28 countries and regions (Tomlinson, 1987; Walsh & Jenner, 2002). TuMV belongs to the genus Potyvirus. This is the largest genus of the largest family of plant viruses, the Potyviridae (Riechmann et al., 1992; Shukla et al., 1994; Berger et al., 2000), which itself belongs to the picorna-like supergroup of viruses of animals and plants. TuMV, like other potyviruses, is transmitted by aphids in a non-persistent manner (Hamlyn, 1953). Potyviruses have flexuous filamentous particles, 700–750 nm long, each of which contains a single copy of the genome, which is a single-stranded positive-sense RNA molecule about 10 000 nt long. The genomes of potyviruses have terminal untranslated regions between which is a single open reading frame that is translated into a single large polyprotein. This is hydrolysed into several proteins by proteases that are part of the same polyprotein (Riechmann et al., 1992). Potyvirus P1 genes, the most variable in the potyvirus genome, encode a serine proteinase (Verchot et al., 1992; Choi et al., 2002), which is also the most variable potyvirus protein (Tomimura et al., 2003), but the precise function of P1 in viral infections has not yet been established. Neither has the function of the P3 gene (Urcuqui-Inchima et al., 2001; Jenner et al., 2002, 2003). The other proteins encoded by potyvirus genomes are relatively conserved, although the N termini of the CP genes are also variable.

There have been two reports of studies of the phylodemography of TuMV (Ohshima et al., 2002; Tomimura et al., 2003). Comparisons of genomic sequences showed that TuMV isolates fell into four well-defined close-knit groups: basal-B, basal-BR, Asian-BR and world-B. Parts of around one-fifth of the sequences were phylogenetically anomalous, and this was shown to have resulted from recombination, which seemed therefore to be a significant generator of genetic diversity in populations of this virus. Interestingly, the ‘clear’ recombinants were found only in isolates from east Asia, and ‘uncertain’ recombinants, perhaps more ancient, were found only in lineages from other parts of the world.

In the study reported here, we analysed a total of 89 representative TuMV genomic sequences (on average about one-third of each) using methods based on different evolutionary assumptions, and we discuss the results in terms of the information they provide on the extent to which recombination drives the evolution of TuMV populations.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Virus isolates and host tests.
Isolates of TuMV were collected by ourselves and others. Details of the isolates, their names, country of origin, original host plant, year of isolation and strain (host type) are shown in Table 1, together with details of the worldwide isolates for which the complete genomic sequences have been reported previously (Tomimura et al., 2003; GenBank accession nos AF394601 and AF394602). All the isolates collected were inoculated into Chenopodium quinoa and serially cloned through single lesions at least three times. They were propagated in Brassica rapa cv. Hakatasuwari or Nicotiana benthamiana plants. Plants infected systemically with each of the TuMV isolates were homogenized in 0·01 M potassium phosphate buffer (pH 7·0) and the isolates were mechanically inoculated into young plants of B. rapa cv. Hakatasuwari, Brassica pekinensis cvs Nozaki-1go and Kekkyu Kyoto-3go, Brassica napus cv. Norin-32go and Raphanus sativus cvs Taibyo-sobutori and Akimasari. Inoculated plants were kept for at least 4 weeks in a glasshouse at 25 °C.


View this table:
[in this window]
[in a new window]
 
Table 1. TuMV isolates analysed in this study

Seventy Asian and 19 non-Asian isolates are listed.

 
Extraction of viral RNA.
Viral RNAs were extracted from purified virions (Choi et al., 1977) or TuMV-infected B. rapa and N. benthamiana leaves using the RNeasy Plant Mini kit (Qiagen). The RNAs were reverse transcribed and amplified using high-fidelity Platinum Pfx DNA polymerase (Invitrogen). Amplified cDNAs were separated by electrophoresis in agarose gels and bands were excised with a razor blade and purified using the QIAquick Gel Extraction kit (Qiagen). Purified cDNAs that had been hydrolysed by appropriate restriction enzymes were ligated into alkaline phosphatase-treated enzyme-hydrolysed pBluescript II SK(+) vectors. The ligated plasmids were then used to transform Escherichia coli XL-1 Blue (Stratagene). The cDNAs were sequenced as described below.

Sequencing.
Nucleotide sequences of the genes encoding P1, the C terminus of cylindrical inclusion (Ct-CI) (from nt 5680 to the 3' end of this gene as described for the 1J nucleotide sequence; Ohshima et al., 1996), 6K2, NIa-VPg and NIa-Pro and the coat protein (CP) of each isolate were determined from two independent RT-PCR products and cloned plasmids. Each cloned plasmid and RT-PCR product was sequenced by ‘primer walking’ in both directions using the BigDye Terminator v3.0 Cycle Sequencing Ready Reaction kit (Applied Biosystems) and an Applied Biosystems Genetic Analyzer DNA model 310; ambiguous nucleotides in any sequence were checked in sequences obtained from at least four other independent plasmids. Sequence data were assembled using BioEdit version 5.0.9 (Hall, 1999).

Recombinant analyses.
The sequences of 70 Asian and 19 non-Asian TuMV isolates (Tomimura et al., 2003), a total of 89, were used for evolutionary analyses. First, we joined the nucleotide sequences of the P1, Ct-CI, 6K2, NIa-VPg, NIa-Pro and CP genes, and termed this the P1+R12+Pro+CP region. Among the 29 phylogenies obtained from the aligned gene sequences or genomic slices, only those from the region around nt 5501–6500 in degapped sequences, which we call region 12 (R12; see Tomimura et al., 2003), and which extends from Ct-CI to the middle of the NIa-Pro sequences, were almost identical to those constructed from entire genomic sequences in all three maximum-likelihood (ML), maximum-parsimony (MP) and neighbour-joining (NJ) trees. Therefore, we joined the R12 sequences between P1 and the remainder of NIa-Pro+CP sequences to produce P1+R12+Pro+CP sequences of each isolate for further evolutionary analyses. The homologous regions of two sequences of Japanese yam mosaic virus (JYMV) (Fuji & Nakamae, 1999, 2000), one of Scallion mosaic virus (ScaMV) (Chen et al., 2002b) and one of Narcissus yellow stripe virus (NYSV) (Chen et al., 2003) were used to align the TuMV P1+R12+Pro+CP sequences, as BLAST searches had shown them to be the sequences in the international sequence databases most closely and consistently related to those of TuMV; TuMV P1 genes were more closely related to those of JYMV than ScaMV, whereas for many other TuMV regions/genes such as R12+Pro the converse was true, except that the TuMV CP gene is most closely related to that of NYSV. We thus aligned all 89 P1 sequences with those of two JYMV isolates as the outgroup, the R12+Pro sequences with those of JYMV and ScaMV and the CP sequences with that of NYSV, using CLUSTAL X (Jeanmougin et al., 1998). However, this procedure resulted in some gaps that were not in multiples of 3 nt. Therefore, the amino acid sequences corresponding to individual regions with the appropriate outgroups indicated above were aligned using CLUSTAL X with TRANSALIGN (kindly supplied by Georg Weiller) to maintain the degapped alignment of the encoded amino acids and then reassembled to form sequences 3324 nt long. The aligned sequences were checked using SISCAN version 2 (Gibbs et al., 2000) and PHYLPRO (Weiller, 1998) for incongruent relationships that might have resulted from recombination. These analyses also assessed which non-recombinant sequences have regions that are closest to regions of the recombinant sequences and hence indicate the likely lineages that provided those regions in the recombinants. For simplicity in this paper, we call these the ‘parental’ isolates of recombinants, although they are merely those that include the most closely related regions. The transversion and transition rates of all pairs of sequences were calculated using DAMBE (Xia & Xie, 2001).

Phylogenetic analyses.
Two JYMV sequences and one from ScaMV were used as an outgroup to construct a P1+R12+Pro+CP phylogenetic tree, because the CP sequence of only one NYSV isolate has been published (Chen et al., 2003). The phylogenetic relationships of the sequences were determined by three methods: the ML algorithm of TREEPUZZLE version 5.0 (Strimmer & von Haeseler, 1996; Strimmer et al., 1997), the MP algorithm of PAUP 4.0 beta version 10 (Swofford, 1998) and the NJ algorithm of PHYLIP version 3.5 (Felsenstein, 1993). For ML analyses, 1000 puzzling steps were calculated using the Hasegawa–Kishino–Yano (HKY) model of substitution (Hasegawa et al., 1985). For MP analyses, the heuristic search option and 1000 bootstrap resamplings were used. For NJ analyses, distance matrices were calculated by DNADIST with the Kimura two-parameter option (Kimura, 1980), and trees were constructed from these matrices by the NJ method (Saitou & Nei, 1987). A bootstrap value for each internal node of the NJ trees was calculated using 1000 random resamplings with SEQBOOT (Felsenstein, 1985). The calculated trees were displayed by TREEVIEW (Page, 1996). Nucleotide and amino acid similarities were estimated using the Kimura two-parameter method (Kimura, 1980) and Dayhoff PAM matrix (Dayhoff et al., 1983), respectively.


   RESULTS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Gene sequences
The 70 Asian and 19 non-Asian TuMV sequences analysed in this study are listed in Table 1. The regions encoding the P1, 6K2, NIa-VPg, NIa-Pro and CP genes of all Asian isolates were 1086, 159, 576, 729 and 864 nt long, respectively. The 5' non-coding regions (NCRs) of all isolates were 94 nt long (excluding the 35 nt primer sequence), whereas the 3'NCRs of most isolates were 213 nt long, except those of isolates C1 and TW, which were 212 nt long. All the recognized motifs of potyvirus genes and encoded proteins were found (data not shown); however, the P1 gene, and the protein it encodes, had few totally conserved residues.

Recombination sites
In a previous study we showed that at least 10 % of 76 TuMV isolates had recombinant P1 and CP genes (Ohshima et al., 2002) and that these caused inconsistencies in phylogenetic analyses and poor bootstrap support for some lineages in the resulting trees. Therefore, we first checked the P1+R12+Pro+CP sequences for evidence of recombination. After gaps had been removed, the ‘phylogenetic profiles' of the sequences were examined using PHYLPRO (Weiller, 1998). This clearly showed that parts of some sequences were phylogenetically anomalous, in that different parts of them had different phylogenetic affinities (data not shown). Proof that these anomalies resulted from recombination rather than convergent selection was obtained by the ‘sister scanning’ method (Gibbs et al., 2000). All the isolates that had been identified as recombinants in the P1+R12+Pro+CP region by Tomimura et al. (2003) had statistically significant ‘conflicting related signals' (CRSs) in sites that differed synonymously from those in other sequences. Many isolates seemed to have significant CRSs that differed synonymously in the middle of the P1 gene. It was found, for example, that isolate HOD517J was most closely related (Z-values>3·0) to sequence CHN5 in the N-terminal half of the P1 gene (nt 250–520), but that the C-terminal half (nt 520–850) was closer to CHBJ1 in total nucleotide and synonymous and non-synonymous site analyses (for synonymous site analysis, see Fig. 1a).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 1. SISCAN analyses comparing synonymous sites. (a) The P1 sequence of isolate HOD517J compared with that of CHN5 (blue-black line), that of CHBJ1 (red line) and a sequence constructed by random choice of nucleotides from ‘parental’ sequences (other coloured lines); the CHN5 and CHBJ1 sequences represent the likely parental sequences of HOD517J. Note the strong support (i.e. Z-value>3·0) for HOD517J being more closely related to CHBJ1 than CHN5 in the 5' 520 nt in total nucleotide differences and between nt 250 and 520 in synonymous analysis, and the converse in the 3' 330 nt in synonymous analysis. For clarity, an analysis within the P1 gene is shown. (b) The P1+R12+Pro+CP sequence of isolate CH6 compared with that of CHL13 (blue-black line), that of FD27J (red line) and a sequence constructed by random choice of nucleotides from ‘parental’ sequences (other coloured lines); the sequences of CHL13 and FD27J represent the likely parental sequences of CH6. Note the strong support (Z-value>3·0) for CH6 being more closely related to CHL13 than FD27J between nt 600 and 1000, and the converse between nt 1000–1200 in synonymous site analysis. (c) The P1+R12+Pro+CP sequence of isolate HRD compared with that of HOD517J (blue-black line), that of ND10J (red line) and a sequence constructed by random choice of nucleotides from ‘parental’ sequences (other coloured lines). The HOD517J and ND10J sequences represent the likely parental sequences of HOD517J. Note the strong support (Z-value>3·0) for HOD517J being more closely related to ND10J than FD27J between nt 1100 and 1400, and the converse between nt 1400–2800 in synonymous site analysis. For all three graphs each window comparison involved subsequences of 100 nt with a step between window positions of 50 nt.

 
When surveying the P1+R12+Pro+CP sequences, using the PHYLPRO program, to determine the parents of the isolates, we found different parental relationships for some isolates in two adjacent 100 nt sequences, although most of these did not show clear phylogenetic anomalies in PHYLPRO. We therefore checked, using SISCAN, 100 nt slices of all sequences for evidence of recombination. Surprisingly, we found two major recombination sites in many Asian isolates at the junction of the P1 gene and the Ct-CI region (around nt 950–1000), and also in the NIa-VPg gene (around nt 1410). It seemed that PHYLPRO easily detected anomalies with parents from different lineages, but did not show clear anomalies with parents from the same lineage. Many of these isolates were BR strains and were recombinants of Asian-BR parents, or they were B strain and recombinants of world-B parents. It was found, for example, that isolate CH6 showed significant affinities (Z-values>3·0) with sequence CHL13 in the C-terminal half of the P1 gene (nt 600–1000), but that its Ct-CI and 6K2 regions (nt 1000–1200) were more similar to FD27J when the combined synonymous and non-synonymous differences were analysed (for synonymous site analysis, see Fig. 1b). On the other hand, isolate HRD showed significant affinities (Z-values>3·0) with sequence HOD517J in the 6K2 and the N-terminal half of the NIa-VPg genes (nt 1000–1400), but the C-terminal half of its NIa-VPg and NIa-Pro genes (nt 1400–2800) showed more affinities with ND10J when the combined synonymous and non-synonymous sites were analysed (for synonymous site analysis, see Fig. 1c).

Phylogenetic relationships
We calculated trees from the concatenated P1+R12+Pro+CP sequences of the 89 isolates, including all clear recombinants identified previously (Ohshima et al., 2002; Tomimura et al., 2003) and in this study. We investigated the relationships of the isolates by three methods (ML, MP and NJ) but only the ML tree is shown (Fig. 2). All the trees partitioned most of the sequences into the same four consistent groups: basal-B, basal-BR, Asian-BR and world-B.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 2. ML trees calculated from the combined P1+R12+Pro+CP sequences of all 32 ‘non-recombinant’ isolates of TuMV (acronyms in black), and all 57 clear recombinants listed in Table 2 and those previously identified by Tomimura et al. (2003) (acronyms in red). Numbers at each node indicate the percentage of supporting puzzling steps (or bootstrap samples) (only values >70 are shown) in ML, MP and NJ, respectively. Horizontal branch-length is drawn to scale with the bar indicating 0·1 nt replacements per site. The homologous sequences of two isolates (mild and j1) of JYMV and an isolate of ScaMV were used as the outgroup.

 
Recombinants in different lineages
Most of the recombinants identified in this study fell into the Asian-BR and world-B groups. Indeed, all the Asian-BR group isolates were recombinants, as were 25 of 48 isolates (52 %) in world-B group and 56 of 70 Asian isolates (80 %) but only one of 19 non-Asian isolates (5 %). Thus, only 35 % of the isolates showed no evidence of recombination, and many of these were from non-Asian countries (Fig. 2).

Table 2 lists the probable recombination sites in the P1+R12+Pro+CP region as indicated by SISCAN, together with the parental (the closest isolate) sequences. The many isolates that had significant CRSs in sites that differed synonymously from those in other sequences in the middle of the P1 gene were mostly Asian. One cross-over site, nt 350–400, was found only in two Nepalese isolates, whereas the two other sites, nt 450–500 and nt 500–550, were found in Chinese/Japanese/Korean isolates or Chinese/Japanese isolates, respectively. All these isolates were strain BR and were recombinants of world-B and Asian-BR parents; furthermore, the closest parental isolates of the recombinants had all been collected in China. We looked in more detail at the sequences around the recombination sites in the P1 gene and some, including two that were found Asia-wide. Nt 478–493 and nt 503–517, corresponding to aa 159–164 and aa 168–173 in the encoded amino acid sequences, were likely to be recombination sites (data not shown). On the other hand, many isolates analysed in this study by SISCAN had CRSs at the junction between the P1 and Ct-CI genes and/or in the NIa-VPg gene. For instance, the CHL13, CHL14 and RHS1 isolates seemed to have recombination sites within the P1 gene and between the P1 and Ct-CI genes, and were recombinants of world-B, Asian-BR and basal-BR parents. Many of the other sequences, notably A102/11, ITA7, St48 and IS1 from the basal groups and CHN12, CHYK56, CHK55, CHZJ27, Ka1J, DMJ and RAD1 from the world-B group, had recombination sites in their P1 and NIa-VPg genes when examined by SISCAN using synonymous differences, but these were not clear recombinants. Table 3 summarizes the numbers of isolates which have recombination sites within different subpopulations and in different genes/regions. In particular, many Asian isolates were recombinants whether they were strain BR or B. The recombination sites in the BR isolates were found mainly in the P1 and NIa-VPg genes, and also between the P1 and Ct-CI gene/region, whereas B strain recombination sites were found mostly at the junction of the P1 and Ct-CI gene/region. However, careful analyses using representative entire sequences are necessary to identify recombination sites between the junction of P1 and the Ct-CI gene/region.


View this table:
[in this window]
[in a new window]
 
Table 2. Recombinants and their recombination sites within the P1+R12+Pro+CP region as revealed by SISCAN analysis

The P1+R12+Pro+CP region is made up as follows: P1, nt 1–972; Ct-CI, nt 973–1044; 6K2, nt 1045–1203; NIa-VPg, nt 1204–1779; NIa-Pro, nt 1780–2508; CP, nt 2509–3324.

 

View this table:
[in this window]
[in a new window]
 
Table 3. Numbers of isolates which have recombination sites within subpopulation/strain and gene/region

BR or B strains include BR and B(R) strains or B and (B) strains, respectively, as listed in Table 1. Note that some isolates have multiple recombination sites.

 

   DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
This study reports evidence that recombination has occurred frequently during the evolution of TuMV. Comparisons of many isolates of TuMV found more recombination sites in the P1 and NIa-VPg genes than in the NIa-Pro and CP genes. These comparisons also showed that individual recombinants were derived from parents from the same major lineage (intralineage recombination) as frequently as from different major lineages (interlineage recombination), although the latter were easier to detect. Furthermore, these recombinants provided useful additional information about the migratory spread of the virus, perhaps the first such report for a plant virus population.

An earlier study of the entire sequences of 38 isolates had shown that eight were recombinants and five multiple recombinants, namely three double, one triple and one quadruple (Tomimura et al., 2003). All these isolates had been collected in Asia and this led us to collect more samples from Asian populations to determine more accurately how common recombinants were in those TuMV populations. An earlier study of the P1/CP genes (Ohshima et al., 2002) showed that only two isolates, 1J and FD27J, from the Asian population (5 %) had clear recombination sites in their P1 genes, whereas in this study we found that 21 of 70 Asian isolates (30 %) had recombinant P1 genes, as did 21 of 45 Asian strain BR isolates (47 %) (Table 3). This study also showed that 57 of 89 isolates (64 %) were recombinants of the P1+R12+Pro+CP region (Fig. 2), although it will be necessary to analyse this region for more isolates, including non-Asian regions and especially Europe, to assess accurately the number of recombinants in all TuMV populations. So why did we find a larger proportion of recombinants in the work reported here? One possible cause is sampling bias, as more Asia-wide isolates were analysed in this study than in previous studies and the new samples came from more widely distributed sites, so more between-population recombinants were detected. Our results indicate that it is important to analyse representative numbers of isolates collected in at least two neighbouring parts of the same biogeographical region. Bateson et al. (2002) found a similar population structure in studies of populations of Papaya ringspot virus, another potyvirus; major lineages revealed the pattern of worldwide migration, whereas in south-east Asia and the western Pacific there was ‘a single mixed population with some well-defined subpopulations’.

We analysed recombinants in the P1 genes not only of the 89 sequences listed in Table 1 but also of the sequences available in the international sequence databases, 124 isolates in all (data not shown). Six Japanese isolates showed anomalies in SISCAN synonymous site analyses and had clear recombination sites within their P1 genes, indicating that all Japanese Asian-BR group isolates examined so far are recombinants. In addition, because PHYLPRO and SISCAN are ‘sliding window’ methods, it is not possible to analyse fully the N termini of P1 sequences using only P1 sequences; we therefore joined the 5'NCRs and P1 sequences of each isolate and then analysed them by SISCAN. In BLAST searches, the TuMV 5'NCR was most closely related to the homologous region of the genome of Plum pox virus (PPV); we thus aligned them by CLUSTAL X using not only JYMV or ScaMV but also PPV as members of an outgroup and joined the degapped aligned 5'NCR sequences to the TuMV P1 sequences that had been aligned using only the two JYMV sequences as an outgroup. When we analysed these sequences for phylogenetic anomalies using all the nucleotide sites, most but not all of the Asian isolates had recombination sites almost at the N terminus of the P1 gene (nt 1–12; from the beginning of degapped P1 sequences) and with the most closely related parental sequences from the Asian BR and world-B groups (data not shown). Unfortunately, the recombination sites were located near the junction of the 5'NCR and P1 sequences, so we could not analyse them using synonymous or non-synonymous sites separately, and could not assess whether or not these are clear recombination sites at the 5' end of the P1 gene. Phylogenetic analyses of the aligned P1 sequences showed that many strain BR recombinants had world-B group sequences between nt 141 and 605 and Asian-BR group sequences between nt 752 and 1086 (Fig. 3). In all, 47 % of BR strains had mosaic P1 genes (Table 3). Our sequence comparisons provided no clear functional explanation for the large proportion of P1 mosaic structures. It is possible that recombinants may adapt more quickly to new hosts or have enhanced pathogenicity or P1/HC-Pro fusion (Kasschau & Carrington, 1998; Urcuqui-Inchima et al., 2001). The newly identified recombination sites in the P1 and NIa-VPg genes were most common in the ‘emerging’ Asian subpopulation and in BR strains, and our comparisons also show that most recombination sites in TuMV genomes are found in the P1 and NIa-VPg genes, which are, respectively, the most variable and relatively well-conserved genes of potyvirus genomes. The P1 protein is also the most variable of potyvirus proteins but the reason for this is at present unknown (Shukla et al., 1994; Urcuqui-Inchima et al., 2001), although it may result from recombination, as the recombinants we found in the P1 gene all had parents from different lineages (world-B and Asian-BR), whereas recombinants in the NIa-VPg gene mostly had parents from the same lineage (Asian-BR and Asian-BR).



View larger version (52K):
[in this window]
[in a new window]
 
Fig. 3. Recombination maps of TuMV genomes. The estimated nucleotide positions of the recombination sites are shown relative to the 5' end of the P1 gene. The nucleotide positions correspond to those of the 1J sequence (Ohshima et al., 1996). The positions were estimated approximately from the data shown in Table 2, together with that published by Tomimura et al. (2003) and Ohshima et al. (2002). The wide box shows the P1, R12+Pro and CP regions sequenced in this study; the narrow box shows the regions described in earlier reports. The recombination sites in the narrow boxes need further analyses using representative isolates. The coloured boxes in green, blue and red are, respectively, of Asian-BR, world-B and basal-BR group parents as assessed by PHYLPRO and SISCAN. Horizontal, vertical and diagonal cross-hatching in the boxes shows the different parents of each recombination type-group isolate.

 
There are conflicting reports on whether recombination sites are randomly distributed in the genomes of potyviruses or if particular regions of the genome are favoured. In Potato virus Y (PVY) and TuMV (Glais et al., 2002; Moury et al., 2002; Tomimura et al., 2003) there was no clear pattern, whereas in Slovakian isolates of PPV and Spanish isolates of Watermelon mosaic virus recombination sites appeared to be in particular regions of the genome (Glasa et al., 2002; Moreno et al., 2004), although most of these studies involved relatively small numbers of isolates, or isolates that came from a limited collecting area, and so may have been unrepresentative and not likely to provide useful generalizations. The present larger study has shown that 18 recombination sites were found in the 5' two-thirds of the genome, but only two in the 3' one-third; thus, 24 and 35 % of the P1 and NIa-VPg gene sequences we examined were recombinants, whereas only 1 % of the respective NIa-Pro and CP gene sequences were recombinants. In addition, recombination sites were found between the P1 and Ct-CI regions but not between the NIa-Pro and CP genes, namely in the NIb gene. The recombination sites within the P1 gene and NIa-VPg genes were present in the wider Asian population (Table 2, Fig. 3). These results may indicate that some recombination sites in the P1 and NIa-VPg genes are adaptively advantageous, or at least do not incur a fitness penalty, so that such recombinants persist as successful ‘founders’. In addition, these results indicate that there are different types of recombination site: those that are obvious and are widely distributed during epidemics, and those that are less obvious and may be ancient. The latter may be difficult to identify, perhaps because mutations have erased the clearest evidence of recombination (Rubio et al., 2001), and this possibility was supported by recombination analyses in previous studies, although we need to analyse recombination sites using entire genomes of representative non-Asian isolates. Tomimura et al. (2003) suggested that the presence of clear recombinants in a subpopulation could be a molecular signature of recent ‘emergence’. Likewise, a ‘star phylogeny’, as found in populations of recently emerged epidemic viruses such as Simian immunodeficiency virus, Human immunodeficiency virus (HIV) and CMV, may also indicate a recent emergence with minimal selection (Myers et al., 1993; Roossinck et al., 1999). Although recombinants may affect topology of the phylogeny, star phylogenies were seen in the Asian-BR group and one of the subgroups in the world-B group in the ML phylogeny, which mostly consisted of Asian recombinants (Fig. 2). The presence of star phylogenies and many recombinants in Asian subpopulations adds credibility to our conclusion that TuMV has recently ‘emerged’ in Asia.

Among 57 Asian recombinants, 60 % were intralineage recombinants and 40 % were both interlineage and intralineage recombinants, and most of the latter were at least triple recombinants (Fig. 3). For instance, the isolates in recombinant types K and M were likely to be parents of the isolates in recombination type J, and seemed to be at least double recombinants. Thus, some of the intralineage recombinants seemed to be both parent and progeny, and detailed comparisons of these recombination sites may be required in order to reveal the order of events. One of two recombination sites within the P1 gene found in this study was around nt 727 and was found in two Chinese isolates (collected in Beijing), a Korean isolate and six Japanese isolates (all from Kyushu in south-west Japan), locations that are close to one another. The other recombination site around nt 752 was found in Chinese and Japanese isolates, and collected in Guilin (mid-south of China) and all parts of Japan (Fig. 3, Table 2). Therefore, this latter group of Asian TuMV isolates, all BR strains and collected in different countries, share recombination sites in their P1 genes that probably came from a single recombination event. Studies of Puumala virus (Sironen et al., 2001) and HIV-1 (Cornelissen et al., 2000) have also shown how recombinants provide information from which the migratory spread of hosts and viruses may be deduced. Therefore, although it is not possible to exclude the possibility that some of the recombination sites found in Asia-wide isolates are in recombinational hotspots, we believe that it is more likely that they come from a limited number of recombinational events and, together with sequence similarities, usefully trace the migration and evolution of this virus.


   ACKNOWLEDGEMENTS
 
We thank Akemi Sato and Hisao Shinohara (Saga University, Japan) for their careful technical assistance. We thank Drs John A. Walsh (Horticulture International, UK) and Jang-Kyung Choi (Kangwon National University, Korea) for supplying TuMV isolates and also Mark Gibbs and Adrian Gibbs (Australian National University) for advice with sequence analysis and for very kind critical reading of the manuscript. This work was supported by Grant-in-Aid for Scientific Research no. 15580036 from the Japan Society for the Promotion of Science.


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Bateson, M. F., Lines, R. E., Revill, P., Chaleeprom, W., Ha, C. V., Gibbs, A. J. & Dale, J. L. (2002). On the evolution and molecular epidemiology of the potyvirus Papaya ringspot virus. J Gen Virol 83, 2575–2585.[Abstract/Free Full Text]

Berger, P. H., Barnett, O. W., Brunt, A. A. & 14 other authors (2000). Family Potyviridae. In Virus Taxonomy. Seventh Report of the International Committee on Taxonomy of Viruses, pp. 703–724. Edited by M. H. V. van Regenmortel, C. M. Fauquet, D. H. L. Bishop, E. B. Carstens, M. K. Estes, S. M. Lemon, J. Maniloff, M. A. Mayo, D. J. McGeoch, C. R. Pringle & R. B. Wickner. New York: Academic Press.

Bousalem, M., Douzery, E. J. P. & Fargette, D. (2000). High genetic diversity, distant phylogenetic relationships and intraspecies recombination events among natural populations of Yam mosaic virus: a contribution to understanding potyvirus evolution. J Gen Virol 81, 243–255.[Abstract/Free Full Text]

Chen, Y.-K., Goldbach, R. & Prins, M. (2002a). Inter- and intramolecular recombinations in the Cucumber mosaic virus genome related to adaptation in Alstroemeria. J Virol 76, 4119–4124.[Abstract/Free Full Text]

Chen, J., Zheng, H. Y., Chen, J. P. & Adams, M. J. (2002b). Characterisation of a potyvirus and a potexvirus from Chinese scallion. Arch Virol 147, 683–693.[CrossRef][Medline]

Chen, J., Chen, J. P., Langeveld, S. A., Derks, A. F. L. M. & Adams, M. J. (2003). Molecular characterization of carla- and potyviruses from Narcissus in China. J Phytopathol 151, 26–29.[CrossRef]

Choi, J. K., Maeda, T. & Wakimoto, S. (1977). An improved method for purification of turnip mosaic virus. Ann Phytopathol Soc Jpn 43, 440–448.

Choi, I.-R., Horken, K. M., Stenger, D. C. & French, R. (2002). Mapping of the P1 proteinase cleavage site in the polyprotein of Wheat streak mosaic virus (genus Tritimovirus). J Gen Virol 83, 443–450.[Abstract/Free Full Text]

Cornelissen, M., van den Burg, R., Zorgdrager, F. & Goudsmit, J. (2000). Spread of distinct human immunodeficiency virus type 1 AG recombinant lineage in Africa. J Gen Virol 81, 515–523.[Abstract/Free Full Text]

Dayhoff, M. O., Barker, W. C. & Hunt, L. T. (1983). Establishing homologies in protein sequences. Methods Enzymol 91, 524–545.[Medline]

Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791.

Felsenstein, J. (1993). PHYLIP (phylogeny interference package), version 3.5. Distributed by the author. Department of Genetics, University of Washington, Seattle, USA.

Fuji, S. & Nakamae, H. (1999). Complete nucleotide sequence of the genomic RNA of a Japanese yam mosaic virus, a new potyvirus in Japan. Arch Virol 144, 231–240.[CrossRef][Medline]

Fuji, S. & Nakamae, H. (2000). Complete nucleotide sequence of the genomic RNA of a mild strain of Japanese yam mosaic potyvirus in Japan. Arch Virol 145, 635–640.[CrossRef][Medline]

García-Arenal, F., Fraile, A. & Malpica, J. M. (2001). Variability and genetic structure of plant virus populations. Annu Rev Phytopathol 39, 157–186.[CrossRef][Medline]

Gibbs, M. J., Armstrong, J. S. & Gibbs, A. J. (2000). Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16, 573–582. http://www.anu.edu.au/BoZo/software/[Abstract]

Glais, L., Tribodet, M. & Kerlan, C. (2002). Genomic variability in Potato potyvirus Y (PVY): evidence that PVYNW and PVYNTN variants are single to multiple recombinants between PVYO and PVYN isolates. Arch Virol 147, 363–378.[CrossRef][Medline]

Glasa, M., Marie-Jeanne, V., Labonne, G., Subr, Z., Kúdela, O. & Quiot, J.-B. (2002). A natural population of recombinant Plum pox virus is viable and competitive under field conditions. Eur J Plant Pathol 108, 843–853.[CrossRef]

Hall, T. A. (1999). BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41, 95–98.

Hamlyn, B. M. G. (1953). Quantitative studies on the transmission of cabbage black ringspot virus by Myzus persicae (Sulz.). Ann Appl Biol 40, 393–402.

Hasegawa, M., Kishino, H. & Yano, T. (1985). Dating of the human–ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22, 160–174.[Medline]

Jeanmougin, F., Thompson, J. D., Gouy, M., Higgins, D. G. & Gibson, T. J. (1998). Multiple sequence alignment with Clustal X. Trends Biochem Sci 23, 403–405.[CrossRef][Medline]

Jenner, C. E., Tomimura, K., Ohshima, K., Hughes, S. L. & Walsh, J. A. (2002). Mutations in Turnip mosaic virus P3 and cylindrical inclusion protein are required to overcome two Brassica napus resistance genes. Virology 300, 50–59.[CrossRef][Medline]

Jenner, C. E., Wang, X., Tomimura, K., Ohshima, K., Ponz, F. & Walsh, J. A. (2003). The dual role of the potyvirus P3 protein on Turnip mosaic virus as a symptom and avirulence determinant in brassicas. Mol Plant Microbe Interact 16, 777–784.[Medline]

Kasschau, K. D. & Carrington, J. C. (1998). A counterdefensive strategy of plant viruses: suppression of posttranscriptional gene silencing. Cell 95, 461–470.[Medline]

Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16, 111–120.[Medline]

Monci, F., Sanchez-Campos, S., Navas-Castillo, J. & Moriones, E. (2002). A natural recombinant between the geminiviruses tomato yellow leaf curl Sardinia virus and tomato yellow leaf curl virus exhibits a novel pathogenic phenotype and is becoming prevalent in Spanish populations. Virology 303, 317–326.[CrossRef][Medline]

Moreno, I. M., Malpica, J. M., Díaz-Pendón, J. A., Moriones, E., Fraile, A. & García-Arenal, F. (2004). Variability and genetic structure of the population of watermelon mosaic virus infecting melon in Spain. Virology 318, 451–460.[CrossRef][Medline]

Moury, B., Morel, C., Johansen, E. & Jacquemond, M. (2002). Evidence for diversifying selection in Potato virus Y and in the coat protein of other potyviruses. J Gen Virol 83, 2563–2573.[Abstract/Free Full Text]

Myers, G., MacInnes, K. & Myers, L. (1993). Phylogenetic moments in the AIDS epidemic. In Emerging Viruses, pp. 120–137. Edited by S. S. Morse. New York: Oxford University Press.

Nicolas, O. & Laliberté, J.-F. (1992). The complete nucleotide sequence of turnip mosaic potyvirus RNA. J Gen Virol 73, 2785–2793.[Abstract]

Ohshima, K., Tanaka, M. & Sako, N. (1996). The complete nucleotide sequence of turnip mosaic virus RNA Japanese strain. Arch Virol 141, 1991–1997.[Medline]

Ohshima, K., Yamaguchi, Y., Hirota, R. & 10 other authors (2002). The molecular evolution of Turnip mosaic virus; evidence of host adaptation, genetic recombination and geographical spread. J Gen Virol 83, 1511–1521.[Abstract/Free Full Text]

Page, R. D. M. (1996). TreeView: an application to display phylogenetic trees on personal computers. Comput Appl Biosci 12, 357–358.[Medline]

Provvidenti, R. (1996). Turnip mosaic potyvirus. In Viruses of Plants, pp. 1340–1343. Edited by A. A. Brunt, K. Crabtree, M. J. Dallwitz, A. J. Gibbs & L. Watson. Wallingford, UK: CAB International.

Riechmann, J. L., Laín, S. & García, J. A. (1992). Highlights and prospects of potyvirus molecular biology. J Gen Virol 73, 1–16.[Medline]

Roossinck, M. (1997). Mechanisms of plant virus evolution. Annu Rev Phytopathol 35, 191–209.[CrossRef]

Roossinck, M. J., Zhang, L. & Hellwald, K.-H. (1999). Rearrangements in the 5' nontranslated region and phylogenetic analyses of cucumber mosaic virus RNA 3 indicate radial evolution of three subgroups. J Virol 73, 6752–6758.[Abstract/Free Full Text]

Rubio, L., Angeles, M., Ayllón, A., Kong, P., Fernández, A., Polek, M., Guerri, J., Moreno, P. & Falk, B. W. (2001). Genetic variation of Citrus tristeza virus isolates from California and Spain: evidence for mixed infections and recombination. J Virol 75, 8054–8062.[Abstract/Free Full Text]

Saitou, N. & Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4, 406–425.[Abstract]

Shukla, D. D., Ward, C. W. & Brunt, A. A. (1994). 1. Introduction. In The Potyviridae, pp. 1–26. Edited by D. D. Shukla, C. W. Ward & A. A. Brunt. Wallingford, UK: CAB International.

Simon, A. E. & Bujarski, J. J. (1994). RNA–RNA recombination and evolution in virus-infected plants. Annu Rev Phytopathol 32, 337–362.[CrossRef]

Sironen, T., Vaheri, A. & Plyusnin, A. (2001). Molecular evolution of Puumala hantavirus. J Virol 75, 11803–11810.[Abstract/Free Full Text]

Stenger, D. C., Seifers, D. L. & French, R. (2002). Patterns of polymorphism in wheat streak mosaic virus: sequence space explored by a clade of closely related viral genotypes rivals that between the most divergent strains. Virology 302, 58–70.[CrossRef][Medline]

Strimmer, K. & von Haeseler, A. (1996). Quartet puzzling: a quartet maximum likelihood method for reconstructing tree topologies. Mol Biol Evol 13, 964–969.[Free Full Text]

Strimmer, K., Goldman, N. & von Haeseler, A. (1997). Bayesian probabilities and quartet puzzling. Mol Biol Evol 14, 210–211.[Free Full Text]

Swofford, D. L. (1998). PAUP. Phylogenetic analysis using parsimony. Version 4. Sunderland, MA: Sinauer Associates.

Tomimura, K., Gibbs, A. J., Jenner, C. E., Walsh, J. A. & Ohshima, K. (2003). The phylogeny of Turnip mosaic virus; comparisons of thirty-eight genomic sequences reveal a Eurasian origin and a recent ‘emergence’ in east Asia. Mol Ecol 12, 2099–2111.[CrossRef][Medline]

Tomlinson, J. A. (1987). Epidemiology and control of virus diseases of vegetables. Ann Appl Biol 110, 661–681.

Urcuqui-Inchima, S., Haenni, A.-L. & Bernardi, F. (2001). Potyvirus proteins: a wealth of functions. Virus Res 74, 157–175.[CrossRef][Medline]

Verchot, J., Herndon, K. L. & Carrington, J. C. (1992). Mutational analysis of the tobacco etch potyviral 35-kDa proteinase: identification of essential residues and requirements for autoproteolysis. Virology 190, 298–306.[CrossRef][Medline]

Walsh, J. A. & Jenner, C. E. (2002). Turnip mosaic virus and the quest for durable resistance. Mol Plant Pathol 3, 289–300.[CrossRef]

Weiller, G. F. (1998). Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences. Mol Biol Evol 15, 326–335.[Abstract]

Xia, X. & Xie, Z. (2001). DAMBE: software package for data analysis in molecular biology and evolution. J Hered 92, 371–373.[Abstract/Free Full Text]

Received 22 March 2004; accepted 25 May 2004.