Multiple infection, recombination and genome relationships among begomovirus isolates found in cotton and other plants in Pakistan

Ana I. Sanzb,1, Aurora Fraile1, Fernando García-Arenal1, Xueping Zhouc,2, David J. Robinson2, Saif Khalid3, Tahir Buttd,2 and Bryan D. Harrison2

Departmento de Biotecnologia, E.T.S. Ingenieros Agronomos, Universidad Politecnica, 28040 Madrid, Spain1
Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, UK2
National Agricultural Research Centre, Islamabad 45500, Pakistan3

Author for correspondence: Bryan Harrison. Fax +44 1382 562426. e-mail djrobi{at}scri.sari.ac.uk


   Abstract
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
Begomoviruses occur in many plant species in Pakistan and are associated with an epidemic of cotton leaf curl disease that has developed since 1985. PCR analysis with primer pairs specific for each of four already sequenced types of DNA-A of cotton leaf curl virus (CLCuV-PK types a, 26, 72b and 804a), or for okra yellow vein mosaic virus (OYVMV), indicated that many individual naturally infected plants of cotton and other malvaceous species contained two or three begomovirus sequences. Similarly, sequence differences among overlapping fragments of begomovirus DNA-A, amplified from individual naturally infected plants, indicated much multiple infection in malvaceous and non-malvaceous species. Some cotton plants contained DNA-A sequences typical of begomoviruses from non-malvaceous species, and some non-malvaceous plants contained sequences typical of CLCuV-PK. Some DNA-A sequences were chimaeric; they each included elements typical of different types of CLCuV-PK, or of different malvaceous and/or non-malvaceous begomoviruses. Often an apparent recombination site occurred at the origin of replication. No complete CLCuV-PK DNA-A sequence was found in malvaceous or non-malvaceous species collected in Pakistan outside the area of the cotton leaf curl epidemic but chimaeric sequences, including a part that was typical of CLCuV-PK DNA-A, did occur there. We suggest that recombination among such pre-existing sequences was crucial for the emergence of CLCuV-PK. Recombination, following multiple infection, could also explain the network of relationships among many of the begomoviruses found in the Indian subcontinent, and their evolutionary divergence, as a group, from begomoviruses causing similar diseases in other geographical regions.


   Introduction
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
Members of the genus Begomovirus, family Geminiviridae, are plant viruses that have serologically related geminate particles and genomes consisting of two molecules of circular single-stranded DNA (DNA-A and DNA-B) or, less commonly, one molecule (equivalent to DNA-A; Navot et al., 1991 ; Harrison & Robinson, 1999 ), each molecule containing about 2600–2800 nucleotides. Begomoviruses are transmitted by whiteflies of the Bemisia tabaci complex (including B. argentifolii; Brown et al., 1995 ), which occur mainly in tropical and warm temperate areas, where the viruses cause important diseases in many dicotyledonous crops (Harrison, 1985 ; Brown & Bird, 1992 ). In recent years, a particularly serious epidemic of cotton leaf curl disease has developed in Pakistan (Ali et al., 1995 ). Affected cotton (Gossypium hirsutum) plants contain begomovirus DNA-A-like molecules (Nadeem, 1995 ; Zhou et al., 1998 ), which will be referred to subsequently as DNA-A, but no DNA-B-like component has been detected (Liu et al., 1998 ; Mansoor et al., 1999 ). However, some experimentally infected test plants (Liu et al., 1998 ) and some naturally infected cotton plants (Mansoor et al., 1999 ) contained, in addition to DNA-A, circular single-stranded DNA molecules about half the size of DNA-A and derived from it by various combinations of sequence deletion, inversion, duplication and rearrangement (Liu et al., 1998 ). These defective molecules could be transmitted, along with DNA-A, by grafting and by B. tabaci. It is not known whether they play any important part in disease aetiology.

Recent work has also revealed another type of circular single-stranded DNA of about 1·4 kb in all leaf curl-affected cotton plants that were tested from several locations in Pakistan; this molecule resembles a genome component of nanoviruses that encodes a replication-associated protein (Rep). However, this nanovirus-like DNA becomes packaged in begomovirus coat protein and is transmissible by B. tabaci (Mansoor et al., 1999 ). When cotton plants were inoculated with cloned infectious molecules of the nanovirus-like component, together with cloned molecules of a DNA-A from leaf curl-affected cotton, each of these agents infected the plants systemically but only very mild symptoms developed in co-infected plants, not those typical of cotton leaf curl disease (Mansoor et al., 1999 ). The aetiology of the disease is therefore still unclear.

A further complication is that four different types of Pakistan cotton leaf curl virus (CLCuV-PK) DNA-A (types a, 26, 72b and 804a) have been distinguished by comparing their complete sequences, which differ by 8–29% and, for three of these types, by as much as distinct begomovirus species (Zhou et al., 1998 ). However, the differences are not uniformly distributed along the DNA-A sequence, so that some parts differ much more than the mean value, whereas other parts are virtually the same. It was concluded that three of the types of CLCuV-PK DNA-A are probably recombinant molecules, and all four were found to share part, but not the whole, of their sequence with DNA-A from okra plants infected by okra yellow vein mosaic virus (OYVMV; Zhou et al., 1998 ). However, the other parent(s) of each of the three putative recombinants is not known and the direction of recombination is therefore unclear. These data, and other analyses of partial sequences of DNA-A, suggested that begomovirus isolates possessing recombinant DNA-A molecules might be widespread in cotton and other malvaceous species in Pakistan (Zhou et al., 1998 ; Harrison & Robinson, 1999 ; Sanz et al., 1999 ).

Two kinds of evidence suggest that begomoviruses infecting non-malvaceous plants in Pakistan may play a part in the evolution of Pakistani cotton-infecting begomoviruses. Firstly, tests with a panel of monoclonal antibodies showed that Pakistani begomovirus isolates from a range of naturally infected non-malvaceous species were antigenically strongly related to, although mostly distinguishable from, CLCuV-PK (Harrison et al., 1997a ). Indeed, begomoviruses found in various hosts in India or Pakistan are antigenically more closely related to one another than to begomoviruses associated with similar diseases in other geographical areas, such as the Americas or the African/Mediterranean region (Swanson et al., 1992 ; Nateshan et al., 1996 ; Harrison et al., 1997a ). Secondly, CLCuV-PK was transmitted experimentally by B. tabaci to bean, tobacco and tomato, as well as to the malvaceous species okra (Abelmoschus esculentus), which developed okra leaf curl disease (Harrison et al., 1997a ). Individual plants of a variety of species in Pakistan might therefore be co-infected with two or more begomovirus isolates. Indeed, brief records of begomovirus co-infection in three plants (one cotton, two okra) from Pakistan (Zhou et al., 1998 ; Sanz et al., 1999 ) support this idea. Co-infection is presumably a precondition for recombination to occur, and earlier work has not detected any bar to begomovirus co-infection at either the strain or species level. For instance, Lazarowitz (1991) reported co-infection of squash plants in California with two strains of squash leaf curl virus, and Harrison et al. (1997b ) detected the DNA-A of two distinct begomoviruses in many severely mosaic-affected cassava plants from Uganda.

In this paper, we describe work to assess the extent to which begomovirus co-infection occurs in cotton and other plants in Pakistan, and to explore the variety of begomovirus DNA-A sequences occurring in singly or multiply infected plants. The results show that co-infection and recombination are rife in naturally infected plants, and that they involve several begomoviruses, and hosts in several botanical families. Based on this evidence, we propose a hypothesis to explain the evolutionary divergence, as a group, of begomoviruses in the Indian subcontinent from those occurring in other geographical regions.


   Methods
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
{blacksquare} Sources of viral DNA.
In 1993, and in 1995–1997, young leaves were collected from naturally infected plants of cotton and other species showing begomovirus-like symptoms and growing at widely separated locations in Pakistan. Many samples were collected in Punjab Province (see Harrison et al., 1997a ; Zhou et al., 1998 ; Sanz et al., 1999 ), especially from cotton and okra. DNA-containing extracts were prepared from the leaf tissue as described by Harrison et al. (1997a) . In some instances, viral dsDNA was further purified using a plasmid isolation kit (Boehringer Mannheim) (Sanz et al., 1999 ). Additional leaf samples, mostly collected in Sindh Province in 1997, were used as sources of DNA from begomovirus-infected plants of other species and are listed below. All DNA samples were stored at -20 °C.

(a) Samples from malvaceous species.

p12 Hollyhock (Althaea rosea) from Mirpur Mathelo, north Sindh

p17 China rose (Hibiscus rosa-sinensis) from Moen jo Daro, central Sindh

p26 Hollyhock from Hyderabad, south Sindh

p31 China rose from Karachi, south Sindh

S12 and S13 Hollyhock from Mirpur Mathelo

S26 Hollyhock from Hyderabad

(b) Samples from other species.

p1 Tomato (Lycopersicon esculentum) from Multan, Punjab

p19 Solanum nigrum from Nawabshah, central Sindh

p20 Guar (Cyamopsis tetragonoloba) from Shahdadpur, central Sindh

p24 and p25 Unidentified weeds from Mirpur Khas, south Sindh

p27 Tobacco (Nicotiana tabacum) from Hyderabad

p28 Tomato from Tando Muhammad Khan, south Sindh

p29 Bottlegourd (Luffa cylindrica) from Tando Muhammad Khan

p30 Watermelon (Citrullus lanatus) from Tando Muhammad Khan

All the samples from Sindh, except those from Mirpur Mathelo, were collected outside the area of the cotton leaf curl epidemic.

{blacksquare} PCR and sequence determination.
The first kind of analysis was used to test for mixed infections involving any of the four main types of CLCuV-PK (clc26, clca, clc72b and clc804a) and the isolates of OYVMV described by Zhou et al. (1998) . Five pairs of primers were designed, four that were specific for DNA-A of each type of CLCuV-PK and one OYVMV-specific pair that detected DNA-A of isolates of both type oyvm201 and type oyvm301. Positions of the primers are shown in Fig. 1; the primer pairs were (sequences not listed in full are given by Zhou et al., 1998 ): (a) for CLCuV-PK type 26, primer pair CL-CR/R2 and CL-72/F (5' ATTCGAGGGTGTGTTGATGGC 3', identical to nt 2469–2489); (b) for CLCuV-PK type a, primer pair CL-CR/R2 and CL-AR/F2 (5' GCGTTTGTTTTTAAAGCACGTGG 3', nt 2556–2578); (c) for OYVMV, primer pair CL-CR/R2 and OYV/F (5' TGGGTGAGAAAGACGAATGCT 3', nt 1579–1599); (d) for CLCuV-PK type 72b, primer pair CL72-AL/R and CL72-AL/F; (e) for CLCuV-PK type 804a, primer pair CL800-AL/R and CL11/F. Each primer pair was used in a separate reaction (Harrison et al., 1997a ) to test samples from malvaceous plants.



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 1. Diagram showing regions of begomovirus DNA-A amplified in PCR with different combinations of primers. The grey circle represents DNA-A, grey arrows indicate the positions of open reading frames, and the black triangle marks the origin of replication (nt 1). Black lines indicate the regions amplified with the primers represented by black squares. Products shown outside the DNA-A circle are those analysed in the overlapping sequence method; those obtained with the virus type-specific primers are shown inside the circle.

 
The second kind of analysis used three other pairs of primers (Sanz et al., 1999 ) that represent relatively well conserved sequences in DNA-A of Pakistani begomoviruses, and amplify DNA fragments with overlapping sequences (Fig. 1). These primers were used to test samples from a wider range of plant species, both malvaceous and non-malvaceous, with the aim of detecting a greater variety of begomoviruses. Excluding the primer sequences, the fragment amplified with primers IRv and IRc (Fragment 2) overlaps with 202 nt of Fragment 1 (amplified with primers AC1v and AC1c), and with 79 nt of Fragment 3 (amplified with primers CPv and CPc). Fragment 3 includes the CP gene, except for its 3'-terminal 15 nt. However, primer CL3a (Harrison et al., 1997a ) was used instead of primer CPc for seven DNA samples; the fragment thus amplified (Fragment 3a) lacks 73 nt at the 3' end of the CP gene. Each primer pair was used in a separate reaction, in conditions described by Sanz et al. (1999) . The amplified fragments were cloned, and their sequences were determined with an automatic sequencer (ABI Prism, Perkin Elmer) (Sanz et al., 1999 ).

{blacksquare} Sequence analysis.
The new sequences (EMBL acc. nos AJ245495–AJ245501 and AJ270853–AJ270873) obtained as described above were compared with other partial DNA-A sequences obtained by Sanz et al. (1999) (EMBL acc. nos AJ228561–AJ228599) but not previously analysed in this manner. The comparisons also included the relevant parts of the published sequences of DNA-A of the four types of CLCuV-PK and two types of OYVMV (AJ002447–AJ002459), and of two types of tomato leaf curl virus from India (U15015 and U38239). Sequence data were analysed with the Wisconsin Package, version 8.1 (Anon., 1994 ). Multiple alignments were optimized manually. A phylogenetic tree representing the CP gene (768 nt) of 33 virus isolates was constructed by a Maximum Likelihood method using PUZZLE version 4.0, with 10000 puzzling steps. Where some nucleotides at the 3' end of the CP gene were not determined, as explained above, it was assumed that the open reading frames of all isolates had the same length as that of isolate clc26, and the missing nucleotides were considered to be unknown.


   Results
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
Detection of co-infection in malvaceous plants by PCR
Twenty-four samples of leaf curl-affected cotton, 16 of okra and 3 of hollyhock, collected at widely separated locations in Pakistan in 1993 or 1995–1997, were tested by PCR with primers specific for DNA-A of each of the four main types of CLCuV-PK and that of OYVMV; fragments of the expected size were consistently amplified. In cotton, DNA typical of CLCuV-PK type a was found in 1993, 1995 and 1997 (in a total of 3 samples), that of types 26 and 72b in 1995–1997 (17 and 12 samples, respectively) but that of type 804a only in 1996 (4 samples). No amplification with the OYVMV-specific primers was obtained with samples from cotton. Only one type of CLCuV-PK DNA was detected in 16 of the cotton samples (data not shown) but, of the 23 samples collected in 1995–1997, 8 contained DNA typical of two or three types (Table 1). Multiple infections were not found in samples collected near the southern (e.g. north Sindh) or western borders of the area affected by the leaf-curl epidemic (0 of 5 samples), where the disease incidence was <1%, but were common (8 of 19 samples) in the main epidemic area, where the disease incidence was 50–100%.


View this table:
[in this window]
[in a new window]
 
Table 1. Evidence of begomovirus co-infection in malvaceous plants with leaf curl symptoms sampled in 1995–1997

 
In okra, OYVMV-like DNA-A was detected in 5 of the 6 samples from plants showing yellow vein mosaic symptoms, and none contained that of CLCuV-PK. In contrast, 8 of the 10 okra plants with leaf curl symptoms contained DNA typical of CLCuV-PK (types a, 26 or 804a). Four of the plants that contained CLCuV-PK DNA also yielded a product in PCR with the OYVMV-specific primers (Table 1). Also, a hollyhock plant with leaf curl symptoms apparently contained DNA of both CLCuV-PK type 804a and OYVMV, and a second hollyhock sample (S12) contained both of these, plus DNA of CLCuV-PK type 26 (Table 1). In a further test, the 1486 bp fragment amplified with the OYVMV primers from leaf curl-affected okra T19 (Table 1) was cloned and sequenced. In the intergenic region (IR), the iterons (Arguello-Astorga et al., 1994 ) were very like those of OYVMV isolate 201, and the whole sequence had its greatest similarity to that of OYVMV isolate 301 (88% identical). This sequence was only 74% identical with the sequence of CLCuV-PK type 804a (Zhou et al., 1998 ) and, in particular, lacked a sequence corresponding to the type 804a primer, CL800-AL/R, and so was distinct from the fragment amplified with the CLCuV-PK 804a primers. Thus, sample T19 was apparently co-infected, as were okra samples 206, 208 and T14. The results therefore provide evidence that multiple infection with begomoviruses occurs in both okra and hollyhock as well as in cotton.

The two leaf curl-affected okra plants in which CLCuV-PK was not detected gave a product with the OYVMV-specific primers even though yellow vein mosaic symptoms were uncommon (<1%) or apparently absent in the source fields. Further work is needed to clarify the aetiology of these two kinds of begomovirus-associated disease symptom in okra.

Co-infection detected by sequence comparisons
The three primer pairs, which amplify overlapping sequences in begomovirus DNA-A, were used in tests on samples from a range of begomovirus-infected plant species. Overlapping DNA fragments were obtained from 18 samples and, in 12 of these, the overlapping sequences differed (Table 2). Hence, begomovirus co-infection seems easy to find in Pakistan in non-malvaceous plants, such as bottlegourd, tobacco and tomato, as well as in malvaceous species. However, nearly all the sequences closely resembled those of reference virus isolates (sequence identity always at least 89%, usually >93%), as indicated in Table 2. In most instances, affinities of the non-overlapping and overlapping parts of Fragment 1 were closest to the same reference isolate; the same was true for Fragments 3 and 3a.


View this table:
[in this window]
[in a new window]
 
Table 2. Components of sequence mixtures revealed by comparison of overlapping sequences in two regions of DNA-A

 
In tests of this kind, a pair of co-infecting viruses would theoretically yield six distinct sequence fragments but the greatest number obtained in practice was three, probably because the primers could not amplify all the relevant sequences or because not all the kinds of fragment amplified were cloned and sequenced. Nevertheless, Table 2 brings out several points. With sample 49, for example, the sequences differed at both overlaps: Fragment 1 was novel, Fragment 2 was like CLCuV-PK type a and Fragment 3 was like type 804a. Thus this sample appeared to contain at least two viruses, and possibly three, depending on whether or not the novel AC1 sequence in Fragment 1 was connected to the CLCuV-PK type 804a-like Fragment 3. With other samples, such as sample 85, three sequence fragments were amplified but only one of the two overlaps provided evidence of multiple infection. With a few samples, all three sequence fragments were amplified and no sequence differences occurred at the overlaps.

Relationships among isolates
When comparing 13 CP genes of Pakistani begomovirus isolates from malvaceous plants, Zhou et al. (1998) found that the isolates from cotton fell into three clusters, with the isolates from okra being in, or close to, two of these clusters. Twenty-two additional begomovirus CP gene sequences from Pakistani malvaceous and non-malvaceous plants are now available, so making a more comprehensive analysis possible (Fig. 2). The CP gene sequences from cotton or okra now fall into four main clusters, three of which also contain CP genes of viruses from non-malvaceous plants. The new cluster, typified by isolate olc (okra leaf curl) 90, includes the CP genes of viruses from bottlegourd, hollyhock, tobacco and a weed, all of which are virtually indistinguishable from that of olc90. Among the cotton viruses, several examples were found of two additional minor CP gene variants [typified by clc49 (1·9% nucleotide sequence difference from clc804a) and clc82 (1·3% difference from olc311)] and, among the viruses from okra or saklai (Hibiscus tiliaceus; slc60), three extra variants with up to 6% sequence difference from CLCuV-PK type 804a could be recognized. Among viruses from other species, the CP genes of p28 (tomato) and p30 (watermelon) had strong affinities (<3·3% difference) to that of CLCuV-PK type 804a (Fig. 2). The two virus isolates from leaf curl-affected tomato in India [U15015 (Padidam et al., 1995a ) and U38239] have CP genes that differ substantially from all the others but the U38239 CP gene is closest to that of CLCuV-PK type 72b (Fig. 2).



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 2. Phylogenetic tree obtained, using the PUZZLE program, from an alignment of the sequences of 33 begomovirus CP genes from Pakistan or India. Numbers above each branch are the percentage support for that branch. The scale bar indicates the horizontal distance equivalent to 0·1 replacements per position. The sources of viral DNA are cotton with leaf curl (clc prefix), okra with leaf curl (olc) or yellow vein mosaic (oyvm), saklai with leaf curl (slc) and other plant species, as indicated.

 
Examination of the sequences of Fragment 2 of virus isolates from non-malvaceous species, including the overlaps with Fragments 1 and 3 (Table 2), revealed that, in contrast to the CP genes, most of these sequences have affinities to those of one or other of the above two tomato leaf curl viruses (ToLCV) from India. For example, the sequence of ToLCV-U15015 corresponding to Fragment 2 (713 nt) was 95·5–95·7% identical with the Fragment 2 sequences of p19 (from Solanum nigrum),p20 (Cyamopsis tetragonoloba), p28 (tomato) and p29 (bottlegourd); the p19, p28 and p29 sequences were >99% identical with that of p20. Similarly, the sequences of the 3' region of Fragment 1 of p28 and p29 were each like those of one of the ToLCV isolates (Table 2). However, one of the overlapping sequences at the 3' region of the AC1 gene, obtained from tobacco (p27), is CLCuV-PK-like. Thus some viruses from non-malvaceous plants have sequence resemblance in parts of their DNA-A to isolates from malvaceous species. Conversely, the 3' region of the AC1 gene of one component of a mixed infection of okra (sample O1; Table 2) resembles that of a ToLCV isolate.

Another complexity revealed by our sequence comparisons is that different parts of some individual sequences have different affinities and appear to have been produced by recombination. Clear-cut changes in affinity occurred at points in the IR, or in the AC1 or CP genes (Table 3). For instance, in the AC1 sequence from sample 74, the sequences brought together are typical of those of the two most disparate types of CLCuV-PK described by Zhou et al. (1998) , types 26 and 72b. Similarly, the CP gene of slc60 combines sequences typical of CLCuV-PK types 26 (5' half) and 804a (3' half). Of the equivalent sequences of olc1 (from sample O1) and olc16, the 5' 700 nt closely resemble those of CLCuV-PK type 804a but the rest are unlike any of the other sequences, although similar to one another. In other instances (p12, p31), putative recombinant viral sequences from malvaceous plants incorporate elements typical of ToLCV; and part of a viral sequence from tobacco (p27) is typical of an okra-infecting virus (oyvm201; Table 3). In these last three examples, the presumed recombination site is close to the origin of replication (ori), but it cannot be located precisely because almost all the reference isolates have the same 21 nt sequence at this point. To check the reliability of the sequences determined in this work, two independent clones of 12 PCR products were sequenced. The sequences included examples of Fragments 1, 2 and 3, Fragments from plants considered (because of the sequence differences between overlapping Fragments) to be dually infected, and putative recombinant Fragments. In 11 instances, the duplicate sequences were identical; in the twelfth, we could not decide whether one of the two sequences was derived from a true recombinant DNA-A, or was an artefact produced by template-switching during PCR. We conclude that template-switching, if it occurred at all, could not explain the number of recombinant sequences observed.


View this table:
[in this window]
[in a new window]
 
Table 3. Examples of putative recombinant sequences in Pakistani begomoviruses

 
In further analyses, the sequences of Fragment 2 (Fig. 3) were divided into four parts: (a) extending in the 5' direction from the IR, in the AC1 gene (252–258 nt in length), (b) the part of the IR on the 5' side of ori (145–164 nt), (c) the part of the IR on the 3' side of ori (115–142 nt), and (d) extending in the 3' direction from the IR, in the AV2 gene (181 nt). For the p27 sequence (Fig. 3), the percentage identities of these four parts with those of ToLCV-U38239 are 80, 60, 92 and 97, respectively whereas, when compared with oyvm201, they are 97, 91, 44 and 72. The converse relationships are found in the p12 sequence, which has percentage identities of 91, 91, 45 and 71 with the tomato virus, and 79, 60, 94 and 95 with oyvm201. Fragment 2 of p31 (Fig. 3) apparently has one recombination site at ori and a second near the start of the AV2 gene. Its percentage sequence identities are 95, 91, 47 and 74 with ToLCV-U38239, and 79, 55, 97 and 78 with CLCuV-PK type 26. The 3' (AV2) part of this sequence is novel, so providing evidence that three viruses have contributed DNA sequences to p31. Thus, recombination sites apparently can occur in the AC1, AV2 or CP genes as well as at ori, and begomovirus DNA-A sequences from non-malvaceous plants apparently have exchanged portions with those from malvaceous species.




View larger version (157K):
[in this window]
[in a new window]
 
Fig. 3. Comparison of sequences of Fragment 2 from three putative recombinant begomovirus isolates (p12, p27 and p31) with the equivalent sequences of oyvm201, clc26 and ToLCV (U38239, shown twice). The most common residue at each position is shown in reverse contrast. Arrows mark the left and right ends of the IR and the symbol ‘O’ marks ori: these points demarcate the four parts into which the sequences were divided for further analysis (see text). Fragment 2 of U38239 runs from nt 2327 (here numbered 1) of the complete DNA-A sequence, through nt 1 (here 410) to nt 324 (733).

 
In the IR, seven kinds of combination of its 5' and 3' halves were reported by Zhou et al. (1998) , and the sequences determined by Sanz et al. (1999) include two others. For convenience, Zhou et al. (1998) designated the kinds of 5' sequence A–F and the kinds of 3' sequence 1–3. We can now add six more combinations; the complete range is shown in Table 4. Inspection of the AC1 and AV2 sequences in Fragment 2 indicated that they were, in almost all instances, typical of the virus variant that provided the IR sequence to which they were joined. Indeed a classification based on just the four AC1 codons closest to the 5' part of the IR was almost the same as that based on the complete 5' half of the IR, shown in Table 4. However, we suspect that a few of the IR sequence combinations shown in Table 4 may not represent separate recombination events. For example, IR sequences H, I and J contain the same 6-nt iterons (Table 4), and sequence categories H4, I4 and J6 have the same length and differ from one another (7–17%) less than do the other IR sequence categories. Also, AC1 sequences (258 nt) adjacent to the IR 5' sequences of categories H, I and J differed only by 7–10%. These various AC1-IR regions may therefore represent three lineages that have diverged somewhat by genetic drift following a single recombination event. Likewise, the 9–10% difference between the AC1 3' sequences (258 nt) of isolate p12 and one of the isolates from sample O1, on the one hand, and those of isolates p31 and ToLCV-U38239, on the other hand, all of which have IR 5' sequences of category G, suggests post-recombinational divergence.


View this table:
[in this window]
[in a new window]
 
Table 4. Categories of intergenic region among 31 Pakistani begomovirus sequences

 
Comparison of the iteron sequences of all the isolates (Table 4) supports the classification (A–J) of the 5' halves of the IR but is somewhat less discriminatory. For instance, the iterons in sequences of category G are contained within those of category B although the complete B and G sequences differ considerably.


   Discussion
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
Our experiments were not intended to establish the aetiology of the different begomovirus-associated diseases found in Pakistan. This must be the subject of further research. Our aim was to study the kinds of begomovirus DNA-A occurring in a range of plant species and to assess the variety of interactions among the virus genomes. Our results provide much new information on these topics. They indicate that begomovirus co-infection is common in Pakistan. It was detected in 14 out of 43 plants tested by virus type-specific PCR, and in 12 out of 18 tested by the overlapping sequence method. At least 5 plants contained sequences typical of three types of DNA-A. Although the same samples were not used for the two kinds of test, the overlapping sequence method seemed to be better at detecting the occurrence of multiple viral sequences in malvaceous species (8/12 samples) than the PCR method (14/43 samples), which used primers of greater specificity, based on more variable regions of DNA-A. Indeed, even the remarkably high figure obtained with the overlapping sequence method must be considered a minimum, because it is unlikely that all the kinds of begomovirus sequence present in the samples could be amplified with the primers used, and that all those amplified were cloned and sequenced. However, although the overlapping sequence approach was effective for detecting multiple sequences in single plants, it provides no information on the relative concentrations of the different kinds of DNA-A in them.

Evidence was obtained of co-infection in cotton, in other malvaceous species and in non-malvaceous species. Sequences typical of viruses from non-malvaceous plants were found in malvaceous ones, and sequences of CLCuV-PK were found in non-malvaceous plants. One factor favouring the spread of begomoviruses among these plants is that many dicotyledonous species in Pakistan are hosts of whiteflies of the B. tabaci complex (Ali et al., 1995 ), which are the known or likely vectors of all the viruses. Another possible factor, which may favour virus establishment in the vector-inoculated plants, is that begomovirus replication and/or movement proteins, which are host-adapted, may be able to mediate in trans the replication and systemic invasion of closely related co-infecting viruses that would otherwise be unable to infect the relevant plant species systemically. This phenomenon has been observed in experiments (Lazarowitz, 1991 ; Frischmuth et al., 1993 ; Ingham et al., 1995 ; Hou et al., 1998 ), and might partly explain the frequency and variety of the begomovirus sequence mixtures found in naturally infected plants in Pakistan.

The increased number of nucleotide sequences now available for Pakistani begomoviruses has provided a clearer picture of the extents of variation and affinity among them. Considerable similarities were found among the viral CP genes, with most of those from non-malvaceous plants being classified in the clusters containing CP genes obtained from malvaceous plants. This conclusion parallels that drawn from comparisons of epitope profiles of begomovirus particles (Harrison et al., 1997a ). In contrast, comparisons of IR sequences, the most variable part of begomovirus DNA-A (Padidam et al., 1995b ), provided much new evidence of variation and recombination. Additional kinds of IR sequence were obtained from infected malvaceous and non-malvaceous species. Moreover, some of the sequences from malvaceous plants incorporated elements indistinguishable from sequences of viruses infecting non-malvaceous species; and some sequence elements obtained from infected non-malvaceous plants were typical of viruses from malvaceous species. Many putative recombinant sequences had a recombination site close to ori, in agreement with previous findings (Stanley, 1995 ; Zhou et al., 1998 ; Sanz et al., 1999 ). Interestingly, we found no evidence for recombination in the 5' half of the IR, or in the region (250 nt) of the AC1 gene immediately adjoining it. The association of the 5' half of the IR and its cognate AC1 gene was therefore maintained. This is probably required for virus viability because, for the genomic DNA to be replicated, a specific amino-acid motif in the N-terminal portion of the AC1-encoded Rep protein must recognize the iteron(s) in the 5' half of the IR (Lazarowitz et al., 1992 ; Fontes et al., 1994 ; Jupin et al., 1995 ). Also, iteron sequences differ among begomoviruses, and the recognition is nucleotide sequence-specific. In contrast, a putative recombination site was detected in the AV2 gene, near its junction with the IR. The frequency of supposed recombinant sequences among Pakistani begomoviruses presumably reflects the many opportunities for recombination presented by multiple infection of individual plants, together with the existence of shared sequence motifs at various points in the viral DNA-A.

The pattern of relationships among Pakistani begomoviral DNA-A sequences resembles a network, with frequent evolutionary interactions among viruses from malvaceous species, and less frequent interactions between these viruses and those from non-malvaceous species. This picture confirms and extends that provided by the complementary data of Sanz et al. (1999) , who found that begomoviral sequences from cotton had essentially the same kinds of nucleotide diversity as those from other malvaceous species, and argued that all these sequences represent a single undifferentiated population. We conclude that begomoviruses in Pakistan, and probably in the whole Indian subcontinent, have a complex lineage, largely irrespective of their preferred host species, and distinct from the lineages of begomoviruses associated with similar diseases in other geographical areas, such as the Americas or the African/Mediterranean region.

We have not found the complete DNA-A of CLCuV-PK in Pakistan outside the area affected by the cotton leaf curl epidemic (Punjab and the extreme north of Sindh), although we now have evidence from the sequences found in mixed infections, and from the recombinant sequences, that pieces of sequence that are typical of each of the main types of CLCuV-PK DNA-A (a, 26, 72b and 804a) occur in begomovirus-infected malvaceous or non-malvaceous species in central or south Sindh. A form of CLCuV-PK DNA-A could presumably have emerged in the Punjab by rounds of recombination among such viruses. We have also obtained evidence that the begomovirus epidemic in cotton, in turn, has resulted in numerous multiple infections, and probably yet more rounds of recombination, which have ensured that further types of begomovirus continued to emerge in Pakistan. Whether cotton leaf curl disease in Pakistan is caused by any one, or some combination, of these begomovirus variants, perhaps in association with another virus-like agent (Liu et al., 1998 ; Mansoor et al., 1999 ), remains to be determined.


   Acknowledgments
 
Our work was supported in part by contract CI1*-CT94-0052 from the European Commission. SCRI is grant-aided by the Scottish Executive Rural Affairs Department.


   Footnotes
 
The EMBL accession numbers of the sequences reported in this paper are AJ245495–AJ245501 and AJ270853–AJ270873.

b Present address: CropDesign N.V., Technologiepark 3, B-9052 Gent, Belgium.

c Permanent address: Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China.

d Permanent address: Virology Section, Ayub Agricultural Research Institute, Faisalabad, Pakistan.


   References
Top
Abstract
Introduction
Methods
Results
Discussion
References
 
Ali, M., Ahmad, Z., Tanveer, M. & Mahmood, T. (1995). Cotton Leaf Curl Virus in the Punjab: Current Situation and Review of Work. Multan: Central Cotton Research Institute/Ministry of Food, Agriculture and Livestock, Government of Pakistan/Asian Development Bank.

Anon. (1994). Program Manual for the Wisconsin Package, Version 8. Madison: Genetics Computer Group.

Arguello-Astorga, G., Herrera-Estrella, L. & Rivera-Bustamante, R. (1994). Experimental and theoretical definition of geminivirus origin of replication. Plant Molecular Biology 26, 553-556.[Medline]

Brown, J. K. & Bird, J. (1992). Whitefly-transmitted geminiviruses and associated disorders in the Americas and the Caribbean Basin. Plant Disease 76, 220-225.

Brown, J. K., Frolich, D. R. & Rosell, R. C. (1995). The sweetpotato or silverleaf whiteflies: biotypes of Bemisia tabaci or a species complex? Annual Review of Entomology 40, 511-534.

Fontes, E. P. B., Gladfelter, H. J., Schaffer, R. L., Petty, I. T. D. & Hanley-Bowdoin, L. (1994). Geminivirus replication origins have a modular organization. Plant Cell 6, 405-416.[Abstract/Free Full Text]

Frischmuth, T., Roberts, S., von Arnim, A. & Stanley, J. (1993). Specificity of bipartite geminivirus movement proteins. Virology 196, 666-673.[Medline]

Harrison, B. D. (1985). Advances in geminivirus research. Annual Review of Phytopathology 23, 55-82.

Harrison, B. D. & Robinson, D. J. (1999). Natural genomic and antigenic variation in whitefly-transmitted geminiviruses (begomoviruses). Annual Review of Phytopathology 37, 369-398.[Medline]

Harrison, B. D., Liu, Y. L., Khalid, S., Hameed, S., Otim-Nape, G. W. & Robinson, D. J. (1997a). Detection and relationships of cotton leaf curl virus and allied whitefly-transmitted geminiviruses occurring in Pakistan. Annals of Applied Biology 130, 61-75.

Harrison, B. D., Zhou, X., Otim-Nape, G. W., Liu, Y. & Robinson, D. J. (1997b). Role of a novel type of double infection in the geminivirus-induced epidemic of severe cassava mosaic in Uganda. Annals of Applied Biology 131, 437-448.

Hou, Y.-M., Paplomatas, E. J. & Gilbertson, R. L. (1998). Host adaptation and replication properties of two bipartite geminiviruses and their pseudorecombinants. Molecular Plant–Microbe Interactions 11, 208-217.

Ingham, D. J., Pascal, E. & Lazarowitz, S. (1995). Both bipartite geminivirus movement proteins define viral host range, but only BLI determines viral pathogenicity. Virology 207, 191-204.[Medline]

Jupin, I., Hericourt, F., Benz, B. & Gronenborn, B. (1995). DNA replication specificity of TYLCV geminivirus is mediated by the amino-terminal 116 amino acids of the Rep protein. FEBS Letters 362, 116-120.[Medline]

Lazarowitz, S. G. (1991). Molecular characterization of two bipartite geminiviruses causing squash leaf curl disease: role of viral replication and movement functions in determining host range. Virology 180, 70-80.[Medline]

Lazarowitz, S. G., Wu, L. C., Rogers, S. G. & Elmer, J. S. (1992). Sequence-specific interaction with the viral ALI protein identifies a geminivirus DNA replication origin. Plant Cell 4, 799-809.[Abstract/Free Full Text]

Liu, Y., Robinson, D. J. & Harrison, B. D. (1998). Defective forms of cotton leaf curl virus DNA-A that have different combinations of sequence deletion, duplication, inversion and rearrangement. Journal of General Virology 79, 1501-1508.[Abstract]

Mansoor, S., Khan, S. H., Bashir, A., Saeed, M., Zafar, Y., Malik, K. A., Briddon, R., Stanley, J. & Markham, P. G. (1999). Identification of a novel circular single-stranded DNA associated with cotton leaf curl disease in Pakistan. Virology 259, 190-199.[Medline]

Nadeem, A. (1995). Molecular characterization of two cotton geminiviruses. PhD Dissertation, University of Arizona, USA.

Nateshan, H. M., Muniyappa, V., Swanson, M. M. & Harrison, B. D. (1996). Host range, vector relations and serological relationships of cotton leaf curl virus from southern India. Annals of Applied Biology 128, 233-244.

Navot, N., Pichersky, E., Zeidan, M., Zamir, D. & Csoznek, H. (1991). Tomato yellow leaf curl virus; a whitefly-transmitted geminivirus with a single genomic molecule. Virology 185, 151-161.[Medline]

Padidam, M., Beachy, R. N. & Fauquet, C. M. (1995a). Tomato leaf curl geminivirus in India has a bipartite genome and coat protein is not essential for infectivity. Journal of General Virology 76, 25-35.[Abstract]

Padidam, M., Beachy, R. N. & Fauquet, C. M. (1995b). Classification and identification of geminiviruses using sequence comparisons. Journal of General Virology 76, 249-263.[Abstract]

Sanz, A. I., Fraile, A., Gallego, J. M., Malpica, J. M. & García-Arenal, F. (1999). Genetic variability of natural populations of cotton leaf curl geminivirus, a single-stranded DNA virus. Journal of Molecular Evolution 49, 672-681.[Medline]

Stanley, J. (1995). Analysis of African cassava mosaic virus recombinants suggests strand nicking occurs within the conserved nonanucleotide motif during the initiation of rolling circle DNA replication. Virology 206, 707-712.[Medline]

Swanson, M. M., Varma, A., Muniyappa, V. & Harrison, B. D. (1992). Comparative epitope profiles of the particle proteins of whitefly-transmitted geminiviruses from nine crop legumes in India. Annals of Applied Biology 120, 425-433.

Zhou, X., Liu, Y., Robinson, D. J. & Harrison, B. D. (1998). Four DNA-A variants among Pakistani isolates of cotton leaf curl virus and their affinities to DNA-A of geminivirus isolates from okra. Journal of General Virology 79, 915-923.[Abstract]

Received 14 January 2000; accepted 3 March 2000.