Further variability within the genus Crinivirus, as revealed by determination of the complete RNA genome sequence of Cucurbit yellow stunting disorder virus

Juan M. Aguilar1, Maribel Franco1, Cristina F. Marco1, Benjamín Berdiales2, Emilio Rodriguez-Cerezo2, Verónica Truniger3 and Miguel A. Aranda3

1 Estación Experimental ‘La Mayora’, Consejo Superior de Investigaciones Científicas, 29750 Algarrobo-Costa, Málaga, Spain
2 Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas, Campus Universidad Autónoma, 28049 Cantoblanco, Madrid, Spain
3 Centro de Edafología y Biología Aplicada del Segura (CEBAS), Consejo Superior de Investigaciones Científicas, Campus Universitario de Espinardo, Apdo Correos 164, 30100 Espinardo, Murcia, Spain

Correspondence
Miguel Aranda
m.aranda{at}cebas.csic.es


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The complete nucleotide (nt) sequences of genomic RNAs 1 and 2 of Cucurbit yellow stunting disorder virus (CYSDV) were determined for the Spanish isolate CYSDV-AlLM. RNA1 is 9123 nt long and contains at least five open reading frames (ORFs). Computer-assisted analyses identified papain-like protease, methyltransferase, RNA helicase and RNA-dependent RNA polymerase domains in the first two ORFs of RNA1. This is the first study on the sequences of RNA1 from CYSDV. RNA2 is 7976 nt long and contains the hallmark gene array of the family Closteroviridae, characterized by ORFs encoding a heat shock protein 70 homologue, a 59 kDa protein, the major coat protein and a divergent copy of the coat protein. This genome organization resembles that of Sweet potato chlorotic stunt virus (SPCSV), Cucumber yellows virus (CuYV) and Lettuce infectious yellows virus (LIYV), the other three criniviruses sequenced completely to date. However, several differences were observed. The most striking novel features of CYSDV compared to SPCSV, CuYV and LIYV are a unique gene arrangement in the 3'-terminal region of RNA1, the identification in this region of an ORF potentially encoding a protein which has no homologues in any databases, and the prediction of an unusually long 5' non-coding region in RNA2. Additionally, the CYSDV genome resembles that of SPCSV in having very similar 3' regions in RNAs 1 and 2, although for CYSDV similarity in primary structures did not result in predictions of equivalent secondary structures. Overall, these data reinforce the view that the genus Crinivirus contains considerable genetic variation. Additionally, several subgenomic RNAs (sgRNAs) were detected in CYSDV-infected plants, suggesting that generation of sgRNAs is a strategy used by CYSDV for the expression of internal ORFs.

The GenBank accession numbers of the sequences reported in this paper are AY242077 (CYSDV RNA 1) and AY242078 (CYSDV RNA2).


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Cucurbit yellow stunting disorder virus (CYSDV) is a whitefly-transmitted virus which extensively affects cucurbit crops in many warm and temperate areas of production (Abou-Jawdah et al., 2000; Célix et al., 1996; Desbiez et al., 2000; Hassan & Duffus, 1991; Kao et al., 2000; Louro et al., 2000; Wisler et al., 1998). CYSDV can induce yellowing symptoms and cause important yield reductions in infected plants and, therefore, its economic impact is very high. CYSDV is a member of the genus Crinivirus (family Closteroviridae) (Martelli et al., 2000, 2002). Its particles are flexible rods with lengths between 750 and 800 nm (Liu et al., 2000). CYSDV has a narrow host range limited to species of the family Cucurbitaceae and is confined to phloem-associated cells (Célix et al., 1996; Marco et al., 2003). The CYSDV genome consists of two molecules of single-stranded (ss) RNA of positive (+) polarity designated RNAs 1 and 2 (Célix et al., 1996).

In recent years, the number of newly identified whitefly-transmitted criniviruses has increased rapidly (Wisler et al., 1998). Despite this and the very significant negative impact that criniviruses have on crop production, the only reported crinivirus complete genome sequences available to date are those of Lettuce infectious yellows virus (LIYV; Klaasen et al., 1995), the type member of the genus Crinivirus, Sweet potato chlorotic stunt virus (SPCSV; Kreuze et al., 2002) and Cucumber yellows virus (CuYV; Hartono et al., 2003). Analyses of the genome sequences of these three viruses revealed several new features compared to other closteroviruses, for example the identification of a particular gene arrangement of the five-gene module typical of the family Closteroviridae (Karasev, 2000; Klaasen et al., 1995; Martelli et al., 2000, 2002) and the presence of a unique open reading frame (ORF) in the SPCSV genome that putatively encodes a protein belonging to the RNase III family (Kreuze et al., 2002). Livieratos & Coutts (2002) cloned and sequenced cDNAs of the RNA 2 of a CYSDV isolate. Their work showed that the RNA 2 of this isolate is 7281 nt long and contains seven ORFs flanked by 5'- and 3'-untranslated regions of 486 nt and 223 nt, respectively. Going from 5' to 3', proteins potentially encoded by CYSDV RNA 2 are a 5 kDa hydrophobic protein, a 62 kDa heat shock protein 70 homologue (Hsp70h), 59 kDa and 9 kDa proteins of unknown function, a 28·5 kDa coat protein (CP), a 53 kDa coat protein minor (CPm) and a 26·5 kDa protein of unknown function (Livieratos & Coutts, 2002). Thus, the CYSDV RNA 2 contains the closterovirus hallmark gene array, with a similar arrangement to those of LIYV, SPCSV and CuYV (Hartono et al., 2003; Klaassen et al., 1995; Kreuze et al., 2002; Martelli et al., 2000, 2002). Although information is available on CYSDV RNA 2 genomic sequences, including isolate variability of the CP and Hsp70h genes (Rubio et al., 1999, 2001), the CYSDV RNA 1, with an estimated size of 9 kb (Célix et al., 1996), has not been sequenced. In this paper, we report the complete nucleotide (nt) sequence of RNAs 1 and 2 of a Spanish isolate of CYSDV. Analyses of the two genomic RNAs showed further new features compared to the other sequenced criniviruses, such as the identification of a new ORF in the 3' region of RNA 1 and a unusually long 5' region in RNA 2 in which no significant ORF could be predicted. In addition, we present data on the possible mechanisms of expression of CYSDV RNAs 1 and 2.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Virus sources and purification of viral RNAs.
The CYSDV isolate (CYSDV-AlLM) characterized in this study was obtained from a naturally infected melon plant collected in Almería (Spain) in 1997, and is maintained in the Estación Experimental ‘La Mayora’-CSIC (Málaga, Spain) on Cucumis melo cv. Amarillo or C. sativus cv. Bellpuig through transmission by Bemisia tabaci (B biotype) (Guirao et al., 1997). Samples of this isolate are kept under the accession number PV-0592/EWSN_6 in the DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH collection (Braunschweig, Germany). Additional isolates in this study for which the 5'-terminal sequence of RNA 2 was determined were CYSDV-AM2, obtained from a naturally infected melon plant collected in Almería (Spain) in 1999, CYSDV-AM14 and CYSDV-AM16, obtained from naturally infected cucumber plants collected in Málaga (Spain) in 1999, and CYSDV-Tex, which was obtained from a naturally infected melon plant collected in Texas (USA) in 2002.

Double-stranded RNA (dsRNA) extracts were prepared according to the method of Valverde et al. (1990) with the modifications introduced by Célix et al. (1996). Total RNA was extracted from healthy and CYSDV-infected plants using TRIzol reagent (Sigma) following the manufacturer's instructions.

cDNA synthesis, cloning and sequencing.
To generate cDNAs of the 5'- and 3'-ends of CYSDV dsRNAs we proceeded as follows. First, poly(A) was added to the 3'-ends of the dsRNA minus and plus strands. One microlitre of 0·1 M methylmercuric hydroxide (Sigma) was added to 9 µl of dsRNA extract (10 ng µl-1) and incubated for 10 min at room temperature. One microlitre of 1·4 M {beta}-mercaptoethanol was then added, mixed and incubated on ice for 10 min. To that mixture, 4 µl of 5x yeast poly(A) polymerase buffer (USB) was added together with 1 µl of 2 mM ATP, 1 µl of RNase inhibitor (10 U µl-1; Amersham Pharmacia Biotech), 1 µl of yeast poly(A) polymerase (USB) and 2 µl of sterile distilled water. The mixture was incubated for 30 min at 37 °C, phenol/chloroform extracted and ethanol-precipitated. After drying, the resulting concentrate was dissolved in 11 µl of water and 1 µl of 0·1 M methyl mercuric hydroxide was added. The sample was incubated for 10 min at room temperature and then 1 µl of 1·4 M {beta}-mercaptoethanol added. cDNAs were synthesized by adding 5 µl of 5x reverse transcriptase buffer (Roche Diagnostics), 2 µl of 5 mM dNTPs, 1 µl of RNase inhibitor (10 U µl-1; Amersham Pharmacia Biotech), 1 µl of 100 mM DTT, 1 µl of reverse transcriptase (Expand RT, Roche Diagnostics) and 2 µl of oligo(dT) (100 ng µl-1) and the mixture was then incubated for 90 min at 37 °C. The cDNA was PCR-amplified using specific oligonucleotides designed from known internal sequences (Célix et al., 1996; Livieratos & Coutts, 2002; E. Rodríguez-Cerezo & M. A. Aranda, unpublished results) and the Expand High Fidelity kit (Roche Diagnostics) following the manufacturer's instructions.

To generate cDNAs internal to the 5'- and 3'-ends of CYSDV RNAs, specific oligonucleotides designed from internal sequences (Célix et al., 1996; Livieratos & Coutts, 2002; E. Rodríguez-Cerezo & M. A. Aranda, unpublished results) were used as primers in RT-PCR reactions in which total RNA from CYSDV-infected plants was used as a template. Reverse transcription and PCR were carried out essentially as described for the generation of cDNAs to the 5'- and 3'-ends.

DNA fragments obtained after PCR were fractionated by electrophoresis in 1 % agarose gels and purified. Some DNA fragments were ligated to the plasmid pCR-BluntII-TOPO (Invitrogen) or pGEM-T Easy vector System II (Promega) and cloned in E. coli. For each cDNA fragment two clones (Fig. 1) were sequenced and each nucleotide position of these clones was read at least twice. Other DNA fragments were directly sequenced. In this case, sequences from at least two independent RT-PCR reactions were obtained for each genomic region (Fig. 1).



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 1. Genomic structure of CYSDV and sequencing strategy. The genomic RNA1 and RNA2 are represented by a line, with the ORFs indicated by boxes. ORFs are shown as boxes above, below or in the middle of the line, depending on the frame they are in. The start of ORF1b is drawn assuming a +1 ribosomal translational frameshifting at nt 5999 to 6002. Domains predicted from deduced amino acid sequences are indicated inside the boxes: L-Pro, putative papain-like leader proteinase, for which an arrow and a dotted line indicate the predicted autocatalytic cleavage site; MTR, methyltransferase domain; HEL, helicase domain; RdRp, RNA-dependent RNA polymerase domain; Hsp70h, Heat shock protein 70 homologue; CP, coat protein; CPm, minor coat protein. Approximate molecular mass is shown in parentheses if no function could be predicted for potential peptides coded by ORFs. Black boxes indicate cDNA clones used for sequencing. Designation of clones is indicated below each black box. Shaded boxes represent directly sequenced RT-PCR products.

 
Sequence analyses.
Computer analyses of nucleotide sequence coding capacity were performed using the FRAMES (gcg software package) and ORF FINDER (NCBI site; http://www.ncbi.nlm.nih.gov/) programs. Database searches for protein similarities were performed with the BLAST search tool at the NCBI site (http://www.ncbi.nlm.nih.gov/). Protein profiles and domains were analysed at the Prosite database of protein families and domains (http://us.expasy.org/prosite/). Transmembrane helices in proteins were predicted with the TMHMM program (http://www.cbs.dtu.dk/services/TMHMM-2.0) (Krogh et al., 2001; Möller et al., 2001). Prediction of subcellular localization was carried out using the TARGETP program (http://www.cbs.dtu.dk/services/TargetP-1.0) (Emanuelsson et al., 2000). Sequence alignments were obtained using the CLUSTALX program (Thompson et al., 1997). Percentages of sequence identity were estimated from aligned sequences as the number of identical residues shared by two sequences multiplied by 100 and divided by the length of the shorter sequence excluding the gaps.

Northern blot analyses.
For Northern blot analysis, 10 µg of total RNA was incubated at 55 °C for 75 min in a solution containing 1 M glyoxal and 50 % DMSO in 0·5 M BES pH 6·7. Denatured samples were electrophoresed in a 0·8 % agarose gel in 0·5 M BES pH 6·7 running buffer. RNAs were transferred to positively charged nylon membranes (Roche Diagnostics) by blotting. After UV-cross-linking, membranes were hybridized in a solution containing 50 % formamide at 65 °C for at least 8 h. For hybridizations, digoxigenin-11-UTP-labelled cRNA probes were used. Probes were synthesized by in vitro transcription from plasmids containing inserts corresponding to different genomic regions. Thus, probes ‘I’ to ‘III’ correspond to nt 1 to 950, 7539 to 8195 and 8342 to 9085 of CYSDV RNA 1, respectively. Probes ‘IV’ to ‘IX’ correspond to nt 11 to 430, 1844 to 2327, 3350 to 4161, 4842 to 5523, 5634 to 6190 and 7099 to 7957 of CYSDV RNA 2, respectively (see below). Plasmids used to prepare the probes were either those obtained for sequencing (Fig. 1) or by subcloning the specific regions following standard protocols (Sambrook & Russell, 2001). Probes were transcribed from plasmids following standard protocols (Sambrook & Russell, 2001). After hybridization, membranes were washed for 15 min, once at room temperature in 2x SSC and twice at 65 °C in 0·1x SSC. Chemiluminescent detection was carried out using the reagents and protocols supplied with a DIG-labelling and detection kit (Roche Diagnostics).


   RESULTS AND DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Fragmentary sequences corresponding to CYSDV RNAs 1 and 2 (Célix et al., 1996; Livieratos & Coutts, 2002; E. Rodríguez-Cerezo & M. A. Aranda, unpublished results) were used to design CYSDV-specific primers for the generation of a set of cDNA clones covering the whole length of both RNAs. Clones corresponding to the 5'- and 3'-ends of the CYSDV genomic RNAs were produced by cloning RT-PCR products obtained using poly(A)-tailed dsRNAs as a template and oligo(dT) and specific primers. Similarly, clones internal to the viral RNA ends were obtained by cloning RT-PCR products obtained using total RNA from infected plants and specific primers. No fewer than two cDNA clones were sequenced per genomic region (Fig. 1). Nucleotide sequences corresponding to each pair of cDNA clones were identical, except for clones pLM7-4 and pLM8-2 (Fig. 1), in which thirteen nucleotide differences in coding regions were observed. Of these, two were silent whereas eleven implied an amino acid change. In addition, preliminary analysis of the nucleotide sequences deduced from these cDNA clones suggested that several unexpected features could be present in diverse regions of the CYSDV genome. To resolve heterogeneities and to confirm these features, we directly sequenced at least two independently RT-PCR-amplified products for the genomic regions indicated in Fig. 1. RT-PCR amplification products were obtained from total RNA from infected plants using specific primers. Altogether, the nucleotide sequences obtained defined the complete genome sequence of CYSDV and showed that RNAs 1 and 2 are 9123 and 7976 nt, respectively. The complete sequences of both RNAs have been submitted to the GenBank Nucleotide Sequence Database and have been assigned the accession numbers AY242077 (RNA 1) and AY242078 (RNA2).

CYSDV RNA1 – identification of a new ORF
An analysis of the coding capacity of RNA1 showed that it contains at least five ORFs, which we have designated ORF1a, ORF1b, ORF2, ORF3 and ORF4. These ORFs are flanked by two putatively untranslated regions (UTRs) (Fig. 1). The 5'-UTR of CYSDV RNA1 is 94 nt and the 3'-UTR is 221 nt.

The first AUG translation initiation codon is at position 95 downstream of the 5'-end and, consequently, ORF1a extends from nt 95 to 6028 and potentially encodes a 1977 amino acid protein with a calculated molecular mass of 226·7 kDa. In this putative protein two motifs conserved among proteins associated with virus replication could be identified. A methyltransferase domain (MTR) (Rozanov et al., 1992) is located between amino acids 514 and 871 and this sequence has high similarity to the corresponding domain of other criniviruses (Table 1). In the carboxy-terminal portion (between amino acids 1678 to 1941) there is a helicase domain (HEL) (Gorbalenya & Koonin, 1993); again, the homologous domains of other criniviruses have high similarity to this sequence (Table 1). The region between the MTR and HEL motifs does not show significant similarity to any protein sequence available in databases. Interestingly, in this region, the TMHMM program (Krogh et al., 2001; Möller et al., 2001) predicts two transmembrane domains between amino acid positions 1238 to 1260 and 1352 to 1374, suggesting that this protein may be localized to membranes, analogous to the replicase of the closterovirus Beet yellows virus (Erokhina et al., 2000, 2001). An alignment of the amino acid sequences upstream of MTR in SPCSV, CuYV and LIYV with that of the similar region of CYSDV showed the existence of a putative papain-like cysteine proteinase domain (L-Pro), similar to that which has been reported for other viruses of the family Closteroviridae (Peng et al., 2001). The conserved cysteine and histidine residues that are predicted to participate directly in catalysis (Peng et al., 2001) are Cys-393 and His-442, and the presumed cleavage site is between Gly-461 and Val-462. Thus, autoproteolysis would yield a putative leader proteinase of 52·3 kDa.


View this table:
[in this window]
[in a new window]
 
Table 1. Percentage nucleotide identity and amino acid identity and similarity (in parentheses) between CYSDV-AlLM and SPCSV, CuYV and LIYV in RNA1

 
The protein encoded by ORF1b is closely related to the corresponding ORF1b products of CuYV, SPCSV and LIYV (Table 1) and contains the typical conserved motifs of RNA-dependent RNA polymerases (RdRp) (Koonin & Dolja, 1993). Several AUG codons could potentially initiate translation of this ORF. However, none of them is in an optimal context for translation initiation (Kozak, 1991). In fact, it has been proposed that the closteroviral ORFs homologous to CYSDV ORF1b are expressed by a +1 ribosomal frameshift during translation (Agranovsky et al., 1994; Karasev et al., 1995; Klaasen et al., 1995) and the CYSDV ORF1b could be expressed using this same mechanism as it is in the +1 reading frame with respect to ORF1a. Frameshifting in CYSDV RNA1 would yield a fusion protein of 286 kDa. However, the analysis of the nucleotide sequence in the overlapping region of ORFs 1a and 1b from CYSDV did not reveal any ‘slippery’ heptanucleotide nor any particularly significant secondary structure which might be indicative of frameshifting (Agranovsky et al., 1994; Karasev et al., 1995; Klaasen et al., 1995). Nevertheless, a comparison of the nucleotide and amino acid sequences in the region upstream of the putative frameshifting region in CYSDV and other closteroviruses showed the existence of significant homology between them (data not shown); moreover, as has been pointed out on at least one other occasion (Kreuze et al., 2002), this region shares significant similarity with the well characterized +1 frameshifting site in the prfB gene of E. coli (data not shown; Farabaugh, 1996).

The organization of the ORFs downstream, ORFs1a/1b, seems to be unique to CYSDV among criniviruses. ORF2 (nt 7554 to 7700) encodes a putative small protein (p5) of 48 amino acids with a predicted molecular mass of 5·2 kDa. In this protein, the TARGETP program (Emanuelsson et al., 2000) predicts the existence of an amino-terminal signal peptide which might be responsible for targeting the protein to the ER. Interestingly, there is a putative N-myristoylation site near the signal peptide cleavage site predicted by TARGETP. Thus, after processing, p5 might remain attached to the plasma membrane anchored by a myristoyl residue (Cross, 1990). ORF 3 (nt 7704 to 8342) encodes a putative protein (p25) of 212 amino acids, with a predicted molecular mass of 25·1 kDa. No sequence homology has been found between p25 and any other protein in the databases nor has any conserved motif been identified in its sequence. The putative protein encoded by ORF 4 (p22; nt 8324 to 8902) is 192 amino acids and has a predicted molecular mass of 22·4 kDa. p22 has a similar size to the proteins encoded from the most 3'-terminal genes of the RNA1 of SPCSV and LIYV, and the BYV genome. All these proteins share some sequence similarity (Table 1; results not shown) which suggests that they could have some functional equivalence.

CYSDV RNA 2 – prediction of an unusually long 5'-UTR
RNA2 is 7976 nt. This is 695 nt longer than the sequence determined by Livieratos & Coutts (2002) for the same RNA of another isolate of CYSDV. An alignment of both sequences showed that of these 695 nt, 691 nt are in the 5'-terminal region (Fig. 2), and that the overall nucleotide similarity between both sequences in the shared region is very high (99 % nt identity). In order to confirm the presence of these extra nucleotides in the 5'-end of the genomic RNA2 of CYSDV-AlLM (the CYSDV isolate characterized in this study; see Methods), total nucleic acid extracts of CYSDV-AlLM-infected plants were analysed in Northern blots using a probe complementary to nt 11 to 430 of CYSDV RNA2 sequence described here. Consistently, this probe detected an RNA species of around 8 kb which likely corresponds to the genomic RNA2 (Fig. 3c). In addition, the six 5'-terminal nucleotides (5'-GAAATA-3') of CYSDV-AlLM RNA2 are identical to the six 5'-terminal nucleotides of RNA1, similar to what has been described for SPCSV (Kreuze et al., 2002) and LIYV (Klaasen et al., 1995). Based on these results, we concluded that these extra 691 nucleotides are really a part of the RNA2 5'-terminal region of CYSDV-AlLM. Differences between this and the sequence determined by Livieratos & Coutts (2002) could be the result of the natural variability of CYSDV isolates. Thus, we decided to analyse the RNA2 5'-terminal regions of four CYSDV field isolates from Almería (Spain), Málaga (Spain) and Texas (USA) by preparing and sequencing RT-PCR products for this genomic region. The results indicated that all four CYSDV isolates have very similar nucleotide sequences to the sequence of CYSDV-AlLM (Fig. 2).



View larger version (119K):
[in this window]
[in a new window]
 
Fig. 2. Comparison of the RNA2 5'-terminal nucleotide sequence of CYSDV-AlLM (AlLM) with those of other CYSDV isolates. Isolates included are CYSDV-AM2 (AM2), obtained from a naturally infected melon plant collected in Almería (Spain) in 1999, CYSDV-AM14 and CYSDV-AM16 (AM14 and AM16), obtained from naturally infected cucumber plants collected in Málaga (Spain) in 1999, CYSDV-Tex (Tex), obtained from a naturally infected melon plant collected in Texas (USA) in 2002, and the isolate (L-C) sequenced by Livieratos & Coutts (2002). The first 10 nt of isolates CYSDV-AM2, -AM16, -AM14 and -Tex (indicated by dots), were not determined. Asterisks indicate nucleotide identity and dashes indicate absence of nucleotide sequence for the CYSDV-L-C isolate. Conserved nucleotides are highlighted by boxes. Identical nucleotides in all six isolates are highlighted in grey. The first ATG codon is shown in italics.

 


View larger version (64K):
[in this window]
[in a new window]
 
Fig. 3. Detection of CYSDV sgRNAs. (a) Probes used in Northern analyses were synthesized by in vitro transcription from plasmids that contained inserts corresponding to the genomic regions, indicated by black boxes below the diagrams representing the genome of CYSDV. Probes ‘I’ to ‘III’ correspond to nt 1 to 950, 7539 to 8195 and 8342 to 9085 of CYSDV RNA 1, respectively. Probes ‘IV’ to ‘IX’ correspond to nt 11 to 430, 1844 to 2327, 3350 to 4161, 4842 to 5523, 5634 to 6190 and 7099 to 7957 of CYSDV RNA 2, respectively. The sgRNAs predicted to be detected are indicated below the diagrams representing the genome of CYSDV. (b) Northern blot hybridization analysis of total ssRNAs from healthy (lane 1) and CYSDV-infected (lanes 2 and 3) melon plants using probes ‘I’ to ‘III’ (RNA1). (c) Northern blot hybridization analysis of total ssRNAs from healthy (lane 1) and CYSDV-infected (lanes 2 and 3) melon plants using probes ‘IV’ to ‘IX’ (RNA2). Positions and sizes of denatured marker DNAs in kb are shown on the left of the panels. Subgenomic RNAs are indicated on the right of the panels. The sgRNAs are designated by their estimated sizes. Higher contrast has been given to the image corresponding to probe ‘IX’ to show sgRNA0.9 more clearly.

 
Despite the increased size of the 5'-terminal region of CYSDV RNA2, no additional ORFs were predicted in this RNA using the FRAMES and ORF FINDER programs. The longest nucleotide sequences without a stop codon in the first 1177 nt of RNA2 are 144–168 nt; however, no AUG initiation codon was found for them. It has been reported that translation initiation could occur at non-AUG codons for both cellular and viral gene expression (Kozak, 1989; Shirako, 1998), but none of the possible alternative translation initiation codons was found to be in a favourable context (Kozak, 1989, 1991); moreover, the putative translation products that would be generated did not show significant homology with any other proteins in the databases or have any conserved motif. Therefore, our data suggest that the first 1177 nt of CYSDV RNA2 are a non-coding genomic region. If this is the case, this would be the longest 5'-UTR described among plant ssRNA viruses, as far as we know. Reported functions of 5'-UTRs include control of the expression of downstream ORFs and control of RNA replication (reviewed in Duggal et al., 1994). Interestingly, other closteroviruses have long, apparently non-coding, intercistronic regions upstream of p5/Hsp70h (Karasev, 2000), suggesting that these genes may require particularly complex control regions to regulate their expression.

Further analyses of the CYSDV RNA2 sequence showed that it contained at least eight ORFs, potentially encoding proteins p5 (ORF5; nt 1178 to 1303), Hsp70h (ORF6; nt 1234 to 2895), p6 (ORF7; nt 2902 to 3066), p59 (ORF8; nt 3060 to 4616), p9 (ORF9; nt 4595 to 4834), CP (ORF10; nt 4927 to 5682), CPm (ORF11; nt 5682 to 7061) and p26 (ORF12; nt 7066 to 7752). Therefore, the predicted genomic structure of CYSDV RNA2 is essentially similar to that proposed by Livieratos & Coutts (2002). However, an additional ORF, ORF7, was predicted from our data (Fig. 1, Table 2). This ORF was in-frame with ORF6 and extended from nt 2902 to 3066. Its putative AUG initiation codon, which is not in an optimal context for translation initiation (Kozak, 1991), is only six nucleotides downstream of the ORF6 termination codon. If this AUG codon is in fact functional, then ORF7 would encode the p6 protein, which would have 54 amino acids and a predicted molecular mass of 6·5 kDa. Nevertheless, no sequence homology has been found between p6 and any other protein in the databases, nor has any conserved motif been identified in its sequence. Further studies are needed to analyse the biological significance of ORF7. The remaining proteins potentially encoded by CYSDV RNA2 according to our analyses are very similar (Table 2) to those proposed by Livieratos & Coutts (2002) and their characteristics have been discussed by those authors.


View this table:
[in this window]
[in a new window]
 
Table 2. Percentage nucleotide identity and amino acid identity and similarity (in parentheses) between CYSDV-AlLM and CYSDV-L-C, CuYV, SPCSV and LIYV in RNA2

 
The putative 3'-UTR of CYSDV RNA2 is 224 nt. It is very similar in size to the putative 3'-UTR of RNA1 and both 3'-UTRs share 87 % nt identity. Despite this high similarity, both 3'-UTRs are predicted to form different secondary structures (Zuker, 1989). In this respect, each sequenced crinivirus seems to represent a unique case. Thus, the corresponding 3'-UTRs of SPCSV RNA1 and RNA2 are identical over their last 208 nt and are predicted to form identical stable secondary structures (Kreuze et al., 2002), whereas the corresponding 3'-UTRs of LIYV RNA1 and RNA2 are very different (nt identity <31 %; Klaasen et al., 1995). These results suggest that different functional requirements may exist for the 3'-UTRs of these three criniviruses.

Expression of internal ORFs via subgenomic RNAs
Several mechanisms, such as production of 3' co-terminal sgRNAs, frameshifting and polyprotein processing, have been shown or proposed to be used by closteroviruses to facilitate the expression of internal ORFs (reviewed by Karasev, 2000). To identify putative CYSDV sgRNAs, we carried out Northern blot analyses on total ssRNAs from healthy and infected plants with probes for different regions of CYSDV RNAs 1 and 2 (Fig. 3).

For CYSDV RNA1, probe ‘I’, complementary to the 5'-terminal 950 nt (Fig. 3a), hybridized to an RNA of around 9 kb which most probably corresponds to CYSDV RNA1, but did not hybridize to any RNAs from healthy plants (Fig. 3b). Probe ‘II’, complementary to nt 7539 to 8195 (Fig. 3a), hybridized to CYSDV RNA1 and also to a single smaller RNA of approximately 1·6 kb (Fig. 3b). Probe ‘III’, complementary to nt 8342 to 9085 of the 3'-terminal region (Fig. 3a), hybridized to the same RNAs as probe ‘II’ and, in addition, to a single smaller RNA of around 0·8 kb (Fig. 3b). We speculate that the 0·8 kb sgRNA detected by probe ‘III’ may correspond to a subgenomic mRNA for ORF4 and that the sgRNA of around 1·6 kb detected by probes ‘II’ and ‘III’ may correspond to a subgenomic mRNA for ORF 2 and/or ORF 3. If the 1·6 kb sgRNA was monocistronic, p5 or p25 might not be expressed. Alternatively, they might be expressed from a different sgRNA that accumulates at undetectable levels, or expressed using a different mechanism. If the 1·6 kb sgRNA was dicistronic, then the expression of p25 might occur by ribosomal leaky scanning or internal ribosome entry, as has been shown for other viruses (reviewed in Maia et al., 1996), or by a different mechanism. The lack of detection of a sgRNA for ORF1b agrees with the hypothesis proposed previously (see above) that this ORF is expressed by a +1 ribosomal frameshift during translation.

For CYSDV RNA2, probe ‘V’, complementary to nt 11 to 430 (Fig. 3a), hybridized to an RNA of around 8 kb (Fig. 3c), which most likely corresponds to CYSDV RNA2. Probe ‘V’, complementary to nt 1844 to 2327 (Fig. 3a), hybridized to CYSDV RNA 2 and to an RNA of around 6·8 kb (Fig. 3c). According to this size estimation, this RNA might correspond to a subgenomic mRNA for p5 and/or Hsp70h. However, it would be surprising if p5 needed to be expressed from a subgenomic mRNA, since the corresponding ORF seems to be the first ORF in CYSDV RNA2 and, therefore, it would be expected to be expressed directly from CYSDV RNA2. Probe ‘V’ also hybridized to several RNAs in healthy samples (Fig. 3c), perhaps as a consequence of the conservation of Hsp70 genes between the host and the virus. Probes ‘VI’ to ‘IX’ (Fig. 3a) identified three 3'-co-terminal sgRNAs of sizes around 5, 3 and 0·9 kb (Fig. 3c). According to these size estimations, these RNAs could serve as subgenomic mRNAs for ORFs 8 and/or 9, CP and ORF12. We expected to detect another sgRNA of around 2·3 kb which would serve for the expression of CPm. Such an sgRNA was not seen and therefore CPm could be either expressed from an sgRNA undetectable under our experimental conditions or expressed using a different mechanism. Based on the observations above, a tentative CYSDV sgRNAs map was proposed (Fig. 3a).

Conclusions
We have determined the complete genomic sequence of CYSDV, the fourth crinivirus genome fully sequenced to date. This analysis has shown new features of crinivirus genome composition. The most striking novel features of CYSDV compared to LIYV, SPCSV and CuYV are a unique gene arrangement in the 3'-terminal region of RNA1, the identification in this region of an ORF potentially encoding a protein with no homologues in the databases and the prediction of an unusually long 5'-non-coding region in RNA2. Additionally, the CYSDV genome resembles those of SPCSV and CuYV in having very similar 3' regions in RNAs 1 and 2 (Hartono et al., 2003; Kreuze et al., 2002), although in the case of CYSDV similarity in primary structures did not result in predictions of equivalent secondary structures. Overall, these data reinforce the view that the genus Crinivirus contains considerable genetic variation (Karasev, 2000; Kreuze et al., 2002). On the other hand, our Northern blot analyses suggested that the generation of subgenomic mRNAs is a strategy used by CYSDV for the expression of internal ORFs. Aspects that deserve more attention are the lack of detection of a candidate sgRNA for the CPm ORF, the precise mapping of the observed sgRNAs and the corresponding analyses of protein expression from these sgRNAs. Further studies on the molecular biology of CYSDV and, in general, on the molecular biology of criniviruses, have the potential to provide interesting information on gene function and regulation of expression of complex positive-strand RNA genomes. In addition, these studies may have considerable practical implications due to the economic importance of diseases caused by viruses in this genus.


   ACKNOWLEDGEMENTS
 
J. M. Aguilar and M. Franco contributed equally to this work. We wish to thank E. Moriones and Y. Hernando for the critical reviewing of the manuscript and M. Victoria Martín for technical assistance. Financial support from Ministerio de Ciencia y Tecnología (Spain; grant AGL2000-1156), Fundación Séneca de la Región de Murcia (Spain) and Seminis Vegetable Seeds is gratefully acknowledged.


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Abou-Jawdah, Y., Sobh, H., Fayad, A., Lecoq, H., Delecolle, B. & Trad-Ferre, J. (2000). Cucurbit yellow stunting disorder virus – a new threat to cucurbits in Lebanon. J Plant Pathol 82, 55–60.

Agranovsky, A. A., Koonin, E. V., Boyko, V. P., Maiss, E., Frötschl, R., Lunina, N. A. & Atabekov, J. G. (1994). Beet yellows closterovirus: complete genome structure and identification of a leader papain-like thiol protease. Virology 198, 311–324.[CrossRef][Medline]

Célix, A., López-Sesé, A., Almarza, N., Gómez-Guillamón, L. & Rodríguez-Cerezo, E. (1996). Characterization of Cucurbit yellow stunting disorder virus, a Bemisia tabaci-transmitted closterovirus. Phytopathology 86, 1370–1376.

Cross, G. A. (1990). Glycolipid anchoring of plasma membrane proteins. Annu Rev Cell Biol 6, 1–39.[Medline]

Desbiez, C., Lecoq, H., Aboulama, S. & Peterschmitt, M. (2000). First report of Cucurbit yellow stunting disorder virus in Morocco. Plant Dis 84, 596.

Duggal, R., Lahser, F. C. & Hall, T. C. (1994). Cis-acting sequences in the replication of plant viruses with plus-sense RNA genomes. Annu Rev Phytopathol 32, 287–309.[CrossRef]

Emanuelsson, O., Nielsen, H., Brunak, S. & von Heijne, G. (2000). Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300, 1005–1016.[CrossRef][Medline]

Erokhina, T. N., Zinovkin, R. A., Vitushkina, M. V., Jelkmann, W. & Agranovsky, A. A. (2000). Detection of beet yellows closterovirus methyltransferase-like and helicase-like proteins in vivo using monoclonal antibodies. J Gen Virol 81, 597–603.[Abstract/Free Full Text]

Erokhina, T. N., Vitushkina, M. V., Zinovkin, R. A., Lesemann, D. E., Jelkmann, W., Koonin, E. V. & Agranovsky, A. A. (2001). Ultrastructural localization and epitope mapping of the methyltransferase-like and helicase-like proteins of Beet yellows virus. J Gen Virol 82, 1983–1994.[Abstract/Free Full Text]

Farabaugh, F. J. (1996). Programmed translational frameshifting. Microbiol Rev 60, 103–134.[Medline]

Gorbalenya, A. E. & Koonin, E. V. (1993). Helicases: amino acid sequence comparisons and structure–function relationship. Curr Opin Struct Biol 3, 419–429.

Guirao, P., Beitia, F. & Cenis, J. L. (1997). Biotype determination of Spanish populations of Bemisia tabaci (Homoptera: Aleyrodidae). Bull Entomol Res 87, 587–593.

Hartono, S., Natsuaki, T., Genda, Y. & Okuda, S. (2003). Nucleotide sequence and genome organization of Cucumber yellows virus, a member of the genus Crinivirus. J Gen Virol 84, 1007–1012.[Abstract/Free Full Text]

Hassan, A. A. & Duffus, J. E. (1991). A review of a yellowing and stunting disorder of cucurbits in the United Arab Emirates. Emir J Agric Sci 2, 1–16.

Kao, J., Jia, L., Tian, T., Rubio, L. & Falk, B. W. (2000). First report of Cucurbit yellow stunting disorder virus (Genus Crinivirus) in North America. Plant Dis 84, 101.

Karasev, A. V. (2000). Genetic diversity and evolution of closteroviruses. Annu Rev Phytopathol 38, 293–324.[CrossRef][Medline]

Karasev, A. V., Boyko, V. P., Gowda, S. & 10 other authors (1995). Complete sequence of the Citrus tristeza virus RNA genome. Virology 208, 511–520.[CrossRef][Medline]

Klaasen, V. A., Boeshore, M. L., Koonin, E. V., Tian, T. & Falk, B. W. (1995). Genome structure and phylogenetic analysis of Lettuce infectious yellows virus, a whitefly-transmitted, bipartite closterovirus. Virology 208, 99–110.[CrossRef][Medline]

Koonin, E. V. & Dolja, V. V. (1993). Evolution and taxonomy of positive-strand RNA viruses: implications of comparative analysis of amino acid sequences. Crit Rev Biochem Mol Biol 28, 375–430.[Abstract]

Kozak, M. (1989). Context effects and inefficient initiation at non-AUG codons in eukaryotic cell-free translation systems. Mol Cell Biol 9, 5073–5080.[Medline]

Kozak, M. (1991). Structural features in eukaryotic mRNAs that modulate the initiation of translation. J Biol Chem 266, 19867–19870.[Free Full Text]

Kreuze, J. F., Savenkov, E. I. & Valkonen, J. P. T. (2002). Complete genome sequence and analyses of the subgenomic RNAs of Sweet potato chlorotic stunt virus reveal several new features for the genus Crinivirus. J Virol 76, 9260–9270.[Abstract/Free Full Text]

Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. L. (2001). Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305, 567–580.[CrossRef][Medline]

Liu, H.-Y., Wisler, G. C. & Duffus, J. E. (2000). Particle lengths of whitefly-transmitted criniviruses. Plant Dis 84, 803–805.

Livieratos, I. C. & Coutts, R. H. A. (2002). Nucleotide sequence and phylogenetic analysis of Cucurbit yellow stunting disorder virus RNA2. Virus Genes 24, 225–230.[CrossRef][Medline]

Louro, D., Vaira, A. M., Accotto, G. P. & Nolasco, G. (2000). Cucurbit yellow stunting disorder virus (genus Crinivirus) associated with the yellowing disease of cucurbit crops in Portugal. Plant Dis 84, 1156.

Maia, I. G., Seron, K., Haenni, A. L. & Bernardi, F. (1996). Gene expression from viral RNA genomes. Plant Mol Biol 32, 367–391.[Medline]

Marco, C. F., Aguilar, J. M., Abad, J., Gómez-Guillamón, M. L. & Aranda, M. A. (2003). Melon resistance to Cucurbit yellow stunting disorder virus is characterized by reduced virus accumulation. Phytopathology 93, 844–852.

Martelli, G. P., Agranovsky, A. A., Bar-Joseph, M. & 15 other authors (2000). Family Closteroviridae. In Virus Taxonomy. Seventh Report of the International Committee on Taxonomy of Viruses. pp. 943–952. Edited by M. H. V. Van Regenmortel, C. M. Fauquet, D. H. L. Bishop, E. B. Carstens, M. K. Estes, S. M. Lemon, J. Maniloff, M. A. Mayo, D. J. McGeoch, C. R. Pringle and R. B. Wickner. San Diego: Academic Press.

Martelli, G. P., Agranovsky, A. A., Bar-Joseph, M. & 13 other authors (2002). The family Closteroviridae revised. Arch Virol 147, 2039–2045.[CrossRef][Medline]

Möller, S., Croning, M. D. R. & Apweiler, R. (2001). Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17, 646–653.[Abstract/Free Full Text]

Peng, C.-W., Peremyslov, V. V., Mushegian, A. R., Dawson, W. O. & Dolja, V. V. (2001). Functional specialization and evolution of leader proteinases in the family Closteroviridae. J Virol 75, 12153–12160.[Abstract/Free Full Text]

Rozanov, M. N., Koonin, E. V. & Gorbalenya, A. E. (1992). Conservation of the putative methyltransferase domain: a hallmark of the ‘Sindbis-like’ supergroup of positive-strand RNA viruses. J Gen Virol 73, 2129–2134.[Abstract]

Rubio, L., Soong, J., Kao, J. & Falk, B. W. (1999). Geographic distribution and molecular variation of isolates of three whitefly-borne closteroviruses of cucurbits: Lettuce infectious yellows virus, Cucurbit yellow stunting disorder virus, and Beet pseudo-yellows virus. Phytopathology 89, 707–711.

Rubio, L., Abou-Jawdah, Y., Lin, H.-X. & Falk, B. W. (2001). Geographically distant isolates of the crinivirus Cucurbit yellow stunting disorder virus show very low genetic diversity in the coat protein gene. J Gen Virol 82, 929–933.[Abstract/Free Full Text]

Sambrook, J. & Russell, D. W. (2001). Molecular cloning – a Laboratory Manual, 3rd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.

Shirako, Y. (1998). Non-AUG translation initiation in a plant RNA virus: a forty-amino-acid extension is added to the N terminus of the Soil-borne wheat mosaic virus capsid protein. J Virol 72, 1677–1682.[Abstract/Free Full Text]

Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. (1997). The CLUSTALX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24, 4876–4882.[CrossRef]

Valverde, R. A., Nameth, S. T. & Jordan, R. L. (1990). Analysis of double-stranded RNA for plant virus diagnosis. Plant Dis 74, 255–258.

Wisler, G. C., Duffus, J. E., Liu, H.-Y. & Li, R. H. (1998). Ecology and epidemiology of whitefly-transmitted Closteroviruses. Plant Dis 82, 270–280.

Zuker, M. (1989). On finding all suboptimal foldings of an RNA molecule. Science 244, 48–52.[Medline]

Received 7 March 2003; accepted 30 April 2003.