The complete sequence of hepatitis E virus genotype 4 reveals an alternative strategy for translation of open reading frames 2 and 3

Youchun Wang1,2, Huayuan Zhang2, Roger Ling1, Hemin Li2 and Tim J. Harrison1

Department of Medicine, Royal Free and University College Medical School, University College London, Royal Free Campus, Rowland Hill Street, London NW3 2PF, UK1
Department of Hepatitis, National Institute for the Control of Pharmaceuticals and Biological Products, Temple of Heaven, Beijing, PR China2

Author for correspondence: Tim Harrison. Fax +44 20 7433 2852. e-mail t.harrison{at}rfc.ucl.ac.uk


   Abstract
Top
Abstract
Introduction
Methods
Results and Discussion
References
 
Isolates of hepatitis E virus (HEV) have recently been described from China that are distinct from Burmese, Mexican and US viruses and constitute a novel genotype (genotype 4). Here, the complete genomic sequence of a representative isolate of genotype 4 HEV, amplified directly from the stool of an acutely infected patient, is presented. Analysis of the entire sequence confirms our previous conclusion, based upon partial sequence data, that these Chinese isolates belong to a novel genotype. Typical of genetic variation in HEV, most nucleotide substitutions occur in the third base of the codon and do not affect the amino acid sequence. The genotype 4 virus is unusual in that a single nucleotide insertion in the ORF 3 region changes the initiation of ORF 3, and perhaps also ORF 2. The consequences of these changes are discussed.


   Introduction
Top
Abstract
Introduction
Methods
Results and Discussion
References
 
Hepatitis E virus (HEV) is the major cause of enterically transmitted non-A, non-B hepatitis and is responsible for significant morbidity and mortality, particularly in developing countries (Bradley, 1992 ; Harrison, 1999 ). The geographical distribution of the virus was thought originally to be restricted to countries with poor sanitation and the rare cases of hepatitis E in the West were attributed to infections acquired through travel. However, isolates of HEV from North America and Europe have been reported recently (Schlauder et al., 1998 , 1999 ) and the virus may be endemic worldwide. It is possible that domestic animals, including pigs (Clayson et al., 1995 ; Meng et al., 1998 ), constitute an important reservoir.

The first complete genomic sequence was derived from virus implicated in an epidemic of hepatitis E in Burma (Reyes et al., 1990 ; Tam et al., 1991 ). The genome is a positive-sense, polyadenylated RNA molecule of around 7500 nt. The largest open reading frame (ORF 1), located at the 5' end of the genome, is believed to encode a non-structural polyprotein which contains motifs recognizable as consensus elements of methyl transferase, protease, helicase and RNA-dependent RNA polymerase activities (Koonin et al., 1992 ). ORF 2 is located at the 3' end of the genome and is believed to encode the major capsid protein. Unusually for a non-enveloped virus, the predicted polypeptide has a signal sequence at the amino terminus and sites for N-linked glycosylation. The short ORF 3 overlaps the other two and encodes a polypeptide of uncertain function.

Complete and partial nucleotide sequences have been determined for many HEV isolates. Complete viral sequences from Pakistan (Tsarev et al., 1992 ), China (Bi et al., 1993 ; Yin et al., 1994 ) and India (Donati et al., 1997 ; Panda et al., 1995 ) and partial sequences of isolates from Africa and the Asian republics of the former Soviet Union (Chatterjee et al., 1997 ) have high identity (>90% nucleotide identity) to the Burmese prototype. In contrast, the sequence of a virus implicated as the cause of an epidemic of hepatitis E in Mexico (Huang et al., 1992 ) shares less than 77% identity with Burmese-like viruses.

The concept of ‘Old World’ and ‘New World’ hepatitis E viruses was eclipsed by the discovery of a third genotype infecting pigs (Meng et al., 1997 ), and causing sporadic cases of acute hepatitis in humans (Kwo et al., 1997 ), in the United States. Complete sequences of the US genotype have been reported (Schlauder et al., 1998 ) and, although these are distinct from the Burmese-like group and the single isolate from Mexico, they share all of the characteristic features of the HEV genome. We reported recently that some isolates of HEV from China are distinct from the Burmese-like (genotype 1) viruses known to be endemic in that country and constitute a fourth genotype (Wang et al., 1999 ). Recent reports of further divergent HEV sequences from Italy and Greece suggest that the virus is also present in Europe and that further genotypes may exist (Schlauder et al., 1999 ).

The diagnostic assays for anti-HEV antibodies, which are commercially available, are based on recombinant proteins or synthetic peptides derived from ORFs 2 and 3 of the Burmese and Mexican genotypes (Yarbough et al., 1991 ; Dawson et al., 1992 ). We failed to detect anti-HEV in sera from patients infected with genotype 4 HEV although, in some instances, the acute phase samples may have been taken prior to the development of detectable levels of antibody (Wang et al., 1999 ). The availability of complete ORF 2 and 3 sequences from genotype 4 HEV should enable evaluation of sequence variation in the regions critical for antibody assays and help determine whether it is necessary to modify current assays to detect antibodies to this new genotype.

Here, we report the entire nucleotide sequence of a representative of the Chinese genotype of HEV. Typical of genetic variation in HEV, many of the variant nucleotides occur in the third base of the codons, so that the predicted amino acid sequences remain conserved. However, the Chinese genotype has a single base insertion which is likely to affect the translation of the ORF 2 and ORF 3 proteins. This insertion was confirmed in an independent isolate. The implications of this unique feature of the Chinese genotype of HEV are discussed.


   Methods
Top
Abstract
Introduction
Methods
Results and Discussion
References
 
{blacksquare} Clinical samples.
HEV ORF 1 cDNA was amplified from the stool sample (designated T1) collected 3 days after the onset of jaundice in a 33-year-old male with acute hepatitis. The partial sequence (nt 102–387) was compared to the various genotypes of HEV and was found to be 85% identical to S15 (genotype 4; Wang et al., 1999 ) but only 74–77%, 79% and 79–80% identical to Burmese-like, Mexican and US isolates (genotypes 1–3, respectively). Thus, HEV-T1 was considered to be a suitable source of viral RNA for determination of the complete sequence of a representative isolate of HEV genotype 4.

As described below, isolate HEV-T1 was found to have a single base insertion affecting the translational strategy of ORFs 2 and 3. In order to determine whether this change is present in other genotype 4 viruses, viral cDNA was amplified from a second stool sample and identified as genotype 4 using the above criteria. This sample, designated T11, was collected 2 days after the onset of jaundice in a 39-year-old male with acute hepatitis. Both cases of hepatitis were sporadic (community acquired) and were not associated with an epidemic.

{blacksquare} Extraction of RNA.
One tenth volume of 10x PBS was added to the stool samples and the suspensions were mixed thoroughly and clarified by centrifugation at 3000 r.p.m. at room temperature for 10 min. The supernatants were stored at -20 °C for RNA extraction. HEV RNA was extracted from 560 µl stool suspension using a QIAamp Viral RNA kit (QIAGEN) and the manufacturer’s protocol for large sample volumes.

{blacksquare} PCR amplification.
Primers for the amplification of the Chinese genotype of HEV, and for 5' and 3' RACE, are listed in Table 1. Some primers were based on sequences conserved between the Burmese and Mexican genotypes, others were designed on the basis of sequence information from the T1 isolate. cDNA was synthesized from 9 µl purified RNA using AMV or superscript RT and the outer, antisense PCR primer. First-round PCRs were carried out in a 50 µl reaction using 10 µl cDNA, 25 pmol of each outer primer, 5 µl 10x PCR buffer, 2 mM MgCl2 and 5 U Taq polymerase. Second-round reactions were carried out in 50 µl volumes with 5 µl first-round product, 25 pmol of each inner primer and 5 U Taq polymerase.


View this table:
[in this window]
[in a new window]
 
Table 1. PCR and RACE primers

 
The ORF 3 region of HEV has a GC content of more than 85% and cannot be amplified using standard PCR methods. First-round PCR was carried out in a 50 µl reaction using 10 µl cDNA, 25 pmol of primers BO13 and EO13 (Table 1), 5 µl 10x PCR buffer, 2 mM MgCl2, 10 mM GC Melt (Clontech), 5 µl DMSO and 5 U Taq polymerase. Second-round amplifications were carried out in 50 µl reactions with 5 µl first-round product, 25 pmol of primers BI13 and EI13, 5 µl 10x PCR buffer, 2 mM MgCl2, 10 mM GC Melt, 5 µl DMSO and 5 U Taq polymerase. PCR products were resolved on 2% agarose gels.

{blacksquare} RACE.
To amplify the 5' end, first-strand cDNA was synthesized using primer E1EO (Table 1). First-strand cDNA (20 µl) was treated with 1 µl RNase H (Promega) at 37 °C for 30 min and purified using QIAEX II (QIAGEN) according to the manufacturer’s instructions, yielding 20 µl purified cDNA. Purified cDNA (10 µl) was used in each of two homopolymer tailing reactions: 20 µM dATP or dGTP and 20 U terminal deoxynucleotidyl transferase (Promega), using the buffer supplied by the manufacturer, and incubated at 37 °C for 30 min. The tailed cDNA was purified using a QIAEX II kit (QIAGEN) and amplified in the first-round PCR with primers E1EO and R2 (dG-tailed) or E1EO and R1 (dA-tailed). Five microlitres of each first-round PCR product was amplified further in second-round, semi-nested PCR with primers E1EI and R2 or R1.

To amplify the 3' end, first-strand cDNA was synthesized with primer R2 in a 20 µl reaction. First-strand cDNA (10 µl) was used in a first-round PCR with primers E2BO and R1 and 5 µl first-round product was used in a second-round PCR (semi-nested) with primers E2BI and R1.

{blacksquare} Cloning and sequencing of amplicons.
The products of PCR amplification and RACE were cloned into pGEM-T (Promega) or pCRII (Invitrogen). Recombinant plasmids were purified and the inserts were sequenced, either manually, using Sequenase (version 2.0; Amersham Pharma Biotec), or using an ABI Prism dRhodamine terminator cycle sequencing ready reaction kit (PE Applied Biosystems) and an ABI Prism 310 genetic analyser. The complete sequence of the T1 isolate has been deposited in the EMBL and GenBank nucleotide databases (accession no. AJ272108).

{blacksquare} Phylogenetic analysis.
The nucleotide sequences were aligned using PILEUP and compared using GAP (Wisconsin Sequence Analysis Package; Genetics Computer Group, Madison, Wisconsin, version 9.0) or Clustal X from the European Bioinformatics Institute (EBI). These alignments were analysed using the DNADIST program of PHYLIP (version 3.5c; Felsenstein, 1993 ) or Clustal X to calculate the evolutionary distances between sequences.

The following full-length HEV sequences were used for analysis. Genotype 1: Burmese prototype [B1, accession no. M73218 (Tam et al., 1991 )]; Burmese isolate [B2, D10330 (Tam et al., 1991 ; Aye et al., 1993 )]; Pakistan isolate [P1, M80581 (Tsarev et al., 1992 )]; Chinese isolates [C1, L25547 (Bi et al., 1993 ), C2, M94177 (S. R. Yin and others, unpublished results) and C3, D11093 (T. Uchida and others, unpublished results)]; Indian isolates [I1, X98292 (Donati et al., 1997 ) and I2, X99441 (A. Von Brunn and others, unpublished results)]. Genotype 2: the Mexican prototype [M1, M74506 (Huang et al., 1992 )]. Genotype 3: US isolates [US1 and US2, AF060668 and AF060669, respectively (Schlauder et al., 1998 )].


   Results and Discussion
Top
Abstract
Introduction
Methods
Results and Discussion
References
 
Sequence analysis
Analysis of ORF 1 sequences of the HEV-T1 isolate, using primers and amplification conditions described previously (Wang et al., 1999 ), confirmed that this isolate was genotype 4. The T1 sample, therefore, was a valuable source of viral RNA to determine the full-length sequence. A further 15 sets of overlapping primers (Table 1), including primers for 5' and 3' RACE, were designed to amplify this novel genome. Some primers were based on conserved regions of Burmese-like, Mexican and US isolates, whereas others were based specifically on the sequence determined for the T1 isolate. The sizes of the amplicons from 13 sets of primers were approximately as predicted, but primer set 10 produced a product (902 bp), approximately 80 bp shorter than expected. Cloning and sequencing of this PCR product confirmed the absence of 71 bases in a GC-rich region within ORF 3. Primer set 11 was designed, based on the T1 sequence, to amplify this region. Successful amplification required the addition of 10 mM GC Melt and 5% DMSO to the reaction and ‘touchdown’ PCR.

PCR products were cloned into pGEM-T or pCR2.1 and two or three clones were sequenced in each direction, either manually using the -40 or reverse sequencing primers and Sequenase version 2.0, or automatically using an ABI model 310 DNA sequencer and the ABI ready reaction sequencing kit. The full-length sequence was assembled with removal of overlaps between adjacent PCR products. If the identity within each overlapping region failed to reach 99%, the amplification, cloning and sequencing were repeated.

Genome organization of genotype 4 HEV
The HEV-T1 genome comprises 7232 nt, excluding the 3' poly(A) tail (Table 2). The 5' nontranslated region (NTR) comprises 25 bases and the 3' NTR, 68 bases. The presence of three major ORFs was confirmed. ORF 1 begins at nt 26 and ends at nt 5146 (5121 nt in length) and potentially encodes a product of 1707 aa. ORF 2 (nt 5146–7161) comprises 2016 nt and encodes 672 aa. The predicted size of ORF 2 for HEV-T1 is 14 codons longer than for other isolates in the sequence databases. ORF 1 and ORF 2 in HEV-T1 overlap by one base, whereas the ORF 2 of other isolates begins 41 bases downstream of ORF 1. This variation is caused by the insertion of a single nucleotide (U) at position nt 5159. This insertion also affects ORF 3, which starts at nt 5174 and ends at nt 5509 (336 nt), and potentially encodes a polypeptide of 112 aa. ORF 3 of HEV-T1 starts 28 bases downstream of ORF 1, whereas ORF 1 and ORF 3 in other isolates reportedly overlap by one base.


View this table:
[in this window]
[in a new window]
 
Table 2. Comparison of the genomic organization of various HEV isolates

 
The consequences of the single nucleotide insertion are that ORF 2 has an additional 14 codons at the 5' end and may be translated from the initiation codon which is believed to start translation of ORF 3 in other HEV genotypes (Fig. 1) or from the initiation codon used for the ORF 2 of other genotypes of HEV. If present, the additional 14 aa would extend the signal sequence at the amino terminus of the ORF 2 protein but might not affect cleavage by the cellular signal peptidase or the size of the processed ORF 2 product. The situation regarding ORF 3 is also unclear. The 112 aa polypeptide predicted above assumes translation from the first AUG in the ORF. Alternatively, there is another in-frame AUG two codons downstream.



View larger version (26K):
[in this window]
[in a new window]
 
Fig. 1. Organization of HEV genotype 4. (a) shows the conventional translation strategy of genotypes 1–3 and (b) shows the open reading frames of genotype 4, with alternative initiation codons for ORF 2 and truncated ORF 3. ORF 1 motifs are indicated: methyl transferase (MT), cysteine protease (Pro), helicase and RNA polymerase (Replicase) activities. ORF2 has a signal sequence at the amino-terminus (black box) and potential N-linked glycosylation sites (indicated by lollipops). The 7·5 kb, genomic RNA is shown beneath (b). (c) and (d) show in detail the strategy for the initiation of translation of ORFs 2 and 3, considering the ORF 1 reading frame as frame 1. (c) shows the conventional strategy of genotypes 1–3, the initiation codon for ORF 2 is underlined (sequence from isolate I1, accession no. X98292). (d) shows the equivalent region of genotype 4; the inserted U residue is marked with an asterisk (sequence from isolate T1, accession no. AJ272108). Three potential initiation codons for ORF 2 are underlined. Note that, although ORF 3 is drawn assuming translation from the first available AUG in the truncated open reading frame, the third codon is an alternative.

 
Because the insertion of a single nucleotide is crucial to the translational strategy of HEV-T1, it was confirmed by sequencing both strands of the relevant amplicon using manual and automated methodology. Nevertheless, it was possible that the sequence represented an excess of defective virus in the T1 sample, or that T1 was an unusual variant of genotype 4. We therefore wished to confirm the insertion in another sample of genotype 4 HEV. The ORF 3 region of HEV-T11 was cloned and sequenced from the stool of another patient acutely infected with genotype 4 HEV and an additional U was detected at the same position as in HEV-T1.

To date, full-length sequences for three genotypes of HEV have been deposited in the nucleotide sequence databases. The Burmese-like isolates are considered as genotype 1, the single Mexican isolate, genotype 2, and the US isolates, genotype 3 (Wang et al., 1999 ). The nucleotide identities, based on full-length sequences, are approximately 75·0% between genotypes 1 and 2, 74·0% between genotypes 1 and 3, and 74·0% between genotypes 2 and 3. HEV-T1, as a representative of genotype 4, shows 74·5–75·8% identity to the Burmese-like isolates, 74·5% to the Mexican isolate and 75·3–76·3% to the US isolates (Table 3). These comparisons support our hypothesis that T1 and related viruses constitute a novel genotype of HEV.


View this table:
[in this window]
[in a new window]
 
Table 3. Pairwise comparisons (% nucleotide identity) of full-length HEV sequences

 
Amplification of the 5' terminus of HEV-T1 using RACE showed the length of the 5' NTR to be 25 nt, which is roughly equivalent to that of most HEV sequences in the nucleotide databases (Table 2). The length of 5' NTR of HEV sequences in the databases varies, but this may reflect the experimental strategies used to define the 5' ends. Notably, the 5' NTR of HEV-US2 is the longest, at 35 nt, whereas no 5' NTR sequences were reported for the Pakistan and US1 strains. Only three bases were reported before the ORF 1 initiation codon of the Mexican strain. Alignment of 5' NTR sequences of different HEV isolates (data not shown) revealed a high degree of conservation; only two nucleotide positions varied between T1 and the Burmese-like isolates.

The length of the 3' NTR of HEV-T1, at 68 nt, is similar to that of other isolates (65–74 nt), with the exception of the Pakistan isolate (Table 2). The sequence of the 3' NTR is highly variable with some deletions, although these are not located in the same positions in the various isolates. There is more than 92% identity between Burmese-like isolates in the overlapping region. HEV-T1 has 75–79%, 77% and 61% identity to the Burmese-like isolates, the Mexican and US strains, respectively, whereas Burmese-like isolates have 70–72% and 72–76% identity to the Mexican and US strains, respectively.

Analysis of ORF 1
The predicted ORF 1 polypeptide of HEV-T1 is 1708 aa. The HEV-T1 polypeptide has 85·4–86·7%, 85% and 88·5–88·7% identity, at the amino acid level, to the Burmese-like, Mexican and US strains, respectively, whereas the ORF 1 polypeptides of Burmese-like strains have more than 95% identity to each other, 81·5–83·0% identity to the US strains and 81·8–84·2% identity to the Mexican strain (Table 4). The HEV-T1 ORF 1 polyprotein is 14–16 aa longer than the Burmese-like and Mexican strains, 9 aa longer than HEV-US1 and 1 aa shorter than HEV-US2. The variability in ORF 1 length was mostly because of insertions within the hypervariable region (nt 2145–2376). This region was amplified independently with two different primer sets. The overlapping sequences of the two PCR products covered the entire length of the hypervariable region and were identical.


View this table:
[in this window]
[in a new window]
 
Table 4. Pairwise comparisons (% amino acid identity, with % nucleotide identity in parentheses) of ORF 1 sequences

 
All of the motifs expected within ORF 1 were conserved in the T1 isolate: the putative methyl transferase domain is located at aa 56–240, the Y domain at aa 216–434, the papain-like protease at aa 433–592, the X domain at aa 802–960, the RNA helicase domain at aa 977–1219 and the RNA-dependent RNA polymerase domain at aa 1224–1707 (Koonin et al., 1992 ). The NTP-binding domains GVPGSGKS and DEAP in the putative helicase region and the GDD site found in all RNA-dependent RNA polymerases were conserved in the genotype 4 sequence.

Analysis of ORF 2
The HEV-T1 ORF 2 product is 672 aa in length and is most highly conserved. The HEV-T1 ORF 2 is 91·6–92·4%, 91·9–93% and 90·1% identical at the amino acid level to Burmese-like, US and Mexican strains, respectively, whereas the identity between the HEV-US and Mexican strains is approximately 90% (Table 5). Within genotypes, the identity between two isolates within the Burmese genotype is more than 98% and, between US1 and US2, is 98%.


View this table:
[in this window]
[in a new window]
 
Table 5. Pairwise comparisons (% amino acid identity, with % nucleotide identity in parentheses) of ORF 2 sequences

 
The amino acid sequences of 11 isolates were aligned using the Clustal X program. The amino acid sequence from aa 13 to 178 of HEV-T1 varies considerably from that of other isolates but the sequence from aa 179 to 358 is highly conserved. A hydrophobic signal peptide was identified at the extreme amino terminus of the ORF 2 protein of HEV-T1 using primary structure analysis methods and a signal peptidase cleavage site is predicted between aa 37 and 38 (Nielsen et al., 1996 ). This site is equivalent to that on other HEV isolates, so that the additional 12 aa that may be present at the amino terminus of the genotype 4 protein, if it is translated from the first available AUG, are predicted to be removed during processing of the mature protein. The role of membrane association and glycosylation of the ORF 2 protein is unclear; a recent report suggests that only the non-glycosylated form may be stable and involved in capsid assembly (Torresi et al., 1999 ). The cytosolic, non-glycosylated ORF 2 protein is potentially larger for genotype 4 isolates than those of other genotypes.

Analysis of ORF 3
The HEV-T1 ORF 3 is 336 nt long and encodes 112 aa. Because of the single nucleotide (U) insertion at position nt 5160, the likely initiation codon of ORF 3 in HEV-T1 is 28 nt downstream of that predicted for HEV isolates described previously and the ORF 3 in HEV-T1 is shorter than in other HEV strains. The overlapping regions in ORF 3 of HEV-T1 and other isolates were aligned and the identity was calculated using the Clustal X program. The ORF 3 of HEV-T1 was 83·3–83·9%, 83·1% and 85·6–87·1% identical at the nucleotide level to Burmese-like, Mexican and HEV-US strains, respectively, whereas the Burmese-like isolates were 86·2–87% and 89·5–90·9% identical to the HEV-US and Mexican isolates within this region (Table 6).


View this table:
[in this window]
[in a new window]
 
Table 6. Pairwise comparisons (% amino acid identity, with % nucleotide identity in parentheses) of ORF 3 sequences

 
The amino acid sequence within this region is the most highly variable of the three ORFs. The Burmese-like isolates have 96·7–100% identity to each other whereas HEV-US1 is 97·6% identical to HEV-US2. HEV-T1 has 75·9–77·8%, 75% and 79·6–83·3% identity to the Burmese-like, Mexican and HEV-US strains respectively. The Burmese-like isolates are 82·2–84·4% and 85·4–88·6% identical to HEV-US and Mexican strains, respectively.

Comparison with partial ORF 1 sequences of other HEV variants
Partial sequences have been reported of two HEV isolates (G9 and G20) from Guangzhou province, China (Huang et al., 1995 ) and of three isolates (TW4, TW7 and TW8) from Taiwan (Wu et al., 1998 ). Analysis of 196 nt within the RNA-dependent RNA polymerase region of ORF 1 suggests that these isolates are distinct from the Burmese-like and Mexican strains (Wu et al., 1998 ). These sequences also are distinct from the HEV-US1 and HEV-US2 viruses (Schlauder et al., 1998 ). Seventeen sequences of 196 nt, within the RNA-dependent RNA polymerase region of ORF 1, were aligned using the Clustal X program. HEV-T1 has 75–83% identity to these other sequences at the nucleotide level. The identity between TW4 and TW8 is 98%. TW7 has 87–90% identity to G9 and G20, whereas it has 78%, 84% and 84% identity to HEV-T1, TW4 and TW8, respectively. Genetic distances were also calculated with DNADIST. The genetic distance between TW4 and TW8 was 0·0153 and the distances between TW4 and TW7, G20, G9 and HEV-T1 were more than 0·150. The distances of TW7 to G9 and G20 were 0·122 and 0·0918, respectively. The distance of HEV-T1 to the five variants was also more than 0·150. In addition, four short sequences, also reported from Taiwan (Hsieh et al., 1998 ), show approximately 88% nucleotide identity to T1 whereas, over the same short stretch of sequence, T1 has 89% identity to S15 (genotype 4; Wang et al., 1999 ) and 83% or less to other HEV genotypes, including those reported recently from Europe (Schlauder et al., 1999 ). HEV variants isolated from China, and which do not fall into genotype 1 (Burmese-like) may, therefore, be divided into at least three groups. T1 and related viruses, including those reported by Hsieh et al. (1998) , form the first group (genotype 4), TW4 and TW8 form the second and TW7, G9 and G20 the third, although Wu et al. (1998) consider that G9 and G20 belong to distinct groups. Thus, there seem to be at least two further genotypes of HEV in China. It is not clear whether these may be related to new genotypes identified recently in Europe (Schlauder et al., 1999 ).

Comparison with partial ORF 2 sequences of other HEV variants
Recently, three other variants of HEV were reported from Chinese patients with acute hepatitis, T21 (accession no. AF151963; Yang et al., 2000 ) and LZ-105 and HF-044 (accession nos AF103940 and AF134812; Li et al., 2000 ). In order to investigate whether these three isolates also belonged to genotype 4, 16 partial ORF 2 sequences were aligned using Clustal X. HEV-T1 was 85%, 85%, 87% and 88% identical to T21, T11, LZ-105 and HF-044, respectively, and 76–78%, 74% and 77–79% identical to Burmese-like, Mexican and US isolates, respectively. The genetic distances were also calculated with DNADIST: the distances of HEV-T1 to T21, HF-044, T11 and LZ-105 were 0·150, 0·147, 0·123 and 0·120, respectively (all equal to or less than 0·150), whereas the distances of HEV-T1 to Burmese-like, Mexican and US isolates were 0·227–0·250, 0·215 and 0·213–0·227, respectively (all greater than 0·150). An unrooted tree was drawn using Treeview and the variants T21, T11, LZ-105 and HF-044 clustered with HEV-T1. All the results indicate that these four variants belong to genotype 4 with HEV-T1.

Diagnosis of HEV genotype 4 infections
The question as to whether current diagnostic assays require modification to detect genotype 4 HEV infections must be answered experimentally using relevant convalescent sera. However, some insight may be gained by comparing the genotype 4 amino acid sequences with the prototype strains. Table 7(a) shows the immunoreactive domains 4-2 (ORF 3) and 3-2 (ORF 2) identified by Yarbough et al. (1991) and which constitute the basis of commercially available antibody assays. With the exception of the amino-terminal end of 4-2, T1 shows considerable conservation, particularly with genotype 2 (recombinant proteins from genotypes 1 and 2 are included in the antibody assays). Linear, immunoreactive epitopes which have been identified in the ORF 2 protein are located predominantly in the carboxyl-terminal region. Peptide sequences identified by Kaur et al. (1992) and Khudyakov et al. (1994a , b ) are shown in Table 7(b and c), respectively. With the exception of the single epitope in the amino-terminal region of ORF 2, T1 again seems well conserved with the other genotypes. Current antibody assays may be suitable for the detection of IgG responses to genotype 4 infections but this will require confirmation using validated sera. Serological tests for HEV show poor concordance and those based on synthetic peptides, in particular, may be unreliable (Mast et al., 1998 ).


View this table:
[in this window]
[in a new window]
 
Table 7. Comparison of amino acid sequences of known epitopes in ORFs 2 and 3

 
Expression of ORFs 2 and 3
By analogy with other families of viruses with a similar coding strategy, it is assumed that the ORFs at the 3' end of the HEV genome are translated from subgenomic RNAs. However, the initial report of the detection of such species in infected liver (Reyes et al., 1990 ) has not been confirmed. The close proximity of the amino termini of the ORF 2 and 3 products begs the question as to whether they may be translated from the same mRNA. The relative abundance of ORF 2 and 3 products may be controlled at the level of transcription, if there are separate mRNAs, or translation, if there is a single mRNA species with different efficiencies of ribosomal initiation for the two ORFs. In the latter case, the relative abundance of the ORF 2 and 3 products may differ in genotype 4 isolates from those of other genotypes.

Since the submission of this paper, Buisson et al. (2000) have reported HEV isolates from Nigeria that fall into the Mexican genotype but were unable to compare these sequences with genotype 4. Pairwise comparisons using GAP (GCG10) revealed 76·9% nucleotide identity between T1 and Nigerian isolates 1, 4, 5, 6 and 7 (accession nos AF172999, AF17300, AF17301, AF173230 and AF173231, respectively) and 76·0% between T1 and Nigerian isolate 9 (AF173232), in the ORF 2 region. These data are consistent with the assignment of T1 and these Nigerian isolates to separate genotypes.

In contrast to our usage of Arabic numerals, Buisson et al. (2000) use Roman numerals for the four genotypes. Furthermore, they label the Mexican and US genotypes III and II (our 2 and 3). Such inconstancies require resolution and agreement with the International Committee on Taxonomy of Viruses (ICTV).


   Acknowledgments
 
Y.W. is the recipient of a research development award in tropical medicine from the Wellcome Trust. Stool samples were kindly provided by Dr Liao Shengdong of Tiantan Hospital, Beijing.


   Footnotes
 
The EMBL/GenBank accession number of the complete sequence of isolate HEV-T1 is AJ272108.


   References
Top
Abstract
Introduction
Methods
Results and Discussion
References
 
Aye, T. T., Uchida, T., Ma, X., Iida, F., Shikata, T., Ichikawa, M., Rikihisa, T. & Win, K. M. (1993). Sequence and gene structure of the hepatitis E virus isolated from Myanmar. Virus Genes 7, 95-110.[Medline]

Bi, S. L., Purdy, M. A., McCaustland, K. A., Margolis, H. S. & Bradley, D. W. (1993). The sequence of hepatitis E virus isolated directly from a single source during an outbreak in China. Virus Research 28, 233-247.[Medline]

Bradley, D. W. (1992). Hepatitis E: epidemiology, aetiology and molecular biology. Reviews in Medical Virology 2, 19-28.

Buisson, Y., Grandadam, M., Nicand, E., Cheval, P., van Cuyck-Gandre, H., Innis, B., Rehel, P., Coursaget, P., Teyssou, R. & Tsarev, S. (2000). Identification of a novel hepatitis E virus in Nigeria. Journal of General Virology 81, 903-909.[Abstract/Free Full Text]

Chatterjee, R., Tsarev, S., Pillot, J., Coursaget, P., Emerson, S. U. & Purcell, R. H. (1997). African strains of hepatitis E virus that are distinct from Asian strains. Journal of Medical Virology 53, 139-144.[Medline]

Clayson, E. T., Innis, B. L., Myint, K. S., Narupiti, S., Vaughn, D. W., Giri, S., Ranabhat, P. & Shrestha, M. P. (1995). Detection of hepatitis E virus infections among domestic swine in the Kathmandu Valley of Nepal. American Journal of Tropical Medicine and Hygiene 53, 228-232.[Medline]

Dawson, G. J., Chau, K. H., Cabal, C. M., Yarbough, P. O., Reyes, G. R. & Mushahwar, I. K. (1992). Solid-phase enzyme-linked immunosorbent assay for hepatitis E virus IgG and IgM antibodies utilizing recombinant antigens and synthetic peptides. Journal of Virological Methods 38, 175-186.[Medline]

Donati, M. C., Fagan, E. A. & Harrison, T. J. (1997). Sequence analysis of full length HEV clones derived directly from human liver in fulminant hepatitis E. In Viral Hepatitis and Liver Disease, pp. 313-316. Edited by M. Rizzetto, R. H. Purcell, J. L. Gerin & G. Verme. Torino: Edizioni Minerva Medica.

Felsenstein, J. (1993). PHYLIP inference package version 3.5c. Department of Genetics, University of Washington, Seattle, WA, USA.

Harrison, T. J. (1999). Hepatitis E virus – an update. Liver 19, 171-176.[Medline]

Hsieh, S. Y., Yang, P. Y., Ho, Y. P., Chu, C. M. & Liaw, Y. F. (1998). Identification of a novel strain of hepatitis E virus responsible for sporadic acute hepatitis in Taiwan. Journal of Medical Virology 55, 300-304.[Medline]

Huang, C. C., Nguyen, D., Fernandez, J., Yun, K. Y., Fry, K. E., Bradley, D. W., Tam, A. W. & Reyes, G. R. (1992). Molecular cloning and sequencing of the Mexico isolate of hepatitis E virus (HEV). Virology 191, 550-558.[Medline]

Huang, R. T., Nakazono, N., Ishii, K., Kawamata, O., Kawaguchi, R. & Tsukada, Y. (1995). II. Existing variations on the gene structure of hepatitis E virus strains from some regions of China.Journal of Medical Virology 47, 303-308.[Medline]

Kaur, M., Hyams, K. C., Purdy, M. A., Krawczynski, K., Ching, W. M., Fry, K. E., Reyes, G. R., Bradley, D. W. & Carl, M. (1992). Human linear B-cell epitopes encoded by the hepatitis E virus include determinants in the RNA-dependent RNA polymerase. Proceedings of the National Academy of Sciences, USA 89, 3855-3858.[Abstract]

Khudyakov, Y. E., Favorov, M. O., Jue, D. L., Hine, T. K. & Fields, H. A. (1994a). Immunodominant antigenic regions in a structural protein of the hepatitis E virus. Virology 198, 390-393.[Medline]

Khudyakov, Y. E., Favorov, M. O., Khudyakova, N. S., Cong, M. E., Holloway, B. P., Padhye, N., Lambert, S. B., Jue, D. L. & Fields, H. A. (1994b). Artificial mosaic protein containing antigenic epitopes of hepatitis E virus. Journal of Virology 68, 7067-7074.[Abstract]

Koonin, E. V., Gorbalenya, A. E., Purdy, M. A., Rozanov, M. N., Reyes, G. R. & Bradley, D. W. (1992). Computer-assisted assignment of functional domains in the nonstructural polyprotein of hepatitis-e virus – delineation of an additional group of positive-strand RNA plant and animal viruses. Proceedings of the National Academy of Sciences, USA 89, 8259-8263.[Abstract]

Kwo, P. Y., Schlauder, G. G., Carpenter, H. A., Murphy, P. J., Rosenblatt, J. E., Dawson, G. J., Mast, E. E., Krawczynski, K. & Balan, V. (1997). Acute hepatitis E by a new isolate acquired in the United States. Mayo Clinic Proceedings 72, 1133-1136.[Medline]

Li, K., Zhuang, H. & Zhu, W. (2000). Partial nucleotide sequencing of hepatitis E viruses isolated from 14 cities of China: identification of 2 major variants of hepatitis E virus. Journal of Medical Virology (in press).

Mast, E. E., Alter, M. J. & Holland, P. V. (1998). Evaluation of assays for antibody to hepatitis E virus by a serum panel. Hepatology 27, 856-861.

Meng, X. J., Purcell, R. H., Halbur, P. G., Lehman, J. R., Webb, D. M., Tsareva, T. S., Haynes, J. S., Thacker, B. J. & Emerson, S. U. (1997). A novel virus in swine is closely related to the human hepatitis E virus. Proceedings of the National Academy of Sciences, USA 94, 9860-9865.[Abstract/Free Full Text]

Meng, X. J., Halbur, P. G., Shapiro, M. S., Govindarajan, S., Bruna, J. D., Mushahwar, I. K., Purcell, R. H. & Emerson, S. U. (1998). Genetic and experimental evidence for cross-species infection by swine hepatitis E virus. Journal of Virology 72, 9714-9721.[Abstract/Free Full Text]

Nielsen, H., Engelbrecht, J., von Heijne, G. & Brunak, S. (1996). Defining a similarity threshold for a functional protein sequence pattern: the signal peptide cleavage site. Proteins 24, 165-177.[Medline]

Panda, S. K., Nanda, S. K., Zafrullah, M., Ansari, I. U. H., Ozdener, M. H. & Jameel, S. (1995). An Indian strain of hepatitis E virus (HEV): cloning, sequence, and expression of structural region and antibody responses in sera from individuals from an area of high-level HEV endemicity. Journal of Clinical Microbiology 33, 2653-2659.[Abstract]

Reyes, G. R., Purdy, M. A., Kim, J. P., Luk, K. C., Young, L. M., Fry, K. E. & Bradley, D. W. (1990). Isolation of a cDNA from the virus responsible for enterically transmitted non-A, non-B hepatitis. Science 247, 1335-1339.[Medline]

Schlauder, G. G., Dawson, G. J., Erker, J. C., Kwo, P. Y., Knigge, M. F., Smalley, D. L., Rosenblatt, J. E., Desai, S. M. & Mushahwar, I. K. (1998). The sequence and phylogenetic analysis of a novel hepatitis E virus isolated from a patient with acute hepatitis reported in the United States. Journal of General Virology 79, 447-456.[Abstract]

Schlauder, G. G., Desai, S. M., Zanetti, A. R., Tassopoulos, N. C. & Mushahwar, I. K. (1999). Novel hepatitis E virus (HEV) isolates from Europe: evidence for additional genotypes of HEV. Journal of Medical Virology 57, 243-251.[Medline]

Tam, A. W., Smith, M. M., Guerra, M. E., Huang, C. C., Bradley, D. W., Fry, K. E. & Reyes, G. R. (1991). Hepatitis E virus (HEV): molecular cloning and sequencing of the full-length viral genome. Virology 185, 120-131.[Medline]

Torresi, J., Li, F., Locarnini, S. A. & Anderson, D. A. (1999). Only the non-glycosylated fraction of hepatitis E virus capsid (open reading frame 2) protein is stable in mammalian cells. Journal of General Virology 80, 1185-1188.[Abstract]

Tsarev, S. A., Emerson, S. U., Reyes, G. R., Tsareva, T. S., Legters, L. J., Malik, I. A. & Purcell, R. H. (1992). Characterization of a prototype strain of hepatitis E virus. Proceedings of the National Academy of Sciences, USA 89, 559-563.[Abstract]

Wang, Y. C., Ling, R., Erker, J. C., Zhang, H. Y., Li, H. M., Desai, S., Mushahwar, I. K. & Harrison, T. J. (1999). A divergent genotype of hepatitis E virus in Chinese patients with acute hepatitis. Journal of General Virology 80, 169-177.[Abstract]

Wu, J. C., Sheen, I. J., Chiang, T. Y., Sheng, W. Y., Wang, Y. J., Chan, C. Y. & Lee, S. D. (1998). The impact of travelling to endemic areas on the spread of hepatitis E virus infection: epidemiological and molecular analyses. Hepatology 27, 1415-1420.[Medline]

Yang, J., Zhang, H., Wang, Y., Mao, Q. & Li, H. (2000). Partial sequence comparison of three sporadic hepatitis E variants. Chinese Journal of Virology (in press).

Yarbough, P. O., Tam, A. W., Fry, K. E., Krawczynski, K., McCaustland, K. A., Bradley, D. W. & Reyes, G. R. (1991). Hepatitis E virus: identification of type-common epitopes. Journal of Virology 65, 5790-5797.[Medline]

Yin, S. R., Purcell, R. H. & Emerson, S. U. (1994). A new Chinese isolate of hepatitis E virus: comparison with strains recovered from different geographical regions. Virus Genes 9, 23-32.[Medline]

Received 15 December 1999; accepted 21 March 2000.