Comparative analysis of the genome organization of human adenovirus 11, a member of the human adenovirus species B, and the commonly used human adenovirus 5 vector, a member of species C

Ya-Fang Mei, Johan Skog, Kristina Lindman and Göran Wadell

Department of Virology, Umeå University, SE-901 85 Umeå, Sweden

Correspondence
Göran Wadell
goran.wadell{at}climi.umu.se


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Adenovirus type 11 (Ad11), a member of the human adenovirus species B (HAdV-B), has a tropism for the urinary tract. The genome of Ad11 was found to comprise 34 794 bp and is 1141 bp shorter than the Ad5 genome of species HAdV-C. The G+C content of the Ad11 genome is 48·9 %, whereas that of Ad5 is 55·2 %. Ad11 and Ad5 share 57 % nucleotide identity and possess the same four early regions, but the E3 region of Ad11 could not be divided into E3A and E3B. The late genes of Ad11 and Ad5 are organized into six and five regions, respectively. Thirty-eight putative ORFs were identified in the Ad11 genome. The ORFs in the late regions, the E2B region and IVa2 show high amino acid identity between Ad11 and Ad5, whereas the ORFs in E1, E2A, E3 and E4, protein IX and the fibre protein show low amino acid identity. The highest and lowest identities were noted in the pre-terminal protein and fibre proteins: 85 % and 24·6 %, respectively. The E3 20·3K and 20·6K ORFs and the L6 agnoprotein were present in the Ad11 genome only, whereas the E3 11·6K cell death protein was identified only in Ad5. All ORFs but the E3 10·3K and L4 pVIII protein vary not only in composition but also in size. Ad11 may have a higher vector capacity than Ad5, since it has a shorter genome and a shorter fibre. Furthermore, in the E3 region, two additional ORFs can be deleted to give extra capacity for foreign DNA.

The complete sequence of human adenovirus 11 has been deposited in GenBank under accession no. AF532578.

The characteristics of the complete Ad11 genome were presented at the 12th International Congress of Virology in Paris, 27 July–1 August 2002, by Y.-F. Mei and others.


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Human adenoviruses constitute a large group within the family Adenoviridae, with 51 serotypes identified so far. These serotypes are divided into six species, designated A–F, based on DNA homology (Green et al., 1979; Wadell et al., 1980; Wadell, 1984). The nine human adenovirus members of species B have been classified further into two subspecies, B:1 and B:2. Adenovirus 3 (Ad3), Ad7, Ad16, Ad21 and Ad50 belong to subspecies B:1 and cause respiratory infection, whereas Ad11, Ad34 and Ad35 of subspecies B:2 are associated with persistent infections of the urinary tract and kidney. Ad11 was first isolated from a stool sample of a patient with poliomyelitis and has subsequently been isolated from the urine of patients with haemorrhagic cystitis, as well as from pregnant women. Ad11 can also be found in healthy children and adults. Ad11, Ad34 and Ad35 are strongly over-represented among isolates from immunosuppressed patients and bone marrow transplant recipients.

Ad11 targets an unknown cellular receptor, which is more widely distributed than the usual coxsackie–adenovirus receptor (CAR) and is highly expressed on the surface of cells from various organs. Furthermore, relatively few persons have been exposed to Ad11 and, consequently, seroprevalence is low (D'Ambrosio et al., 1982). Previously, we demonstrated that Ad11 binds strongly to and replicates in kidney cells (HEK 293 cells), an endothelial cell line, and committed haematopoietic cell lines. Ad11 attaches most efficiently to carcinoma cell lines, for example from epithelial carcinoma (A549 cells), hepatoma (HepG2), prostatic cancer (DU 145 and LNCaP), laryngeal cancer (Hep2) and breast cancer (CAMA and MG7) and also to glioblastoma, medulloblastoma and neuroblastoma cells (Segerman et al., 2000; Mei et al., 2002; Skog et al., 2002; Zhang et al., 2003). Thus, Ad11 shows affinity for a broad range of host cells and has a higher binding capacity than members of species A, C, D, E and F (Mei et al., 2002).

Ad2 and Ad5 belong to species C. In 1984, the complete genome organization of Ad2 was reported (Roberts et al., 1984) and the complete Ad5 genome was presented some years later (Chroboczek et al., 1992). Ad5 has been the focus of vector development for more than two decades (Graham & Prevec, 1992), and both Ad2 and Ad5 have become useful vectors for gene therapy and vaccination. However, the use of Ad2- and Ad5-based vectors for human gene therapy has been hampered because of pre-existing immunity against Ad2 and Ad5, which could affect the efficacy and even safety of adenovirus vector administration. CAR is the host cell receptor for Ad2 and Ad5 (Bergelson et al., 1997; Tomko et al., 1997). CAR has been localized recently to the basolateral membrane rather than the apical surface of human airway epithelia cells (Walters et al., 2002). Thus, a number of human tissues that represent important targets for gene therapy, such as the airway epithelium and cancer cells, are refractory to adenovirus infection due the limited availability of the adenovirus receptor. Thus, a prerequisite for the development of an alternative adenovirus vector is that it should manifest low serum prevalence in the human population and undergo CAR-independent attachment, preferably with enhanced specificity and efficiency of gene transfer to target cells. Ad11 meets these requirements. In order to enable the construction of an Ad11-based vector, we have presented the complete genome of human adenovirus type 11 and a comparative analysis with the commonly used vector Ad5.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Virus strains and viral DNA preparation.
Human Ad11 with the prototype genome type (strain Slobitski) was cultivated in the A549 cell line. The cells were grown in DMEM supplemented with 5 % foetal bovine serum. Purification of viral DNA was carried out as described previously (Shinagawa et al., 1983).

DNA cloning.
For sequencing, BamHI restriction fragments of Ad11 DNA were cloned into pBlueScript SK(-). The two termini, the BamHI F and B fragments, were separately purified from agarose gels. Two pairs of primers specific for the left and right termini were synthesized. The two terminal PCR fragments were then cloned into the pUC19 vector. Confirmatory sequencing of various parts of the genome was performed by sequencing the viral DNA directly. All reported sequences are the result of at least three sequencing reactions.

Genomic DNA sequencing.
The sequence reaction was carried out using the DYEnamic ET terminator cycle sequencing premix kit (Amersham Pharmacia). DNA sequencing was performed on an ABI PRISM 377 DNA sequencer. Assembly of the complete sequence of the Ad11 genome was accomplished using SeqMan software from the Lasergene package (DNAStar). Further sequence analysis was performed using the University of Wisconsin Genetics Computer Group (GCG) programs.

Nucleotide sequence accession numbers.
The complete sequence of human Ad11 has been deposited in GenBank under accession no AF532578. Other human adenovirus sequences used for comparison had accession numbers X73487 (Ad12), AF108105 (Ad17), J01917 (Ad2), L19443 (Ad40), M73260 (Ad5) and AF394196 (human adenovirus species E, actually derived from chimpanzee adenovirus CV68).


   RESULTS AND DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Small portions of the sequence of Ad11p, the prototype genotype, strain Slobitski, have been reported previously by Mei & Wadell (1992, 1993). Here, we present the complete genome of Ad11. The new accession number AF532578 replaces the older accession numbers M94458 and L08231.

General characteristics of the Ad11 genome
The complete Ad11p genome is 34 794 base pairs (bp) in length, which is 1141 bp shorter than the Ad5 genome (35 935 bp). Ad5 is used as a vector and for this reason its genome is used here for comparison with that of Ad11. Four early regulatory regions, E1, E2, E3 and E4, are located in the genome of Ad11 essentially where they are located in the Ad5 genome. Ad5 possesses five late gene expression cassettes whereas Ad11 contains six potential late regions according to the position of polyadenylation signals and ORFs. Thirty-eight putative open reading frames can be discerned in the Ad11 genome (Fig. 1; Table 1). The genome of Ad11 was analysed for potential cuts involving 212 restriction enzymes and among these 204 enzymes were predicted to cut the genome at least once. Four enzymes showed one unique recognition site. From the 5' to 3' terminus, the unique cleavage sites are Nhel at nt 6709, AscI at nt 7929, PacI at nt 18139 and RsrII at nt 23302. The eight restriction enzymes that cannot cut the genome of Ad11 are FseI, NotI, PmeI, SbfI, SfiI, SgfI, SrfI and SwaI.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 1. Genome organization of Ad11. The linear double-stranded genome is depicted in the centre as a double line, with the inverted terminal repeats (ITRs) at each end. Transcription units are shown as arrows, relative to their position and orientation in the Ad11 genome. These include early genes (E1A, E1B, E2A, E2B, E3 and E4), genes expressed at intermediate times of infection (IX and IVa2) and late genes (L1–L6). All of the late genes are expressed from the major late promoter (MLP) and contain the tripartite leader (TPL) at their 5' ends. A triangle represents the virus-associated (VA) RNA. One map unit (m.u.) is equivalent to 347·94 bp.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Adenovirus 11 genome

The beginning of each region was defined by the location of either the TATA box or the ATG of the first ORF, whereas the ending of each region was defined by the position of the first poly(A) signal. R, residues.

 
The Ad11 genome has a base composition of 24·5 % G, 24·4 % C, 26 % A and 25·1 % T. Thus, the Ad11 genome has a G+C content of 48·9 %, which is similar to a previous estimate (51 %) derived from biochemical analysis (Green, 1970; Green et al., 1979). The Ad5 genome has a G+C content of 55·2 % and the base composition is 27·2 % G, 28 % C, 23·3 % A and 21·5 % T. Thus, Ad5 has a higher G+C content than Ad11. Nucleotide sequence comparison between Ad11 and other human adenoviruses indicated that Ad11 of subspecies B:2 has a genome sequence that differs distinctly from any other adenoviruses species. The degree of identity of Ad11 to Ad17 of species D, CV68 of species E, Ad5 of species C, Ad12 of species A and Ad40 of species F is 62 %, 60 %, 57 %, 43 % and 39 %, respectively (Clustal method, DNAStar program). The open reading frames of Ad11 were determined and the putative gene products were identified based on their similarity to the corresponding ORFs of Ad5 or Ad2 (Table 1).

Early regions
At the two ends of the Ad11 genome, the 137 bp inverted terminal repeats (ITRs) initiate DNA replication, which proceeds by a strand displacement mechanism. The displaced single strand is subsequently used as the template for a second round of DNA synthesis. The presence of the ITR enables the 5' and 3' ends of the displaced strand to interact to form a panhandle structure, which resembles the terminus of the parental DNA and can thus serve as an origin for second-strand DNA synthesis. Elongation can then proceed. The 22 nucleotides at the two extreme ends are identical between Ad11 and Ad5 and contain the ATAATATACC consensus sequence, which has been defined as the minimum origin for a site recognized by the pre-terminal protein–viral DNA polymerase (pTP–pol) and forms a pTP–pol complex (Chen et al., 1990). The ITR of Ad5 has binding sites for nuclear factors I and III (NFI and NFIII) and these are relatively well conserved in the Ad11 ITR. The 103-nucleotide ITR of Ad5 shows 65 % identity to the 137-nucleotide ITR of Ad11. From the left ITR of Ad11, the 1530 bp downstream sequence covers two regions: the encapsidation site and the E1A gene, which show 52 % and 57 % identity to the corresponding sequences of members of species C, respectively.

The encapsidation region is located between nt 249 and 392 in the Ad11 genome and shows 67·4 % identity to the same region of Ad5. The critical encapsidation region of Ad5 shows three EF-1A and two E2F transcription factor binding sites. In contrast, the corresponding region of Ad11 shows high similarity to the E2F binding site but little or no similarity to the EF-1A binding site. A repeated element that is crucial for packaging has been identified in the Ad5 genome. This element includes a conserved TTTG motif followed by eight nucleotides and CG (TTTGN8CG), as indicated in Fig. 2. Among the four repeat elements for packaging, only the last one of Ad11 is identical to that of Ad5 (Schmid & Hearing, 1998). It has been reported that even two mismatches could reduce the packaging efficiency to less than 10 %. Thus, it is probable that the function of the packaging domain of Ad5 would be incompatible with packaging of the Ad11 genome.



View larger version (65K):
[in this window]
[in a new window]
 
Fig. 2. Comparison of the nucleotide sequences of the left 400 bp of the Ad11 and Ad5 genomes. The DNA sequences of the left ends of the two serotypes covering the left ITR and the encapsidation region were aligned using GCG software. Identical nucleotides are indicated by asterisks and gaps are marked with dots. The grey-shaded region indicates the conserved AT-rich sequence that is important for the binding of the pTP/pol complex. The sequences for nuclear factor I (NFI) and NFIII are marked and the bold letters represent the consensus sequence. The sequences corresponding to EF-1A and E2F, the transcription factor binding sites, are indicated below the sequence. All repeat packaging elements, TTTG-N8-CG, are indicated by black boxes and marked as AI, AII, AV and AVI.

 
The gene products of the Ad5 E1 region are associated with transactivation of viral and cellular genes, transformation of cells in culture and induction of cellular DNA synthesis and mitosis. The E1 transcription unit of Ad11 is located at the left end of the genome, as in Ad5, and in view of the presence of a polyadenylation signal, the E1 region can be divided into E1A and E1B portions. The E1A region is located between 1·36 and 4·3 map units (m.u.). Three different types of ORF with varied splicing patterns were identified in the E1A region of Ad11. The ORFs with 262, 231 and 58 residues (R) of the Ad11 E1A region correspond to ORFs with 290, 243 and 55 residues within the Ad5 E1A region and share identities of only 35·9, 36·4 and 14·3 %, respectively. The pRb binding motif (LXCXE) was identified in the Ad11 E1A 262R and 231R ORFs, whereas a zinc-finger motif (CX2CX13CX2C) was only found in the 262R of Ad11 E1A (Whyte et al., 1988). The 21K and 55K proteins of Ad11 E1B show identities of 49·7 and 53·1 % to Ad5 E1B proteins. Thus, it is possible that the E1 proteins of Ad11 have functions similar to those of the Ad5 E1 proteins.

Early region 2 (E2) is divided into two parts, E2A and E2B. Three E2-encoded proteins are necessary for viral DNA replication: DNA polymerase, pre-terminal protein (pTP) and DNA binding protein (DBP). The DNA polymerase gene is situated at nt 8435–5067, the pTP gene at nt 10354–8438 and the DBP at nt 23402–21846. The three corresponding proteins of Ad11 have 78·7, 85 and 57·1 % identity to Ad5, respectively. The Ad11 pTP possesses two ascertained cleavage sites (154MRGF{lozenge}G158 and 315MRGG{lozenge}V319) and one putative site (162MHGR-T166), whereas Ad5 pTP has the three cleavage sites that are known substrates of adenovirus protease. The nuclear localization sequence (349RLPVRRRRRVP360) of Ad11 pTP is identical to the corresponding sequence of Ad5. Analysis of the Ad11 DBP revealed that there are two zinc-binding domains, which correspond to those of Ad5 DBP. The first (HXCX8CXH) is conserved in Ad11 (aa 259–273) and Ad5 (aa 273–287). The second is characterized by cysteine residues at positions 396, 398, 450 and 467 in Ad5 DBP, all of which are conserved in Ad11 (aa 383, 385, 437 and 454) (Fig. 3). In Ad2, the N-terminal domain is heavily phosphorylated at serine and threonine residues and contains most of the phosphate moieties bound by the DBP molecule. In a comparison of the first 100 nucleotides of the gene, it was noted that the putative Ad11 DBP manifests 22 serine or threonine residues, whereas only 18 such residues are present in Ad5. In the Ad5 DBP, a bipartite nuclear localization signal (NLS), 42PPKKR46 and 84PKKKKK89, was identified in the N-terminal domain (Morin et al., 1989) and a similar sequence, 44PPKRN48 and 84PPKKKP89, is present in Ad11 DBP, which could function as an NLS for the transport of the protein into the nucleus. The hydrophilic 13-amino acid sequence MRTQEEEEEPSEA in Ad5 is absent from Ad11 and thus the overall size of the Ad11 DBP (518 amino acids) is shorter than Ad5 DBP (529 amino acids). The E2B region of Ad5 encodes the DNA polymerase and pTP. These display significant homologies to the corresponding proteins of Ad11.



View larger version (71K):
[in this window]
[in a new window]
 
Fig. 3. Comparison of the amino acid sequences of the DNA binding proteins of Ad11 and Ad5. Identical and conserved amino acid residues are indicated by asterisks. Two dots show highly similar amino acids and one dot shows some similarity in the properties of the amino acids. Nuclear localization signals are shaded grey. The sequence motifs for coordinating zinc atoms are indicated in black.

 
The E3 proteins encoded by species C adenoviruses are involved in modifying the host immune response. E3 is located between the genes for the precursor protein VIII (L4) and the fibre protein (L5). The E3 region of Ad11 is situated between nt 26864 and 30625 and only one polyadenylation signal, characterized by the AATAAA motif, and one TATA box (29177TATAAAAG29184) are present in the Ad11 E3 region. Five Ad11 E3 ORFs, 12·1K, 18·5K, 10·3K, 15·2K and 15·3K, have homologues in Ad5, showing identities of 53·3, 35·8, 50, 31·8 and 53·1 %, respectively. The E3 20·3K and 20·6K ORFs have been demonstrated previously in both the Ad11 and Ad35 genomes, but do not have counterparts in the Ad5 genome (Flomenberg et al., 1988; Mei & Wadell, 1992). The Ad5 E3 11·6K protein, also designated the adenovirus death protein (ADP), is encoded neither in the Ad11 genome nor in the Ad35 genome, but corresponding 9·0K and 7·7K proteins have been detected in Ad3 and Ad7 of species B:1 (Hong et al., 1988). It has been reported that the ADP facilitates late cytolysis of the infected cell and causes efficient release of progeny virus (Tollefson et al., 1996). Plaques formed in Ad5-infected cells were more transparent than in cells infected by Ad11; thus, this phenomenon could be due to dysfunction or absence of the 11·6K protein.

The E4 transcription units from the closely related species C serotypes 2 and 5 have been well defined concerning their transcriptional and post-transcriptional regulation and their gene products. The mRNAs potentially encode six different polypeptides, the products of ORF1, ORF2, ORF3, ORF3/4, ORF6 and ORF6/7. All but ORF3/4 have been reported to exist in infected cells. In Ad11, the E4 transcription unit is located between m.u. 91·4 and 99·1 at the right end of the adenovirus genome and is transcribed in a leftward direction. The E4 unit is controlled by the E4 promoter and should generate a primary transcript of approximately 2700 bases. The E4 region of Ad11 possesses all the ORFs that appear in Ad5 E4. However, the amino acid identity of E4 ORF3, ORF4 and ORF6 between Ad11 and Ad5 was only 51·3, 44·3 and 59 %, respectively.

Intermediate regions of the Ad11 genome
The genes encoding protein IX and protein IVa2 were originally classified as E1B and E2B regions, respectively. Recently, these two proteins were grouped as intermediate genes (Tribouley et al., 1994; Lutz & Kedinger, 1996; Rosa-Calatrava et al., 2001). The activity of the adenovirus major late promoter is enhanced by protein IVa2. The Ad11 IVa2 sequence shows 79·1 % identity to the Ad5 virus counterpart. Protein IX is a minor capsid component and is essential for the packaging of viral DNA; it also possesses transcriptional properties (Ghosh-Choudhury et al., 1987; Rosa-Calatrava et al., 2001). Sequence identity of protein IX between Ad11 and Ad5 was only 45·3 % (Table 1 and Table 2). This low value implies that the Ad5 protein IX may not package the Ad11 genome efficiently. The Ad3 genome is packaged into the Ad5 vector with low efficiency (Ghosh-Choudhury et al., 1987; Ostapchuk & Hearing, 2001).


View this table:
[in this window]
[in a new window]
 
Table 2. Comparison of the gene products in Ad11 and Ad5

 
TATA boxes and polyadenylation sequences
Two TATAAAAG boxes were identified in the Ad11 genome. One is located at the major late promoter region, as found in Ad5, and the other lies within nt 29177–29184, just in front of E3B. However, the second TATAAAAG box is not present in Ad5 E3 region; instead, there is a similar sequence, TACAAAAG. An additional box, 18770TACAAAAG18777, can be found downstream of the Ad11 L4 region. Thus, Ad11 shows three TAT/CAAAAG boxes and Ad5 has two. The TATAAAAG box and TACAAAAG box manifest differences in cyclization kinetics of DNA with or without TATA-box binding protein (Davis et al., 1999).

The polyadenylation signal was used in dividing early and late expression cassettes and also in separating the E3 region into E3A and E3B. However, only one polyadenylation signal is present at the end of the Ad11 E3 region, while two polyadenylation signals characterize the Ad5 E3 region. Thus, the E3 region of Ad11 cannot be divided into E3A and E3B portions as defined in Ad5. More interestingly, a potential ORF encoding 169 residues was found downstream of the fibre gene and followed by two additional polyadenylation signals. This late gene expression cassette has not been reported in the Ad2 or the Ad5 genome sequences.

Virus-associated RNA region
Adenovirus virus-associated (VA) RNAs are short RNA polymerase III transcripts. The human Ad2, Ad5 and Ad7 genomes each carry two VA RNA genes, VA RNA I and II. In Ad11, only one VA RNA can be found (Table 3) (Kidd et al., 1995). The VA RNAs of Ad5 are located in tandem at nt 10621–11031 of the genome. In Ad11, the VA RNA gene is located at nt 10432–10588. The Ad11 VA RNA region and the Ad5 VA RNA region comprise 157 and 411 nucleotides, respectively. Nucleotide comparison showed that the single Ad11 VA RNA I is more closely related to Ad5 VA RNA II and the degree of identity is 63·6 % and 66·8 % to the genes of Ad5 and Ad2, respectively. The VA RNAs have been suggested to interact with protein kinase PKR, thus facilitating the down-regulation of the interferon-mediated antiviral response and selective translation of viral mRNAs in infected cells (Ghadge et al., 1994). The functional role of the VA RNA I and II from Ad5 is still unclear, but VA RNA is known to confer resistance to {alpha}-interferon on the infected cell. The latter molecule plays a role as an RNA silencer in mammalian cells (Vance & Vaucheret, 2001).


View this table:
[in this window]
[in a new window]
 
Table 3. Genome differences between Ad11 and Ad5

 
Late genes
The transcription of Ad11 late genes is initiated predominantly at the major late promoter (MLP), after the initiation of DNA replication, which usually occurs at 8 h post-infection. In Ad11, the MLP is located at 16·78 m.u. and contains an inverted CAAT box at nt 5842–5839, an upstream stimulatory factor (USF) binding site (nt 5859–5867) and a canonical TATAAAAG box (nt 5888–5893). The initiator element (INR) (TCACTGTCTTCC) of Ad11 (nt 5916–5927) is located 25 bp downstream from the last A of the TATAAAA box. The G at the sixth base pair in the INR of Ad11 is replaced by C in Ad5 (Fig. 4). The INR sequence has been demonstrated to enhance initiation of transcription from the MLP in Ad2 (Hu & Manley, 1981). The two sequence elements, DE1 and DE2, at nt 6004–6014 and 6019–6036, downstream from the transcription site, are identical to the same elements of Ad5. These two elements enhance transcription from the MLP after the onset of DNA replication. The MLP region from nt 5819 to 6041 of the Ad11 genome shows 78·9 % identity to the Ad5 MLP. The tripartite leader (TPL) in the Ad11 genome, situated at nt 5919–5959 (41 nucleotides), nt 6979–7050 (72 nucleotides) and nt 9493–9579 (87 nucleotides), shows identities of 80, 77·8 and 73·6 %, respectively, to the tripartite leader of Ad5 (Table 3). The overall length of the Ad5 and Ad11 TPLs is 200 nt.



View larger version (49K):
[in this window]
[in a new window]
 
Fig. 4. Comparison of the transcriptional control elements of the major late promoter (MLP) regions of Ad11 and Ad5. The transcription control elements are shaded. DE1 and DE2 enhance transcription from the MLP after the onset of DNA replication. USF, upstream stimulatory factor; INR, initiator element.

 
In the Ad5 genome, the late regions are transcribed from the MLP after the onset of viral DNA replication. The primary transcripts are then cleaved and become polyadenylated at one of five locations along the genome, forming five late regions (L1–L5). The late regions encode mostly structural proteins and some proteins involved in the morphogenesis of the virion. In contrast, the Ad11 genome can be divided into six late families (L1–L6) since polyadenylation sites exist at six places along the genome. At least 16 putative late genes have been identified from the Ad11 MLP, extending to the right for approximately 29 kb and terminating at around nt 33915.

Two putative ORFs, for the 52/55K and precursor protein IIIa (pIIIa), are encoded within the L1 region. The 52/55K protein is 27 amino acids shorter in Ad11 (388 amino acids) than in Ad5 (415 amino acids). The amino acid identity between Ad11 and Ad5 is 75 % for the 52/55K protein and 77·7 % for pIIIa. The 52/55K protein is involved in virion assembly (Hasson et al., 1989). pIIIa is a virion phosphoprotein, which is cleaved by the viral protease at 567LGGR{lozenge}G571 during virion assembly and associated with the hexon polypeptide (Cuillel et al., 1987).

Four potential ORFs are encoded within the L2 region. Protein III, also designated the penton base protein, is one of three major capsid proteins and plays an important role during virus internalization. It interacts with {alpha}v{beta}3 and {alpha}v{beta}5 integrins on the cell surface via its RGD motif (Wickham et al., 1993). The RGD motif is found in the penton base proteins of both Ad11 and Ad5. Ad11 can attach efficiently to and be expressed in some peripheral lymphocytes (Segerman et al., 2000), which contain few integrin receptors. Consequently, Ad11 may enter these cells through an RGD-independent pathway. In addition, the LDV motif that interacts with {alpha}4{beta}1 integrin also exists in the penton base proteins of both Ad11 and Ad5 (Komoriya et al., 1991). However, the function of the LDV motif during adenovirus infection is still unknown. A fibre-interacting domain, HSRLSNLLGIRKR, in the Ad5 penton base (Caillet-Boudin, 1989) is conserved with the exception of two residues: the first histidine is substituted by glutamic acid and the last arginine by lysine in Ad11. Furthermore, the Ad11 penton base peptide is 10 residues shorter and shares 73·5 % identity with the Ad5 penton base. The L2 region of Ad11 also encodes precursor protein VII (pVII), protein V and precursor protein X (pX). All of these proteins are rich in arginine and lysine and are known to function as core proteins that bind to virus DNA. The pVII, protein V and pX of Ad11 show 72·5, 65·8 and 74·7 % identity to the equivalent proteins of Ad5. Ad11 pVII is 192 residues in length and contains one adenovirus protease cleavage site (21MYGG{downarrow}A25). Ad11 pX comprises 76 residues and contains two consensus protease cleavage sites (25MLGR{downarrow}G29 and 43LRGG{downarrow}F47). Two functional identical cleavage sites were also found in Ad5 pX (24MAGH{downarrow}G28 and 48MRGG{downarrow}I 52).

Precursor protein VI (pVI), protein II (hexon) and the 23K protease are encoded by the L3 region. Ad11 pVI is four amino acid residues shorter than Ad5 pVI. They share 68·3 % amino acid sequence identity. Two endoprotease cleavage motifs (30LNGG{downarrow}A34 and 232IVGL{downarrow}G236) are present near the N and C termini of Ad11 pVI. These motifs are also present in Ad5 pVI. pVI is associated with the transport of hexon molecules to the nucleus, but the mature VI protein is a minor capsid component (Kauffman & Ginsberg, 1976). The hexon proteins of Ad11 and Ad5 contain 948 and 952 residues, respectively. Seven variable regions have been defined as being type-specific epitopes by X-ray crystallographic studies of the Ad5 hexon (Rux & Burnett, 2000). A comparison between the hexons of Ad11 and Ad5 reveals a relatively high overall identity (78·8 %). However, pronounced differences between Ad11 and Ad5 were found in all seven hypervariable regions (HVRs), especially in HVR I, II and IV. Fewer acidic amino acids were exposed in the HVR I of Ad11, as compared with five glutamic acid residues in Ad5.

The L4 region of Ad11 encodes four ORFs, for the 100K, 33K, L4 agnoprotein and precursor protein VIII (pVIII). The 100K protein shows 64·8 % amino acid identity between Ad11 and Ad5. This protein attaches to the hexon and takes part in its folding and transport to the nucleus. With the help of the 100K protein, the hexon protein can fold into a trimer (Oosterom-Dragon & Ginsberg, 1981). The hexon proteins of Ad11 and Ad5 show a relatively high identity (78·8 %). One may speculate that the pronounced differences in the 100K protein between Ad11 and Ad5 might identify a species-specific protein. The 100K protein manifests a high identity (97·7 %) between Ad2 and Ad5 within species C, whereas the same protein shows a low identity (64·8 %) between Ad11 and Ad5.

The L4 33K proteins of Ad11 and Ad5 differ in size, being 226 residues and 229 amino acids, respectively. The Ad11 33K coding region comprises two exons located at nt 25603–25921 and nt 26090–26452, respectively. The L4 33K proteins of Ad11 and Ad5 manifest an identity of 57·6 %. A comparison between Ad11 and Ad5 revealed that the additional residues are mainly in the middle part of Ad5 33K, with segments of 7 and 3 amino acids that are absent from the Ad11 protein. However, at the N terminus, serine–threonine-rich motifs are retained in both the Ad11 and Ad5 proteins. Therefore both can function as phosphoproteins. pVIII in the L4 region contains 227 amino acids and is the only ORF among the late genes displaying a relatively high degree of identity (79·8 %) between Ad11 and Ad5. In Ad5 and Ad11 pVIII, three adenovirus protease cleavage sites (L/IXGG{lozenge}X or L/IXGX{lozenge}G) are found. pVIII, together with three other hexon-associated proteins, designated pIIIa, pVI and protein IX, connects the core with the inner surface of the adenovirus capsid, thus stabilizing it.

The L5 region contains only one ORF, encoding the fibre protein, which was originally designated protein IV in Ad2 and Ad5. The fibres of Ad11 and Ad5 differ substantially in size and contain 325 and 582 amino acid residues, respectively. The two fibre proteins show the least identity of all the proteins specified by the two viruses, with only 24·6 % identity for the whole fibre and 29·8 % for the fibre knob. A hydrophobic motif (FNPVYPYE/D) is present at the N-terminal end of both fibres. This motif is involved in the specific interaction between the fibre tail and the penton base (Caillet-Boudin, 1989). The knob of the fibre protein, situated towards the C terminus, contains the CAR-binding site and includes residues S408, P409, K417, K420 and Y477 in Ad5 (Roelvink et al., 1999). Among these residues, only K420 could be identified in Ad11. This could explain why Ad11 does not bind to CAR (Skog et al., 2002). The epitopes expressed on the Ad11 fibre knob have previously been characterized by Mei & Wadell (1996).

A conspicuous difference between the Ad11 and the Ad5 genomes is the finding that Ad11 possesses an additional putative ORF situated downstream of the fibre gene. This is followed by two polyadenylation sites (AATAAA) and this region is now designated L6. It encodes an unidentified putative protein of 169 amino acids. In contrast, the two polyadenylation signals are absent from the corresponding region of the Ad5 genome. Consequently, there may be six late regions in the Ad11 genome but only five in the Ad5 genome.

Conclusions
Overall, the length and general organization of the Ad11 genome is similar to that of the Ad5 genome. However, a markedly different packaging region, the VA RNA region, the fibre protein, the highly variable region of the hexon, a relatively large E3 region and possibly six late gene expression cassettes are distinctive characteristics of the Ad11 genome. Consequently, Ad11 is a newly characterized member of subspecies B:2 that could become a vector candidate for human gene therapy and vaccination, both because of its lower serum prevalence in the human population and because of its affinity for a non-CAR receptor, which is apparently expressed on all human cells tested.

Note added in proof. After the submission of this paper, our attention was drawn to a GenBank entry (AY163756) that describes the complete genome of Ad11 and cites a paper in press by D. Stone, A. Furthmann, V. Sandig & A. Lieber.


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Bergelson, J. M., Cunningham, J. A., Droguett, G., Kurt-Jones, E. A., Krithivas, A., Hong, J. S., Horwitz, M. S., Crowell, R. L. & Finberg, R. W. (1997). Isolation of a common receptor for coxsackie B viruses and adenoviruses 2 and 5. Science 275, 1320–1323.[Abstract/Free Full Text]

Caillet-Boudin, M. L. (1989). Complementary peptide sequences in partner proteins of the adenovirus capsid. J Mol Biol 208, 195–198.[Medline]

Chen, M., Mermod, N. & Horwitz, M. S. (1990). Protein–protein interactions between adenovirus DNA polymerase and nuclear factor I mediate formation of the DNA replication preinitiation complex. J Biol Chem 265, 18634–18642.[Abstract/Free Full Text]

Chroboczek, J., Bieber, F. & Jacrot, B. (1992). The sequence of the genome of adenovirus type 5 and its comparison with the genome of adenovirus type 2. Virology 186, 280–285.[Medline]

Cuillel, M., Milleville, M. & D'Halluin, J. C. (1987). Expression of protein IIIa of human adenovirus type 2 in Escherichia coli. Gene 55, 295–301.[Medline]

D'Ambrosio, E., Del Grosso, N., Chicca, A. & Midulla, M. (1982). Neutralizing antibodies against 33 human adenoviruses in normal children in Rome. J Hyg 89, 155–161.

Davis, N. A., Majee, S. S. & Kahn, J. D. (1999). TATA box DNA deformation with and without the TATA box-binding protein. J Mol Biol 291, 249–265.[CrossRef][Medline]

Flomenberg, P. R., Chen, M. & Horwitz, M. S. (1988). Sequence and genetic organization of adenovirus type 35 early region 3. J Virol 62, 4431–4437.[Medline]

Ghadge, G. D., Malhotra, P., Furtado, M. R., Dhar, R. & Thimmapaya, B. (1994). In vitro analysis of virus-associated RNA I (VAI RNA): inhibition of the double-stranded RNA-activated protein kinase PKR by VAI RNA mutants correlates with the in vivo phenotype and the structural integrity of the central domain. J Virol 68, 4137–4151.[Abstract]

Ghosh-Choudhury, G., Haj-Ahmad, Y. & Graham, F. L. (1987). Protein IX, a minor component of the human adenovirus capsid, is essential for the packaging of full length genomes. EMBO J 6, 1733–1739.[Abstract]

Graham, F. L. & Prevec, L. (1992). Adenovirus-based expression vectors and recombinant vaccines. Biotechnology 20, 363–390.[Medline]

Green, M. (1970). Oncogenic viruses. Annu Rev Biochem 39, 701–756.[CrossRef][Medline]

Green, M., Mackey, J. K., Wold, W. S. & Rigden, P. (1979). Thirty-one human adenovirus serotypes (Ad1–Ad31) form five groups (A–E) based upon DNA genome homologies. Virology 93, 481–492.[Medline]

Hasson, T. B., Soloway, P. D., Ornelles, D. A., Doerfler, W. & Shenk, T. (1989). Adenovirus L1 52- and 55-kilodalton proteins are required for assembly of virions. J Virol 63, 3612–3621.[Medline]

Hong, J. S., Mullis, K. G. & Engler, J. A. (1988). Characterization of the early region 3 and fiber genes of Ad7. Virology 167, 545–553.[Medline]

Hu, S. L. & Manley, J. L. (1981). DNA sequence required for initiation of transcription in vitro from the major late promoter of adenovirus 2. Proc Natl Acad Sci U S A 78, 820–824.[Abstract]

Kauffman, R. S. & Ginsberg, H. S. (1976). Characterization of a temperature-sensitive, hexon transport mutant of type 5 adenovirus. J Virol 19, 643–658.[Medline]

Kidd, A. H., Garwicz, D. & Oberg, M. (1995). Human and simian adenoviruses: phylogenetic inferences from analysis of VA RNA genes. Virology 207, 32–45.[CrossRef][Medline]

Komoriya, A., Green, L. J., Mervic, M., Yamada, S. S., Yamada, K. M. & Humphries, M. J. (1991). The minimal essential sequence for a major cell type-specific adhesion site (CS1) within the alternatively spliced type III connecting segment domain of fibronectin is leucine–aspartic acid–valine. J Biol Chem 266, 15075–15079.[Abstract/Free Full Text]

Lutz, P. & Kedinger, C. (1996). Properties of the adenovirus IVa2 gene product, an effector of late-phase-dependent activation of the major late promoter. J Virol 70, 1396–1405.[Abstract]

Mei, Y. F. & Wadell, G. (1992). The nucleotide sequence of adenovirus type 11 early 3 region: comparison of genome type Ad11p and Ad11a. Virology 191, 125–133.[Medline]

Mei, Y. F. & Wadell, G. (1993). Hemagglutination properties and nucleotide sequence analysis of the fiber gene of adenovirus genome types 11p and 11a. Virology 194, 453–462.[CrossRef][Medline]

Mei, Y. F. & Wadell, G. (1996). Epitopes and hemagglutination binding domain on subgenus B:2 adenovirus fibers. J Virol 70, 3688–3697.[Abstract]

Mei, Y. F., Lindman, K. & Wadell, G. (2002). Human adenoviruses of subgenera B, C, and E with various tropisms differ in both binding to and replication in the epithelial A549 and 293 cells. Virology 295, 30–43.[CrossRef][Medline]

Morin, N., Delsert, C. & Klessig, D. F. (1989). Mutations that affect phosphorylation of the adenovirus DNA-binding protein alter its ability to enhance its own synthesis. J Virol 63, 5228–5237.[Medline]

Oosterom-Dragon, E. A. & Ginsberg, H. S. (1981). Characterization of two temperature-sensitive mutants of type 5 adenovirus with mutations in the 100,000-dalton protein gene. J Virol 40, 491–500.[Medline]

Ostapchuk, P. & Hearing, P. (2001). Pseudopackaging of adenovirus type 5 genomes into capsids containing the hexon proteins of adenovirus serotypes B, D, or E. J Virol 75, 45–51.[Abstract/Free Full Text]

Roberts, R. J., O'Neill, K. E. & Yen, C. T. (1984). DNA sequences from the adenovirus 2 genome. J Biol Chem 259, 13968–13975.[Abstract/Free Full Text]

Roelvink, P. W., Mi Lee, G., Einfeld, D. A., Kovesdi, I. & Wickham, T. J. (1999). Identification of a conserved receptor-binding site on the fiber proteins of CAR-recognizing adenoviridae. Science 286, 1568–1571.[Abstract/Free Full Text]

Rosa-Calatrava, M., Grave, L., Puvion-Dutilleul, F., Chatton, B. & Kedinger, C. (2001). Functional analysis of adenovirus protein IX identifies domains involved in capsid stability, transcriptional activity, and nuclear reorganization. J Virol 75, 7131–7141.[Abstract/Free Full Text]

Rux, J. J. & Burnett, R. M. (2000). Type-specific epitope locations revealed by X-ray crystallographic study of adenovirus type 5 hexon. Mol Ther 1, 18–30.[CrossRef][Medline]

Schmid, S. I. & Hearing, P. (1998). Cellular components interact with adenovirus type 5 minimal DNA packaging domains. J Virol 72, 6339–6347.[Abstract/Free Full Text]

Segerman, A., Mei, Y. F. & Wadell, G. (2000). Adenovirus types 11p and 35p show high binding efficiencies for committed hematopoietic cell lines and are infective to these cell lines. J Virol 74, 1457–1467.[Abstract/Free Full Text]

Shinagawa, M., Matsuda, A., Ishiyama, T., Goto, H. & Sato, G. (1983). A rapid and simple method for preparation of adenovirus DNA from infected cells. Microbiol Immunol 27, 817–822.[Medline]

Skog, J., Mei, Y. F. & Wadell, G. (2002). Human adenovirus serotypes 4p and 11p are efficiently expressed in cell lines of neural tumour origin. J Gen Virol 83, 1299–1309.[Abstract/Free Full Text]

Tollefson, A. E., Ryerse, J. S., Scaria, A., Hermiston, T. W. & Wold, W. S. (1996). The E3-11·6-kDa adenovirus death protein (ADP) is required for efficient cell death: characterization of cells infected with adp mutants. Virology 220, 152–162.[CrossRef][Medline]

Tomko, R. P., Xu, R. & Philipson, L. (1997). HCAR and MCAR: the human and mouse cellular receptors for subgroup C adenoviruses and group B coxsackieviruses. Proc Natl Acad Sci U S A 94, 3352–3356.[Abstract/Free Full Text]

Tribouley, C., Lutz, P., Staub, A. & Kedinger, C. (1994). The product of the adenovirus intermediate gene IVa2 is a transcriptional activator of the major late promoter. J Virol 68, 4450–4457.[Abstract]

Vance, V. & Vaucheret, H. (2001). RNA silencing in plants – defense and counterdefense. Science 292, 2277–2280.[Abstract/Free Full Text]

Wadell, G. (1984). Molecular epidemiology of human adenoviruses. Curr Top Microbiol Immunol 110, 191–220.[Medline]

Wadell, G., Hammarskjold, M. L., Winberg, G., Varsanyi, T. M. & Sundell, G. (1980). Genetic variability of adenoviruses. Ann N Y Acad Sci 354, 16–42.[Medline]

Walters, R. W., Freimuth, P., Moninger, T. O., Ganske, I., Zabner, J. & Welsh, M. J. (2002). Adenovirus fiber disrupts CAR-mediated intercellular adhesion allowing virus escape. Cell 110, 789–799.[Medline]

Whyte, P., Buchkovich, K. J., Horowitz, J. M., Friend, S. H., Raybuck, M., Weinberg, R. A. & Harlow, E. (1988). Association between an oncogene and an anti-oncogene: the adenovirus E1A proteins bind to the retinoblastoma gene product. Nature 334, 124–129.[CrossRef][Medline]

Wickham, T. J., Mathias, P., Cheresh, D. A. & Nemerow, G. R. (1993). Integrins alpha v beta 3 and alpha v beta 5 promote adenovirus internalization but not virus attachment. Cell 73, 309–319.[Medline]

Zhang, L. Q., Mei, Y. F. & Wadell, G. (2003). Human adenovirus serotypes 4 and 11 show higher binding affinity and infectivity for endothelial and carcinoma cell lines than serotype 5. J Gen Virol 84, 687–695.[Abstract/Free Full Text]

Received 20 February 2003; accepted 18 April 2003.