Overlapping reading frames in closely related human papillomaviruses result in modular rates of selection within E2

Apurva Narechania1, Masanori Terai1 and Robert D. Burk1,2,3,4

1 Department of Microbiology and Immunology, The Albert Einstein Cancer Center, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA
2 Department of Pediatrics, The Albert Einstein Cancer Center, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA
3 Department of Obstetrics, Gynecology and Women's Health, The Albert Einstein Cancer Center, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA
4 Department of Epidemiology and Population Health, The Albert Einstein Cancer Center, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA

Correspondence
Robert D. Burk
burk{at}aecom.yu.edu


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
A core group of four open reading frames (ORFs) is present in all known papillomaviruses (PVs): the E1 and E2 replication/transcription proteins and the L1 and L2 structural proteins. Because they are involved in processes that are essential to PV propagation, the sequences of these proteins are well-conserved. However, sequencing of novel subtypes for human papillomaviruses (HPV) 54 (AE9) and 82 (AE2/IS39), coupled to analysis of four other closely related genital HPV pairs, indicated that E2 has a higher dN/dS ratio than E1, L1 or L2. The elevated ratio is not homogeneous across the length of the ORF, but instead varies with respect to E2's three domains. The E2 hinge region is of particular interest, because its hypervariability (dN/dS>1) differs markedly from the two domains that it joins: the transcription-activation domain and the DNA-binding domain. Deciphering whether the hinge region's high rate of non-synonymous change is the result of positive Darwinian selection or relaxed constraint depends on the evolutionary behaviour of E4, an ORF that overlaps E2. The E2 hinge region is contained within E4 and non-synonymous changes in the hinge are associated with a disproportionate amount of synonymous change in E4, a case of simultaneous positive and purifying selection in overlapping reading frames. Modular rates of selection among E2 domains are a likely consequence of the presence of an embedded E4. E4 appears to be positioned in a part of the HPV genome that can tolerate non-synonymous change and purifying selection of E4 may be indicative of its functional importance.

The GenBank/EMBL/DDBJ accession numbers for the sequences reported in this work are AF436129 (HPV54A) and AF293961 (HPV82A).


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Human papillomaviruses (HPVs) include a group of closed-circular, double-stranded DNA viruses established as the aetiological agents of cervical neoplasia and invasive cervical cancer (Bosch & de Sanjosé, 2003; Muñoz et al., 2003). Over 100 putative HPVs have been described, more than 90 of which have been cloned and officially designated (de Villiers et al., 2004). Despite an array of types, the development of cervical carcinoma is restricted to a subset of high-risk viruses, of which HPV16 and HPV18 together account for nearly two-thirds of observed cases (Muñoz et al., 2003). Classification of HPV types into categories, such as ‘cancer-causing’ and ‘non-cancer-causing’ or cutaneous and mucosal, is based on both epidemiological and phylogenetic studies (Chan et al., 1995; Pfister & Fuchs, 1994; Van Ranst et al., 1992, 1993). Most oncogenic types share a common ancestor and most common ancestors radiate within an ecological niche, such as the skin or mucosa.

The typical HPV genome contains six open reading frames (ORFs) encoding the structural proteins (L1 and L2), proteins that mediate the viral life cycle (E1 and E2) and proteins that regulate host-cell DNA replication and transformation (E6 and E7) (Knipe et al., 2001). E6 and E7 bind the cell-cycle regulators p53 and pRb, respectively. Among highly cancerous types, these oncogenes degrade p53 and pRb, resulting in host-cell immortalization and proliferation (reviewed by Fehrmann & Laimins, 2003). Some papillomaviruses (PVs) also encode the E4 and E5 genes, whose roles, although largely unknown, seem to involve functions in the later phases of the viral life cycle (Longworth & Laimins, 2004; Peh et al., 2004). The E5 ORF is situated between E2 and L2 in typical PV genomes. In most cases, E5 does not overlap with any of its neighbours; however, the E4 ORF is contained completely within E2.

A great deal of effort has been spent studying the evolutionary implications of overlapping reading frames in viruses (Hein & Stovlbaek, 1995; Krakauer, 2000; Miyata & Yasunaga, 1978). A fundamental question is how natural selection operates on two protein sequences encoded by the same stretch of DNA. We address this question in the context of HPV E2 and E4, highlighting rates of synonymous and non-synonymous substitution in overlapping reading frames. To this end, we utilized two recently cloned and sequenced viral isolates, HPV AE2/IS39 and HPV AE9. Although initially thought to be novel types (10 % variance in L1 nucleotide sequences) based on sequence analysis of the MY09/MY11 region of the L1 ORF, upon analysis of their complete genomes, HPV AE2/IS39 and HPV AE9 were found to be subtypes (2–10 % variance in L1 sequences) of HPV82 and HPV54, respectively. In this report, we examine rates of evolution of each ORF by using dN/dS ratios across these subtype pairs and other closely related mucosal types. Closely related genomes allow examination of evolutionary change prior to saturation of genetic changes. We focus specifically on the modular rates of selection exhibited by E2 and a possible explanation for this modularity, in the overlapping ORF, E4.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
A cervicovaginal sample containing HPV AE2/IS39 (HPV82 subtype) originated from a white, 22-year-old female with persistent stage 1 cervical intraepithelial neoplasia, and a sample containing HPV AE9 (HPV54 subtype) was obtained from a white, 19-year-old female with normal cervical cytology. HPV AE2/IS39- and HPV AE9-containing samples were originally detected by MY09/MY11 PCR and dot-blot analyses, as described previously (Tachezy et al., 1994; Terai & Burk, 2002).

To clone the potentially novel genomes, PCR primers were designed by alignment of closely related HPV genomes using the sequences of the partial L1 region amplified by MY09/MY11 primers. Additional primers were used to amplify the entire genome in fragments by using overlapping PCR amplification, as described previously (Terai & Burk, 2002). PCR products were separated by agarose-gel electrophoresis, stained with ethidium bromide and visualized under UV illumination. After confirmation of appropriate product sizes, each PCR product was purified (Qiagen gel-extraction kit) and ligated into the pGEM-T Easy vector (Promega) according to the manufacturer's instructions. Initially, SP6 and T7 primers flanking each HPV DNA insert were used to determine the nucleotide sequence. Additional primers were designed by sequence walking. Sequencing was performed in the Albert Einstein Cancer Center DNA-sequencing facility. Overlapping fragments were assembled manually and several additional primers were used to clarify sequence ambiguities. The primer sequences used are available from the authors on request.

Pairwise sequence alignments were performed by using CLUSTAL W (Thompson et al., 1994) with a gap cost of 10·0 and the IUB weight matrix. Multiple sequence alignments, providing prototype and subtype phylogenetic context, were generated in the same way. Calculation of overall, ORF-wide non-synonymous and synonymous changes and rates of non-synonymous and synonymous change were done by using SNAP (Synonymous/Non-synonymous Analysis Program) (Korber, 2002; Nei & Gojobori, 1986) and K-Estimator was used to perform sliding-window analysis of dN/dS (Comeron, 1999).

Phylogenetic trees were constructed in PAUP (version 4.10) (Swofford, 1998) from multiple sequence alignments of the E6, E7, E1, E2, L2 and L1 concatenated ORFs, using both distance (neighbour-joining) and parsimony methods. Alignment gaps were coded as missing before distance and parsimony trees were reconstructed by using equal-weighted characters and 100 bootstrap replicates. To ensure adequate searches in the tree space, 100 random-addition heuristic searches and TBR (tree bisection and reconnection) swapping were employed.

HPVs used in this study and their GenBank accession numbers are as follows: HPV7 (NC_001595), HPV11 (NC_001525), HPV27 (NC_001584), HPV40 (NC_001589), HPV44 (NC_001689), HPV54 (NC_001676), HPV55 (NC_001692), HPV82 (NC_002172), HPV13 (NC_001349), HPV26 (NC_001583), HPV30 (NC_001585), HPV32 (NC_001586), HPV42 (NC_001534), HPV51 (NC_001533), HPV53 (NC_001593), HPV56 (NC_001594), HPV66 (NC_001695), HPV69 (NC_002171), HPV74 (NC_004501), HPV91 (NC_004085), HPV54A (AF436129) and HPV82A (AF293961).


   RESULTS AND DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The genomes of HPV AE2/IS39 (abbreviated as HPV82A) and HPV AE9 (abbreviated as HPV54A) were each amplified and cloned as three overlapping PCR fragments. For HPV82A, these fragments measured 3·8, 3·9 and 0·5 kb, and for HPV54A, 4·3, 3·6 and 0·5 kb. The complete genome sequences are available from GenBank under the accession numbers AF436129 (HPV54A) and AF293961 (HPV82A).

The genomes of HPV54A and HPV82A share a high level of similarity with their prototypes. A comparison of whole genomes reveals 93·5 % nucleic acid similarity between HPV54 and HPV54A, and 89·8 % similarity between HPV82 and HPV82A. The currently accepted criterion for HPV classification relies on comparison of L1 nucleotide ORFs. The L1 ORF in the HPV54 pair diverges by 4·6 %, whilst the HPV82 pair diverges by 7·7 %, placing both isolates within the subtype range.

As expected, subtype and prototype genome lengths are also comparable. HPV82 and HPV82A are 7871 and 7904 bp, respectively, whilst HPV54 and HPV54A are 7759 and 7717 bp in length. Each pair contains the anticipated PV ORFs, including the E6 and E7 proteins, the replication proteins (E1 and E2) and components of the viral capsid (L1 and L2). The only major difference in genomic architecture across the two pairs is HPV54's lack of an E5 homologue. HPV82 contains an unambiguous E5 ORF, but lacks a proximal start codon, whereas the corresponding genomic region in HPV54 is as variable as some sections of its upstream regulatory region. Lack of evidence for E5 in HPV54 reinforces its status as a unique outlier, rooting members of the {alpha}-PV species clades 1, 8 and 10 (Fig. 1), all of which contain an intact E5 ORF. The identification of an HPV54 subtype implies that this ancient part of the tree still appears to be evolving. In their greater phylogenetic context, HPV82A and HPV54A sort into species groups 5 and 13, respectively. This evolutionary relationship held in phylogenies constructed from either the concatenated ORFs or L1 nucleotide sequences alone (data not shown) (Fig. 1).



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 1. Phylogenetic trees constructed by using maximum-parsimony and distance methods resulted in identical topologies. Representative cladograms of (a) HPV82 and its taxonomic context and (b) HPV54 and its taxonomic context are shown. Both trees were based on the alignment of the nucleotide sequences of compiled ORFs (E6, E7, E1, E2, L2 and L1) of the indicated PV genomes. Bootstrap values for the parsimony calculation are provided at all nodes. Numbers to the right indicate the taxonomic species group of the {alpha}-PVs (de Villiers et al., 2004).

 
The distribution of polymorphisms across each ORF within HPV54 and HPV82 is shown in Table 1, accompanied by four closely related genital HPV pairs. The subtypes HPV44 and HPV55 and the sister taxa HPV6 and HPV11, HPV7 and HPV40, and HPV2 and HPV27 were chosen because they all exhibit whole-genome divergence of <15 %. These closely related pairs allow unambiguous alignment and are less likely to be subject to saturation of genetic variation. To examine the possibility of directional selection, we calculated the ratio of non-synonymous substitutions per non-synonymous site (dN) to synonymous substitutions per synonymous site (dS). A dN/dS ratio equal to 1 indicates neutrality, a ratio <1 indicates purifying selection and a ratio >1 indicates positive Darwinian selection (Yang, 1998). Overall, the small ratios of non-synonymous to synonymous changes and dN/dS values indicate that E1, L1 and L2 are under strong purifying selection, forming a functional core essential to the viral life cycle. Previously, we had also included E2 in this group because of its presence in all known PVs and its role in viral-genome replication (Narechania et al., 2004). However, in the current analysis, a remarkably consistent pattern of elevated non-synonymous E2 variation was observed, a pattern more in line with the values seen for the E6/E7 ORFs than the stable core E1, L1 and L2 ORFs. The median E2 dN/dS value (0·31) was significantly different from that of the E1/L1/L2 group (0·09) (Mann–Whitney P<0·0004) (Siegel, 1956).


View this table:
[in this window]
[in a new window]
 
Table 1. Complete ORF table

All values were calculated with SNAP as described by Korber (2000).

 
Much of E2's variation is attributable to the hinge region connecting the transcription activation domain (TAD) to the DNA-binding domain (DBD) (Table 2; Fig. 2) (Eriksson et al., 1999; Gauthier et al., 1991; Ham et al., 1991). In all cases except for the HPV54 pair, the hinge region did not manifest the purifying selection that constrains the rest of the E2 ORF (dN/dS median, 1·11). HPV16 variant genomes also showed elevated dN/dS ratios in the E2 hinge region (Z. Chen & R. D. Burk, unpublished data). The TAD and DBD regions display intermediate median dN/dS values (0·18), lower than that for the hinge, but significantly higher than that observed for E1, L1 or L2 (Mann–Whitney P<0·002). This pattern holds in all cases except for the HPV54 pair. Aside from a low hinge dN/dS and a high TAD dN/dS, the HPV54 pair also exhibits relatively high E1 and L1 dN/dS ratios, again highlighting its status as a unique outlier of species clades 1, 8 and 10. Although relatively elevated E2 dN/dS values are a trend that is consistent across all analysed pairs, there are many functional protein motifs that are conserved across all types. For example, the {alpha}-1 recognition helix in the E2 DBD is well-conserved across all types used in the current study, as are the cognate DBD sites (ACCgNNNNcGGT) in the long control region (reviewed by Hegde, 2002).


View this table:
[in this window]
[in a new window]
 
Table 2. E2 domain data

All values were calculated with SNAP as described by Korber (2000).

 


View larger version (21K):
[in this window]
[in a new window]
 
Fig. 2. Distributions of nucleotide and amino acid changes within the E2 ORF. (a) HPV54 and HPV54A; (b) HPV82 and HPV82A. Positions of the three functional domains of E2 (TAD, the hinge and DBD) are delimited along the horizontal axis in white, black and grey, respectively. These positions were derived by alignment to HPV18 E2 (Hegde, 2002). Vertical lines above the horizontal axis indicate non-synonymous changes (i.e. amino acid changes), whereas those below the axis correspond to synonymous changes. The height of the vertical lines both above and below the horizontal axis represents the number of non-synonymous and synonymous changes per codon, respectively.

 
The most compelling observations in the current analysis were the high overall dN/dS ratio of E2 and the concentration of non-synonymous changes in the hinge region of this ORF. E2 is unique in the PV genome in that it is the only ORF that completely contains another (in this case, E4). Moreover, E4 frames the hinge region completely in all pairs analysed (for two examples, see Fig. 3). In each sliding-window analysis, except for HPV54 where little time has apparently passed since subtype divergence, non-synonymous changes dominate the E2 reading frame, whereas E4 favours synonymous changes. This observation is at odds with the prevailing notion concerning overlapping genes – that severe evolutionary constraint should operate on nucleotide sequences encoding two layers of protein in the dense genetic environment of viruses (Mizokami et al., 1997; Pavesi, 2000; Pavesi et al., 1997; Walewski et al., 2001). For overlapping genes, codon changes at the third position in one reading frame are unusual, because they result in changes in the second position of the +1 reading frame. However, what we see here is a pattern of E4 purifying selection superimposed on the E2 hinge, the one domain that can physically accommodate high levels of non-synonymous change. The relaxed constraint observed for the hinge is therefore attributable to both its role as a flexible connector between the functionally conserved TAD and DBD domains and the existence of the overlapping E4 reading frame. In PVs with no evidence of an E4 ORF, we might expect this phenomenon to disappear. However, to our knowledge, all PVs contain an E4-like ORF except for the two avian PVs that have been characterized to date: Psittacus erithacus papillomavirus (PePV) and Fringilla coelebs papillomavirus (FPV) (Terai et al., 2002). An alignment of PePV and FPV E2 ORFs yielded only 52 % identity. At this level of divergence, saturation becomes a problem and a satisfactory dN/dS analysis could not be performed.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 3. Sliding-window traces of rates of synonymous (dS, dashed line) and non-synonymous (dN, solid line) change in the overlapping E2 hinge and E4 reading frames. The analysis employed a 100 nt window sliding 10 positions at a time, registering the dN and dS values at the centre position (i.e. in a 100 nt window, the centre position would be 50·5 bp). Lines below the horizontal axis delimit the position of the E2 hinge within a particular trace. Plots shown include pairwise alignments between HPV44 and HPV55 in the E2 reading frame (top left) and the E4 reading frame (bottom left), and alignments between HPV2 and HPV27 in the E2 (top right) and E4 (bottom right) reading frames.

 
Despite the claim that constraint should be the rule in overlapping reading frames, more recent reports indicate that some viral genomes demonstrate simultaneous positive and purifying selection. Simian immunodeficiency virus (Hughes et al., 2001), potato leafroll virus (Guyader & Ducray, 2002) and Sendai virus (Fujii et al., 2001) all show increased non-synonymous change in one ORF with concurrent dominance of synonymous change in an overlapping reading frame. We suggest that the modularity displayed in the E2 dN/dS profile is a probable consequence of the presence of the E4 overlapping reading frame, and that the E2 hinge is the ideal genomic position in which to fix an overlapping gene. As E4 is subject to purifying selection relative to the E2 hinge in all tested cases and there is evidence of an E4 ORF in all mammalian PVs, this ORF may have greater functional importance than thought previously.


   ACKNOWLEDGEMENTS
 
This work was supported in part by a grant from the National Cancer Institute, National Institutes of Health to R. D. B. (CA78527).


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Bosch, F. X. & de Sanjosé, S. (2003). Chapter 1: human papillomavirus and cervical cancer – burden and assessment of causality. J Natl Cancer Inst Monogr 31, 3–13.[Medline]

Chan, S.-Y., Delius, H., Halpern, A. L. & Bernard, H.-U. (1995). Analysis of genomic sequences of 95 papillomavirus types: uniting typing, phylogeny, and taxonomy. J Virol 69, 3074–3083.[Abstract]

Comeron, J. M. (1999). K-Estimator: calculation of the number of nucleotide substitutions per site and the confidence intervals. Bioinformatics 15, 763–764.[Abstract/Free Full Text]

de Villiers, E. M., Fauquet, C., Broker, T. R., Bernard, H.-U. & zur Hausen, H. (2004). Classification of papillomaviruses. Virology 324, 17–27.[CrossRef][Medline]

Eriksson, A., Herron, J. R., Yamada, T. & Wheeler, C. M. (1999). Human papillomavirus type 16 variant lineages characterized by nucleotide sequence analysis of the E5 coding segment and the E2 hinge region. J Gen Virol 80, 595–600.[Abstract]

Fehrmann, F. & Laimins, L. A. (2003). Human papillomaviruses: targeting differentiating epithelial cells for malignant transformation. Oncogene 22, 5201–5207.[CrossRef][Medline]

Fujii, Y., Kiyotani, K., Yoshida, T. & Sakaguchi, T. (2001). Conserved and non-conserved regions in the Sendai virus genome: evolution of a gene possessing overlapping reading frames. Virus Genes 22, 47–52.[CrossRef][Medline]

Gauthier, J. M., Dillner, J. & Yaniv, M. (1991). Structural analysis of the human papillomavirus type 16-E2 transactivator with antipeptide antibodies reveals a high mobility region linking the transactivation and the DNA-binding domains. Nucleic Acids Res 19, 7073–7079.[Abstract]

Guyader, S. & Ducray, D. G. (2002). Sequence analysis of Potato leafroll virus isolates reveals genetic stability, major evolutionary events and differential selection pressure between overlapping reading frame products. J Gen Virol 83, 1799–1807.[Abstract/Free Full Text]

Ham, J., Dostatni, N., Gauthier, J.-M. & Yaniv, M. (1991). The papillomavirus E2 protein: a factor with many talents. Trends Biochem Sci 16, 440–444.[CrossRef][Medline]

Hegde, R. S. (2002). The papillomavirus E2 proteins: structure, function, and biology. Annu Rev Biophys Biomol Struct 31, 343–360.[CrossRef][Medline]

Hein, J. & Stovlbaek, J. (1995). A maximum-likelihood approach to analyzing nonoverlapping and overlapping reading frames. J Mol Evol 40, 181–189.[CrossRef][Medline]

Hughes, A. L., Westover, K., da Silva, J., O'Connor, D. H. & Watkins, D. I. (2001). Simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus. J Virol 75, 7966–7972.[Abstract/Free Full Text]

Knipe, D. M., Howley, P. M., Griffin, D. E., Lamb, R. A., Martin, M. A., Roizman, B. & Straus, S. E. (editors) (2001). Fields Virology, 4th edn. Philadelphia, PA: Lippincott Williams & Wilkins.

Korber, B. (2002). HIV signature and sequence variation analysis. In Computational Analysis of HIV Molecular Sequences, pp. 55–72. Edited by A. G. Rodrigo & G. H. Learn. Dordrecht: Kluwer Academic Publishers.

Krakauer, D. C. (2000). Stability and evolution of overlapping genes. Evolution Int J Org Evolution 54, 731–739.[Medline]

Longworth, M. S. & Laimins, L. A. (2004). Pathogenesis of human papillomaviruses in differentiating epithelia. Microbiol Mol Biol Rev 68, 362–372.[Abstract/Free Full Text]

Miyata, T. & Yasunaga, T. (1978). Evolution of overlapping genes. Nature 272, 532–535.[Medline]

Mizokami, M., Orito, E., Ohba, K., Ikeo, K., Lau, J. Y. N. & Gojobori, T. (1997). Constrained evolution with respect to gene overlap of hepatitis B virus. J Mol Evol 44 (Suppl 1), S83–S90.[Medline]

Muñoz, N., Bosch, F. X., de Sanjosé, S., Herrero, R., Castellsagué, X., Shah, K. V., Snijders, P. J. F. & Meijer, C. J. L. M. (2003). Epidemiologic classification of human papillomavirus types associated with cervical cancer. N Engl J Med 348, 518–527.[Abstract/Free Full Text]

Narechania, A., Terai, M., Chen, Z., DeSalle, R. & Burk, R. D. (2004). Lack of the canonical pRB-binding domain in the E7 ORF of artiodactyl papillomaviruses is associated with the development of fibropapillomas. J Gen Virol 85, 1243–1250.[Abstract/Free Full Text]

Nei, M. & Gojobori, T. (1986). Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3, 418–426.[Abstract]

Pavesi, A. (2000). Detection of signature sequences in overlapping genes and prediction of a novel overlapping gene in hepatitis G virus. J Mol Evol 50, 284–295.[Medline]

Pavesi, A., De Iaco, B., Granero, M. I. & Porati, A. (1997). On the informational content of overlapping genes in prokaryotic and eukaryotic viruses. J Mol Evol 44, 625–631.[Medline]

Peh, W. L., Brandsma, J. L., Christensen, N. D., Cladel, N. M., Wu, X. & Doorbar, J. (2004). The viral E4 protein is required for the completion of the cottontail rabbit papillomavirus productive cycle in vivo. J Virol 78, 2142–2151.[Abstract/Free Full Text]

Pfister, H. & Fuchs, P. G. (1994). Anatomy, taxonomy and evolution of papillomaviruses. Intervirology 37, 143–149.[Medline]

Siegel, S. (1956). Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.

Swofford, D. L. (1998). PAUP*. Phylogenetic Analysis Using Parsimony (*and other methods), version 4. Sunderland, MA: Sinauer Associates.

Tachezy, R., Van Ranst, M. A., Cruz, Y. & Burk, R. D. (1994). Analysis of short novel human papillomavirus sequences. Biochem Biophys Res Commun 204, 820–827.[CrossRef][Medline]

Terai, M. & Burk, R. D. (2002). Identification and characterization of 3 novel genital human papillomaviruses by overlapping polymerase chain reaction: candHPV89, candHPV90, and candHPV91. J Infect Dis 185, 1794–1797.[CrossRef][Medline]

Terai, M., DeSalle, R. & Burk, R. D. (2002). Lack of canonical E6 and E7 open reading frames in bird papillomaviruses: Fringilla coelebs papillomavirus and Psittacus erithacus timneh papillomavirus. J Virol 76, 10020–10023.[Abstract/Free Full Text]

Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.[Abstract]

Van Ranst, M., Kaplan, J. B. & Burk, R. D. (1992). Phylogenetic classification of human papillomaviruses: correlation with clinical manifestations. J Gen Virol 73, 2653–2660.[Abstract]

Van Ranst, M., Tachezy, R., Delius, H. & Burk, R. D. (1993). Taxonomy of the human papillomaviruses. Papillomavirus Rep 4, 61–65.

Walewski, J. L., Keller, T. R., Stump, D. D. & Branch, A. D. (2001). Evidence for a new hepatitis C virus antigen encoded in an overlapping reading frame. RNA 7, 710–721.[Abstract/Free Full Text]

Yang, Z. (1998). Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15, 568–573.[Abstract]

Received 5 November 2004; accepted 8 February 2005.