Application of maximum-likelihood models to selection pressure analysis of group I nucleopolyhedrovirus genes

Robert L. Harrison and Bryony C. Bonning

Department of Entomology and Interdepartmental Program in Genetics, Iowa State University, Ames, IA 50011, USA

Correspondence
Bryony Bonning
bbonning{at}iastate.edu


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Knowledge of virus genes under positive selection pressure can help identify molecular determinants of species-specific virulence or host range without prior knowledge of the mechanisms governing host range and virulence. Towards this end, codon-based models of substitution were used in a maximum-likelihood approach to analyse selection pressures acting on 83 genes of group I nucleopolyhedroviruses (NPVs). Evidence for positive selection was found for nine genes: ac38, ac66, arif-1, lef-7, lef-10, lef-12, odv-e18, odv-e56 and vp80. The baculovirus DNA helicase gene (dnahel) was not found to be positively selected using models that allowed the intensity of selection pressure to vary among codon sites. Further analysis with a method that allows selection pressure intensity to vary among lineages suggests that positive selection may have occurred in dnahel during the divergence of Bombyx mori NPV and the NPVs of Autographa californica and Rachiplusia ou. NPV genes that have undergone positive selection may modulate the ability of different NPVs to replicate efficiently in cells (lef-7, lef-10, lef-12) or to establish primary infection of the midgut (odv-e18, odv-e56) of different host species.


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
The family Baculoviridae is a large group of arthropod-specific viruses that infect species mainly within the order Lepidoptera (butterflies and moths; Adams & McClintock, 1991). Baculoviruses possess a double-stranded circular DNA genome of approximately 80–180 kb, which is contained within an enveloped, rod-shaped virion. Two different virion phenotypes are produced during infection: the occlusion-derived virus (ODV), which establishes primary infection of the midgut of the host, and budded virus (BV), which mediates the cell-to-cell spread of the virus to other tissues within the host. Virions of the ODV phenotype are embedded within distinctive crystalline occlusions. The family Baculoviridae is currently divided on the basis of occlusion morphology into two genera, Nucleopolyhedrovirus (NPV) and Granulovirus. On the basis of phylogenetic studies, members of Nucleopolyhedrovirus are further subdivided into group I and group II nucleopolyhedroviruses (Blissard et al., 2000). The group I viruses include Autographa californica multiple nucleopolyhedrovirus (AcMNPV), the type species for the family and a model for studies on both basic baculovirus biology and the application of baculoviruses as gene expression vectors and insecticidal agents.

A number of NPV genes influence the species-specific virulence or host range of NPVs by affecting the ability to infect and replicate in cells of specific species, the dose required to cause mortality or the survival time of infected hosts (Chen & Thiem, 1997; Chen et al., 1998; Clem et al., 1991; Clem & Miller, 1993; Croizier et al., 1994; Lu & Miller, 1996; Maeda et al., 1993; Popham et al., 1998). In most of these cases, the species-specific effect of a gene on virulence or host range was discovered after expression of the gene had been eliminated by ORF disruption or deletion. Eliminating or reducing virulence against one species upon knocking out a gene suggests that acquisition of new genes during evolution (by recombination) can shape NPV host range. However, in a single case, individual amino acid replacements in a gene encoding an essential DNA helicase expanded the host range of AcMNPV to include a normally refractory species (Argaud et al., 1998; Kamita & Maeda, 1997). This example raises the possibility that nucleotide substitutions in key genes also influence virulence and host range.

Nonsynonymous (amino acid-changing) nucleotide substitutions in NPV genes may lead to alterations in the activity of the encoded protein that facilitate adaptation to a new host species, or overcome the defences of a current host. Such mutations would confer a fitness advantage and would be expected to be fixed in the population at a higher rate than synonymous (silent) substitutions, which are generally invisible to natural selection. When the rate of nonsynonymous substitutions per potential nonsynonymous site in a gene is greater than the rate of synonymous substitutions per potential synonymous site, the gene is said to be undergoing positive selection (Yang, 2001). This concept is expressed as the ratio of nonsynonymous to synonymous substitution rates, {omega}, which is greater than one for positively selected genes. The value of {omega} is less than one for genes undergoing negative or purifying selection, in which nonsynonymous mutations are deleterious and are eliminated at a faster rate than synonymous mutations. Most genes appear to be subject to negative selection most of the time (Endo et al., 1996; Yang, 2002).

Maximum-likelihood models that estimate the value of {omega} for aligned sequences have been used to identify sites within viral envelope glycoprotein genes that map to regions previously found to be involved in host immune recognition and receptor binding (Holmes et al., 2002; Twiddy et al., 2002; Woelk & Holmes, 2001; Woelk et al., 2001), suggesting that this method can be used to identify viral genes involved in adapting to new or current hosts. A similar evaluation of the selection pressures on NPV genes may help identify genes involved in species-specific virulence and host range, although positively selected sites may also result from selection for improved stability or transmission, or from compensatory changes triggered by variation in other genes. We applied maximum-likelihood models of codon substitution to examine the selection pressures on 83 group I nucleopolyhedovirus genes that are either widespread in distribution or for which protein expression had been previously demonstrated.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Nucleotide sequences.
Group I NPV gene sequences were collected from the GenBank database and published sources (Table 1). Most data sets consisted of sequences from the five group I NPV genomes that have been completely sequenced (Autographa californica multiple nucleopolyhedrovirus, AcMNPV, Ayres et al., 1994; Bombyx mori nucleopolyhedrovirus, BmNPV, Gomi et al., 1999; Epiphyas postvittana multiple nucleopolyhedrovirus, EppoMNPV, Hyink et al., 2002; Orgyia pseudotsugata multiple nucleopolyhedrovirus, OpMNPV, Ahrens et al., 1997; Rachiplusia ou multiple nucleopolyhedrovirus, RoMNPV, Harrison & Bonning, 2003). In a few cases, AcMNPV and RoMNPV sequences for a given gene translated to produce identical amino acid sequences; in these cases, only the AcMNPV sequence was used. Some AcMNPV-C6 genes had been re-sequenced during a previous study (Harrison & Bonning, 2003), and the re-determined sequences for these genes were used when they differed from the Ayres et al. sequence. Sequences from other group I NPVs were included when available. In addition, the p35 sequence from the group II NPV of Leucania separata was included in the p35 data set.


View this table:
[in this window]
[in a new window]
 
Table 1. Maximum-likelihood analysis of selection pressures on 83 NPV genes

 
Alignment and phylogenetic tree construction.
For each data set, predicted amino acid sequences were aligned by CLUSTALW (Thompson et al., 1994) in the MEGALIGN program (Lasergene) using the Gonnet matrices with gap penalties of 10 or 15 and gap extension penalties of 0·2 or 0·3 (depending on the degree of sequence divergence observed in the data set). The alignments were adjusted manually when necessary. Regions where the alignment of homologous sites was uncertain due to insertions or deletions were removed. The sequences in the alignment were then converted back to the original nucleotide sequences. Phylogenetic trees of each nucleotide sequence alignment were constructed with PAUP* 4.0 (Sinauer Associates) using maximum-parsimony (MP), minimum evolution (ME), and maximum-likelihood (ML) methods. MP, ME and ML trees were sought for by a heuristic search using tree bisection–reconnection. MP and ME tree reconstruction started with ten initial trees generated by random addition of sequences. ME tree reconstruction used the LOGDET/paralinear pairwise distance matrix. ML tree reconstruction used the HKY substitution model with variation in substitution rates among sites modelled by a gamma distribution with four categories. The transition/transversion ratio and gamma shape parameter were estimated from the ME tree previously constructed and fixed at these values for the remainder of the heuristic search.

Analysis of selection pressure.
The PAML (phylogenetic analysis by maximum-likelihood) software package (Yang, 1997; http://abacus.gene.ucl.ac.uk/software/paml.html) was used to evaluate selection pressures on NPV genes. This software uses a maximum-likelihood approach with codon-based models to estimate the ratio ({omega}) of dN, the rate of nonsynonymous substitutions per nonsynonymous site, to dS, the ratio of synonymous substitutions per synonymous site.

All nucleotide sequence alignments were fitted to six models with different hypotheses about the distribution of estimated values of {omega} (Yang et al., 2000): (1) M0 assumes one {omega} value for all codons; (2) M1 divides codons into an invariant class p0, where {omega} is set at zero (purifying selection) and a neutral class p1, where {omega} is set at one (neutral evolution); (3) M2 includes p0 and p1 from M1, and adds a third class (p2), where {omega} is estimated from the underlying data and can be greater than one; (4) M3 divides codons among three classes of sites (p0, p1, and p2) and {omega} is estimated independently for all three classes and can be greater than one; (5) M7 features ten classes modelled with a discrete beta distribution. The shape of the distribution is determined by parameters p and q, and {omega} values for these classes cannot be greater than one; and (6) M8 includes the ten classes of M7 (collectively referred to as p0), and uses an additional class (p1) where {omega} can be greater than one.

In addition, the DNA helicase gene (dnahel) was further analysed with the free-ratios model, which allows {omega} to be independently estimated for each individual branch in a phylogenetic tree (Yang, 1998).

Models M0 and M1 are nested with models M2 and M3, and model M7 is nested with M8. Models which are nested together can be compared statistically using a likelihood ratio test, in which twice the difference between the log-likelihood values for two models is compared with a {chi}2 distribution table with the degrees of freedom equal to the difference in the number of parameters between the two models (Yang et al., 2000). This comparison supplies a P value for the probability that the null hypothesis (no positive selection, embodied in models M1 and M7) is an equally good or better fit for the data when compared to the nested models that allow for the possibility of positive selection. Positive selection can be inferred from this analysis when (1) models M2, M3 or M8 indicate a group of codons with an {omega} ratio greater than one, and (2) the likelihood of the positive selection model is significantly higher than that of the nested null hypothesis model (at P<0·05). The M0 model, which assumes a single {omega} ratio for all lineages, and the free-ratios model can also be compared in this manner.

The empirical Bayes procedure is used to calculate the probabilities for individual codons belonging to each of the site classes and can be used to predict which codons are under positive selection. The program output lists the codon sites with a probability >=0·5 of being in the positively selected class.

Codon frequency bias was accounted for using the F61 model of codon frequency, in which frequencies for each codon are calculated individually. The transition/transversion ratio ({kappa}) was estimated from the underlying data. For each data set, the analysis was run using each MP, ME and ML tree saved by PAUP* that possessed a unique topology.

Alignments and output files from these analyses can be downloaded from http://www.ent.iastate.edu/dept/faculty/bonningb/selection_pressure-group1.zip.


   RESULTS AND DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Alignment and phylogeny of group I NPV sequences
Selection pressure analysis requires alignment of homologous codon sites. Failure to accurately align homologous codon sites may result in the false identification of positively selected sites. For this reason, our analysis focused almost exclusively on sequences from group 1 NPVs, which were less divergent than group 2 NPV sequences. Even with data sets consisting solely of group I sequences, it was necessary in a majority of the data sets to remove regions of the alignment where it was unclear if homologous codon sites had been properly aligned due to the insertion of gaps needed to achieve optimal alignment scores.

Most data sets consisted of sequences from AcMNPV, RoMNPV, BmNPV, OpMNPV and EppoMNPV. Phylogenetic analysis of these sequences, irrespective of the method used, divided them into two clades, one consisting of AcMNPV, RoMNPV and BmNPV, and the other consisting of OpMNPV and EppoMNPV. In data sets containing sequences from other viruses (e.g. Choristoneura fumiferana MNPV), these sequences grouped with the clade containing OpMNPV and EppoMNPV.

Selection pressure analysis
Models M3 or M8 were the best fit (in terms of having the highest log-likelihood scores) for 81 of the 83 data sets (Table 1). The strictly neutral model (M1) was a poor fit for many of the data sets. For 70 genes, the log-likelihood scores for M1 were lower than the scores for M0 (the model assuming a single {omega} value for all sites). For 4 of these 70 data sets, the M2 model (which includes two classes from M1) also fit the data less well than M0.

For 52 genes, none of the models designed to detect positive selection identified a class with {omega}>1. At least one model identified a class of positively selected sites in the remaining 31 data sets. For 18 of 21 data sets for which the M3 model detected a class of positively selected sites, the M2 model did not identify positively selected sites. Because M2 has two fixed-value classes (p0 and p1, with {omega} set at 0 and 1, respectively), the extra category p2 optimally accounts for codon sites with {omega} values lying between 0 and 1 (Yang et al., 2000).

For nine genes, models M2, M3 or M8 identified a class of positively selected sites and rejected null hypothesis models at P<0·05. For many of the remaining 22 data sets, the nested null hypothesis models could not be rejected at this significance level. In some cases, M3 contained a positively selected site class but was not a significantly better fit for the data than M2, which failed in these cases to identify positively selected sites. These genes were not considered to be positively selected.

Of the data sets for which more than one tree topology was obtained, the models differed in identifying positively selected classes under different tree topologies in only one instance. This result is consistent with previous results indicating that the ability to detect positive selection is not strongly affected by tree topology (Yang et al., 2000).

Positively selected structural genes
Three genes encoding proteins that are components of the virion were identified as being positively selected (Table 2).


View this table:
[in this window]
[in a new window]
 
Table 2. NPV structural gene data sets where positive selection was detected and their relevant parameter values

 
odv-e18.
For models M3 and M8, approximately 5 % of the codon sites in this data set were placed into categories with {omega} values reported as 99, which is a pre-set upper limit imposed by PAML when the estimated number of synonymous substitutions (and dS) is 0 or a very low number. M3 and M8 convincingly rejected the null hypothesis models at P<<0·01, and M3 rejected M2 (which did not identify positively selected sites) at a very low P value as well. Empirical Bayes analysis placed the same four sites in the positively selected class for M3 and M8. None of these sites mapped to a hydrophobic domain (29–47) hypothesized to be involved in the localization of this protein to ODV envelopes and envelope precursors (Braunagel et al., 1996b).

odv-e56.
M3 and M8 both contained site classes in odv-e56 consisting of approximately 4 % of the sites with {omega}>3. The other models were rejected at P<=0·01. Like ODV-E18, ODV-E56 is found in the envelope of occluded virus. None of the positively selected sites identified by Bayesian analysis were found in a hydrophobic domain hypothesized to be involved in the viral envelope localization of this protein (Braunagel et al., 1996a).

As ODV envelope proteins, ODV-E18 and ODV-E56 may interact with midgut cell surface proteins and mediate binding and internalization of ODV. Substitutions in key sites of these proteins may enhance binding and internalization in different species.

vp80.
Classes of positively selected codons with {omega}>2 were present in both M3 and M8 in the vp80 data set. M3 rejected M1 and M2 (which did not identify positively selected sites) at P<<0·01, but M8 could not reject M7 at P<0·05. For both M3 and M8, Bayesian empirical analysis identified three positively selected sites that are located in the conserved N and C termini of the predicted protein sequence (Li et al., 1997b).

vp80 encodes a protein associated with the NPV nucleocapsid (Lu & Carstens, 1992; Li et al., 1997b). It is unclear how changes in a nucleocapsid protein like VP80 would modulate virulence or host range. Positive selection detected with this gene may represent co-variation to compensate for sequence alterations occurring elsewhere in the genome, or it may reflect adaptation that stabilizes virions under different environmental conditions.

Positively selected genes encoding replication and expression factors
Three genes with previously characterized roles in virus replication and gene expression were identified as being positively selected (Table 3).


View this table:
[in this window]
[in a new window]
 
Table 3. NPV replication and expression gene data sets where positive selection was detected and their relevant parameter values

 
lef-7.
Both M2 (data not shown) and M3 indicated positively selected sites in lef-7. M8 also contained a category of positively selected sites, but the parameter values of this category differed from those calculated for M2 and M3. M8 could not reject M7 at the P<0·05 level.

lef-7 was found to be necessary for late promoter-driven reporter gene expression in a transient assay in a Spodoptera frugiperda cell line, but not in a Trichoplusia ni cell line (Lu & Miller, 1995a). lef-7 knockout mutant viruses exhibited impaired replication in S. frugiperda and Spodoptera exigua cell lines but not in a T. ni cell line (Chen & Thiem, 1997). Substitutions in this gene may modulate the ability of NPVs to replicate in the cells of different hosts.

lef-10.
Models M2, M3, and M8 all indicated positively selected categories of sites in lef-10. For M3 and M8, these categories contained approximately 18 % of the sites in this small ORF, and exhibited {omega} values of approximately 6·5. Null hypothesis models were rejected for both M3 and M8, and the same eight codon sites were placed into the positively selected category.

lef-12.
Positively selected categories for M3 and M8 each contained approximately 1–2 % of total sites with similar {omega} values (approximately 9·7–9·8). M3 and M8 rejected the other models at P<=0·01. A single codon site (position 153) was identified as being positively selected by both models at P>0·99.

lef-10 and lef-12 were identified as genes that supported late promoter-driven reporter gene expression in a transient assay (Lu & Miller, 1994; Rapp et al., 1998). lef-12 was found to be essential for expression in assays performed with S. frugiperda-derived Sf21 cells, but not in High 5 cells derived from T. ni, suggesting that LEF-12 operates as a species-specific late gene expression factor (Rapp et al., 1998). Further analysis revealed that lef-12 was itself a late gene (Guarino et al., 2002). In contrast to results obtained with transient expression assays, viral mutants in which expression of lef-12 had been eliminated were able to express late genes and replicate in S. frugiperda-derived Sf9 cells, albeit at reduced levels (Guarino et al., 2002). With both lef-10 and lef-12, mutations may facilitate efficient late gene expression in different species.

Positively selected auxiliary genes and genes of unknown function
Two uncharacterized but widespread ORFs and one gene involved in actin rearrangement were identified as undergoing positive selection (Table 4).


View this table:
[in this window]
[in a new window]
 
Table 4. NPV auxiliary and unknown function gene data sets where positive selection was detected and their relevant parameter values

 
ac38.
M3 and M8 identified categories of positively selected sites in ac38 with similar parameter values. However, M3 could not reject M2, which did not identify positively selected sites (data not shown). M8 was able to reject M7 at P<0·01. Bayesian analysis with the parameter estimates of both M3 and M8 identified the same five sites as being positively selected.

ac66 and arif-1.
Positively selected sites were identified by M3 and M8 for both ac66 and arif-1, but the parameter estimates for the categories containing these sites were different for the two models. For both ac66 and arif-1, the positively selected categories under M8 were smaller with higher {omega} values. For arif-1, M2 (which did not identify positively selected sites) was not rejected by M3.

ac38 and ac66 are present in all lepidopteran NPV genomes sequenced to date. However, these ORFs remain uncharacterized. The ac38 predicted amino acid sequences have no significant sequence identity with proteins with a known function, while ac66 specifies an amino acid sequence with significant identity to desmoplakin, a structural component of intercellular junctions called desmosomes that link the intermediate filaments of cells together (Ruhrberg & Watt, 1997).

The arif-1 ORF encodes the 48 kDa actin rearrangement-inducing factor, a protein that localizes to vesicular structures at the plasma membrane of infected cells (Roncarati & Knebel-Mörsdorf, 1997). ARIF-1 mediates the dissociation of the host cell actin network and of the virus-induced actin cables that form early during infection, as well as the subsequent formation of actin aggregates at the plasma membrane (Dreschers et al., 2001). Mutations in AcMNPV arif-1 had no effect upon replication in S. frugiperda or T. ni cells in vitro (Roncarati & Knebel-Mörsdorf, 1997; Dreschers et al., 2001). However, ARIF-1 could be required in some way for the in vivo replication cycle.

Analysis of the DNA helicase gene
dnahel encodes a DNA helicase (P143) required for viral DNA replication and late gene expression (Gordon & Carstens, 1984; Lu & Carstens, 1991; Lu & Miller, 1995b; McDougal & Guarino, 2000). Substitution of part of the AcMNPV dnahel gene with the homologous sequence from BmNPV dnahel resulted in recombinant AcMNPV that could replicate in a B. mori cell line and kill B. mori larvae, a species normally refractory to AcMNPV infection (Maeda et al., 1993; Croizier et al., 1994). Substitutions at two sites encoding different amino acids in the AcMNPV and BmNPV dnahel gene products were found to be minimally required for the expanded host range of the recombinant AcMNPV (Argaud et al., 1998; Kamita & Maeda, 1997). This result suggests that P143 works in a host-specific fashion to facilitate virus replication, and that non-synonymous substitutions in dnahel may contribute to the capacity of NPVs to replicate in different hosts.

However, selection pressure analysis with models that allow {omega} to vary among sites failed to detect positive selection in this gene (Table 5). A category of codons with {omega}=1·213 was indicated in M8, but the null hypothesis model M7 could not be rejected (P=0·333). The two sites in dnahel (positions 564 and 577) identified as being minimally required for expansion of the AcMNPV host range to include B. mori had average {omega} values in M8 of 0·099 and 0·08, respectively.


View this table:
[in this window]
[in a new window]
 
Table 5. Summary of selection pressures on dnahel

 
The models that allow {omega} to vary among codon sites do not allow {omega} to vary among lineages. With such models, positive selection is not detected if {omega} averaged over all lineages is not greater than one. It is possible that positive selection occurred in dnahel only within the clade containing AcMNPV, BmNPV, and related viruses. To investigate this possibility, the dnahel alignment was subjected to analysis using the free-ratios model, in which {omega} is allowed to vary among lineages. The free-ratios model was significantly better fit for the data than M0 (P<10-7). With this analysis, the branch of the tree leading to the BmNPV dnahel gene has an {omega}>1, with dN=0·0105 and dS=0·0001 (Fig. 1). With 2582·2 potential nonsynonymous sites and 906·8 potential synonymous sites, 27·1 nonsynonymous substitutions and 0·1 synonymous substitutions are estimated to have occurred along the branch of this tree leading to the BmNPV dnahel sequence. This result is consistent with positive selection taking place during the divergence of the BmNPV sequence and the AcMNPV-like sequences.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 1. Phylogeny of group 1 NPV dnahel genes. The two numbers at each branch are maximum-likelihood estimates of dN and dS, respectively, for dnahel along that branch under the free-ratios model. With the exception of the branch leading to the OpMNPV and EppoMNPV genes (dashed line), the branches are drawn in proportion to the estimates of their lengths (measured in estimated substitutions per codon site).

 
Because positive selection is likely only to occur at a few sites within a gene and at a few time intervals during the evolutionary history of a gene, detecting positive selection is difficult. The methods used in this study employ models that make this task easier by allowing selection pressures to vary among sites and among lineages. The variation in selection pressures among sites has proven to be a more difficult problem in detecting positive selection, and models that allow {omega} to vary among sites have been successful in detecting positive selection even in sequences where most of the sites are under purifying selection (Yang, 2002).

The likelihood ratio tests used to compare models determine whether sequences in an alignment contain sites under positive selection and are very conservative (Anisimova et al., 2001). The power of the likelihood ratio tests to detect positive selection decreases with decreasing number of sequences in a data set. Because our data sets often consisted of only five sequences, it is likely that not all of the positively selected NPV genes of the group of 83 examined in this study were detected. In addition, the inclusion of the OpMNPV sequences likely resulted in higher values of dS, which would have increased the difficulty of identifying genes with {omega}>1. The OpMNPV genome has an overall G+C composition of 55 %, while the G+C compositions of EppoMNPV, AcMNPV, RoMNPV, and BmNPV are approximately 39–41 %. This difference in nucleotide frequencies is especially pronounced in the third (wobble) codon position of coding sequences, where the G+C % ranges from approximately 70–80 % for OpMNPV genes and 48–52 % for genes from the other viruses. As a result, homologous codon sites that code for the same amino acid are frequently expected to have a G or a C in the wobble position of the OpMNPV sequence, while an A or a T is more likely to occur at the same position in the other sequences. The sequence differences at the wobble positions of such codons would be scored as synonymous substitutions, when in fact the differences in some cases may be a function of the different nucleotide frequencies between OpMNPV and the other sequences.

The ability of the empirical Bayes method to identify sites occurring in positively selected classes suffers from reduced accuracy when a low number of sequences are analysed (Anisimova et al., 2002). Although the empirical Bayes method often identified the same residues as being positively selected with different models (M3 and M8), the identification of positively selected sites in this study should be regarded with caution because of the relatively low number of sequences in the data sets.

This analysis identified nine genes that have undergone positive selection amongst group 1 NPVs. Two of these genes (odv-e18 and arif-1) were previously identified as being positively selected when the same codon substitution models were applied to alignments of AcMNPV and RoMNPV genes (Harrison & Bonning, 2003). Genes under positive selection pressure may account for differences in species-specific virulence or host range among NPVs, although contributions to environmental stability or transmission efficiency cannot be ruled out. It is also possible that some of these genes, such as vp80, are not responding directly to selection pressure but are exhibiting co-variation to compensate with changes elsewhere in the genome. Empirical studies will be required to assess the contribution of the positively selected genes identified in this study to species-specific virulence and host range.


   ACKNOWLEDGEMENTS
 
We thank Dr Ziheng Yang (Department of Biology, University College London), Dr Gavin Naylor (Department of Zoology and Genetics, Iowa State University), and Dr Karin Dorman (Department of Statistics, Iowa State University) for helpful discussions. This material is based upon work supported by Hatch Act and State of Iowa funds.


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
REFERENCES
 
Adams, J. R. & McClintock, J. T. (1991). Baculoviridae. Nuclear polyhedrosis viruses, part 1: nuclear polyhedrosis viruses of insects. In Atlas of Invertebrate Viruses, pp. 87–204. Edited by J. R. Adams & J. R. Bonami. Boca Raton, FL: CRC Press.

Ahrens, C. H., Russell, R. L. Q., Funk, C. J., Evans, J. T., Harwood, S. H. & Rohrmann, G. F. (1997). The sequence of the Orgyia pseudotsugata multicapsid nuclear polyhedrosis virus genome. Virology 229, 381–399.[CrossRef][Medline]

Anisimova, M., Bielawski, J. P. & Yang, Z. (2001). Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution. Mol Biol Evol 18, 1585–1592.[Abstract/Free Full Text]

Anisimova, M., Bielawski, J. P. & Yang, Z. (2002). Accuracy and power of Bayes prediction of amino acid sites under positive selection. Mol Biol Evol 19, 950–958.[Abstract/Free Full Text]

Argaud, O., Croizier, L., López-Ferber, M. & Croizier, G. (1998). Two key mutations in the host-range specificity domain of the p143 gene of Autographa californica nucleopolyhedrovirus are required to kill Bombyx mori larvae. J Gen Virol 79, 931–935.[Abstract]

Ayres, M. D., Howard, S. C., Kuzio, J., López-Ferber, M. & Possee, R. D. (1994). The complete DNA sequence of Autographa californica nuclear polyhedrosis virus. Virology 202, 586–605.[CrossRef][Medline]

Barrett, J. W., Krell, P. J. & Arif, B. M. (1995). Characterization, sequencing and phylogeny of the ecdysteroid UDP-glucosyltransferase gene from two distinct nuclear polyhedrosis viruses isolated from Choristoneura fumiferana. J Gen Virol 76, 2447–2456.[Abstract]

Blissard, G., Black, B., Crook, N., Keddie, B. A., Possee, R., Rohrmann, G., Theilmann, D. & Volkman, L. (2000). Family Baculoviridae. In Virus Taxonomy. Seventh Report of the International Committee on the Taxonomy of Viruses, pp. 195–202. Edited by M. H. V. van Regenmortel, C. M. Fauquet, D. H. L. Bishop, E. B. Carstens, M. K. Estes, S. M. Lemon, J. Maniloff, M. A. Mayo, D. J. McGeoch, C. R. Pringle & R. B. Wickner. San Diego: Academic Press.

Braunagel, S. C., Elton, D. M., Ma, H. & Summers, M. D. (1996a). Identification and analysis of an Autographa californica nuclear polyhedrosis virus structural protein of the occlusion-derived virus envelope: ODV-E56. Virology 217, 97–110.[CrossRef][Medline]

Braunagel, S. C., He, H., Ramamurthy, P. & Summers, M. D. (1996b). Transcription, translation, and cellular localization of three Autographa californica nuclear polyhedrosis virus structural proteins: ODV-E18, ODV-E35, and ODV-EC27. Virology 222, 100–114.[CrossRef][Medline]

Carstens, E. B., Liu, J. J. & Dominy, C. (2002). Identification and molecular characterization of the baculovirus CfMNPV early genes: ie-1, ie-2, and pe38. Virus Res 83, 13–30.[CrossRef][Medline]

Chen, C. J. & Thiem, S. M. (1997). Differential infectivity of two Autographa californica nucleopolyhedrovirus mutants on three permissive cell lines is the result of lef-7 deletion. Virology 227, 88–95.[CrossRef][Medline]

Chen, C.-J., Quentin, M. E., Brennan, L. A., Kukel, C. & Theim, S. M. (1998). Lymantria dispar nucleopolyhedrovirus hrf-1 expands the larval host range of Autographa californica nucleopolyhedrovirus. J Virol 72, 2526–2531.[Abstract/Free Full Text]

Chou, C. M., Huang, C. J., Lo, C. F., Kou, G. H. & Wang, C. H. (1996). Characterization of Perina nuda nucleopolyhedrovirus (PenuNPV) polyhedrin gene. J Invertebr Pathol 67, 259–266.[CrossRef][Medline]

Clem, R. J. & Miller, L. K. (1993). Apoptosis reduces both the in vitro replication and the in vivo infectivity of a baculovirus. J Virol 67, 3730–3738.[Abstract]

Clem, R. J., Fechheimer, M. & Miller, L. K. (1991). Prevention of apoptosis by a baculovirus gene during infection of insect cells. Science 254, 1388–1389.[Medline]

Croizier, G., Croizier, L., Argaud, O. & Poudevigne, D. (1994). Extension of Autographa californica nuclear polyhedrosis virus host range by interspecific replacement of a short DNA sequence in the p143 helicase gene. Proc Natl Acad Sci U S A 91, 48–52.[Abstract]

Dreschers, S., Roncarati, R. & Knebel-Mörsdorf, D. (2001). Actin rearrangement-inducing factor of baculoviruses is tyrosine-phosphorylated and colocalizes to F-actin at the plasma membrane. J Virol 75, 3771–3778.[Abstract/Free Full Text]

Endo, T., Ikeo, K. & Gojobori, T. (1996). Large-scale search for genes on which positive selection may operate. Mol Biol Evol 13, 685–690.[Abstract]

Gomi, S., Majima, K. & Maeda, S. (1999). Sequence analysis of the genome of Bombyx mori nucleopolyhedrovirus. J Gen Virol 80, 1323–1337.[Abstract]

Gordon, J. D. & Carstens, E. B. (1984). Phenotypic characterization and physical mapping of a temperature sensitive mutant of Autographa californica nuclear polyhedrosis virus defective in DNA synthesis. Virology 138, 69–81.[Medline]

Guarino, L. A., Mistretta, T.-A. & Dong, W. (2002). Baculovirus lef-12 is not required for viral replication. J Virol 76, 12032–12043.[Abstract/Free Full Text]

Harrison, R. L. & Bonning, B. C. (2003). Comparative analysis of the genomes of Rachiplusia ou and Autographa californica multiple nucleopolyhedroviruses. J Gen Virol 84, 1827–1842.[Abstract/Free Full Text]

Hill, J. E. & Faulkner, P. (1994). Identification of the gp67 gene of a baculovirus pathogenic to the spruce budworm, Choristoneura fumiferana multinucleocapsid nuclear polyhedrosis virus. J Gen Virol 75, 1811–1813.[Abstract]

Hill, J. E., Kuzio, J. & Faulkner, P. (1995). Identification and characterization of the v-cath gene of the baculovirus, CfMNPV. Biochim Biophys Acta 1264, 275–278.[Medline]

Holmes, E. C., Woelk, C. H., Kassis, R. & Bourhy, H. (2002). Genetic constraints and the adaptive evolution of rabies virus in nature. Virology 292, 247–257.[CrossRef][Medline]

Hyink, O., Dellow, R. A., Olsen, M. J., Caradoc-Davies, K. M. B., Drake, K., Herniou, E. A., Cory, J. S., O'Reilly, D. R. & Ward, V. K. (2002). Whole genome analysis of the Epiphyas postvittana nucleopolyhedrovirus. J Gen Virol 83, 957–971.[Abstract/Free Full Text]

Kamita, S. G. & Maeda, S. (1997). Sequencing of the putative DNA helicase-encoding gene of the Bombyx mori nuclear polyhedrosis virus and fine-mapping of a region involved in host range expansion. Gene 190, 173–179.[CrossRef][Medline]

Lapointe, R., Back, D. W., Ding, Q. & Carstens, E. B. (2000). Identification and molecular characterization of the Choristoneura fumiferana multicapsid nucleopolyhedrovirus genomic region encoding the regulatory genes pkip, p47, lef-12, and gta. Virology 271, 109–121.[CrossRef][Medline]

Li, X., Pang, A., Lauzon, H. A., Sohi, S. S. & Arif, B. M. (1997a). The gene encoding the capsid protein P82 of the Choristoneura fumiferana multicapsid nucleopolyhedrovirus: sequencing, transcription and characterization by immunoblot analysis. J Gen Virol 78, 2665–2673.[Abstract]

Li, X., Pang, A., Lauzon, H. A. M., Sohi, S. S. & Arif, B. M. (1997b). The gene encoding the capsid protein P82 of the Chorisoneura fumiferana multicapsid nucleopolyhedovirus: sequencing, transcription and characterization by immunoblot analysis. J Gen Virol 78, 2665–2673.[Abstract]

Li, X., Lauzon, H. A., Sohi, S. S., Palli, S. R., Retnakaran, A. & Arif, B. M. (1999). Molecular analysis of the p48 gene of Choristoneura fumiferana multicapsid nucleopolyhedroviruses CfMNPV and CfDEFNPV. J Gen Virol 80, 1833–1840.[Abstract]

Liu, J. J. & Carstens, E. B. (1995). Identification, localization, transcription, and sequence analysis of the Choristoneura fumiferana nuclear polyhedrosis virus DNA polymerase gene. Virology 209, 538–549.[CrossRef][Medline]

Liu, J. J. & Carstens, E. B. (1996). Identification, molecular cloning, and transcription analysis of the Choristoneura fumiferana nuclear polyhedrosis virus spindle-like protein gene. Virology 223, 396–400.[CrossRef][Medline]

Liu, J. C. & Maruniak, J. E. (1999). Molecular characterization of genes in the GP41 region of baculoviruses and phylogenetic analysis based upon GP41 and polyhedrin genes. Virus Res 64, 187–196.[CrossRef][Medline]

Lu, A. & Carstens, E. B. (1991). Nucleotide sequence of a gene essential for viral DNA replication in the baculovirus Autographa californica nuclear polyhedrosis virus. Virology 181, 336–347.[Medline]

Lu, A. & Carstens, E. B. (1992). Nucleotide sequence and transcriptional analysis of the p80 gene of Autographa californica nuclear polyhedrosis virus: a homologue of the Orgyia pseudotsugata nuclear polyhedrosis virus capsid-associated protein. Virology 190, 201–209.[Medline]

Lu, A. & Miller, L. K. (1994). Identification of three late expression factor genes within the 33·8- to 43·4-map-unit region of Autographa californica nuclear polyhedrosis virus. J Virol 68, 6710–6718.[Abstract]

Lu, A. & Miller, L. K. (1995a). Differential requirements for baculovirus late expression factor genes in two cell lines. J Virol 69, 6265–6272.[Abstract]

Lu, A. & Miller, L. K. (1995b). The roles of eighteen baculovirus late expression factor genes in transcription and DNA replication. J Virol 69, 975–982.[Abstract]

Lu, A. & Miller, L. K. (1996). Species-specific effects of the hcf-1 gene on baculovirus virulence. J Virol 70, 5123–5130.[Abstract]

Maeda, S., Kamita, S. G. & Kondo, A. (1993). Host range expansion of Autographa californica nuclear polyhedrosis virus (NPV) following recombination of a 0·6-kilobase-pair DNA fragment originating from Bombyx mori NPV. J Virol 67, 6234–6238.[Abstract]

McDougal, V. V. & Guarino, L. A. (2000). The Autographa californica nuclear polyhedrosis virus p143 gene encodes a DNA helicase. J Virol 74, 5273–5279.[Abstract/Free Full Text]

Popham, H. J. R., Pellock, B. J., Robson, M., Dierks, P. M. & Miller, L. K. (1998). Characterization of a variant of Autographa californica nuclear polyhedrosis virus with a nonfunctional ORF 603. Biol Control 12, 223–230.[CrossRef]

Rapp, J. C., Wilson, J. A. & Miller, L. K. (1998). Nineteen baculovirus open reading frames, including LEF-12, support late gene expression. J Virol 72, 10197–10206.[Abstract/Free Full Text]

Rodrigues, J. C., De Souza, M. L., O'Reilly, D., Velloso, L. M., Pinedo, F. J., Razuck, F. B., Ribeiro, B. & Ribeiro, B. M. (2001). Characterization of the ecdysteroid UDP-glucosyltransferase (egt) gene of Anticarsia gemmatalis nucleopolyhedrovirus. Virus Genes 22, 103–112.[CrossRef][Medline]

Roncarati, R. & Knebel-Mörsdorf, D. (1997). Identification of the early actin-rearrangement-inducing factor gene, arif-1, from Autographa californica multicapsid nuclear polyhedrosis virus. J Virol 71, 7933–7941.[Abstract]

Ruhrberg, C. & Watt, F. M. (1997). The plakin family: versatile organizers of cytoskeletal architecture. Curr Opin Genet Dev 7, 392–397.[CrossRef][Medline]

Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.[Abstract]

Twiddy, S. S., Woelk, C. H. & Holmes, E. C. (2002). Phylogenetic evidence for adaptive evolution of dengue viruses in nature. J Gen Virol 83, 1679–1689.[Abstract/Free Full Text]

Wilson, J. A., Hill, J. E., Kuzio, J. & Faulkner, P. (1995). Characterization of the baculovirus Choristoneura fumiferana multicapsid nuclear polyhedrosis virus p10 gene indicates that the polypeptide contains a coiled-coil domain. J Gen Virol 76, 2923–2932.[Abstract]

Woelk, C. H. & Holmes, E. C. (2001). Variable immune-driven natural selection in the attachment (G) glycoprotein of respiratory syncytial virus (RSV). J Mol Evol 52, 182–192.[Medline]

Woelk, C. H., Jin, L., Holmes, E. C. & Brown, D. W. G. (2001). Immune and artificial selection in the haemagglutinin (H) glycoprotein of measles virus. J Gen Virol 82, 2463–2474.[Abstract/Free Full Text]

Yang, Z. (1997). PAML: A program package for phylogenetic analysis by maximum likelihood. CABIOS 13, 555–556.[Medline]

Yang, Z. (1998). Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15, 568–573.[Abstract]

Yang, Z. (2001). Adaptive molecular evolution. In Handbook of Statistical Genetics, pp. 327–350. Edited by D. J. Balding, M. Bishop & C. Cannings. London: Wiley.

Yang, Z. (2002). Inference of selection from multiple sequence alignments. Curr Opin Genet Dev 12, 688–694.[CrossRef][Medline]

Yang, Z., Nielsen, R., Goldman, N. & Pedersen, A.-M. K. (2000). Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155, 431–449.[Abstract/Free Full Text]

Zanotto, P. M., Sampaio, M. J., Johnson, D. W., Rocha, T. L. & Maruniak, J. E. (1992). The Anticarsia gemmatalis nuclear polyhedrosis virus polyhedrin gene region: sequence analysis, gene product and structural comparisons. J Gen Virol 73, 1049–1056.[Abstract]

Received 6 August 2003; accepted 22 September 2003.



This Article
Abstract
Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Google Scholar
Articles by Harrison, R. L.
Articles by Bonning, B. C.
Articles citing this Article
PubMed
PubMed Citation
Articles by Harrison, R. L.
Articles by Bonning, B. C.
Agricola
Articles by Harrison, R. L.
Articles by Bonning, B. C.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS