Evolutionary timescale of rabies virus adaptation to North American bats inferred from the substitution rate of the nucleoprotein gene

Gareth J. Hughes{dagger}, Lillian A. Orciari and Charles E. Rupprecht

Rabies Section, Centers for Disease Control and Prevention, 1600 Clifton Road, Mail-Stop G33, Atlanta, GA 30333, USA

Correspondence
Charles E. Rupprecht
cyr5{at}cdc.gov


   ABSTRACT
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Throughout North America, rabies virus (RV) is endemic in bats. Distinct RV variants exist that are closely associated with infection of individual host species, such that there is little or no sustained spillover infection away from the primary host. Using Bayesian methodology, nucleotide substitution rates were estimated from alignments of partial nucleoprotein (N) gene sequences of nine distinct bat RV variants from North America. Substitution rates ranged from 2·32x10–4 to 1·38x10–3 substitutions per site per year. A maximum-likelihood (ML) molecular clock model was rejected for only two of the nine datasets. In addition, using sequences from bat RV variants across the Americas, the evolutionary rate for the complete N gene was estimated to be 2·32x10–4. This rate was used to scale trees using Bayesian and ML methods, and the time of the most recent common ancestor for current bat RV variant diversity in the Americas was estimated to be 1660 (range 1267–1782) and 1651 (range 1254–1773), respectively. Our reconstructions suggest that RV variants currently associated with infection of bats from Latin America (Desmodus and Tadarida) share the earliest common ancestor with the progenitor RV. In addition, from the ML tree, times were estimated for the emergence of the three major lineages responsible for bat rabies cases in North America. Adaptation to infection of the colonial bat species analysed (Eptesicus fuscus, Myotis spp.) appears to have occurred much quicker than for the solitary species analysed (Lasionycteris noctivagans, Pipistrellus subflavus, Lasiurus borealis, Lasiurus cinereus), suggesting that the process of virus adaptation may be dependent on host biology.

{dagger}Present address: Laboratory for Clinical and Molecular Virology, The University of Edinburgh, Summerhall, Edinburgh EH9 1QH, UK.


   INTRODUCTION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Rabies virus (RV) is the type member of the genus Lyssavirus, family Rhabdoviridae. RV has a negative-sense RNA genome of ~12 kb and consists of five genes encoding the nucleoprotein (N), phosphoprotein, matrix protein, glycoprotein and polymerase (Tordo et al., 1986). The N encapsidates the viral RNA, forming the template for transcription and replication (Tordo & Kouknetzoff, 1993). RNA viruses are characterized by a high mutation rate during replication due to the lack of proofreading and post-replication error correction by RNA polymerases (Domingo & Holland, 1994). This genetic diversity seems to provide an adaptive potential for RV that can vary according to natural history (Morimoto et al., 1998; Kissi et al., 1999).

Indigenous bat rabies was first recognized in the USA in 1953 (Scatterday & Galton, 1954), although the exact age of its emergence remains uncertain. During 2001, bats were responsible for 17 % of reported wildlife rabies cases and all three cases of human rabies in the USA (Krebs et al., 2002). In North America, rabid bats have been found for almost every species that has been thoroughly investigated (Constantine, 1979). The genetic diversity of current bat RV variants suggests that up to 30 different lineages of RV may exist in the USA alone (Smith, 2002). A more conservative assessment of RV diversity in the Americas suggests 13 distinct RV variants (Nadin-Davis et al., 2001). In the USA during 2001, bat rabies where the host could be identified at species level was dominated by two host species: Eptesicus fuscus (big brown bat; 47·1 %) and Tadarida brasiliensis (Mexican free-tailed bat; 28·3 %). In addition, small numbers of cases were attributed to Lasiurus cinereus (hoary bat; 5·5 %), Lasiurus borealis (red bat; 4·3 %), Myotis lucifugus (little brown bat; 3·7 %), Lasionycteris noctivagans (silver-haired bat; 3·0 %) and Pipistrellus subflavus (eastern pipistrelle; 1·2 %). In Canada, with the exception of T. brasiliensis, the species of bats most often found rabid are identical to the USA (Nadin-Davis et al., 2001). The emergence of the lineage of RV associated with infection of Lasionycteris noctivagans and P. subflavus, which has caused a disproportionately high number of human cases, has been attributed to an increased viral infectivity (Morimoto et al., 1996; Messenger et al., 2003), although this hypothesis remains largely unproven.

Timing the divergence of viral lineages relies on an accurate estimation of the rate of nucleotide substitution and subsequent application of a molecular clock. Computational methods to estimate substitution rate are problematic (Drummond et al., 2003) and the validity of applying clock-like models to viral sequence data is contentious (Holmes, 2003). Recently, Bayesian techniques using Markov Chain Monte Carlo (MCMC) methods have been developed for estimating substitution rates from dated sequences (Drummond et al., 2002) and have been successfully applied to RV where estimations of evolutionary rate and subsequent divergence times were shown to be both comparable to maximum-likelihood methods and supported by field prevalence data (Hughes et al., 2004).

Here, we have applied computational methods to nucleotide sequence data from the N gene of RV. We used distinct datasets, which represented all of the major bat RV variants in the USA and Canada, to estimate rates of molecular evolution for individual variants. In turn, using complete N gene sequences from North American bat RV variants, together with bat species associated with Latin America, we reconstructed the timescale for the evolution of bat RV variants in North America.


   METHODS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Datasets.
We used nine sets of partial N gene sequences (all of which could be differentiated by either host species or geography) and an alignment of complete N gene sequences, all of which were bat-associated RV variants from North, South and Central America (Table 1). Within each dataset, identical sequences with the same year of isolation were removed. Partial N gene sequences from the USA were obtained directly from the Centers for Disease Control and Prevention (CDC) RV archival database (sequences used in this study are to be submitted to GenBank) and corresponded to nt 1107–1350 (244 nt) of the Pasteur RV N gene (GenBank accession no. M13215; Tordo et al., 1986). An alignment of RV sequences from Desmodus rotundus (vampire bat) covering the same region of the N gene as the US sequences was generated using data obtained from GenBank (accession nos AB083804, AB083805, AB083807–AB083809, AB083811, AB083813, AB083814, AB083818, AF045166, AF351847, AF351852, U22478, U22479). Partial N gene sequences from Canada were obtained from GenBank and have been described in detail elsewhere (Nadin-Davis et al., 2001). Canadian RV sequences corresponded to nt 287–670 (384 nt) of the Pasteur RV N gene. Complete N gene sequences were taken from GenBank (accession nos AF045166, AF351827–AF351829, AF351831–AF351848, AF351851–AF351862, AF394872–AF394876, AF394879–AF394886, AY039224, AY039226–AY039229, U22478).


View this table:
[in this window]
[in a new window]
 
Table 1. Datasets used in this study

 
Positive selection analysis.
Positive selection analysis was performed using various models of codon substitution (Yang et al., 2000) implemented using the CODEML program of the PAML package (Yang, 1997). Various maximum-likelihood (ML) models that have varying constraints on the values of non-synonymous substitutions per non-synonymous site (dn) and synonymous substitutions per synonymous site (ds) and their ratio ({omega}) were applied to each dataset. Models allowing for positive selection (i.e. {omega}>1) are nested within models that do not allow for positive selection. This allows the significance of the fit for positive selection models to be tested using the likelihood ratio test. Positive selection is inferred if the positive selection model has a significantly higher likelihood than the null model and a value of {omega}>1 is estimated. If evidence of positive selection is suggested, Bayesian methods are used to identify which individual codons fall into the {omega}>1 class. For CODEML analysis, sequence alignments were trimmed to include only complete non-stop codons. For each set of sequences, an ML tree generated using PHYLIP (Felsenstein, 1981) was used for positive selection analysis. Model testing using CODEML was performed using codon substitution models described elsewhere (Woelk et al., 2002). Where no evidence of positive selection was detected, the value for {omega} (between the bounds of 0 and 1) from M0 was taken as the mean value for the alignment.

Estimation of the rate of molecular evolution.
Estimates of the rate of molecular evolution (substitutions per site per year; µ) were obtained using the BEAST program (available at http://www.evolve.zoo.ox.ac.uk). This program uses a Bayesian MCMC method that requires no assumptions be made regarding tree topology (Drummond et al., 2002). Previously, we have employed this method for RV sequences where the results were found to be comparable to ML-based methods (Hughes et al., 2004). In addition, for feline immunodeficiency virus, the MCMC method has been shown to produce shorter confidence limits than those based on ML (Biek et al., 2003).

In each case, an input file for BEAST was generated using the BEAUti program (available at http://www.evolve.zoo.ox.ac.uk) with sequences dated with the year of isolation. Analysis was performed using the HKY85 substitution model (Hasegawa et al., 1985) without site rate heterogeneity. A coalescent model (prior) of constant population size was used. MCMC analysis was optimized using the criteria suggested in the program's documentation. This included an operator acceptance probability of ~25 % and an effective sample size of >100. BEAST output was assessed using the Tracer program (available at http://www.evolve.zoo.ox.ac.uk). The distribution shape for estimated values of µ was checked to ensure that adequate sampling of the chain had taken place. For each estimation, three replicate BEAST runs were performed to test the repeatability of the analysis.

To test for possible variation in estimated substitution rates along the N gene, we used a sliding window analysis to assess region-specific rates and corresponding values of sequence diversity. Nucleotide diversity ({pi}) was calculated from the alignment of 53 complete N gene sequences of bat RV variants using the computer program DnaSP (Rozas et al., 2003). For estimation of both µ and {pi}, a sliding window of 250 nt with a step size of 100 was used. Values of µ were then calculated from individual alignments of 250 nt as described above.

Molecular clock testing and estimation of divergence times.
The hypothesis of a molecular clock (i.e. a constant rate of nucleotide substitution across the whole tree) was tested using ML trees generated by the program TipDate (Rambaut, 2000). In each case, an ML tree was generated from PHYLIP and used as input. For each dataset, an ML tree was generated in TipDate under the assumption of a single rate of nucleotide substitution using dated tips (SRDT model) and the likelihood of this tree was compared with that of an ML tree generated without the assumption of a molecular clock (where the substitution rate is allowed to vary for individual branches; DR model). Trees were generated using the HKY85 model of nucleotide substitution with transition to transversion ratios estimated directly from the data. For SRDT trees, rates of substitution estimated from BEAST analysis were used to scale trees and the ML root estimated by TipDate. The likelihoods of the SRDT and DR models were then compared using a likelihood ratio test as described elsewhere (Rambaut, 2000). The assumption of the molecular clock was not rejected if the likelihood of the SRDT tree was not significantly worse (P<0·05) than that of the DR tree. For the complete N gene alignment, SRDT trees were generated using the mean and upper and lower 95 % confidence limits (CL) of the BEAST-estimated values of µ. SRDT trees scaled according to time were used to estimate times of the most recent common ancestor (TMRCA) and divergence times of major lineages. An additional estimate of the TMRCA was made using BEAST by performing the MCMC analysis without dated tips using the mean and upper and lower 95 % CL of the values of µ to scale trees.

Simulations.
In order to assess the ability of the MCMC method to accurately infer rates, we simulated nucleotide evolution with pre-defined rates and tested the precision of the MCMC method using varying numbers of taxa and ranges of isolation dates. The evolution of DNA sequences using defined substitution rates was simulated using the program Seq-Gen (Rambaut & Grassly, 1997). Phylogenetic trees generated by the program COMPONENT (available at http://taxonomy.zoology.gla.ac.uk) using pre-defined numbers of taxa were used as input for Seq-Gen. Two sets of simulations were performed to assess the precision of the MCMC method to estimate substitution rates: (i) to test the effect of the period of isolation (time between first isolation and last) for a fixed number of taxa (n=50) with varying substitution rate; and (ii) to test the effect of the number of taxa over a fixed isolation period (25 years) with varying substitution rate. For all simulations, sequences of 250 nt in length were generated using the HKY85 model of nucleotide substitution.

To test the effect of varying the number of taxa, sequences were removed at random from the COMPONENT tree to ensure that topological relationships between sequences were retained (although the process of random removal meant that overall topology was not identical). Where differing isolation periods were required (but with the same number of taxa), trees were rescaled according to the timescale required for simulation. Seq-Gen-generated alignments were then used as input for BEAST as described for the biological datasets.


   RESULTS
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Positive selection analysis
A mean value for {omega} (taken from M0) for each dataset is shown in Table 1. Although there was considerable variation in {omega} between datasets (from 0·02 for CN Eptesicus to 0·14 for US RAC), there was no evidence of positive selection for any dataset using codon substitution models in CODEML.

Simulations of sequence evolution
The results of BEAST analysis of simulated datasets are shown in Fig. 1. For ease of interpretation, results are shown as the 95 % confidence range deviation from the set values of µ for which the simulated alignment was generated. Where limited diversity was present within an alignment (i.e. lower substitution rates, limited date range and low numbers of sequences), the certainty around the mean value of µ became unstable. With a substitution rate of less than <0·0005, the 95 % confidence range differed by more than 1 log from the value used to generate the data, indicating the inability of the MCMC method to determine µ precisely.



View larger version (16K):
[in this window]
[in a new window]
 
Fig. 1. Variation in the range of estimated substitution rates for simulated data using pre-defined values of µ. Data were generated using varying ranges of isolation dates with a fixed number of taxa (n=50) (a) and varying numbers of taxa using a fixed period of isolation (25 years) (b). Individual data points show the 95 % confidence interval deviation from the pre-defined value of µ. Missing data points indicate failure of BEAST to estimate values of µ due to the limited amount of variation generated by low substitution rates.

 
When genetic distance was acquired at slower rates, the accuracy of the MCMC method decreased dramatically; increasing the number of taxa and the range of isolation dates (therefore increasing the level of nucleotide diversity with time) enhanced the ability of the MCMC method to estimate rates precisely. Overall, the simulations supported the ability of the method to determine rates of viral evolution, but did show that better-suited data (an optimal number of taxa, i.e. that is not too computationally expensive, and a large range of isolation dates) dramatically increased the certainty of the mean value.

Estimation of the rate of molecular evolution and molecular clock testing
Estimated values of µ for each dataset, with 95 % CL, are shown in Fig. 2. For seven of the nine partial N gene datasets, the assumption of a molecular clock was not significantly rejected (Table 2). Rejection of the molecular clock for US Eptesicus, CN Eptesicus and bat complete N was highly significant (P<0·001). In all three cases where the SRDT model was significantly rejected by the DR model, the likelihood of the SRDT model was higher than that using a single rate without dated tips model (SR) (results not shown). The improved likelihood of the SRDT model over the SR model has been suggested as an indication that rates and divergence times inferred from the SRDT model reflect the mean values for the dataset and are appropriate to use for further analysis (Jenkins et al., 2002; Twiddy et al., 2003). In addition, it has been demonstrated that rejection of the molecular clock model due to low-rate heterogeneity can still yield reliable estimates of evolutionary rate (Jenkins et al., 2002) and divergence times similar to those produced by a relaxed molecular clock model (Lemey et al., 2004).



View larger version (8K):
[in this window]
[in a new window]
 
Fig. 2. Estimated rates of molecular evolution. Substitution rates were calculated using an MCMC-based method and are shown on a logarithmic scale. Datasets are arranged in descending order of mean substitution rate. Open circles indicate that the assumption of a molecular clock was significantly rejected (P<0·001). Error bars show 95 % CL of the mean.

 

View this table:
[in this window]
[in a new window]
 
Table 2. Likelihood ratio testing for the assumption of a molecular clock

 
The results of sliding window analysis along the N gene are shown in Fig. 3. Although the analysis showed variation in both {pi} and µ along the N gene, there was no significant correlation between {pi} and µ (Spearman's rank correlation coefficient, rs=0·098, n=12, P=0·762), {pi} and the 95 % confidence interval range of µ (rs=0·150, n=12, P=0·700), {pi} and midpoint position (rs=0·070, n=12, P=0·829) and µ and midpoint position (rs=–0·091, n=12, P=0·779). The lack of a significant correlation between {pi} and µ suggests that estimating evolutionary rates using the MCMC method is not influenced by the variability within the gene (or gene fragment) but by the phylogenetic position of individual taxa. No value of µ was greatly different to the estimate for the complete N gene, and although seven of the 12 sliding window values were greater than the upper 95 % CL of the complete N gene value of µ, three alignments estimated extremely small lower CL, together suggesting that the mean value for the complete N gene represented a good estimation of µ (and its variability) across the whole gene.



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 3. Sliding window analysis using an alignment of 53 complete bat RV variant N genes (window size 250 nt, step size 100). Closed circles show substitution rates and 95 % CL estimated from BEAST. Open circles indicate where lower CL were exceptionally small. Open squares show the nucleotide diversity calculated from DnaSP. Broken lines show the mean and 95 % CL of the substitution rate calculated using the complete N gene sequences (1353 nt).

 
TMRCA and the timescale of bat RV variant evolution
The SRDT tree for the complete N gene alignment, generated using the mean value of µ estimated by BEAST (2·3x10–4), is shown in Fig. 4. The TMRCA for current bat RV variant diversity in the Americas (based on the complete N gene alignment) was estimated to be 1651 (range 1254–1773) from the ML SRDT tree and 1660 (range 1267–1782) from MCMC-based analysis (BEAST). The inclusion of rate heterogeneity in the HKY model gave an identical mean value of µ for the complete N gene dataset and did not significantly alter the estimated TMRCA (results not shown).



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 4. ML SRDT tree for complete bat N gene sequences scaled to time using the mean value of µ estimated from BEAST. Sequence names have been altered to include the year of isolation.

 
The divergence times of individual lineages were estimated from the ML SRDT tree from TipDate. The first branching from the parental virus is predicted to have been the divergence of an RV that gave rise to the current variants associated with infection of T. brasiliensis and D. rotundus, i.e. bats geographically associated with Latin America and the southern USA. It appears that adaptation to T. brasiliensis and D. rotundus was a relatively long process; current diversity of RV isolates associated with these two bat species has a TMRCA of 1837 (range 1656–1892), suggesting that the progenitor viruses for the RV variants associated with infection of these two species circulated for ~160 years before being displaced by the evolution of variants better adapted to species-specific transmission.

Within North America, compartmentalization of RV into lineages associated with infection of solitary bat species (hosts for the LnPs and L-clade viruses) and colonial bat species (E. fuscus, Myotis spp.) is estimated to have occurred 29 years (range 18–62) after the divergence of the Tadarida/Desmodus lineage. In turn, each of the two North American lineages underwent further division to become the four extant RV variants that have been analysed in this study and which together represent the majority of current bat rabies cases in the USA (Krebs et al., 2002). Further division of the LnPs/L-clade lineage is predicted to have occurred during 1850 (range 1683–1897), with the development of the LnPs and L-clade lineages. The establishment of RV variants associated with E. fuscus and Myotis spp. occurred much earlier, around 1768 (range 1507–1847), indicating a much shorter period of adaptation than was required for the Tadarida/Desmodus and LnPs/L-clade lineages.


   DISCUSSION
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Currently, the validity of applying a molecular clock to RNA virus evolution is unclear (Holmes, 2003). Saturation of synonymous mutations (leading to potential underestimations of substitution rates; Holmes, 2003), recombination (reviewed by Awadalla, 2003), RNA secondary structure (Simmonds & Smith, 1999) and selection pressures can undoubtedly all contribute to misleading estimates of evolutionary rate. The lack of recombination found for complete bat N genes (Haydon et al., 2004), the fact that the RV genome is complexed with the N (Tordo & Kouknetzoff, 1993) and thus not subject to constraints on RNA secondary structure, and the strong evidence for purifying selection found in this study and elsewhere (Kissi et al., 1995; Holmes et al., 2002; Kuzmin et al., 2004) imply that the RV N gene is suitable for estimations of evolutionary rate and subsequent analysis of evolutionary history. Here, we assumed that the substitution rate of the RV N gene has remained constant over the course of the timescale measured (349–746 years). Although RV has undoubtedly undergone periods of intense host adaptation, the strong evidence for purifying selection suggests that fitness fluctuations should not have affected the evolutionary dynamics of the N gene.

In this study we limited grouping of sequences (as much as was possible given the lack of sequence data available for a number of variants) to those that represented, as accurately as could be determined, distinct lineages of RV (Smith, 2002). The lack of sustained spillover between RV variants (Smith, 2002; Krebs et al., 2002) ensured that our partial N gene datasets represented distinct RV transmission networks. Of the nine partial N gene datasets (each representing a distinct RV variant), seven produced trees under the assumption of a molecular clock (SRDT model) that were not significantly less likely than trees generated using the non-clock model (DR), i.e. the model of a molecular clock was not rejected. The biological significance of datasets for which the clock was significantly rejected is unclear, although both are RV variants associated with infection of the same bat species (E. fuscus).

The relative consistency of µ estimations between datasets supports the hypothesis of a molecular clock governing evolution of the RV N gene, irrespective of the host species. Although rates do vary by host species, there is considerable overlap in the CL for individual values. In fact, the similarity of estimated values of µ (range 2·32x10–4–1·38x10–3) suggested that the number of replication cycles per unit of time was similar for the RV variants studied here. Additionally, our estimates of substitution rate (i.e. both dn and ds combined) were close to previously estimated ds rates for the N gene (5·27x10–4; Holmes et al., 2002) and the glycoprotein gene (3·1x10–4–5·5x10–4; Badrane & Tordo, 2001). This was no doubt due to the lack of positively selected codons along the N gene found here and elsewhere (Holmes et al., 2002; Kuzmin et al., 2004).

Using complete bat RV variant N gene sequences, we have been able to reconstruct the evolutionary history of bat rabies in the Americas. Although the molecular clock model for the complete N gene alignment was rejected (probably in part due to both the amalgamation of RV variants with subtly different substitution rates and minor variation in µ along the N gene), we have used the SRDT tree for evolutionary timescaling as we believe the mean µ for this alignment to represent the best available approximation of substitution rate for the N gene. Additionally, there is good concordance between TMRCA estimates for the complete N gene alignment generated using ML (SRDT trees) and the MCMC-based methods.

There is some variation in previous estimates of the TMRCA of existing worldwide RV diversity. The establishment of non-chiropteran rabies (globally) and the development of RV variants in the Americas have been dated to have occurred between 542 and 1113 (Badrane & Tordo, 2001). Another study estimated the TMRCA of current global RV diversity to be within the last 500 years, implying that RV variants associated with New World bats originated from RV introduced during European colonization (Holmes et al., 2002), a hypothesis not refuted by our estimation of the TMRCA of current bat RV diversity in the Americas.

The results of our TMRCA analysis suggest that RV associated with infection of Latin American bats (D. rotundus and the migratory T. brasiliensis) share an earlier common ancestor with the progenitor RV responsible for seeding bat rabies in North America compared with those variants associated with non-Tadarida North American bat species. The presence of bat rabies in the Americas prior to European colonization may have been due to infection with extinct lineages of RV, the diversity of which is no longer evident, although this hypothesis is not testable. Whether or not current North American bat rabies is directly descended from an extinct, pre-Tadarida/Desmodus RV lineage is also uncertain. The position of terrestrial RV variants in this evolutionary pathway is also unclear and will require considerable further analysis.

While rabies in carnivores was well appreciated in the Old World for thousands of years, most of the earliest explorers remarked upon its notable absence in ‘New Spain’, at least in dogs, until the early 1700s (Baer et al., 1996). Several non-mutually exclusive mechanisms have been proposed for the arrival of rabies in the New World. Ancestral lyssaviruses may already have been present in the rudiments of various warm-blooded hosts in the super continent of Pangea, as various rhabdovirus lineages diverged and radiated among plants, invertebrates and vertebrates (Shope, 1975; Rupprecht et al., 1991). Much more recently, the disease may have entered through terrestrial animal migrations via the ancient Bering land bridge (Crandell, 1975). During viral incubation periods that can encompass months, infected animals may have rafted along coastlines on storm debris or flotsam, stowed away on primitive human conveyances or been brought intentionally for translocated domestic purposes (Smith & Seidel, 1993). In addition, rabies may have entered directly by flight from infected bats, given their unique and considerable aerodynamic abilities (Constantine, 2003). However, these pre-historic contributions of bats to rabies in the Americas and associations with particular fauna are largely speculative, with the exception of evidence based on fossil records of vampire bats.

Based on the fossil record, several species of vampire bats were once widely distributed throughout North America. During the Quaternary period, many suitable large-bodied and abundant mammalian species may have served as suitable prey for vampire bats but perished during the Pleistocene extinctions. As such, barring any major impact of climate change or a shortage of suitable roosts such as caves, a post-Pleistocene ecological bottleneck may have been created due to a relative shortage in blood supply for vampires during pre-Columbian times (Ray et al., 1988). With the colonization of the New World, the introduction of cattle should have allowed the local expansion of haematophagus bat populations. For example, in modern times, Desmodus populations are larger in livestock areas of Argentina than in non-cattle-rearing regions (Delpietro & Russo, 1993). Although not identified as rabies at the time, Spanish colonists recognized a link between bat bites and death in humans and animals as early as the 1500s and throughout the intervening centuries, but such observations were not attributed to rabies until the 1900s in Brazil and elsewhere (Baer & Smith, 1991). Extant vampires do not hibernate, do not migrate and are currently limited in distribution from Mexico to Argentina, approximated by a 10 °C winter isotherm (McNab, 1973). The entry of RV into colder latitudes would have required a more tolerant migratory species, such as Tadarida. The association of the Desmodus/Tadarida lineage and subsequent formation of the bat RV variants circulating today is supported by the results of this study. Of course, other bat species may have been equally or more important. The prominence of considerable cattle losses through blood feeding by rabid vampires provided an incentive worthy of study and notation, whereas rabies in insectivorous or frugivorous species may have gone largely undetected, except by accident.

Following the introduction of RV into North American bats, our analysis suggests that there was rapid adaptation to new host species. The evolution of RV variants associated with colonial species (E. fuscus, Myotis spp.) occurred first (~1768), approximately 70 years before the compartmentalization of Tadarida/Desmodus-associated variants and 80 years before the compartmentalization of LnPs/L-clade-associated variants. Perhaps the biology of colonial bat species (a higher density of susceptible contacts) ensured a greater number of replication cycles with time and thus a speedier adaptation process than for solitary species (where contacts with susceptible hosts would be less frequent). Of course, if such a situation were true, it would ensure higher levels of nucleotide substitution during the adaptive period, in turn suggesting that the sampling period for isolates used in this study is beyond the period of intensive adaptation. Whether or not such a process occurs in actuality will only be testable should nucleotide sequence data be available over the course of an analogous period of adaptation. Given the current epidemiological situation of bat rabies in North America and its apparent recent endemic stability among chiropteran hosts, the development of such an extant scenario appears remote, barring major environmental perturbations on a continental scale, such as more dramatic climatic change or enhanced habitat degradation.


   ACKNOWLEDGEMENTS
 
The authors thank Jean Smith for retrieval of sequence data, Andrew Rambaut for continued guidance with sequence analysis software and Dan Haydon for helpful comments on the manuscript. G. J. H. was funded by an American Society for Microbiology/National Center for Infectious Diseases post-doctoral fellowship.


   REFERENCES
Top
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES
 
Awadalla, P. (2003). The evolutionary genomics of pathogen recombination. Nat Rev Genet 4, 50–60.[CrossRef][Medline]

Badrane, H. & Tordo, N. (2001). Host switching in Lyssavirus history from the Chiroptera to the Carnivora orders. J Virol 75, 8096–8104.[Abstract/Free Full Text]

Baer, G. M. & Smith, J. S. (1991). Rabies in nonhematophagous bats. In The Natural History of Rabies, 2nd edn, pp. 341–366. Edited by G. M. Baer. Boca Raton, FL: CRC Press.

Baer, G. M., Neville, J. & Turner, G. S. (1996). Rabbis and Rabies. Mexico City: de Imprenta Marsella.

Biek, R., Rodrigo, A. G., Holley, D., Drummond, A., Anderson, C. R., Jr, Ross, H. A. & Poss, M. (2003). Epidemiology, genetic diversity, and evolution of endemic feline immunodeficiency virus in a population of wild cougars. J Virol 77, 9578–9589.[Abstract/Free Full Text]

Constantine, D. G. (1979). An updated list of rabies-infected bats in North America. J Wildl Dis 15, 347–349.[Medline]

Constantine, D. G. (2003). Geographic translocation of bats: known and potential problems. Emerg Infect Dis 9, 17–21.[Medline]

Crandell, R. A. (1975). Arctic fox rabies. In The Natural History of Rabies, 1st edn, pp. 23–40. Edited by G. M. Baer. New York: Academic Press.

Delpietro, H. A. & Russo, R. G. (1996). Ecological and epidemiologic aspects of the attacks by vampire bats and paralytic rabies in Argentina and analysis of the proposals carried out for their control. Rev Sci Tech 15, 971–984.[Medline]

Domingo, E. & Holland, J. J. (1994). Mutation rates and rapid evolution of RNA viruses. In The Evolutionary Biology of Viruses, pp. 161–184. Edited by S. S. Morse. New York: Raven Press.

Drummond, A. J., Nicholls, G. K., Rodrigo, A. G. & Solomon, W. (2002). Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161, 1307–1320.[Abstract/Free Full Text]

Drummond, A., Pybus, O. G. & Rambaut, A. (2003). Inference of viral evolutionary rates from molecular sequences. Adv Parasitol 54, 331–358.[CrossRef][Medline]

Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17, 368–376.[Medline]

Hasegawa, M., Kishino, H. & Yano, T. (1985). Dating the human–ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22, 160–174.[Medline]

Haydon, D. T., Bastos, A. D. S. & Awadalla, A. (2004). Low linkage disequilibrium indicative of recombination in foot-and-mouth disease virus gene sequence alignments. J Gen Virol 85, 1095–1110.[Abstract/Free Full Text]

Holmes, E. C. (2003). Molecular clocks and the puzzle of RNA virus origins. J Virol 77, 3893–3897.[Free Full Text]

Holmes, E. C., Woelk, C. H., Kassis, R. & Bourhy, H. (2002). Genetic constraints and the adaptive evolution of rabies virus in nature. Virology 292, 247–257.[CrossRef][Medline]

Hughes, G. J., Páez, A., Boshell, J. & Rupprecht, C. E. (2004). A phylogenetic reconstruction of the epidemiological history of canine rabies virus variants in Colombia. Infect Genet Evol 4, 45–51.[CrossRef][Medline]

Jenkins, G. M., Rambaut, A., Pybus, O. G. & Holmes, E. C. (2002). Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J Mol Evol 54, 156–165.[CrossRef][Medline]

Kissi, B., Tordo, N. & Bourhy, H. (1995). Genetic polymorphism in the rabies virus nucleoprotein gene. Virology 209, 526–537.[CrossRef][Medline]

Kissi, B., Badrane, H., Audry, L., Lavenu, A., Tordo, N., Brahimi, M. & Bourhy, H. (1999). Dynamics of rabies virus quasispecies during serial passage in heterologous hosts. J Gen Virol 80, 2041–2050.[Abstract/Free Full Text]

Krebs, J. W., Noll, H. R., Rupprecht, C. E. & Childs, J. E. (2002). Rabies surveillance in the United States during 2001. J Am Vet Med Assoc 221, 1690–1701.[Medline]

Kuzmin, I. V., Botvinkin, A. D., McElhinney, L. M., Smith, J. S., Orciari, L. A., Hughes, G. J., Fooks, A. R. & Rupprecht, C. E. (2004). Molecular epidemiology of terrestrial rabies in the former Soviet Union. J Wildl Dis 40, 617–631.[Abstract/Free Full Text]

Lemey, P., Pybus, O. G., Rambaut, A., Drummond, A. J., Robertson, D. L., Roques, P., Worobey, M. & Vandamme, A.-M. (2004). The molecular population genetics of HIV-1 group O. Genetics 167, 1059–1068.[Abstract/Free Full Text]

McNab, B. K. (1973). Energetics and the distribution of vampires. J Mammal 15, 131–144.

Messenger, S. L., Smith, J. S., Orciari, L. A., Yager, P. A. & Rupprecht, C. E. (2003). Emerging pattern of rabies deaths and increased viral infectivity. Emerg Infect Dis 9, 151–154.[Medline]

Morimoto, K., Patel, M., Corisdeo, S., Hooper, D. C., Fu, Z. F., Rupprecht, C. E., Koprowski, H. & Dietzschold, B. (1996). Characterization of a unique variant of bat rabies virus responsible for newly emerging human cases in North America. Proc Natl Acad Sci U S A 93, 5653–5658.[Abstract/Free Full Text]

Morimoto, K., Hooper, D. C., Carbaugh, H., Fu, Z. F., Koprowski, H. & Dietzchold, B. (1998). Rabies virus quasispecies: implications for pathogenesis. Proc Natl Acad Sci U S A 95, 3152–3156.[Abstract/Free Full Text]

Nadin-Davis, S. A., Huang, W., Armstrong, J., Casey, G. A., Bahloul, C., Tordo, N. & Wandeler, A. I. (2001). Antigenic and genetic divergence of rabies viruses from bat species indigenous to Canada. Virus Res 74, 139–156.[CrossRef][Medline]

Rambaut, A. (2000). Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 16, 395–399.[Abstract]

Rambaut, A. & Grassly, N. C. (1997). Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput Appl Biosci 13, 235–238.[Abstract]

Ray, C. E., Linares, O. J. & Morgan, G. S. (1988). Paleontology. In Natural History of Vampire Bats, pp. 19–30. Edited by A. M. Greenhall & U. Schmidt. Boca Raton, FL: CRC Press.

Rozas, J., Sánchez-DelBarrio, J. C., Messeguer, X. & Rozas, R. (2003). DnaSp, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19, 2496–2497.[Abstract/Free Full Text]

Rupprecht, C. E., Dietzschold, B., Wunner, W. H. & Koprowski, H. (1991). Antigenic relationships of lyssaviruses. In The Natural History of Rabies, 2nd edn, pp. 69–100. Edited by G. M. Baer. Boca Raton, FL: CRC Press.

Scatterday, J. E. & Galton, M. M. (1954). Bat rabies in Florida. Vet Med 49, 133–135.

Shope, R. (1975). Rabies virus antigenic relationships. In The Natural History of Rabies, 1st edn, pp. 141–152. Edited by G. M. Baer. New York: Academic Press.

Simmonds, P. & Smith, D. B. (1999). Structural constraints on RNA virus evolution. J Virol 73, 5787–5794.[Abstract/Free Full Text]

Smith, J. S. (2002). Molecular epidemiology. In Rabies, pp. 79–111. Edited by A. C. Jackson & W. H. Wunner. New York: Academic Press.

Smith, J. S. & Seidel, H. D. (1993). Rabies: a new look at an old disease. Prog Med Virol 40, 82–106.[Medline]

Tordo, N. & Kouknetzoff, A. (1993). The rabies virus genome: an overview. Onderstepoort J Vet Res 60, 263–269.[Medline]

Tordo, N., Poch, O., Ermine, A., Keith, G. & Rougeon, F. (1986). Walking along the rabies genome: is the large G-L intergenic region a remnant gene? Proc Natl Acad Sci USA 83, 3914–3918.[Abstract]

Twiddy, S. S., Holmes, E. C. & Rambaut, A. (2003). Inferring the rate and time-scale of dengue virus evolution. Mol Biol Evol 20, 122–129.[Abstract/Free Full Text]

Woelk, C. H., Pybus, O. G., Jin, L., Brown, D. W. & Holmes, E. C. (2002). Increased positive selection in persistent (SSPE) versus acute measles virus infections. J Gen Virol 83, 1419–1430.[Abstract/Free Full Text]

Yang, Z. (1997). PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13, 555–556.[Medline]

Yang, Z., Nielsen, R., Goldman, N. & Pedersen, A. M. (2000). Codon-substitution models for heterogeneous selection pressures at amino acid sites. Genetics 155, 431–449.[Abstract/Free Full Text]

Received 22 October 2004; accepted 7 February 2005.



This Article
Abstract
Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Citation Map
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Google Scholar
Articles by Hughes, G. J.
Articles by Rupprecht, C. E.
Articles citing this Article
PubMed
PubMed Citation
Articles by Hughes, G. J.
Articles by Rupprecht, C. E.
Agricola
Articles by Hughes, G. J.
Articles by Rupprecht, C. E.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS