Department of Zoology, Brigham Young University
In a recent letter in Molecular Biology and Evolution, Schierup and Hein (2000a)
showed that the likelihood ratio test (LRT) of the molecular clock (Felsenstein 1981
) "wrongly" rejects the clock hypothesis when recombination has occurred. However, this result should not be taken as a failure of the LRT. Because in the presence of recombination often there is not one single tree describing the history of the sequences, but several, the LRT is correctly rejecting the actual null hypothesis tested, that is, that the data are evolving under a clock on one single tree.
To appropriately test the clock hypothesis in the presence of recombination, we need to use a test independent of tree topology. Muse and Weir (1992)
proposed a triplet likelihood ratio test to test for equality of evolutionary of rates for two species at a time using a third species as an outgroup (to avoid confusion, I will call this test the relative-rate test [RRT]). The RRT is therefore independent of topology and might be used for potentially recombinant sequences if an outgroup is selected which did not recombine with the ingroup. Here, the performance of the RRT with recombinant data is presented.
Recombinant alignments were simulated using the coalescent with recombination (Hudson 1983
). A program written in C for this purpose is available from the author. Alignments of 11 sequences (10 recombining ingroup and one nonrecombining outgroup) with 1,000 nt were evolved with a molecular clock under the Jukes-Cantor (JC) model of evolution (Jukes and Cantor 1969
). For each level of recombination and diversity, 1,000 replicates were generated. A maximum-likelihood (ML) tree was estimated for each simulated data set under the JC+
model of evolution in PAUP* (Swofford 1998
) without assuming a clock. The
distribution for rate variation among sites (Yang 1993
) was included because it is known that recombination introduces such rate heterogeneity (Schierup and Hein 2000b
), and the likelihood increases significantly when this variation is accounted for. For the ML tree, the likelihood under the unconstrained model, where each lineage is allowed to have its own rate (alternative hypothesis), was compared with the likelihood obtained when the molecular clock was enforced (null hypothesis). If the data are evolving under a clock, the difference in likelihood between these two models should be close to zero. To establish statistical significance, twice the difference in likelihood is assumed to be distributed as a
2 with n - 2 degrees of freedom, where n is the number of sequences (Felsenstein 1981
). The likelihoods for the LRTs were calculated in PAUP*, while the RRTs were performed with HYPHY (Kosakovsky and Muse 2000
). In the latter case, multiple tests were calculated for each data set, and the Bonferroni correction was applied to avoid an increase in false positives. If any of the pairwise tests for a given data set were significant, the RRT was considered to reject the clock hypothesis for that data set.
As Schierup and Hein (2000b)
previously demonstrated, the LRT rejected the molecular-clock hypothesis even with low levels of recombination (table 1
). Increasing divergence made the LRT more prone to reject the molecular clock. However, when the RRT was used, results were very different, and increasing levels of recombination did not affect the rejection levels (table 1
). The RRT rejected (after the Bonferroni correction) the clock hypothesis less than 5% of the time (around 2%3%), so it is a conservative test. This could be due to the lack of power of the relative ratio tests under some conditions (Bromham et al. 2000
), but it could also be due to the conservative Bonferroni correction (Rice 1989
). More powerful Bonferroni corrections exist that could be easily applied to single data sets (Hochberg 1988
).
|
|
The RRT as described here is a conservative method for testing the molecular-clock hypothesis, independent of recombination. This is true only if the outgroup used did not recombine with the ingroup. There are two main applications of the RRT test:
In summary, the RRT can be a useful tool for investigating several molecular evolutionary processes, such as recombination and selection. The RRT is easily implemented in the software HYPHY (Kosakovsky and Muse 2000
).
Acknowledgements
Mikkel Schierup suggested the use of the relative ratio test for recombinant data. This manuscript benefited from conversations with Mikkel Schierup, Andrew Rambaut, and Michael Worobey. Thanks to Eddie Holmes and two anonymous reviewers for useful suggestions. This work was supported by a BYU Graduate Studies Award and by an NSF Doctoral Dissertation Improvement Grant (NSF DEB 0073154).
Footnotes
Edward Holmes, Reviewing Editor
1 Present address: Variagenics, Inc., Cambridge, Massachusetts.
2 Keywords: recombination
molecular clock
likelihood ratio tests
relative-rate test
3 Address for correspondence and reprints: David Posada, Variagenics, Inc., 60 Hampshire Street, Cambridge, Massachusetts 02139-1548. dposada{at}variagenics.com
.
References
Bromham L., D. Penny, A. Rambaut, M. D. Hendy, 2000 The power of the relative rates tests depends on the data J. Mol. Evol 50:296-301[ISI][Medline]
Cunningham C. W., 1997 Can three incongruence tests predict when data should be combined? Mol. Biol. Evol 14:733-740[Abstract]
Felsenstein J., 1981 Evolutionary trees from DNA sequences: a maximum likelihood approach J. Mol. Evol 17:368-376[ISI][Medline]
Grassly N. C., E. C. Holmes, 1997 A likelihood method for the detection of selection and recombination using nucleotide sequences Mol. Biol. Evol 14:239-247[Abstract]
Hochberg Y., 1988 A sharper Bonferroni procedure for multiple tests of significance Biometrika 75:800-802[ISI]
Hudson R. R., 1983 Properties of a neutral allele model with intragenic recombination Theor. Popul. Biol 23:183-201[ISI][Medline]
Jukes T. H., C. R. Cantor, 1969 Evolution of protein molecules Pp. 21132 in H. M. Munro, ed. Mammalian protein metabolism. Academic Press, New York
Kosakovsky S. L., S. V. Muse, 2000 HYPHY: hypothesis testing using phylogenies. Beta 1.7 Program in Statistical Genetics, Department of Statistics, North Carolina State University, Raleigh
Muse S. V., B. S. Weir, 1992 Testing for equality of evolutionary rates Genetics 132:269-276
Rice W. R., 1989 Analyzing tables of statistical tests Evolution 43:223-225[ISI]
Robertson D. L., 2001 Links to recombinant sequence detection/analysis programs http://grinch.zoo.ox.ac.uk/RAP_links.html
Schierup M. H., J. Hein, 2000a. Recombination and the molecular clock Mol. Biol. Evol 17:1578-1579
. 2000b. Consequences of recombination on traditional phylogenetic analysis Genetics 156:879-891
Swofford D. L., 1998 PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0 beta Sinauer, Sunderland, Mass
Yang Z., 1993 Maximum likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites Mol. Biol. Evol 10:1396-1401[Abstract]
Zhou J., B. G. Spratt, 1992 Sequence diversity within the argf, fbp and reca genes of natural isolates of Neisseria meningitidis: interspecies recombination within the argF gene Mol. Microbiol 23:2135-2146