Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Brighton, UK
Correspondence: E-mail: gwenael.piganeau{at}obs-banyuls.fr.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: clonal inheritance Macaca nemestrina mtDNA recombination
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To investigate the question of whether recombination occurs further, and if it does, how prevalent it is, we compiled 267 data sets of protein-coding sequences from mtDNA, in which multiple individuals from a species had been sequenced. We subjected these data sets to four tests of recombination: (1) the maxchi test of Maynard Smith (1992), (2) the geneconv test of Sawyer (1989) (http://www.math.wustl.edu/sawyer/geneconv/), (3) the correlation between r2 (Hill and Robertson 1968), a measure of linkage disequilibrium, and distance (henceforth, the LDr2 test), and (4) the correlation between |D'| (Lewontin 1964), another measure of linkage disequilibrium, and distance (the LD|D'| test). We chose to use four tests, instead of one, to investigate whether any evidence of recombination was method dependent. The tests we chose represent different approaches to the detection of recombination and are among the most efficient methods available for low recombination rates (Posada and Crandall 2001).
![]() |
Material and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Tests of Recombination
The data were subjected to four tests of recombination. (1) In the maxchi test of Maynard Smith (1992), we follow Maynard Smith's original implementation of his test, rather than the recent version suggested by Posada and Crandall (2001). In their method, they only survey the middle third of the alignment for evidence of recombination. We survey the entire region but prevent anomalous behavior of the statistic by requiring that all expected values in the chi-square are at least 2. This requirement prevents large chi-square values being produced when the expected values are very small. This test appears to be considerably more powerful than the implementation of Posada and Crandall (2001). Simulations, however, suggest that the type I error is not affected. Data sets in which all chi-square tests, in the original data, fail because of small expected values were excluded from further analysis. (2) In Sawyer's geneconv test (Sawyer 1989), we considered global P values, which automatically correct for multiple comparisons, for internal fragments. We used the default parameters because there has been no systematic analysis of the behavior of the geneconv test. (3) We considered the relationship between the measure of linkage disequilibrium, r2 (Hill and Robertson 1968), and distance between sites, with the significance assessed by a Mantel test (the LDr2 test). (4) We considered the relationship between the measure of linkage disequilibrium, |D'| (Lewontin 1964), and distance between sites, with the significance assessed by a Mantel test (the LD|D'| test). A data set was excluded from further analysis with the LD|D'| if |D'| = 1 for all pairwise comparisons, including these data sets would make the test conservative because all randomized data sets have the same correlation as the original data set (i.e., a correlation of 0). One thousand randomizations were performed for each test to assess significance; however, if none of the randomized data sets exceeded the observed value, we repeated the analysis with 10,000 and then 100,000 randomizations. If the test still yielded no randomized data sets that exceeded the observed value, we set the probability value to 0.00001 in all further analyses. These methods are available at www.lifesci.sussex.ac.uk/CSE/test.
Statistical Analysis
To combine results across data sets we used Fisher's method of combining probabilities by summing 2Ln(p) across data sets. Because 2Ln(p) is chi-square distributed with 2 degrees of freedom, the sum is chi-square distributed with 2n degrees of freedom, where n is the number of data sets.
To test whether there was more evidence of recombination in sexual species than in asexual species we calculated the combined probability value for each data set for each test (i.e., by summing 2Ln(p), where p is the probability value). We divided the value by the number of data sets to obtain the average and considered the difference between the sexual species and the asexual species as our test statistic where nsex and nasex are the numbers of sexual and asexual data sets. To find the distribution of this statistic under the null hypothesis, that there is no difference between sexual and asexual species, we pooled the probability values from the sexual and sexual data sets and then randomly selected, without replacement, nsex and nasex data sets and recalculated Z. We repeated this procedure 1,000 times.
ANOVA analyses were performed on the probability values from each test that were transformed as Sqrt(Ln(p)), which was found to be approximately normally distributed. Kruskal-Wallis tests were also performed.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Only four of the data sets are individually significant because we have performed four tests on 267 species; that is, applying the Bonferroni correction means that a test has to be individually significant at a probability level of 0.05/(4*267) = 4 x 105 to provide significant evidence of recombination. These data sets are from a plant parasitic nematode, Bursaphelenchus conicaudatus; a primate, Macaca nemestrina; a fish, Micropterus salmoides; and a rodent, Microtus longicaudus. These data sets all contain recombinant sequences between two distinct haplotypes, which suggests that they may be recombinants between subspecies or species (see below). However, there is strong evidence of recombination even if these data sets are removed, because the combined probability values are highly significant for each test (maxchi, P = 0.001; geneconv, P = 0.014; LDr2, P < 104; and LD|D'|, P = 0.013). The problem is simply that we cannot say precisely which species has undergone recombination. Let us imagine that we performed a statistical test on 100 independent data sets and found one to be significant at a probability value of 0.0001 and 20 to be significant at a probability value of 0.05. Clearly the first data set is individually significant because the probability of observing one or more data sets at that level of significance is very small, even when we have 100 data sets (P = 0.01). In contrast, we expect to observe at least five data sets that are significant at 5%. However, the probability of observing 20 or more is highly unlikely (P < 0.0001). So we know that there is evidence of recombination in the remaining 99 data sets, and it is likely to be in one of the data sets that have a probability value of 0.05. However, we cannot tell which data set has actually undergone recombination.
In table 2, we list the species that contribute most to the evidence of recombination. A complete list of the outcome of the four tests for the 267 data sets is available in table 1 in Supplementary Material online. To give a measure of the overall evidence of recombination, we summed 2Ln(p) across tests, where p is the probability value. This value cannot be converted into an overall probability value because the tests are not statistically independent; however, it provides a metric that is related to the overall support for recombination.
|
There does not appear to be any evidence that the frequency of recombination differs across our data set; an ANOVA of the transformed probability values shows no evidence of differences between phyla, classes, or orders. These results remained unchanged if the analyses were restricted to groups (i.e., phyla, classes, or orders) that had more than 20 species.
A visual inspection of the data sets that show the strongest evidence of recombination, reveals a small number of cases in which the recombinant molecules can be identified visually. Two of these are in primates. In Macaca nemestrina, one of the data sets that is individually significant, there are two distinct groups of haplotypes, with a third group of two sequences that appear to be recombinants between the first two (figure 1). Both putative recombinants are significant according to the maxchi test (P < 105 and P = 0.01 for Rec1 and Rec2, respectively). The putative points of recombination are between nucleotides 420 and 436 for Rec1 and between 489 and 507 for Rec2. If we just consider those positions that are fixed in either group 1 or group 2 (i.e., that define the two major haplotypic groups), then before the putative breakpoint, there are 27 matches and two mismatches to the group 1 sequences, and after the putative breakpoint, there are five matches and 15 mismatches. For the second recombinant sequence, there are 30 matches and six mismatches before the breakpoint and eight matches and six mismatches after the breakpoint. The fact that there are nucleotides, both before and after the breakpoint, that match the "wrong" parental group of sequences might be the result of the formation of heteroduplex during recombination (Szostak et al. 1983), parallel mutations, or further recombination. Interestingly, recombination in this species seems to have been between two quite different haplotypes that may be the main subspecies of M. nemestrina: M. nemestrina nemestrina and M. nemestrina leonina. All the group 1 sequences are listed as the former, whereas two of the four group 2 sequences are given as M. nemestrina leonina.
|
|
There is also evidence of recombination in another primate, Papio papio, the Guinea baboon. In this species, one individual is distinctly different from the others up to nucleotide 270, after which it becomes identical to some of the other sequences. Interestingly the sequence up to nucleotide 270 is more similar to that of several other baboon species, including P. anubis, the olive baboon, whose range is adjacent to that of P. papio (Newman, Jolly, and Rogers 2003). However, the sequence of P. papio is not identical to that of P. anubis, which suggests that either recombination occurred sometime in the past or there has been subsequent recombination within P. papio.
However, cases in which recombination events can be identified are exceptional; in the vast majority of cases we cannot identify recombinants visually. This is perhaps not surprising, because recombination is only obvious to the eye when there have been very few recent recombination events between sequences that are quite different.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
There are a variety of ways in which artifactual evidence of recombination could be produced. First, there may be numts in the data; these are nuclear-encoded copies of the mtDNA. We have gone to some length to exclude numts by eliminating all data sets in which there are stop codons or the ratio of nonsynonymous to synonymous polymorphism is unreasonable (data sets in which this ratio was greater than approximately one third were excluded). These strategies will exclude most data sets in which all sequences are numts, but they may not be an effective guard against data sets in which a small number of sequences are recent numts. Second, there could be contamination between two DNA samples. This could lead to recombination in two ways. First, the polymerase may jump from one template to another during PCR. Second, different primers may bind to the two templates with different efficiencies; for example, the first set of primers may bind to the start of the gene from individual A very well, but the second half of the gene may bind very poorly. Third, recombination could be introduced during sequence assembly. This would likely produce perfect recombinants with a sequence that is composed of a perfect match, with one sequence followed by a perfect match to another sequence. Each of these should be easy to catch if the DNA has been sequenced in both directions, but there can be no guarantee that this has happened or that both strands have been sequenced to high standard. Ultimately, confirmation of the recombinants we have discovered will require careful resequencing. Testing for recombination may be one way to assess data quality.
It is also possible that the evidence of recombination is not caused by experimental error but by a problem with the tests of recombination. Innan and Nordborg (2002) have pointed out that the linkage disequilibrium methods can generate evidence of recombination if there is a concentration of hypermutable sites, at which the infinite sites assumption is violated. It also seems likely that maxchi and geneconv will be susceptible to this bias. However, there are a number of reasons for believing that clusters of hypermutable sites are not responsible for the evidence of recombination. First, hypermutable sites cannot generate the sort of patterns of recombination that we have detected in M. nemestrina, P. papio, and a number of other species. Second, sexual taxa show significantly more evidence of recombination than do asexual taxa, and, yet, there is no reason why asexual taxa should not have clusters of hypermutable sites. Third, it seems likely that the clustering would have to be strong to generate the strong evidence of recombination we have detected (it should be emphasized that all data sets are from protein-coding genes so that none of them contain control region sequence).
If recombination is occurring in nature, then it has some implications, but possibly in areas in which it is often not thought about. Clonality is often explicitly assumed in analyses of demography; for example, mtDNA has been used extensively to trace the spread of humans across the world. It is likely that much of this work will be unaffected because the patterns are established by migration and limited gene flow, so the mitochondrial molecules involved in the pattern never have the chance to recombine. However, recombination may affect inferences about changes in population size because recombination can mirror some of the patterns induced by population size expansion. MtDNA has also been used to date various events. Under some circumstances, these dates will be affected by recombination. For example, mtDNA has been used to date our most recent matrilineal ancestor. Recombination will affect this dateit will generally mean that the date has been underestimated (Eyre-Walker 2000)but by how much is unclear. However, the area in which mtDNA is used most often, molecular systematics, is the area in which recombination is rarely considered, and, yet, it might have important implications. In the M. nemestrina and P. papio sequences shown in figure 1, there are clearly recombinants between two subspecies, or genetically distinct groups of animals. Without an appreciation of recombination, the phylogenetic status of some individuals would be incorrect. Furthermore, the obvious evidence of introgression would be missed.
It is likely that some of the evidence for recombination is a consequence of laboratory error, but unless the quality of DNA sequencing is very poor, recombination in mtDNA is moderately frequent and occurs both within and between species and subspecies.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
2 Present address: School of Biological Sciences, University of Bristol, Bristol, UK
Michael Nachman, Associate Editor
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Andolfatto, P., J. Scriber, and B. Charlesworth. 2003. No association between mitochondrial DNA haplotypes and a female-limited mimicry phenotype in Papilio glaucus. Evol. Int. J. Org. Evol. 57:305316.
Avise, J. C. 2000. Phylogeography: the history and formation of species. Harvard University Press, Cambridge, Mass.
Birky, C. W. J. 1995. Uniparental inheritance of mitochondrial ad chloroplast genes: mechanisms and evolution. Proc. Natl. Acad. Sci. USA 92:1133111338.[Abstract]
Elson, J. R., R. M. Andrews, P. F. Chinnery, R. N. Lightowlers, D. M. Turnbull, and N. Howell. 2001. Analysis of European mtDNAs for recombination. Am. J. Hum. Genet. 68:145153.[CrossRef][ISI][Medline]
Evans, B. J., J. Carlos Morales, J. Supriatna, and D. J. Melnick. 1999. Origin of the Sulawesi macaques (Cercopithecidae: Macaca) as suggested by mitochondrial DNA phylogeny. Biol. J. Linn. Soc. 66:539560.[CrossRef][ISI]
Evans, B. J., J. Supriatna, N. Andayani, and D. J. Melnick. 2003. Diversification of Sulawesi macaque monkeys: decoupled evolution of mitochondrial and autosomal DNA. Evolution 57:19311946.[ISI][Medline]
Eyre-Walker, A. 2000. Do mitochondria recombine in humans?. Philos. Trans. R. Soc. Lond. B Biol. Sci. 355:15731580.[CrossRef][ISI][Medline]
Gouy, M., F. Milleret, C. Mugnier, M. Jacobzone, and C. Gautier. 1984. ACNUC: a nucleic acid sequence data base and analysis system. Nucleic Acids Res. 12:121127.[Abstract]
Gyllensten, U., D. Wharton, A. Josefsson, and A. C. Wilson. 1991. Paternal inheritance of mitochondrial DNA in mice. Nature 352:255257.[CrossRef][ISI][Medline]
Hey, J. 2000. Human mitochondrial DNA recombination: can it be true?. Trends Ecol. Evol. 15:181182.[CrossRef][ISI][Medline]
Hill, W. G., and A. Robertson. 1968. Linkage disequilibrium in finite populations. Theoret. Appl. Genet. 38:226231.
Hoarau, G., S. Holla, R. Lescasse, W. T. Stam, and J. L. Olsen. 2002. Heteroplasmy and evidence for recombination in the mitochondrial control region of the flatfish Platichthys flesus. Mol. Biol. Evol. 19:22612264.
Hoeh, W. R., D. T. Stewart, C. Saavedra, B. W. Sutherland, and E. Zouros. 1997. Phylogenetic evidence for role-reversals of gender-associated mitochondrial DNA in Mytilus (Bivalvia: Mytilidae). Mol. Biol. Evol. 14:959967.[Abstract]
Ingman, M., H. Kaessman, S. Paabo, and U. Gyllensten. 2000. Mitochondrial genome variation and the origin of modern humans. Nature 408:708713.[CrossRef][ISI][Medline]
Innan, H., and M. Nordborg. 2002. Recombination or mutational hot spots in human mtDNA?. Mol. Biol. Evol. 19:11221127.
Jorde, L. B., and M. Bamshad. 2000. Questioning evidence for recombination in human mitochondrial DNA. Science 288:1931a.
Kondo, R., Y. Satta, E. T. Matsuura, H. Ishiwa, N. Takahata, and S. I. Chigusa. 1990. Incomplete maternal transmission of mitochondrial DNA in Drosophila. Genetics 126:657663.
Kraytsberg, Y., M. Schwartz, T. A. Brown, K. Ebralidse, W. S. Kunz, D. A. Clayton, J. Vissing, and K. Khrapko. 2004. Recombination of human mitochondrial DNA. Science 304:981.
Kvist, L., J. Martens, A. A. Nazarenko, and M. Orell. 2003. Paternal leakage of mitochondrial DNA in the great tit (Parus major). Mol. Biol. Evol. 20:243247.
Ladoukakis, E. D., and E. Zouros. 2001a. Direct evidence for homologous recombination in mussel (Mytilus galloprovincialis) mitochondrial DNA. Mol. Biol. Evol. 18:11681175.
. 2001b. Recombination in animal mitochondrial DNA: evidence from published sequences. Mol. Biol. Evol. 18:21272131.
Lewontin, R. C. 1964. The interaction of selection and linkage. I. Genetic considerations; heterotic models.. Genetics 49:4967.
Magoulas, A., and E. Zouros. 1993. Restriction-site heteroplasmy in anchovy (Engraulis encrasicolus) indicates incidental biparental Inheritance of mitochondrial DNA. Mol. Biol. Evol. 10:319325.
Maynard Smith, J. M. 1992. Analyzing the mosaic structure of genes. J. Mol. Evol. 34:126129.[ISI][Medline]
Maynard Smith, J., and N. H. Smith. 2002. Recombination in animal mitochondrial DNA Mol. Biol. Evol. 19:23302332.
McVean, G., P. Awadalla, and P. Fearnhead. 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160:12311241.
Newman, T. K., C. J. Jolly, and J. Rogers. 2003. Mitochondrial phylogeny and systematics of baboons (Papio). Am. J. Phys. Anthropol. 122.
Piganeau, G., and A. Eyre-Walker. 2004. A reanalysis of the indirect evidence for recombination in human mitochondrial DNA. Heredity 92:282288.[CrossRef][ISI][Medline]
Posada, D., and K. A. Crandall. 2001. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc. Natl. Acad. Sci. USA 98:1375713762.
Roos, C., T. Ziegler, J. K. Hodges, H. Zischler, and C. Abegg. 2003. Molecular phylogeny of Mentawai macaques: taxonomic and biogeographic implications. Mol. Phylogenet. Evol. 29:139150.[CrossRef][ISI][Medline]
Sawyer, S. 1989. Statistical tests for detecting gene conversion. Mol. Biol. Evol. 6:526538.[Abstract]
Schwartz, M., and J. Vissing. 2002. Paternal inheritance of mitochondrial DNA. N. Engl. J. Med. 347:576580.
Skibinski, D. O. F., C. Gallagher, and C. M. Beynon. 1994. Mitochondrial DNA inheritance. Nature 368:817818.[ISI][Medline]
Szostak, J. W., T. L. Orr-Weaver, R. J. Rothstein, and F. W. Stahl. 1983. The double-strand-break model for recombination. Cell 33:2535.[ISI][Medline]
Wallis, G. P. 2000. Mitochondrial recombination or coevolution of sites?. Trends Ecol. Evol. 15:470471.[CrossRef][ISI]
Zouros, E., A. O. Ball, C. Saavedra, and K. R. Freeman. 1994. Mitochondrial DNA inheritance. Nature 368:818.[CrossRef][ISI][Medline]