Sequence Gaps Join Mice and Men: Phylogenetic Evidence from Deletions in Two Proteins

Celine Poux*, Teun van Rheede*, Ole Madsen* and Wilfried W. de Jong*,{dagger}

*Department of Biochemistry, University of Nijmegen, Nijmegen, The Netherlands;
{dagger}Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam

Recent nuclear sequence analyses have provided evidence that primates and rodents are more closely related than previously believed (Madsen et al. 2001Citation ; Murphy et al. 2001a,Citation 2001bCitation ). This proposal is difficult to reconcile with morphological insights (Liu et al. 2001Citation ; Novacek 2001Citation ) and is not generally supported by current mitochondrial sequence data (Reyes, Pesole, and Saccone 2000Citation ; Nikaido et al. 2001Citation ; Arnason et al. 2002Citation ; Janke et al. 2002Citation ). Moreover, the supporting data and analyses have been criticized on methodological grounds (Rosenberg and Kumar 2001Citation ). Here we report deletions in two nuclear protein-coding genes that lend independent support to this contested grouping.

Some 18 orders of placental mammals are currently recognized, but their phylogenetic relationships remain highly controversial. Extensive sequence comparisons of mainly nuclear genes support a basal division into four major clades (Xenarthra, Afrotheria, Laurasiatheria, and Euarchontoglires), which has far-reaching implications for early mammalian biogeography and morphological diversification (Murphy et al. 2001bCitation ). Euarchontoglires is composed of the orders Primates, Rodentia, Lagomorpha (rabbits, hares, and pikas), Scandentia (tree shrews), and Dermoptera (flying lemurs). In contrast, morphology groups Primates, Scandentia, and Dermoptera with Chiroptera (bats) in the clade Archonta, whereas Rodentia and Lagomorpha (jointly called Glires) are in a distant clade with Macroscelidea (elephant shrews) (Liu et al. 2001Citation ; Novacek 2001Citation ). Also, sequence data from 12 proteins encoded by the mitochondrial genome generally do not support Euarchontoglires (e.g., Nikaido et al. 2001Citation ) or even maintain rodent polyphyly in many cases (Reyes, Pesole, and Saccone 2000Citation ; Arnason et al. 2002Citation ; Janke et al. 2002Citation ). Only by excluding some taxa with high or atypical substitution rates (or both) can sound mitochondrial support be obtained (Waddell, Kishino, and Ota 2001Citation ). Establishing the monophyly of the most speciose eutherian order, Rodentia, and finding its sister group has indeed been most difficult to solve on the basis of sequence evidence (e.g., Graur, Hide, and Li 1991Citation ; Adkins et al. 2001Citation ; Huchon et al. 2002Citation ). As for the molecular data sets giving support to Euarchontoglires, it has been questioned whether these are actually able to resolve the relationship of rodents and primates or whether more genes and longer sequences are needed (Rosenberg and Kumar 2001Citation ). Given, too, that Euarchontoglires is the least supported of the four major clades in some analyses (Madsen et al. 2001Citation ), additional evidence for their monophyly is certainly needed. This could be provided by "rare genomic changes," such as insertions and deletions (indels) in proteins (Rokas and Holland 2000Citation ). Indels in protein-coding DNA sequences require more complex mutational mechanisms and are generally more constrained than single base substitutions. Such indels can therefore be good indicators for monophyly, as demonstrated already for two of the other major clades, Xenarthra (van Dijk et al. 1999Citation ) and Afrotheria (Madsen et al. 2001Citation ), as well as in deeper vertebrate phylogeny (Venkatesh, Erdmann, and Brenner 2001Citation ).

While studying genes involved in various neurodegenerative disorders, we noticed two deletions that might be informative for the naturalness of Euarchontoglires. One is a large deletion in exon 8 of the gene for spinocerebellar ataxia 1 (SCA1), resulting in an 18-residue deletion in the encoded protein (fig. 1 , top). The other is a 6-bp deletion at the 5' end of the intronless coding region of the prion protein gene (PRNP; fig. 1 , bottom). Both deletions perfectly distinguish Euarchontoglires from all other placentals and outgroup marsupials. Obviously, the most parsimonious interpretation is that these deletions originated once and independently in the SCA1 and PRNP genes of the last common ancestor of Euarchontoglires, thus supporting their monophyly. If the morphological or mitogenomic trees are true, both deletions must have originated at least twice in exactly the same lineages.



View larger version (87K):
[in this window]
[in a new window]
 
Fig. 1.—Deletions in the SCA1 protein (top) and the prion protein gene (bottom) support Euarchontoglires. Protein and DNA sequences, respectively, are shown as being most informative. Sequences correspond with positions 415 to 445 in the human SCA1 protein, and with nucleotides 1–44 of the coding sequence of the human PRNP gene. Eutherian species are grouped according to the four recently proposed basal clades of placental mammals (Murphy et al. 2000bCitation ). Gray shading emphasizes the overall sequence conservation; — denotes alignment gaps. The underlined Leu-Ser-Pro repeat in SCA1 is discussed in the text. Most sequences were newly determined by direct sequencing of PCR-amplified genomic DNA fragments and can be found with full species names under accession numbers AJ438463–AJ438487 for SCA1 and AJ438193–AJ438207 for PRNP. Human and mouse SCA1 sequences are from the database (a, XM004164; b, NM009124), and PRNP sequences indicated with c from Wopfner et al. (1999)Citation

 
Although reversal of the observed deletions in SCA1 and PRNP is difficult to imagine, a repeated origin cannot totally be excluded. Indels are certainly not free from homoplasy, especially in regions with sequence repeats. In the SCA1 gene, for example, a sequence repeat CTG TCN CCC, coding for Leu-Ser-Pro (underlined in fig. 1 , top), might in principle have triggered the large deletion more than once. In the middle of this same region, a 6-bp deletion has caused the loss of two alanines in armadillo, whereas a 3-bp insertion results in an additional alanine in most Laurasiatheria (fig. 1 , top). This latter insertion might indeed agree nicely with a basal separation of Eulipotyphla (represented here by hedgehog and mole) from the other Laurasiatheria (Murphy et al. 2001bCitation ). However, both the deletion and the insertion are likely to be caused by the GCC (Ala) repeat in this gene region and therefore to have little phylogenetic significance. It is the congruence of independent evidence that makes the two deletions as shown in figure 1 convincing indicators for the monophyly of Euarchontoglires. The probability of parallel origins of such deletions in two independent genes is difficult to evaluate statistically (van Dijk et al. 1999Citation ; Rokas and Holland 2000Citation ), but certainly it is extremely small. And even if these deletions were due to homoplasy, it would be a most curious coincidence that they occur in precisely the same species that are also grouped by independent sequence evidence (Madsen et al. 2001Citation ; Murphy et al. 2001a,Citation 2001bCitation ).

Acknowledgements

This work was supported by grants from the Netherlands Organisation for Scientific Research and the European Commission.

Footnotes

Rodney Honeycutt, Reviewing Editor

Keywords: insertions deletions phylogeny Glires primates Euarchontoglires Back

Address for correspondence and reprints: Wilfried W. de Jong, Department of Biochemistry 161, University of Nijmegen, P.O. Box 9101, 6500 HB Nijmegen, The Netherlands. w.dejong{at}ncmls.kun.nl . Back

References

    Adkins R. M., E. L. Gelke, D. Rowe, R. L. Honeycutt, 2001 Molecular phylogeny and divergence times for major rodent groups: evidence from multiple genes Mol. Biol. Evol 18:777-791[Abstract/Free Full Text]

    Arnason U., J. A. Adegoke, K. Bodin, E. W. Born, Y. B. Esa, A. Gullberg, M. Nilsson, R. V. Short, X. Xu, A. Janke, 2002 Mammalian mitogenomic relationships and the root of the Eutherian tree Proc. Natl. Acad. Sci. USA 99:8151–8156 .

    Graur D., W. A. Hide, W.-H. Li, 1991 Is the guinea-pig a rodent? Nature 351:649-652[ISI][Medline]

    Huchon D., O. Madsen, M. J. J. Sibbald, K. Ament, M. J. Stanhope, F. Catzeflis, W. W. de Jong, E. J. P. Douzery, 2002 Rodent phylogeny and a timescale for the evolution of Glires: evidence from an extensive taxon sampling using three nuclear genes Mol. Biol. Evol (in press)

    Janke A., O. Magnell, G. Wieczorek, M. Westerman, U. Arnason, 2002 Phylogenetic analysis of 18S rRNA and the mitochondrial genomes of the wombat, Vombatus ursinus, and the spiny anteater, Tachyglossus aculeatus: increased support for the Marsupionta hypothesis J. Mol. Evol 54:71-80[ISI][Medline]

    Liu F. G., M. M. Miyamoto, N. P. Freire, P. Q. Ong, M. R. Tennant, T. S. Young, K. F. Gugel, 2001 Molecular and morphological supertrees for eutherian (placental) mammals Science 291:1786-1789[Abstract/Free Full Text]

    Madsen O., M. Scally, C. J. Douady, D. J. Kao, R. W. DeBry, R. Adkins, H. M. Amrine, M. J. Stanhope, W. W. de Jong, M. S. Springer, 2001 Parallel adaptive radiations in two major clades of placental mammals Nature 409:610-614[ISI][Medline]

    Murphy W. J., E. Eizirik, W. E. Johnson, Y. P. Zhang, O. A. Ryder, S. J. O'Brien, 2001a. Molecular phylogenetics and the origins of placental mammals Nature 409:614-618[ISI][Medline]

    ———, ———, S. J. O'Brien, et al 2001b. Resolution of the early placental mammal radiation using Bayesian phylogenetics Science 294:2348-2351[Abstract/Free Full Text]

    Nikaido M., K. Kawai, Y. Cao, M. Harada, S. Tomita, N. Okada, M. Hasegawa, 2001 Maximum likelihood analysis of the complete mitochondrial genomes of eutherians and a reevaluation of the phylogeny of bats and insectivores J. Mol. Evol 53:508-516[ISI][Medline]

    Novacek M. J., 2001 Mammalian phylogeny: genes and supertrees Curr. Biol 11:R573-R575[ISI][Medline]

    Reyes A., G. Pesole, C. Saccone, 2000 Long-branch attraction phenomenon and the impact of among-site rate variation on rodent phylogeny Gene 259:177-187[ISI][Medline]

    Rokas A., P. W. Holland, 2000 Rare genomic changes as a tool for phylogenetics Trends Ecol. Evol 15:454-459[ISI][Medline]

    Rosenberg M. S., S. Kumar, 2001 Incomplete taxon sampling is not a problem for phylogenetic inference Proc. Natl. Acad. Sci. USA 98:10751-10756[Abstract/Free Full Text]

    van Dijk M. A. M., E. Paradis, F. Catzeflis, W. W. de Jong, 1999 The virtues of gaps: xenarthran (edentate) monophyly supported by a unique deletion in alpha-A-crystallin Syst. Biol 48:94-106[ISI][Medline]

    Venkatesh B., M. V. Erdmann, S. Brenner, 2001 Molecular synapomorphies resolve evolutionary relationships of extant jawed vertebrates Proc. Natl. Acad. Sci. USA 98:11382-11387[Abstract/Free Full Text]

    Waddell P. J., H. Kishino, R. Ota, 2001 A phylogenetic foundation for comparative mammalian genomics Genome Informatics 12:141-154

    Wopfner F., G. Weidenhofer, R. Schneider, A. von Brunn, S. Gilch, T. F. Schwarz, T. Werner, H. M. Schatzl, 1999 Analysis of 27 mammalian and 9 avian PrPs reveals high conservation of flexible regions of the prion protein J. Mol. Biol 289:1163-1178[ISI][Medline]

Accepted for publication June 2, 2002.