Department of Zoology, Natural History Museum, London, England
Many microbial eukaryotes lack functional mitochondria required for oxidative phosphorylation and instead gain their energy from anaerobic metabolism. Some of these eukaryotes, such as anaerobic ciliates and fungi, have aerobic sister groups with mitochondria and thus are clearly secondary anaerobes (Dore and Stahl 1991
; Embley et al. 1995
). Other amitochondriate anaerobes, including diplomonads (e.g., Giardia and Spironucleus), microsporidians (e.g., Vairimorpha), parabasalids (e.g., Trichomonas), and pelobionts (e.g., Entamoeba), for which aerobic sister groups have been less readily apparent, were presumed to be representative of the most primitive nucleated cells and to have diverged from other eukaryotes prior to the mitochondrion endosymbiosis (Cavalier-Smith 1983
). These lineages were known as Archezoa (Cavalier-Smith 1983
) and became the focus of molecular enquiries into the origins and early diversification of eukaryotes. However, one result of these studies was that the Archezoa hypothesis was rejected on a case-by-case basis, as it was demonstrated that archezoans arose from within groups which possessed mitochondria or that they contained genes derived from the mitochondrial endosymbiont (reviewed in Embley and Hirt 1998
; Roger 1999
).
The strongest datum rejecting the Archezoa hypothesis for diplomonads is the report of a mitochondrial chaperonin 60 (cpn60) gene (cpn60) on the genome of Giardia lamblia (Roger et al. 1998
). In aerobic eukaryotes, cpn60 plays an important role in mitochondrial protein import (Martin 1997
), and its phylogeny reflects an
-proteobacterial ancestry consistent with an origin from the mitochondrion endosymbiont (Viale and Arakaki 1994
). The cpn60 gene is found on the host nuclear genome, which is usually interpreted as indicating that it was transferred from the mitochondrial symbiont to the host nucleus early in the history of that symbiosis (Viale and Arakaki 1994
). Thus, a reasonable explanation for the presence of cpn60 genes on the genome of Giardia would be that it too is descended from ancestors that once contained the mitochondrion endosymbiont.
In contrast, several recent publications have suggested that Giardia may still be a true archezoan but one which has acquired mitochondrial-like genes through horizontal gene transfer (HGT) events independent of the mitochondrial endosymbiosis (Sogin 1997
; Doolittle 1998
; Roger et al. 1998
). Such transfer was suggested to have occurred from another eukaryote (Roger et al. 1998
); from an
-proteobacterium closely related to, but distinct from, the mitochondrial endosymbiont (Sogin 1997
); or from an
-proteobacterium ingested as food (Doolittle 1998
).
Here we show that the nuclear-encoded cpn60 of the diplomonad Spironucleus barkhanus, a distant relative of Giardia, is of -proteobacterial origin and is specifically related to that of G. lamblia and that both diplomonad proteins branch noncontroversially among mitochondrial homologs from other eukaryotes in phylogenetic reconstructionsindicating that extant diplomonads are derived from ancestors which had experienced the mitochondrial endosymbiosis. Furthermore, we show that cpn60 provides no evidence for an early divergence of diplomonads from other eukaryotes.
From an expressed sequence tag (EST) project on the diplomonad S. barkhanus ATCC strain 50380, we identified a cDNA with strong similarities to characterized mitochondrial cpn60 gene sequences and used it to isolate a full-length gene from a lambda DASHII genomic library using standard methods (Horner, Hirt, and Embley 1999
). Conceptual translation of the S. barkhanus cpn60 sequence indicated a peptide of 512 amino acids with a putative N-terminal extension of 16 amino acids relative to proteobacterial sequences (fig. 1
). This extension is similar in length to the putative mitochondrial targeting peptides of apicomplexan and euglenozoan cpn60 identified by alignment (not shown). The S. barkhanus putative leader peptide resembles protist cpn60 mitochondrial targeting signals in the presence of aromatic residues and the absence of negatively charged side chains. The primary sequence similarity between protist cpn60 mitochondrial targeting peptides is rather low, which hinders inferences of possible function. However, mitochondrial targeting peptide detection software (www.mips.biochem.mpg.de/cgi-bin/proj/medgen/mitofilter) did recognize the Spironucleus N-terminal extension as a plausible targeting peptide (fig. 1
) with approximately the same confidence as the Trichomonas vaginalis presequence which has been shown to be functional in this respect (Bui, Bradley, and Johnson 1996
). No potential targeting peptide was detected in the G. lamblia sequence, nor was the functional Entamoeba histolytica mitosome targeting presequence (Mai et al. 1999
; Tovar, Fischer, and Clark 1999
) recognized by this software (fig. 1
). Studies using antibodies heterologous to GroEL/cpn60 have previously indicated a punctate localization of this gene product in G. lamblia (Soltys and Gupta 1994
; Roger et al. 1998
), but no membrane-bounded organelle was demonstrated.
|
Eukaryote mitochondrial and representative bacterial cpn60/GroEL gene sequences were recovered from GenBank. Translated gene products were aligned using CLUSTAL W (Thompson, Higgins, and Gibson 1994
) and refined manually. DNA alignments were back-aligned using the program PUTGAPS (J. O. McInerney, NHM). Ambiguously aligned regions were removed, leaving 280 aligned codons (77 taxa) for phylogenetic analyses. To reduce the effects of compositional bias and saturation on phylogenetic reconstruction, codon position 3 was excluded. Because we observed base composition variation between cpn60 sequences for codon positions 1+2 (43.3% for Plasmodium falciparum to 56.6% for Bradyrhizobium japonicum), we used LogDet/Paralinear DNA distances (Lake 1994
; Lockhart et al. 1994
) in PAUP 4.0b4a (Swofford 1998
). To partially correct for site rate variation, we removed (Waddell and Steel 1997
) the proportion of inferred invariable sites (17% of all sites, 100% of constant sites) estimated using maximum likelihood (ML) (Lockhart et al. 1996
; Hirt et al. 1999
), leaving 464 variable nucleotide positions for analysis (77 taxa). It has been suggested that for data sets that display nucleotide composition variation, it may be better to analyze inferred protein sequences rather than DNA sequences (Hasegawa and Hashimoto 1993
), the expectation being that compositional bias will be mitigated at the amino acid level, allowing the correct tree to be recovered. Balanced against this view is the loss of information from collapsing characters from DNA to protein (Yang and Roberts 1995
; Miyamoto and Fitch 1996
) and observations that amino acid composition biases also occur in cpn60 protein sequences (Roger et al. 1998
). ML analyses of protein alignments were performed with PROTML in MOLPHY 2.3 (Adachi and Hasegawa 1996
) using the heuristic quick-add OTU method, with the JTT-f amino acid replacement model. Since PROTML assumes that all sites can vary, we first removed the proportion of sites inferred to be invariable (9.6% of all sites, 90% of constant sites) using a variable/invariable model in PUZZLE (Strimmer and von Haeseler 1996
). The resulting data set contained 253 amino acid positions for 77 taxa.
The altered pattern of amino acid conservation for diplomonad cpn60 discussed above is also expected to affect phylogenetic analyses of these data. Thus, the diplomonad branches are by far the longest in all trees (fig. 2
), and inspection of the cpn60 protein alignments revealed a number of unambiguously aligned sites which were constant for all eukaryote sequences except for S. barkhanus (26 sites), G. lamblia (11 sites), or both (15 sites). Other groups of taxa contributed far fewer such sites (five for the plant clade, three for the fungal clade). Moreover, the proportions, as well as the distributions, of variable sites appear to be different for cpn60 sequences from mitochondriate and secondarily amitochondriate lineages. Using a codon capture/recapture method (Sidow, Nguyen, and Speed 1992
), the number of variable codons in the 280-amino-acid data set was inferred to be 211 among aerobic, mitochondrial taxa. Inclusion of diplomonad, Trichomonas, and Entamoeba sequences increased this value to 245 codons. These observations demonstrate that the number and distribution of invariable sites differ across the cpn60 tree (Miyamoto and Fitch 1995
; Lockhart et al. 1996
) and suggest that phylogenetic inferences from this data set which use homogeneous models should be treated with caution.
|
Spironucleus barkhanus cpn60 forms a strongly supported monophyletic group, irrespective of method of analysis, with the one from Giardia lamblia (fig. 2
). The simplest explanation for this group is that these two diplomonads inherited their cpn60 gene from a common ancestor. Since Spironucleus and Giardia represent separate arms of what is currently recognized as the deepest split among diplomonads (Rozario et al. 1996
), the cpn60 gene was already present early in the history of this group. A weakly supported relationship between T. vaginalis and the diplomonads was recovered in the best LogDet/Paralinear distances tree and in the eukaryotes-only protein ML tree (fig. 2B
). Thus, the cpn60 sequence data, while not strongly supporting it, cannot exclude a potential sister relationship between diplomonads and T. vaginalis, which has been suggested by analysis of other molecular data sets (Edlind et al. 1996
; Keeling and Doolittle 1996
; Hashimoto et al. 1997, 1998
; Roger et al. 1998
; Hirt et al. 1999
). Given that mitochondria and hydrogenosomes share common ancestry (Embley, Horner, and Hirt 1997
; Akhmanova et al. 1998
; Martin and Müller 1998
; Dyall and Johnson 2000
), a relationship between Giardia and Trichomonas would support secondary loss of mitochondria in diplomonads (Embley and Hirt 1998
; Roger 1999
).
A relationship between Giardia and Entamoeba, as recovered by Roger et al. (1998)
, was not supported in our phylogenetic analyses when we introduced corrections to mitigate known problems with the cpn60 DNA and protein data sets (Roger et al. 1998
) (fig. 2
). Instead, both LogDet/Paralinear distances (with strong bootstrap support) and protein ML (with weak bootstrap support) recovered E. histolytica together with the aerobic mitochondrion-containing slime mold D. discoideum. The presence of a shared single amino acid deletion (fig. 1
) is consistent with the hypothesis that these two organisms are related. Several authors have noted the instability of the placement of E. histolytica gene sequences in phylogenetic trees (Keeling and Doolittle 1996
; Baldauf 1999
). However, close scrutiny of published gene trees reveals that E. histolytica and D. discoideum appear together sufficiently frequently to provide a good working hypothesis of relationship. For example, trees based on
-tubulin (Keeling and Doolittle 1996
), small-subunit RNA (Kumar and Rzhetsky 1996
), and EF-2 (Hirt et al. 1999
) have all depicted this topology. Moreover, Cavalier-Smith (1998)
recently suggested uniting these taxa within the subphylum Conosa of the phylum Amoebozoa based on morphological features.
Like Roger et al. (1998)
, we found an
-proteobacterial origin for Giardia (and Spironucleus) cpn60, and, like them, we believe that acquisition of this gene from the mitochondrial endosymbiont is the preferred explanation. Nevertheless, Roger et al. (1998)
, Sogin (1997)
, and Doolittle (1998)
have posed other explanations involving HGT that they suggested could also account for the mitochondrial-like phylogeny of Giardia cpn60 and that would preserve the status of Giardia as primitively amitochondriate (a bona fide archezoan sensu Cavalier-Smith [1983]
). The first of these alternatives was the casual suggestion that HGT of cpn60 from another eukaryote to Giardia could explain the cpn60 data (Roger et al. 1998
). As we show here, there is no support from phylogenetic analyses for this suggestion. The second alternative posits that Giardia could have obtained its cpn60 gene through HGT from an
-proteobacterial donor that was distinct from the mitochondrial endosymbiont (Sogin 1997
; Doolittle 1998
; Roger et al. 1998
). It is germane to consider what observations might actually require, or provide support for, this scenario.
Available data indicate a single origin of mitochondria (Gray, Burger, and Lang 1999
). If all mitochondrial eukaryotes obtained their cpn60 from one source (the mitochondrion endosymbiont) and the diplomonads from another (not the mitchondrion endosymbiont), then these cpn60 genes should form two distinct clades. If descendants of the different
-proteobacterial donor lineages were not included in the analysis, or if they were extremely closely related, the diplomonad sequences should still constitute the basal branch in a eukaryote clade. In our analyses, bootstrap support for a clade containing all eukaryotic cpn60 was only moderate (64%) from LogDet/Paralinear distances of DNA sequences, but there was no evidence from bootstrap partitions that diplomonad sequences particularly eroded that support. Furthermore, the diplomonad cpn60 sequences were recovered as an internal branch within the mitochondrial clade, not as a sister to all other eukaryotes, as the
-proteobacterial-but-not-mitochondrion endosymbiont scenario predicts.
In conclusion, we observe no relationships in our phylogenetic analyses that require ad hoc explanations involving lateral gene transfer from obscure or unknown donors as a source of diplomonad cpn60 genes. The obvious explanation of the trees is that the two highly divergent diplomonads G. lamblia and S. barkhanus are not archezoans, but, like other eukaryotes, inherited their cpn60 genes from a common ancestor that once contained the mitochondrial endosymbiont.
Supplementary Material
The sequence reported in this paper was deposited in GenBank under accession number AY033509.
Acknowledgements
The EST survey on Spironucleus was in collaboration with Mark Ragan (Canadian Institute of Advanced Research). D.S.H. was supported by a fellowship awarded by the Natural History Museum. We thank Robert Hirt for comments on the manuscript.
Footnotes
William Martin, Reviewing Editor
1 Abbreviation: ML, maximum likelihood.
2 Keywords: cpn60
horizontal gene transfer
anaerobic eukaryotes
Archezoa
mitochondria
Giardia
Spironucleus
Entamoeba
3 Address for correspondence and reprints: T. Martin Embley, Department of Zoology, Natural History Museum, London SW7 5BD, United Kingdom. tme{at}nhm.ac.uk
.
References
Adachi J., M. Hasegawa, 1996 MOLPHY version 2.3b: programs for molecular phylogenetics based on maximum likelihood Comput. Sci. Monogr 28:1-150
Akhmanova A., F. Voncken, T. van Alen, A. van Hoek, B. Boxma, G. Vogels, M. Veenhuis, J. H. Hackstein, 1998 A hydrogenosome with a genome Nature 396:527-528[ISI][Medline]
Baldauf S. L., 1999 A search for the origins of animals and fungi: comparing and combining molecular data Am. Nat 154:178-188
Braig K., Z. Otwinowski, R. Hegde, D. C. Boisvert, A. Joachimiak, A. L. Horwich, P. B. Sigler, 1994 The crystal structure of the bacterial chaperonin GroEL at 2.8A Nature 371:578-586[ISI][Medline]
Brocchieri L., S. Karlin, 2000 Conservation among HSP60 sequences in relation to structure, function and evolution Protein Sci 9:476-486[Abstract]
Buckle A. M., R. Zahn, A. R. Fersht, 1997 A structural model for GroEL-polypeptide recognition Proc. Natl. Acad. Sci. USA 94:3571-3575
Bui E. T. N., P. J. Bradley, P. J. Johnson, 1996 A common evolutionary origin for mitochondria and hydrogenosomes Proc. Natl. Acad. Sci. USA 93:9651-9656
Cavalier-Smith T., 1983 A 6 kingdom classification and a unified phylogeny Pp. 10271034 in W. Schwemmler and H. E. A. Schenk, eds. Endocytobiology II. De Gruyter, Berlin
. 1998 A revised six-kingdom system of life Biol. Rev 73:203-266[ISI][Medline]
Clarke C. G., A. J. Roger, 1995 Direct evidence for secondary loss of mitochondria in Entamoeba histolytica Proc. Natl. Acad. Sci. USA 92:6518-6521[Abstract]
Doolittle W. F., 1998 You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes Trends. Genet 14:307-311[ISI][Medline]
Dore J., D. A. Stahl, 1991 Phylogeny of anaerobic rumen chytridiomycetes inferred from small subunit ribosomal RNA sequence comparisons Can. J. Bot 69:1964-1971[ISI]
Dyall S. B., P. J. Johnson, 2000 Origins of hydrogenosomes and mitochondria: evolution and organelle biogenesis Curr. Opin. Microbiol 3:404-411[ISI][Medline]
Edlind T. D., J. Li, G. S. Visvesvara, M. H. Vodkin, G. L. McLaughlin, S. K. Katiyar, 1996 Phylogenetic analysis of ß-tubulin sequences from amitochondrial protozoa Mol. Phylogenet. Evol 5:359-367[ISI][Medline]
Embley T. M., B. J. Finlay, P. L. Dyal, R. P. Hirt, M. Wilkinson, A. G. Williams, 1995 Multiple origins of anaerobic ciliates with hydrogenosomes within the radiation of aerobic ciliates Proc. R. Soc. Lond. B Biol. Sci 262:87-93[ISI][Medline]
Embley T. M., R. P. Hirt, 1998 Early branching eukaryotes? Curr. Opin. Genet. Dev 8:624-629[ISI][Medline]
Embley T. M., D. S. Horner, R. P. Hirt, 1997 Anaerobic eukaryote evolution: hydrogenosomes as biochemically modified mitochondria? Trends Ecol. Evol 12:437-441[ISI]
Felsenstein J., 1978 Cases in which parsimony or compatibility methods will be positively misleading Syst. Zool 27:401-410[ISI]
Fenton W. A., K. Yechexkel, K. Furtak, A. L. Horwich, 1994 Residues in chaperonin GroEL required for polypeptide binding and release Nature 371:614-619[ISI][Medline]
Gray M. W., G. Burger, B. F. Lang, 1999 Mitochondrial evolution Science 283:1476-1481
Hasegawa M., T. Hashimoto, 1993 Ribosomal RNA trees misleading Nature 361:23[ISI][Medline]
Hashimoto T., Y. Nakamura, T. Kamaishi, M. Hasegawa, 1997 Early evolution of eukaryotes inferred from the amino acid sequences of elongation factors 1 and 2 Arch. Protistenkd 148:287-295
Hashimoto T., L. B. Sanchez, T. Shirakura, M. Müller, M. Hasegawa, 1998 Secondary absence of mitochondria in Giardia and Trichomonas revealed by valyl-tRNA synthetase phylogeny Proc. Natl. Acad. Sci. USA 95:6860-6865
Hirt R. P., J. M. Logsdon, B. Healy, M. W. Dorey, W. F. Doolittle, T. M. Embley, 1999 Microsporidia are related to fungi: evidence from the largest subunit of RNA polymerase II and other proteins Proc. Natl. Acad. Sci. USA 96:580-585
Horner D. S., R. P. Hirt, T. M. Embley, 1999 A single eubacterial origin of eukaryotic pyruvate : ferredoxin oxidoreductase genes: implications for the evolution of anaerobic eukaryotes Mol. Biol. Evol 16:1280-1291[Abstract]
Horner D. S., R. P. Hirt, S. Kilvington, D. Lloyd, T. M. Embley, 1996 Molecular data suggest an early acquisition of the mitochondrion endosymbiont Proc. R. Soc. Lond. B Biol. Sci 263:1053-1059[ISI][Medline]
Keeling P. J., W. F. Doolittle, 1996 -tubulin from early-diverging eukaryotic lineages and the evolution of the tubulin family Mol. Biol. Evol 13:1297-1305
Kumar S., A. Rzhetsky, 1996 Evolutionary relationships of eukaryotic kingdoms J. Mol. Evol 42:183-193[ISI][Medline]
Lake J. A., 1994 Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances Proc. Natl. Acad. Sci. USA 91:1455-1459[Abstract]
Lockhart P. J., A. W. D. Larkum, M. A. Steel, P. J. Waddel, D. Penny, 1996 Evolution of chlorophyll and bacteriochlorophyll: the problem of invariant sites in sequence analysis Proc. Natl. Acad. Sci. USA 93:1930-1934
Lockhart P. J., M. A. Steel, M. D. Hendy, D. Penny, 1994 Recovering evolutionary trees under a more realistic model of sequence evolution Mol. Biol. Evol 11:605-612
Mai Z., S. Ghosh, M. Frisardi, B. Rosenthal, R. Rogers, J. Samuelson, 1999 Hsp60 is targeted to a cryptic mitochondrion-derived organelle ("crypton") in the microaerophilic protozoan parasite Entamoeba histolytica Mol. Cell. Biol 19:2198-2205
Martin J., 1997 Molecular chaperones and mitochondrial protein folding J. Bioenerg. Biomembr 29:35-43[ISI][Medline]
Martin W., M. Müller, 1998 The hydrogen hypothesis for the first eukaryote Nature 392:37-41[ISI][Medline]
Miyamoto M. M., W. M. Fitch, 1995 Testing the covarion hypothesis of molecular evolution Mol. Biol. Evol 12:503-513[Abstract]
. 1996 Constraints on protein evolution and the age of the eubacteria/eukaryote split Syst. Biol 45:568-575[ISI][Medline]
Roger A. J., 1999 Reconstructing early events in eukaryotic evolution Am. Nat 154:S146-S163[ISI][Medline]
Roger A. J., C. G. Clarke, W. F. Doolittle, 1996 A possible mitochondrial gene in the early branching amitochondriate protist Trichomonas vaginalis Proc. Natl. Acad. Sci. USA 93:14618-14622
Roger A. J., S. G. Svard, J. Tovar, C. G. Clark, M. W. Smith, F. D. Gillin, M. L. Sogin, 1998 A mitochondrial-like chaperonin 60 gene in Giardia lamblia: evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria Proc. Natl. Acad. Sci. USA 95:229-234
Rozario C., L. Morin, A. J. Roger, M. W. Smith, M. Müller, 1996 Primary structure and phylogenetic relationships of glyceraldehyde-3-phosphate dehydrogenase gene of free-living and parasitic diplomonad flagellates J. Eukaryot. Microbiol 43:330-340[ISI][Medline]
Sidow A., T. Nguyen, T. P. Speed, 1992 Estimating the fraction of invariable codons with a capture-recapture method J. Mol. Evol 35:253-260[ISI][Medline]
Sogin M. L., 1997 History assignment: when was the mitochondrion founded? Curr. Opin. Genet. Dev 7:792-799[ISI][Medline]
Soltys B. J., R. S. Gupta, 1994 Presence and cellular distribution of a 60-KDa protein related to mitochondrial HSP60 in Giardia lamblia J. Parasitol 80:580-590[ISI][Medline]
Strimmer K., A. von Haeseler, 1996 Quartet puzzling: a quartet maximum likelihood method for reconstructing tree topologies Mol. Biol. Evol 13:964-969
Swofford D. L., 1998 PAUP*. Phylogenetic analysis using parsimony (*and other methods) Sinauer, Sunderland, Mass
Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680[Abstract]
Tovar J., A. Fischer, C. G. Clark, 1999 The mitosome, a novel organelle related to mitochondria in the amitochondrial parasite Entamoeba histolytica Mol. Microbiol 32:1013-1021[ISI][Medline]
Viale A. M., A. K. Arakaki, 1994 The chaperone connection to the origins of the eukaryotic organelles FEBS Lett 341:146-151[ISI][Medline]
Viitanen P. V., G. H. Lorimer, R. Seetharam, R. S. Guptal, J. Oppenheim, J. O. Thomas, N. J. Cowan, 1992 Mammalian mitochondrial chaperonin 60 functions as a single toroidal ring J. Biol. Chem 267:695-698
Waddell P. J., M. A. Steel, 1997 General time-reversible distances with unequal rates across sites: mixing gamma and inverse Gaussian distributions with invariant sites Mol. Phylogenet. Evol 8:398-414[ISI][Medline]
Yang Z., D. Roberts, 1995 On the use of nucleic acid sequences to infer early branchings in the tree of life Mol. Biol. Evol 12:451-458[Abstract]