* Department of Biological Sciences, Smith College, Northampton, Massachusetts
Program in Organismic and Evolutionary Biology, University of Massachusetts-Amherst
Bioinformatics Research Center, Department of Statistics, North Carolina State University, Raleigh
Correspondence: E-mail: lkatz{at}smith.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: Protein evolution histone H4 ciliates macronucleus chromosomal rearrangements, fate of paralogs
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In most nonciliate eukaryotes, histone H4 proteins differ at only a few amino acid positions, demonstrating that a limited number of polypeptides can maintain structure and proper gene expression (Luger et al. 1997; Wolffe and Hayes 1999). Histone H4 has the slowest rate of nonsynonymous substitutions among a study of 27 proteins in mammals; further, only three amino acid substitutions were found among comparisons of 16 eukaryotic taxa that included several vertebrates, echinoderms, arthropods, and plants (Graur 1985; Wells and McBride 1989). Exceptions to this conservative pattern of evolution come from taxa in poorly sampled eukaryotic clades, including Giardia (Wu et al. 2000), Leishmania (Lukes and Maslov 2000), and Entamoeba (Binder et al. 1995).
Similarly, paralogs (duplicated copies) of histone H4 within the genomes of most eukaryotes encode for identical or very similar amino acid sequences. With only one exception, the 12 human histone H4 paralogs differ by no more than 2 aa across the entire length of the gene (Piontkivska, Rooney, and Nei 2002) and are identical in amino acid sequence for the region used in this study. The single exception to this pattern (GenBank accession number Z80788) appears to be a highly divergent pseudogene because no expression of this paralog was detected in analyses of several human cell lines (Doenecke, personal communication). The conservation of amino acid but not nucleotide sequences among nonciliate histone H4 paralogs indicates that purifying selection eliminates nonsynonymous substitutions within most eukaryotic genomes (Piontkivska, Rooney, and Nei 2002). In contrast, previous studies suggested that the histone H4 protein diversifies faster in ciliates than in other eukaryotes (Sadler and Brunk 1992; Salvini 1997; Bernhard and Schlegel 1998), although these analyses focused mainly on qualitative assessments of overall patterns of divergence. For instance, there are more amino acid differences in histone H4 between two classes of ciliates (Spirotrichea and Oligohymenophorea) than there are between land plants and animals (Bernhard and Schlegel 1998).
To elucidate the pattern of histone H4 diversity within ciliates and to compare patterns of substitutions with those in other well-sampled eukaryotic clades, we sequenced a portion of the histone H4 gene from 13 ciliate species. Ciliates studied include representatives from six of nine classes of ciliates along with two related orders of uncertain taxonomic position (table 1). We analyzed variation among major eukaryotic clades using a genealogical perspective and compared patterns of nucleotide and amino acid substitutions among histone H4 paralogs.
|
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Characterization of Ciliate Histone Sequences
We amplified a portion of histone H4 from total-genomic DNAs using gene-specific primers H4F011+ ([CUA]4 GGNRTNACNAARCCNGCNAT) and H4R011- ([CAU]4 TTNARNGCRTANACNACRTC) and platinum Taq DNA polymerase (GibcoBRL). PCR products purified using the Qiaquick PCR purification (Qiagen) were cloned in pAMP1 vector (GibcoBRL) and miniprepped using the Qiaprep Spin Miniprep kit (Qiagen). Sequences were generated from all clones in both directions using the BigDye terminator RR mix from PE Applied Biosystems. Reactions were cleaned using gel filtration cartridges from Edge Biosystems, and samples were run on either an ABI 3100 or ABI 377 automated sequencer. For paralogs represented by multiple clones, we randomly chose a sequence to represent the paralog for analyses.
Data Analysis
We searched for histone H4 GenBank entries using the Entrez browser through the program Editseq implemented by DNASTAR, Inc. We limited our searches to exclude ESTs, GSSs, patented sequences, and sequences less than 100 nt, and only included sequences from clades containing at least three genera. For the well-sampled taxaplants + green algae ("greens"), animals and fungiwe only included species with paralogs to maintain a data set of reasonable size (table S1 in Supplementary Material online at www.mbe.oupjournals.org). Multiple sequence alignments, including our contigs assembled in Seqman (DNAStar, Inc), were generated by ClustalW (Thompson, Higgins, and Gibson 1994) as implemented by Megalign (DNAStar, Inc) with a gap penalty of 10 and gap length penalty of 10.
To construct genealogies, we used PAUP* version 4.0b10 software (Swofford 2002), implementing the Neighbor-Joining algorithm. Nucleotide distances were estimated using maximum-likelihood settings in PAUP* (Swofford 2002), with a GTR model and parameters estimated by Modeltest (Posada and Crandall 1998) as implemented in HyPhy (Kosakovsky Pond and Muse 2003). Amino acid distances were calculated in Tree-Puzzle version 5.0 (Strimmer and von Haeseler 1997) using a JTT model, with variation in rates among sites estimated by gamma distribution with six rate classes.
We compared the rates of nonsynonymous and synonymous substitutions within the constrained animal, fungi, "green," and ciliate clades. The dN/dS ratios were estimated by maximum-likelihood methods in HyPhy (Kosakovsky Pond and Muse 2003), with the MG94_3x4 model of substitution on a constrained topology (see below). The ratio of the synonymous and nonsynonymous rates estimated by this model (Muse and Gaut 1994) is very similar to the ratio of the expected number of nonsynonymous substitutions per nonsynonymous site to the expected number of synonymous substitutions per synonymous site (dN/dS [Muse 1996]), and hence we use this term. Values of dN/dS were compared using Kruskal-Wallis tests corrected for ties (P = 0.003) (Kruskal and Wallis 1953), Mood's median test (Mood 1950), and Fisher's exact test implemented by the program RxC (bioweb.usu.edu/mpmbio).
A primary goal of the analyses is to identify clades with exceptionally high dN/dS ratios. Because some of our analyses involve averaging dN/dS ratios over lineages, we wanted to limit the effect of outlying observations, which are likely to arise as the result of the sampling properties of ratio estimators. Towards that end, we restricted the range of estimates of dN and dS by setting values less than 10-10 to 0 and estimates of 10 or more to 10. Similarly, dN/dS values were capped at 2, including infinite N/0 ratios, and estimates of dN/dS that were equal to 0/0 were set at a value of 0. We believe these settings cause our analyses to be conservative in identifying clades with high dN/dS values by minimizing the effects of dN/dS ratios with extreme values in either the numerator or denominator that are likely the result of large sampling variation. This procedure is conservative for our purposes because it only reduces large dN/dS estimates, making it more difficult for an estimate to be identified as extreme.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
We found no evidence of paralogs in the populations of Stentor sp., Strombidium sp., or Tokophrya lemnarum, even though up to 11 clones were sampled from these three taxa (table 1). Clearly, both the number of clones we sampled and the relative amplification of each paralog within the macronuclear genome affect the number of paralogs identified, and more paralogs are likely to exist. Although we did not characterize the termination codon, all but one of our sequences can be translated into an open reading frame using the universal genetic code. The exception is H. grandinella paralog P4 (HgraP4), which has an inframe TAA and needs the ciliate genetic code (NCBI translation table 6) to generate an open reading frame. Clonal lines and population samples showed similar patterns of sequence divergences (table 1).
Genealogical analyses of ciliate histone H4 sequences, along with paralogs from animals, fungi, and "greens," yield topologies that are discordant with both morphology and other molecular genealogies. For example, the fungi and ciliates are polyphyletic in nucleotide and amino acid genealogies generated in a NJ analysis with ML distances (figure S1 in Supplementary Material online). Our genealogies also fail to show evidence of ancient paralogs (figure 1 and figure S1 in Supplementary Material online), inferred when parallel clades are generated for a given set of taxa. Hence, we conclude that the heterogeneity in rates of evolution explain the discordant topologies.
|
The ratio of nonsynonymous to synonymous substitution rates, estimated by maximum likelihood using a codon-based model of sequence evolution (Muse and Gaut 1994) on the constrained topology, is higher in ciliates than in other eukaryotic clades. Comparisons of binned dN/dS estimates reveal that ciliates have significantly greater than expected numbers of values of 0.2 or more compared with other clades (Fisher's exact test, P < 0.001 [table 2]). Similarly, 10% trimmed mean dN/dS (trimmed to reduce the impact of outliers) is highest in ciliates (dN/dS = 0.045) and is at least twice that found in animals (dN/dS = 0.000), "greens" (dN/dS = 0.002), and fungi (dN/dS = 0.021 [table 2]). Nonparametric tests for comparing the distributions of dN/dS values among the four clades are highly significant (P < 0.001 in Kruskal-Wallis adjusted for ties, and P = 0.006 in median test), and both tests indicate that ciliate dN/dS values are the largest.
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The diversity of histone H4 proteins in ciliates may be related to the presence of two functionally distinct genomes within each cell: the transcriptionally-inactive micronucleus and the functional macronucleus. We propose three hypotheses for the dramatic protein evolution in ciliate histone H4s: (1) as previously suggested, relaxed functional constraint on the nucleosomes of amitotic macronuclei allows for an increase in the ratio of nonsynonymous substitutions to synonymous substitutions in comparison to other eukaryotes (Sadler and Brunk 1992; Salvini 1997; Bernhard and Schlegel 1998); (2) adaptive substitutions accumulate in micronuclear copies of paralogs that are essentially hidden from selection in the macronucleus during asexual divisions; and (3) the divergence reflects the accumulation of deleterious mutations. It is likely that all three hypotheses interact to produce the observed pattern. However, we believe that the third hypothesis alone, the accumulation of deleterious mutations, cannot account for the observed level of variation given the tremendous genetic load that this hypothesis would suggest has accumulated within the approximately 850 to 2,200 Myr old ciliate clade (Knoll 1992; Wright and Lynn 1997; Cavalier-Smith 2002).
If relaxed functional constraints explain most of the observed pattern, then we expect that ciliate histone H4 proteins will not function in nonciliate eukaryotes. Surprisingly, the Tetrahymena thermophila histone H4 protein is viable when transformed into yeast (Fogel and Brunk 1997), despite 12 amino acid substitutions across the region we analyzed between these two proteins. This transformation result indicates that at least this one divergent ciliate protein has retained its essential function when present in the yeast nucleosome. A further prediction of a hypothesis of relaxed functional constraints on the macronuclear genome is that all ciliates will contain a conserved histone H4 protein that allows the condensation of chromatin during mitosis in micronuclei, as this nucleus contains a "conventional" (unprocessed) eukaryotic genome. We do not find any evidence for a conserved histone H4 gene within any of our taxa; for example the predicted histone H4 proteins of all phyllopharyngean ciliates are highly divergent (clade "P" [fig. 1]).
In contrast, there is evidence to support the second hypothesis that adaptive evolution contributes to the variation observed among ciliate histone H4. Under this hypothesis, divergent paralogs evolve because the dual nature of ciliate genomes enables ciliates to maintain copies of paralogs in their micronuclei while "hiding" them from selection, or at least reducing the impact of selection, in their transcriptionally active macronuclei. All ciliates exhibit some degree of fragmentation and amplification of chromosomes during the development of macronuclei (Jahn and Klobutcher 2002; Yao, Duharcourt, and Chalker 2002). Selection on macronuclei differs from that of conventional nuclei, as processed macronuclear genomes have the potential to: (1) break up linkage groups such that the fate of paralogs is less affected by polymorphisms at linked loci; (2) allow assortment and recombination of paralogs during asexual divisions of the macronucleus, possibly resulting in the removal of deleterious mutations from the macronucleus that are still present in the micronucleus; (3) redefine "genetic load" through the differential amplification of macronuclear "chromosomes," thus making it relatively inexpensive for ciliates to carry duplicated genes and deleterious mutations; and/or (4) enable the accumulation of mutations in the unexpressed micronucleus until conjugation. In effect, during the asexual divisions that follow conjugation, a duplicated copy of a gene can experience reduced selection in the processed macronuclear genome (e.g., if it contains fewer [or no] copies of a deleterious gene), while potentially acquiring compensatory substitutions in the micronucleus. After conjugation, the parental macronucleus is replaced by a new macronucleus that essentially develops from the micronucleus, enabling the expression of any gene that has acquired compensatory changes. Such a mechanism may enable ciliates to explore protein space in a manner that is unique among eukaryotes.
This second hypothesis is further supported by the fact that the most divergent histone H4 genes are found in ciliates with extensively fragmented macronuclear chromosomes (members of the classes Phyllopharyngea and Spirotrichea, and the related orders Clevelandellida and Armophorida). Macronuclear "chromosomes" from ciliates in these three clades often contain only a single gene (Riley and Katz 2001). This extensive processing effectively breaks up linkage groups and potentially reduces the impact of deleterious mutations in macronuclei through assortment or recombination during asexual divisions more than in other ciliates.
Finally, if the dual nature of ciliate genomes allows the rapid evolution of histone H4 proteins, then we expect elevated rates of amino acid substitutions to also be found in other ciliate proteins. In fact, fast protein evolution has been reported for several ciliate genes, including elongation factor 1, heat shock protein 70, actin, and eukaryotic release factor 1 (Bhattacharya and Ehlting 1995; Budin and Philippe 1998; Moreira, Le Guyader, and Philippe 1999; Moreira et al. 2002), and divergent paralogs of
-tubulin have been found in taxa with extensively processed macronuclear genomes (Israel et al. 2002). In most cases, elevated rates of protein evolution in ciliates are accompanied by the presence of divergent paralogs, suggesting that the potential of adaptive evolution through gene duplication (Ohno 1970; Hughes 1994; Force et al. 1999; Hughes 2000; Lynch and Conery 2000) may interact with the dual nature of ciliate genomes to explain the divergence among ciliate histone H4s.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bernhard, D., and M. Schlegel. 1998. Evolution of histone H4 and H3 genes in different ciliate lineages. J. Mol. Evol. 46:344-354.[ISI][Medline]
Bhattacharya, D., and J. Ehlting. 1995. Actin coding regions: gene family evolution and use as a phylogenetic marker. Archiv. fur Protistenkunde 145:155-164.[ISI]
Binder, M., S. Ortner, B. Plaimauer, M. Fodinger, G. Wiedermann, O. Scheiner, and M. Duchene. 1995. Sequence and organization of an unusual histone H4 gene in the human parasite Entamoeba-Histolytica. Mol. Biochem. Parasitol. 71:243-247.[CrossRef][ISI][Medline]
Budin, A., and H. Philippe. 1998. New Insights into the phylogeny of eukaryotes based on ciliate Hsp70 sequences. Mol. Biol. Evol. 15:943-956.[Abstract]
Cavalier-Smith, T. 2002. The phagotrophic origin of eukaryotes and phylogenetic classification of protozoa. Int. J. Syst. Evol. Microbiol. 52:297-354.
Fogel, G. B., and C. F. Brunk. 1997. Expression of Tetrahymena histone H4 in yeast. Biochim. Biophys. Acta Gene Struct. Express. 1354:116-126.[CrossRef][ISI]
Force, A., M. Lynch, F. B. Pickett, A. Amores, Y. L. Yan, and J. Postlethwait. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545.
Graur, D. 1985. Amino-acid composition and the evolutionary rates of protein-coding genes. J. Mol. Evol. 22:53-62.[ISI][Medline]
Hughes, A. L. 1994. The evolution of functionally novel proteins after gene duplication. Proc. R. Soc. Lond. B Biol. Sci. 256:119-124.[ISI][Medline]
Hughes, A. L. 2000. Adaptive evolution of genes and genomes. Oxford University Press, New York.
Israel, R. L., S. L. Kosakovsky Pond, S. V. Muse, and L. A. Katz. 2002. Evolution of duplicated alpha-tubulin genes in ciliates. Evolution 56:1110-1122.[ISI][Medline]
Jahn, C. L., and L. A. Klobutcher. 2002. Genome remodeling in ciliated protozoa. Ann. Rev. Microbiol. 56:489-520.[CrossRef][ISI][Medline]
Katz, L. A. 2001. Evolution of nuclear dualism in ciliates: a reanalysis in light of recent molecular data. Int. J. Syst. Evol. Microbiol. 51:1587-1592.
Knoll, A. H. 1992. The early evolution of eukaryotesa geological perspective. Science 256:622-627.[ISI][Medline]
Kosakovsky Pond, S., and S. Muse. 2003. HyPhy: hypothesis testing using phylogenies. (http://www.hyphy.org).
Kruskal, W. H., and W. A. Wallis. 1953. Use of ranks in one-criterion analysis of variance (errata). J. Am. Stat. Assoc. 48:907-911.
Luger, K., A. W. Mader, R. K. Richmond, D. F. Sargent, and T. J. Richmond. 1997. Crystal structure of the nucleosome core particle at 2.8 angstrom resolution. Nature 389:251-260.[CrossRef][ISI][Medline]
Lukes, J., and D. A. Maslov. 2000. Unexpectedly high variability of the histone H4 gene in Leishmania. Parasitol. Res. 86:259-261.[CrossRef][ISI][Medline]
Lynch, M., and J. S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290:1151-1155.
Mood, A. M. 1950. Introduction to the theory of statistics. McGraw-Hill, New York.
Moreira, D., S. Kervestin, O. Jean-Jean, and H. Philippe. 2002. Evolution of eukaryotic translation elongation and termination factors: variations of evolutionary rate and genetic code deviations. Mol. Biol. Evol. 19:189-200.
Moreira, D., H. Le Guyader, and H. Philippe. 1999. Unusually high evolutionary rate of the elongation factor 1a genes from the ciliorphora and its impact on the phylogeny of eukaryotes. Mol. Biol. Evol. 16:234-245.[Abstract]
Muse, S. V. 1996. Estimating synonymous and nonsynonymous substitution rates. Mol. Biol. Evol. 13:105-114.[Abstract]
Muse, S. V., and B. S. Gaut. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates with application to the chloroplast genome. Mol. Biol. Evol. 11:715-724.
Ohno, S. 1970. Evolution by gene duplication. Spinger-Verlag, New York.
Piontkivska, H., A. P. Rooney, and M. Nei. 2002. Purifying selection and birth-and-death evolution in the histone H4 gene family. Mol. Biol. Evol. 19:689-697.
Posada, D., and K. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818.[Abstract]
Prescott, D. M. 1994. The DNA of ciliated protozoa. Microbiol. Rev. 58:233-267.[ISI][Medline]
Raikov, I. B. 1982. The protozoan nucleus: morphology and evolution. Springer-Verlag, New York.
Riley, J. L., and L. A. Katz. 2001. Widespread distribution of extensive genome fragmentation in ciliates. Mol. Biol. Evol. 18:1372-1377.
Sadler, L. A., and C. F. Brunk. 1992. Phylogenetic relationships and unusual diversity in histone H4 proteins within the Tetrahymena pyriformis complex. Mol. Biol. Evol. 9:70-84.[Abstract]
Salvini, M., Bini, E., Santucci, A., and R. Batistoni. 1997. H4 Histone in the macronucleus of Blepharisma japonicum (Protozoa, Ciolophora, Heterotrichida). FEMS Microbiol. Lett. 149:93-98.[CrossRef][ISI][Medline]
Steinbrück, G., S. Raszikowski, M. Golembiewska-Skoczylas, and B. Sapetto-Rebow. 1995. Characterization of low and high molecular weight DNA in the macronucleus of the ciliate Chilodonella steini. Acta Protozool. 34:125-134.[ISI]
Strimmer, K., and A. von Haeseler. 1996. Quartet puzzling: A quartet maximum likelihood method for reconnecting tree topologies Mol. Biol. Evol. 13:964-969.
Swofford, D. 2002. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sinauer Associates, Sunderland, Mass.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.[Abstract]
Wells, D., and C. McBride. 1989. A comprehensive compilation and alignment of histones and histone genes. Nucleic Acids Res. 17:R311-R346.[ISI]
Wolffe, A. P., and J. J. Hayes. 1999. Chromatin disruption and modification. Nucleic Acids Res. 27:711-720.
Wright, A.-D. G., and D. H. Lynn. 1997. Maximum ages of ciliate lineages estimated using a small subunit rRNA molecular clock: crown eukaryotes date back to the paleoprotoerozoic. Arch. Protistenkd. 148:329-341.
Wu, G., A. G. McArthur, A. Fiser, A. Sali, M. L. Sogin, and M. Muller. 2000. Core histones of the amitochondriate protist, Giardia lamblia. Mol. Biol. Evol. 17:1156-1163.
Yao, M. C., S. Duharcourt, and D. L. Chalker. 2002. Genome-wide rearrangements of DNA in ciliates. Pp. 730758 in N. L. Craig, R. Craigie, M. Gellert, and A. Lambowitz, eds. Mobile DNA II. ASM Press, Washington D.C.
Zhang, J. 2000. Rates of conservative and radical nonsynonymous nucleotide subsitutions in mammalian nuclear genes. J. Mol. Evol. 50:56-68.[ISI][Medline]