Department of Biological Sciences, Mississippi State University, Mississippi State;
Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University, University Park
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
With the exception of histone H4, all histone genes can be classified into three main subtypes on the basis of their expression pattern and genomic organization (Isenberg 1979
; Maxson, Cohn, and Kedes 1983
; Wu et al. 1986
; Doenecke et al. 1997
): replication-dependent (RD), replication-independent (RI), and tissue-specific (TS) histones. These histones are encoded by multigene families. RI histones, which are also called replacement histones, are expressed at constant but low levels throughout the cell cycle and in quiescent differentiated cells. These histones may contain introns and are not found in histone gene clusters. Instead, they occupy solitary locations within the genome. In contrast, RD histones, which are also called nonreplacement histones, are expressed only during the S phase of the cell cycle, do not contain introns, and are organized as clusters (fig. 1
). Finally, TS histones are expressed only in certain cell types, such as the testis. An additional numerical classification (H3.1, H3.2, etc.) is also often used. This numbering system is an arbitrary designation based on small amino acid sequence differences. For example, the vertebrate RI H3 genes are classified into the H3.3 category, whereas vertebrate RD genes can be classified into H3.1 or H3.2 gene categories. These variants are defined on the basis of migration patterns on Triton X-100 polyacrylamide gels, and they sometimes differ from one another by as little as one amino acid (e.g., mouse H3.1 vs. mouse H3.2). It is not clear whether these small differences result in functional differentiation, although this is apparently so in the case of larger amino acid differences (e.g., between mouse H3.1/H3.2 and mouse H3.3).
|
Histones, especially H3 and H4, are among the most conserved proteins in the eukaryotic genome. There only about three amino acid differences between animal and plant H3 proteins, which are composed of 135 residues. Because of this high sequence similarity, the multigene families encoding histones are generally believed to be subject to concerted evolution, which homogenizes the member genes of a multigene family by interlocus gene recombination or gene conversion (e.g., Kedes 1979
; Coen, Strachan, and Dover 1982
; Holt and Childs 1984
; Matsuo and Yamazaki 1989
; DeBry and Marzluff 1994
; Thatcher and Gorovsky 1994
). However, protein sequence homogeneity can also be attained by strong purifying selection without concerted evolution. In this case, genes can evolve independently or according to the birth-and-death model of evolution (Nei and Hughes 1992
; Nei, Gu, and Sitnikova 1997
). This latter model of evolution assumes that new genes are created by repeated gene duplication and that some of these duplicated genes are maintained in the genome for a long time, whereas others are deleted or become nonfunctional. Birth-and-death evolution has been shown to be the primary mode of evolution for large multigene families, such as the major histocompatibility complex (MHC), immunoglobulin (Ig), antibacterial ribonuclease genes, and nematode chemoreceptor gene families (Nei and Hughes 1992
; Ota and Nei 1994
; Nei, Gu, and Sitnikova 1997
; Robertson 1998
; Zhang, Dyer, and Rosenberg 2000
), as well as for smaller multigene families such as the ubiquitins (Nei, Rogozin, and Piontkivska 2000
).
The purpose of this paper is to study the evolutionary mode of H3 genes in relation to the two aforementioned hypotheses. Assuming a rapid process of interlocus recombination or gene conversion, we would expect that in each species the proportion of synonymous nucleotide differences per synonymous site (pS) between member genes is similar or only slightly higher than the proportion of nonsynonymous nucleotide differences per nonsynonymous site (pN), irrespective of whether there is purifying selection. However, if birth-and-death evolution under strong purifying selection were the major evolutionary force, pS would be much higher than pN because the member genes might diverge extensively by silent nucleotide substitution. Therefore, one may be able to distinguish between these two modes of evolution by comparing pS and pN.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In some instances, H3 gene sequences found in GenBank contained errors. These sequences are listed in the histone sequence database maintained by the National Human Genome Research Institute (http://genome.nhgri.nih.gov/histones/). For our study, we used the corrected versions listed in this database.
We used the program CLUSTAL-X (Thompson et al. 1997
) to align the H3 amino acid sequences that were deduced. The alignment of nucleotide sequences was constructed on the basis of the amino acid sequence alignment. Both final alignments were checked for errors by visual inspection. The nucleotide alignment consisted of a set of sequences with 444 nucleotide sites (148 amino acid sites), excluding the start and stop codons. In plants, animals, and fungi, there are typically 135 amino acids in the H3 protein. However, there are various amino acid insertions and deletions in the H3 proteins of protists. Still, it was relatively easy to construct an alignment by taking the deduced amino acid sequences into account, the majority of which are conserved among all eukaryotes. In our study, we used uncorrected p distances for both nucleotide and deduced amino acid sequences to measure the extent of sequence divergence because most mathematical models are unlikely to apply to this highly conserved protein-coding gene. The number of synonymous (pS) and nonsynonymous (pN) nucleotide differences per site was also computed for all sequences by using a modified version of the Nei-Gojobori method (Zhang, Rosenberg, and Nei 1998
). Phylogenetic trees were reconstructed from these distances using the neighbor-joining method (Saitou and Nei 1987). We rooted phylogenetic trees using the diplomonad protist Giardia intestinalis because diplomonads appear to form the earliest branch in the eukaryotic lineage (Roger et al. 1998
). All analyses were performed with the computer program MEGA2 (Kumar et al. 2001
).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
|
|
Complete Genome Comparisons
In the case of C. elegans, the complete genome sequence is known; so we are giving special attention to this data set. As shown in the tree in figure 2
, the amino acid sequences of C. briggsae and C. elegans RD histones form species-specific clusters that are supported by high bootstrap values. The only exceptions are the two RI sequences found on chromosomes III and X, which differed by a single amino acid substitution and clustered with the other animal RI proteins. At the nucleotide level, there is some mixing of RD genes between the two Caenorhabditis species, based on a neighbor-joining phylogeny of p distances for the complete H3 nucleotide sequences (data not shown). This pattern results from substitutions at synonymous sites because C. elegans RD genes are identical at the amino acid level but show substantial levels of nucleotide sequence divergence.
The C. elegans 14 sequences are spread out over five chromosomes. A few of these sequences are found in small clusters, such as sequences 1, 2, and 3 on chromosome II (fig. 1 ). In the case of these three genes, sequence 3 is found in an orientation opposite to that of the other two; yet, all three have identical nucleotide sequences. This apparent homogenization may be partly because of the inverted duplication of the third gene sequence on chromosome II. A similar situation is found on rice chromosome 6 (sequences 9 and 10). In this case, the sequences have begun to diverge; so the inverted duplication must have occurred some time ago.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Our results suggest that H3 genes evolve according to the evolution by a birth-and-death process rather than in a concerted fashion. Apparently, most cases involving low sequence divergence between genes can be explained by recent gene duplication. This is shown best by the C. elegans example involving duplication and inversion of a gene on chromosome II of this species. This pattern of evolution is similar to the evolution of ubiquitin genes, which are also highly conserved (Nei, Rogozin, and Piontkivska 2000
). However, the patterns observed in the chicken histone cluster are more complicated. In this case, it is not entirely clear whether concerted evolution or birth-and-death evolution is more important in the long-term evolution of H3 genes in this species, because of the rather low levels of sequence divergence between chicken H3 genes.
It is extremely difficult to determine whether concerted evolution or recent gene duplication is responsible for the low levels of sequence divergence. Recent gene duplication could explain why chicken RD sequences cluster together and display very low levels of synonymous sequence divergence (fig. 3 , tables 1 and 2 ). Because recent gene duplication could have occurred within a relatively short time period, species-specific clustering would characterize the phylogeny in question, and synonymous substitutions would appear to be small, as not enough time could have elapsed to allow for the accumulation of nucleotide substitutions. In fact, it is interesting to note that all chicken H3 sequences are of the H3.2 type. As discussed previously, this might be evidence of H3.1 gene loss followed by recent duplication of H3.2 genes. However, a broader taxonomic sampling of avian H3 genes would help answer the question of whether or not the chicken histone cluster evolves in a concerted manner. The information gained from comparisons of histone sequence divergence within and between species, as well as the phylogenetic relationships among these sequences, might be useful in discriminating between concerted evolution and birth-and-death evolution under strong purifying selection.
Wells, Bains, and Kedes (1986)
hypothesized that an RI gene was the progenitor of all H3 genes. However, Thatcher and Gorovsky (1994)
argued that H3 RI proteins arose independently in animals, plants, and Tetrahymena. We found that the H3 amino acid sequence from the protist Phreatamoeba was the closest relative of animal and plant H3 amino acid sequences. However, this relationship may have resulted by chance alone because the sequence appears equally distant from fungal H3 sequences, and the bootstrap value is very low (29%). Nevertheless, the clustering of Phreatamoeba with animals, plants, and fungi is supported by a relatively high bootstrap value (78%), suggesting that an H3 gene similar to the one from Phreatamoeba may have been the ancestor of the animal, plant, and fungal H3 sequences. It is interesting that Phreatamoeba has an AIA amino acid sequence motif in the positions homologous to the RI H3.3 AIG motif, and Dictyostelium, which also clusters with animals, plants, and fungi, has an AIG motif in this same position. This suggests that an H3 RI-like gene was the progenitor of all H3 proteins. If the AIG/A motif of Dictyostelium and Phreatameoba arose through convergence, then the hypothesis of Thatcher and Gorovsky (1994)
is favored, but if the motif did not arise through convergence, the hypothesis of Wells, Bains, and Kedes (1986)
would be favored. In the case of the latter, it would mean that RD proteins arose independently multiple times on the basis of the topology shown in figure 2 . Further analyses are needed to discriminate between these hypotheses. Perhaps the sequencing of protist genomes will provide more H3 sequences, which might help to solve this problem.
Studies of histone gene families will help to identify mechanisms by which some multigene families evolve. In general, the study of multigene family evolution will become increasingly important as more complete genome sequences become available. From such studies completed to date, we already know that multigene families are the rule and not the exception as far as genome organization is concerned. Therefore, it is important to understand multigene family evolution if we are to understand the broader scope of genome evolution. This has potential ramifications for fields of research other than evolutionary genomics and molecular evolution. For example, in molecular parasitology it is known that multigene families control antigenic variation of malarial parasites of the genus Plasmodium, and it has been recently suggested that an understanding of the mechanisms by which these multigene families function and evolve holds the key to malaria control (Snounou, Jarra, and Preiser 2000
).
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Keywords: histone
multigene family
concerted evolution
birth-and-death evolution
purifying selection
Address for correspondence and reprints: Alejandro P. Rooney, Department of Biological Sciences, Mississippi State University, P.O. Box GY, Mississippi State, MS 39762. arooney{at}biology.msstate.edu
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Benton M. J., 1990 Phylogeny of the major teterapod groups: morphological data and divergence dates J. Mol. Evol 30:409-424[ISI][Medline]
Brown V. D., Z.-F. Wang, A. S. Williams, W. F. Marzluff, 1996 Structure of a cluster of mouse histone genes Biochim. Biophys. Acta 1306:17-22[ISI][Medline]
Chaubet N., G. Philipps, M. E. Chabout, M. Ehling, C. Gigot, 1986 Nucleotide sequence of two corn histone H3 genes. Genomic organization of the two corn H3 and H4 genes Plant Mol. Biol 6:253-263[ISI]
Chabout M. E., N. Chaubet, G. Philipps, M. Ehling, C. Gigot, 1987 Genome organization and nucleotide sequences of two histone H3 and H4 genes of Arabidopsis thaliana Plant Mol. Biol 8:179-191[ISI]
Clayton R. A., O. White, K. A. Ketchum, J. C. Venter, 1997 The first genome from the third domain of life Nature 387:459-462[ISI][Medline]
Coen E., T. Strachan, G. A. Dover, 1982 Dynamics of concerted evolution of ribosomal DNA and histone gene families in the melanogaster species subgroup of Drosophila J. Mol. Biol 158:17-35[ISI][Medline]
DeBry R. W., 1998 Comparative analysis of evolution in a rodent histone H2a pseudogene J. Mol. Evol 46:355-360[ISI][Medline]
DeBry R. W., W. F. Marzluff, 1994 Selection on silent sites in the rodent H3 histone gene family Genetics 138:191-202
Doenecke D., W. Albig, C. Bode, B. Drabent, K. Franke, K. Gavenis, O. Witt, 1997 Histones: genetic diversity and tissue-specific gene expression Histochem. Cell Biol 107:1-10[ISI][Medline]
Dover G., 1982 Molecular drive: a cohesive mode of species evolution Nature 299:111-117[ISI][Medline]
Felsenstein J., 1985 Confidence limits on phylogenies: an approach using the bootstrap Evolution 39:783-791[ISI]
Gabrielli F., 1989 Human histone variants Pp. 316 in L. S. Hnilica, G. S. Stein, and J. L. Stein, eds. Histones and other basic proteins. CRC Press, Boca Raton, Fla
Graham J., 1995 Tandem genes and clustered genes J. Theor. Biol 175:71-87[ISI][Medline]
Hnilica L. S., G. S. Stein, J. L. Stein, 1989 Histones and other basic proteins CRC Press, Boca Raton, Fla
Holt C. A., G. Childs, 1984 A new family of tandem repetitive early histone genes in the sea urchin Lytechinus pictus: evidence for concerted evolution within tandem arrays Nucleic Acids Res 12:6,455-6,471[Abstract]
Isenberg I., 1979 Histones Annu. Rev. Biochem 48:159-191[ISI][Medline]
Kedes L., 1979 Histone messengers and histone genes Annu. Rev. Biochem 28:837-870
Kumar S., S. B. Hedges, 1998 A molecular timescale for vertebrate evolution Nature 392:917-919[ISI][Medline]
Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Arizona State University, Tempe, Ariz. http://www.megasoftware.net
Marzluff W. F., R. A. Graves, 1984 Organization and expression of mouse histone genes Pp. 281315 in G. Stein, J. Stein, and W. Marzluff, eds. Histone genes: structure organization, and regulation. Wiley, New York
Matsuo Y., T. Yamazaki, 1989 Nucleotide variation and divergence in the histone multigene family in Drosophila melanogaster Genetics 122:87-97
Maxson R., R. Cohn, L. Kedes, 1983 Expression and organization of histone genes Annu. Rev. Genet 17:239-277[ISI][Medline]
Nakayama T., S. Takechi, Y. Takami, 1993 The chicken histone gene family Comp. Biochem. Physiol. B 104:635-639[ISI][Medline]
Nei M., X. Gu, T. Sitnikova, 1997 Evolution by the birth-and-death process in multigene families of the vertebrate immune system Proc. Natl. Acad. Sci. USA 94:7799-7806
Nei M., A. L. Hughes, 1992 Balanced polymorphism and evolution by the birth-and-death process in the MHC loci Pp. 2738 in K. Tsuji, M. Aizawa, and T. Sasazuki, eds. 11th Histocompatibility workshop and conference. Oxford University Press, Oxford, U.K
Nei M., I. B. Rogozin, H. Piontkivska, 2000 Purifying selection and birth-and-death evolution in the ubiquitin gene family Proc. Natl. Acad. Sci. USA 97:10866-10871
Ohta T., 1983 On the evolution of multigene families Theor. Popul. Biol 23:216-240[ISI][Medline]
Ohta T., 1993 An examination of generation-time effect on molecular evolution Proc. Natl. Acad. Sci. USA 90:10676-10680[Abstract]
Ota T., M. Nei, 1994 Divergent evolution and evolution by the birth-and-death process in the immunoglobulin VH gene family Mol. Biol. Evol 11:469-482[Abstract]
Puig S., E. Matallana, J. E. Perez-Ortin, 1999 Stochastic nucleosome positioning in a yeast chromatin region is not dependent on histone H1 Curr. Microbiol 39:168-172[ISI][Medline]
Roberts S. B., M. Sanicola, S. W. Emmons, G. Childs, 1987 Molecular characterization of the histone gene family of Caenorhabditis elegans J. Mol. Biol 196:27-38[ISI][Medline]
Robertson H. M., 1998 Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Caenorhabditis briggsae reveal extensive gene duplication, diversification, movement, and intron loss Genome Res 8:449-463
Roger A. J., S. G. Svard, J. Tovar, C. G. Clark, M. W. Smith, F. D. Gillin, M. L. Sogin, 1998 A mitochondrial-like chaperonin 60 gene in Giardia lamblia: evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria Proc. Natl. Acad. Sci. USA 95:229-234
Saitou N., M. Nei, 1987 The neighbor-joining method. A new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]
Sharp P. M., W. H. Li, 1987 Molecular evolution of ubiquitin genes Trends Ecol. Evol 2:328-332[ISI]
Sittman D. B., R. A. Graves, W. F. Marzluff, 1983 Histone mRNA concentrations are regulated at the level of transcription and mRNA degradation Proc. Natl. Acad. Sci. USA 80:1849-1853[Abstract]
Smith G. P., 1974 Unequal crossover and the evolution of multigene families Cold Spring Harb. Symp. Quant. Biol 38:507-513[ISI][Medline]
Snounou G., W. Jarra, P. R. Preiser, 2000 Malaria multigene families: the price of chronicity Parasitology Today 16:28-30[ISI][Medline]
Soto M., J. M. Requena, C. Alonso, 1996 Organization, transcription and regulation of the Leishmania infantum histone H3 genes Biochem. J 318:813-819[ISI][Medline]
Stenico M., A. T. Lloyd, P. M. Sharp, 1994 Codon usage in Caenorhabditis elegans: delineation of translational selection and mutational biases Nucleic Acids Res 22:2437-2446[Abstract]
Tan Y., S. T. Bishoff, M. A. Riley, 1993 Ubiquitins revisited: further examples of within- and between-locus concerted evolution Mol. Phylogenet. Evol 2:351-360[Medline]
The C. elegans Sequencing Consortium. 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology Science 282:2012-2018
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 24:4876-4882
Thatcher T. H., M. A. Gorovsky, 1994 Phylogenetic analysis of the core histones H2A, H2B, H3, and H4 Nucleic Acids Res 22:174-179[Abstract]
Ushinsky S. C., H. Bussey, A. A. Ahmed, Y. Wang, J. Friesen, B. A. Williams, R. K. Storms, 1997 Histone H1 in Saccharomyces cerevisiae Yeast 13:151-161[ISI][Medline]
Vrana P. B., W. C. Wheeler, 1996 Molecular evolution and phylogenetic utility of the polyubiquitin locus in mammals and higher vertebrates Mol. Phylogenet. Evol 6:259-269[ISI][Medline]
Wang Z.-F., T. Krasikov, M. R. Frey, J. Wang, A. G. Matera, W. F. Marzluff, 1996a Characterization of the mouse histone gene cluster on chromosome 13: 45 histone genes in three patches spread over 1 Mb Genome Res 6:688-701[Abstract]
Wang Z.-F., R. Tisovec, R. W. DeBry, M. R. Frey, A. G. Matera, W. F. Marzluff, 1996b Characterization of a 55-kb mouse histone gene cluster on chromosome 3 Genome Res 6:702-714[Abstract]
Wells D., W. Bains, L. Kedes, 1986 Codon usage in histone gene families of higher eukaryotes reflects functional rather than phylogenetic relationships J. Mol. Evol 23:224-241[ISI][Medline]
Wells J. R., L. S. Coles, A. J. Robins, 1989 Organization of histone genes and their variants Pp. 253267 in L. S. Hnilica, G. S. Stein, and J. L. Stein, eds. Histones and other basic proteins. CRC Press, Boca Raton, Fla
Wu R. S., H. T. Panusz, C. L. Hatch, W. M. Bonner, 1986 Histones and their modifications Crit. Rev. Biochem 20:201-263[ISI][Medline]
Zhang J., K. D. Dyer, H. F. Rosenberg, 2000 Evolution of the rodent eosinophil-associated RNase gene family by rapid gene sorting and positive selection Proc. Natl. Acad. Sci. USA 97:4701-4706
Zhang J., H. F. Rosenberg, M. Nei, 1998 Positive Darwinian selection after gene duplication in primate ribonuclease genes Proc. Natl. Acad. Sci. USA 95:3708-3713
Zimmer E. A., S. L. Martin, S. M. Beverley, Y. W. Kan, A. C. Wilson, 1980 Rapid duplication and loss of genes coding for the chains of hemoglobin Proc. Natl. Acad. Sci. USA 77:2158-2162[Abstract]