Molecular Evolution of the Nontandemly Repeated Genes of the Histone 3 Multigene Family

Alejandro P. Rooney, Helen Piontkivska and Masatoshi Nei

Department of Biological Sciences, Mississippi State University, Mississippi State;
Institute of Molecular Evolutionary Genetics and Department of Biology, The Pennsylvania State University, University Park


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
In some species, histone gene clusters consist of tandem arrays of each type of histone gene, whereas in other species the genes may be clustered but not arranged in tandem. In certain species, however, histone genes are found scattered across several different chromosomes. This study examines the evolution of histone 3 (H3) genes that are not arranged in large clusters of tandem repeats. Although H3 amino acid sequences are highly conserved both within and between species, we found that the nucleotide sequence divergence at synonymous sites is high, indicating that purifying selection is the major force for maintaining H3 amino acid sequence homogeneity over long-term evolution. In cases where synonymous-site divergence was low, recent gene duplication appeared to be a better explanation than gene conversion. These results, and other observations on gene inactivation, organization, and phylogeny, indicated that these H3 genes evolve according to a birth-and-death process under strong purifying selection. Thus, we found little evidence to support previous claims that all H3 proteins, regardless of their genome organization, undergo concerted evolution. Further analyses of the structure of H3 proteins revealed that the histones of higher eukaryotes might have evolved from a replication-independent–like H3 gene.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
The primary function of histones is to bind DNA in the chromatin of eukaryotes (reviewed in Hnilica, Stein, and Stein 1989Citation ). Chromatin contains a repetitive structure known as the nucleosome, which contains roughly 200 base pairs of DNA wrapped around an octamer of histone proteins (reviewed in Hnilica, Stein, and Stein 1989Citation ). The histone octamer is composed of four different types of core histones (H2A, H2B, H3, and H4), each represented by two copies. The H1 histone protein binds linker DNA that connects two different histone octamers.

With the exception of histone H4, all histone genes can be classified into three main subtypes on the basis of their expression pattern and genomic organization (Isenberg 1979Citation ; Maxson, Cohn, and Kedes 1983Citation ; Wu et al. 1986Citation ; Doenecke et al. 1997Citation ): replication-dependent (RD), replication-independent (RI), and tissue-specific (TS) histones. These histones are encoded by multigene families. RI histones, which are also called replacement histones, are expressed at constant but low levels throughout the cell cycle and in quiescent differentiated cells. These histones may contain introns and are not found in histone gene clusters. Instead, they occupy solitary locations within the genome. In contrast, RD histones, which are also called nonreplacement histones, are expressed only during the S phase of the cell cycle, do not contain introns, and are organized as clusters (fig. 1 ). Finally, TS histones are expressed only in certain cell types, such as the testis. An additional numerical classification (H3.1, H3.2, etc.) is also often used. This numbering system is an arbitrary designation based on small amino acid sequence differences. For example, the vertebrate RI H3 genes are classified into the H3.3 category, whereas vertebrate RD genes can be classified into H3.1 or H3.2 gene categories. These variants are defined on the basis of migration patterns on Triton X-100 polyacrylamide gels, and they sometimes differ from one another by as little as one amino acid (e.g., mouse H3.1 vs. mouse H3.2). It is not clear whether these small differences result in functional differentiation, although this is apparently so in the case of larger amino acid differences (e.g., between mouse H3.1/H3.2 and mouse H3.3).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 1.—H3 genome organization in species showing a nontandem clustered organization (A through E), a dispersed organization (F through H), and a tandem clustered organization (I). Expressed genes are shown in white, and an arrow indicates the direction of transcription when known. Pseudogenes are represented by black boxes. In A through E only a portion of the histone cluster is shown. The major human histone cluster located on chromosome 6 (fig. 1A ), and the minor cluster located on chromosome 1 (fig. 1B ) are both shown. The chicken also has two clusters, but it is not certain whether these are on the same chromosome or not. In mouse, there are two clusters (the major one on chromosome 13 and the minor one on chromosome 3), but only a portion of the major one is shown here

 
The total number of histone genes in a species is variable. For example, there are two copies of each of the core histones in the yeast Saccharomyces cerevisiae (Maxson, Cohn, and Kedes 1983Citation ; Wells, Coles, and Robins 1989Citation ; Ushinsky et al. 1997Citation ), whereas there can be up to 1,000 copies in some sea urchin species (Kedes 1979Citation ). In species that have a few hundred to a few thousand histone gene copies, the genes are usually organized into tandemly repeating quintets of the five histone types (e.g., sea urchin; fig. 1 ). On the other hand, histone genes are not always arranged in tandem in species that have smaller copy numbers. In these species, the histone genes are often organized into one or two large clusters, as in the mouse where there are two relatively large clusters found on different chromosomes (Wang et al. 1996a, 1996bCitation ). In certain species, histone genes are dispersed throughout the genome in small groups of one-to-few copies (e.g., Caenorhabditis elegans and corn; Chaboutè et al. 1987Citation ; Roberts et al. 1987Citation ).

Histones, especially H3 and H4, are among the most conserved proteins in the eukaryotic genome. There only about three amino acid differences between animal and plant H3 proteins, which are composed of 135 residues. Because of this high sequence similarity, the multigene families encoding histones are generally believed to be subject to concerted evolution, which homogenizes the member genes of a multigene family by interlocus gene recombination or gene conversion (e.g., Kedes 1979Citation ; Coen, Strachan, and Dover 1982Citation ; Holt and Childs 1984Citation ; Matsuo and Yamazaki 1989Citation ; DeBry and Marzluff 1994Citation ; Thatcher and Gorovsky 1994Citation ). However, protein sequence homogeneity can also be attained by strong purifying selection without concerted evolution. In this case, genes can evolve independently or according to the birth-and-death model of evolution (Nei and Hughes 1992Citation ; Nei, Gu, and Sitnikova 1997Citation ). This latter model of evolution assumes that new genes are created by repeated gene duplication and that some of these duplicated genes are maintained in the genome for a long time, whereas others are deleted or become nonfunctional. Birth-and-death evolution has been shown to be the primary mode of evolution for large multigene families, such as the major histocompatibility complex (MHC), immunoglobulin (Ig), antibacterial ribonuclease genes, and nematode chemoreceptor gene families (Nei and Hughes 1992Citation ; Ota and Nei 1994Citation ; Nei, Gu, and Sitnikova 1997Citation ; Robertson 1998Citation ; Zhang, Dyer, and Rosenberg 2000Citation ), as well as for smaller multigene families such as the ubiquitins (Nei, Rogozin, and Piontkivska 2000Citation ).

The purpose of this paper is to study the evolutionary mode of H3 genes in relation to the two aforementioned hypotheses. Assuming a rapid process of interlocus recombination or gene conversion, we would expect that in each species the proportion of synonymous nucleotide differences per synonymous site (pS) between member genes is similar or only slightly higher than the proportion of nonsynonymous nucleotide differences per nonsynonymous site (pN), irrespective of whether there is purifying selection. However, if birth-and-death evolution under strong purifying selection were the major evolutionary force, pS would be much higher than pN because the member genes might diverge extensively by silent nucleotide substitution. Therefore, one may be able to distinguish between these two modes of evolution by comparing pS and pN.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
We analyzed histone H3 nucleotide sequences from representatives of the major eukaryotic taxonomic groups (animals, plants, fungi, and protists). Both mRNA and genomic DNA sequences were used. In order to simplify their identification, sequences were labeled with each species' common name, whenever possible, and numbered. The list of species and DNA sequences analyzed is given at http://mep.bio.psu.edu/databases/histoneH3/appendix.doc. Apart from the sequences of the two nematode species, C. elegans and C. briggsae, all H3 nucleotide sequences used in this study were obtained from GenBank. The H3 sequences from C. elegans and C. briggsae were extracted from the complete genome database of C. elegans, and the partial genome database of C. briggsae was obtained by using the Washington University BLAST server (http://genome.wustl.edu/gsc/Blast). In the case of C. elegans, the chromosomal location of all H3 genes is known; so we labeled the sequences in the Appendix according to the chromosome on which they were found (I, II, III, etc.). However, we labeled the C. briggsae sequences according to the cosmid on which they were found because the chromosomal location of these genes has not yet been determined. The C. briggsae sequences do not yet have accession numbers, although they are freely available.

In some instances, H3 gene sequences found in GenBank contained errors. These sequences are listed in the histone sequence database maintained by the National Human Genome Research Institute (http://genome.nhgri.nih.gov/histones/). For our study, we used the corrected versions listed in this database.

We used the program CLUSTAL-X (Thompson et al. 1997Citation ) to align the H3 amino acid sequences that were deduced. The alignment of nucleotide sequences was constructed on the basis of the amino acid sequence alignment. Both final alignments were checked for errors by visual inspection. The nucleotide alignment consisted of a set of sequences with 444 nucleotide sites (148 amino acid sites), excluding the start and stop codons. In plants, animals, and fungi, there are typically 135 amino acids in the H3 protein. However, there are various amino acid insertions and deletions in the H3 proteins of protists. Still, it was relatively easy to construct an alignment by taking the deduced amino acid sequences into account, the majority of which are conserved among all eukaryotes. In our study, we used uncorrected p distances for both nucleotide and deduced amino acid sequences to measure the extent of sequence divergence because most mathematical models are unlikely to apply to this highly conserved protein-coding gene. The number of synonymous (pS) and nonsynonymous (pN) nucleotide differences per site was also computed for all sequences by using a modified version of the Nei-Gojobori method (Zhang, Rosenberg, and Nei 1998Citation ). Phylogenetic trees were reconstructed from these distances using the neighbor-joining method (Saitou and Nei 1987). We rooted phylogenetic trees using the diplomonad protist Giardia intestinalis because diplomonads appear to form the earliest branch in the eukaryotic lineage (Roger et al. 1998Citation ). All analyses were performed with the computer program MEGA2 (Kumar et al. 2001Citation ).


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Amino Acid Sequence Evolution
The phylogenetic tree of H3 amino acid sequences from various species is shown in figure 2 . This tree includes several distinct clusters of H3 sequences as indicated in the figure. In protists and fungi the number of repeat copies is small, but H3 protein sequences are quite divergent among different species. In contrast, the extent of sequence divergence among different species of animals and plants is very small except for a few sequences. For example, RD proteins from animal species are virtually identical except for the duck-2 and human-12 sequences, which are different from the standard animal RD sequence by two and four amino acid differences, respectively. The H3.1 and H3.2 sequences are also very similar to one another at the amino acid level and differ by only one amino acid residue. To some extent, these clusters appear to be vertebrate specific and include sequences from different species of mammals, birds, and amphibians, which last shared a common ancestor approximately 350 Myr ago (Benton 1990Citation ). It is worthwhile to note that these protein sequences from highly divergent taxa are identical to one another as long as they belong to the same H3-type. Also, some species (e.g., human and chicken) have only one kind of RD H3 protein, whereas other species (e.g., mouse and frog) have two kinds.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 2.—Phylogeny of H3 amino acid sequences from representative eukaryotic lineages. Only unique sequences in each species are shown. The numbers of identical sequences for each taxon are listed next to the name of the species. Uncorrected p distances were used for the phylogeny reconstruction. Bootstrap values based on 1,000 replications are shown only when they are greater than 50%. RD: replication-dependent genes, RI: replication-independent genes

 
Nucleotide Sequence Evolution
Although the amino acid sequences are identical for many proteins, RD nucleotide sequences showed a relatively high degree of synonymous substitutions, even between genes within a species (table 1 ). The topology of the tree derived from nucleotide sequences (fig. 3 ) is not reliable because most comparisons have reached the saturation level. However, it is evident that the extent of synonymous nucleotide differences both within and between species is generally very large. For example, most of the human genes are very divergent from one another (fig. 3 ). In fact, the average value of pS (S) for the human RD genes is 0.463 (table 1 ), which is close to the saturation level under the condition that there are no amino acid differences (Nei, Rogozin, and Piontkivska 2000Citation ), and the pS values for intraspecific comparisons of human genes are virtually the same as those for interspecific comparisons between human and other vertebrate genes. Similarly, the sequence similarities of mouse RD genes are nearly the same as those between mouse and chicken RD genes (table 2 and fig. 3 ). However, there was one case where the sequences from one species were very closely related to each other. The H3 RD genes of the chicken showed a value of S = 0.067 (table 1 ), although the magnitude of S was significantly greater than the magnitude of N (P < 0.001, Z-test; table 1 ). In fact, it is clear from the tree in figure 3 and from the S values reported in table 1 that all species for which multiple RD sequences were available showed extensive levels of divergence at synonymous nucleotide sites. The RI proteins show the same patterns as RD proteins with respect to high levels of synonymous nucleotide divergence and low amino acid sequence divergence (table 1 ; figs. 2 and 3 ). Compared with animal RD proteins, animal RI proteins appear to show a slightly higher rate of sequence divergence (table 1 ); yet, this does not hold for plants.


View this table:
[in this window]
[in a new window]
 
Table 1 Average Numbers of Synonymous (S) and Nonsynonymous (N) Substitutions per Site and the Average R (i.e., transition/transversion) Value, in Representative Animal, Plant and Fungus H3 Genesa

 


View larger version (25K):
[in this window]
[in a new window]
 
Fig. 3.—Phylogeny of eukaryotic H3 genes using synonymous nucleotide distances (pS). Bootstrap values greater than 50% are shown above the lines. (I): replication-independent genes

 

View this table:
[in this window]
[in a new window]
 
Table 2 Levels of pS (below diagonal) and pN (above diagonal) x 100 Observed Between Representative RD H3 Sequences from Chicken, Mouse, and Humana

 
In some cases, it has been suggested that RD genes found in close proximity on a chromosome undergo gene conversion (e.g., DeBry and Marzluff 1994Citation ). To test whether or not this is true, we examined sequences from two closely related species of rodents (shrew mouse and mouse). Shrew mouse genes (1–3) that are putative orthologues of mouse chromosome 13 genes (1–3) are more closely related to certain mouse genes (mouse 1, 3, and 4) than they are to other shrew mouse genes. For example, the overall p distance between sequence 1 of the shrew mouse and sequence 3 of the mouse is p = 0.022, whereas the distances between sequence 1 of the shrew mouse and sequences 2 and 3 of the shrew mouse are p = 0.024 and p = 0.034, respectively. When we tested S = N for mouse chromosome 13 genes (mouse 1–9) and their putative shrew mouse orthologues (shrew mouse 1–3), we found that S was significantly different from N in both cases (table 1 ; P < 0.001, Z-test).

Complete Genome Comparisons
In the case of C. elegans, the complete genome sequence is known; so we are giving special attention to this data set. As shown in the tree in figure 2 , the amino acid sequences of C. briggsae and C. elegans RD histones form species-specific clusters that are supported by high bootstrap values. The only exceptions are the two RI sequences found on chromosomes III and X, which differed by a single amino acid substitution and clustered with the other animal RI proteins. At the nucleotide level, there is some mixing of RD genes between the two Caenorhabditis species, based on a neighbor-joining phylogeny of p distances for the complete H3 nucleotide sequences (data not shown). This pattern results from substitutions at synonymous sites because C. elegans RD genes are identical at the amino acid level but show substantial levels of nucleotide sequence divergence.

The C. elegans 14 sequences are spread out over five chromosomes. A few of these sequences are found in small clusters, such as sequences 1, 2, and 3 on chromosome II (fig. 1 ). In the case of these three genes, sequence 3 is found in an orientation opposite to that of the other two; yet, all three have identical nucleotide sequences. This apparent homogenization may be partly because of the inverted duplication of the third gene sequence on chromosome II. A similar situation is found on rice chromosome 6 (sequences 9 and 10). In this case, the sequences have begun to diverge; so the inverted duplication must have occurred some time ago.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Our study of nontandem H3 genes shows that a birth-and-death process with strong purifying selection best describes the long-term evolutionary mode of these genes. First, both RD and RI H3 genes were substantially divergent at the nucleotide level despite the fact that the amino acid sequences differed by only one or a few sites between H3 variants. Second, pseudogenes are found in both RD and RI gene clusters. By searching GenBank, we found three human H3 pseudogenes, two mouse H3 pseudogenes, one H3 pseudogene from rice, one H3 pseudogene in a species related to the soybean, and four putative pseudogenes from C. elegans. In C. elegans, two genes found on chromosome III share protein sequence similarities with the C. elegans RI genes, but they might be pseudogenes as they show numerous insertions or deletions and a substantial number of amino acid substitutions when aligned with the other H3 sequences. Although the sequences are still in frame, it is possible that they could be pseudogenes if, for example, their promoters have somehow become inactivated. It is worth noting that the sequences X-2 and V-6 of C. elegans may also be pseudogenes because they contain many single nucleotide gaps when aligned with the other H3 genes. Third, the genes in solitary locations of the genome (RI genes) displayed the same evolutionary patterns as the genes that are organized into clusters (RD genes). This observation runs counter to theoretical expectations, which predict that clustered genes are supposed to show evidence of gene conversion or unequal crossing-over more often than solitary genes under the model of concerted evolution. Fourth, the H3 amino acid and nucleotide sequence phylogenies are not consistent with predictions under the birth-and-death model (Nei and Hughes 1992Citation ; Ota and Nei 1994Citation ; Nei, Gu, and Sitnikova 1997Citation ), in which case genes will cluster by type and not by species. If genes evolve in a concerted manner, they should cluster by species and show relatively short branch lengths based on pS. Our results suggest that H3 genes cluster according to inferred gene duplication events, although the chicken H3 genes appear to be an exception that we shall discuss later in more detail. Finally, we did not find that S = N for any species (table 1 ), though the magnitude of S should be nearly the same as the magnitude of N under the concerted evolution process. This result suggests that purifying selection is important for maintaining amino acid sequence homogeneity of H3 genes.

Our results suggest that H3 genes evolve according to the evolution by a birth-and-death process rather than in a concerted fashion. Apparently, most cases involving low sequence divergence between genes can be explained by recent gene duplication. This is shown best by the C. elegans example involving duplication and inversion of a gene on chromosome II of this species. This pattern of evolution is similar to the evolution of ubiquitin genes, which are also highly conserved (Nei, Rogozin, and Piontkivska 2000Citation ). However, the patterns observed in the chicken histone cluster are more complicated. In this case, it is not entirely clear whether concerted evolution or birth-and-death evolution is more important in the long-term evolution of H3 genes in this species, because of the rather low levels of sequence divergence between chicken H3 genes.

It is extremely difficult to determine whether concerted evolution or recent gene duplication is responsible for the low levels of sequence divergence. Recent gene duplication could explain why chicken RD sequences cluster together and display very low levels of synonymous sequence divergence (fig. 3 , tables 1 and 2 ). Because recent gene duplication could have occurred within a relatively short time period, species-specific clustering would characterize the phylogeny in question, and synonymous substitutions would appear to be small, as not enough time could have elapsed to allow for the accumulation of nucleotide substitutions. In fact, it is interesting to note that all chicken H3 sequences are of the H3.2 type. As discussed previously, this might be evidence of H3.1 gene loss followed by recent duplication of H3.2 genes. However, a broader taxonomic sampling of avian H3 genes would help answer the question of whether or not the chicken histone cluster evolves in a concerted manner. The information gained from comparisons of histone sequence divergence within and between species, as well as the phylogenetic relationships among these sequences, might be useful in discriminating between concerted evolution and birth-and-death evolution under strong purifying selection.

Wells, Bains, and Kedes (1986)Citation hypothesized that an RI gene was the progenitor of all H3 genes. However, Thatcher and Gorovsky (1994)Citation argued that H3 RI proteins arose independently in animals, plants, and Tetrahymena. We found that the H3 amino acid sequence from the protist Phreatamoeba was the closest relative of animal and plant H3 amino acid sequences. However, this relationship may have resulted by chance alone because the sequence appears equally distant from fungal H3 sequences, and the bootstrap value is very low (29%). Nevertheless, the clustering of Phreatamoeba with animals, plants, and fungi is supported by a relatively high bootstrap value (78%), suggesting that an H3 gene similar to the one from Phreatamoeba may have been the ancestor of the animal, plant, and fungal H3 sequences. It is interesting that Phreatamoeba has an AIA amino acid sequence motif in the positions homologous to the RI H3.3 AIG motif, and Dictyostelium, which also clusters with animals, plants, and fungi, has an AIG motif in this same position. This suggests that an H3 RI-like gene was the progenitor of all H3 proteins. If the AIG/A motif of Dictyostelium and Phreatameoba arose through convergence, then the hypothesis of Thatcher and Gorovsky (1994)Citation is favored, but if the motif did not arise through convergence, the hypothesis of Wells, Bains, and Kedes (1986)Citation would be favored. In the case of the latter, it would mean that RD proteins arose independently multiple times on the basis of the topology shown in figure 2 . Further analyses are needed to discriminate between these hypotheses. Perhaps the sequencing of protist genomes will provide more H3 sequences, which might help to solve this problem.

Studies of histone gene families will help to identify mechanisms by which some multigene families evolve. In general, the study of multigene family evolution will become increasingly important as more complete genome sequences become available. From such studies completed to date, we already know that multigene families are the rule and not the exception as far as genome organization is concerned. Therefore, it is important to understand multigene family evolution if we are to understand the broader scope of genome evolution. This has potential ramifications for fields of research other than evolutionary genomics and molecular evolution. For example, in molecular parasitology it is known that multigene families control antigenic variation of malarial parasites of the genus Plasmodium, and it has been recently suggested that an understanding of the mechanisms by which these multigene families function and evolve holds the key to malaria control (Snounou, Jarra, and Preiser 2000Citation ).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
We thank D. Doenecke, X. Gu, R. L. Honeycutt, Y. Matsuo, S. Yokoyama, and an anonymous reviewer for helpful discussions and comments. This work was supported by grants from NIH (GM20293) and NASA (NCC2-1057) to M.N.


    Footnotes
 
Rodney Honeycutt, Reviewing Editor

Keywords: histone multigene family concerted evolution birth-and-death evolution purifying selection Back

Address for correspondence and reprints: Alejandro P. Rooney, Department of Biological Sciences, Mississippi State University, P.O. Box GY, Mississippi State, MS 39762. arooney{at}biology.msstate.edu . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 

    Benton M. J., 1990 Phylogeny of the major teterapod groups: morphological data and divergence dates J. Mol. Evol 30:409-424[ISI][Medline]

    Brown V. D., Z.-F. Wang, A. S. Williams, W. F. Marzluff, 1996 Structure of a cluster of mouse histone genes Biochim. Biophys. Acta 1306:17-22[ISI][Medline]

    Chaubet N., G. Philipps, M. E. Chabout, M. Ehling, C. Gigot, 1986 Nucleotide sequence of two corn histone H3 genes. Genomic organization of the two corn H3 and H4 genes Plant Mol. Biol 6:253-263[ISI]

    Chabout M. E., N. Chaubet, G. Philipps, M. Ehling, C. Gigot, 1987 Genome organization and nucleotide sequences of two histone H3 and H4 genes of Arabidopsis thaliana Plant Mol. Biol 8:179-191[ISI]

    Clayton R. A., O. White, K. A. Ketchum, J. C. Venter, 1997 The first genome from the third domain of life Nature 387:459-462[ISI][Medline]

    Coen E., T. Strachan, G. A. Dover, 1982 Dynamics of concerted evolution of ribosomal DNA and histone gene families in the melanogaster species subgroup of Drosophila J. Mol. Biol 158:17-35[ISI][Medline]

    DeBry R. W., 1998 Comparative analysis of evolution in a rodent histone H2a pseudogene J. Mol. Evol 46:355-360[ISI][Medline]

    DeBry R. W., W. F. Marzluff, 1994 Selection on silent sites in the rodent H3 histone gene family Genetics 138:191-202[Abstract/Free Full Text]

    Doenecke D., W. Albig, C. Bode, B. Drabent, K. Franke, K. Gavenis, O. Witt, 1997 Histones: genetic diversity and tissue-specific gene expression Histochem. Cell Biol 107:1-10[ISI][Medline]

    Dover G., 1982 Molecular drive: a cohesive mode of species evolution Nature 299:111-117[ISI][Medline]

    Felsenstein J., 1985 Confidence limits on phylogenies: an approach using the bootstrap Evolution 39:783-791[ISI]

    Gabrielli F., 1989 Human histone variants Pp. 3–16 in L. S. Hnilica, G. S. Stein, and J. L. Stein, eds. Histones and other basic proteins. CRC Press, Boca Raton, Fla

    Graham J., 1995 Tandem genes and clustered genes J. Theor. Biol 175:71-87[ISI][Medline]

    Hnilica L. S., G. S. Stein, J. L. Stein, 1989 Histones and other basic proteins CRC Press, Boca Raton, Fla

    Holt C. A., G. Childs, 1984 A new family of tandem repetitive early histone genes in the sea urchin Lytechinus pictus: evidence for concerted evolution within tandem arrays Nucleic Acids Res 12:6,455-6,471[Abstract]

    Isenberg I., 1979 Histones Annu. Rev. Biochem 48:159-191[ISI][Medline]

    Kedes L., 1979 Histone messengers and histone genes Annu. Rev. Biochem 28:837-870

    Kumar S., S. B. Hedges, 1998 A molecular timescale for vertebrate evolution Nature 392:917-919[ISI][Medline]

    Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Arizona State University, Tempe, Ariz. http://www.megasoftware.net

    Marzluff W. F., R. A. Graves, 1984 Organization and expression of mouse histone genes Pp. 281–315 in G. Stein, J. Stein, and W. Marzluff, eds. Histone genes: structure organization, and regulation. Wiley, New York

    Matsuo Y., T. Yamazaki, 1989 Nucleotide variation and divergence in the histone multigene family in Drosophila melanogaster Genetics 122:87-97[Abstract/Free Full Text]

    Maxson R., R. Cohn, L. Kedes, 1983 Expression and organization of histone genes Annu. Rev. Genet 17:239-277[ISI][Medline]

    Nakayama T., S. Takechi, Y. Takami, 1993 The chicken histone gene family Comp. Biochem. Physiol. B 104:635-639[ISI][Medline]

    Nei M., X. Gu, T. Sitnikova, 1997 Evolution by the birth-and-death process in multigene families of the vertebrate immune system Proc. Natl. Acad. Sci. USA 94:7799-7806[Abstract/Free Full Text]

    Nei M., A. L. Hughes, 1992 Balanced polymorphism and evolution by the birth-and-death process in the MHC loci Pp. 27–38 in K. Tsuji, M. Aizawa, and T. Sasazuki, eds. 11th Histocompatibility workshop and conference. Oxford University Press, Oxford, U.K

    Nei M., I. B. Rogozin, H. Piontkivska, 2000 Purifying selection and birth-and-death evolution in the ubiquitin gene family Proc. Natl. Acad. Sci. USA 97:10866-10871[Abstract/Free Full Text]

    Ohta T., 1983 On the evolution of multigene families Theor. Popul. Biol 23:216-240[ISI][Medline]

    Ohta T., 1993 An examination of generation-time effect on molecular evolution Proc. Natl. Acad. Sci. USA 90:10676-10680[Abstract]

    Ota T., M. Nei, 1994 Divergent evolution and evolution by the birth-and-death process in the immunoglobulin VH gene family Mol. Biol. Evol 11:469-482[Abstract]

    Puig S., E. Matallana, J. E. Perez-Ortin, 1999 Stochastic nucleosome positioning in a yeast chromatin region is not dependent on histone H1 Curr. Microbiol 39:168-172[ISI][Medline]

    Roberts S. B., M. Sanicola, S. W. Emmons, G. Childs, 1987 Molecular characterization of the histone gene family of Caenorhabditis elegans J. Mol. Biol 196:27-38[ISI][Medline]

    Robertson H. M., 1998 Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Caenorhabditis briggsae reveal extensive gene duplication, diversification, movement, and intron loss Genome Res 8:449-463[Abstract/Free Full Text]

    Roger A. J., S. G. Svard, J. Tovar, C. G. Clark, M. W. Smith, F. D. Gillin, M. L. Sogin, 1998 A mitochondrial-like chaperonin 60 gene in Giardia lamblia: evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria Proc. Natl. Acad. Sci. USA 95:229-234[Abstract/Free Full Text]

    Saitou N., M. Nei, 1987 The neighbor-joining method. A new method for reconstructing phylogenetic trees Mol. Biol. Evol 4:406-425[Abstract]

    Sharp P. M., W. H. Li, 1987 Molecular evolution of ubiquitin genes Trends Ecol. Evol 2:328-332[ISI]

    Sittman D. B., R. A. Graves, W. F. Marzluff, 1983 Histone mRNA concentrations are regulated at the level of transcription and mRNA degradation Proc. Natl. Acad. Sci. USA 80:1849-1853[Abstract]

    Smith G. P., 1974 Unequal crossover and the evolution of multigene families Cold Spring Harb. Symp. Quant. Biol 38:507-513[ISI][Medline]

    Snounou G., W. Jarra, P. R. Preiser, 2000 Malaria multigene families: the price of chronicity Parasitology Today 16:28-30[ISI][Medline]

    Soto M., J. M. Requena, C. Alonso, 1996 Organization, transcription and regulation of the Leishmania infantum histone H3 genes Biochem. J 318:813-819[ISI][Medline]

    Stenico M., A. T. Lloyd, P. M. Sharp, 1994 Codon usage in Caenorhabditis elegans: delineation of translational selection and mutational biases Nucleic Acids Res 22:2437-2446[Abstract]

    Tan Y., S. T. Bishoff, M. A. Riley, 1993 Ubiquitins revisited: further examples of within- and between-locus concerted evolution Mol. Phylogenet. Evol 2:351-360[Medline]

    The C. elegans Sequencing Consortium. 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology Science 282:2012-2018[Abstract/Free Full Text]

    Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 24:4876-4882

    Thatcher T. H., M. A. Gorovsky, 1994 Phylogenetic analysis of the core histones H2A, H2B, H3, and H4 Nucleic Acids Res 22:174-179[Abstract]

    Ushinsky S. C., H. Bussey, A. A. Ahmed, Y. Wang, J. Friesen, B. A. Williams, R. K. Storms, 1997 Histone H1 in Saccharomyces cerevisiae Yeast 13:151-161[ISI][Medline]

    Vrana P. B., W. C. Wheeler, 1996 Molecular evolution and phylogenetic utility of the polyubiquitin locus in mammals and higher vertebrates Mol. Phylogenet. Evol 6:259-269[ISI][Medline]

    Wang Z.-F., T. Krasikov, M. R. Frey, J. Wang, A. G. Matera, W. F. Marzluff, 1996a Characterization of the mouse histone gene cluster on chromosome 13: 45 histone genes in three patches spread over 1 Mb Genome Res 6:688-701[Abstract]

    Wang Z.-F., R. Tisovec, R. W. DeBry, M. R. Frey, A. G. Matera, W. F. Marzluff, 1996b Characterization of a 55-kb mouse histone gene cluster on chromosome 3 Genome Res 6:702-714[Abstract]

    Wells D., W. Bains, L. Kedes, 1986 Codon usage in histone gene families of higher eukaryotes reflects functional rather than phylogenetic relationships J. Mol. Evol 23:224-241[ISI][Medline]

    Wells J. R., L. S. Coles, A. J. Robins, 1989 Organization of histone genes and their variants Pp. 253–267 in L. S. Hnilica, G. S. Stein, and J. L. Stein, eds. Histones and other basic proteins. CRC Press, Boca Raton, Fla

    Wu R. S., H. T. Panusz, C. L. Hatch, W. M. Bonner, 1986 Histones and their modifications Crit. Rev. Biochem 20:201-263[ISI][Medline]

    Zhang J., K. D. Dyer, H. F. Rosenberg, 2000 Evolution of the rodent eosinophil-associated RNase gene family by rapid gene sorting and positive selection Proc. Natl. Acad. Sci. USA 97:4701-4706[Abstract/Free Full Text]

    Zhang J., H. F. Rosenberg, M. Nei, 1998 Positive Darwinian selection after gene duplication in primate ribonuclease genes Proc. Natl. Acad. Sci. USA 95:3708-3713[Abstract/Free Full Text]

    Zimmer E. A., S. L. Martin, S. M. Beverley, Y. W. Kan, A. C. Wilson, 1980 Rapid duplication and loss of genes coding for the {alpha} chains of hemoglobin Proc. Natl. Acad. Sci. USA 77:2158-2162[Abstract]

Accepted for publication September 4, 2001.