Department of Ecology and Evolution, University of Chicago
Correspondence: E-mail: whli{at}uchicago.edu.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Key Words: synonymous rates nonsynonymous rates mutational bias selective constraint tissue-specific and genetic redundancy
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Recent large-scale gene expression studies have made it possible to examine the expression patterns of many genes at different developmental times and tissues and thus enable a more concrete description of housekeeping genes in the genomic scale. A proposed working concept of housekeeping genes has been "those genes critical to the activities that must be carried out for successful completion of the cell cycle" (Warrington et al. 2000). Several recent studies attempted to identify housekeeping genes genome-wide from a number of tissues and developmental stages. For example, Warrington et al. (2000) examined the expression levels of 7,000 genes in 11 different human adult and fetal tissues using high-density oligonucleotide arrays, and identified 535 housekeeping genes that are turned on early in fetal development and stay on throughout adulthood in all tissues. Similarly, Hsiao et al. (2001) analyzed the expression pattern of 7,070 genes in 59 tissue samples representing 19 human tissue types, and identified 451 housekeeping genes, as expressed in all 19 tissues, among which 358 genes were also found in Warrington et al. (2000). Both studies show that housekeeping genes are not necessarily expressed at the same level across all tissues; rather, each tissue seems to have a specific expression profile of housekeeping genes. Furthermore, housekeeping genes are not necessarily the most highly expressed genes in all tissues. It is evident that the old concept and views are insufficient to reflect the nature of housekeeping genes. To better understand the role they play in the genome, more studies need to be conducted on the function, expression, and evolutionary patterns of these genes.
To date, few attempts have been made to examine how housekeeping genes evolve in general and how different they are from tissue-specific genes. A few studies have shown that broadly expressed genes tend to evolve more slowly than narrowly expressed genes (Duret and Mouchiroud 2000; Hastings 1996; Hughes and Hughes 1995). In the present study, taking advantage of the recent large-scale studies for identifying housekeeping genes, we addressed the issue of rate evolution in housekeeping genes. We used genes from Hsiao et al. (2001), because it has a larger sample of tissues than that of Warrington et al. (2000), and all data can be easily obtained from the associated Web site dedicated to an inventory of housekeeping genes. Furthermore, both studies share a majority of the housekeeping genes.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
All protein sequences were aligned using ClustalW with the default parameters, and back-translated to their corresponding DNA sequences. The alignments were then visually inspected and modified when necessary. The number of substitutions per nonsynonymous site and the number of substitutions per synonymous site, denoted as Ka and Ks, respectively, were calculated using the maximum likelihood method implemented in the PAML package (Yang 1997). We excluded genes with Ks >1, leaving a data set of 1,581 genes in total for our final analyses (table 1).
|
The effective number of codons (ENC), a measurement of codon usage bias that ranges from 20 to 61 (Wright 1990), was calculated for all genes. A gene with ENC equal to 20 uses only one type of codon for each synonymous codon set, and thus it shows the strongest codon usage bias, whereas a gene with ENC equal to 61 indicates no synonymous codon usage preference.
The conventional view on housekeeping genes has been that they are low-copy-number genes in the genome. If the notion holds, one can also attribute the slower nonsynonymous rate to less genetic redundancy in housekeeping genes. To examine this issue, we used gene families that have been compiled in the ENSEMBL database to determine the copy numbers for genes used in our study. Genes with more than one member in each species were grouped into multiple-copied gene families.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
|
Possible Causes for Rate Differences
Why do housekeeping genes evolve more slowly in general than tissue-specific genes? To answer this question, we examined several factors that might affect evolutionary rates. First, the higher nonsynonymous substitution rates in tissue-specific genes could be due to higher mutation rates. However, this explanation is unlikely because if higher mutation rates are the main reason, we expect to see much higher synonymous substitution rates in tissue-specific genes than in housekeeping genes. The average synonymous rate in tissue-specific genes does not exhibit the same magnitude of increase as the average nonsynonymous substitution rate (10% increase vs. 100% increase), suggesting that mutation rate differences are not the main cause for the nonsynonymous rate difference.
Second, housekeeping genes might be under stronger selective constraints than tissue-specific genes. To compare selective constraints on these two types of genes, we calculated the Ka/Ks of these genes because Ka/Ks has commonly been used as an indicator of selective constraint. The average Ka/Ks for housekeeping genes is 0.093, whereas it increases more than twofold for lung-specific genes (mean Ka/Ks = 0.259) and liver-specific genes (mean Ka/Ks = 0.233). The Wilcoxon rank sum test shows that the average Ka/Ks is statistically higher for each of the different types of tissue-specific genes, except for prostate-specific genes, and for the pooled tissue-specific genes (table 3), indicating that, on average, tissue-specific genes are under weaker selective constraints than housekeeping genes. It is hard to know, however, to what extent selective constraint contributes to the rate difference, although previous studies have attributed the rate difference between broadly and narrowly expressed genes to be due solely to selective constraint differences (Duret and Mouchiroud 2000; Hastings 1996).
|
|
Although it is not clear how many housekeeping genes are needed for the normal function of an organism, the evolution of this subset of genes does provide us a glance at the evolution of these important genes. More studies are needed to better characterize housekeeping genes. For example, how to define housekeeping genes in terms of their function and expression? How are housekeeping genes organized in the genome? Is there a minimum number of housekeeping genes for the normal function of cells, and if there is, does this number vary with the complexity of the organism? Finally, how different are housekeeping genes from tissue-specific genes in expressional, functional, and evolutionary perspective?
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
![]() |
Literature Cited |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Butte, A. J., and V. J. Dzau, and S. B. Glueck. 2001. Further defining housekeeping, or "maintenance," genes focus on "A compendium of gene expression in normal human tissues". Physiol. Genomics 7:95-96.
Duhig, T., C. Ruhrberg, O. Mor, and M. Fried. 1998. The human surfeit locus. Genomics 52:72-78.[CrossRef][ISI][Medline]
Duret, L., and D. Mouchiroud. 2000. Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17:68-70.
Duret, L., D. Mouchiroud, and M. Gouy. 1994. HOVERGEN, a database of homologous vertebrate genes. Nucleic Acids Res. 22:2360-2365.[Abstract]
Hastings, K. E. M. 1996. Strong evolutionary conservation of broadly expressed protein isoforms in the troponin I gene family and other vertebrate gene families. J. Mol. Evol. 42:631-640.[ISI][Medline]
Hsiao, L.-L., and F. Dangond, T. Yoshida, et al. (23 co-authors). 2001. A compendium of gene expression in normal human tissues. Physiol. Genomics 7:97-104.
Hughes, A. L., and M. K. Hughes. 1995. Self peptides bound by HLA class I molecules are derived from highly conserved regions of a set of evolutionarily conserved proteins. Immunogenetics 41:257-262.[ISI][Medline]
Kagawa, Y., and S. Ohta. 1990. Regulation of mitochondrial ATP synthesis in mammalian cells by transcriptional control. Int. J. Biochem. 22:219-229.[CrossRef][ISI][Medline]
Lercher, M. J., and A. O. Urrutia, and L. D. Hurst. 2002. Clustering of housekeeping genes provides a unified model of gene order in the human genome. Nat. Genet. 31:180-183.[CrossRef][ISI][Medline]
Warrington, J. A., and A. Nair, M. Mahadevappa, and M. Tsyganskaya. 2000. Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol. Genomics 2:143-147.
Watson, J. D., N. H. Hopkins, J. W. Roberts, J. A. Steitz, and A. M. Weiner. 1965. Molecular biology of the gene, vol. 1. p 704. Benjamin/Cummings, Menlo Park, Calif.
Wilcoxon, F. 1945. Individual comparisons by ranking methods. Biometrics 1. 8083.
Wright, F. 1990. The effective number of codons' used in a gene. Gene 87:23-29.[CrossRef][ISI][Medline]
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555-556.[Medline]