Further defining housekeeping, or "maintenance," genes Focus on "A compendium of gene expression in normal human tissues"

ATUL J. BUTTE1, VICTOR J. DZAU2,3 and SUSAN B. GLUECK4

1 Children’s Hospital Informatics Program, Boston, Massachusetts 02115
2 Editor-in-Chief, Physiological Genomics
3 Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts 02115
4 Deputy Editor, Physiological Genomics

HOUSEKEEPING GENES are constitutively expressed to maintain cellular function. As such, they are presumed to produce the minimally essential transcripts necessary for normal cellular physiology. With the advent of microarray technology, it has recently become possible to identify at least the "starter set" of housekeeping genes, as exemplified by the work of Velculescu et al. (2), as well as by Warrington et al. (3) in a paper published in this journal previously. In that paper, Warrington et al. examined the expression of 7,000 full-length genes in 11 different human tissues, both adult and fetal, to determine the suite of transcripts that were commonly expressed throughout human development and in different tissues. They identified 535 transcripts via microarray hybridization as likely candidates for housekeeping genes, or, as the authors proposed, "maintenance" genes.

Among the findings of this previous study was the fact that there are constitutively expressed genes particular to certain tissues, as well as sets of genes expressed only in the fetus or only in the adult. Cluster analysis was employed to identify groups of transcripts expressed in the seven adult tissues studied (695) and the four fetal tissues tested (767). The analytical technology was able to detect fluctuations in expression between fetal and adult samples, indicating that such genes, while permanently activated, are perhaps not required at the same level throughout development. Forty-seven transcripts commonly expressed between fetal and adult samples did not vary in expression level and might serve as useful internal controls for other expression profiling experiments.

In this online release of Physiological Genomics, Hsiao et al. (Ref. 1; see page 97 in this release) extend the findings of Warrington et al. (3) to focus on the tissue-selective gene clusters that appear characteristic of particular organs. Their data has been logged into a database (HuGE Index: Human Gene Expression Index, http://www.hugeindex.org) which is intended to serve as a benchmark compendium of expression profiles of various human tissue to enable researchers to make comparisons of normal states and disease. Hsiao et al. (1) reverse-transcribed cDNAs from RNA extracted from 19 different tissue types, then hybridized cRNAs made from the resulting transcript pool to Affymetrix GeneChip HuGeneFL oligonucleotide human genome microarrays. They identified any gene expressed in all 19 tissues as a housekeeping/maintenance gene; then, they identified tissue-selective genes using principal component analysis.

The 451 maintenance genes encode proteins that mediate cellular functions such as transcription, translation, and signaling. Of these genes, 358 were common to the previous study. As would be expected, differences in expression levels of the housekeeping genes from one tissue to another could be used to separate and cluster different tissues successfully according to tissue type (certainly, if these tissues can be visually distinguished from each other, than there must exist genes that accurately separate the tissues as well). Further statistical analysis revealed groups of tissue-selective genes predominantly expressed in a single tissue type; for example, among the muscle-selective genes were those associated with contraction, such as tropomyosin, or with glucose metabolism, such as lactate dehydrogenase. Additionally, genes were identified whose expression levels varied the most significantly between patient samples. For example, in lung, integrin-ß2 was among genes with a coefficient of variation greater than two standard deviations from the mean. The authors (1) noted that variation in integrin-ß2 is involved in a predisposition to recurrent bacterial lung infections.

This paper by Hsiao et al. (1) and the previous work by Warrington et al. (3) are important in highlighting the potential that gene expression profiling holds for characterizing normal biological states and disease states. Their findings that particular genes are highly variable among normal individuals brings up the questions of the precise definition of a housekeeping gene.


Is a housekeeping gene expressed at the same level throughout the day or life of an organ?
Certainly, genes are differentially expressed during development, but what about lower frequency cycles? For example, there are many tissues in the body that respond to glucocorticoids. Since serum cortisol levels can normally vary two- to threefold during the course of a day, expression of particular genes in those tissues may vary several-fold or greater depending on when samples are acquired. Genes responding to other hormones, such as insulin, luteinizing hormone, or follicle-stimulating hormone, may vary at a different frequency or amplitude (for instance, gene expression peaking meal to meal, or month to month). Thus all the genes present at one point of the cycle may differ from those present at a different point. How can we expand the definition of housekeeping genes to include these genes that change as a course of normal molecular physiology?

Is a housekeeping gene expressed at the same level throughout an organ?
At a rough level for particular organs, it may be the case that a few samples of liver accurately reflect the other parts of the liver. Can the same be said for muscle? Certainly, different parts of the brain and placenta may express different genes, and each sample differs in homogeneity of cell types. How can we expand the definition of housekeeping genes in this hierarchical sense; that is, if a gene is a housekeeping gene in a subcomponent of an organ, is it considered a housekeeping gene in the entire organ?

Does the list of housekeeping genes constitute the minimum set of genes that need to be expressed for a cell to survive?
Is there any cell type that expresses only this set of genes? If a cell requires expression of either gene A, B, or C for its normal functioning, would we consider any one of these genes in the set of housekeeping genes? How can we expand the definition of housekeeping genes to consider genes that potentially share functioning?

The list of normal genes in normal tissues described in this issue will become a valuable resource for many investigators. It is encouraging that the authors are making online queries against this data possible. Starting with this list of housekeeping genes, some of the next bioinformatics questions to ask of this data set could include the following two important items. First, over 270 genes on the HuGeneFL microarray have been annotated with the molecular function of DNA binding, and over 150 genes are known to code for proteins with zinc fingers. How do all of these potential transcription factors fare in the list of housekeeping genes? Are there specific transcription factors that are always present in every tissue? Second, over 4,900 genes on the HuGeneFL microarray have been linked to Online Mendelian Inheritance in Man (OMIM) disease entries through LocusLink. Are the diseases linked to housekeeping genes more likely to affect multiple organs than those diseases linked to non-housekeeping genes?

Continued delineation of the transcriptome in normal tissues like the work by Hsiao et al. (1) will certainly allow these and many other explorations of genomic physiology.

FOOTNOTES

Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).

Address for reprint requests and other correspondence: A. Butte, Children’s Hospital, Boston, MA 02115 (E-mail: atul_butte{at}harvard.edu).

REFERENCES

  1. Hsiao LL, Dangond F, Yoshida T, Hong R, Jensen RV, Misra J, Dillon W, Lee KF, Clark KE, Haverty P, Weng Z, Mutter GL, Frosch MP, MacDonald ME, Milford EL, Crum CP, Bueno R, Pratt RE, Mahadevappa M, Warrington JA, Stephanopoulos G, Stephanopoulos G, and Gullans SR. A compendium of gene expression in normal human tissues. Physiol Genomics 7: 97–104, 2001. First published October 2, 2001; 10.1152/physiolgenomics.00040.2001.[Abstract/Free Full Text]
  2. Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J, Rago C, Lal A, Wang CJ, Beaudry GA, Ciriello KM, Cook BP, Dufault MR, Ferguson AT, Gao Y, He TC, Hermeking H, Hiraldo SK, Hwang PM, Lopez MA, Luderer HF, Mathews B, Petroziello JM, Polyak K, Zawel L, Zhang W, Zhang X, Zhou W, Haluska FG, Jen J, Sukumar S, Landes GM, Riggins GJ, Vogelstein B, and Kinzler KW. Analysis of human transcriptomes. Nat Genet 23: 387–388, 1999.[ISI][Medline]
  3. Warrington JA, Nair A, Mahadevappa M, and Tsyganskaya M. Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol Genomics 2: 143–147, 2000.[Abstract/Free Full Text]