Gene expression profiling of mouse postnatal cerebellar development

RYO MATOBA1,2, SAKAE SAITO1,2, NORIKO UENO1, CHIYURI MARUYAMA1,2, KENICHI MATSUBARA1,2 and KIKUYA KATO1,2

1 Taisho Laboratory of Functional Genomics, Nara Institute of Science and Technology
2 Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation, 8916-5 Takayama, Ikoma, Nara, 630-0101, Japan


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Expression patterns of 1,869 genes were determined using adapter-tagged competitive PCR (ATAC-PCR) at 6 time points during mouse postnatal cerebellar development. The expression patterns were classified into 12 clusters that were further assembled into 3 groups by hierarchical cluster analysis. Among the 1,869 genes, 1,053 known genes were assigned to 90 functional categories. Statistically significant correlation between the clusters or groups of gene expression and the functional categories was ascertained. Genes involved in oncogenesis or protein synthesis were highly expressed during the earlier stages of development. Those responsible for brain functions such as neurotransmitter receptor and synapse components were more active during the later stages of development. Many other genes also showed expression patterns in accordance with literature information. The gene expression patterns and the inferred functions were in good agreement with anatomical as well as physiological observations made during the developmental process.

adapter-tagged competitive polymerase chain reaction; cluster analysis; SwissProt database


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
DEVELOPMENT OF THE NERVOUS system is a complicated but ordered process executed by a large number of genes. Each part of the developmental process is executed by a set of genes whose expression levels must be defined by programs encoded in the genome. Analysis of these programs is one of the most challenging subjects of modern biology.

So far we have tried to understand such programs by analyzing individual genes. Genetic approaches based on mutant analysis have resulted in the isolation of key genes for specific processes, particularly in invertebrate model systems. However, such approaches are insufficient to completely address the genetic mechanism of any developmental process. To approach such issues, it is absolutely necessary to describe expression states of the entire gene population.

Large-scale analysis of gene expression is not conceptually new (14). The main factor preventing this concept from extensive application had been technological. For example, Northern hybridization (27) is not applicable for testing thousands of genes, whereas differential hybridization (27) and differential display (17) can point out only a fraction of genes that are differentially expressed. cDNA microarrays have been successfully applied to monitor expression levels of active genes in budding yeast (7, 32) and some mammalian cell lines (13). So far, however, application to the mammalian nervous system has been limited only to identification of differentially expressed genes (18, 40). High mRNA complexity results in higher background noise and consequently makes quantitation unreliable. The other approaches based on frequency of clones such as digital expression profiling (23) or serial analysis of gene expression (SAGE) (36) are labor intensive, and not sensitive to detect modest changes.

These technical limitations can be overcome by introduction of adapter-tagged competitive PCR (ATAC-PCR) (15), an advanced form of quantitative PCR, which is characterized by addition of adapters with different spacer length to different cDNA samples. Because the technique is free from tedious steps inherent to conventional quantitative PCR, a large number of genes can be assayed. ATAC-PCR has high sensitivity, and can identify changes in gene expression as small as twofold without ambiguity. Combined with capillary sequencers, ATAC-PCR has the ability to process as many genes as DNA microarrays.

We analyzed gene expression profiles in postnatal mouse cerebellar development with ATAC-PCR. The cerebellum is one of the best-studied regions in the mammalian nervous system (2, 3). There are two major cell types: Purkinje cells and granule cells. Purkinje cells have already finished proliferation at birth, and grow in size and become functionally mature after birth. Granule cells proliferate after birth in the epidermal germinal layer, the outermost layer of the cerebellar cortex. At birth, there are few granule cells, which then proliferate vigorously and soon exceed Purkinje cells both in number and volume. The granule cells then transform their shapes, and start to migrate inward through the molecular layer, extend axons (parallel fibers), and settle at the granule cell layer. Cell proliferation reaches a peak during the first week after birth, whereas cell migration and elongation primarily occurs during the second week after birth. This process is completed by the third week after birth. At that point, the cerebellar cortex enters into its second phase of development, which involves gradual maturation of synapses without morphological changes, resulting in the full assumption of adult characteristics.

The cerebellar cortex has unique features that are favorable for gene expression analysis. Because the majority of cellular mass is occupied by a single cell type, (i.e., granule cell) (3), RNA obtained from the whole structure is likely to represent that from granule cells. The postnatal developmental processes occur synchronously. Furthermore, naturally occurring mutants (30) and targeted gene disruptions blocking particular steps of its development are available (5, 21). A preliminary study supported this view (20) and has prompted us to survey the expression profiles of genes on a large scale.

We determined expression levels of 1,869 genes by ATAC-PCR at 6 time points during postnatal cerebellar development. The expression patterns classified by cluster analysis were compared with a new functional category table constructed using information obtained from the literature. The gene expression patterns and the inferred functions were in good agreement with anatomical as well as physiological observations made during the developmental process.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

Adapter-tagged competitive PCR.
The protocol of ATAC-PCR was essentially the same as previously described (15, 20). RNA preparations from each sample were converted into cDNA, digested by a restriction enzyme, and then ligated to an adapter by using the cohesive end created by the enzyme. We routinely used multiple adapters, each having a common sequence in the "outer" region that lies next to an "inner" spacer of different sizes. After mixing the ligated samples into a single tube, PCR amplification was performed using an adapter primer and a gene-specific primer. The products were separated by polyacrylamide gel electrophoresis. Products from different samples can be discriminated by the sizes of the inner spacer region. The amount of each fragment reflects the amount of original template, and relative expression levels in each sample can be deduced form their signal intensities.

Usually, six different cDNA samples attached to different adapters were used, three of which were assigned to different amounts of control cDNA samples. In the case of the cerebellar experiment, cDNA derived from the adult cerebrum was used as the control: 10, 3, and 1 portions of cDNA with different adapters were included in each PCR reaction. As for cerebellar samples, one portion each of three out of six cerebellar samples was included in the reaction. PCR amplification was performed with the carboxyfluorescein (FAM)-labeled adapter primer corresponding to the common region of adapters and with a gene-specific primer. Products were separated by polyacrylamide gel electrophoresis. With each PCR reaction, a calibration curve was made with three control samples. Thus accurate quantitation can be made with the three cerebellar samples. Sequences of primers and adapters are as follows: C1S-FAM, 5'-6FAM-GTACATATTGTCGTTAGAACGC-3'; MB-1, 5'-GTACATATTGTCGTTAGAACGCG-3' and 5'-GATCCGCGTTCTAACGACAATATGTAC-3'; MB-2, 5'-GTACATATTGTCGTTAGAACGCGACT-3' and 5'-GATCAGTCGCGTTCTAACGACAATATGTAC-3'; MB-3, 5'-GTACATATTGTCGTTAGAACGCGCATACT-3' and 5'-GATCAGTATGCGCGTTCTAACGACAATATGTAC-3'; MB-4, 5'-GTACATATTGTCGTTAGAACGCGATCCATACT-3' and 5'-GATCAGTATGGATCGCGTTCTAACGACAATATGTAC-3'; MB-5, 5'-GTACATATTGTCGTTAGAACGCGTCAATCCATACT-3' and 5'-GATCAGTATGGATTGACGCGTTCTAACGACAATATGTAC-3'; and MB-6, 5'-GTACATATTGTCGTTAGAACGCGTACTCAATCCATACT-3' and 5'-GATCAGTATGGATTGAGTACGCGTTCTAACGACAATATGTAC-3'.

An ABI model 3700 DNA analyzer was used for gel separation, with a current production rate of more than 1,000 assays per day. Genes subjected for ATAC-PCR analysis were selected in the descending order of abundance, prioritizing known genes.

To obtain expression patterns at six time points, three combinations of RNA samples were used with each gene. cDNA derived from adult cerebrum was used as a standard for calibration. With each set, two assays with different combinations of calibrations were performed. The first consisted of 10 portions of the standard with adapter MB-1, 3 portions with the MB-3, 1 portion with the MB-6, and 1 portion each of cerebellar samples with other adapters. The second consisted of 10 portions of the standard with adapter MB-6, 3 portions with the MB-4, 1 portion with the MB-1, and 3 portions each of cerebellar samples with other adapters. Using these two individual assays with different calibrations, most of the obtained data points were within the range of calibration. Those data which had discrepancies between the two were discarded. The overall success rate of assays was about 70%.

Statistical analyses.
Cluster analysis was performed using ClustanGraphics3 developed by Wishart (41). The data matrix was at first standardized to z-score, and cluster analysis was performed using Ward’s method (41). Optimal reordering of the cases was also performed using the software. Among the several hierarchical clustering procedures we tested, Ward’s method gave consistently better results than other methods. In the case of cerebellar development, clustering was truncated at the 12-cluster level. The proximity matrix was then reordered by a recently developed method. This method reorders the cases so that the rank correlation between the actual and target row-wise ranks is maximized.

Functional categories were assigned to all known genes in our expressed sequence tag (EST) collections based on the SwissProt database and/or the Medline abstract. Each functional category is independent from each other with several exceptions: "intracellular signal transduction" does not include serine-threonine kinases and tyrosine kinases; "cell surface molecule" includes all those except adhesion molecules.

Statistical tests to select functional categories enriched in specific clusters or groups were based on the binomial distribution. Those functional categories enriched in a specific cluster(s) or group were selected, by comparing the occurrence in a cluster with the occurrence in the entire population. The cutoff points were arbitrarily set either 0.01 (Fig. 3) or 0.05 (Fig. 4).

Multidimensional scaling was performed using a software package (STATISTICA 97) with default settings.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

Experimental design.
We sequenced more than 20,000 3' end cDNA fragments from mouse cerebellum and hippocampus cDNA libraries and obtained more than 10,000 unique sequences (19). Genes were selected in descending order of abundance estimated from appearance in the libraries, and among rare species (those appeared only once) known genes were prioritized. A total of 1,869 genes were selected, and gene-specific primers for ATAC-PCR were constructed. The use of genes in this collection is preferable to the use of a preselected standard set of genes such as the National Center for Biological Information (NCBI) UniGene set, because our customized set did not include any genes absent in cerebellar mRNA.

We selected six time points throughout early postnatal cerebellar development (2 days, 4 days, 8 days, 12 days, 3 wk, and 6 wk after birth) and determined expression levels. The relative expression levels of the developing vs. adult cerebrum were assayed using ATAC-PCR (15, 20). An outline of ATAC-PCR is schematically represented in Fig. 1 and described in detail in METHODS. To ensure accuracy, we repeated the assay with the following three combinations of time points: 2 days, 4 days, and 8 days; 12 days, 3 wk, and 6 wk; and 4 days, 12 days, and 6 wk. The expression profile covering 1,869 genes was generated from these data sets. (Please refer to the Supplementary Material1 for this article, published online at the Physiological Genomics web site.)



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 1. Outline of adapter-tagged competitive PCR (ATAC-PCR). We used adult cerebrum as the control RNA. Three samples were from postnatal mouse cerebellum at 6 time points.

 
Cluster analysis of developmental expression patterns.
The gene expression patterns thus obtained were classified by hierarchical cluster analysis (10, 39), which is especially useful for analysis of temporal changes (7, 20, 32). We used Ward’s method, followed by a reordering of the similarity matrix. A schematic representation of the data matrix is shown in Fig. 2. Truncation at the 12-cluster level led to the following categorization: four clusters are represented by expression patterns characterized by elevated expression at early stages of development (group A, A1A4); five clusters are represented by patterns characterized by elevated expression at the later stages (i.e., 12 days, 3 wk, or 6 wk) (group B, B1B5); the remaining three clusters display complicated expression patterns (group C, C1C3). More than 80% of the genes can be placed into the group A or B clusters. Note that there are several outliers (i.e., expression patterns atypical to each cluster) that are inherent to hierarchical cluster analysis of a large data set. Because outliers were few and setting exclusion criteria was difficult, they were not removed. The following analysis was based on these 12 clusters and 3 groups.



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 2. Cluster analysis of 1,869 genes using their expression patterns. The 1,869 genes are vertically aligned: the data matrix is standardized to z-score, i.e., converted to zero mean and unit variance, and schematically represented. Each row represents the expression pattern of each gene, and columns represent time points: column 1, 2 days after birth; column 2, 4 days; column 3, 8 days; column 4, 12 days; column 5, 3 wk; and column 3, 6 wk. Expression levels are indicated by color, with scale shown at the bottom right: dark red, yellow, and dark green represent high, middle, and low expression levels, respectively. Four clusters, A1A4, are characterized by elevated expression at the early stages (from 2 days to 8 days) and decline; five clusters, B1B5, are characterized by low expression at early stages followed by elevated expression at later stages (from 12 days to 6 wk); and three clusters, C1C3, are characterized by other complicated patterns.

 
Strategy to correlate expression pattern and gene function.
We then attempted to correlate the gene expression patterns to functions. Among the 1,869 genes assayed, 1,053 genes were matched to sequences deposited in the GenBank database and thus had functional annotations. To explore the characteristics of each cluster, we first classified known genes into a limited number of functional categories, and then assessed the correlation between functional categories and clusters of genes obtained by expression profiling. Each functional category was labeled by a keyword that well represents the character of the category.

In a study using budding yeast, Munich Information Center for Protein Sequences (MIPS) classification of genes was used for correlating gene functions and their expression patterns (34). However, it is not applicable to mammalian systems, because the unicellular system lacks most of the functions that characterize multicellular organisms. Furthermore, it is not optimized for use with cluster analysis of gene expression.

In our EST collections, about 1,600 known genes, which included the 1,053 known gene assayed here, were identified, which covers more than one-quarter of the 5,841 known genes listed in the UniGene set (UniGene Build no. 75). All of the known genes were classified into 90 functional categories. Keywords representing functional categories are supplied as a supplementary material. The categorization of functions was done mainly based on description in the SwissProt database and/or Medline abstracts. Biologically relevant activities were used to assign functional categories to the gene products. In addition, we intended the assignment criteria to be rather "loose" so that enough genes could be included in each category. For example, detailed classifications of metabolic pathways were avoided. Functional categories are varied in nature. For example, "eye" and "testis" indicate sites of expression. The category "Ca2+-related" represents genes whose products require Ca2+ for their function, such as intracellular signal transducers, carrier proteins of unknown function, and calcium channels. Categories such as "brain" and "intracellular signal transduction" can cover a wide variety of genes. The functional category "brain" represents in some cases the site of expression, but in other cases, this represents functions related to the nervous system. Up to four keywords were assigned to each gene.

Correlation of functional categories enriched in specific gene expression patterns.
We selected functional categories that appeared more than 10 times in the 1,869 genes and searched for statistically significant correlations between the categories and the clusters representing expression patterns. The results are shown in Fig. 3. There were five functional categories that were enriched in group A clusters, characterized by high expression during early stages of development followed by decline. They included "cancer-related," which was enriched in group A1, "ribosomal protein," in group A1 and group A3, "RNA processing," in group A1, "intracellular signal transduction," in group A4, and "transcription factor," in group A4. Eight functional categories were enriched in group B clusters, which showed initial low expression followed by augmented expression during late stages of development (12 days, 3 wk, and 6 wk after birth). They included "carbohydrate metabolism," which was enriched in group B2, and those related to brain functions such as "ion channel and transporter" in group B3, "neurotransmitter receptor" in group B3, and "synapse component" in group B4. Figure 3 also includes genes that are predominantly or specifically expressed in the cerebellum. These two annotations were defined by data obtained by ATAC-PCR; relative expression levels at 6 wk against those in the adult cerebrum exceeding 20-fold were defined as "cerebellum specific," and those from 10- to 20-fold were as "cerebellum dominant." These two annotations were also enriched in group B clusters.



View larger version (30K):
[in this window]
[in a new window]
 
Fig. 3. Functional categories enriched in specific clusters of gene expression patterns during mouse cerebellar development. Each value represents the number of genes belonging to each cluster and each functional category. Dark shade indicates statistically significant enrichment. Light shade indicates statistically significant rare cases.

 
Because the number of genes belonging to each cluster was often not sufficiently large, we next searched for statistically significant correlations between the functional categories and the three groups of clusters: groups A, B, and C. Functional categories appearing more than five times were selected and examined. Results are shown in Fig. 4. Two functional categories were enriched in group A: "cancer-related" and "ribosomal protein." These findings suggest that genes for cell proliferation and protein synthesis are actively expressed in early development, when granule cells are replicating vigorously. In group B, seven functional categories were enriched: "RNA synthesis," "carbohydrate metabolism," "brain," "ion channel and transporter," "synapse component," "neurotransmitter receptor," and "oligodendroglia." Except for "RNA synthesis" and "carbohydrate metabolism," all categories are related to functions of the mature nervous system. The functional annotation, "cerebellum-dominant," was also enriched in group B. These findings were in concert with the maturation of the cerebellum, as determined by its morphological and functional organization.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 4. Functional categories enriched in specific groups of gene expression patterns during mouse cerebellar development. Each value represents the number of genes belonging to each group and each functional category. Dark shade indicates statistically significant enrichment. Light shade indicates statistically significant rare cases.

 
Segregation of "carbohydrate metabolism" in group B was likely to reflect the fact that mature neurons consume large amounts of energy. Interestingly, genes encoding components needed for protein synthesis (i.e., ribosomal proteins) were found to be actively expressed during early development, and genes for RNA synthesis were actively expressed during late development and/or adulthood. This discrepancy suggests that in mature neurons, RNA is degraded at an accelerated rate, and/or that more weight is given to transcriptional controls than to translational controls.

Expression patterns of individual genes.
Several interesting insights into individual genes can be obtained using the functional categories.

Ten of fifteen "cancer-related" genes belonged to the group A clusters (Fig. 5, "cancer-related"), strongly suggesting their involvement in cell proliferation. Genes encoding ribosomal proteins showed elevated expression during early stages of development, more than half of them being grouped in clusters A1 and A3, characterized by elevated expression around 4 days (Fig. 5, "ribosomal protein"). This peak represents the active phase of protein synthesis vs. the cell proliferation stage. It is interesting to note that genes whose products belong to the functional category "cell growth" did not necessarily exhibit elevated expression patterns during the early stage of cerebellar development (Fig. 5, "cell growth"). Further studies are needed, but the majority of the experimental evidence for these genes has been obtained not with intact nervous systems but with other in vitro systems. We suspect that functions in the nervous system in vivo might be different from those in in vitro systems.



View larger version (59K):
[in this window]
[in a new window]
 
Fig. 5. Gene expression of members of functional categories. The data are schematically shown as in Fig. 2. In each row, cells represent expression levels at specific time points: from left to right, 2 days, 4 days, 8 days, 12 days, 3 wk, and 6 wk. The color scale is the same as in Fig. 2.

 
Most of the genes whose products belong to the "neurotransmitter receptor" and "ion channel and transporter" functional categories exhibited elevated expression during late development, (i.e., 3 wk and/or 6 wk), suggesting that the mature cerebellum has more complex electrophysiological properties than the immature cerebellum (Fig. 5, "neurotransmitter receptor" and "ion channel and transporter"). On the other hand, some genes were found to be highly expressed during early development. Members of these gene families consist of subtypes whose products possess unique electrophysiological properties. Some of the subtypes are known to be more active during development than during adulthood. For example, in the cerebellum, a gene encoding a Ca2+-dependent K+ channel exhibited transiently elevated expression from 7 days to 14 days (22). Another example is the N-methyl-D-aspartate (NMDA) receptor (1, 33, 37). The gene for the NR2B subunit is expressed beginning in the late embryonic stages, and gradually disappears during the second postnatal week. In contrast, expression of the genes encoding the NR2A and NR2C subunits are absent at birth and appear only during postnatal development. Although these genes did not appear in our data set, two group A genes whose products belong to the functional category "ion channel and transporter" are likely to follow similar patterns. The gene encoding the M1 muscarinic acetylcholine receptor, which showed transiently high expression at day 8, is such an example; in addition, its behavior was in accord with previous results by Northern hybridization (26).

The functional categories "brain," "intracellular signal transduction," and "cytoskeleton" each consist of a large number of gene products, whose characteristics are not entirely consistent. Only members of the "brain" category were found to be enriched in group B. Genes whose products have important functions in the mature brain were in general expressed to a lesser degree in the early stages and were progressively increased in expression over the course of development.

Transcription factors were enriched in cluster A4, but the biological meaning of this correlation is unclear. This category includes 12 transcription factors known to be involved in development or differentiation. Their expression patterns were not consistent with one another, and no overall tendency was observed.

Genes specific to oligodendroglia were found in group B with one exception: a gene encoding a brain-specific lipid-binding protein (Fig. 5, "oligodendroglia"). There are very few oligodendroglia at birth, but they increase in number over the course of development (8). Our finding is in complete agreement with this observation and suggests that their multiplication is most active at around 3 wk after birth.

Several dominant expression patterns were found for specific genes encoding adhesion molecules and matrix proteins, which are known to be functionally important during development of the nervous system (Fig. 5, "adhesion molecule," "cell matrix protein"). In particular, three gene products, reelin (9), tenascin (4), and matrix metalloproteinases (35) are known to play important roles during postnatal cerebellar development. The function of reelin was inferred from the Reeler mutant mouse to be involved in the formation of layer structures. In situ hybridization experiments revealed dense expression in the external germinal layer (29), and results with ATAC-PCR agreed well with these observations. Tenascin is likely to be expressed in astrocytes in the cerebellar cortex and is thought to take part in guidance of granule cell migration (4). Its expression was transiently elevated during the middle stage of development, which agrees well with the timing expected from its proposed physiological action.

Programmed cell death is an important mechanism of development. Recent studies demonstrated that the majority of programmed cell death occurred in the brain within the region of cell proliferation, although many cells were dying in the postmitotic regions (6). Five apoptosis-related gene products, Bax-{alpha} (12), Requiem (11), TDAG51 (24), Nedd2 (16), and Siva (25), have been found to induce apoptosis. These activities were demonstrated only with blood cell lines and not with neuronal cell lines. Their expression patterns, except that of Siva, belonged to group A clusters, demonstrating elevated expression during early development (Fig. 5, "apoptosis"). This behavior is likely to be related to the known programmed cell death in the external germinal layer, the region where granule cells are proliferating. The role of late-onset apoptosis-related genes is awaiting further analyses.

Multidimensional scaling of developmental stages.
All of the above analyses have focused on characterizing how each gene product functions during postnatal cerebellar development. Instead, the relationships between each developmental stage can be explored based on similarities between each time point calculated using the values of the 1,869 genes as variables. We applied multidimensional scaling (28), which is a statistical procedure for fitting a set of points in a space such that the distances between points correspond as closely as possible to a given set of dissimilarities (or similarities) between a set of objects. Here, Pearson’s product-moment correlation coefficient was calculated between each of the developmental stages, and they were plotted in three-dimensional space. In this analysis, distances between each time point represent dissimilarities deduced from the correlation coefficient: expression states of time points are similar when they are close to each other. Because of the characteristics of the correlation coefficient, the quantitative aspects of each transcript were ignored, and the expression status of each gene was treated equally. As shown in Fig. 6, day 2 and day 4 are located close to each other, and day 12, week 3, and week 6 are close to each other. They represent two distinct groups, each representing the cell proliferation stage and the maturation stage. Day 8 is far away from the others, indicating distinctive physiological states.



View larger version (31K):
[in this window]
[in a new window]
 
Fig. 6. Multidimensional scaling of mouse postnatal cerebellar development. Each time point was placed in three dimensions so that distances between each time point optimally represent the similarities. 2d, 2 days after birth; 4d, 4 days after birth; 12d, 12 days after birth; 3w, 3 wk after birth; and 6w, 6 wk after birth.

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Expression profiling, monitoring of gene expression with a large number of genes, is a technical challenge as well as an intellectual challenge. The current trend in molecular biology is to attempt to understand biological processes in terms of structures and interactions of individual molecules. This reductionist approach is not applicable to the vast amounts of data obtained with expression profiling. A method of summarizing and integrating the data is necessary to extract valuable information. Cluster analysis classifies gene expression patterns into a limited number of groups, making possible the handling of a small number of gene clusters, rather than an intractable number of individual genes.

Information regarding gene product functions that have been determined thus far is stored mainly as articles in scientific journals. To couple the information with gene expression patterns obtained by cluster analysis, we added to each gene keywords representing functional categories which summarize literature information. Assignment of keywords allowed for statistical and categorical correlation of gene expression patterns and gene product functions.

We analyzed the expression of 1,869 genes at 6 time points, which covered only parts of the entire population. Nevertheless, they gave a number of interesting findings and clues for further analysis. From the analysis of known genes, their expression profiles were in complete agreement with the anatomy and physiology of the developing cerebellum. Strong correlation between gene product functions and expression patterns suggests that functions of novel genes might be deduced from their expression patterns. In most of the cases, however, such assertions cannot be made easily, because functions of known genes are not as highly condensed in the clusters. When used in combination with other information such as primary structure, however, hypotheses can be made, leading to more focused experimental design.

Several molecules have been identified as being involved in controlling development of cerebellar granule cells. A helix-loop-helix transcription factor Mth1 is essential for embryogenesis of granule cells (5). The sonic hedgehog pathway has been shown to be involved in postnatal granule cell proliferation (38), supporting a model of local control of the proliferation by Purkinje cells. We also identified here at least 63 transcription factors, many of which are likely to control expression of other genes during development. It is a difficult but very important problem to link the expression patterns of these genes and other genes in their respective pathways.

The use of functional categories was successful for correlating gene expression patterns and product functions. As demonstrated by as the analyses of adhesion molecules and apoptosis-related genes, it is also a good method to discriminate and characterize members within a functional category. However, several methodological problems have been observed. The results are subject to a large degree of experimenter bias. We do not claim that our keyword list is a definitive version, and keyword lists designed by other investigators may provide different interpretations. In addition, keywords are condensations of information and may often exclude important details. More importantly, the literature information is derived from research conducted and reported in the past and will therefore reflect any biases of that work without addressing any of the specific conditions under which that work was performed, which may or may not be relevant to the current situation. For example, genes belonging to the category "cell growth" were not necessarily highly expressed during the cell proliferation period. The cell growth-promoting effects of these gene products have been mainly studied using in vitro systems. Their main functions in the brain, on the other hand, could be distinct from previously observed in vitro activities. Thus the analysis of expression patterns may suggest other unknown functions of known genes products.

Interpretation of expression patterns requires several other caveats. It should be noted that the levels of proteins do not necessarily parallel mRNA levels and that protein levels are more likely to reflect the physiological states of samples (20). Repeated experiments exploring the limits of ATAC-PCR revealed that differences less than twofold were ambiguous. Expression patterns with marked changes and those with minor changes were indistinguishable after standardization, and data sets with small changes may yield incorrect expression patterns. Transcription of some genes might be loosely controlled, and patterns might be different among individual samples.

Both cluster analysis and multidimensional scaling can be used to demonstrate the relationship of each developmental stage by means of similarity matrix of expression states. For a small number of cases, multidimensional scaling is more appropriate because the relationship is demonstrated visually in a two- or three-dimensional space. It should be noted that the results shown in Fig. 6 are based on an assumption that the weight of each gene is equal. Although the results are in agreement with anatomical and physiological observations, it may be rather appropriate to assume that genes of several functional categories weigh more than the others. For example, since the properties of neurons are mainly determined by their electrophysiological properties, higher weights of genes belonging to such groups might better reflect the states of tissues.

A tissue is composed of multiple cell types, and expression patterns obtained from RNA extracted from whole tissue are a weighted average of those of each cell type. It is therefore necessary to be careful in our data interpretation, especially for the middle stage of development; at these stages, there are at least three granule cell types: those in the outer epidermal germinal layer, inner germinal layer, and granule cell layer. In general, observed changes in expression in each cell type would be masked by opposite changes in other cell types, such that the real changes of gene expression in individual cells should be sharper than those observed. More accurate observation can be done with separate sampling of each cell layer using advanced techniques such as laser microcapture dissection (31).

The work presented here is only the beginning of a long period of postgenomic research. The technology both for assays and statistical analysis may be widely applicable for the analysis of complicated systems including other parts of the nervous system.


    ACKNOWLEDGMENTS
 
We thank Ikuko Ikeda, Keiko Miyaoka, and Satoko Maki for technical assistance.

The expression data and the list of functional categories will be available from our web site (http://love2.aist-nara.ac.jp).

This work was partly supported by a Grant-in-Aid from the Ministry of Education, Science, Sports, and Culture.


    FOOTNOTES
 
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).

Address for reprint requests and other correspondence: K. Kato, Taisho Laboratory of Functional Genomics, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara, 630-0101, Japan (E-mail address: kkato@bs.aist-nara.ac.jp).

1 Supplementary Material to this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/4/2/155/DC1. Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Akazawa C, Shigemoto R, Bessho Y, Nakanishi S, and Mizuno N. Differential expression of five N-methyl-D-aspartate receptor subunit mRNAs in the cerebellum of developing and adult rats. J Comp Neurol 347: 150–160, 1994.[ISI][Medline]
  2. Altman J. Postnatal development of the cerebellar cortex in the rat. J Comp Neurol 145: 353–514, 1972.[ISI][Medline]
  3. Altman J and Bayer SA. Development of the Cerebellar System: In Relation to its Evolution, Structure, and Functions. Boca Raton, FL: CRC, 1996.
  4. Bartsch S, Bartsch U, Dorries U, Faissner A, Weller A, Ekblom P, and Schachner M. Expression of tenascin in the developing and adult cerebellar cortex. J Neurosci 12: 736–749, 1992.[Abstract]
  5. Ben-Arie N, Bellen HJ, Armstrong DL, McCall AE, Gordadze PR, Guo Q, Matzuk MM, and Zoghbi HY. Math1 is essential for genesis of cerebellar granule neurons. Nature 390: 169–172, 1997.[ISI][Medline]
  6. Blaschke AJ, Staley K, and Chun J. Widespread programmed cell death in proliferative and postmitotic regions of the fetal cerebral cortex. Development 122: 1165–1174, 1996.[Abstract/Free Full Text]
  7. Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, and Herskowitz I. The transcriptional program of sporulation in budding yeast. Science 282: 699–705, 1998.[Abstract/Free Full Text]
  8. Curtis R, Cohen J, Fok-Seang J, Hanley MR, Gregson NA, Reynolds R, and Wilkin GP. Development of macroglial cells in rat cerebellum I. Use of antibodies to follow early in vivo development and migration of oligodendrocytes. J Neurocytol 17: 43–54, 1988.[ISI][Medline]
  9. D’Arcangelo G, Miao GG, Chen SC, Soares HD, Morgan JI, and Curran T. A protein related to extracellular matrix proteins deleted in the mouse mutant reeler. Nature 374: 719–723, 1995.[ISI][Medline]
  10. Eisen MB, Spellman PT, Brown PO, and Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95: 14863–14868, 1998.[Abstract/Free Full Text]
  11. Gabig TG, Mantel PL, Rosli R, and Crean CD. Requiem: a novel zinc finger gene essential for apoptosis in myeloid cells. J Biol Chem 269: 29515–29519, 1994.[Abstract/Free Full Text]
  12. Han J, Sabbatini P, Perez D, Rao L, Modha D, and White E. The E1B 19K protein blocks apoptosis by interacting with and inhibiting the p53-inducible and death-promoting Bax protein. Genes Dev 10: 461–477, 1996.[Abstract]
  13. Iyer VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JCF, Trent JM, Staudt LM, Hudson J Jr, Boguski MS, Lashkari D, Shalon D, Botstein D, and Brown PO. The transcriptional program in the response of human fibroblasts to serum. Science 283: 83–87, 1999.[Abstract/Free Full Text]
  14. Kato K. A collection of cDNA clones with specific expression patterns in mouse brain. Eur J Neurosci 2: 704–711, 1990.[ISI][Medline]
  15. Kato K. Adaptor-tagged competitive PCR: a novel method for measuring relative gene expression. Nucleic Acids Res 25: 4694–4696, 1997.[Abstract/Free Full Text]
  16. Kumar S, Kinoshita M, Noda M, Copeland NG, and Jenkins NA. Induction of apoptosis by the mouse Nedd2 gene, which encodes a protein similar to the product of the Caenorhabditis elegans cell death gene ced-3 and the mammalian IL-1 beta-converting enzyme. Genes Dev 8: 1613–1626, 1994.[Abstract]
  17. Liang P and Pardee AB. Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257: 967–971, 1992.[ISI][Medline]
  18. Luo L, Salunga RC, Guo H, Bittner A, Joy KC, Galindo JE, Xiao H, Rogers KE, Wan JS, Jackson MR, and Erlander MG. Gene expression profiles of laser-captured adjacent neuronal subtypes. Nat Med 5: 117–22, 1999.[ISI][Medline]
  19. Matoba R, Kato K, Saito S, Kurooka C, Maruyama C, Sakakibara Y, and Matsubara K. Gene expression in mouse cerebellum during its development. Gene 241: 125–131, 2000.[ISI][Medline]
  20. Matoba R, Kato K, Kurooka C, Maruyama C, Sakakibara Y, and Matsubara K. Correlation between gene functions and developmental expression patterns in the mouse cerebellum. Eur J Neurosci 12: 1357–1371, 2000.[ISI][Medline]
  21. Millen KJ, Wurst W, Herrup K, and Joyner AL. Abnormal embryonic cerebellar development and patterning of postnatal foliation in two mouse Engrailed-2 mutants. Development 120: 695–706, 1994.[Abstract/Free Full Text]
  22. Muller YL, Reitstetter R, and Yool AJ Regulation of Ca2+-dependent K+-channel expression in rat cerebellum during postnatal development. J Neurosci 18: 16–25, 1998.[Abstract/Free Full Text]
  23. Okubo K, Hori N, Matoba R, Niiyama T, Fukushima A, Kojima Y, and Matsubara K. Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression. Nat Genet 2: 173–179, 1992.[ISI][Medline]
  24. Park CG, Lee SY, Kandala G, Lee SY, and Choi Y. A novel gene product that couples TCR signaling to Fas (CD95) expression in activation-induced cell death. Immunity 4: 583–591, 1996.[ISI][Medline]
  25. Prasad KV, Ao Z, Yoon Y, Wu MX, Rizk M, Jacquot S, and Schlossman SF. CD27, a member of the tumor necrosis factor receptor family, induces apoptosis and binds to Siva, a proapoptotic protein. Proc Natl Acad Sci USA 94: 6346–6351, 1997.[Abstract/Free Full Text]
  26. Russo-Neustadt A, Rotter A, and Frostholm A. Distribution of muscarinic receptors in the developing cerebellum. Brain Res 548: 179–186, 1991.[ISI][Medline]
  27. Sambrook J, Fritsch EF, and Maniatis T. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1989.
  28. Schiffman SS, Reynolds ML, and Young FW. Introduction to Multidimensional Scaling: Theory, Methods, and Applications. New York: Academic, 1983.
  29. Schiffmann SN, Bernier B, and Goffinet AM. Reelin mRNA expression during mouse brain development. Eur J Neurosci 9 : 1055–1071, 1997.
  30. Sidman RL. Experimental Neurogenetics: Genetics of neurological and psychiatric disorders. New York: Raven, 1983.
  31. Simone NL, Bonner RF, Gillespie JW, Emmert-Buck MR, and Liotta LA. Laser-capture microdissection: opening the microscopic frontier to molecular analysis. Trends Genet 14: 272–276, 1998.[ISI][Medline]
  32. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, and Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9: 3273–3297, 1998.[Abstract/Free Full Text]
  33. Takahashi T, Feldmeyer D, Suzuki N, Onodera K, Cull-Candy SG, Sakimura K, and Mishina M. Functional correlation of NMDA receptor epsilon subunits expression with the properties of single channel and synaptic currents in the developing cerebellum. J Neurosci 16: 4376–4382, 1996.[Abstract/Free Full Text]
  34. Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, and Church GM. Systematic determination of genetic network architecture. Nat Genet 22: 281–285, 1999.[ISI][Medline]
  35. Vaillant C, Didier-Bazes M, Hutter A, Belin MF, and Thomasset N. Spatiotemporal expression patterns of metalloproteinases and their inhibitors in the postnatal developing rat cerebellum. J Neurosci 19: 4994–5004, 1999.[Abstract/Free Full Text]
  36. Velculescu VE, Zhang L, Vogelstein B, and Kinzler KW. Serial analysis of gene expression. Science 270: 484–487, 1995.[Abstract]
  37. Watanabe M, Mishina M, and Inoue Y. Distinct spatiotemporal expressions of five NMDA receptor channel subunit mRNAs in the cerebellum. J Comp Neurol 343: 513–519, 1994.[ISI][Medline]
  38. Wechsler-Reya RJ and Scott MP. Control of neuronal precursor proliferation in the cerebellum by Sonic Hedgehog. Neuron 22: 103–114, 1999.[ISI][Medline]
  39. Wen X, Fuhrman S, Michaels GS, Carr DB, Smith S, Barker JL, and Somogyi R. Large-scale temporal gene expression mapping of central nervous system development. Proc Natl Acad Sci USA 95: 334–339, 1998.[Abstract/Free Full Text]
  40. Whitney LW, Becker KG, Tresser NJ, Caballero-Ramos CI, Munson PJ, Prabhu VV, Trent JM, McFarland HF, and Biddison WE. Analysis of gene expression in multiple sclerosis lesions using cDNA microarrays. Ann Neurol 46: 425–428, 1999.[ISI][Medline]
  41. Wishart D. ClustanGraphics Primer. Edinburgh: Clustan, 1999.