Time-dependent changes in ARE-driven gene expression by use of a noise-filtering process for microarray data

Jiang Li1 and Jeffrey A. Johnson1,2,3,4

1 School of Pharmacy, University of Wisconsin, Madison, Wisconsin 53705
2 Molecular and Environmental Toxicology Center, University of Wisconsin, Madison, Wisconsin 53705
3 Waisman Center, University of Wisconsin, Madison, Wisconsin 53705
4 Center for Neuroscience, University of Wisconsin, Madison, Wisconsin 53705


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
The current study was designed to identify the time-dependent gene expression profiles of antioxidant responsive element (ARE)-driven genes induced by tert-butylhydroquinone (tBHQ). A set of simple noise-filtering methods was introduced to evaluate and minimize the variance of microarray datasets. Gene expression induced by tBHQ (10 µM) in IMR-32 human neuroblastoma cells was analyzed by means of large-scale oligonucleotide microarray. Rank analysis was used to determine the acceptable number of independent samples necessary to eliminate false positives from the dataset. A dramatic reduction in the number of genes passing the rank analysis was achieved by using a 3 x 3 matrix comparison. Reproducibility was evaluated based on the coefficient of variation for average difference change. Completion of these analyses revealed that 101 of the 9,670 genes examined showed dynamic changes with treatment ranging from 4 h to 48 h. Since certain ARE-driven genes have been already identified, gene clustering would presumably group them together based on similar regulation. Self-organizing map grouped the genes induced by tBHQ into 12 (4x3) distinct clusters. Those previously identified ARE-driven genes were shown to group into different clusters. Since all potential ARE-driven genes did not cluster together, we speculate that multiple transcription factors and/or multiple signal transduction pathways contribute to transcriptional activation of the ARE. In conclusion, many novel potential ARE-driven genes were identified in this study. They function in detoxification and antioxidant defense, neuronal proliferation and differentiation, and signal transduction. The noise-filtering process applied to these microarray data, therefore, has proven to be very useful in identification of the time-dependent changes in ARE-drive gene expression.

antioxidant responsive element; human neuroblastoma cell; phase II detoxifying enzymes; oligonucleotide microarray analysis


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
TRANSCRIPTIONAL ACTIVATION of many antioxidant genes is mediated through a cis-acting enhancer named the antioxidant responsive element (ARE). AREs have been detected in the promoters of the following phase II detoxifying enzymes: rat and mouse glutathione S-transferase (GST) Ya, rat GST P, rat and human NAD(P)H:quinone oxidoreductase (NQO1), and murine heme oxygenase-1 (HO1) (12, 13). Several other phase II detoxifying enzymes and antioxidant genes such as {gamma}-glutamylcysteine ligase catalytic (GCLC) and regulatory (GCLR) subunits, as well as ferritin heavy and light chains, are also suspected to be upregulated through ARE activation (29, 31). AREs share a consensus motif, TGAC/TNNNGC, originally identified by mutational analysis of the rat GST-Ya ARE (10, 25). Thus identification of the ARE was an initial step leading to elucidation of the molecular mechanism for antioxidant response. tert-butylhydroquinone (tBHQ), a metabolite of the widely used food antioxidant butylated hydroxyanisole (BHA), is a known inducer of phase II enzyme systems including IMR-32 human neuroblastoma cell (18). Increasing evidence indicates that tBHQ induces these phase II detoxifying enzymes through the ARE (2, 23). In addition, the transcription factor, NF-E2-related factor 2 (Nrf2), is essential for ARE-mediated induction of these genes (12, 13, 29, 31). Our laboratory and others have already confirmed that exposure to tBHQ results in nuclear accumulation of Nrf2 (6, 17, 26). The binding of Nrf2 to the ARE leads to transcriptional activation of a score of genes such as NQO1, HO1, multiple forms of GST, glutathione reductase (GR), and thioredoxin reductase (TR) (1, 4, 8, 14, 15, 19, 21).

Oligonucleotide microarray analysis allows us to monitor the expression level of thousands of genes in parallel and has the potential to surpass traditional approaches in terms of sensitivity and speed. Presently, molecular biologists and bioinformatic analysts are working together to establish different biostatistical models for interpretation of microarray data. Most of these methods, however, are time-consuming and hard to manage, leading to unfavorable reviews from the common users. In addition, researchers often complain that microarray data can be so confusing that it sometimes provides few clues for further investigation. A major reason for these types of interpretations is the inherent variability of microarray data that leads to false-positive signals concealing real changes. This is true especially for genes with low basal expression levels (28). Another principal contributing factor is the complexity of transcriptional regulatory networks responsible for the observed changes in gene expression. This makes it difficult to predict the signal transduction pathway(s) driving the changes based only on gene expression profiles. Clearly, the main goal of the analysis is to determine whether the microarray data make biological sense. In the present study, oligonucleotide microarrays were used to identify the time-dependent changes in ARE-driven genes induced by tBHQ. We also introduced a set of simple, useful methods comprising rank evaluation, cross comparisons, and reproducibility analysis to evaluate and minimize the variance of microarray datasets.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
Reagents.
The tert-butylhydroquinone (tBHQ) was from Fisher; monoclonal antibody against NQO1 was provided by Dr. David Ross (University of Colorado); polyclonal antibody against GCLC and GCLR was provided by Dr. Terrance Kavanagh (University of Washington); an monoclonal antibody against HO1 was from BD Biosciences.

Cell culture and drug treatment.
IMR-32 human neuroblastoma cells were grown in DMEM supplemented with fetal calf serum (10%), 100 IU/ml penicillin, and 100 mg/ml streptomycin. Cultures were maintained at 37°C in a humidified 10% CO2 atmosphere. IMR-32 cells (60% confluent) were treated with tBHQ at a final concentration of 10 µM or vehicle (0.01% EtOH) for 4, 8, 24, and 48 h prior to harvesting.

Microarray analysis.
Details for the sample preparation and microarray processing are available from Affymetrix (Santa Clara, CA). Total RNA was prepared from cells by using a Qiagen RNeasy midi kit. Eight micrograms of total RNA was used to prepare double-stranded cDNA (Superscript choice kit; GIBCO-BRL) with a T7-(dT)24 primer containing a T7 RNA polymerase promoter site (Operon). The cRNA was prepared and biotin labeled by in vitro transcription (Enzo Biochem). Labeled cRNA was fragmented by incubation at 94°C for 35 min in the presence of 40 mM Tris acetate, pH 8.1, 100 mM potassium acetate, and 30 mM magnesium acetate. Fifteen micrograms of fragmented cRNA was hybridized 16 h at 45°C to an HG U95a array (Affymetrix, Santa Clara, CA). This array contains ~12,000 probe sets corresponding to 9,670 full-length human cDNAs. Sixteen to twenty pairs of 25-mer oligonucleotides that span the coding region represent each gene. After hybridization, the GeneChips were automatically washed and stained with streptavidin-phycoerythrin by using a fluidics Station. Finally, probe arrays were scanned at 3-µm resolution using the GeneChip System confocal scanner made for Affymetrix by Aligent. Affymetrix Microarray Suite 4.1 was used to scan and analyze the relative abundance of each gene derived from the average difference of intensities. Analysis parameters used by the software were set to values corresponding to moderate stringency (SDT = 30, SRT = 1.5). Fluorescence intensity was measured for each probe array and normalized to the average fluorescence intensity for the entire probe array. Output from the GeneChip analysis was merged with the Unigene or GenBank descriptor and stored as an Excel data spreadsheet. Gene cluster analysis was performed using GENECLUSTER 1.0 (MIT, Cambridge, MA) (30). Gene categorization was based on NetAffx (http://www.netaffx.com/index2.jsp) and other sources.

RT-PCR.
Validation of upregulated expression was performed by RT-PCR for NQO1, GCLR, GCLC, and HO1. PCR primers which were specific for these genes were used for cDNA synthesis and amplification as follows: NQO1, forward 5'-CATTCTGAAAGGCTGGTTTGA-3' and reverse 5'-TTTCTTCCATCCTTCCAGGAT-3', resulting in a PCR product of 300 bp; GCLR, forward 5'-TTTGGTCAGGGAGTTTCCAG-3' and reverse 5'-ACACAGCAGGAGGCAAGATT-3', resulting in a PCR product of 400 bp; GCLC, forward 5'-TGAGATTTAAGCCCCCTCCT-3' and reverse 5'-TTGGGATCAGTCCAGGAAAC-3', resulting in a PCR product of 380 bp; HO1, forward 5'-ACATCTATGTGGCCCTGGAG-3' and reverse 5'-GCGGTAGAGCTGCTTGAACT-3', resulting in a PCR product of 380 bp. Total RNA, purified from cell pellets with Trizol Reagent (GIBCO-BRL), was subjected to RT-PCR with the Promega Transcription System (Madison, WI). The reaction mix (20 µl) contained 200 µM dNTP, 0.45 µM of each primer, 1 µg of total RNA, and AMV reverse transcriptase (15 U). RNA was reverse transcribed at 42°C for 30 min. DNA was amplified by an initial incubation at 94°C for 4 min followed by 25 ~ 35 cycles of 94°C for 0.5 min, 55 ~ 58°C for 0.6 min, 72°C for 0.5 min, and a final extension at 72°C for 7 min. the PCR products were then separated by electrophoresis in a 1.2% agarose gel and visualized by ethidium bromide staining. The number of cycles and melting temperature were adjusted depending on the genes amplified.

Western blot.
The whole cell extract was resolved by SDS-PAGE, transferred to polyvinylidene difluoride (PVDF) membrane, and blocked with 5% nonfat milk. The PVDF membranes were incubated with primary antibodies [mouse monoclonal antibody against NQO1 (1:1,000), rabbit polyclonal antibody against GCLC (1:2,000) or GCLR (1:2,000), and mouse monoclonal antibody against HO1 (1:250)] overnight at 4°C. Membranes were washed and incubated with anti-rabbit (1:5,000) or anti-mouse (1:5,000) IgG labeled with horseradish peroxidase for 2 h. The chemiluminescence emitted from luminal oxidization by horseradish peroxidase was detected by using the enhanced chemiluminescence Western blotting detection system (Amersham Pharmacia Biotech, Piscataway, NJ).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
Rank evaluation.
IMR-32 cells were treated with tBHQ (10 µM) or vehicle (0.01% EtOH), and four pairs of samples were generated over a 4-mo period. All samples were analyzed using the human U95a arrays. As shown in Fig. 1A, there were dramatic variations in the number of genes called different (increased or decreased). Even between the pair-matched comparisons there were great variations (bold in Fig. 1A). Affymetrix gives great weight to the "difference call" as a basic parameter for evaluating transcriptional up- or downregulation. We defined "increase," "decrease," or "no change" of expression for individual genes based on ranking of the difference call from six 2 x 2, four 3 x 3, and one 4x4 matrix comparisons. Briefly, no Change = 0, marginal increase = 1, increase = 2, marginal decrease = -1, and decrease = -2. The cutoff value for increase or decrease was set as ±n2 (n >= 2) because of the marginal calls. For example, in a 2 x 2 matrix that generates four data sets, genes called "marginal increase(1)/decrease(-1)" would have to be called in each comparison to pass the criteria, n2 = 4/-4. Therefore, the cutoff in the 3 x 3 matrix would be n2 = 9/-9, and that in the 4 x 4 matrix would be n2 = 16/-16. Analysis of the data from the 8-h time point by a Latin square comparison is shown in Fig. 1B. A significant decrease in the number of genes passing the rank analysis was observed when comparing 2 x 2 with 3 x 3 for both increased and decreased genes. There also appeared to be a greater variability in the number of increased genes than that of decreased genes (see Fig. 3B). This variability in gene number, however, does not reflect a consistent change in the same gene, thereby accounting for the low number of decreased genes following tBHQ treatment. These data indicate that a minimum of three independent samples should be run to determine differential gene expression between control and treatment groups. Application of this 3 x 3 matrix analysis to the microarray data at each time point led to the identification of ~200 genes whose expression was consistently increased (196 genes) or decreased (4 genes).



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 1. The n x n matrix comparisons plus ranking analysis. A: by means of cross-comparisons between treatment and control groups, the number of genes called increase (I) or decrease (D) are shown as I/D. Data gathered from pair-wise comparisons were in bold type. B: histograms of the number of genes that were called as I or D. In addition to four pair-matched "1 by 1" comparisons (Jan, Feb, Mar, and Apr), a random selection was conducted to generate six "2 by 2" (Jan+Feb, Jan+Mar, Jan+Apr, Feb+Mar, Feb+Apr, and Mar+Apr), four "3 by 3" (Jan+Feb+Mar, Jan+Feb+Apr, Feb+Mar+Apr, and Jan+Mar+Apr), and one "4 by 4" (Jan+Feb+Mar+Apr) matrix comparisons. Statistic analysis was conducted by one-tailed, unequal variance t-test. Significant P values (P < 0.05) and the comparisons are shown in the graph.

 


View larger version (39K):
[in this window]
[in a new window]
 
Fig. 3. Verification of upregulated gene expression in IMR-32 cells treated with tert-butylhydroquinone (tBHQ). After treatment with tBHQ for 8 and 24 h, total RNA (1 µg) from IMR-32 cells was extracted for microarray analysis (A) and RT-PCR (B) with specific primers for NAD(P)H:quinone oxidoreductase (NQO1), heme oxygenase-1 (HO1), {gamma}-glutamylcysteine ligase catalytic subunit (GCLC), and {gamma}-glutamylcysteine ligase regulatory subunit (GCLR). Statistical analysis was conducted by one-tailed, unequal variance t-test. *Significant difference of average difference between control (C) and treatment (T) groups (P < 0.05). Protein lysates from IMR-32 cells were prepared and subjected to Western blot with specific antibodies toward NQO1, GCLC, GCLR, and HO1 (C).

 
Reproducibility of data.
Evaluation of the reproducibility of paired experiments was based on the coefficient of variation (CV) (standard deviation/mean) for average difference change (ADC). Distribution curves of the CV for each time point are depicted in Fig. 2. Previously identified ARE-driven genes such as NQO1, HO1, TR, GR, and GCLR were used to determine a rational cutoff value for CV. The CVs of these genes were all below 1.0, and therefore we considered this value as our cutoff. Only those changes complying with this criterion were considered further for clustering analysis. Of the 200 genes, 101 (1% of 9,670 genes) genes made the final list. These genes are listed by functional categorization in the Supplemental Table,1 published online at the Physiological Genomics web site. Interestingly, there were no decreased genes, suggesting that tBHQ is a selective and potent activator of the ARE. Transcriptional upregulation peaked at 8 h with more than 50% of listed genes showing increased expression at this time point.



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 2. Coefficient of variation (CV) as a parameter with which to evaluate the reproducibility of microarray analysis. The CV (standard deviation/mean) was calculated based on average difference change (ADC = ADtreat - ADbase, where ADtreat is average difference of treatment, and ADbase is average difference of baseline) from the data obtained after a 3 x 3 ranking analysis of the four time point experiments. Scatter plots are for means of ADC vs. CVs for 4 h data (A), 8 h data (B), 24 h data (C), and 48 h data (D). The arbitrary cutoff value of 1.0 is depicted as a horizontal line throughout the graphs.

 
Gene categorization.
In this study, one of the major clusters of transcriptionally activated genes in IMR-32 cells was the phase II detoxification enzymes. Within this category of genes, there were early-response genes [NQO1, HO1, GR, glutathione transferase M3 (GSTM3), GCLR, and TR] and late-response genes (ferritin heavy and light chain). Interestingly, unlike other phase II detoxification enzymes, HO1 was transiently upregulated at 8 h treatment and not changed at 24 or 48 h. Other anti-oxidative systems like hepatic dihydrodiol dehydrogenase (HDD) and its isoform, KIAA0119, malate NADP oxidoreductase, and breast cancer cytosolic NADP(+)-dependent malic enzyme were also found to be upregulated by tBHQ (Table 1). All of these genes ranked higher than others induced by tBHQ, suggesting their importance in the tBHQ-mediated antioxidant effect. We were surprised to note that GCLC did not change after treatment with tBHQ, albeit some researchers have reported this gene to be upregulated by tBHQ via the ARE (15, 20).


View this table:
[in this window]
[in a new window]
 
Table 1. Transcriptional upregulation in phase II detoxifying enzymes induced by tBHQ

 
Consistent with the microarray data (Fig. 3A), RT-PCR confirmed the increase in transcript levels of detoxifying enzymes (NQO1, GCLR, and HO1) (Fig. 3B). Western blot analysis for NQO1, GCLR, and HO1 protein also demonstrated a corresponding change in protein levels (Fig. 3C). As expected from the microarray and RT-PCR data, the protein levels of GCLC did not change with treatment. RT-PCR results for other genes, including neurofilament heavy subunit, HDD, TR, GR, ferritin heavy and light chain, were again consistent with the corresponding microarray data (data not show).

Increases were also observed in genes associated with neuronal growth and differentiation, such as neuronal olfactomedin-related ER localized protein, axin, ß2-syntrophin, secretogranin II, and musashi.

Nuclear transcription factors including c-Jun, c-Fos, Jun-B, Jun-D, Fra-1, Fra-2, ATF-3, ATF-4, NF-{kappa}B, and small Maf proteins (MafK and MafF), which have been reported to possibly influence the Nrf2-ARE interaction, did not change at the mRNA level after treatment with tBHQ. Of particular note, KIAA0132, the human homolog of mouse KEAP1, was increased. KEAP1 is a cytosolic chaperone of Nrf2. Exposure of cells to inducers disrupts the KEAP1-Nrf2 complex and allows Nrf2 to translocate into the nucleus, where it binds to the ARE and stimulates transcription (6). Other categories of genes induced by tBHQ are listed in the Supplemental Table.

SOM clustering.
The 101 genes mentioned above were included in the self-organizing maps (SOM) clustering. We used GENECLUSTER to analyze the data. Normalized hybridization intensities (ADC) for individual genes at time t were defined as (ADC of gene x at time t) - (mean ADC of gene x)/(standard deviation of ADC of gene x), to allow clustering to occur on the basis of the expression profiles rather than by absolute level. A range of two-dimensional matrices was examined with a rapid settling on a 4 x 3 SOM. Less than 12 nodes provided distinct patterns (wide variation between expression of individual genes within a cluster), and increasing beyond 12 nodes led to cluster duplication (nearly identical cluster patterns). Figure 4 shows the average normalized gene expression patterns for the genes contained within clusters 0–11. Most genes previously identified in the literature to be induced by tBHQ ("Source of Evidence" in Table 1) were grouped within different clusters, although some of them were grouped together as shown in Table 1. The expression profiles induced by a specific agent, therefore, were far more complicated than expected. In addition, these genes grouped into 12 distinct clusters with strikingly consistent patterns, yet no functional correlation existed among the clustered genes (see the Supplemental Table). For example, the genes grouped in cluster 7 are involved in detoxification and cellular antioxidant defense (NQO1, ferritin heavy chain, and HDD), cytoskeleton construction (dynein heavy polypeptide), immune response [HLA class-1 (HLA-A26) heavy chain], and energy metabolism (transketolase). These data suggest that multiple pathways and/or transcription factors must be involved in altering expression of each individual gene in response to tBHQ.



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 4. Cluster analysis reveals gene expression wave in IMR-32 cells after treatment with tBHQ: cluster analysis of probe sets on human U95a arrays. Self-organizing maps (SOM) were used to group the 101 identified genes into clusters based on similar expression dynamics over the four time points. SOM algorithm grouped them into 12 discrete clusters. The label in the top left corner of each panel represents the cluster number (c0c11). The number at the top center of each panel represents the number of genes in that particular cluster. Each time point (4, 8, 24, and 48 h) is represented by black dots. Thick lines indicate the mean expression values, and thin lines indicate standard deviation.

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 
The data presented here have identified and characterized the time-dependent changes in gene expression associated with tBHQ treatment in IMR-32 cells. This time-dependent study led to the discovery of early-response and late-response gene clusters. Our laboratory and others have confirmed that the induction of many detoxifying enzymes is dependent on the binding of Nrf2 to the ARE (6, 17, 26). The 5'-flanking regions of some genes revealed in this study were analyzed by MatInspector (web site http://transfac.gbf.de). Promoter regions from NQO1, GCLR, malate NADP oxidoreductase, HO1, Wnt5a, KIAA0132, ferritin heavy chain, and TR were available in the GenBank database. Putative Nrf2 binding site(s) were found in all these genes. The overall occurrence of Nrf2 sites in this set of promoters is more than expected randomly, suggesting that Nrf2 is likely to act as a common regulator of ARE-driven gene expression.

SOM, based on an unsupervised neural network algorithm, was applied to cluster and analyze gene expression patterns in this study. In contrast to the rigid structure of hierarchical clustering, the strong prior hypotheses used in Bayesian clustering, and the nonstructure of k-means clustering, SOMs are ideally suited to exploratory data analysis by allowing one to impose partial structure on the clusters, facilitating easy visualization and interpretation (3, 7). Most researchers believe that this analysis assigns genes to the single group or "cluster" that most closely shares a related expression pattern across specimens. They also believe that this approach has biological relevance, because coordinate regulation of groups of genes often signifies a role in common processes or pathways (9). In this study, SOM automatically and quickly extracted the gene expression profiles among the most prominent features of the data. The SOM results showed that some of the potential ARE-driven genes were clustered together based on their gene expression similarity. However, there are also examples of ARE-driven genes that did not co-cluster with other known ARE-related genes, suggesting that multiple pathway and/or transcription factors may be involved in controlling the expression level of each individual gene.

Oligonucleotide microarray data reflect the mRNA level of each individual clone directly, which may not only be influenced by transcriptional regulation, but also by the posttranscriptional regulation and mRNA stability. As we know, most protein kinases or transcription regulators, which play an important role in the signal transduction pathway, seldom are regulated at the mRNA level or the translational level. Their activation or inactivation is usually associated with protein-protein interaction often mediated by protein kinases and phosphatases. Changes in gene expression based on microarray analysis, therefore, are unable to tell us the signal transduction pathway(s) involved directly. However, we cannot deny the usefulness of microarray analysis, since researchers still gain some critical information by combining their microarray analysis with clustering and reference confirmation.

The advent of high-density oligonucleotide microarrays has greatly facilitated the ability to simultaneously examine the abundance of multiple mRNAs. Although oligonucleotide microarrays are powerful tools for profiling gene expression, the dynamic change and the large number of signals produced require efficient procedures for distinguishing false-positive results from changes in expression that are "real" (independently reproducible). There are two sources of randomly generated error associated with microarray analysis, namely, systematic and experimental error. The latter is a result of the experimental design. Most bioinformatic specialists ignore this type of error, which reflects, for example, the variations in treatment, cell density, and culture medium in cell lines; or tissue regions harvested (reproducibility of the dissection), genetic background, and diet in animal models. In contrast, the systematic error is generated during the analytical phase of microarray analysis. False-positive signals can be produced at multiple steps including probe array manufacture, preparation of cRNAs for microarray analysis, hybridization or washing steps, and global scaling and normalization of overall signal intensities between probe arrays. Therefore, the running of independent samples is necessary to establish a high degree of confidence in the data and will likely diminish both sources of error, whereas repeatedly running the same samples will decrease systematic error but amplify the experimental error. Our analysis does not ensure that all negative calls are truly negative. The percentage of false negatives is hard to quantify, but the confirmation rate of RT-PCR for selected positive calls after our analysis has been 100% (data not show).

Another issue of concern for the user is the several parameters presented by Affymetrix algorithms for detecting differential expression. Several previous studies have used arbitrary fold-change thresholds (typically 2- to 3-fold) to define significant expression change and stratify microarray results (5). This analysis maintains that an individual mRNA species has to equal or exceed a mandated level of fold-change between control and experimental RNA preparations before it is considered to have undergone a change that is likely to be significant. Although this approach is intuitively appealing, we could not find published reports in which its utility has been systematically evaluated. In the application of this technique, there are several limitations. Fold-change, as a ratio, is particularly vulnerable to artifacts produced by global scaling of probe array datasets, and assigning an arbitrary cutoff may mask biologically significant changes (20, 22). In this study, we give high emphasis to "difference call," which is generated based on the "decision matrix thresholds" Affymetrix provided: maximum [increase/total, decrease/total], increase/decrease ratio, log average ratio change, and Dpos-Dneg ratio (27). Each comparison metric is weighted and entered into a decision matrix to derive a difference call that indicates whether a transcript has increased (I), marginally increased (MI), decreased (D), marginally decreased (MD), or not changed (NC) in expression level. We have developed a ranking analysis of difference call as shown in the results section to determine the increased or decreased gene expression based on a rational cutoff value (±n2). These data make it absolutely clear that a minimum of three independent samples should be run to generate a reliable set of changed genes.

According to Affymetrix Suite 4.1, "average difference" (AD) serves as a relative indicator in the level of expression of a transcript. ADC is a parameter calculated from average difference of treatment (ADtreat) balanced by average difference of baseline (ADbase), i.e., ADC = ADtreat - ADbase. We used ADC for reproducibility analyses, because ADC represents the difference between control and treatment more accurately than fold change does in Affymetrix algorithms. The CV is a relative measure of the variation, since dividing by the mean directly accounts for the magnitude of the values. Large values of CV suggest that the data are quite variable and, therefore, inaccurate. Above all, we showed that our approach to noise filtration not only permits an experiment-wide assessment of overall data quality, but also allows the users to score and rank individual genes according to their likelihood of manifesting changes that are reproducible. These noise-filtering methods have also proven to be very useful in screening data gathered from primary cortical neuronal cultures, human neural stem cells, and dissected brain regions from transgenic mice overexpressing human amyloid precursor protein.

In conclusion, this study not only identified new genes induced by tBHQ, but also provided new insight into the complex pathways governing the regulation of antioxidant defense gene. Clearly, there is a significant variability in microarray data that may be contributed to the complexity in transcriptional regulation of gene expression. The noise-filtering methods presented here, however, make more efficient use of microarray data and increase the probability of generating a biological relevant dataset.


    ACKNOWLEDGMENTS
 
We thank Matthew Slattery and Molecular Biology Core Facility of the University of Wisconsin Molecular and Environmental Health Science Center for conducting the Affymetrix gene array hybridization. We also thank Thor Stein, Andrew Kraft, Delinda Johnson, and Jong-Min Lee for reading the manuscript and very helpful suggestions.

This study was supported by National Institute of Environmental Health Sciences Grants ES-08089 (to J. A. J), ES-10042 (to J. A. Johnson), and ES-09090 and by the Burroughs Wellcome New Investigator in Toxicological Sciences Award (to J. A. Johnson).


    FOOTNOTES
 
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).

Address for reprint requests and other correspondence: J. A. Johnson, School of Pharmacy, Univ. of Wisconsin, 6125 Rennebohm Hall, 777 Highland Ave, Madison, WI 53706 (E-mail: jajohnson{at}pharmacy.wisc.edu).

10.1152/physiolgenomics.00003.2002.

1 Supplementary material to this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/9/3/137/DC1. Back


    References
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 References
 

  1. Alam J, Stewart D, Touchard C, Boinapally S, Choi AM, and Cook JL. Nrf2, a Cap’n’Collar transcription factor, regulates induction of the heme oxygenase-1 gene. J Biol Chem 274: 26071–26078, 1999.[Abstract/Free Full Text]
  2. Budihardjo I, Oliver H, Lutter M, Luo X, and Wang X. Biochemical pathways of caspase activation during apoptosis. Annu Rev Cell Dev Biol 15: 269–290, 1999.[ISI][Medline]
  3. Butte AJ, Tamayo P, Slonim D, Golub TR, and Kohane IS. Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci USA 97: 12182–12186, 2000.[Abstract/Free Full Text]
  4. Chan JY and Kwong M. Impaired expression of glutathione synthetic enzyme genes in mice with targeted deletion of the Nrf2 basic-leucine zipper protein. Biochim Biophys Acta 1517: 19–26, 2000.[ISI][Medline]
  5. Claverie JM. Computational methods for the identification of differential and coordinated gene expression. Hum Mol Genet 8: 1821–1832, 1999.[Abstract/Free Full Text]
  6. Dhakshinamoorthy S and Jaiswal AK. Functional characterization and role of INrf2 in antioxidant response element-mediated expression and antioxidant induction of NAD(P)H:quinone oxidoreductase 1 gene. Oncogene 20: 3906–3917, 2001.[ISI][Medline]
  7. Dieckgraefe BK, Stenson WF, Korzenik JR, Swanson PE, and Harrington CA. Analysis of mucosal gene expression in inflammatory bowel disease by parallel oligonucleotide arrays. Physiol Genomics 4: 1–11, 2000.[Abstract/Free Full Text]
  8. Eftekharpour E, Holmgren A, and Jurlink BH. Thioredoxin reductase and glutathione synthesis is upregulated by t-butylhydroquinone in cortical astrocytes but not in cortical neurons. Glia 31: 241–248, 2000.[ISI][Medline]
  9. Eisen MB, Spellman PT, Brown PO, and Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95: 14863–14868, 1998.[Abstract/Free Full Text]
  10. Friling RS, Bensimon A, Tichauer Y, and Daniel V. Xenobiotic-inducible expression of murine glutathione S-transferase Ya subunit gene is controlled by an electrophile-responsive element. Proc Natl Acad Sci USA 87: 6258–6262, 1990.[Abstract]
  11. Hudson FN and Kavanagh TJ. Cloning and characterization of the proximal promoter region of the mouse glutamate-L-cysteine ligase regulatory subunit gene. Biochim Biophys Acta 1492: 447–451, 2000.[ISI][Medline]
  12. Itoh K, Ishii T, Wakabayashi N, and Yamamoto M. Regulatory mechanisms of cellular response to oxidative stress. Free Radic Res 31: 319–324, 1999.[ISI][Medline]
  13. Jaiswal AK. Regulation of genes encoding NAD(P)H:quinone oxidoreductases. Free Radic Biol Med 29: 254–262, 2000.[ISI][Medline]
  14. Jeyapaul J and Jaiswal AK. Nrf2 and c-Jun regulation of antioxidant response element (ARE)-mediated expression and induction of gamma-glutamylcysteine synthetase heavy subunit gene. Biochem Pharmacol 59: 1433–1439, 2000.[ISI][Medline]
  15. Kang KW, Cho MK, Lee CH, and Kim SG. Activation of phosphatidylinositol 3-kinase and Akt by tert-butylhydroquinone is responsible for antioxidant response element-mediated rGSTA2 induction in H4IIE cells. Mol Pharmacol 59: 1147–1156, 2001.[Abstract/Free Full Text]
  16. Kwak MK, Itoh K, Yamamoto M, Sutter TR, and Kensler TW. Role of transcription factor Nrf2 in the induction of hepatic phase 2 and antioxidative enzymes in vivo by the cancer chemoprotective agent, 3H-1, 2-dimethiole-3-thione. Mol Med 7: 135–145, 2000.[ISI]
  17. Lee JM, Hanson JM, Chu WA, and Johnson JA. Phosphatidylinositol 3-kinase, not extracellular signal-regulated kinase, regulates activation of the antioxidant-responsive element in IMR-32 human neuroblastoma cells. J Biol Chem 276: 20011–20016, 2001.[Abstract/Free Full Text]
  18. Lee JM, Moehlenkamp JD, Hanson JM, and Johnson JA. Nrf2-dependent activation of the antioxidant responsive element by tert-butylhydroquinone is independent of oxidative stress in IMR-32 human neuroblastoma cells. Biochem Biophys Res Commun 280: 286–292, 2001.[ISI][Medline]
  19. Li N, Venkatesan MI, Miguel A, Kaplan R, Gujuluva C, Alam J, and Nel A. Induction of heme oxygenase-1 expression in macrophages by diesel exhaust particle chemicals and quinones via the antioxidant-responsive element. J Immunol 165: 3393–3401, 2000.[Abstract/Free Full Text]
  20. Liu T, Lai H, Wu W, Chinn S, and Wang PH. Developing a strategy to define the effects of insulin-like growth factor-1 on gene expression profile in cardiomyocytes. Circ Res 88: 1231–1238, 2001.[Abstract/Free Full Text]
  21. McMahon M, Itoh K, Yamamoto M, Chanas SA, Henderson CJ, McLellan LI, Wolf CR, Cavin C, and Hayes JD. The Cap’n’Collar basic leucine zipper transcription factor Nrf2 (NF-E2 p45-related factor 2) controls both constitutive and inducible expression of intestinal detoxification and glutathione biosynthetic enzymes. Cancer Res 61: 3299–3307, 2001.[Abstract/Free Full Text]
  22. Mills JC and Gordon JI. A new approach for filtering noise from high-density oligonucleotide microarray datasets. Nucleic Acids Res 29: E72–2, 2001.[Medline]
  23. Murphy TH, Miyamoto M, Sastre A, Schnaar RL, and Coyle JT. Glutamate toxicity in a neuronal cell line involves inhibition of cystine transport leading to oxidative stress. Neuron 2: 1547–1558, 1989.[ISI][Medline]
  24. Orino K, Tsuji Y, Torti FM, and Torti SV. Adenovirus E1A blocks oxidant-dependent ferritin induction and sensitizes cells to pro-oxidant cytotoxicity. FEBS Lett 461: 334–338, 1999.[ISI][Medline]
  25. Paulson KE, Darnell JE Jr, Rushmore T, and Pickett CB. Analysis of the upstream elements of the xenobiotic compound-inducible and positionally regulated glutathione S-transferase Ya gene. Mol Cell Biol 10: 1841–1852, 1990.[ISI][Medline]
  26. Ramos-Gomez M, Kwak MK, Dolan PM, Itoh K, Yamamoto M, Talalay P, and Kensler TW. Sensitivity to carcinogenesis is increased and chemoprotective efficacy of enzyme inducers is lost in nrf2 transcription factor-deficient mice. Proc Natl Acad Sci USA 98: 3410–3415, 2001.[Abstract/Free Full Text]
  27. Schadt EE, Li C, Su C, and Wong WH. Analyzing high-density oligonucleotide gene expression array data. J Cell Biochem 80: 192–202, 2000.[ISI][Medline]
  28. Schulze A and Downward J. Navigating gene expression using microarrays: a technology review. Nat Cell Biol 3: E190–E195, 2001.[ISI][Medline]
  29. Talalay P. Chemoprotection against cancer by induction of phase 2 enzymes. Biofactors 12: 5–11, 2000.[ISI][Medline]
  30. Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, and Golub TR. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci USA 96: 2907–2912, 1999.[Abstract/Free Full Text]
  31. Wild AC and Mulcahy RT. Regulation of gamma-glutamylcysteine synthetase subunit gene expression: insights into transcriptional control of antioxidant defenses. Free Radic Res 32: 281–301, 2000.[ISI][Medline]