Dissecting tBHQ induced ARE-driven gene expression through long and short oligonucleotide arrays
Jiang Li1,
Maria L. Spletter1 and
Jeffrey A. Johnson1,2,3
1 School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin
2 Environmental Toxicology Center, University of Wisconsin-Madison, Madison, Wisconsin
3 Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin
 |
ABSTRACT
|
---|
This paper compares the gene expression profiles identified by short (Affymetrix U95AV2) or long (Agilent Hu1A) oligonucleotide arrays on a model for upregulation of a cluster of antioxidant responsive element-driven genes by treatment with tert-butylhydroquinone. MAS 5.0, dCHIP, and RMA were applied to normalize the Affymetrix data, and Lowess regression was considered for Agilent data. SAM was used to identify the differential gene expression. A set of biological markers and housekeeping genes were chosen to evaluate the performance of multiple normalization approaches. Both arrays illustrated a definite set of overlapping genes between the data sets regardless of data mining tools used. However, unique gene expression profiles based on the platform used were also revealed and confirmed by quantitative RT-PCR. Further analysis of the data revealed by alternative approaches suggested that alternative splicing, multiple vs. single probe(s) measurement, and use or nonuse of mismatch probes may account for the discrepant data. Therefore, these two microarray technologies offer relatively reliable data. Integration of the gene expression profiles from different array platforms may not only help for cross-validation but also provide a more complete view of the transcriptional scenario.
microarray; cross-platform comparison; antioxidant responsive element; tert-butylhydroquinone
 |
INTRODUCTION
|
---|
THE RECENT POPULARITY of microarray technology can be attributed to its successful application to a wide range of biological areas including toxicity profiling, drug discovery and screening, clinical diagnosis, and outcome prediction. This revolutionary technology has brought a new outlook for the study of gene expression networks at the transcriptional level, providing a valuable insight into the molecular mechanisms underlying processes from embryonic development and organ formation, to progression of diseases, and to the effect of drugs. Despite being a relatively new technique, arrays exist in a variety of forms and can be classified based on a number of attributes, including length of target sequence (long cDNAs or oligonucleotides), commercial or custom made, glass or membrane based, and spotted or in situ synthesized (39). Because of an unprecedented marketing potential predicted in the near future, several major biotechnology companies have been competitively involved in developing their own patented microarray platforms integrated with their independent data-mining solutions.
Affymetrix (Santa Clara, CA) and Incyte Genomics (Palo Alto, CA) were the pioneer companies in commercial microarrays. Their technologies and products well represent the two initial array formats. Recent years have seen a shake-up in the microarray industry. In late 2001, Incyte Genomics no longer provided commercial microarrays. Incyte may be gone, but some heavy hitters, most notably Agilent Technology (Palo Alto, CA), Amersham Biosciences (Piscataway, NJ), and Applied Biosystems (Foster City, CA), entered the marketplace in early 2002 and 2004. Agilent uses proprietary SurePrint ink-jet technology and offers human, mouse, and rat cDNA arrays based on Incytes LifeSeq Gold databases, whereas the Affymetrix array uses public genomic databases. In a collaborative effort between Agilent and Rosetta Inpharmatics (Kirkland, WA), Agilent launched a new, flexible platform, 60-mer oligonucleotide arrays, that embraces an in situ oligonucleotide synthesis method in which the ink-jet printing process is modified to accommodate delivery of phosphoramidites to directed locations on a glass surface. Using the flexibility offered by this system, Hughes et al. (20) characterized the importance of various experimental parameters to hybridization specificity and sensitivity. The company claims that its longer 60-mer probe affords between five and eight times more sensitivity than Affymetrixs 25-mer probes (see "Performance comparison of Agilents 60-mer and 25-mer in situ synthesized oligonucleotide microarrays;" http://www.chem.agilent.com).
Flexibility in the creation of new arrays is becoming increasingly important as more refinement of the genomic sequences, more knowledge of alternative splicing, and more understanding of the limited cross-validation among different array platforms occur. Multiple microarray formats for measuring genome-wide gene expression levels are currently available. Increasingly accessible microarray platforms allow the unrestrained and rapid generation of a large volume of expression data sets. Therefore, whether data generated from multiple microarray formats are interchangeable, convertible, and comparative and to what extent we can rely on these data are important issues facing life scientists. In fact, several comparative studies have already been undertaken. However, the results show either high correlation (3, 32, 66) or low correlation (29, 44, 56) between the data generated from different array formats. There is also a lack of information on the comparative study between widely used Affymetrix short oligonucleotide arrays and recently developed Agilent long oligonucleotide arrays.
To address the concerns of cross-platform microarray studies as described above, researchers must select an appropriate experimental model. Toxicologists immediately recognized the impact that the microarray could have on the study of drug toxicity and rapidly embraced this technology as one of the bright futures of toxicological analysis; they termed it toxicogenomics. Transcriptional regulation of multiple detoxification enzymes is one of the major gene clusters focused on by toxicologists.
Examples of detoxification enzymes include NAD(P)H:quinone oxidoreductase (NQO1), epoxide hydrolases, glutathione S-transferases (GSTs), N-acetyltransferases, sulfotransferases, and UDP-glucuronosyltransferases (UGT) and other enzyme superfamilies (51). The ability of these enzymes to conjugate redox-cycling chemicals is an important protective mechanism against electrophiles and oxidative stress. The transcriptional activation of the detoxification enzymes and/or antioxidant genes by redox-cycling chemicals has been traced to a cis-acting element called the antioxidant responsive element (ARE) that regulates either or both constitutive and inducible gene expression. ARE sequences have been detected in the promoter region of genes, including rat and mouse GST-Ya (6, 16, 53), rat GST-P (17), rat and human NQO1 (9, 23), murine heme oxygenase-1 (HO1) (1, 37), and murine ferritin heavy chain (50) as well as the murine
-glutamylcysteine ligase catalytic (GCLC) (12, 48) and regulatory (GCLR) subunits (13, 46). The mechanisms that regulate phase II detoxification gene expression through ARE activation are under intense investigation. Several ARE-binding proteins have been proposed and/or identified. Nrf2, a member of the "cap n collar" (CNC) basic leucine zipper transcription factor (bZIP) family, has been demonstrated to play a central role in gene expression (47). Using Affymetrix microarrays, we have published multiple data sets related to Nrf2-dependent ARE activation models, which include tert-butylhydroquinone (tBHQ)-induced ARE-driven gene expression in a neuroblastoma cell line (40, 41) and primary neuronal culture (33), Nrf2-dependent ARE-driven gene expression in primary mouse/rat neuronal and astrocytic cultures through adenovirus-mediated Nrf2 overexpression (27, 55), and Nrf2-dependent gene expression in mouse liver by comparison of Nrf2 knockout mice with wild-type littermates (42). Cross-comparison of the multiple data sets has allowed us to finally narrow the field down to 20 potential biological markers that appear to be upregulated by the Nrf2-ARE pathway. A majority of them have also been reported by other researchers to be involved in the Nrf2-ARE signaling pathway. Therefore, dissection of tBHQ-induced ARE-driven gene expression is a stringent and useful model to evaluate differential gene expression identified by different array formats. In this study, we analyzed identical RNA preparations using two commercially available high-density microarray platforms.
 |
MATERIALS AND METHODS
|
---|
Tissue Collection and Cell Culture
Human fetal cortical tissue (between 8 and 13 wk postconception) was collected after routine terminations of pregnancy. All fetal tissue was kindly provided to the laboratory of Dr. Clive Svendsen while at the Medical Research Council Centre for Brain Repair, University of Cambridge, by Dr. Eric Jauniaux, Department of Obstetrics and Gynaecology, University College, London, UK. Full ethical approval was granted by the Local Research Ethics Committee, University College Hospital, London, UK. The methods of collection conform with the arrangements recommended by the Polkinghorne Committee and National Institutes of Health for the collection of such tissues and with the guidelines set out by the United Kingdom Department of Health as well as the University of Wisconsin (IRB reference no. 2001-288). The human neural stem cell line CTX066 was established by Dr. Clive Svendsens laboratory at the University of Cambridge on July 19, 2000. The cell line was a gift to our laboratory in May, 2001. CTX066 cells were passaged every 14 days by sectioning into 150-µm sections that then were reseeded into fresh growth medium containing 70% DMEM, 30% Hams F-12, 1% penicillin-streptomycin-amphotericin B supplemented with 20 ng/ml EGF, 10 ng/ml leukemia inhibitory factor, and 1% (vol/vol) N2 at a density equivalent to 200,000 cells/ml. One-half the growth medium was replenished every fourth day (64).
Cells were treated with vehicle (ethanol, 0.01%) and tBHQ (20 µM) for 24 h. According to the cell viability assay (MTS) and ARE-luciferase assay, the concentration of tBHQ selected for the following experiment has been proven to induce significant ARE activation with no observable toxicity in this culture model.
Total RNA Isolation and Quantitative Measurement
Human total RNA was isolated with the use of TRIzol (Life Technologies). The quality of total RNA and fragmented cRNA was easily visualized on a 2100 Bioanalyzer (Agilent Technologies), using the RNA 6000 LabChip kit. The ratio of 260/280 absorbance was also measured by UV spectrophotometry. Five micrograms of high-quality total RNA (as shown in Supplemental Fig. S1; available at the Physiological Genomics web site)1
were used as a template for the following cRNA synthesis for both array platforms.
Target Preparation and Array Hybridization
Both in situ-synthesized long oligonucleotide arrays (Hu1A) from Agilent Technologies and short oligonucleotide arrays (U95Av2) from Affymetrix were used to measure differential gene expression. Some technical parameters related to the array formats are briefly described in Table 1. Target preparation and array hybridization were conducted according to the manufacturers protocol.
Feature Extraction and Data Mining
Affymetrix Microarray Suite 5.0 (MAS 5.0) was used to scan and analyze the relative abundance of each gene from the intensity signal value. Total chip intensities were scaled to an average intensity of 2,500, with manufacturer-defined parameters. Analysis parameters used by the software were set to values corresponding to moderate stringency (SDT = 30, SRT = 1.5). Output from the microarray analysis was merged with the Unigene or GenBank descriptor and stored as an Excel data spreadsheet. Significantly changed genes were determined using the Wilcoxon signed rank test for each comparison. Probe sets with P values <0.0025 were called Increased/Decreased, probe sets with P values in the range 0.0025 < P < 0.003 were called Marginally Increased/Decreased, and the remaining probe sets were called No Change. An additional level of ranking was used to incorporate multiple comparisons such that No Change = 0, Marginal Increase/Decrease = 1/1, and Increase/Decrease = 2/2. The final rank equaled the sum of the ranks from the three comparisons, and the value varied from 6 to 6 for a three-paired comparison. The coefficient of variation (CV) and mean (SD) for the average fold change were also calculated.
Other normalization approaches also were applied to Affymetrix data. Through close examination of signal intensity from perfect match (PM) and mismatch (MM), several publications reported that MM probes interfere with fold change calculation and also presented cases where genes were mistakenly classified as absent because of high MM signal (26, 35, 67). Because MM signal may overcorrect PM signal, and this may not represent the real MM as hypothesized, PM only-based signal intensity normalization and calculation have been recommended.
The analysis methods considered in this paper include 1) algorithms built around an intensity-modeling approach implemented in dCHIP v1.3 (http://www.dchip.org) (36) and 2) expression values calculated using the robust multiple average (RMA) method (4, 21). Normalization was done using quantile normalization. Software for RMA is available for download at http://www.bioconductor.org for use in the R-package for statistical computing (http://www.r-project.org).
For the Agilent arrays, the default settings of the Agilent G2566AA Feature Extraction software, which selects the Lowess (locally weighted linear regression curve fit) normalization method after local background subtraction (see "Robust local normalization of gene expression microarray data;" http://www.chem.agilent.com), were used. Specifically, it performs a large number of local regressions in overlapping windows along the length of the data and then joins the regressions together to form a smooth curve. Lowess regression has been shown to significantly eliminate systematic bias contributed by dual dye-labeling systems. It is a within-array normalization, not the same as dCHIP or RMA. Data flagged as having poor quality by the Agilent extraction software were removed from the analysis. Output from the Genechip analysis was merged with the Unigene or GenBank descriptor and stored as an Excel data spreadsheet.
Signal Log Ratio Versus Average Log Intensity Plot
Normalized data were visually displayed by plotting a signal log ratio (M) vs. the average log intensity (A), namely, MvA plot.
Cross-Platform Comparisons of Expression Data
Each microarray platform reported the GenBank ID of the sequence interrogated by each of the probes or probe sets on the array. These GenBank IDs were then converted to their corresponding UniGene ID. RESOURCERER, developed by The Institute for Genomic Research (TIGR), is a high-throughput web-based database for the annotation and comparison of commonly available microarray resources. It allows comparisons between resources from the same species, using either the TIGR Gene Indices or UniGene, and between species, using the TIGR EGO database (http://pga.tigr.org). The relationships between expression measurements made with different array types were assessed by Pearson correlation coefficient.
Detection of Differential Gene Expression
Tusher et al. (58) recently proposed a similar permutation-based algorithm called significance analysis of microarrays (SAM), addressing this same problem by controlling the false discovery rate (FDR) of differential gene lists. SAM uses an algorithm based on the Students t-test and also performs data permutations to determine the FDR. The software is free for download at http://stat.stanford.edu/
tibs/SAM.
Data Deposition
Raw data from this study are available from the Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo) and are listed under the following accession numbers: GSE768 (for the entire experimental series), GSE759 (for the neural stem cells), GSE761 (for the neuroblastoma cells), GPL91 (for Affymetrix U95Av2 arrays), and GPL553 (for Agilent Hu1A arrays).
Real-Time RT-PCR
Total RNA used for microarray analysis also was applied for quantitative (q)RT-PCR. Double-strand cDNA was produced from 4 µg of each RNA sample using an Ambion RT-catalyzed T7 oligo(dT) primer (MessageAmp aRNA kit, Ambion). The cDNAs used for real-time PCR were not the same for two microarray experiments. PCR primers specific for the genes of interest were used for amplification (as shown in Supplemental Table S1). PCR products with a single band in agarose gel electrophoresis were purified by QIAquick PCR Purification kit (Qiagen), and quantitative analysis was performed by UV spectrometer. The LightCycler (Roche) was used for quantitative PCR. A serial dilution of the purified PCR product was performed to generate a standard curve for the specific gene of interest. A quantity of 0.11 µl of each cDNA pool was used for each PCR in the presence of SYBR Green, according to the manufacturers instructions (http://biochem.roche.com/prodinfo_fst.htm?/lightcycler/WhatIS.htm). Data was presented as means ± SE (n = 3) of the ratio between the concentrations of a specific template from vehicle- and tBHQ-treated samples. A one-tailed, paired t-test was also performed on the concentrations between tBHQ- and vehicle-treated groups, and P < 0.05 was considered as the significant gene expression changes.
 |
RESULTS
|
---|
Both in situ-synthesized long oligonucleotide arrays (Hu1A, Agilent Technologies) and short oligonucleotide arrays (U95Av2, Affymetrix) were used to measure differential gene expression in RNA samples generated from human neural stem cells (CTX066) treated with vehicle (ethanol, 0.01%) and tBHQ (20 µM) for 24 h. The same RNA samples were used for the comparison study. Three arrays per condition were used for each platform. The samples were harvested in three different cell passages instead of pooling samples or repeating the same sample. Provided that the sample labeling of mRNA was carried out separately from different extractions, this approach led us as close to independent experimental results as was feasible in this context (7, 65). According to RESOURCERER 6.0, there were
9,556 UniGene clusters that were represented in both the Affymetrix U95Av2 array [
12,000 clones and/or expressed sequence tags (ESTs)] and the Agilent Hu1A array (
17,000 clones and/or ESTs). Therefore, comparisons were limited to the 9,556 unique clusters.
Data Normalization
We assembled a list of 20 candidate genes based on the multiple data sets generated in our lab (Table 2). All of these genes had been previously implicated in the tBHQ-induced ARE activation pathway. Because these genes also were individually described as upregulated by tBHQ and/or through the Nrf2-ARE pathway, our main focus in this study was on transcriptional upregulation-related gene expression profiles. We also monitored 23 housekeeping genes, the gene expression levels of which remained constant. These are classified as cytoskeletal proteins, ribosomal proteins, or metabolic enzymes and have previously been used as controls in several publications (Supplemental Table S2; Refs. 8, 34, 59).
Multiple statistic model-based data normalization approaches have been proposed for the Affymetrix platform (53). However, a "gold standard" method has not been defined yet. Thus method selection should be motivated by the application at hand and the goals of the data analysis.
As shown in Fig. 1, when compared with other normalization approaches, MAS 5.0 analysis of both intergroup and intragroup pairings has the largest variations observed across the whole data set except for those of RMA intergroup analysis. This is extremely obvious in the scatterplots, which form a so-called "funnel shape" at low signal intensities. Here, the relationship between control and treatment groups was regarded as an intragroup relationship, and the three biological replicates were considered an intergroup relationship. The dCHIP model produced a better intergroup or intragroup correlation with respect to the MAS scaling procedure, as shown in Fig. 1B. Figure 1C illustrates the good data distribution from intracontrol groups or intratreatment groups after quantile normalization by RMA. However, scatterplots for intergroups demonstrated a huge data variation across the whole spectrum of signal intensity values. This indicates that there were no correlations between controls and treatments.

View larger version (46K):
[in this window]
[in a new window]
|
Fig. 1. Comparison of log ratio measurement between the two samples after different data normalization. Expression measures (intensity value) for the two samples are plotted against each other. These two samples could be intragroup [controls (C) or treatments (T)] or intergroup (treatment vs. control). The straight-line fit with the Pearson correlation coefficient and the corresponding R2 are shown. A: data from Microarray Suite version 5.0 (MAS5) scaling. B: data from dCHIP [perfect match (PM)-only model]. C: data from robust multiple average (RMA; quantile normalization). D: data from Agilent (Lowess regression). The relative intensity values, not the ratios, were used to present Agilent data. E: summary of the Pearson correlation coefficient. One-tailed, pair-matched t-test was performed. MM, mismatch.
|
|
As shown in Fig. 1E by one-tailed, paired t-test on Pearson correlation coefficient (R2), the averaged R2 values from the intracontrol group (0.693, 0.852) and the intratreatment group (0.669, 0.810) from MAS 5.0 and dCHIP (PM only) are very close to each other (P > 0.05), indicating that these two procedures did not affect the global similarity between samples. dCHIP (PM only) provided a higher correlation compared with dCHIP(PM, MM) in terms of intragroup correlations and gave a better fit than MAS 5.0 scaling in terms of inter- or intragroup correlation. The quantile normalization from RMA significantly increased the intragroup correlation (control, 0.913; treatment, 0.896) and also dropped the intergroup correlation (0.067) dramatically, implying that this approach amplified the difference between the control and treatment groups but decreased the variation (bias) generated from biological replicates. Therefore, RMA seems to be a more effective way than MAS 5.0 or dCHIP (PM only) to normalize the Affymetrix data in terms of biostatistical significance between control and treatment groups.
One representative set of pair-matched comparison data was MvA plotted, as show in Fig. 2, using multiple data normalization tools. Through comparison of the MvA plot generated by MAS 5.0, dCHIP, and RMA-normalized raw Affymetrix data, the low average log intensity data had greater variation in the MAS 5.0-generated MvA plot. The tightness of an MvA plot generated after Hu1A (PM only) or RMA normalization only suggests that the low-level background noise is consistent, not that probe data in that range are accurate or sensitive for measuring changes between arrays. Data that have been normalized represent data that have been modified to reduce error based on different statistical hypotheses and algorithms. It may or may not represent the biological relevance of the data sets. Therefore, we highlighted 20 known differentially expressed genes and 23 consistently expressed housekeeping genes (described in Supplemental Table S2) in the MvA plots. The biological markers selected here covered genes with low basal intensity (HO1, NQO1) to those with high basal intensity (GCLR, ferritin). The average signal intensity for most of the housekeeping genes is relatively high.

View larger version (20K):
[in this window]
[in a new window]
|
Fig. 2. Signal log ratio vs. average log intensity (MvA) plots showing log fold change as a function of average log expression level. Mean log intensity is plotted against log fold change for each normalization approach. Housekeeping genes and biological markers are highlighted as blue and red dots, respectively. The rest of the genes are denoted with black. For MAS- and RMA-normalized graphs, ribosomal proteins identified by RMA-significance analysis of microarrays (SAM) analysis are highlighted as blue. A: MvA plot for Affymetrix (Affy) data after MAS scaling. B: MvA plot for Affy data after dCHIP (PM only) normalization. C: MvA plot for Affy data after RMA (quantile) normalization. D: MvA plot for Agilent data after Lowess regression normalization.
|
|
All three MvA plots from MAS 5.0 (Fig. 2A), dCHIP (Fig. 2B), and RMA (Fig. 2C) showed a similar distribution pattern for the selected markers. Most biological markers were far above the baseline cloud formed by the majority of unchanged genes, while the housekeeping genes tightly resided in this region. These data indicate that, although dCHIP and RMA showed tighter scatter plots across the data set than did MAS 5.0, it was difficult to draw a conclusion on which normalization approach was better from a biological point of view.
For the Agilent array system, the most commonly used normalization method is called Loess (Lowess) regression. Figures 1D and 2D show the performance of this Lowess normalization on Agilent data. Correlations among the inter- and intragroups were similar to those of dCHIP or RMA (intragroup only). MvA scatter plots showed that Lowess regression analysis provided a nice plot across the whole span of the average intensity values.
SAM on Differential Expression Identification
Because we had established a set of biological markers and housekeeping genes, we interrogated SAM on how many biological markers could be identified as significant positives and how many housekeeping genes could be excluded as significant positives. Before doing this, we needed to estimate the magnitude of gene expression in the tBHQ-induced ARE activation model. Assuming 500 genes or 1,000 genes showed differential gene expression, how many significant positives could we get from the SAM after different normalization approaches, and how much was the corresponding FDR? Figure 3 summarizes the results from SAM analysis. There were similar numbers of positive genes identified by RMA-SAM (158) and MAS 5.0 (122). Interestingly, only 17 positive genes could be identified by dCHIP-SAM. However, the majority of them were biological markers or had significant biological relevance. When comparing the different array formats, it was noted that Agilent arrays identified most of the biological markers (18/20, 90%). For the Affymetrix format, MAS 5.0 worked better than other approaches, identifying 17 of 20 biological makers. In contrast, the RMA-SAM strategy identified the fewest markers (5/20, 25%) and also had the most errors (8/23, 34.8%). RT-PCR for the five identified housekeeping genes indicated that they were false positives (data not shown). There were also a large number (26) of ribosomal proteins considered as upregulated after RMA-SAM (data not shown). It appears that RMA-SAM analysis resulted in a large number of false positives and false negatives to the final report. Hence, we believe the RMA-SAM strategy was not an appropriate way to identify the differential gene expression. Although the dCHIP-SAM plot showed a very rigid curve and the number of differential genes expressed did not correlate with the change of the
value, it did identify 50% of the biological markers within 17 discovered positive genes. We propose that dCHIP-SAM might help the researcher get a panel of true biological markers in unknown biological models. However, false negatives due to the data normalization could not be neglected.
Receiver operator characteristic (ROC) curves provide a graphical representation of the trade-off between sensitivity and specificity. In signal detection theory, a ROC is created by plotting the true positive (TP) rate (sensitivity) against the false positive (FP) rate (1 specificity) obtained as the discrimination threshold is varied. Proposed numbers of differential gene expression changes were considered as the dynamic threshold. We ranked the genes on the basis of the absolute score generated by SAM. Because the 20 biological markers were actually differentially expressed in these experiments and the 23 housekeeping genes were considered unchanged across the treatments, it was easy to determine the TP rate (the number of biological markers identified/20) and FP rate (the number of housekeeping genes misidentified/23). The best possible prediction method would yield a graph that had data points in the top left corner of the ROC space (Fig. 4). The closer the curve followed the left-hand border and then the top border of the ROC space, the more accurate the test. The closer the curve came to the 45-degree diagonal of the ROC space, the less accurate the test. Hence, in terms of Affymetrix data, RMA gave the worst prediction, whereas MAS 5.0 and dCHIP (PM) showed similar accuracy, which was slightly better than dCHIP (PM, MM). Agilent-SAM also provided better prediction than RMA-SAM but not as good as MAS 5.0-SAM or dCHIP-SAM.

View larger version (23K):
[in this window]
[in a new window]
|
Fig. 4. Receiver operator characteristic (ROC) curve. A ROC was created by plotting the true positive (TP) rate (sensitivity) against false positive (FP) rate (1 specificity) obtained as its discrimination threshold was varied. Proposed numbers of differential gene expression changes were considered as the dynamic threshold. We ranked the genes based on the absolute score generated by SAM after various data normalization [for Affymetrix data, MAS 5.0, dCHIP (PM), dCHIP (PM, MM), and RMA; for Agilent data, Lowess regression]. Because the 20 biological markers were actually differentially expressed in these experiments and the 23 housekeeping genes were considered unchanged across the treatments, it is easy to determine the TP rate (the number of biological markers identified/20) and FP rate (the number of housekeeping genes misidentified/23).
|
|
Finally, we chose MAS 5.0-SAM and Agilent-SAM approaches to observe how many overlapping genes were present. In this study, two levels of estimation were set up based on the dynamic change of the
value or FDR value. If the multitude of the differential gene expression were set to
475, 16.0% of genes were identified by both array formats; however, only 8 genes overlapped. If we assumed 920,
15.8% genes were on both arrays, with just 22 genes overlapping. These data suggest that only a limited number of gene expression profiles have been captured by both Affymetrix and Agilent arrays and that each format generated their own gene list.
Exemplifying the Cause of Discrepant Data Generated by Affymetrix and Agilent Arrays
We have analyzed the gene expression changes by qRT-PCR from a total of 25 genes that increased in both platforms or one and not the other. We noticed that the introduced biological markers were mostly confirmed by qRT-PCR, as shown in Table 3, and some new markers revealed by both array systems were also verified, including ATF3, asparagine synthetase, Gem-GTPase, and uridine phosphorylase. As one of the potential Nrf2-binding partners, ATF3 has been shown to be involved in Nrf2-dependent ARE activation (15). The expression of asparagine synthase, which catalyzes ATP-dependent conversion of aspartate to asparagine using an amine group from glutamine or ammonia, is induced upon amino acid and glucose deprivation, and induction increases cancer cell resistance to chemotherapy. Gem-GTPase, a gene expressed in mitogen-stimulated cells that binds GTP, calmodulin, microtubules, and the kinesin-like protein KIF9, may be involved in cell shape control, neuron differentiation, and calcium channel regulation. Uridine phosphorylase is involved in nucleoside metabolism. Therefore, integration of the gene expression data generated from both array systems may quickly narrow down 22 markers that can be cross-validated with strong biological relevance. Although the number is limited, it does give us a high confidence to further interrogate whether the gene expression changes identified by the alternative approach are real and can be verified by qRT-PCR. The issues related to alternative splicing, site of probe construction, probe set vs. single measurements, and use or nonuse of MM probes have been proposed to be responsible for the discrepant data.
Alternative splicing.
Alternative splicing is proposed to account for the genetic complexity in mammalian transcriptomes. In humans,
60% of genes undergo alternative splicing (31, 61). Affymetrix microarray selects 1620 of 25-mer adjacent probes from near the 3'-end to assemble on the array to represent a single gene. A discontinuous cumulative effect from the tiled probes indicates that a potential alternative splicing may exist in the region interrogated. Compared with the single, long oligonucleotide probe as used by the Agilent array, those tiled probes covering 500 bp of the cDNA sequence may presumably have more chances to identify splicing isoforms. Here we give an example of identification of an increased splicing isoform by multiple-probe array (Affymetrix) but not single-probe array (Agilent). It should be noted that the procedures and microarrays used for this experiment were not optimized to identify or monitor alternative splicing. Agilent actually selected several probes that targeted different locations in a gene and only single probe with the best performance (supposed to give the most accurate fold change across the tissues) finally was synthesized on the arrays.
The flavin-containing monooxygenases (FMOs) are a family of enzymes that catalyzes the oxygenation of numerous nitrogen-, sulfur-, phosphorous-, and other nucleophilic heteroatom-containing chemicals and xenobiotics (5, 68). FMOs show overlapping substrate specificity among the six family members. All FMOs are the result of alternative splicing. The boundaries and sizes of these coding exons are highly conserved among family members and across animal species. Expression of FMOs is tissue and species dependent (18). For example, FMO3 is the most abundant isoform in adult human liver (43), whereas in adult human kidney and fetal human brain, FMO1 is the most prominent form (10). Transcriptional regulation of FMO expression has not been well characterized. A recent report through genetic comparisons of the gene expression of detoxification enzymes in Nrf2 knockout vs. wild-type mice by Affymetrix arrays showed that there was a significant decrease in FMO (consensus sequence) and FMO3 in liver samples from knockout mice (42). Because tBHQ has been proven to stabilize Nrf2 by inhibition of the ubiquitin-proteasome pathway (unpublished data), we proposed that FMO could be a potential Nrf2-driven gene. In the present study, the data generated from Affymetrix array showed a substantial increase in FMO1 instead of other FMOs after human neural stem cells were treated with tBHQ for 24 h. The gene expression changes for FMO1 were confirmed by qRT-PCR, using a sequence-specific set of primers for the unique 3'-UTR (Blast search confirmed). Actually, adenovirus-mediated overexpression of Nrf2 increased FMO1 at the mRNA level (data not shown). However, the data generated from Agilent array showed no changes in FMO1 expression. By comparing the locations of the probe(s) selected by both array formats, we found that Affymetrix probe sets covered not only exon 9 but also the unique 3'-UTR region for FMO1 (Fig. 5Aa), whereas the Agilent probe for FMO1 covered only 60 bp in exon 9 that were also shared by FMO4 (Blast search confirmed). A 20-bp sequence from this 60-mer probe demonstrated a 100% homolog with Jagged 2, which is highly expressed in neural stem cells. In many cases, in terms of specificity, the probe was designed within the 3'-UTRs because they share less homology with each other than is the case with the coding regions between genes (see "The Design and Annotation of the Applied Biosystems Human Genome Survey Microarray;" http://www.appliedbiosystems.com). Only the probes covering the 3'-UTR region showed a dramatic difference in fluorescence intensity between vehicle- and tBHQ-treated samples, as shown in Fig. 5Ab. One set of primers specifically designed for 3'-UTR showed significant changes by qRT-PCR (Table 3), suggesting that, in some cases, multiple probes have advantages over a single probe in detecting alternative splicing. In addition, according to the basis of thermodynamics and kinetics of nucleotide hybridization here, precision and accuracy are affected when oligonucleotide hybridizations are not long enough to reach probe/target on-off equilibrium.

View larger version (27K):
[in this window]
[in a new window]
|
Fig. 5. Exemplifying the cause of data discrepancy generated by Affymetrix and Agilent arrays. The sequence of probe set (Affymetrix) or probe (Agilent) selected by both array formats for a specific gene was aligned with the referred full-length cDNA sequence. Fluorescence intensity of PM and MM for each probe within a probe set was plotted against the probe number. Data were gathered from one representative pair of samples treated with tert-butylhydroquinone (tBHQ; T) or vehicle (V).
|
|
Multiple probes vs. single probes.
As we described earlier, multiple probes may have advantages over single probes in identifying alternative splice variants. If there is a certain level of RNA degradation, cross-hybridization, and/or poor hybridization occurring just in the region from which the sole probe was selected, the result will be skewed. Therefore, using results from multiple probes would provide an overall effect that is sometimes better than results from single probes in determining the gene expression changes, if the single probe does not perform well enough to represent the interrogated gene. For example, RNA polymerase III, subunit 32, an abundant subunit of RNA polymerase III and part of a subcomplex important for the initiation of transcription, was identified as an upregulated gene by Affymetrix array, and this upregulation was confirmed by qRT-PCR using a set of primers specifically targeting the coding region. However, the single probe selected by Agilent array is not able to identify this change. The probe construction sites for both arrays were aligned with the full cDNA sequence, as shown in Fig. 5Ba. Although Agilent claimed that their probes were selected from the 3'-end of the sequence, this one was an exception. Given that poor hybridization, RNA degradation from the 5'-end, and/or less labeling efficiency (RT-in vitro transcription labeling has a higher efficiency in incorporating labels into cRNA and cDNA at the 3'-end than at the 5'-end) occurred in that region, there is no way to identify the changes based on this single probe. Actually, the Agilent probe that covered a 60-mer oligonucleotide in the 5'-end of the coding region showed a relatively lower basal gene expression in both vehicle- (1,008.99, 515.986, and 462.643) and tBHQ-treated samples (794.876, 473.95, and 433.343) compared with the average signal intensity throughout the whole array (5,764.91 ± 178.34). Figure 5Bb showed the fluorescence intensity of PM and MM for each probe of RNA polymerase III.
Another example comes from a well-characterized Nrf2-driven stress-response gene, sequestosome 1, also called A170 in mice. Electrophilic agents such as diethylmaleate and tBHQ were reported to induce the gene expression of sequestosome 1 (22). As shown in the Affymetrix probe set, 13 of 16 probes representing this gene showed a dramatic difference (Fig. 6Ab). However, Agilent probes showed no change with extremely high fluorescence intensity in both vehicle (33,933.2, 17,726.9, and 13,422.5) and treatment samples (29,401.7, 23,554.4, and 19,142) compared with the averaged intensity value of the whole array. In addition, TR1 (vehicle, 21.01 ± 7.13 nM; tBHQ, 65.23 ± 9.13 nM), which had a similar basal expression level to sequestosome 1 (vehicle, 22.83 ± 10.21 nM; tBHQ, 52.29 ± 18.28 nM) according to qRT-PCR, had significantly lower intensity values than did sequestosome 1 (320.8, 404.3, and 574.3 for vehicles; 2,273.6, 1,473.6, and 2,744.5 for treatments). These data suggest that cross-hybridization perhaps occurred when selecting probe(s) from this region. QRT-PCR, using the primers specific to the coding region, confirmed the increased gene expression of sequestosome 1 after tBHQ treatment (Table 3). Once again, the site of probe construction is important in determining gene expression level, and the assessment of hybridization behavior from multiple probes does provide more comprehensive data than that from a single probe in some cases.
TR1 is a good example for the multiple-probe approach. As shown in Fig. 7A, 15 of 16 probes showed the same trend except for probe 4 when comparing the levels between vehicle and treatment groups. However, in many cases, the 16 probes within a probe set did not perform similarly. For instance, upregulation of NPTX1, a neuronal-specific protein involved in synaptic uptake of extracellular material in response to xenobiotic-induced toxicity, was identified only by the Agilent array and confirmed by qRT-PCR (Table 3). Through an analysis of the hybridization performance of the 16 probes, we found that there were many (10/16) bad probes, including probe 2, probe 3 (high MM), probe 5, probes 711, probe 15, and probe 16 (high MM), selected to represent NPTX1 (Fig. 6Bb). After integration of the results from all 16 probes, the real difference, reflected by probes 1, 4, 6, and 1214, was disguised, suggesting that the site of probe construction is crucial to determination of the gene expression changes. Actually, the sequences for probes 1215 matching with the single probe from Agilent showed similar results, implying that there was increased expression of NPTX1 after tBHQ treatment (Fig. 6Ba). Thus this region should only be considered as a site for probe construction.

View larger version (32K):
[in this window]
[in a new window]
|
Fig. 7. Exemplifying the use of MM correction to minimize the extent of gene expression changes between vehicle and tBHQ treatments. Fluorescence intensity of PM and MM for each probe within a probe set was plotted against the probe number. Data were gathered from one representative pair of samples treated with tBHQ or vehicle. Thioredoxin reductase-1 (TR1), carbonyl reductase-1 (CR1), stromal cell-derived factor 1 (STROM), and dual-specificity phosphatase-1 (DSP1) were selected as examples.
|
|
Use or nonuse of MM probes.
In some cases, a single mutation may not substantially influence the hybridization efficiency. Introducing two or more mutations may start to substantially change the hybridization behavior. Therefore, to incorporate MM probes with a single mutation in the middle of probes may not well represent cross-hybridization or background signal, as thought by Affymetrix MAS 5.0. Carbonyl reductase-1 (CR1), a detoxification enzyme, is an example. CR1 is a broad-specificity, NADPH-dependent oxidoreductase that catalyzes the reduction of aromatic ketones, quinones, prostaglandins, and ketone-containing drugs; decreased expression may correlate with reduced survival in certain cancers. CR1 is also regulated through the Nrf2-ARE pathway (30, 57). Data from Affymetrix arrays always showed no changes based on the detection call from MAS 5.0 or significance testing by SAM. As shown in Fig. 7B, the gene expression changes induced by tBHQ were significantly diminished by MM corrected probes 1, 5, 6, 8, and 9. Although qRT-PCR and the Agilent arrays confirmed the changes (Table 3), this difference may not be reflected by MAS 5.0 when considering MM signals to exclude the data from the probes with significant differences. Similar results were observed for other genes such as dual-specificity phosphatase-1 (Fig. 7C), stromal cell-derived factor 1 (Fig. 7D), and adenylyl cyclase type 8 (data not shown), which were identified to have differential gene expression by Agilent arrays but not by Affymetrix arrays. By ignoring the MM and only taking the signal intensity values from the PM into consideration, those changes are statistically significant when using a parametric or nonparametric test (P < 0.05). qRT-PCR also confirmed those changes (Table 3). Some of the genes identified solely by Agilent do make biological sense. For instance, dual-specificity phosphatase-1, a stress factor-induced, nonmembrane-spanning protein phosphatase that inactivates members of the mitogen-activated protein kinase family, is involved in chemotaxis and possibly heat shock and oxidative stress responses. Stromal cell-derived factor 1, an
-chemokine that acts through the G protein-coupled receptor CXCR4, stimulates leukocyte adhesion, migration, and chemotaxis. Adenylyl cyclase type 8, an ATP-pyrophosphate lyase that converts ATP to cAMP, the activity of which is stimulated by calcium/calmodulin, may play a role in neuronal processes associated with alcoholism and Alzheimers disease.
 |
DISCUSSION
|
---|
A better experimental design for cross-platform comparison should have both technical replicates and biological replicates to make a stronger conclusion. For the Affymetrix platform, it would be ideal to have 3 passages of 2 different treatments and 3 arrays per sample, resulting in a total of 18 arrays. For the Agilent platforms, a dye-swap design is necessary. There are many published articles that discuss the approaches to two-color microarray experimental design. Many complex designs have been proposed, but four of the simplest comparisons are recommended. They include direct comparisons, balanced block designs, reference designs, and loop designs (52). For the control vs. treatment, a pair-matched comparison is made (e.g., 1v1, 2v2, 3v3 = 3 x 2 arrays). For the intragroup comparisons, all pairwise combinations need to be completed (1v2, 2v3, 1v3 = 6 x 2 arrays) or a pooling scheme is needed to generate a common reference for both the controls and treatments (reference designs). If the RNA sample is limiting, balanced block design, as we have utilized, must be an option. However, we have to acknowledge that any gene biased by preferential labeling with one dye may result in false positives and negatives or may not appear to significantly distinguish the classes.
According to our results, the reproducibility of data was highly dependent on the data normalization approaches used. As shown in Fig. 1E, dCHIP (PM only)-normalized Affymetrix data showed the highest intergroup reproducibility that was similar to Lowess-normalized Agilent data. However, in terms of intragroup reproducibility, RMA showed the highest reproducibility, even higher than for Agilent data (P < 0.05). Intragroup data generated from MAS 5.0 showed less reproducibility than dCHIP, RMA, and Agilent data. However, less reproducibility is not always equivalent to less accuracy. RMA provided a higher intergroup correlation coefficient than dCHIP and MAS 5.0 scaling while significantly decreasing the intragroup correlation. However, RMA-SAM analysis resulted in a large number of false positives to the final report. A recent paper tried to explain why RMA works better in some microarray experiments, like spike-in samples, while generating so many false positives in others (54). The increased sensitivity of probe set algorithms that ignore the MM signal, such as RMA, would be expected to come at an increased cost of noise, where the quality of low level signals defined by RMA were "noisy" projects that could lead to data interpretations of poor integrity. Specifically, detection of spike-in controls would be expected to be independent of confounding noise within arrays and projects. However, the increased sensitivity of some probe set algorithms would be expected to lead to a high proportion of false positives in projects where there was a relatively high level of unwanted noise.
Further analysis of the data revealed by alternative approaches suggested that alternative splicing, multiple vs. single probe(s) measurement, and use or nonuse of MM probes may account for the discrepant data. Other factors resulting in the discrepant data may include probes selected from the wrong or partially accurate cDNA sequence. Jarvinen et al. (24) Compared microarray data generated from Affymetrix U95Av2 and Agilent cDNA array and found that more than one-half of the discrepant cases that they investigated could be explained by the incorrect clones after sequencing (24). However, when comparing the sequence-matched probes from these two platforms, Mehcam et al. (45) showed significantly improved cross-platform consistency in terms of gene expression ratios and different calls. Although we exemplified several issues perhaps related to the discrepancy derived from Affymetrix and Agilent arrays, it is possible that multiple factors may simultaneously be responsible for the discrepancy.
On the basis of this study, the data interchangeability across different microarray formats is limited. However, through cross-platform comparisons, researchers could validate positive genes generated from different formats and quickly narrow down the real biological markers in unknown biological models. For this reason, both Affymetrix (14) and Agilent (60) platforms provided very successful examples of clinical diagnosis and outcome prediction. Through combination with clustering analysis, those finally settled biomarkers could classify an unknown sample into different categories (49). In this case, the variation in gene discovery across different microarray platforms seems not to be a big issue. Unfortunately, for most array-based initial global screening experiments, missing the important messages could result in incomplete or incorrect conclusions. By merging the data sets, the investigator may generate a more sophisticated gene expression pattern and provide a wide view of transcriptional networks. This combination will not be realized as a better option for presenting and analyzing microarray data until more cross-platform studies have been carried out and reported using performance figures with predictive algorithms such as the logistic regression and discriminate function analyses.
Although the methodologies described in this study are the most widely used, we should realize that none of them were the gold standard. Microarray bioinformatics is a booming area, and many new solutions are constantly appearing and being updated. The next generation of strategies for microarray data analysis should be a combination of multiple data normalization approaches with various significance testings. Multiple formats of microarray technology will exist for years, and the microarray industry will experience continuous reconstruction. Just like computer formats (personal computer or Macintosh), no single array format could constantly dominate the market. Improvement on the existing microarray formats on multiple levels may be more urgent than simply trying to make a bigger array covering more uncertified genes. A number of factors, including variabilities generated from multiple steps of microarray technology, often lead to results that fail validation by other techniques such as qRT-PCR. The multiple levels we refer to include probe (probe sets) selection, quality controls, industrial standards for array manufacture, general limits of fluorescent microarray technology, dye bias, data deposition, and data mining tools. We hope that the data generated from microarray technology will become more accurate in the future. Currently, RNA standards are being developed that hopefully will help us identify probes that are inaccurate (see "External RNA controls for monitoring performance of Codelink whole genome bioarrays;" http://www.amershambiosciences.com). With these standards, researchers will be able to create algorithms that precisely identify the conversion constants that will convert intensity value directly to mRNA copy number. Until that time, we must understand the limitations of the systems used and realize that we may be catching only a few fish out of the school with the hope that these few fish will be key indicators for the overall function of the school.
It should also be noted that both Affymetrix and Agilent are continually updating and refining their array designs, and future experiments on newer arrays may also yield some different results. Hence, researchers who are interested in a particular field are encouraged to have custom designs of either array format. A stronger conclusion on cross-platform performance will be made based on those custom arrays.
 |
GRANTS
|
---|
This study was supported by National Institute of Environmental Health Sciences Grants ES-08089 and ES-10042 (both to J. A. Johnson).
 |
ACKNOWLEDGMENTS
|
---|
We thank Affymetrix and Agilent for support of this comparative study and on-site training. We thank Dr. Clive Svendsen (Waisman Center, Univ. of Wisconsin at Madison) for the human neural stem cells and Dr. Douglas Bates (Dept. of Statistics, Univ. of Wisconsin at Madison) for helpful suggestions on Bioconductor. We thank Dorothy Nesbit for technical support on LightCycler. We also thank Delinda Johnson and Marcus Calkins for contributing experiences in microarray data analysis.
 |
FOOTNOTES
|
---|
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: J. A. Johnson, Univ. of Wisconsin-Madison, School of Pharmacy, 6125 Rennebohm Hall, 777 Highland Ave., Madison, WI 53705-2222 (E-mail: jajohnson{at}pharmacy.wisc.edu).
10.1152/physiolgenomics.00214. 2004.
1 The Supplemental Material for this article (Supplemental Figure S1 and Supplemental Tables S1 and S2) is available online at http://physiolgenomics.physiology.org/cgi/content/full/00214.2004/DC1. 
 |
REFERENCES
|
---|
- Alam J, Stewart D, Touchard C, Boinapally S, Choi AM, and Cook JL. Nrf2, a CapnCollar transcription factor, regulates induction of the heme oxygenase-1 gene. J Biol Chem 274: 2607126078, 1999.[Abstract/Free Full Text]
- Aono J, Yanagawa T, Itoh K, Li B, Yoshida H, Kumagai Y, Yamamoto M, and Ishii T. Activation of Nrf2 and accumulation of ubiquitinated A170 by arsenic in osteoblasts. Biochem Biophys Res Commun 305: 271277, 2003.[CrossRef][ISI][Medline]
- Barczak A, Rodriguez MW, Hanspers K, Koth LL, Tai YC, Bolstad BM, Speed TP, and Erle DJ. Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res 13: 17751785, 2003.[Abstract/Free Full Text]
- Bolstad BM, Irizarry RA, Astrand M, and Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19: 185193, 2003.[Abstract/Free Full Text]
- Cashman JR. Structural and catalytic properties of the mammalian flavin-containing monooxygenase. Chem Res Toxicol 8: 166181, 1995.[Medline]
- Chan JY and Kwong M. Impaired expression of glutathione synthetic enzyme genes in mice with targeted deletion of the Nrf2 basic-leucine zipper protein. Biochim Biophys Acta 1517: 1926, 2000.[ISI][Medline]
- Churchill GA. Fundamentals of experimental design for cDNA microarrays. Nat Genet 32, Suppl: 490495, 2002.[CrossRef]
- de Longueville F, Surry D, Meneses-Lorente G, Bertholet V, Talbot V, Evrard S, Chandelier N, Pike A, Worboys P, Rasson JP, Le Bourdelles B, and Remacle J. Gene expression profiling of drug metabolism and toxicology markers using a low-density DNA microarray. Biochem Pharmacol 64: 137149, 2002.[CrossRef][ISI][Medline]
- Dhakshinamoorthy S and Jaiswal AK. Functional characterization and role of INrf2 in antioxidant response element-mediated expression and antioxidant induction of NAD(P)H:quinone oxidoreductase 1 gene. Oncogene 20: 39063917, 2001.[CrossRef][ISI][Medline]
- Dolphin CT, Cullingford TE, Shephard EA, Smith RL, and Phillips IR. Differential developmental and tissue-specific regulation of expression of the genes encoding three members of the flavin-containing monooxygenase family of man, FMO1, FMO3 and FM04. Eur J Biochem 235: 683689, 1996.[Abstract]
- Eftekharpour E, Holmgren A, and Juurlink BH. Thioredoxin reductase and glutathione synthesis is upregulated by t-butylhydroquinone in cortical astrocytes but not in cortical neurons. Glia 31: 241248, 2000.[CrossRef][ISI][Medline]
- Galloway DC, Blake DG, Shepherd AG, and McLellan LI. Regulation of human gamma-glutamylcysteine synthetase: co-ordinate induction of the catalytic and regulatory subunits in HepG2 cells. Biochem J 328: 99104, 1997.[ISI][Medline]
- Galloway DC and McLellan LI. Inducible expression of the gamma-glutamylcysteine synthetase light subunit by t-butylhydroquinone in HepG2 cells is not dependent on an antioxidant-responsive element. Biochem J 336: 535539, 1998.[ISI][Medline]
- Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, and Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531537, 1999.[Abstract/Free Full Text]
- Gong P, Stewart D, Hu B, Vinson C, and Alam J. Multiple basic-leucine zipper proteins regulate induction of the mouse heme oxygenase-1 gene by arsenite. Arch Biochem Biophys 405: 265274, 2002.[CrossRef][ISI][Medline]
- Hayes JD, McLeod R, Ellis EM, Pulford DJ, Ireland LS, McLellan LI, Judah DJ, Manson MM, and Neal GE. Regulation of glutathione S-transferases and aldehyde reductase by chemoprotectors: studies of mechanisms responsible for inducible resistance to aflatoxin B1. IARC Sci Publ 139: 175187, 1996.[Medline]
- Hayes JD and Pulford DJ. The glutathione S-transferase supergene family: regulation of GST and the contribution of the isoenzymes to cancer chemoprotection and drug resistance. Crit Rev Biochem Mol Biol 30: 445600, 1995.[Abstract]
- Hines RN, Cashman JR, Philpot RM, Williams DE, and Ziegler DM. The mammalian flavin-containing monooxygenases: molecular characterization and regulation of expression. Toxicol Appl Pharmacol 125: 16, 1994.[CrossRef][ISI][Medline]
- Hudson FN and Kavanagh TJ. Cloning and characterization of the proximal promoter region of the mouse glutamate-L-cysteine ligase regulatory subunit gene. Biochim Biophys Acta 1492: 447451, 2000.[ISI][Medline]
- Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR, Kobayashi S, Davis C, Dai H, He YD, Stephaniants SB, Cavet G, Walker WL, West A, Coffey E, Shoemaker DD, Stoughton R, Blanchard AP, Friend SH, and Linsley PS. Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol 19: 342347, 2001.[CrossRef][ISI][Medline]
- Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, and Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31: e15, 2003.[Abstract/Free Full Text]
- Ishii T, Itoh K, Takahashi S, Sato H, Yanagawa T, Katoh Y, Bannai S, and Yamamoto M. Transcription factor Nrf2 coordinately regulates a group of oxidative stress-inducible genes in macrophages. J Biol Chem 275: 1602316029, 2000.[Abstract/Free Full Text]
- Jaiswal AK. Regulation of genes encoding NAD(P)H:quinone oxidoreductases. Free Radic Biol Med 29: 254262, 2000.[CrossRef][ISI][Medline]
- Jarvinen AK, Hautaniemi S, Edgren H, Auvinen P, Saarela J, Kallioniemi OP, and Monni O. Are data from different gene expression microarray platforms comparable? Genomics 83: 11641168, 2004.[CrossRef][ISI][Medline]
- Kondo M, Shibata T, Kumagai T, Osawa T, Shibata N, Kobayashi M, Sasaki S, Iwata M, Noguchi N, and Uchida K. 15-Deoxy-Delta(12,14)-prostaglandin J(2): the endogenous electrophile that induces neuronal apoptosis. Proc Natl Acad Sci USA 99: 73677372, 2002.[Abstract/Free Full Text]
- Kothapalli R, Yoder SJ, Mane S, and Loughran TP Jr. Microarray results: how accurate are they? BMC Bioinformatics 3: 22, 2002.[CrossRef][Medline]
- Kraft AD, Johnson DA, and Johnson JA. Nuclear factor E2-related factor 2-dependent antioxidant response element activation by tert-butylhydroquinone and sulforaphane occurring preferentially in astrocytes conditions neurons against oxidative insult. J Neurosci 24: 11011112, 2004.[Abstract/Free Full Text]
- Kubera C, Gavin I, and Huberman E. Isolation of two unknown genes potentially involved in differentiation of the hematopoietic pathway, and studies of spermidine/spermine acetyltransferase regulation. US Dept Energy J Undergrad Res 2: 3439, 2002.
- Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, and Kohane IS. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 18: 405412, 2002.[Abstract/Free Full Text]
- Kwak MK, Wakabayashi N, Itoh K, Motohashi H, Yamamoto M, and Kensler TW. Modulation of gene expression by cancer chemopreventive dithiolethiones through the Keap1-Nrf2 pathway. Identification of novel gene clusters for cell survival. J Biol Chem 278: 81358145, 2003.[Abstract/Free Full Text]
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. Initial sequencing and analysis of the human genome. Nature 409: 860921, 2001.[CrossRef][ISI][Medline]
- Lee JK, Bussey KJ, Gwadry FG, Reinhold W, Riddick G, Pelletier SL, Nishizuka S, Szakacs G, Annereau JP, Shankavaram U, Lababidi S, Smith LH, Gottesman MM, and Weinstein JN. Comparing cDNA and oligonucleotide array data: concordance of gene expression across platforms for the NCI-60 cancer cells. Genome Biol 4: R82.1R82.12, 2003.
- Lee JM, Shih AY, Murphy TH, and Johnson JA. NF-E2-related factor-2 mediates neuroprotection against mitochondrial complex I inhibitors and increased concentrations of intracellular calcium in primary cortical neurons. J Biol Chem 278: 3794837956, 2003.[Abstract/Free Full Text]
- Lee PD, Sladek R, Greenwood CM, and Hudson TJ. Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies. Genome Res 12: 292297, 2002.[Abstract/Free Full Text]
- Li C and Hung Wong W. Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2, August 2001 [ePub 2001 August 3].
- Li C and Wong WH. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci USA 98: 3136, 2001.[Abstract/Free Full Text]
- Li N, Venkatesan MI, Miguel A, Kaplan R, Gujuluva C, Alam J, and Nel A. Induction of heme oxygenase-1 expression in macrophages by diesel exhaust particle chemicals and quinones via the antioxidant-responsive element. J Immunol 165: 33933401, 2000.[Abstract/Free Full Text]
- Li J and Johnson JA. Time-dependent changes in ARE-driven gene expression by use of a noise-filtering process for microarray data. Physiol Genomics 9: 137144, 2002.[Abstract/Free Full Text]
- Li J and Johnson JA. Comparative studies using cDNA vs. oligonucleotide arrays. In: An Introduction to Toxicogenomics, edited by Burczynski ME. Boca Raton, FL: CRC, 2003, p. 1727.
- Li J, Lee JM, and Johnson JA. Microarray analysis reveals an antioxidant responsive element-driven gene set involved in conferring protection from an oxidative stress-induced apoptosis in IMR-32 cells. J Biol Chem 277: 388394, 2002.[Abstract/Free Full Text]
- Li J, Pankratz M, and Johnson JA. Differential gene expression patterns revealed by oligonucleotide versus long cDNA arrays. Toxicol Sci 69: 383390, 2002.[Abstract/Free Full Text]
- Li J, Stein TD, and Johnson JA. Genetic dissection of systemic autoimmune disease in Nrf2-deficient mice. Physiol Genomics 18: 261272, 2004.[Abstract/Free Full Text]
- Lomri N, Gu Q, and Cashman JR. Molecular cloning of the flavin-containing monooxygenase (form II) cDNA from adult human liver. Proc Natl Acad Sci USA 89: 16851689, 2003.[CrossRef]
- Mah N, Thelin A, Lu T, Nikolaus S, Kuehbacher T, Gurbuz Y, Eickhoff H, Kloeppel G, Lehrach H, Mellgard B, Costello CM, and Schreiber S. A comparison of oligonucleotide and cDNA-based microarray systems. Physiol Genomics 16: 361370, 2003.[ISI]
- Mecham BH, Klus GT, Strovel J, Augustus M, Byrne D, Bozso P, Wetmore DZ, Mariani TJ, Kohane IS, and Szallasi Z. Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. Nucleic Acids Res 32: e74, 2004.[Abstract/Free Full Text]
- Moinova HR and Mulcahy RT. Up-regulation of the human gamma-glutamylcysteine synthetase regulatory subunit gene involves binding of Nrf-2 to an electrophile responsive element. Biochem Biophys Res Commun 261: 661668, 1999.[CrossRef][ISI][Medline]
- Motohashi H, OConnor T, Katsuoka F, Engel JD, and Yamamoto M. Integration and diversity of the regulatory network composed of Maf and CNC families of transcription factors. Gene 294: 112, 2002.[CrossRef][ISI][Medline]
- Mulcahy RT and Gipp JJ. Identification of a putative antioxidant response element in the 5'-flanking region of the human gamma-glutamylcysteine synthetase heavy subunit gene. Biochem Biophys Res Commun 209: 227233, 1995.[CrossRef][ISI][Medline]
- Ochs MF and Godwin AK. Microarrays in cancer: research and applications. Biotechniques, Suppl: 415, 2003.[Medline]
- Orino K, Tsuji Y, Torti FM, and Torti SV. Adenovirus E1A blocks oxidant-dependent ferritin induction and sensitizes cells to pro-oxidant cytotoxicity. FEBS Lett 461: 334338, 1999.[CrossRef][ISI][Medline]
- Prestera T and Talalay P. Electrophile and antioxidant regulation of enzymes that detoxify carcinogens. Proc Natl Acad Sci USA 92: 89658969, 1995.[Abstract/Free Full Text]
- Quackenbush J. Using DNA microarrays to assay gene expression. In: Bioinformatics: a Practical Guide to the Analysis of Genes and Proteins, edited by Baxevanis AD and Francis-Ouellette BF. Indianapolis, IN: Wiley, p. 410443, 2005.
- Rushmore TH, King RG, Paulson KE, and Pickett CB. Regulation of glutathione S-transferase Ya subunit gene expression: identification of a unique xenobiotic-responsive element controlling inducible expression by planar aromatic compounds. Proc Natl Acad Sci USA 87: 38263830, 1990.[Abstract/Free Full Text]
- Seo J, Bakay M, Chen YW, Hilmer S, Shneiderman B, and Hoffman EP. Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays. Bioinformatics 20: 25342544, 2004.[Abstract/Free Full Text]
- Shih AY, Johnson DA, Wong G, Kraft AD, Jiang L, Erb H, Johnson JA, and Murphy TH. Coordinate regulation of glutathione biosynthesis and release by Nrf2-expressing glia potently protects neurons from oxidative stress. J Neurosci 23: 33943406, 2003.[Abstract/Free Full Text]
- Tan PK, Downey TJ, Spitznagel EL Jr, Xu P, Fu D, Dimitrov DS, Lempicki RA, Raaka BM, and Cam MC. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res 31: 56765684, 2003.[Abstract/Free Full Text]
- Thimmulappa RK, Mai KH, Srisuma S, Kensler TW, Yamamoto M, and Biswal S. Identification of Nrf2-regulated genes induced by the chemopreventive agent sulforaphane by oligonucleotide microarray. Cancer Res 62: 51965203, 2002.[Abstract/Free Full Text]
- Tusher VG, Tibshirani R, and Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98: 51165121, 2001.[Abstract/Free Full Text]
- Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, and Speleman F. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol 3, June 2002 [Epub 2002 June 18].
- vant Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, and Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: 530536, 2002.[CrossRef][ISI][Medline]
- Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, et al. The sequence of the human genome. Science 291: 13041351, 2001.[Abstract/Free Full Text]
- Wang Y, Devereux W, Woster PM, Stewart TM, Hacker A, and Casero RA Jr. Cloning and characterization of a human polyamine oxidase that is inducible by polyamine analogue exposure. Cancer Res 61: 53705373, 2001.[Abstract/Free Full Text]
- Wild AC and Mulcahy RT. Regulation of gamma-glutamylcysteine synthetase subunit gene expression: insights into transcriptional control of antioxidant defenses. Free Radic Res 32: 281301, 2000.[ISI][Medline]
- Wright LS, Li J, Caldwell MA, Wallace K, Johnson JA, and Svendsen CN. Gene expression in human neural stem cells: effects of leukemia inhibitory factor. J Neurochem 86: 179195, 2003.[CrossRef][ISI][Medline]
- Yang YH and Speed T. Design issues for cDNA microarray experiments. Nat Rev Genet 3: 579588, 2002.[ISI][Medline]
- Yuen T, Wurmbach E, Pfeffer RL, Ebersole BJ, and Sealfon SC. Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucleic Acids Res 30: e48, 2002.[Abstract/Free Full Text]
- Zhou Y and Abagyan R. Algorithms for high-density oligonucleotide array. Curr Opin Drug Discov Devel 6: 339345, 2003.[ISI][Medline]
- Ziegler DM. An overview of the mechanism, substrate specificities, and structure of FMOs. Drug Metab Rev 34: 503511, 2002.[CrossRef][ISI][Medline]