A fuzzy logic approach to analyzing gene expression data

PETER J. WOOLF1,2 and YIXIN WANG1

1 Bioinformatics, Department of Molecular Biology, Parke-Davis Pharmaceutical Research, Warner-Lanbert, Ann Arbor 48105
2 Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Woolf, Peter J., and Yixin Wang. A fuzzy logic approach to analyzing gene expression data. Physiol Genomics 3: 9–15, 2000.—We have developed a novel algorithm for analyzing gene expression data. This algorithm uses fuzzy logic to transform expression values into qualitative descriptors that can be evaluated by using a set of heuristic rules. In our tests we designed a model to find triplets of activators, repressors, and targets in a yeast gene expression data set. For the conditions tested, the predictions made by the algorithm agree well with experimental data in the literature. The algorithm can also assist in determining the function of uncharacterized proteins and is able to detect a substantially larger number of transcription factors than could be found at random. This technology extends current techniques such as clustering in that it allows the user to generate a connected network of genes using only expression data.

gene expression profiling; gene regulatory model; data mining


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
CELLS REGULATE THE EXPRESSION of their genes in response to environmental changes. Normally this regulation is beneficial to the cell, protecting it from starvation or injury; however, errors in this regulation can lead to serious diseases ranging from cancer to heart disease. The pharmaceutical industry is beginning to recognize that gene regulation can be useful for both assaying drugs and as a source for new molecular targets, assuming the regulatory network is well understood. As such, changes in gene expression patterns can be used to assay drug efficacy throughout the drug discovery process. One assay that takes advantage of the existing level of sequence information and that is complementary to sequence and genetic analysis is gene expression profiling. Expression profiling technologies such as GeneChip measure the expression level of thousands of genes simultaneously using an array of oligonucleotides bound to a silicon surface. These arrays are hybridized under stringent conditions with a complex sample representing mRNAs expressed in the test cell or tissue. The results from these expression profiling technologies are quantitative and highly parallel, thereby allowing us to take an accurate snapshot of the workings of the cell in a particular state.

Expression profiling assays generate huge data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this data is to develop algorithms to interpret and interconnect results for different genes under different conditions. Currently, most expression data is analyzed using clustering techniques, algorithms that identify distinct expression patterns by grouping genes with similar expression patterns (1, 9). Thus clustering can only distinguish between those genes that have the same and different expression profiles. However, genes in the cell make up a complex network that cannot be revealed with current techniques such as clustering. To determine the network describing how the genes interrelate, more elaborate data mining techniques need to be developed.

Fuzzy logic is an algorithm drawn from engineering and other applied sciences to control systems as diverse as washing machines to autofocus cameras (2, 10). It provides a way to transform precise numbers, such as 32.43, into qualitative descriptors, such as "high" in a process called "fuzzification." Although other techniques can be used to change precise values into discrete descriptors, fuzzy logic provides a systematic and unbiased way to perform this transformation, thereby removing the need for expert knowledge about the system. For example, is 32.43 a high value? If 32.43 is a measure of the ambient air temperature in degrees celsius, then most people would say that 32.43°C is a high temperature. But this analysis requires our own expert knowledge, which can vary from person to person. Someone from a tropical climate may feel that 32.43°C is a medium temperature, whereas someone from a very cold climate may take 32.43°C as a very high temperature. When dealing with gene expression data, the problem is even more complicated, because no expert exists to determine what defines a "high" expression level. Using fuzzy logic, the full range of data is first measured and is then broken into discrete subsections based on the observed data. These discrete subsections then provide a qualitative description of the data. Once transformed, this qualitative data can be analyzed using heuristic rules, which in turn generate fuzzy solutions. For example, the heuristic rule "if high then move fast" takes "high" as a fuzzy input and "fast" as a fuzzy solution. In another process called "defuzzification," this heuristic solution can be transformed from a qualitative descriptor back into a precise number.

There are three main advantages of applying fuzzy logic to the analysis of gene expression data. First, fuzzy logic inherently accounts for noise in the data because it extracts trends, not precise values. Second, in contrast to other automated decision making algorithms, such as neural networks or polynomial fits, algorithms in fuzzy logic are cast in the same language used in day-to-day conversation. As a result, predictions made using fuzzy logic are easily interpretable and can be extrapolated in predictable ways. Third, fuzzy logic techniques are computationally efficient and can be scaled to include an unlimited number of components. Thus they are able to recognize a large number of biologically important patterns.

In this work we present a fuzzy logic based algorithm for analyzing gene expression data. Using fuzzy logic, we have developed a analysis technique that can identify logical relationships between genes and in some cases even predict the function of an unknown gene. This algorithm was validated using yeast expression data gathered from the Affymetrix GeneChip system. By using yeast gene expression data collected at different time points of the cell cycle, we were able to identify many regulatory elements and their target genes within the cell that work together to maintain and control certain cellular processes. Several cases are validated by available experimental results, including the signaling network controlled by the transcription factors HAP1 and ROX1, which control the transition from anaerobic to aerobic growth. These results suggest that our fuzzy logic technique can indeed find biologically relevant connections between sets of genes, which in turn could help to describe the complex web of interactions that regulate gene expression.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

GeneChip experiments.
Public domain GeneChip data describing the yeast cell cycle (1) was chosen to validate the fuzzy logic algorithm. Because the yeast cell cycle is a tightly regulated process at the genetic level, the expression data was expected to show detectable relationships among different genes that might be picked up by our algorithm. Also, years of experimental work on yeast has generated an extensive body of biological data describing many of the proteins in the organism, allowing us to confirm or dismiss our findings based on data in the literature.

Data filtering.
Before expression data was analyzed, the data was first filtered to ensure that 1) the expression data is above the noise level that is determined by GeneChip software and 2) the data set includes genes that differ in their level of expression significantly. In chip hybridization, the noise level is defined as the standard deviation in fluorescence intensities of the nonhybridizing probes. For the yeast data set, the noise level was determined to be 30 in fluorescence intensity; thus to filter out the nondetectable genes in the samples, the highest measurement for a particular gene had to exceed the noise level to allow that gene into the calculation. Also, for a gene to be selected, the maximum value in the series of measurements had to be at least three times greater than the minimum value, ensuring that the observed signal change was significant. The threshold of factor of 3 was decided after variation of multiple measurements from repeated assays was evaluated; 1,898 (30%) genes that met both criteria were selected.

Gene regulatory model.
In developing the algorithm, we chose to search for genes that follow the pattern of a gene product (C) controlled by both an activator (A) and repressor (B), although in theory any pattern can be searched for. In general for the activator-repressor model, when the activator is high and the repressor is low, the concentration of the target C would be high. Conversely, when the repressor concentration is high, and the activator is low, the concentration of the target is low. These qualitative, or heuristic, rules are similar to the judgement calls made by an expert analyzing the data and were used as a basis for developing our fuzzy algorithm.

Fuzzy logic algorithm.
In analyzing genetic expression data, the data is transformed from crisp values to fuzzy values in a process called "fuzzification." Data is fuzzified by first normalizing the data from 0 to 1, then the normalized value is broken up into various membership classes. For example, Fig. 1 shows the three fuzzy sets used in this algorithm, "HI," "MED," and "LO" as a function of the normalized value. For a normalized value of 0.25, the fuzzy value is 0.5 LO, 0.5 MED, and 0 HI; or said another way, 0.25 is 50% low, 50% medium, and 0% high. The three fuzzy sets HI, MED, and LO were chosen after manually examining expression data and finding that the abundance of most transcripts was either high, medium or low. Other schemes that include a different number or shape of fuzzy sets could also be used to better represent the data; however, these modifications tend to make the analysis less general and more complex and therefore were not pursued in this study.



View larger version (10K):
[in this window]
[in a new window]
 
Fig. 1. Fuzzy membership as a function of a normalized expression level.

 
After the data is fuzzified, triplets of data are compared using a set of heuristic rules in the form of a decision matrix (see Fig. 2). Triplets were defined as the expression values of three different proteins (A, B, and C) all taken at the same time point in the yeast growth cycle time series. Fuzzified values of A and B are entered into this matrix, and at points where their predictions overlap, a score is generated as the fuzzified value of predicted C. This form of a fuzzy value for C that can be defuzzified back into a crisp number. The predicted expression values of C for each time point in the time series were calculated. Then, the entire series of the predicted values was compared with that of the observed C measurements. For each triplet, the agreement with the assertion in the rule table can be calculated based on square of the residual, r2, between the calculated C and the observed C. Those triplets that have a low r2 value fit the assertion better and as such are reported with higher confidence. In our initial screen, we only accepted those triplets with r2 < 0.015, corresponding to an average error of 3% or less, well below the error associated with the expression data that was estimated as 15%.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 2. Decision matrix describing an activator (A) and repressor (B) acting on a target (C).

 
In some cases the data set of A and B fail to properly explore the decision matrix (i.e., A is almost always high, and B is almost always low), thus a second score called the variance was also assigned to the data set. The variance is defined as the statistical variance between the total hits in each box on the decision matrix. If the data set hits are evenly distributed throughout the decision matrix, then the variance score is low, and the resulting predictions are credible. However, if the data set is poorly distributed, then the variance will be high and the predictions may or may not be believable because of the lack of combinations of A and B tested. For the initial screen, only those A/B pairs with a variance of 1.5 or less were chosen.

To get an overall idea of how well the assertion fits the data, the r2 value and the variance are multiplied and scaled by a factor of 100,000 to give an overall score. Thus triplets with low r2 values and low variance will have the lowest score and also should be the most credible statements. Other data that are only low in one parameter may be filtered out because either the fit is too poor or the data set is biased.

All fuzzy logic analyses were written in the C programming language and run on an 8-processor SGI Origin 2000 system, which required ~200 h to analyze the relationships between 1,898 genes. Because all combinations of triplets are checked, the algorithm scales as O(h3) with the number of genes examined. However, because the problem consists of solving a large number of smaller, independent comparisons, the algorithm lends itself to parallel computing and scales nearly linearly with the number of available processors.

A US patent application (serial no. 60/181477) has been filed on this algorithm. Copies of the program are available upon request.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Of the 6,321 known proteins in yeast, we found only 1,898 (30%) genes whose expression levels were above the noise threshold and had a maximum value at least three times greater than the minimum value, ensuring that the observed signal change was significant. Only these genes were processed by the fuzzy logic algorithm. Using our initial screening cutoff for r2 and variance, we narrowed our list down to the top 0.007% of all possible triplets, thereby forming the basis for our initial calculations.

To evaluate and validate the algorithm, we examined the best scoring triplets to see if they made biological sense. The complete table of all the triplets can be obtained from us. One of the best scoring triplets CYB2-HAP1-CYC7 is shown in Fig. 3. Figure 3, top, shows the tight fit of the fuzzy logic prediction for CYC7 expression level compared with the experimental data. This triplet has the second best score from our selection and the functions of all three components have been extensively studied. The overall score for this triplet is 1,295 when comparing between the calculated expression data of CYC7 and the observed data (Fig. 3, top), indicating a very high confidence for the correlation. Moreover, the expression data of CYB2 (A) and HAP1 (B) show a fairly wide range of values and are evenly distributed throughout the decision matrix (Fig. 3, bottom), thus the variance score is low and the resulting predictions are credible. It is worthwhile to note that neither CYB2 (A) nor HAP1 (B) could be categorized in the same cluster as CYC7 (C) as shown in Fig. 3, middle; thus only through the use of this fuzzy logic algorithm could this triplet be uncovered.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 3. The fuzzy logic prediction for CYB2 (A), HAP1 (B), and CYC7 (C). Top: correlation of the predicated and the observed values for CYC7 (C). Middle: relationship of CYB2 (A) and HAP1 (B), showing that these two proteins explore the relationship space well and as such have a low variance value. Bottom: Relationship of CYB2 (A) and CYC7 (C), indicating that the relationship between these two proteins could not be uncovered by clustering. Note that for some time points the normalized value equals zero and as such has no mark.

 
Previous studies show that all three genes in this triplet are involved in yeast respiration process, and the predicted relationships between the genes are borne out by experimental results in the literature (4, 6). The transcription factor HAP1 has been shown to repress the nuclear encoding cytochrome gene CYC7 under anaerobic growth and activate CYC7 under aerobic growth (7). The fuzzy logic prediction suggests that HAP1 represses CYC7, which in turn accurately predicts that the cells used in this data set were primarily grown under anaerobic conditions.

The algorithm also predicts that CYB2 should activate CYC7, again in agreement with experimental findings. CYB2, L-(+)-lactate cytochrome c oxidoreductase is a soluble protein from the intermembrane of mitochondria. This protein transfers electrons from L-(+)-lactate to cytochrome c and is upstream of cytochrome c on the electron transport chain. Experimental findings indicate that CYB2 interacts preferentially with CYC7 during the electron transfer process (4) and as such should positively regulate the expression of CYC7 as found by the algorithm.

Following up the relationships revealed by this triplet, we selected all the triplets that contain either HAP1 or HAP1 regulated genes in an effort to generate an interconnected network describing the control roles of HAP1. The network predicted by the fuzzy logic algorithm is described in Fig. 4 and is highly consistent with the experimental data obtained from previous studies. Moreover, we could functionally identify unidentified genes involved in this process and generate hypotheses for future experimental tests. For example, previous studies show that an unidentified protein X masks the activation domain of HAP1 and allows HAP1 act as a repressor under anaerobic conditions (5, 11). Currently, protein X remains uncharacterized; however, the fuzzy logic prediction suggests that several genes including YDL174C, YGL037C, YLR251W, YLR252W, and YNL007C could be this uncharacterized protein. Functionally, these proteins appear to repress the ability of HAP1 to activate CYC7;, however, further experiments are needed to determine the exact protein involved.



View larger version (8K):
[in this window]
[in a new window]
 
Fig. 4. HAP1 and ROX1 regulatory network as predicted by the yeast expression data set.

 
Both CYT1 and CYC1 are regulated by HAP1 (8); however, the fuzzy logic algorithm only uncovered a relationship for CYT1. On further study we found that CYT1, like CYC7 discussed above, has a single HAP1 binding site, whereas the CYC1 promoter has two HAP1 binding sites (7). A major consequence of the two cooperative sites of CYC1 is that the HAP1 modulation profile should be sigmoidal rather than linear. In contrast, the single promoter site in CYT1 and CYC7 should provide these genes a more linear reduction in activity as HAP1 expression changes. Our fuzzy logic algorithm is limited to detecting linear relationships; thus proteins controlled by two redundant sites may not be as easily detected.

Experimentally, it has been shown that HAP1 regulates ROX1, a protein that encodes a repressor protein for the hypoxic genes (3, 12). When cells are grown under aerobic conditions, heme accumulates to levels sufficient to induce ROX1 expression and the hypoxic genes are repressed. When cells are limited for oxygen, heme levels fall, ROX1 repressor levels are reduced and hypoxic gene expression is depressed. The relationship between HAP1 and ROX1 was not revealed by the fuzzy logic, because the expression of ROX1 is transient and highly unstable. But two genes, CYT1 and GPD2, are found by the algorithm as the targets for the positive regulation by ROX1, whereas other hypoxic genes were not identified. This result suggests that the view of hypoxic gene regulation is correct in terms of the phenomenology but there is a great deal more complexity to ROX1 regulation.


Pairs.
Table 1 lists the most frequent occurring pairs of genes in triplets identified by the fuzzy logic algorithm, many of which appear to be biologically relevant. In several cases, both gene products function in the same cellular process. For example, AGP1 and MEP2 are often found together. Functionally AGP1 encodes a broad substrate range amino acid permease whose expression is subject to nitrogen repression, whereas MEP2 is a high-affinity ammonia permease induced by to nitrogen starvation. Thus it makes sense that AGP1 and MED2 are found in the same triplet. Similarly, HAP1 is a transcription factor with a broad spectrum of targets including genes involved in sterol biosynthesis such as FAA1 and ARE2. FAA1 is long chain fatty acyl:CoA synthetase in lipid metabolism and protein N-myristoylation. ARE2 is sterol-ester synthetase in ergosterol esterification. In general, HAP1 is known as a repressor, thus the fact that this algorithm identifies HAP1 as repressing FAA1 and ARE2 is consistent with known biological data.


View this table:
[in this window]
[in a new window]
 
Table 1. Frequent pairs of genes identified by the fuzzy logic algorithm

 
YGP1 and CBF2 represent a very interesting relationship predicted by the algorithm. YGP1 is a highly glycosylated secreted protein involved in cellular adaptations prior to stationary phase. The gene is expressed at a basal level during logarithmic growth and induced up to 50-fold above basal level when cells enter stationary phase. Conversely, CBF2 is a chromosome centromere binding protein in the multisubunit protein complex and is involved in cellular replication and cell division. Its expression is expected to be high during logarithmic growth but low during stationary phase. This reciprocal relationship between YGP1 and CBF2 is well represented in the predicted results, as the two proteins were found in either B or C positions depending upon the protein as the activator in the triplet. When CDC46, a protein enriched in nondividing cells, is the activator, YGP1 is found as the target in the triplet. When proteins promoting cell cycle progression such as CDC45, RPL14A, ZDS2, NHP6A, and IPL1 are activators, CBF2 acts as the target. CBF2 was found 54 times as the regulator, whereas YGP1 was found 17 times to be the regulator, suggesting that with different activators CBF2 played its regulatory role in most of the assays in this analysis. The observation may reflect the fact that most of the assay conditions in this study were designed to promote cell division.

In addition, there are also pairs of genes predicted by the fuzzy logic algorithm where one or both of the genes are uncharacterized. By analogy to the examples shown above, it may be possible to infer the cellular function of these unknown proteins by examining what known proteins are found to associate with the set. This ability to bootstrap functional information out of the expression data could be particularly useful in analyzing human data, where a much larger percentage of proteins are uncharacterized.

Transcription factors.
Because we used our algorithm to search for activator-repressor-target triplets, we expected to find that a disproportionately large number of triplets would include transcription factors. Among the 1,898 genes in our selected data set, we found that 124 genes annotated as transcription factors in GenBank descriptions. The expected probability of finding a transcription factor in our data set is 6.5%. After our initial screen by fuzzy logic analysis, we discovered that transcription factors were found at 9.0% in activator or repression positions, representing a 36% enrichment over would be expected by their frequency in the original data set. When only looking at the 100 best scoring triplets, we found transcription factors at 14% representing a 110% more frequently.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
In general, the findings of the algorithm agree well with experimental results from the literature. This should not come as a surprise, seeing as the algorithm searches for relationships that fit our logical understanding of how an activator, repressor, and target should interact. Thus, by using essentially the same criteria that an experimenter would use to describe the regulatory function of a protein, the fuzzy logic algorithm approximates the thought process an expert would use in analyzing this data. However, in contrast to an expert, the algorithm is automated, unbiased, and general. Expression data can be difficult to interpret and, if not analyzed properly, can be easily misconstrued. By applying a computational algorithm to the analysis of the data, we have provided a "lens" through which the data can be sorted in an unbiased manner, quickly and efficiently.

Although in this study the algorithm was only used to search for triplets of activator, repressor, and target genes, other variations of the algorithm are also possible. The choice of the activator-repressor model provided a simple method to demonstrate that this technology can yield biologically meaningful results. However, the technique is general and can be applied to other relationships and more complicated systems. Examples include other classes of relationships such as coactivators and corepressors or more complicated systems that involve genes whose transcription is regulated in complex ways by any number of transcription factors. Potentially, the technology could also be extended to describe complete general networks of gene interactions based on expression data alone.

However, using fuzzy logic to analyze expression data does have some limitations. To a first approximation, the interaction between multiple proteins is essentially linear, thus the algorithm searched for linear behavior. However, in the case of multiple redundant promoter binding sites (such as the two HAP1 binding sites on the CYC1 gene), this linear approximation is not accurate, causing the algorithm to overlook these biologically relevant connections. This situation could be remedied by including a more sophisticated "fuzzification" step to include nonlinear effects; however, this added complexity may only correct for a few missed connections while edging out many of the more common near linear relationships. Also, the goal of this algorithm is not to yield quantitative predictions, but instead to draw general trends that connect the regulation of multiple genes. Thus including specific nonlinear effects would not help to draw many connections, but would end up adding a significant computational burden to an already difficult problem.

The fuzzy logic algorithm found a disproportionately large number of transcription factors in the roles of activators and repressors; however, not all of the activators and repressors found were transcription factors. Two possible reasons for this discrepancy are 1) transcription factors are expressed at low levels and as such difficult to detect, and/or 2) other gene products such as enzymes can indirectly regulate transcription. Transcription factors are generally present only at a very low concentration; thus changes in transcription factor expression levels can be difficult to detect using current expression profiling techniques. Presumably, if expression profiling technology were to become more sensitive, then the fuzzy logic algorithm would detect an even greater bias of transcription factors in the activator and repressor roles. However, in many cases the expression level of a particular protein is not governed by the expression of a transcription factor, but instead by the concentration of some intracellular compound, such as Ca2+ concentration or cAMP levels, which in turn are controlled by enzymes inside the cell. In these cases, changes in the expression level of the enzyme have a "transcription-factor-like" effect and would be detected by the algorithm as an activator or repressor. From an drug design point of view, these "transcription-factor-like" enzymes are possibly more interesting than true transcription factors, because it is generally easier to change the activity of an enzyme in the cytosol with a drug than to block a true transcription factor in the nucleus. Moreover, the data set used in this study came from a single experiment in which cell cycle control was the main process of study. Transcription factors that are not involved in pathways related to this cellular process might not show significant change in their expression and thus could not be evaluated by the fuzzy logic algorithm. To perform a more comprehensive survey on transcription factors, we are analyzing a data set that includes gene expression profiles of both wild-type and various mutant yeast cells. Many more transcription factors can be evaluated because the cellular processes they control have been perturbed.

Although the validation of this algorithm was performed using GeneChip data in this report, the fuzzy logic algorithm should work equally well with other expression profiling techniques such as Sequential Analysis of Gene Expression (SAGE). SAGE has the advantage that it can detect completely unknown proteins, whereas GeneChip technologies require that at least the sequence of protein’s mRNA be known. This ability to detect unknown proteins would be particularly well suited to the functional characterization that the fuzzy logic algorithm makes possible.

An additional advantage to the fuzzy logic algorithm is that data can come from any source within an organism (tissue, cell type, treatment, or physiological state), and the output actually will be improved by deeper and more diverse data set. The reason for this improvement is that the algorithm needs to observe changes in the expression level of a protein relative to changes in other expression levels. Each new data set provides a different set of expression levels that can be tested to see whether they fit the proposed regulatory model. In our studies, many data sets were eliminated solely because they did not sufficiently explore the combinations of expression levels (too high a sigma value), making their predictions impossible to believe. By including data sets from cells in different states, the algorithm gains more information about the details of the regulatory network.

A primary application of this algorithm is to independently validate or discover drug targets. Traditional techniques for drug target discovery require a detailed understanding of the biology underlying the disease, which can be slow and difficult to obtain. In contrast, expression profiling is a rapid high-throughput process that gives a large amount of information about the cell in a form that could be easily processed on a computer. By using a fuzzy logic approach to analyzing expression profile data, it is possible to confirm the mechanism of a known target. Moreover, because the fuzzy logic algorithm does not require biological information about the gene, genes with unknown functions can be included just as easily as genes with known functions. This ability to identify functional clues for uncharacterized genes is a great advantage in drug target discovery, because potential drug targets then can be followed up with the detailed biology.


    FOOTNOTES
 
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).

Address for reprint requests and other correspondence: Y. Wang, Bioinformatics, Dept. of Molecular Biology, Parke-Davis Pharmaceutical Research, Warner-Lanbert, Ann Arbor, MI 48105 (E-mail: yixin.wang{at}wl.com).


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, and Davis RW. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell 2: 65–73, 1998.[ISI][Medline]
  2. Cox E. Fuzzy fundamentals. IEEE Spectrum 29: 58–61, 1992.
  3. Deckert J, Perini R, Balasubramanian B, and Zitomer RS. Multiple elements and auto-repression regulate Rox1, a repressor of hypoxic genes in Saccharomyces cerevisiae. Genetics Soc Am 139: 1149–1158, 1995.
  4. Fytlovich S, Gervais M, Agrimonti C, and Guiard B. Evidence for an interaction between the CYP1(HAP1) activator and a cellular factor during heme-dependent transcriptional regulation in the yeast Saccharomyces cerevisiae. EMBO J 12: 1209–1218, 1993.
  5. Hach A, Hon T, and Zhang L. A new class of repression modules is critical for heme regulation of the yeast transcriptional activator Hap1. Mol Cell Biol 19: 4324–4333, 1999.[Abstract/Free Full Text]
  6. Lodi T and Guiard B. Complex transcriptional regulation of the Saccharomyces cerevisiae CYB2 gene encoding cytochrome b2: CYP1(HAP1) activator binds to the CYB2 upstream activation site UAS1-B2. Mol Cell Biol 11: 3762–3772, 1991.[ISI][Medline]
  7. Prezant T, Pfeifer K, and Guarente L. Organization of the regulatory region of the yeast cyc7 gene: multiple factors are involved in regulation. Mol Cell Biol 7: 3252–3259, 1987.[ISI][Medline]
  8. Schneider JC and Guarente L. Regulation of the yeast CYT1 gene encoding cytochrome c1 by HAP1 and HAP2/3/4. Mol Cell Biol 11: 4934–4942, 1991.[ISI][Medline]
  9. Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, and Church GM. Systematic determination of genetic network architecture. Nat Genet 22: 281–285, 1999.[ISI][Medline]
  10. Zadeh LA. Fuzzy logic and its application to approximate reasoning. Information Processing 74: 591–594, 1974.
  11. Zhang L, Hach A, and Wang C. Molecular mechanism governing heme signaling in yeast: a higher-order complex mediates heme regulation of the transcriptional activator HAP1. Mol Cell Biol 18: 3819–3828, 1998.[Abstract/Free Full Text]
  12. Zitomer RS, Limbach MP, Rodriguez-Torres AM, Balasubramanian B, Deckert J, and Snow PM. Approaches to the study of Rox1 repression of the hypoxic genes in the yeast Saccharomyces cerevisiae. Methods Enzymol 11: 279–288, 1997.