SPECIAL COMMUNICATION
Use of serial analysis of gene expression to generate kidney expression libraries

M. Ashraf El-Meanawy1, Jeffrey R. Schelling1, Fatima Pozuelo1, Matthew M. Churpek1, Eckhard K. Ficker, Sudha Iyengar3, and John R. Sedor1,2

Departments of 1 Medicine, 2 Physiology and Biophysics, and 3 Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, and Rammelkamp Center for Research and Education, MetroHealth Medical Center, Cleveland, Ohio 44109


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Chronic renal disease initiation and progression remain incompletely understood. Genome-wide expression monitoring should clarify mechanisms that cause progressive renal disease by determining how clusters of genes coordinately change their activity. Serial analysis of gene expression (SAGE) is a technique of expression profiling, which permits simultaneous, comparative, and quantitative analysis of gene-specific, 9- to 13-bp sequence tags. Using SAGE, we have constructed a tag expression library from ROP-+/+ mouse kidney. Tag sequences were sorted by abundance, and identity was determined by sequence homology searching. Analyses of 3,868 tags yielded 1,453 unique kidney transcripts. Forty-two percent of these transcripts matched mRNA sequence entries with known function, 35% of the transcripts corresponded to expressed sequence tag (EST) entries or cloned genes, whose function has not been established, and 23% represented unidentified genes. Previously characterized transcripts were clustered into functional groups, and those encoding metabolic enzymes, plasma membrane proteins (transporters/receptors), and ribosomal proteins were most abundant (39, 14, and 12% of known transcripts, respectively). The most common, kidney-specific transcripts were kidney androgen-regulated protein (4% of all transcripts), sodium-phosphate cotransporter (0.3%), renal cytochrome P-450 (0.3%), parathyroid hormone receptor (0.1%), and kidney-specific cadherin (0.1%). Comprehensively characterizing and contrasting gene expression patterns in normal and diseased kidneys will provide an alternative strategy to identify candidate pathways, which regulate nephropathy susceptibility and progression, and novel targets for therapeutic intervention.

end-stage renal disease; mouse model; chronic renal failure; genetic techniques; mRNA


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

THE INCIDENCE OF END-STAGE renal disease has risen at an annual rate of 7-9% for the last decade (3). This increase is due, at least in part, to incomplete understanding of renal disease pathophysiology and limited therapeutic options to prevent disease progression. Kidney disease-oriented research has primarily focused on candidate molecules, emphasizing fibrogenic pathways (7) and hemodynamic alterations in the setting of reduced nephron number (8) as key mechanisms regulating chronic renal disease initiation and progression. However, chronic renal failure results from complex interaction of genetic and environmental risk factors, and interruption of a single effector pathway is unlikely to result in significant therapeutic benefit (31).

In contrast to focusing on candidate effector pathways, gene expression profiling is an alternative but powerful tool to better understand renal disease pathogenesis by generating a detailed analysis of mRNA expression profiles. Novel molecular techniques used to generate transcript libraries simultaneously can determine net consequences of gene-gene and gene-environment interactions on expression of thousands of genes. Rather than applying a priori assumptions (i.e., hypothesis testing), kidney transcript profiles from normal animals and animals with progressive kidney disease could be "mined" by analytic methods developed to discover unexpected relationships between genes or pathways (i.e., "bottom-up approach") (6). Although hypothesis-driven experimentation remains critical for knowledge discovery, thoughtful analysis of expression profiles generated with other model systems has yielded unanticipated results and defined new paradigms. For example, cluster analysis of transcript profiles showed that serum specifically activated wound-healing processes in fibroblasts rather than serving as a general signal for cell growth (18). Global expression monitoring also demonstrated that receptor tyrosine kinases, with unique ligand specificities and distinct biological effects, induce broadly overlapping, rather than independent, sets of genes (13;27), suggesting that cellular responses depend less on the specific ligand than cellular differentiation state, signal strength, or combinatorial interactions between activated signaling pathways. Finally, clustering of transcripts with similar expression patterns can assign a precise biological pathway to a gene of "generic" function (e.g., kinase) and provide clues to function of an unknown gene by relating it to characterized genes (1, 10).

Gene expression profiles can be generated and compared by multiple techniques. Subtractive hybridization (33), subtraction libraries (14), and differential display (2, 5) are semiquantitative methods, which may not detect small, pathophysiologically important variations in gene expression. In contrast, DNA microarrays quantitatively assay expression levels of thousands of genes, using discrete DNA sequences robotically imprinted on glass microscope slides (9), but require technology and instrumentation not available to most investigators. Serial analysis of gene expression (SAGE) also allows simultaneous, quantitative analysis of a large number of transcripts (36) but does not require expensive instrumentation. Using SAGE, an individual investigator with access to an automated sequencer can quantify mRNAs, expressed at a level of 100 copies/cell, within months (36) and can identify transcripts expressed as low as 1 transcript/cell in larger SAGE libraries (37). To date, SAGE expression libraries have been generated predominantly from yeast, tumor specimens, and cultured cells (37, 42). Because quantitative analyses of transcript levels appear to be critical for a better understanding of normal cell function and cellular response to injury, we have utilized SAGE to generate a cDNA tag library, which is composed of unique 9- to 13-bp cDNA sequence signatures, from mouse kidney mRNA. The results provide feasibility of applying SAGE for comparison of gene expression between normal and diseased kidney to discover new mechanisms of renal disease pathogenesis and to identify novel targets for renal disease therapy.


    MATERIALS AND METHODS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

RNA preparation. As a prelude to comparing kidney expression libraries generated from wild-type animals and animals with progressive chronic renal failure, kidneys were harvested from 26-wk-old male ROP-Es/+ mice (Jackson Laboratories, Bar Harbor, ME), which are glomerulosclerosis prone but have normal kidney structure and function in the absence of a triggering event such as hyperglycemia or nephron reduction (12, 40). One and three-quarters kidney were used for polyadenylated RNA (A+) extraction by using RNeasy and Oligotx extraction kits (Qiagen, Valencia, CA) according to the manufacturer's recommendation.

Production of ROP-Es/+ kidney tag library. SAGE kidney libraries were generated as described by Velculescu et al. (36) with modifications to maximize the final yield. A schematic of SAGE can be found on the Internet (http://www.sagenet.org) and in Fig. 1. Briefly, biotin-TEG-oligo-dT (Integrated DNA Technologies, Coralville, IA) was used to drive first-strand cDNA synthesis of mouse kidney A+ RNA using a MMLV-RT cDNA synthesis kit (GIBCO-BRL, Gaithersburg, MD). Second-strand cDNA was then generated with a final yield efficiency of 75-85%. Biotinylated cDNA was digested with the restriction enzyme Nla III (New England Biolaboratories, Beverly, MA) and 3'-cDNA end fragments isolated with streptavidin-coated magnetic beads (Dynal, Oslo, Norway). 3'-cDNAs were split into two pools, and each pool was ligated by using T4 DNA ligase (GIBCO-BRL) to custom-made 40-bp SAGE oligonucleotide DNA linkers (L1 and L2; Integrated DNA Technologies). To release linker-cDNA tag hybrids, the ligation reactions were then digested with the class II-shift restriction endonuclease Bsm FI (New England Biolaboratories) at 60°C for 2 h with continuous rotation in a hybridization oven. Magnetic beads were pelleted, and released linker-tags were blunted with T4-DNA polymerase (New England Biolaboratories) in the presence of deoxynucleotides. The L1- and L2-linker-tag complexes were then ligated together to form linker-ditag-linker constructs. To determine the optimum input of the linker-ditag-linker for large scale PCR, aliquots of serially diluted (1:10-1:1,280) ligation products were PCR amplified for 25 cycles (95°C for 30 min, 55°C for 1 min, then 70°C for 1 min) and analyzed by 8% PAGE. Subsequently, a large-scale (96 tubes) PCR reaction was conducted at optimum template dilution, and PCR products were pooled and digested with Nla III to release kidney cDNA ditag sequences. Released ditags were separated from linkers by 12% PAGE, eluted, and ligated together to produce concatemers. Concatemers were size fractionated by using 8% PAGE, and those between 600 and 1,200 bp were isolated and cloned into pZero (Invitrogen, San Diego, CA), which had been digested with Sph I. After electroporation into DH10B (GIBCO-BRL), clones were analyzed for recombinants by direct PCR using M13 forward and M13 reverse primers. PCR products >500 bp in size were isolated by 1.2% agarose gel electrophoresis and sequenced by using a BigDye terminator cycle sequencing kit (Perkin-Elmer, Foster City, CA). Sequencing reactions were analyzed by using an ABI-377 automated sequencer (Perkin-Elmer).


View larger version (29K):
[in this window]
[in a new window]
 
Fig. 1.   Schematic of serial analysis of gene expression (SAGE). See MATERIALS AND METHODS for details. Adapted from Velculescu et al. (36, 37) and http://www.sagenet.org/.

Data analysis. Concatemer sequences were analyzed by using SAGE software v1.0 (provided by Dr. Kenneth Kinzler's laboratory, Johns Hopkins University, Baltimore, MD), which automatically detects and counts tags from sequence files. Tag counts directly reflect transcript abundance (36, 42). SAGE software excludes replicate ditags from the tag sequence catalogue, because the probability of any two tags being coupled in the same ditag is small, even for abundant transcripts (36). The identities of the genes corresponding to the tags were determined by homology searches of public databases at the National Center for Biotechnology web site (http://www.ncbi.nlm.nih.gov), including GenBank, European Molecular Biology Laboratory (EMBL), DNA Database of Japan (DDBJ), Protein Data Bank (PDB), and the expressed sequence tag (EST) division of GenBank, using basic alignment research tool (BLAST)-N v2.1 (cutoff 70, no filters) (35). This version of the BLAST search algorithm is no longer available, but our results can be replicated by using Advanced BLAST v2.0 [advanced search options: ungapped alignment, expect set to 100-10,000, no filters and word size set to 6 (see BLAST help file http://www.ncbi.nlm.nih.gov/BLAST/blast_help.htm)]. The nonredundant Mus musculus GenBank database was initially searched with the tag sequences. If no appropriate matches were obtained, the entire nonredundant GenBank database was next searched, and if necessary, the tag sequence was submitted to dbEST, the nonredundant GenBank, EMBL, DDBJ EST database. Frequently, SAGE tag sequences matched more than one transcript. In these cases, genes matching the SAGE tags were identified using the published algorithm (36, 39). First, GenBank entries from mammalian organisms were identified (for searches which were not limited to M. musculus). Matches with nonmammalian sequences were excluded. Second, genomic, non-mRNA sequences were eliminated. Finally, search results were analyzed for multiple entries for the same gene, and the final match was checked to verify that the SAGE sequence flanked the 3'-most Nla III restriction endonuclease recognition site.

Immunoblotting. Tissue samples were isolated from neonatal rat kidney and heart (as a positive control). DRK23 cells overexpressing rat Kv2.1 were analyzed as a negative control. Samples were homogenized in 5-10 vol of 0.3 M sucrose, 10 mM NaPO4 (pH 7.4) supplemented with protease inhibitor mix (Complete; Roche, Indianapolis, IN). After removal of debris and nuclei (3,000 g, 10 min), the supernatant was spun at 50,000 g for 1 h at 4°C to pellet a crude membrane fraction. Protein concentrations were determined by the bicinchoninic acid method (Pierce, Rockford, IL). Total protein was separated on 11% SDS polyacrylamide gels and transferred to polyvinylidene difluoride membranes. Membranes were blocked overnight with 5% nonfat dry milk in PBS plus 0.1% Tween and immunoblotted with a commercially available mouse monoclonal anti-Kv1.5 antibody (1:250 dilution; 1 h at room temperature; Transduction Laboratories, Lexington, KY) followed by horseradish peroxidase-conjugated secondary antibody (1:3,000; 1 h at room temperature; Amersham Pharmacia Biotech, Piscataway, NJ). Western blots were developed with the ECL-Plus detection system (Amersham Pharmacia, Arlington Heights, IL), as previously described (32).


    RESULTS
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Tag library generation. As described in MATERIALS AND METHODS, a pilot PCR reaction was performed to determine the optimum dilution for the input DNA by using serial dilutions (1:10-1:1,280) of recovered linker-ditag-linker constructs (Fig. 2). As expected, amplified linker-ditag-linker constructs were ~102 bp. A 1:40 dilution of the ligation reaction gave optimum results, with the intensity of the ethidium bromide-stained, amplified linker-ditag-linker being greater than the background products of ~80 (ligated linkers without kidney tag sequence) and 90 bp (ligated linkers with a single kidney tag sequence). A large-scale PCR reaction using the previously determined optimum conditions was subsequently performed, and the 102-bp linker-ditag-linker products were digested with Nla III to free ditags from the linkers. Ditags contain two 9- to 13-bp sequences, which have been derived from kidney transcripts and are joined in the sequence 5'-CATG(N)22-26CATG-3'. The first half of the ditag represents sense sequence of an ROP-Es/+ kidney RNA, and the second half represents the antisense sequence of a different kidney transcript. Ditags migrated with 22- to 26-bp-molecular-size marker, whereas linkers migrated at 40 bp (Fig. 3). In our experience, purification of the 102-bp band by preparative PAGE, before Nla III treatment, improves digestion efficiency and ultimately the quality of the tag library. Eluted ditags were ligated together to generate concatemers (Fig. 4), which were ligated into pZero and transformed into bacteria as described in MATERIALS AND METHODS. Bacterial clones were screened for plasmids containing a kidney tag concatemer insert by direct PCR (Fig. 5) and concatemer sequence determined as described in MATERIALS AND METHODS.


View larger version (36K):
[in this window]
[in a new window]
 
Fig. 2.   Determination of optimum template dilution for large-scale PCR. An ethidium bromide-stained, nondenaturing 8% polyacrylamide gel demonstrates PCR products obtained from serial dilutions (top) of the linker-ditag-linker template. The 1:10 dilution is not shown. Amplified linker-ditag-linker migrate at 102 bp. Background bands of other sizes stain with equal or lower intensity. The 1:40 dilution of template had an optimal signal-to-noise ratio and was used for the large scale PCR (see MATERIALS AND METHODS).



View larger version (56K):
[in this window]
[in a new window]
 
Fig. 3.   Nla III digestion of the 102-bp linker-ditag-linker large scale PCR product yields 40-bp band (linkers) and a 26-bp ditag bands. The Nla III digestion products were analyzed by 16% PAGE, and the gel was stained with Sybr green-I. The 26-bp ditag band was excised from the gel and concatemerized as described in MATERIALS AND METHODS.



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 4.   Sybr green-I stained, nondenaturing 5% PAGE of 26-bp ditags concatemerized as described in MATERIALS AND METHODS. Concatemers ranging in size from 600 to 1,200 bp were subcloned in the pZero cloning vector.



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 5.   A representative 1.2% agarose gel electrophoresis of cloned kidney SAGE tag concatemers, which were generated by direct PCR of recombinant bacterial clones. The gel has been ethidium bromide stained. The PCR products vary in size from 700 to 1,400 bp, which include ~300 bp derived from the multiple cloning site.

Tag library analysis. Our library contained 3,868 sequence tags, which represented 1,575 unique mRNA transcripts. Ditags, containing identical tag sequences, were encountered 90 times. In each case, the tags were counted only once in tag abundance calculations. Such ditags are potentially produced by biased PCR, because the probability of any two tags being coupled in the same ditag is small even for abundant transcripts (36). The actual number of unique genes (1,575) identified by 3,868 kidney tags is consistent with previous predictions (37). Assuming that a cell contains 15,000 mRNA molecules (16), the 3,868-tag sequence library provides 1-fold coverage for mRNA molecules present at a minimum of 4 copies/cell. However, previous analyses of SAGE libraries have suggested that three- to fourfold coverage is necessary to identify all the transcripts at any level of abundance with high certainty. The 100 most abundant tags (6.3% of the unique transcripts represented in this expression library) represented 36% of isolated mRNAs. As expected, the frequency distribution pattern demonstrated that a minority of genes were responsible for the majority of transcripts, indicating that most genes are expressed at low abundance.

Table 1 shows the 50 most abundant transcripts with GenBank matches and corresponding sequence tags from the normal mouse kidney tag library. The entire kidney library is available at http://www.metrohealth.org/research/kidneytag.html. Seventeen of these high- abundance transcripts were identified in the EST databases, and of these, 15 ESTs identified genes with characterized functions. As expected, housekeeping and mitochondrial genes were among the most highly expressed genes. In addition, several kidney-predominant or kidney-specific genes were also identified. Mouse androgen-regulated protein previously has been shown to be expressed in the S3 region of proximal tubules and accounted for ~4% of transcripts. Other transcripts, known to be kidney specific, included renal cytochrome P-450 (0.3% of all transcripts), parathyroid hormone receptor (0.1%), and kidney-specific cadherin (0.1%). Other genes that are highly, but not exclusively, expressed in the kidney, were also identified in the group of highly expressed genes, including the renal sodium phosphate (Na+-Pi) cotransporter (0.3% of all transcripts); plasma glutathione peroxidase (a superoxide scavenger pathway enzyme, 2.3% of all transcripts); mouse kidney testosterone-regulated RP2 mRNA (a transcript regulated in patients with polycystic kidney disease, 0.2% of transcripts); the tetraspanin CD63 (predominantly expressed in glomerulus and postulated to be necessary for normal glomerular function, 0.2% of transcripts); and prolidase (0.2% of transcripts). Prolidase encodes an enzyme that recycles dipeptides with COOH-terminus proline or hydroxyproline residues and has been previously been shown to be expressed at a high level in mouse kidney (17). Interestingly, the imidopeptide substrate of prolidase has been incriminated in inflammation and tissue damage in prolidase-deficient humans (21), suggesting that prolidase activity may be necessary for healing after inflammatory injury to the kidney. Although one report demonstrates kidney Kv1.5 expression by Northern blotting (30), the Kv1.5 potassium channel was unexpectantly abundant (3.4% of all transcripts). This gene is predominantly expressed in ventricle, nerve cells, and vascular smooth muscle, and the high level of expression may reflect contribution of renal resistance vasculature or renal nerve transcripts to the SAGE tags. Because such a high frequency of tags corresponding to Kv1.5 mRNA was found in the kidney SAGE library, Kv1.5 expression was confirmed by immunoblot analysis (Fig. 6). Kv1.5 was abundantly expressed in rat heart, as previously published (4), and in neonatal rat kidney but not in a cell line overexpressing a related potassium channel (Kv2.1).

                              
View this table:
[in this window]
[in a new window]
 
Table 1.   Fifty most frequent SAGE tags identified in a sample of 3,868 tags isolated from a normal ROP-Es/+ mouse kidney (26 wk)



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 6.   Kv1.5 expression was analyzed by immunoblotting as described in MATERIALS AND METHODS. Lysates from rat neonatal kidney (lanes 1 and 2), heart (lane 3, positive control) and DRK23 cells overexpressing Kv2.1 (lane 4, negative control) were resolved by SDS-PAGE. Left: molecular mass markers (in kDa); right arrow: position of Kv1.5.

Table 2 catalogues the genes identified in the kidney tag library by function, which were assigned by applying a classification system of cDNA clones isolated in an expression profile of mouse proximal tubule (34). Genes identified by microarray technology have also been grouped by function in a similar manner (18). Approximately 40% of kidney transcripts, identified by SAGE tags, matched mRNA sequence entries with known function, 37% of the transcripts corresponded to EST entries or cloned genes, with undetermined function, and 23% represented unidentified genes. Of the tag sequences identified in the EST databases, 26 were identified only in kidney cDNA libraries that have been used in EST projects (http://www.ncbi.nlm.nih.gov/UniGene/Mm.Home.html). Of these 26 tags, 24 were identified in libraries generated from C57BL kidneys, an interesting finding because the C57BL and ROP mice share a large amount of their genetic backgrounds (20). The remaining two tags were identified in the cDNA ESTs generated from Barstead Balb/c mouse kidney.

                              
View this table:
[in this window]
[in a new window]
 
Table 2.   Genes expressed in normal mouse kidney, as identified by SAGE tags, sorted by function

The functional categories, with the highest numbers of identified genes, should be predicted by the physiology of the normal kidney. For example, genes encoding enzymes regulating metabolic enzymes, mitochondrial function or membrane proteins (transporters/receptors) accounted for 19% of all unique tag sequences and 47% of genes, whose function has been established. Consistent with the high energy requirement of filtration and solute transport, transcripts for 59 proteins involved in mitochondrial respiration were identified. These included the mouse cytochrome-c oxidase Vb subunit gene, cytochrome c-oxidase polypeptides I and III, adrenodoxin, and NADH-ubiquinone oxidoreductase chain 3 genes. Enzymes involved in the Kreb's cycle, the final common pathway for oxidation of amino acids, fatty acids, and carbohydrates and, ultimately, ATP generation, were also identified, including succinyl-CoA synthase, isocitrate dehydrogenase, and malate dehydrogenase. Glucose synthesis is an important kidney metabolic function, and enzymes involved in gluconeogenesis, such as hexokinase and fructose-1,6-bisphosphatase, were represented in this kidney tag library. Arachidonate acid metabolites have been implicated in normal renal physiology. In addition to CYB4B1, other genes involved in arachidonate metabolism were identified, including arachidonate 5-lipoxygenase and phospholipase A2.

Because solute transport is a major physiological function of the kidney, it was predicted that transcripts encoding transport proteins would also be identified in the expression library. In addition to the renal Na+-Pi transporter already mentioned, other expressed genes with known transport functions included the alpha -subunit of Na+-K+-ATPase, H+-ATPase, the inward rectifier potassium channel, the mouse basolateral thick ascending limb of Henle chloride channel, carbonic anhydrase, and the furosemide-sensitive Na-K-2Cl cotransporter.

In contrast to these functional categories of genes that regulate normal renal physiology, transcripts promoting tissue injury and remodeling, including proteases, cytokines, and matrix genes, were uncommonly identified, which was expected in this control (wild-type) mouse strain. However, genes encoding proteins that would limit tissue injury were expressed. Some examples include Ikappa B, the p58 cellular inhibitor of the interferon-inducible, double- stranded RNA-dependent, regulated protein kinase; clusterin (a complement regulator); type 2 plasminogen activator inhibitor; matrix Gla protein (an inhibitor of ectopic calcification); GM2 protein (a platelet-activating factor inhibitor); and p57KIP2 (a cyclin-dependent kinase inhibitor). Genes required for maintenance of renal structure, cytoskeletal proteins, and nuclear matrix proteins, as well as transcription factors, were expressed at intermediate levels. Interestingly, tuberin, the tuberous sclerosis complex 2 (TSC2) gene product that is mutated in tuberous sclerosis, was also identified. Tuberous sclerosis patients, expressing the mutated TSC2 gene, have renal hamartomas and cysts. Identification of the TSC2 transcript in the normal kidney suggests constitutive expression of this gene may maintain normal renal structure. Recently, tuberin expression has been demonstrated predominantly in intercalated cells of the distal convoluted tubule and the cortical and medullary collecting duct (24).


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

Our data demonstrate the feasibility of generating expression libraries from kidney, and in aggregate, identify transcripts whose expression would be predicted by the physiology of the normal kidney. Genes regulating energy generation and solute transport dominated identified transcripts with known functions, whereas, genes normally expressed in inflammation or ongoing tissue remodeling or injury were not identified, as expected, in normal (wild-type) kidney. These data also demonstrate the potential power of expression profiling, which not only catalogues expressed genes but also allows an assessment of how the expression pattern of one gene relates to others. We suggest analysis of expression profiles by SAGE or other methodologies, generated from normal and diseased kidneys, should provide valuable new insights into renal disease pathogenesis and could identify critical regulators of renal disease initiation and progression.

SAGE provides a technique for high-throughput evaluation of gene expression and is based on two principles. First, a short sequence of 9-13 bp can be generated from DNA digestion with appropriate combinations of restriction endonucleases. This sequence tag contains sufficient information to uniquely identify a transcript, provided that it derived from a specific location within the mRNA sequence. Second, many transcript tags can be concatenated and sequenced, revealing the sequence of multiple tags simultaneously (36). A major strength of the technique is that it provides quantitative gene expression data (37), in contrast to differential display and subtraction hybridization. The quantitative aspect of SAGE has been validated in other laboratories (36-38). For example, quantitative hybridization experiments by Iyer and Struhl determined that the SUP44/RPS4 is expressed at ~75 copies in yeast. A yeast transcript profile, generated by SAGE, determined the abundance of this gene to be 63 copies/cell (37). Reproducibility and reliability of tag libraries generated from different amounts of input RNA from the same kidney sample have recently been reported (38). The abundance of the same tags correlated between the two libraries, suggesting this technique and its modifications for small samples will provide similar results in different laboratories.

The SAGE technique has limitations, however. First, a small number of transcripts would be predicted to lack a Nla III restriction site and would not be detected. Second, transcripts expressed at low abundance, or only in a fraction of the cell population, may not be reliably detected. However, SAGE has been used to analyze gene expression in yeast arrested at different phases of the cell cycle, yielding transcripts as low as 0.3 copies/cell (37). Therefore, the size of the SAGE library depends on the level of confidence desired for detecting low- abundance mRNA molecules. Monte Carlo simulations suggest the probability of identifying a single-copy transcript in a library containing 30,000 tags is between 72 and 97% (37). In addition, in an organ with as many different cell types as the kidney, a large tag library will need to be screened to identify transcripts unique for a subpopulation of cells. However, SAGE has been recently adapted for small samples and validated in microdissected kidney tubules containing ~50,000 cells (38). Third, differential mRNA expression may not be reflected at the protein level (15). Finally, identities of transcripts may remain unassigned (i.e., no match in either the nr or dbEST databases), but the percentage of these unknown transcripts will diminish as large-scale sequencing of the mouse and human genomes nears completion. Alternative high-throughput screening techniques, such as cDNA or oligonucleotide microarrays, may ultimately render SAGE obsolete. Recently, commercially available cDNA arrays on nylon membranes containing 588 unique murine genes were used to identify changes in gene expression in an in vitro model of branching morphogenesis (26). At present, though, microarray technology is not widely available, and only at considerable cost if robotically generated glass chips are used. In addition, the repertoire of mouse genes on chips and nylon membranes is limited, and SAGE analysis can result in discovery of novel genes. No matter which method is used to generate expression profiles, tools for exploring gene expression libraries are in their infancy. Data analysis of expression profiles ultimately will depend on integrating kidney mRNA libraries with external information resources and will require software development, such as the VectorArray application used to analyze the gene expression profiles during branching morphogenesis (26). Necessary bioinformatics tools include links between kidney genes identified in an expression profile and Genbank, links to user-friendly biological pathway databases, and access to databases that can identify functionally important nucleotide (e.g., regulatory elements) or protein (e.g., kinase domain) motifs.

Some of the most abundant genes expressed in ROP-Es/+ kidney were also identified in a previously published abundance profile, which was produced by sequencing randomly selected cDNA clones from microdissected renal tubules (34). Examples of common transcripts identified in this study and ours, included the androgen-regulated protein and cytochrome-c oxidase. Not surprisingly, SAGE reliably identified common tubular transcripts, validating the technique in kidney given that proximal tubules account for ~60% of renal mass. As a more comprehensive approach, SAGE should emerge as a preferable method for generating kidney expression libraries. The profile of abundant genes in our library (Table 1) also compares favorably to the abundant, nonmitochondrial genes recently identified by SAGE in C57BL/6J kidney (38). Of these 26 genes, we have identified 21 in similar abundance in our partial, ROP-Es/+ kidney SAGE library.

SAGE has been used to generate gene expression profiles from normal human skeletal muscle (39), pancreas (36), and Saccharomyces cerevisiae yeast (37). Integration of the yeast gene expression data with the S. cerevisiae genomic map allowed generation of chromosomal expression maps, which identified regions that were transcriptionally active and discovered genes that had not been predicted by sequence information alone (37). More recently, gene expression has been generated from in vitro models of normal and disease states, as well as from actual tissues, to gain insights into pathogenesis. SAGE expression libraries generated from cancerous and normal colonic epithelia and p53- and mock-transfected cells have compared 300,000 and 7,200 transcripts, respectively (29, 37). Fewer than 1% of the transcripts were expressed at significantly different levels in these comparative analyses. Because genes exhibiting the greatest difference in expression between normal and diseased tissue are likely to be the most relevant to disease processes (42), SAGE expression libraries have the potential to identify candidate pathways important in disease pathogenesis, novel diagnostic markers, and potential therapeutic targets.

Expression libraries are a powerful tool to apply to mechanisms of renal disease, particularly because available treatments are limited, only marginally effective, and rarely curative. Progressive kidney disease probably results from complex interactions of traits (both environmental and genetic) (31). Further understanding of chronic renal failure disease pathogenesis and development of new chronic renal failure therapies will require molecular tools, such as expression profiling, designed to dissect complex diseases that result from interactions of multiple genes. Expression profiling has already been applied to in vitro and in vivo models of kidney disease. Suppressive subtractive hybridization identified connective tissue growth factor in glucose-stimulated mesangial cells, a finding confirmed in streptozotocin-induced diabetic nephropathy (22). Differential display has been used to identify candidate diabetic nephropathy genes in the GK rat (25), and PCR-based subtractive hybridization has been used to analyze kidney gene profiles after 5/6 nephrectomy (41) and injection of anti-Thy1.1 antibody (23). In each case, transcripts were identified, whose induction in diseased kidney was previously unrecognized. Interestingly, the most abundant transcript identified in our ROP-Es/+ kidney library, mitochondrial cytochrome c-oxidase I, was shown to undergo a 4.5-fold induction in expression in 5/6 nephrectomized mice compared with sham-operated control animals (41). Previous reports also showed increased cytochrome oxidase I activity after unilateral nephrectomy (19). The compensatory increase in cytochrome-c oxidase I expression may be linked to kidney cell deletion through enhanced oxygen radical generation and apoptosis (40). However, the power of expression profiling for identification of mechanisms of kidney disease will really occur only when sophisticated analyses, such as Monte Carlo simulations and hierarchal cluster analysis, are applied (6, 11). These techniques will allow identification of key candidate genes or pathways regulating kidney disease pathogenesis by 1) assigning a precise biological pathway to a gene of generic function (e.g., kinase), 2) relating an unknown gene to characterized genes, and 3) revealing unexpected relationships between previously known gene(s) or pathways.

In summary, SAGE is a powerful tool that enables quantitative identification of differentially expressed genes. Alternative methods of gene expression analysis (differential display, chip microarray, subtractive hybridization) either lack the quantitative analytical capacity or require prohibitively expensive equipment. Our data confirm that it is feasible for a small laboratory to generate and analyze comprehensive mRNA expression libraries by using tissue from in vivo animal models and SAGE technology. On the basis of these studies, we believe analysis of kidney expression libraries could identify renal disease susceptibility genes and new pathogenic mechanisms by comparing mRNA expression profiles from diseased and control animals. Such an approach certainly should improve on simple models that appear sufficient to explain a pathogenic process, but fail to improve outcome due to the true complexity of disease mechanisms. Finally, the SAGE technique may also have potential implications for the diagnosis of human renal disease, if the methods can be modified to accommodate smaller tissue volumes from biopsy specimens. Expression profiles from human renal biopsies could be used to stage disease and to identify patients at risk for progression. Microarray technology already has been used to stratify human breast cancer tissues (28). Further understanding of kidney disease pathogenesis and development of new kidney disease therapies will require application of genetic and genomic tools, such as SAGE and other expression profiling methodologies, which are designed to dissect complex pathogenic mechanisms that result from interaction of multiple genes.


    ACKNOWLEDGEMENTS

The detailed SAGE protocol and SAGE software were generously provided by Drs. V. E. Velculescu and K.W. Kinzler (Oncology Center and the Program in Human Genetics and Molecular Biology, Johns Hopkins University, Baltimore, MD). The authors gratefully acknowledge Dr. Velculescu's helpful discussions concerning the technical aspects of expression library generation using SAGE.


    FOOTNOTES

Support for this project was provided by National Institute of Diabetes and Digestive and Kidney Diseases Grants DK-38558, DK-02281, DK-07470, DK-54644, and DK-54178 and the Kidney Foundation of Ohio. Dr. Schelling is an Established Investigator of the American Heart Association.

Address for reprint requests and other correspondence: J. R. Sedor, Dept. of Medicine, BG 531, MetroHealth Medical Center, 2500 MetroHealth Dr., Cleveland, Ohio 44109-1998 (E-mail: jrs4{at}po.cwru.edu).

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Received 25 October 1999; accepted in final form 22 March 2000.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES

1.   Alon, U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, and Levine AJ. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96: 6745-6750, 1999[Abstract/Free Full Text].

2.   Amson, RB, Nemani M, Roperch JP, Israeli D, Bougueleret L, Le Gall I, Medhioub M, Linares-Cruz G, Lethrosne F, Pasturaud P, Piouffre L, Prieur S, Susini L, Alvaro V, Millasseau P, Guidicelli C, Bui H, Massart C, Cazes L, Dufour F, Bruzzoni-Giovanelli H, Owadi H, Hennion C, and Charpak G. Isolation of 10 differentially expressed cDNAs in p53-induced apoptosis: activation of the vertebrate homologue of the Drosophila seven in absentia gene. Proc Natl Acad Sci USA 93: 3953-3957, 1996[Abstract/Free Full Text].

3.   Anonymous Incidence and prevalence of ESRD. USRDS (United States Renal Data System). Am J Kidney Dis 30: S40-S53, 1997[Medline].

4.   Attali, B, Lesage F, Ziliani P, Guillemare E, Honore E, Waldmann R, Hugnot JP, Mattei MG, Lazdunski M, and Barhanin J. Multiple mRNA isoforms encoding the mouse cardiac Kv1-5 delayed rectifier K+ channel. J Biol Chem 268: 24283-24289, 1993[Abstract/Free Full Text].

5.   Babity, JM, Armstrong JN, Plumier JC, Currie RW, and Robertson HA. A novel seizure-induced synaptotagmin gene identified by differential display. Proc Natl Acad Sci USA 94: 2638-2641, 1997[Abstract/Free Full Text].

6.   Bassett, DEJ, Eisen MB, and Boguski MS. Gene expression informatics-it's all in your mine. Nat Genet 21: 51-55, 1999[ISI][Medline].

7.   Border, WA, and Noble NA. TGF-beta in kidney fibrosis: a target for gene therapy. Kidney Int 51: 1388-1396, 1997[ISI][Medline].

8.   Brenner, BM, Lawler EV, and Mackenzie HS. The hyperfiltration theory: a paradigm shift in nephrology. Kidney Int 49: 1774-1777, 1996[ISI][Medline].

9.   Brown, PO, and Botstein D. Exploring the new world of the genome with DNA microarrays. Nat Genet 21: 33-37, 1999[ISI][Medline].

10.   Chu, S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, and Herskowitz I. The transcriptional program of sporulation in budding yeast. Science 282: 699-705, 1998[Abstract/Free Full Text].

11.   De Waard, V, van den Berg BM, Veken J, Schultz-Heienbrok R, Pannekoek H, and van Zonneveld AJ. Serial analysis of gene expression to assess the endothelial cell response to an atherogenic stimulus. Gene 226: 1-8, 1999[ISI][Medline].

12.   Esposito, C, He CJ, Striker GE, Zalups RK, and Striker LJ. Nature and severity of the glomerular response to nephron reduction is strain-dependent in mice. Am J Pathol 154: 891-897, 1999[Abstract/Free Full Text].

13.   Fambrough, D, McClure K, Kazlauskas A, and Lander ES. Diverse signaling pathways activated by growth factor receptors induce broadly overlapping, rather than independent, sets of genes. Cell 97: 727-741, 1999[ISI][Medline].

14.   Grant, DS, Kinsella JL, Kibbey MC, LaFlamme S, Burbelo PD, Goldstein AL, and Kleinman HK. Matrigel induces thymosin beta 4 gene in differentiating endothelial cells. J Cell Sci 108: 3685-3694, 1995[Abstract/Free Full Text].

15.   Gygi, SP, Rochon Y, Franza BR, and Aebersold R. Correlation between protein and mRNA abundance in yeast. Mol Cell Biol 19: 1720-1730, 1999[Abstract/Free Full Text].

16.   Hereford, LM, and Rosbash M. Number and distribution of polyadenylated RNA sequences in yeast. Cell 10: 453-462, 1977[ISI][Medline].

17.   Ishii, T, Tsujino S, Matsunobu S, Endo F, Sato K, and Sakuragawa N. Cloning of mouse prolidase cDNA: predominant expression of prolidase mRNA in kidney. Biochim Biophys Acta 1308: 15-16, 1996[ISI][Medline].

18.   Iyer, VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JCF, Trent JM, Staudt LM, Hudson JJ, Boguski MS, Lashkari D, Shalon D, Botstein D, and Brown PO. The transcriptional program in the response of human fibroblasts to serum. Science 283: 83-87, 1999[Abstract/Free Full Text].

19.   Kazimierczak, J, Chavaz P, Krstic R, and Bucher O. Morphometric and enzyme histochemical behaviour of the kidney of young rats before and after unilateral nephrectomy. Histochemistry 46: 107-120, 1976[ISI][Medline].

20.   Lenz, O, Zheng F, Vilar J, Doublier S, Lupia E, Schwedler S, Striker LJ, and Striker GE. The inheritance of glomerulosclerosis in mice is controlled by multiple quantitative trait loci. Nephrol Dial Transplant 13: 3074-3078, 1998[Abstract].

21.   Morton, NE, and Collins A. Tests and estimates of allelic association in complex inheritance. Proc Natl Acad Sci USA 95: 11389-11393, 1998[Abstract/Free Full Text].

22.   Murphy, M, Godson C, Cannon S, Kato S, Mackenzie HS, Martin F, and Brady HR. Suppression subtractive hybridization identifies high glucose levels as a stimulus for expression of connective tissue growth factor and other genes in human mesangial cells. J Biol Chem 274: 5830-5834, 1999[Abstract/Free Full Text].

23.   Narita, I, Nakayama H, Goto S, Takeda T, Sakatsume M, Saito A, Nakagawa Y, and Arakawa M. Identification of genes specifically expressed in chronic and progressive glomerulosclerosis. Kidney Int 63: S215-S217, 1997.

24.   Onda, H, Lueck A, Marks PW, Warren HB, and Kwiatkowski DJ. Tsc2(+/-) mice develop tumors in multiple sites that express gelsolin and are influenced by genetic background. J Clin Invest 104: 687-695, 1999[Abstract/Free Full Text].

25.   Page, R, Morris C, Williams J, von Ruhland C, and Malik AN. Isolation of diabetes-associated kidney genes using differential display. Biochem Biophys Res Commun 232: 49-53, 1997[ISI][Medline].

26.   Pavlova, A, Stuart RO, Pohl M, and Nigam SK. Evolution of gene expression patterns in a model of branching morphogenesis. Am J Physiol Renal Physiol 277: F650-F663, 1999[Abstract/Free Full Text].

27.   Pawson, T, and Saxton TM. Signaling networks-do all roads lead to the same genes? Cell 97: 675-678, 1999[ISI][Medline].

28.   Perou, CM, Jeffrey SS, Van de Rijn M, Rees CA, Eisen MB, Ross DT, Pergamenschikov A, Williams CF, Zhu SX, Lee JC, Lashkari D, Shalon D, Brown PO, and Botstein D. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci USA 96: 9212-9217, 1999[Abstract/Free Full Text].

29.   Polyak, K, Xia Y, Zweier JL, Kinzler KW, and Vogelstein B. A model for p53-induced apoptosis. Nature 389: 300-305, 1997[ISI][Medline].

30.   Sasaki, Y, Ishii K, Nunoki K, Yamagishi T, and Taira N. The voltage-dependent K+ channel (Kv1.5) cloned from rabbit heart and facilitation of inactivation of the delayed rectifier current by the rat beta subunit. FEBS Lett 372: 20-24, 1995[ISI][Medline].

31.   Schelling, JR, Zarif L, Sehgal AR, Iyengar S, and Sedor JR. Genetic susceptibility to end stage renal disease. Curr Opin Nephrol Hypertens 8: 465-472, 1999[ISI][Medline].

32.   Singh, R, Wang B, Shirvaikar A, Khan S, Kamat S, Schelling JR, Konieczkowski M, and Sedor JR. The IL-1 receptor and Rho directly associate to drive cell activation in inflammation. J Clin Invest 103: 1561-1570, 1999[Abstract/Free Full Text].

33.   Stevens, CJM, Te Kronnie G, Samallo J, Schipper H, and Stroband HWJ Isolation of carp cDNA clones, representing developmentally-regulated genes, using a subtractive-hybridization strategy. Roux's Arch Dev Biol 205: 460-467, 1996[ISI].

34.   Takenaka, M, Imai E, Kaneko T, Ito T, Moriyama T, Yamauchi A, Hori M, Kawamoto S, and Okubo K. Isolation of genes identified in mouse renal proximal tubule by comparing different gene expression profiles. Kidney Int 53: 562-572, 1998[ISI][Medline].

35.   Trinczek, B, Robert-Nicoud M, and Schwoch G. In situ localization of cAMP-dependent protein kinases in nuclear and chromosomal substructures: relation to transcriptional activity. Eur J Cell Biol 60: 196-202, 1993[ISI][Medline].

36.   Velculescu, VE, Zhang L, Vogelstein B, and Kinzler KW. Serial analysis of gene expression. Science 270: 484-487, 1996[Abstract].

37.   Velculescu, VE, Zhang L, Zhou W, Vogelstein J, Basrai MA, Bassett DE, Jr, Hieter P, Vogelstein B, and Kinzler KW. Characterization of the yeast transcriptome. Cell 88: 243-251, 1997[ISI][Medline].

38.   Virlon, B, Cheval L, Buhler JM, Billon E, Doucet A, and Elalouf JM. Serial microanalysis of renal transcriptomes. Proc Natl Acad Sci USA 96: 15286-15291, 1999[Abstract/Free Full Text].

39.   Welle, S, Bhatt K, and Thornton CA. Inventory of high-abundance mRNAs in skeletal muscle of normal men. Genome Res 9: 506-513, 1999[Abstract/Free Full Text].

40.   Zheng, F, Striker GE, Esposito C, Lupia E, and Striker LJ. Strain differences rather than hyperglycemia determine the severity of glomerulosclerosis in mice. Kidney Int 54: 1999-2007, 1998[ISI][Medline].

41.   Zhang, H, Wada J, Kanwar YS, Tsuchiyama Y, Hiragushi K, Hida K, Shikata K, and Makino H. Screening for genes up-regulated in 5/6 nephrectomized mouse kidney. Kidney Int 56: 549-558, 1999[ISI][Medline].

42.   Zhang, L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, and Kinzler KW. Gene expression profiles in normal and cancer cells. Science 276: 1268-1272, 1997[Abstract/Free Full Text].


Am J Physiol Renal Fluid Electrolyte Physiol 279(2):F383-F392
0363-6127/00 $5.00 Copyright © 2000 the American Physiological Society